Collectives
High-level communication patterns expanded into concrete flows.
Collectives are higher-level communication patterns (common in ML training workloads) that are expanded into concrete flows at simulation start (Topology::process_collectives in src/topos/topo.rs).
Parsing is implemented in src/flows/collective.rs.
Supported collective_type values:
BroadcastGatherAllReduceRingAllReduce
Collectives can be specified individually ([[collective]]) or as sets ([[collective_set]]).
[[collective]] schema
Example (broadcast with packet distribution flows):
[[collective]]
collective_type = "Broadcast"
flow_type = "PacketDistribution"
flow_count = 2
sources = [0, 0]
sinks = [1, 2]
[collective.traffic]
initial_delay = 1.0
duration = 10.0
arr_dist = { type = "Uniform", low = 1.0, high = 1.0 }
pkt_size_dist = { type = "DiscreteUniform", low = 1000, high = 1500 }Supported fields:
collective_type(required)first_flow_id(optional): base ID for flows produced by this collectiveflow_type(required):PacketDistributionorTCP(orDCQCNif enabled)flow_count(required): number of flows in the collectivepaths(optional): explicit per-flow paths; sources/sinks are derived from each path’s endpointssources/sinks(optional): explicit per-flow endpointsrouting(optional): routing protocol for produced flowstraffic(required):[collective.traffic](same schema as flows)
Notes:
- The optional
graphfield is currently stored but not used to derive endpoints; to control endpoints, usesources/sinksorpaths.
Endpoint selection rules
Days determines (source_host, sink_host) for each flow in this order:
- If
sourcesandsinksare provided, they must each containflow_countentries and satisfy type-specific constraints:Broadcast: all entries ofsourcesmust be the sameGather: all entries ofsinksmust be the sameRingAllReduce: ring consistency (sources[i] == sinks[i-1])
- Else, if
pathsis provided, it must containflow_countpaths:- sources are
path[0], sinks arepath[last] - for
RingAllReduce, each sink must match the next flow’s source
- sources are
- Else, endpoints are generated randomly from
hosts(seeded byseed):Broadcast: pick one source, choose sinks from the other hostsGather/AllReduce: pick one sink, choose sources from the other hostsRingAllReduce: requiresflow_count == hosts.len()and uses all hosts in a ring
[[collective_set]] schema
Collective sets generate multiple collectives with the same shape.
Example:
[[collective_set]]
collective_type = "Gather"
collective_count = 2
flow_type = "PacketDistribution"
flow_count = 2
sources = [[0, 1], [1, 2]]
sinks = [[2, 2], [3, 3]]
[collective_set.traffic]
initial_delay = 1.0
duration = 10.0
arr_dist = { type = "Uniform", low = 1.0, high = 1.0 }
pkt_size_dist = { type = "DiscreteUniform", low = 512, high = 512 }Fields:
collective_type(required)collective_count(required)first_flow_id(optional): base flow ID for the entire setflow_type(required)flow_count(required): number of flows per collective in the setsources/sinks(optional): list-of-lists, one entry per collectiverouting(optional)traffic(required)
Flow ID assignment:
- Each collective in the set gets a contiguous flow ID range:
first_flow_id + index * flow_count .. first_flow_id + (index + 1) * flow_count
TCP collectives and application data
For TCP collectives, Days can optionally back flows with application-level byte buffers (src/flows/app_source.rs).
Behavior:
- TCP Broadcast: all flows in the broadcast share the same buffer, so each receiver gets the same byte stream.
- TCP RingAllReduce: each hop sends a specific chunk (byte range) of the overall buffer, using per-flow offset/length handles.
Tune the app-source actor via [app_source] (see /docs/configuration/logging).