mscclpp.language.program

Classes

CollectiveProgram(name, collective, num_ranks)

A program definition for MSCCL++ collective communication operations.

class mscclpp.language.program.CollectiveProgram(name, collective, num_ranks, instances=1, protocol='Simple', instr_fusion=True, auto_sync=True, replication_policy=ReplicationPolicy.interleaved, reuse_resources=False, num_threads_per_block=1024, use_double_scratch_buffer=False, buffer_alignment=16, min_message_size=0, max_message_size=18446744073709551615)

Bases: object

A program definition for MSCCL++ collective communication operations.

CollectiveProgram serves as the main container for defining and executing collective communication programs using the MSCCL++ DSL. It manages GPU resources, channels, operations, and provides serialization to JSON format for execution.

Parameters:
  • name (str)

  • collective (Collective)

  • num_ranks (int)

  • instances (int)

  • protocol (str)

  • instr_fusion (bool)

  • auto_sync (bool)

  • replication_policy (ReplicationPolicy)

  • reuse_resources (bool)

  • num_threads_per_block (int)

  • use_double_scratch_buffer (bool)

  • buffer_alignment (int)

  • min_message_size (int)

  • max_message_size (int)

name

The name of the program.

Type:

str

collective

The collective operation this program implements.

Type:

Collective

num_ranks

The number of ranks participating in the program.

Type:

int

instances

The number of instances to replicate.

Type:

int

protocol

The communication protocol (“Simple” or “LL”).

Type:

str

instr_fusion

Whether to enable instruction fusion optimization.

Type:

bool

replication_policy

The policy for replicating operations.

Type:

ReplicationPolicy

reuse_resources

Whether to reuse resources across instances.

Type:

bool

num_threads_per_block

Number of threads per GPU thread block.

Type:

int

use_double_scratch_buffer

Whether to use double scratch buffering.

Type:

bool

buffer_alignment

Buffer alignment in bytes.

Type:

int

min_message_size

Minimum message size for this program.

Type:

int

max_message_size

Maximum message size for this program.

Type:

int

buffers

Buffer configurations for each rank.

Type:

list

gpus

List of GPU objects representing each rank.

Type:

List[Gpu]

loop_context

Current pipeline loop context, if any.