mscclpp

MSCCL++ Python API.

Functions

deprecated(new_cls)

get_include()

Return the directory that contains the MSCCL++ headers.

get_lib()

Return the directory that contains the MSCCL++ headers.

class mscclpp.Algorithm(id=None, execution_plan=None, native_handle=None, tags=None, constraint=None)

Bases: object

A wrapper for collective communication algorithms.

This class provides a Python interface for collective communication algorithms such as allreduce, allgather, and reduce-scatter. Algorithms can be either DSL-based (defined using MSCCL++ execution plans) or native (implemented in C++/CUDA).

Parameters:
  • id (Optional[str])

  • execution_plan (Optional[CppExecutionPlan])

  • native_handle (Optional[CppAlgorithm])

  • tags (Optional[Dict[str, int]])

  • constraint (Optional[Constraint])

name

Human-readable name of the algorithm.

collective

The collective operation this algorithm implements (e.g., “allreduce”).

message_size_range

Tuple of (min_size, max_size) in bytes for valid message sizes.

tags

Dictionary of tag names to tag values for algorithm selection hints.

buffer_mode

The buffer mode supported by this algorithm (IN_PLACE, OUT_OF_PLACE, or ANY).

class Constraint(world_size=0, n_ranks_per_node=0)

Bases: object

Constraints that define valid execution environments for the algorithm.

Parameters:
  • world_size (int) – Required world size (number of ranks). 0 means any size.

  • n_ranks_per_node (int) – Required number of ranks per node. 0 means any.

property buffer_mode: CppCollectiveBufferMode

The buffer mode supported by this algorithm (IN_PLACE, OUT_OF_PLACE, or ANY).

property collective: str

The collective operation this algorithm implements (e.g., “allreduce”, “allgather”).

classmethod create_from_native_capsule(obj)

Create an Algorithm instance from a PyCapsule object.

Parameters:

obj – A PyCapsule containing a native algorithm pointer.

Returns:

A new Algorithm instance wrapping the algorithm from the capsule.

classmethod create_from_native_handle(handle)

Create an Algorithm instance from a native C++ algorithm handle.

Parameters:

handle (CppAlgorithm) – The native C++ algorithm handle.

Returns:

A new Algorithm instance wrapping the native handle.

execute(comm, input_buffer, output_buffer, input_size, output_size, dtype, op=<NamedMock name='mock.CppReduceOp.NOP' id='140595804162240'>, stream=0, executor=None, nblocks=0, nthreads_per_block=0, extras=None)

Execute the collective algorithm.

Parameters:
  • comm (CppCommunicator) – The communicator to use.

  • input_buffer (int) – Device pointer to the input buffer.

  • output_buffer (int) – Device pointer to the output buffer.

  • input_size (int) – Size of the input buffer in bytes.

  • output_size (int) – Size of the output buffer in bytes.

  • dtype (CppDataType) – Data type of the elements.

  • op (CppReduceOp) – Reduction operation for reduce-type collectives (default: NOP).

  • stream (int) – CUDA stream to execute on (default: 0).

  • executor (Optional[CppExecutor]) – The executor for DSL algorithms (required for DSL, optional for native).

  • nblocks – Number of CUDA blocks (0 for auto-selection).

  • nthreads_per_block – Number of threads per block (0 for auto-selection).

  • extras (Optional[Dict[str, int]]) – Additional algorithm-specific parameters.

Return type:

int

Returns:

The result code (0 for success).

is_dsl_algorithm()

Check if this is a DSL-based algorithm.

Return type:

bool

Returns:

True if this algorithm is defined using DSL/execution plan, False otherwise.

is_native_algorithm()

Check if this is a native C++/CUDA algorithm.

Return type:

bool

Returns:

True if this algorithm is implemented natively, False otherwise.

property message_size_range: Tuple[int, int]

The valid message size range (min_size, max_size) in bytes.

property name: str

The human-readable name of the algorithm.

property tags: Dict[str, int]

Dictionary of tag names to tag values for algorithm selection hints.

class mscclpp.GpuBuffer(*args, **kwargs)

Bases: ndarray

Parameters:
mscclpp.get_include()

Return the directory that contains the MSCCL++ headers.

Return type:

str

mscclpp.get_lib()

Return the directory that contains the MSCCL++ headers.

Return type:

str

Modules

default_algos

ext

language

MSCCL++ DSL.

utils