Accera v1.2 Reference
accera.Plan.tensorize(indices, mma_shape [, use_static_offsets, num_total_passes, num_fused_passes, scheduling_policy, prologue_op, epilogue_op])
Only available for targets with native matrix multiplication instruction (tensor core) support. Marks the dimensions of the iteration-space for tensorization. Only perfectly nested loops of the following form can be tensorized:
for i in range(M):
for k in range(N):
for j in range(K):
C[i, j] += A[i, k] * B[k, j]
Arguments
argument | description | type/default |
---|---|---|
indices |
The 3-dimensional iteration space to tensorize. | 3-D tuple of accera.Index |
mma_shape |
The type of MMA operation to use. | accera.MMAShape |
use_static_offsets |
This is an optimization flag, which when enabled will use precomputed offset maps stored in device constant memory. Defaults to False . |
bool |
num_total_passes |
This controls the total number of passes to run. Defaults to 1. | positive integer |
num_fused_passes |
This controls the number of passes for which register allocation is done, higher the value more the number of registers that are allocated. Defaults to None which will fuse all the passes as specified by num_total_passes . |
positive integer |
scheduling_policy |
For multi-block MMA operations, this controls whether matrix multiplication is done block-by-block or pass-by-pass (affects register usage). Default value is accera.MMASchedulingPolicy.PASS_ORDER |
accera.MMASchedulingPolicy |
prologue_op |
The element-wise operation to apply on matrix fragment data as a part of initialization (pre-tensorization). Default value is accera.MMAFragmentOp.NONE |
accera.MMAFragmentOp |
epilogue_op |
The element-wise operation to apply on matrix fragment data as a part of the final store (post-tensorization). Default value is accera.MMAFragmentOp.NONE |
accera.MMAFragmentOp |
The different values of the enum MMAShape
are explained here: accera.MMAShape
The different values of the enum MMASchedulingPolicy
(applicable only for AMD targets supporting MFMA ops, such as accera.Target.Model.AMD_MI100
) are mentioned here: accera.MMASchedulingPolicy
The different values of the enum MMAFragmentOp
are explained here: accera.MMAFragmentOp
Examples
Mark the dimensions ii
, jj
, and kk
for tensorization execution:
plan.tensorize(indices=(ii,jj,kk))
Last update:
2023-04-17