Accera v1.2 Reference

`accera.Plan.tensorize(indices, mma_shape [, use_static_offsets, num_total_passes, num_fused_passes, scheduling_policy, prologue_op, epilogue_op])`

Only available for targets with native matrix multiplication instruction (tensor core) support. Marks the dimensions of the iteration-space for tensorization. Only perfectly nested loops of the following form can be tensorized:

for i in range(M):
    for k in range(N):
        for j in range(K):
            C[i, j] += A[i, k] * B[k, j]

Arguments

argument	description	type/default
`indices`	The 3-dimensional iteration space to tensorize.	3-D tuple of `accera.Index`
`mma_shape`	The type of MMA operation to use.	`accera.MMAShape`
`use_static_offsets`	This is an optimization flag, which when enabled will use precomputed offset maps stored in device constant memory. Defaults to `False`.	bool
`num_total_passes`	This controls the total number of passes to run. Defaults to 1.	positive integer
`num_fused_passes`	This controls the number of passes for which register allocation is done, higher the value more the number of registers that are allocated. Defaults to `None` which will fuse all the passes as specified by `num_total_passes`.	positive integer
`scheduling_policy`	For multi-block MMA operations, this controls whether matrix multiplication is done block-by-block or pass-by-pass (affects register usage). Default value is `accera.MMASchedulingPolicy.PASS_ORDER`	`accera.MMASchedulingPolicy`
`prologue_op`	The element-wise operation to apply on matrix fragment data as a part of initialization (pre-tensorization). Default value is `accera.MMAFragmentOp.NONE`	`accera.MMAFragmentOp`
`epilogue_op`	The element-wise operation to apply on matrix fragment data as a part of the final store (post-tensorization). Default value is `accera.MMAFragmentOp.NONE`	`accera.MMAFragmentOp`

The different values of the enum MMAShape are explained here: accera.MMAShape

The different values of the enum MMASchedulingPolicy (applicable only for AMD targets supporting MFMA ops, such as accera.Target.Model.AMD_MI100) are mentioned here: accera.MMASchedulingPolicy

The different values of the enum MMAFragmentOp are explained here: accera.MMAFragmentOp

Examples

Mark the dimensions ii, jj, and kk for tensorization execution:

plan.tensorize(indices=(ii,jj,kk))

Last update: 2023-04-17