DirectX-Specs

Direct3D 12 Tight Placed Resource Alignment

Background

When placed resources were introduced in D3D12, there was an intentional decision to simplify alignment restrictions and take the greatest common denominator across the IHVs. This resulted in the following alignment requirements:

Type MIN MAX
Buffers 64 KiB (aligned to page table size) 64 KiB
Textures 4 KiB (must match definition of “Small resource”) 64 KiB
MSAA 64 Kib (must match definition of “Small resource”) 4 MiB

A “Small resource” is defined as:

  1. MUST have UNKNOWN layout.
  2. MUST NOT be RENDER_TARGET or DEPTH_STENCIL.
  3. The estimated size of the most-detailed mip level MUST be a total of the larger alignment restriction or less. The runtime will use an architecture-independent mechanism of size-estimation, that mimics the way standard swizzle and D3D11 tiled resources are sized. However, the tile sizes will be of the smaller alignment restriction for such calculations. Additional data associated with resources, which is typically associated with compression, will not be added into this size. So for a normal texture, when this calculated size is <= 64 KB, you can use the alignment of 4 KB. For an MSAA texture, when this calculated size is <= 4 MB, you can use the alignment of 64 KB.

There were a number of reasons for these choices, such as:

There was an intention to migrate to tighter alignment across the ecosystem over time, but this hasn’t happened yet. Developers have noticed that it is actually pretty common to have numerous tiny resources (meaningfully smaller than the alignment requirements), and they must now make a tradeoff:

Even in the second case, developers are further limited by the fact that creation of SRVs can’t take an offset, so all elements within a placed resource need to have the same stride. This is particularly annoying for the “bag of bits” Buffer resources, and texture data will have a rough time attempting this approach anyway since all of the formats will also need to match the original parent resource allocation.

So simply adding an offset to SRV creation probably isn’t the solution that we need…

Proposed Solution

We already have the ID3D12Device::GetResourceAllocationInfo[ 1, 2, 3] API for developers to get allocation info based on resource desc(s). Under the hood this calls the CheckResourceAllocationInfo DDI which gets alignment, size, etc. from the driver. CheckResourceAllocationInfo contains a UINT32 field for AlignmentRestriction as well as a D3D12DDIARG_CREATERESOURCE_0088 struct parameter that has a bitfield of flags. IHVs agree that they can all align buffers at 256B or less, we just need to allow them to report this capability rather than forcing 64KB alignment.

We will update D3D12DDI_RESOURCE_FLAGS_0003 to include D3D12DDI_RESOURCE_FLAG_0111_USE_TIGHT_ALIGNMENT to indicate to drivers that they should handle allocations for this resource in tight alignment mode. The AlignmentRestriction parameter of the DDI will be set to the minimum acceptable alignment value based on the tables below during resource creation (varies based on placed vs committed resource requirements, which can’t be inferred in the CheckResourceAllocationInfo call the way it can in CreateHeapAndResource). Drivers must return an alignment between the AlignmentRestriction and the max alignment value in the tables below.

In the runtime, we will update D3D12_RESOURCE_FLAGS to include D3D12_RESOURCE_FLAG_USE_TIGHT_ALIGNMENT. We will also add an API cap (D3D12_FEATURE_TIGHT_ALIGNMENT) that developers can check to know that a driver claims support for Tight Alignment. The flag is only valid when the driver reports support.

The expected tight alignment ranges are as follows and will be validated via HLK:

Placed Resources

Type MIN MAX
Buffers 8B 256B
Textures 8B 64KiB (4KiB when it meets the definition of a Small Resource)
MSAA 8B 4MiB (64KiB when it meets the definition of a Small Resource)

The minimum of 8B is to ensure safety of 64bit atomics. Buffer max alignment has been reduced to 256B as this seems to cover the worst case alignment needs of in market hardware.

Textures and multisample resources were deemed less likely to benefit from tighter alignment and aren’t the biggest pain points for ISVs at this time, so while drivers may opt to align them more tightly when possible it isn’t a requirement at this time.

Committed Resources

Committed resources can also benefit from having their minimum alignment reduced, specifically committed buffers. There is some nuance here though due to the fact that a heap is implicitly created for each committed resource, and VidMm’s minimum alignment and size granularity for managing memory is 4KB. Per the original User Mode Heaps spec that laid out the resource creation flow, the spirit of the DDI requires that each committed resource creation call result in 1 allocation. This means that we don’t want drivers to need to manage suballocations, which limits minimum alignment for committed resources to 4KB or larger.

Note: For drivers that report LargePageSupport for VidMM allocations, it is acceptable and optimal for allocations that are a multiple of the LargePage size to be aligned to the large page size instead of using the 4KiB alignment.

Type MIN MAX
Buffers 4KiB 4KiB
Textures 4KiB 64KiB (4KiB when it meets the definition of a Small Resource)
MSAA 4KiB 4MiB (64KiB when it meets the definition of a Small Resource)

Runtime changes

The d3d12.h header will be updated as shown below:

typedef enum D3D12_FEATURE
{
  ... // existing features
  D3D12_FEATURE_D3D12_TIGHT_ALIGNMENT = 54
} D3D12_FEATURE;

typedef enum D3D12_TIGHT_ALIGNMENT_TIER
{
  D3D12_TIGHT_ALIGNMENT_TIER_NOT_SUPPORTED,
  D3D12_TIGHT_ALIGNMENT_TIER_1  // Tight alignment of buffers supported 
} D3D12_TIGHT_ALIGNMENT_TIER;

typedef struct D3D12_FEATURE_DATA_TIGHT_ALIGNMENT
{
  D3D12_TIGHT_ALIGNMENT_TIER SupportTier;
}

typedef enum D3D12_RESOURCE_FLAGS
{
    D3D12_RESOURCE_FLAG_NONE	= 0,
    ... // existing flags
    D3D12_RESOURCE_FLAG_USE_TIGHT_ALIGNMENT = 0x400,
    ... // masks
} 	D3D12_RESOURCE_FLAGS;

Validation:

Calls to GetResourceAllocationInfo will continue to function as they do today, except that when the flag bit for D3D12_RESOURCE_FLAG_USE_TIGHT_ALIGNMENT is set, the alignment for that element is allowed to be aligned as tightly as possible. That said, we still follow the C++ algorithm for calculating a structure’s size and alignment are used when multiple descriptors are passed in: alignment is always based on the largest alignment required, and size depends on the order of the elements. For a contrived example, consider a three-element array with two tiny 256B-aligned resources and a tiny 2MiB-aligned resource. The API will report differing sizes based on the order of the array:

The Alignment returned would always be 2MiB, because it’s the superset of all alignments in the resource array. Note that in the real world you probably wouldn’t do this since there would be so much space wasted on padding. A more realistic scenario would be to have 8192 256B resources (or the equivalent total size) followed by the 2MiB resource. In this case, you are no longer wasting memory on padding and are benefitting from only making a single allocation.

DDI

The new D3D12_RESOURCE_FLAG_USE_TIGHT_ALIGNMENT will be forwarded to the driver via the D3D12DDI_RESOURCE_FLAGS_0003 bitfield. To ensure there is no impact to existing drivers, the flag will not be forwarded to the driver until it reports support for a new DDI cap:

typedef enum D3D12DDI_RESOURCE_FLAGS_0003 
{
    D3D12DDI_RESOURCE_FLAG_0003_NONE 	= 0,
    ... // existing flags
    D3D12DDI_RESOURCE_FLAG_0111_USE_TIGHT_ALIGNMENT = 0x8000,
    ... // masks
};

typedef enum D3D12DDI_TIGHT_ALIGNMENT_TIER
{
    D3D12DDI_TIGHT_ALIGNMENT_TIER_NOT_SUPPORTED = 0,
    D3D12DDI_TIGHT_ALIGNMENT_TIER_1 = 1,
} D3D12DDI_TIGHT_ALIGNMENT_TIER;

// A new caps type is defined
typedef enum D3D12DDICAPS_TYPE
{
    //existing caps types
    D3D12DDICAPS_TYPE_TIGHT_ALIGNMENT_TIER_0111 = 1089,
} D3D12DDICAPS_TYPE;

// The corresponding data struct for the cap
typedef struct D3D12DDI_TIGHT_ALIGNMENT_TIER_DATA_0111
{
    D3D12DDI_TIGHT_ALIGNMENT_TIER SupportTier;
} D3D12DDI_TIGHT_ALIGNMENT_TIER_DATA_0111;

As a reminder, this bitfield is passed to the CheckResourceAllocationInfo DDI as a member of the D3D12DDIARG_CREATERESOURCE_0088 parameter.

IMPORTANT: Drivers should size buffers appropiately (desc.width) rather than assume 64KB size when D3D12DDI_RESOURCE_FLAG_0111_USE_TIGHT_ALIGNMENT is set. CheckExistingResourceAllocationInfo_* should also return appropriate values for resources that were created with tight alignment.

HLK tests

These tests will not be a part of the Germanium HLK playlist, but will be in the future playlists.

Addendum - New Heap flag to indicate that a heap is implicitly created for committed resources

A long standing pain point for drivers has been that committed buffers and heaps for placed resources look the same when created. Today IHVs use fragile heuristics to determine whether a CreateHeapAndResource call is for creating a heap or committed buffer, which isn’t ideal. This is a relatively small change to make and in the spirit of the tight alignment feature, so it will be bundled in. The DDI version will be 0115 though since it was added later in the timeline.

DDI

Starting in DDI 0115, the runtime will set the new D3D12DDI_HEAP_FLAG_0115_IMPLICIT bit to true in the D3D12DDIARG_CREATEHEAP_0001::Flags parameter of CreateHeapAndResource calls for committed resources.

typedef enum D3D12DDI_HEAP_FLAGS
{
    // existing heap flags
    D3D12DDI_HEAP_FLAG_0115_IMPLICIT = 0x80,
} D3D12DDI_HEAP_FLAGS;

FAQ

Why a Flag instead of a new API?

How does this affect Heap alignment and offsets?

Can this be the default behavior? That is, don’t require applications to opt in for each resource?


Open Questions

D3D Team

WDDM / VidMM team

All IHVs