DirectX-Specs

D3D12 CPU Timeline Query Resolution

This document proposes a new D3D12 feature that allows developers to resolve query data on the CPU timeline rather than requiring GPU-based resolution, improving convenience and performance for common query scenarios.

Contents


Introduction

D3D12’s current query system requires developers to explicitly resolve query data on the GPU timeline using ResolveQueryData() Command List operations. This approach, while providing fine-grained control, introduces several inconveniences:

This proposal introduces CPU-based query resolution APIs that allow the runtime/driver to resolve query data using the CPU after GPU execution completes, providing a more convenient and often more efficient alternative for common query scenarios.


Problem Statement

Current D3D12 query workflow requires several steps:

  1. Begin query operation
  2. Perform GPU work
  3. End query operation
  4. Submit command list
  5. Issue ResolveQueryData operation in the same or subsequent command list
  6. Wait for GPU completion
  7. Read resolved data from destination buffer

This workflow is cumbersome for scenarios where developers simply want to read query results on the CPU after GPU work completes. Common use cases that would benefit from CPU resolution include:


Goals


Non-Goals


Overall Design

The CPU Timeline Query Resolution feature introduces one new API:

The design allows developers to resolve the query data on the CPU timeline after GPU execution completes. Any post processing of the query data is abstracted by the runtime and driver.

For drivers or hardware that do not support native CPU resolution, the runtime provides transparent fallback by automatically generating the necessary GPU-based resolution operations.


API

D3D12_QUERY_HEAP_FLAGS

typedef enum D3D12_QUERY_HEAP_FLAGS
{
    D3D12_QUERY_HEAP_FLAG_NONE = 0,
    D3D12_QUERY_HEAP_FLAG_CPU_RESOLVE = 1,
};

Members:

Remarks:

Query heaps created without the D3D12_QUERY_HEAP_FLAG_CPU_RESOLVE flag cannot be used with the ID3D12Device::ResolveQueryData() method and will return E_INVALIDARG if attempted. Conversely, heaps created with the CPU resolve flag cannot be used with the Command List resolve API.

ID3D12Device::CreateQueryHeap1

Create a Query Heap with the option for CPU based resolves.

HRESULT CreateQueryHeap1( 
  [in] const D3D12_QUERY_HEAP_DESC *pDesc,
  [in] D3D12_QUERY_HEAP_FLAGS Flags,
  [in] REFIID riid,
  [inout] void **ppvHeap);

Parameters:

Return Value:

Remarks:

This method extends the original CreateQueryHeap() method by adding support for query heap creation flags. Query heaps created with this method are functionally identical to those created with CreateQueryHeap() when using D3D12_QUERY_HEAP_FLAG_NONE.

When the D3D12_QUERY_HEAP_FLAG_CPU_RESOLVE flag is specified, the runtime and driver may optimize the query heap allocation for CPU access. This can enable more efficient CPU-based query resolution but may have different memory characteristics compared to standard query heaps.

ID3D12Device::ResolveQueryData

CPU-based query resolution method on a Device.

HRESULT ResolveQueryData(
    [in]  ID3D12QueryHeap *pQueryHeap,
    [in]  D3D12_QUERY_TYPE Type,
    [in]  UINT StartIndex,
    [in]  UINT NumQueries,
    [inout] void* pResolvedQueryData);

Parameters:

Return Value:

Remarks: This method resolves query data on the CPU timeline after Command List which initiated the queries has completed execution on the GPU. Any required post-processing of the raw query data generated on the GPU timeline will be performed on the CPU immediately by the runtime and/or driver.

The output buffer must be sized correctly for the query type and number of queries to resolve.

The query heap must have been created with the D3D12_QUERY_HEAP_FLAG_CPU_RESOLVE flag for this method to succeed.


DDI

D3D12DDI_QUERY_HEAP_FLAGS

typedef enum D3D12DDI_QUERY_HEAP_FLAGS
{
    D3D12DDI_QUERY_HEAP_FLAG_NONE = 0,
    D3D12DDI_QUERY_HEAP_FLAG_CPU_RESOLVE = 1,
};

PFND3D12DDI_CREATE_QUERY_HEAP_0119

DDI for CreateQueryHeap1

typedef HRESULT (APIENTRY* PFND3D12DDI_CREATE_QUERY_HEAP_0119)(
  D3D12DDI_HDEVICE hDevice,
  _In_ CONST D3D12DDIARG_CREATE_QUERY_HEAP_0001* pCreate,
  D3D12DDI_QUERY_HEAP_FLAGS Flags,
  D3D12DDI_HQUERYHEAP hQueryHeap
);

PFND3D12DDI_RESOLVE_QUERY_DATA

DDI for CPU-based query resolution.

typedef HRESULT (APIENTRY* PFND3D12DDI_RESOLVE_QUERY_DATA)(
    D3D12DDI_HDEVICE hDevice,
    D3D12DDI_HQUERYHEAP hQueryHeap,
    D3D12_QUERY_TYPE Type,
    UINT StartIndex,
    UINT NumQueries,
    void* pResolvedQueryData
);

Runtime Fallback

For drivers or hardware that do not support native CPU-based query resolution, the D3D12 runtime provides transparent fallback behavior to ensure the feature works universally across all D3D12-capable hardware.

Fallback Mechanism

When a query heap is created with the D3D12_QUERY_HEAP_FLAG_CPU_RESOLVE flag on hardware or drivers that lack native support:

  1. Automatic GPU Resolution Injection: The runtime automatically injects GPU-based ResolveQueryData operations into command lists at Close() time for any query heaps that have been used and marked for CPU resolution.

  2. Internal Buffer Management: The runtime creates and manages internal GPU-visible destination buffers to store the resolved query results. These buffers are sized appropriately for the query types and counts used.

  3. Transparent Operation: Applications using ID3D12Device::ResolveQueryData() are unaware of the fallback - the method behaves identically whether using native support or runtime emulation.

Test Plan

Conformance Tests

Functional Tests


Change Log

Version Date Description
1.0 August 2025 Initial version
1.1 September 2025 Remove TOP/BOP Timestamps from proposal