The hlsl is the GPU programming language provided in conjunction with the dx runtime. Over many years its use has expanded to cover every major rendering API across all major development platforms. Despite its popularity and long history hlsl has never had a formal language specification. This document seeks to change that.
hlsl draws heavy inspiration originally from isoC and later from isoCPP with additions specific to graphics and parallel computation programming. The language is also influenced to a lesser degree by other popular graphics and parallel programming languages.
hlsl has two reference implementations which this specification draws heavily from. The original reference implementation fxc has been in use since dx 9. The more recent reference implementation dxc has been the primary shader compiler since dx 12.
In writing this specification bias is leaned toward the language behavior of dxc rather than the behavior of fxc, although that can vary by context.
In very rare instances this spec will be aspirational, and may diverge from both reference implementation behaviors. This will only be done in instances where there is an intent to alter implementation behavior in the future. Since this document and the implementations are living sources, one or the other may be ahead in different regards at any point in time.
This document specifies the requirements for implementations of hlsl. The hlsl specification is based on and highly influenced by the specifications for the c and the cpp.
This document covers both describing the language grammar and semantics for hlsl, and (in later sections) the standard library of data types used in shader programming.
The following referenced documents provide significant influence on this document and should be used in conjunction with interpreting this standard.
isoC, Programming languages - C
isoCPP, Programming languages - C++
dx Specifications, https://microsoft.github.io/DirectX-Specs/
This document aims to use terms consistent with their definitions in isoC and isoCPP. In cases where the definitions are unclear, or where this document diverges from isoC and isoCPP, the definitions in this section, the remaining sections in this chapter, and the attached glossary ([main]) supersede other sources.
The following definitions are consistent between hlsl and the isoC and isoCPP specifications, however they are included here for reader convenience.
Data is correct if it represents values that have specified or unspecified but not undefined behavior for all the operations in which it is used. Data that is the result of undefined behavior is not correct, and may be treated as undefined.
An implementation defined message belonging to a subset of the implementation’s output messages which communicates diagnostic information to the user.
A program that is not well-formed, for which the implementation is expected to return unsuccessfully and produce one or more diagnostic messages.
Behavior of a well-formed program and correct data which may vary by the implementation, and the implementation is expected to document the behavior.
Restrictions imposed upon programs by the implementation of either the compiler or runtime environment. The compiler may seek to surface runtime-imposed limits to the user for improved user experience.
Behavior of invalid program constructs or incorrect data for which this standard imposes no requirements, or does not sufficiently detail.
Behavior of a well-formed program and correct data which may vary by the implementation, and the implementation is not expected to document the behavior.
An hlsl program constructed according to the syntax rules, diagnosable semantic rules, and the One Definition Rule.
A runtime implementation refers to a full-stack implementation of a software runtime that can facilitate the execution of hlsl programs. This broad definition includes libraries and device driver implementations. The hlsl specification does not distinguish between the user-facing programming interfaces and the vendor-specific backing implementation.
hlsl emerged from the evolution of dx to grant greater control over GPU geometry and color processing. It gained popularity because it targeted a common hardware description which all conforming drivers were required to support. This common hardware description, called a sm, is an integral part of the description for hlsl . Some hlsl features require specific sm features, and are only supported by compilers when targeting those sm versions or later.
hlsl uses a spmd programming model where a program describes operations on a single element of data, but when the program executes it executes across more than one element at a time. This programming model is useful due to GPUs largely being simd hardware architectures where each instruction natively executes across multiple data elements at the same time.
There are many different terms of art for describing the elements of a GPU architecture and the way they relate to the spmd program model. In this document we will use the terms as defined in the following subsections.
hlsl is a data-parallel programming language designed for programming auxiliary processors in a larger system. In this context the host refers to the primary processing unit that runs the application which in turn uses a runtime to execute hlsl programs on a supported device. There is no strict requirement that the host and device be different physical hardware, although they commonly are. The separation of host and device in this specification is useful for defining the execution and memory model as well as specific semantics of language constructs.
A lane represents a single computed element in an spmd program. In a traditional programming model it would be analogous to a thread of execution, however it differs in one key way. In multi-threaded programming threads advance independent of each other. In spmd programs, a group of lanes may execute instructions in lockstep because each instruction may be a simd instruction computing the results for multiple lanes simultaneously, or synchronizing execution across multiple lanes or waves. A lane has an associated lane state which denotes the execution status of the lane (1.6.1.7).
A grouping of lanes for execution is called a wave. The size of a wave is defined as the maximum number of active lanes the wave supports. wave sizes vary by hardware architecture, and are required to be powers of two. The number of active lanes in a wave can be any value between one and the wave size.
Some hardware implementations support multiple wave sizes. There is no overall minimum wave size requirement, although some language features do have minimum lane size requirements.
hlsl is explicitly designed to run on hardware with arbitrary wave sizes. Hardware architectures may implement waves as simt where each thread executes instructions in lockstep. This is not a requirement of the model. Some constructs in hlsl require synchronized execution. Such constructs will explicitly specify that requirement.
A quad is a subdivision of four lanes in a wave which are computing adjacent values. In pixel shaders a quad may represent four adjacent pixels and quad operations allow passing data between adjacent lanes. In compute shaders quads may be one or two dimensional depending on the workload dimensionality. Quad operations require four active lanes.
A grouping of lanes executing the same shader to produce a combined result is called a threadgroup. threadgroups are independent of simd hardware specifications. The dimensions of a threadgroup are defined in three dimensions. The maximum extent along each dimension of a threadgroup, and the total size of a threadgroup are implementation limits defined by the runtime and enforced by the compiler. If a threadgroup’s size is not a whole multiple of the hardware wave size, the unused hardware lanes are implicitly inactive.
If a threadgroup size is smaller than the wave size , or if the threadgroup size is not an even multiple of the wave size, the remaining lane are inactive lanes.
A grouping of threadgroups which represents the full execution of a hlsl program and results in a completed result for all input data elements.
lanes may be in four primary states: active, helper, inactive, and predicated off.
An active lane is enabled to perform computations and produce output results based on the initial launch conditions and program control flow.
A helper lane is a lane which would not be executed by the initial launch conditions except that its computations are required for adjacent pixel operations in pixel fragment shaders. A helper lane will execute all computations but will not perform writes to buffers, and any outputs it produces are discarded. Helper lanes may be required for lane-cooperative operations to execute correctly.
A inactive lane is a lane that is not executed by the initial launch conditions. This can occur if there are insufficient inputs to fill all lanes in the wave, or to reduce per-thread memory requirements or register pressure.
A predicated off lane is a lane that is not being executed due to program control flow. A lane may be predicated off when control flow for the lanes in a wave diverge and one or more lanes are temporarily not executing.
The diagram blow illustrates the state transitions between lane states:
A runtime implementation shall provide an implementation-defined mechanism for defining a dispatch. A runtime shall manage hardware resources and schedule execution to conform to the behaviors defined in this specification in an implementation-defined way. A runtime implementation may sort the threadgroups of a dispatch into waves in an implementation-defined way. During execution no guarantees are made that all lanes in a wave are actively executing.
wave, quad, and threadgroup operations require execution synchronization of applicable active and helper lanes as defined by the individual operation.
An optimizing compiler may not optimize code generation such that it changes the behavior of a well-formed program except in the presence of implementation-defined or unspecified behavior.
The presence of wave, quad, or threadgroup operations may further limit the valid transformations of a program. Specifically, control flow operations which result in changing which lanes, quads, or waves are actively executing are illegal in the presence of cooperative operations if the optimization alters the behavior of the program.
Memory accesses for sm 5.0 and earlier operate on 128-bit slots aligned on 128-bit boundaries. This optimized for the common case in early shaders where data being processed on the GPU was usually 4-element vectors of 32-bit data types.
On modern hardware memory access restrictions are loosened, and reads of 32-bit multiples are supported starting with sm 5.1 and reads of 16-bit multiples are supported with sm 6.0. sm features are fully documented in the dx Specifications, and this document will not attempt to elaborate further.
hlsl programs manipulate data stored in four distinct memory spaces: thread, threadgroup, device and constant.
Thread memory is local to the lane. It is the default memory space used to store local variables. Thread memory cannot be directly read from other threads without the use of intrinsics to synchronize execution and memory.
threadgroup memory is denoted
in hlsl with the
groupshared
keyword. The underlying memory for any
declaration annotated with groupshared
is shared across an
entire threadgroup. Reads and writes
to threadgroup Memory, may occur
in any order except as restricted by synchronization intrinsics or other
memory annotations.
Device memory is memory available to all lanes executing on the device. This memory may be read or written to by multiple threadgroups that are executing concurrently. Reads and writes to device memory may occur in any order except as restricted by synchronization intrinsics or other memory annotations. Some device memory may be visible to the host. Device memory that is visible to the host may have additional synchronization concerns for host visibility.
Constant memory is similar to device memory in that it is available to all lanes executing on the device. Constant memory is read-only, and an implementation can assume that constant memory is immutable and cannot change during execution.
The text of hlsl programs is collected in
source and header files. The distinction between
source and header files is social and not technical. An implementation
will construct a translation unit from a single source file and
any included source or header files referenced via the
#include
preprocessing directive conforming to the isoC
preprocessor specification.
An implementation may implicitly include additional sources as required to expose the hlsl library functionality as defined in (12).
hlsl inherits the phases of translation from isoCPP, with minor alterations, specifically the removal of support for trigraph and digraph sequences. Below is a description of the phases.
Source files are characters that are mapped to the basic source character set in an implementation-defined manner.
Any sequence of backslash (\
) immediately followed
by a new line is deleted, resulting in splicing lines together.
Tokenization occurs and comments are isolated. If a source file ends in a partial comment or preprocessor token the program is ill-formed and a diagnostic shall be issued. Each comment block shall be treated as a single white-space character.
Preprocessing directives are executed, macros are expanded,
pragma
and other unary operator expressions are executed.
Processing of #include
directives results in all preceding
steps being executed on the resolved file, and can continue recursively.
Finally all preprocessing directives are removed from the
source.
Character and string literal specifiers are converted into the appropriate character set for the execution environment.
Adjacent string literal tokens are concatenated.
White-space is no longer significant. Syntactic and semantic analysis occurs translating the whole translation unit into an implementation-defined representation.
The translation unit is processed to determine required instantiations, the definitions of the required instantiations are located, and the translation and instantiation units are merged. The program is ill-formed if any required instantiation cannot be located or fails during instantiation.
External references are resolved, library references linked, and all translation output is collected into a single output.
The basic source character set is a subset of the ASCII character set. The table below lists the valid characters and their ASCII values:
Hex ASCII Value | Character Name | Glyph or C Escape Sequence |
---|---|---|
0x09 | Horizontal Tab | \t |
0x0A | Line Feed | \n |
0x0D | Carriage Return | \r |
0x20 | Space | |
0x21 | Exclamation Mark | ! |
0x22 | Quotation Mark | " |
0x23 | Number Sign | # |
0x25 | Percent Sign | % |
0x26 | Ampersand | & |
0x27 | Apostrophe | ’ |
0x28 | Left Parenthesis | ( |
0x29 | Right Parenthesis | ) |
0x2A | Asterisk | * |
0x2B | Plus Sign | + |
0x2C | Comma | , |
0x2D | Hyphen-Minus | - |
0x2E | Full Stop | . |
0x2F | Solidus | / |
0x30 .. 0x39 | Digit Zero .. Nine | 0 1 2 3 4 5 6 7 8 9 |
0x3A | Colon | : |
0x3B | Semicolon | ; |
0x3C | Less-than Sign | < |
0x3D | Equals Sign | = |
0x3E | Greater-than Sign | > |
0x3F | Question Mark | ? |
0x41 .. 0x5A | Latin Capital Letter A .. Z | A B C D E F G H I J K L M |
N O P Q R S T U V W X Y Z |
||
0x5B | Left Square Bracket | [ |
0x5C | Reverse Solidus | \ |
0x5D | Right Square Bracket | [ |
0x5E | Circumflex Accent | ^ |
0x5F | Underscore | _ |
0x61 .. 0x7A | Latin Small Letter a .. z | a b c d e f g h i j k l m |
n o p q r s t u v w x y z |
||
0x7B | Left Curly Bracket | { |
0x7C | Vertical Line | | |
0x7D | Right Curly Bracket | } |
An implementation may allow source files to be written in alternate extended character sets as long as that set is a superset of the basic character set. The translation character set is an extended character set or the basic character set as chosen by the implementation.
preprocessing-token:
* header-name
* identifier
* pp-number
* character-literal
* string-literal
* preprocessing-op-or-punc
* each non-whitespace character from the
translation character set that cannot be one of the
above
Each preprocessing token that is converted to a token shall have the lexical form of a keyword, an identifier, a constant, a string literal or an operator or punctuator.
Preprocessing tokens are the minimal lexical elements of the language during translation phases 3 through 6 (2.2). Preprocessing tokens can be separated by whitespace in the form of comments, white space characters, or both. White space may appear within a preprocessing token only as part of a header name or between the quotation characters in a character constant or string literal.
Header name preprocessing tokens are only recognized within #include preprocessing directives, __has_include expressions, and implementation-defined locations within #pragma directives. In those contexts, a sequence of characters that could be either a header name or a string literal is recognized as a header name.
token:
* identifier
* keyword
* literal
* operator-or-punctuator
There are five kinds of tokens: identifiers, keywords, literals, and operators or punctuators. All whitespace characters and comments are ignored except as they separate tokens.
The characters /* start a comment which terminates with the characters /. The characters // start a comment which terminates at the next new line.
header-name:
* < h-char-sequence >
* " q-char-sequence "
h-char-sequence:
* h-char
* h-char-sequence h-char
h-char:
* any character in the translation character
set except newline or >
q-char-sequence:
* q-char
* q-char-sequence q-char
q-char:
* any character in the translation character
set except newline or "
Character sequences in header names are mapped to header files or external source file names in an implementation defined way.
pp-number:
* digit
* . digit
* pp-number ’ digit
* pp-number ’ non-digit
* pp-number e sign
* pp-number E sign
* pp-number p sign
* pp-number P sign
* pp-number .
Preprocessing numbers begin with a digit or period (.), and may be followed by valid identifier characters and floating point literal suffixes (e+, e-, E+, E-, p+, p-, P+, and P-). Preprocessing number tokens lexically include all integer-literal and floating-literal tokens.
Preprocessing numbers do not have types or values. Types and values are assigned to integer-literal, floating-literal, and vector-literal tokens on successful conversion from preprocessing numbers.
A preprocessing number cannot end in a period (.) if the immediate next token is a scalar-element-sequence (2.9.4). In this situation the pp-number token is truncated to end before the period2.
literal:
* integer-literal
* character-literal
* floating-literal
* string-literal
* boolean-literal
* vector-literal
integer-literal:
* decimal-literal
integer-suffixopt
* octal-literal integer-suffixopt
* hexadecimal-literal
integer-suffixopt
*
decimal-literal:
* nonzero-digit
* decimal-literal digit
*
octal-literal:
* octal-literal octal-digit
*
hexadecimal-literal:
* 0x hexadecimal-digit
* 0X hexadecimal-digit
* hexadecimal-literal hexadecimal-digit
*
nonzero-digit:
one of
* 2 3 4 5 6 7 8 9
*
octal-digit: one of
* 1 2 3 4 5 6 7
*
hexadecimal-digit:
one of
* 1 2 3 4 5 6 7 8 9
* a b c d e f
* A B C D E F
*
integer-suffix:
* unsigned-suffix long-suffixopt
* long-suffix unsigned-suffixopt
*
unsigned-suffix:
one of
* u U
*
long-suffix: one of
* l L
An integer literal is an optional base prefix, a sequence of digits in the appropriate base, and an optional type suffix. An integer literal shall not contain a period or exponent specifier.
The type of an integer literal is the first of the corresponding list in the table below in which its value can be represented3.
Suffix | Decimal constant | Octal or hexadecimal constant |
---|---|---|
none | int32_t | int32_t |
int64_t | uint32_t | |
int64_t | ||
uint64_t | ||
u or U | uint32_t | uint32_t |
uint64_t | uint64_t | |
l or L | int64_t | int64_t |
uint64_t | ||
Both u or U | uint64_t | uint64_t |
and l or L |
If the specified value of an integer literal cannot be represented by any type in the corresponding list, the integer literal has no type and the program is ill-formed.
An implementation may support the integer suffixes ll and ull as equivalent to l and ul respectively.
floating-literal:
* fractional-constant exponent-partopt
floating-suffixopt
* digit-sequence exponent-part
floating-suffxopt
* fractional-constant:
* digit-sequenceopt .
digit-sequence
* digit-sequence .
* exponent-part:
* e signopt
digit-sequence
* E signopt
digit-sequence
* sign: one of
* + - digit-sequence:
* digit
* digit-sequence digit floating-suffix: one of
h f l H
F L
A floating literal is written either as a fractional-constant with an optional exponent-part and optional floating-suffix, or as an integer digit-sequence with a required exponent-part and optional floating-suffix.
The type of a floating literal is float, unless explicitly specified by a suffix. The suffixes h and H specify half, the suffixes f and F specify float, and the suffixes l and L specify double.4 If a value specified in the source is not in the range of representable values for its type, the program is ill-formed.
vector-literal:
* integer-literal . scalar-element-sequence
* floating-literal . scalar-element-sequence
scalar-element-sequence:
* scalar-element-sequence-x
* scalar-element-sequence-r
scalar-element-sequence-x:
* x
* scalar-element-sequence-x x
scalar-element-sequence-r:
* r
* scalar-element-sequence-r r
A vector-literal is an integer-literal or floating-point literal followed by a period (.) and a scalar-element-sequence.
A scalar-element-sequence is a vector-swizzle-sequence where only the first vector element accessor is valid (x or r). A scalar-element-sequence is equivalent to a vector splat conversion performed on the integer-literal or floating-literal value (4.9).
HLSL inherits a significant portion of its language semantics from C and C++. Some of this is a result of intentional adoption of syntax early in the development of the language and some a side-effect of the Clang-based implementation of DXC. This chapter includes a lot of definitions that are inherited from C and C++. Some are identical to C or C++, others are slightly different. HLSL is neither a subset nor a superset of C or C++, and cannot be simply described in terms of C or C++. This specification includes all necessary definitions for clarity. |
An entity is a value, object, function, enumerator, type, class member, bit-field, template, template specialization, namespace, or pack.
A name is a use of an identifier (5.2.4), operator-function-id ([Overload.operator]), conversion-function-id (9.2), or template-id (10) that denotes any entity or label (6.1).
Every name that denotes an entity is introduced by a declaration. Every name that denotes a label is introduced by a labeled statement (6.1)5.
A variable is introduced by the declaration of a reference other than a non-static data member of an object. The variable’s name denotes the reference or object.
Whenever a name is encountered it is necessary to determine if the name denotes an entity that is a type or template. The process for determining if a name refers to a type or template is called name lookup.
Two names are the same name if:
they are identifiers comprised of the same character sequence, or
they are operator-function-ids formed with the same operator, or
they are conversion-function-ids formed with the same type, or
they are template-ids that refer to the same class or function.
This section matches isoCPP section [basic] except for the exclusion of goto and literal operators. |
A declaration (7) may introduce one or more names into a translation unit or redeclare names introduced by previous declarations. If a declaration introduces names, it specifies the interpretation and attributes of these names. A declaration may also have effects such as:
verifying a static assertion (7),
use of attributes (7), and
controlling template instantiation (10.1).
A declaration is a definition unless:
it declares a function without specifying the function’s body (7.5),
it is a parameter declaration in a function declaration that does not specify the function’s body (7.5),
it is a global or namespace member declaration without the static specifier6,
it declares a static data member in a class definition,
it is a class name declaration,
it is a template parameter,
it is a typedef declaration (7),
it is an alias-declaration (7),
it is a using-declaration (7),
it is a static_assert-declaration (7),
it is an empty-declaration (7),
or a using-directive (7).
The two examples below are adapted from isoCPP [basic.def]. All but one of the following are definitions:
int f(int x) return x+1; // defines f and x struct S int a;int b;; // defines S, S::a, and S::b struct X // defines X int x; // defines non-static member x static int y; // declares static data member y ; int X::y = 1; // defines X::y enum up, down ; // defines up and down namespace N // defines N int d; // declares N::d static int i; // defines N::i
All of the following are declarations:
int a; // declares a const int c; // declares c X anX; // declares anX int f(int); // declares f struct S; // declares S typedef int Int; // declares Int using N::d; // declares d using Float = float; // declares Float cbuffer CB // does not declare CB int z; // declares z tbuffer TB // does not declare TB int w; // declares w
The isoCPP One-definition rule is adopted as defined in isoCPP [basic.def.odr].
A translation unit (2.1) is comprised of a sequence of declarations:
translation-unit:
* declaration-sequenceopt
A program is one or more translation units linked together. A program built from a single translation unit, bypassing a linking step is called freestanding.
A program is said to be fully linked, when it contains no unresolved external declarations, and all exported declarations are entry point declarations (3.7). A program is said to be partially linked, when it contains at least one unresolved external declaration or at least one exported declaration that is not an entry point.
An implementation may generate programs as fully linked or partially linked as requested by the user, and a runtime may allow fully linked or partially linked programs as the implementation allows.
A name has linkage if it can refer to the same entity as a name introduced by a declaration in another scope. If a variable, function, or another entity with the same name is declared in several scopes, but does not have sufficient linkage, then several instances of the entity are generated.
A name with no linkage may not be referred to by names from any other scope.
A name with internal linkage may be referred to by names from other scopes within the same translation unit.
A name with external linkage may be referred to by names from other scopes within the same translation unit, and by names from scopes of other translation units.
A name with program linkage may be referred to by names from other scopes within the same translation unit, by names from scopes of other translation units, by names from scopes of other programs, and by a runtime implementation.
When merging translation units through linking or generating a freestanding program only names with program linkage must be retained in the final program.
Entities with program linkage can be referred to from other partially linked programs or a runtime implementation.
The following entities have program linkage:
entry point functions (3.7)
functions marked with export keyword (7.7)
declarations contained within an export-declaration-group (7.7)
Entities with external linkage can be referred to from the scopes in the other translation units and enable linking between them.
The following entities in HLSL have external linkage:
global variables that are not marked static or groupshared 7
static data members of classes or template classes
Linkage of functions (including template functions) that are not entry points or marked with export keyword is implementation dependent. 8
Entities with internal linkage can be referred to from all scopes in the current translation unit.
The following entities in HLSL have internal linkage:
global variables marked as static or groupshared
all entities declared in an unnamed namespace or a namespace within an unnamed namespace
enumerations
classes or template classes, their member functions, and nested classes and enumerations
An entity with no linkage can be referred to only from the scope it is in.
Any of the following entites declared at function scope or block scopes derived from function scope have no linkage:
local variables
local classes and their member functions
other entities declared at function scope or block scopes derived from function scope that such as typedefs, enumerations, and enumerators
A fully linked program shall contain one or more global functions, which are the designated starting points for the program. These global functions are called entry points, because they denote the location where execution inside the program begins.
Entry point functions have different requirements based on the target runtime and execution mode (3.7.1).
Parameters to entry functions and entry function return types must be of scalar, vector, or non-intangible class type (3.8). Scalar and vector parameters and return types must be annotated with semantic annotations (7.6.1). Class type input and output parameters must have all fields annotated with semantic annotations.
A runtime may define a set of execution modes in an implementation defined way. Each execution mode will have a set of implementation defined rules which restrict available language functionality as appropriate for the execution mode.
The object representation of an object of type T is the sequence of N bytes taken up by the object of type T, where N equals sizeof(T)9. The object representation of an object may be different based on the memory space it is stored in (1.7.1).
The value representation of an object is the set of bits that hold the value of type T. Bits in the object representation that are not part of the value representation are padding bits.
An object type is a type that is not a function type, not a reference type, and not a void type.
A class type is a data type declared with either the class or struct keywords (9). A class type T may be declared as incomplete at one point in a translation unit via a forward declaration, and complete later with a full definition. The type T is the same type throughout the translation unit.
There are special implementation-defined types such as handle types, which fall into a category of standard intangible types. Intangible types are types that have no defined object representation or value representation, as such the size is unknown at compile time.
A class type T is an intangible class type if it contains a base class or members of intangible class type, standard intangible type, or arrays of such types. Standard intangible types and intangible class types are collectively called intangible types(11).
An object type is an incomplete type if the compiler lacks sufficient information to determine the size of an object of type T, and it is not an intangible type. It is a complete type if the compiler has sufficient information to determine the size of an object of type T, or if the type is known to be an intangible type. An object may not be defined to have an incomplete type.
Arithmetic types (3.8.1), enumeration types, and cv-qualified versions of these types are collectively called scalar types.
Vectors of scalar types declared with the built-in vector<T,N> template are vector types. Vector lengths must be between 1 and 4 (i.e. 1 ≤ N ≤ 4 ).
Matrices of scalar types declared with the built-in matrix<T,N,M> template are matrix types. Matrix dimensions, N and M, must be between 1 and 4 (i.e. 1 ≤ N ≤ 4 ).
There are three standard signed integer types: int16_t, int32_t, and int64_t. Each of the signed integer types is explicitly named for the size in bits of the type’s object representation. There is also the type alias int which is an alias of int32_t. There is one minimum precision signed integer type: min16int. The minimum precision signed integer type is named for the required minimum value representation size in bits. The object representation of min16int is int. The standard signed integer types and minimum precision signed integer type are collectively called signed integer types.
There are three standard unsigned integer types: uint16_t, uint32_t, and uint64_t. Each of the unsigned integer types is explicitly named for the size in bits of the type’s object representation. There is also the type alias uint which is an alias of uint32_t. There is one minimum precision unsigned integer type: min16uint. The minimum precision unsigned integer type is named for the required minimum value representation size in bits. The object representation of min16uint is uint. The standard unsigned integer types and minimum precision unsigned integer type are collectively called unsigned integer types.
The minimum precision signed integer types and minimum precision unsigned integer types are collectively called minimum precision integer types. The standard signed integer types and standard unsigned integer types are collectively called standard integer types. The signed integer types and unsigned integer types are collectively called integer types. Integer types inherit the object representation of integers defined in isoC2310. Integer types shall satisfy the constraints defined in isoCPP, section basic.fundamental.
There are three standard floating point types: half, float, and double. The float type is a 32-bit floating point type. The double type is a 64-bit floating point type. Both the float and double types have object representations as defined in IEEE754. The half type may be either 16-bit or 32-bit as controlled by implementation defined compiler settings. If half is 32-bit it will have an object representation as defined in IEEE754, otherwise it will have an object representation matching the binary16 format defined in IEEE75411. There is one minimum precision floating point type: min16float. The minimum precision floating point type is named for the required minimum value representation size in bits. The object representation of min16float is float12. The standard floating point types and minimum precision floating point type are collectively called floating point types.
Integer and floating point types are collectively called arithmetic types.
The void type is inherited from isoCPP, which defines it as having an empty set of values and being an incomplete type that can never be completed. The void type is used to signify the return type of a function that returns no value. Any expression can be explicitly converted to void.
All types T have a scalarized representation, SR(T), which is a list of one or more types representing each scalar element of T.
Scalarized representations are determined as follows:
The scalarized representation of an array T[n] is SR(T0), ..SR(Tn).
The scalarized representation of a vector vector<T,n> is T0, ..Tn.
The scalarized representation of a matrix matrix<T,n, m> is T0, ..Tn × m.
The scalarized representation of a class type T, SR(T) is computed recursively as SR(T::base), SR(T::0), ..SR(T::n) where (T::base) is T’s base class if it has one, and T : :n represents the n non-static members of T.
The scalarized representation for an enumeration type is the underlying arithmetic type.
The scalarized representation for arithmetic, intangible types, and any other type T is T.
Two types cv1 T1 and cv2 T2 are scalar-layout-compatible types if T1 and T2 are the same type or if the sequence of types defined by the scalar representation SR(T1) and scalar representation SR(T2) are identical.
Expressions are classified by the type(s) of values they produce. The valid types of values produced by expressions are:
An lvalue represents a function or object.
An rvalue represents a temporary object.
An xvalue (expiring value) represents an object near the end of its lifetime.
A cxvalue (casted expiring value) is an xvalue which, on expiration, assigns its value to a bound lvalue.
A glvalue is an lvalue, xvalue, or cxvalue.
A prvalue is an rvalue that is not an xvalue.
hlsl inherits standard conversions similar to isoCPP. This chapter enumerates the full set of conversions. A standard conversion sequence is a sequence of standard conversions in the following order:
Zero or one conversion of either lvalue-to-rvalue, or array-to-pointer.
Zero or one conversion of either integral conversion, floating point conversion, floating point-integral conversion, or boolean conversion, derived-to-base-lvalue, or flat conversion13.
Zero or one conversion of scalar-vector splat, or vector/matrix truncation. 14.
Zero or one qualification conversion.
Standard conversion sequences are applied to expressions, if necessary, to convert it to a required destination type.
A glvalue of a non-function type T can be converted to a prvalue. The program is ill-formed if T is an incomplete type. If the glvalue refers to an object that is not of type T and is not an object of a type derived from T, the program is ill-formed. If the glvalue refers to an object that is uninitialized, the behavior is undefined. Otherwise the prvalue is of type T.
If the glvalue refers to an array of type T, the prvalue will refer to a copy of the array, not memory referred to by the glvalue.
An lvalue or rvalue of type T[] (unsized array), can be converted to a prvalue of type pointer to T1516.
An integral promotion is a conversion of:
a glvalue of integer type other than bool to a cxvalue of integer type of higher conversion rank, or
a conversion of a prvalue of integer type other than bool to a prvalue of integer type of higher conversion rank, or
a conversion of a glvalue of type bool to a cxvalue of integer type, or
a conversion of a prvalue of type bool to a prvalue of integer type.
Integer conversion ranks are defined in section 4.13.1.
A conversion is only a promotion if the destination type can represent all of the values of the source type.
A glvalue of a floating point type can be converted to a cxvalue of a floating point type of higher conversion rank, or a prvalue of a floating point type can be converted to a prvalue of a floating point type of higher conversion rank.
Floating point conversion ranks are defined in section [Conf.rank.float].
A glvalue of an integer type can be converted to a cxvalue of any other non-enumeration integer type. A prvalue of an integer type can be converted to a prvalue of any other integer type.
If the destination type is unsigned, integer conversion maintains the bit pattern of the source value in the destination type truncating or extending the value to the destination type.
If the destination type is signed, the value is unchanged if the destination type can represent the source value. If the destination type cannot represent the source value, the result is implementation-defined.
If the source type is bool, the values true and false are converted to one and zero respectively.
A glvalue of a floating point type can be converted to a cxvalue of any other floating point type. A prvalue of a floating point type can be converted to a prvalue of any other floating point type.
If the source value can be exactly represented in the destination type, the conversion produces the exact representation of the source value. If the source value cannot be exactly represented, the conversion to a best-approximation of the source value is implementation defined.
A glvalue of floating point type can be converted to a cxvalue of integer type. A prvalue of floating point type can be converted to a prvalue of integer type. Conversion of floating point values to integer values truncates by discarding the fractional value. The behavior is undefined if the truncated value cannot be represented in the destination type.
A glvalue of integer type can be converted to a cxvalue of floating point type. A prvalue of integer type can be converted to a prvalue of floating point type. If the destination type can exactly represent the source value, the result is the exact value. If the destination type cannot exactly represent the source value, the conversion to a best-approximation of the source value is implementation defined.
A glvalue of arithmetic type can be converted to a cxvalue of boolean type. A prvalue of arithmetic or unscoped enumeration type can be converted to a prvalue of boolean type. A zero value is converted to false; all other values are converted to true.
A glvalue of type T can be converted to a cxvalue of type vector<T,x> or a prvalue of type T can be converted to a prvalue of type vector<T,x>. The destination value is the source value replicated into each element of the destination.
A glvalue of type T can be converted to a cxvalue of type matrix<T,x,y> or a prvalue of type T can be converted to a prvalue of type matrix<T,x,y>. The destination value is the source value replicated into each element of the destination.
A prvalue of type vector<T,x> can be converted to a prvalue of type:
vector<T,y> only if y < x, or
T
The resulting value of vector truncation is comprised of elements [0..y), dropping elements [y..x).
A prvalue of type matrix<T,x,y> can be converted to a prvalue of type:
matrix<T,z,w> only if x ≥ z and y ≥ w,
vector<T,z> only if x ≥ z, or
T.
Matrix truncation is performed on each row and column dimension separately. The resulting value is comprised of vectors [0..z) which are each separately comprised of elements [0..w). Trailing vectors and elements are dropped.
Reducing the dimension of a vector to one (vector<T,1>), can produce either a single element vector or a scalar of type T. Reducing the rows of a matrix to one (matrix<T,x,1>), can produce either a single row matrix, a vector of type vector<T,x>, or a scalar of type T.
A glvalue of type vector<T,x> can be converted to a cxvalue of type vector<V,x>, or a prvalue of type vector<T,x> can be converted to a prvalue of type vector<V,x>. The source value is converted by performing the appropriate conversion of each element of type T to an element of type V following the rules for standard conversions in chapter 4.
A glvalue of type matrix<T,x,y> can be converted to a cxvalue of type matrix<V,x,y>, or a prvalue of type matrix<T,x,y> can be converted to a prvalue of type matrix<V,x,y>. The source value is converted by performing the appropriate conversion of each element of type T to an element of type V following the rules for standard conversions in chapter 4.
A prvalue of type "cv1 T" can be converted to a prvalue of type "cv2 T" if type "cv2 T" is more cv-qualified than "cv1 T".
Every integer and floating point type have defined conversion ranks. These conversion ranks are used to differentiate between promotions and other conversions (see: [Conv.iprom] and [Conv.fpprom]).
No two signed integer types shall have the same conversion rank even if they have the same representation.
The rank of a signed integer type shall be greater than the rank of any signed integer type with a smaller size.
The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type.
The rank of bool shall be less than the rank of all other standard integer types.
The rank of a minimum precision integer type shall be less than the rank of any other minimum precision integer type with a larger minimum value representation size.
The rank of a minimum precision integer type shall be less than the rank of all standard integer types.
For all integer types T1, T2, and T3: if T1 has greater rank than T2 and T2 has greater rank than T3, then T1 shall have greater rank than T3.
The rank half shall be greater than the rank of min16float.
The rank float shall be greater than the rank of half.
The rank double shall be greater than the rank of float.
For all floating point types T1, T2, and T3: if T1 has greater rank than T2 and T2 has greater rank than T3, then T1 shall have greater rank than T3.
This chapter defines the formulations of expressions and the behavior of operators when they are not overloaded. Only member operators may be overloaded17. Operator overloading does not alter the rules for operators defined by this standard.
An expression may also be an unevaluated operand when it appears in some contexts. An unevaluated operand is a expression which is not evaluated in the program18.
Whenever a glvalue appears in an expression that expects a prvalue, a standard conversion sequence is applied based on the rules in 4.
Binary operators for arithmetic and enumeration type require that both operands are of a common type. When the types do not match the usual arithmetic conversions are applied to yield a common type. When usual arithmetic conversions are applied to vector operands they behave as component-wise conversions (4.11). The usual arithmetic conversions are:
If either operand is of scoped enumeration type no conversion is performed, and the expression is ill-formed if the types do not match.
If either operand is a vector<T,X>, vector truncation or scalar extension is performed with the following rules:
If both vectors are of the same length, no dimension conversion is required.
If one operand is a vector and the other operand is a scalar, the scalar is extended to a vector via a Splat conversion (4.9) to match the length of the vector.
Otherwise, if both operands are vectors of different lengths, the vector of longer length is truncated to match the length of the shorter vector (4.10).
If either operand is of type double or vector<double, X>, the other operator shall be converted to match.
Otherwise, if either operand is of type float or vector<float, X>, the other operand shall be converted to match.
Otherwise, if either operand is of type half or vector<half, X>, the other operand shall be converted to match.
Otherwise, integer promotions are performed on each scalar or vector operand following the appropriate scalar or component-wise conversion (4).
If both operands are scalar or vector elements of signed or unsigned types, the operand of lesser integer conversion rank shall be converted to the type of the operand with greater rank.
Otherwise, if both the operand of unsigned scalar or vector element type is of greater rank than the operand of signed scalar or vector element type, the signed operand is converted to the type of the unsigned operand.
Otherwise, if the operand of signed scalar or vector element type is able to represent all values of the operand of unsigned scalar or vector element type, the unsigned operand is converted to the type of the signed operand.
Otherwise, both operands are converted to a scalar or vector type of the unsigned integer type corresponding to the type of the operand with signed integer scalar or vector element type.
primary-expression:
* literal
* this
* ( expression )
* id-expression
*
The type of a literal is determined based on the grammar forms specified in 2.9.1.
The keyword this names a reference to the implicit object of non-static member functions. The this parameter is always a prvalue of non-cv-qualifiedtype. 19
A this expression shall not appear outside the declaration of a non-static member function.
An expression (E) enclosed in parenthesis has the same type, result and value category as E without the enclosing parenthesis. A parenthesized expression may be used in the same contexts with the same meaning as the same non-parenthesized expression.
The grammar and behaviors of this section are almost identical to C/C++ with some subtractions (notably lambdas and destructors). |
id-expression:
* unqualified-id
* qualified-id
unqualified-id:
* identifier
* operator-function-id
* conversion-function-id
* template-id
*
qualified-id:
* nested-name-specifier
templateopt
unqualified-id
*
nested-name-specifier:
* ::
* type-name ::
* namespace-name ::
* nested-name-specifier identifier ::
* nested-name-specifier
templateopt
simple-template-id ::
postfix-expression:
* primary-expression
* postfix-expression [ expression ]
* postfix-expression [ braced-init-list ]
* postfix-expression (
expression-listopt )
* simple-type-specifier (
expression-listopt )
* typename-specifier (
expressionopt )
* simple-type-specifier braced-init-list
* typename-specifier braced-init-list
* postfix-expression .
templateopt
id-expression
* postfix-expression ->
templateopt
id-expression
* postfix-expression ++
* postfix-expression –
A postfix-expression followed by an expression in square brackets () is a subscript expression. In an array subscript expression of the form E1[E2], E1 must either be a variable of array of T[], or an object of type T where T provides an overloaded implementation of operator[] (8).20
A function call may be an ordinary function, or a member function. In a function call to an ordinary function, the postfix-expression must be an lvalue that refers to a function. In a function call to a member function, the postfix-expression will be an implicit or explicit class member access whose id-expression is a member function name.
When a function is called, each parameter shall be initialized with its corresponding argument. The order in which parameters are initialized is unspecified. 21
If the function is a non-static member function the this argument shall be initialized to a reference to the object of the call as if casted by an explicit cast expression to an lvalue reference of the type that the function is declared as a member of.
Parameters are either input parameters, output parameters, or input/output parameters as denoted in the called function’s declaration (7.5). For all types of parameters the argument expressions are evaluated before the function call occurs.
Input parameters are passed by-value into a function. If an argument to an input parameter is of constant-sized array type, the array is copied to a temporary and the temporary value is converted to an address via array-to-pointer decay. If an argument is an unsized array type, the array lvalue directly decays via array-to-pointer decay. 22
Arguments to output and input/output parameters must be lvalues. Output parameters are not initialized prior to the call; they are passed as an uninitialized cxvalue (3.9). An output parameter is only initialized explicitly inside the called function. It is undefined behavior to not explicitly initialize an output parameter before returning from the function in which it is defined. The cxvalue created from an argument to an input/output parameter is initialized through copy-initialization from the lvalue argument expression. Overload resolution shall occur on argument initialization as if the expression T Param = Arg were evaluated. In both cases, the cxvalue shall have the type of the parameter and the argument can be converted to that type through implicit or explicit conversion.
If an argument to an output or input/output parameter is a constant sized array, the array is copied to a temporary cxvalue following the same rules for any other data type. If an argument to an output or input/output parameter is an unsized array type, the array lvalue directly decays via array-to-pointer decay. An argument of a constant sized array of type T[N] can be converted to a cxvalue of an unsized array of type T[] through array to pointer decay. An unsized array of type T[], cannot be implicitly converted to a a constant sized array of type T[N].
On expiration of the cxvalue, the value is assigned back to the argument lvalue expression using a resolved assignment expression as if the expression Arg = Param were written23. The argument expression must be of a type or able to convert to a type that has defined copy-initialization to and assignment from the parameter type. The lifetime of the cxvalue begins at argument expression evaluation, and ends after the function returns. A cxvalue argument is passed by-address to the caller.
If the lvalue passed to an output or input/output parameter does not alias any other parameter passed to that function, an implementation may avoid the creation of excess temporaries by passing the address of the lvalue instead of creating the cxvalue.
When a function is called, any parameter of object type must have completely defined type, and any parameter of array of object type must have completely defined element type.24 The lifetime of a parameter ends on return of the function in which it is defined.25 Initialization and destruction of each parameter occurs within the context of the calling function.
The value of a function call is the value returned by the called function.
A function call is an lvalue if the result type is an lvalue reference type; otherwise it is a prvalue.
If a function call is a prvalue of object type, the type of the prvalue must be complete.
statement:
* labeled-statement
* attribute-specifier-sequenceopt
expression-statement
* attribute-specifier-sequenceopt
compound-statement
* attribute-specifier-sequenceopt
iteration-statement
* attribute-specifier-sequenceopt
selection-statement
* declaration-statement
The optional attribute-specifier-sequence applies to the statement that immediately follows it.
The [unroll] attribute is only valid when applied to iteration-statements. It is used to indicate that iteration-statements like for, while and do while can be unrolled. This attribute qualifier can be used to specify full unrolling or partial unrolling by a specified amount. This is a compiler hint and the compiler may ignore this directive.
The unroll attribute may optionally have an unroll factor represented as a single argument n that is an integer constant expression value greater than zero. If n is not specified, the compiler determines the unrolling factor for the loop. The [unroll] attribute can not be applied to the same iteration-statement as the attribute.
The Attribute tells the compiler to execute each iteration of the loop. In other words, its a hint to indicate a loop should not be unrolled. Therefore it is not compatible with the attribute.
Declarations generally specify how names are to be interpreted. Declarations have the form
declaration-seq:
* declaration
* declaration-seq declaration
declaration:
* name-declaration
* special-declaration
name-declaration:
* ...
special-declaration:
* export-declaration-group
* ...
The specifiers that can be used in a declaration are
decl-specifier:
* function-specifier
* ...
A function-specifier can be used only in a function declaration.
function-specifier:
* export
*
The export specifier denotes that the function has program linkage (3.6.1).
The export specifier cannot be used on functions directly or indirectly within an unnamed namespace.
Functions with program linkage can also be specified in export-declaration-group (7.7).
If a function is declared with an export specifier then all redeclarations of the same function must also use the export specifier or be part of export-declaration-group (7.7).
One or more functions with external linkage can be also specified in the form of
export-declaration-group:
* export {
function-declaration-seqopt
}
*
function-declaration-seq:
* function-declaration
function-declaration-seqopt
The export specifier denotes that every function-declaration included in function-declaration-seq has external linkage (3.6.2).
The export-declaration-group declaration cannot appear directly or indirectly within an unnamed namespace.
Functions with external linkage can also be declared with an export specifier (7.2.2).
If a function is part of an export-declaration-group then all redeclarations of the same function must also be part on a export-declaration-group or be declared with an export specifier (7.2.2).
HLSL inherits much of its overloading behavior from C++. This chapter is extremely similar to isoCPP clause [over]. Notable differences exist around HLSL’s parameter modifier keywords, program entry points, and overload conversion sequence ranking. |
When a single name is declared with two or more different declarations in the same scope, the name is overloaded. A declaration that declares an overloaded name is called an overloaded declaration. The set of overloaded declarations that declare the same overloaded name are that name’s overload set.
Only function and template declarations can be overloaded; variable and type declarations cannot be overloaded.
This section specifies the cases in which a function declaration cannot be overloaded. Any program that contains an invalid overload set is ill-formed.
In overload set is invalid if:
One or more declaration in the overload set only differ by return type.
int Yeet(); uint Yeet(); // ill-formed: decls differ only by return type
An overload set contains more than one member function declarations with the same parameter-type-list, and one of those declarations is a static member function declaration (9.1).
class Doggo
static void pet(); void pet(); // ill-formed: static pet has the same parameter-type-list void pet() const; // ill-formed: static pet has the same parameter-type-list
void wagTail(); // valid: no conflicting static declaration. void wagTail() const; // valid: no conflicting static declaration.
static void bark(Doggo D); void bark(); // valid: static bark parameter-type-list is different void bark() const; // valid: static bark parameter-type-list is different
;
An overload set contains more than one entry function declaration (7.6.2).
void VS(); void VS(int); // valid: only one entry point.
[shader("vertex")] void Entry();
[shader("compute")] void Entry(int); // ill-formed: an overload set cannot have more than one entry function
An overload set contains more than one function declaration which only differ in parameter declarations of equivalent types.
void F(int4 I); void F(vector<int, 4> I); // ill-formed: int4 is a type alias of vector<int, 4>
An overload set contains more than one function declaration which only differ in const specifiers.
void G(int); void G(const int); // ill-formed: redeclaration of G(int) void G(int) void G(const int) // ill-formed: redefinition of G(int)
An overload set contains more than one function declaration which only differ in parameters mismatching out and inout.
void H(int); void H(in int); // valid: redeclaration of H(int) void H(inout int); // valid: overloading between in and inout is allowed
void I(in int); void I(out int); // valid: overloading between in and out is allowed
void J(out int); void J(inout int); // ill-formed: Cannot overload based on out/inout mismatch
Overload resolution is process by which a function call is mapped to a the best overloaded function declaration. Overload resolution uses set of functions called the candidate set, and a list of expressions that comprise the argument list for the call.
Overload resolution selects the function to call in the following contexts26:
invocation of a function named in a function call expression;
invocation of a function call operator on a class object named in function call syntax;
invocation of the operator referenced in an expression;
invocation of a user-defined conversion for copy-initialization of a class object;
invocation of a conversion function for initialization of an object of a nonclass type from an expression of class type.
In each of these contexts a unique method is used to construct the overload candidate set and argument expression list.
isoCPP goes into a lot of detail in this section about how candidate functions and argument lists are selected for each context where overload resolution is performed. HLSL matches C++ for the contexts that HLSL inherits. For now, this section will be left as a stub, but HLSL inherits the following sections from C++:
|
Given the candidate set and argument expressions as determined by the relevant context (8.2.1), a subset of viable functions can be selected from the candidate set.
A function candidate F(P0...Pm) is not a viable function for a call with argument list A0...An if:
The function has fewer parameters than there are arguments in the argument list (m < n).
The function has more parameters than there are arguments to the argument list (m > n), and function parameters Pn + 1...Pm do not all have default arguments.
There is not an implicit conversion sequence that converts each argument Ai to the type of the corresponding parameter Pi.
For an overloaded call with arguments A0...An, each viable function F(P0...Pm), has a set of implicit conversion sequences ICS0(F)...ICSm(F) defining the conversion sequences for each argument Ai to the type of parameter Pi.
A viable function F is defined to be a better function than another viable function $F`$ if for all arguments ICSi(F) is not a worse conversion sequence than $ICS_i(F`)$, and:
for some argument j, ICSj(F) is a better conversion than $ICS_j(F`)$ or,
in the context of an initialization by user-defined conversion, the conversion sequence from the return type of F to the destination type is a better conversion sequence than the return type of $F`$ to the destination type or,
F is a non-template function and $F`$ is a function template specialization, or
F and $F`$ are both function template specializations and F is more specialized than $F`$ according to function template partial ordering rules (10.2).
If there is one viable function that is a better function than all the other viable functions, it is the selected function; otherwise the call is ill-formed.
If the resolved overload is a function with multiple declarations, and if at least two of these declarations specify a default argument that made the function viable, the program is ill-formed.
void F(int X = 1); void F(float Y = 2.0f);
void Fn() F(1); // Okay. F(3.0f); // Okay. F(); // Ill-formed.
An implicit conversion sequence is a sequence of conversions which converts a source value to a prvalue of destination type. In overload resolution the source value is the argument expression in a function call, and the destination type is the type of the corresponding parameter of the function being called.
When a parameter is a cxvalue an inverted implicit conversion sequence is required to convert the parameter type back to the argument type for writing back to the argument expression lvalue. An inverted implicit conversion sequence must be a well-formed implicit conversion sequence where the source value is the implicit cxvalue of the parameter type, and the destination type is the argument expression’s lvalue type.
A well-formed implicit conversion sequence is either a standard conversion sequence, or a user-defined conversion sequence.
In the following contexts an implicit conversion sequence can only be a standard conversion sequence:
Argument conversion for a user-defined conversion function.
Copying a temporary for class copy-initialization.
When passing an initializer-list as a single argument.
Copy-initialization of a class by user-defined conversion.
An implicit conversion sequence models a copy-initialization unless it is an inverted implicit conversion sequence when it models an assignment. Any difference in top-level cv-qualification is handled by the copy-initialization or assignment, and does not constitute a conversion27.
When the source value type and the destination type are the same, the implicit conversion sequence is an identity conversion, which signifies no conversion.
Only standard conversion sequences that do not create temporary objects are valid for implicit object parameters or left operand to assignment operators.
If no sequence of conversions can be found to convert a source value to the destination type, an implicit conversion sequence cannot be formed.
If several different sequences of conversions exist that convert the source value to the destination type, the implicit conversion sequence is defined to be the unique conversion sequence designated the ambiguous conversion sequence. For the purpose of ranking implicit conversion sequences, the ambiguous conversion sequence is treated as a user-defined sequence that is indistinguishable from any other user-defined conversion sequence. If overload resolution selects a function using the ambiguous conversion sequence as the best match for a call, the call is ill-formed.
The conversions that comprise a standard conversion sequence and the composition of the sequence are defined in Chapter 4.
Each standard conversion is given a category and rank as defined in the table below:
Conversion | Category | Rank | Reference |
---|---|---|---|
No conversion | Identity | ||
1-2 Lvalue-to-rvalue |
4.1 | ||
4-4 Array-to-pointer | Lvalue Transformation | Exact Match | 4.2 |
1-2 Qualification | Qualification Adjustment | 4.12 | |
1-4 Scalar splat (without conversion) |
Scalar Extension | Extension | 4.9 |
1-4 Integral promotion |
4.5 & 4.13.1 | ||
1-1 Floating point promotion | Promotion | Promotion | 4.6 & 4.13.2 |
1-1 Component-wise promotion | 4.11 | ||
1-4 Scalar splat promotion |
Scalar Extension Promotion | Promotion Extension | 4.9 |
1-4 Integral conversion |
4.5 | ||
1-1 Floating point conversion | 4.6 | ||
1-1 Floating-integral conversion | Conversion | Conversion | 4.7 |
1-1 Boolean conversion | 4.8 | ||
1-1 Component-wise conversion | 4.11 | ||
1-4 Scalar splat conversion |
Scalar Extension Conversion | Conversion Extension | 4.9 |
1-4 Vector truncation (without conversion) |
Dimensionality Reduction | Truncation | 4.10 |
1-4 Vector truncation promotion |
Dimensionality Reduction Promotion | Promotion Truncation | 4.10 |
1-4 Vector truncation conversion |
Dimensionality Reduction Conversion | Conversion Truncation | 4.10 |
1-4 |
If a scalar splat conversion occurs in a conversion sequence where all other conversions are Exact Match rank, the conversion is ranked as Extension. If a scalar splat occurs in a conversion sequence with a Promotion conversion, the conversion is ranked as Promotion Extension. If a scalar splat occurs in a conversion sequence with a Conversion conversion, the conversion is ranked as Conversion Extension.
If a vector truncation conversion occurs in a conversion sequence where all other conversions are Exact Match rank, the conversion is ranked as Truncation. If a vector truncation occurs in a conversion sequence with a Promotion conversion, the conversion is ranked as Promotion Truncation. If a vector truncation occurs in a conversion sequence with a Conversion conversion, the conversion is ranked as Conversion Truncation.
Otherwise, the rank of a conversion sequence is determined by considering the rank of each conversion.
Conversion sequence ranks are ordered from better to worse as:
Exact Match
Extension
Promotion
Promotion Extension
Conversion
Conversion Extension
Truncation
Promotion Truncation
Conversion Truncation
A partial ordering of implicit conversion sequences exists based on defining relationships for better conversion sequence, and better conversion. If an implicit conversion sequence ICS(f) is a better conversion sequence than $ICS(f`)$, then the inverse is also true: $ICS(f`)$ is a worse conversion sequence than ICS(f). If ICS(f) is neither better nor worse than $ICS(f`)$, the conversion sequences are indistinguishable conversion sequences.
A standard conversion sequence is always better than a user-defined conversion sequence.
Standard conversion sequences are ordered by their ranks. Two conversion sequences with the same rank are indistinguishable unless one of the following rules applies:
If class B is derived directly or indirectly from class A and class C is derived directly or indirectly from class B,
binding of a expression of type C to a cxvalue of type B is better than binding an expression of type C to a cxvalue of type A,
conversion of C to B is better than conversion or C to A,
binding of a expression of type B to a cxvalue of type A is better than binding an expression of type C to a cxvalue of type A,
conversion of B to A is better than conversion of C to A.