Skip to content

Accera v1.2 Reference

accera.Target([architecture, cache_lines, cache_sizes, category, extensions, family, frequency_GHz, known_name, model, name, num_cores, num_threads, runtime, tensor_core_info, turbo_frequency_GHz, vector_bytes, vector_registers)

Defines the capabilities of a target processor.

Arguments

argument description type/default
architecture The processor architecture accera.Target.Architecture
cache_lines Cache lines (kilobytes) list of positive integers
cache_sizes Cache sizes (bytes) list of positive integers
category The processor category accera.Target.Category
extensions Supported processor extensions list of extension codes
family The processor family string
frequency_GHz The processor frequency (GHz) positive number
known_name A name of a device known to Accera string | accera.Target.Model / "HOST"
model The processor model accera.Target.Model
name The processor name string
num_cores Number of cores positive integer
num_threads Number of threads positive integer
runtime The runtime accera.Target.Runtime
tensor_core_info The tensor core capabilities, such as the supported input type, output type, and shapes accera.Targets.TensorCoreInformation
turbo_frequency_GHz Turbo frequency (GHz) positive number
vector_bytes Bytes per vector register positive number
vector_registers total number of SIMD registers positive number

Known device names

Accera provides a pre-defined list of known targets through the accera.Target.Models enumeration.

These known targets provide typical hardware settings and may not fit your specific hardware characteristics exactly. If your target matches closely with (but not exactly to) one of these targets, you can always start with a known target and update the properties accordingly.

If your target is your host machine, Accera will first try to find your host machine's CPU in the list of known devices then use its corresponding capabilities. If none is found, we recommend that you inspect the closest matching device in accera.Target.Models enumeration in order to generate optimal code. If there is no closely matching device for you host machine, we suggest you to look at the following section to define a cpu target in Accera.

Examples

Let's have a look at some examples to understand how to define a CPU target in Accera.

Create a custom CPU target:

cpu_target = acc.Target(name="Custom processor", category=acc.Target.Category.CPU, architecture=acc.Target.Architecture.X86_64, num_cores=10)

We further create a known CPU target and can selectively override fields.

gen10 = acc.Target(
                known_name="Intel 7940X",
                category=acc.Target.Category.CPU,
                extensions=["SSE4.1", "SSE4.2", "AVX2"])

In this example, we created a target device of a known CPU but overrode the extensions to remove AVX512 support.

You can use this example as a starting point to define any other Intel Core Processor. Their specifications are listed in the table above.

Craete a pre-defined GPU target representing an NVidia Tesla v100 processor:

v100 = acc.Target(model=acc.Target.Model.NVIDIA_TESLA_V100)

Here is another example to create a custom GPU target:

gpu_target = acc.Target(name="Custom GPU processor", category=acc.Target.Category.GPU, default_block_size=16)

Additional Notes on Instruction Set Extensions

It is important to identify the number of vector registers and vector bytes of each SIMD register. These values may help you determine if you are leveraging the vector units of the underlying hardware to its best capabilities.

AVX

Advanced Vector Extensions (AVX) promotes legacy 128-bit SIMD instructions that operate on XMM registers to use a vector-extension (VEX) prefix and operate on 256-bit YMM registers.

Intel AVX introduced support for 256-bit wide SIMD registers (YMM0-YMM7 in operating modes that are 32-bit or less, YMM0-YMM15 in 64-bit mode). For Accera, 64-bit mode is the default. a target. The lower 128-bits of the YMM registers are aliased to the respective 128-bit XMM registers. In Intel AVX, there are 256-bit wide vector registers, 16 XMM registers, and 16 YMM registers to support an extension of 128-bits.

AVX512

AVX-512 is a further extension offering 32 ZMM registers, and each SIMD register is 512 bits (64 bytes) wide.

SSE4 Extension

There are 16 XMM registers (XMM0 to XMM15), each 128-bit wide. In 64-bit mode, eight additional XMM registers are accessible. Registers XMM8-XMM15 are accessed by using REX prefixes.


Last update: 2023-04-17