Skip to main content

STREAM

STREAM is a synthetic benchmark designed to measure sustainable memory bandwidth and the corresponding computation rate for simple vector kernels. It is intended to provide a measure of memory performance independent of any particular computing platform's cache hierarchy, and has become a de-facto industry standard for measuring memory bandwidth.

What is Being Measured?

The STREAM benchmark measures sustainable memory bandwidth (in MB/s where 1 MB = 10^6 bytes, not 2^20 bytes) using four simple vector kernels. These kernels are designed to be simple enough to avoid introducing computational bottlenecks while being complex enough to be representative of real application behavior. Each kernel represents a common pattern found in scientific and engineering applications.

The STREAM benchmark runs the following memory bandwidth tests:

Bandwidth BenchmarkDescription
CopyMeasures memory copy operation speeds (a(i) = b(i))
ScaleMeasures memory scale operation speeds (a(i) = q*b(i))
AddMeasures memory add operation speeds (a(i) = b(i) + c(i))
TriadMeasures memory triad operation speeds (a(i) = b(i) + q*c(i))

Workload Metrics

The following metrics are examples of those captured by the Virtual Client when running the STREAM workload. Virtual Client supports both the standard STREAM benchmark and the Microsoft-optimized STREAM implementation which provides additional metrics including latency measurements and the Write operation.

Standard STREAM Metrics

Metric NameExample Value (min)Example Value (max)Example Value (avg)Unit
Best Rate Add8635.5327893.542849.75MB/s
Best Rate Copy6787.4346279.030720.40MB/s
Best Rate Scale6747.1320023.230578.70MB/s
Best Rate Triad10141.2305781.642735.67MB/s

Microsoft STREAM Metrics

The Microsoft STREAM implementation provides additional detailed metrics including minimum, average, and best rates for all operations, as well as latency measurements. It also includes a Write operation in addition to the standard STREAM operations.

Bandwidth Metrics

Metric NameExample Value (min)Example Value (max)Example Value (avg)Unit
Best Rate Add53351.054544.054011.50MB/s
Best Rate Copy72073.074208.073171.83MB/s
Best Rate Read48497.051087.050461.00MB/s
Best Rate Scale72486.074990.073716.00MB/s
Best Rate Triad54780.056567.055725.50MB/s
Best Rate Write87466.0133326.0116689.67MB/s
Avg Rate Add52848.054074.053553.17MB/s
Avg Rate Copy71016.073323.072106.67MB/s
Avg Rate Read47981.050433.049814.33MB/s
Avg Rate Scale71056.073315.072425.50MB/s
Avg Rate Triad53849.055720.055032.50MB/s
Avg Rate Write86252.0124882.0105829.00MB/s
Min Rate Add52199.053658.052981.17MB/s
Min Rate Copy70192.072302.071153.50MB/s
Min Rate Read47600.050057.049066.33MB/s
Min Rate Scale70110.072308.071063.83MB/s
Min Rate Triad53180.055407.054307.67MB/s
Min Rate Write84744.0118017.096797.83MB/s

Latency Metrics

Metric NameExample Value (min)Example Value (max)Example Value (avg)Unit
Avg Latency Add144.0161.0151.17nanoseconds
Avg Latency Copy155.0183.0164.83nanoseconds
Avg Latency Read144.0152.0147.50nanoseconds
Avg Latency Scale155.0179.0161.83nanoseconds
Avg Latency Triad141.0163.0151.67nanoseconds
Avg Latency Write196.0377.0261.17nanoseconds
Max Latency Add158.0189.0172.67nanoseconds
Max Latency Copy173.0203.0183.83nanoseconds
Max Latency Read156.0181.0168.50nanoseconds
Max Latency Scale169.0205.0178.67nanoseconds
Max Latency Triad157.0193.0171.00nanoseconds
Max Latency Write220.0446.0299.17nanoseconds
Min Latency Add126.0145.0134.50nanoseconds
Min Latency Copy143.0169.0151.67nanoseconds
Min Latency Read121.0135.0128.00nanoseconds
Min Latency Scale137.0159.0145.50nanoseconds
Min Latency Triad128.0151.0135.17nanoseconds
Min Latency Write175.0319.0224.00nanoseconds