Skip to main content

MLPerf

MLPerf is a consortium of AI leaders from academia, research labs, and industry whose mission is to “build fair and useful benchmarks” that provide unbiased evaluations of training and inference performance for hardware, software, and services—all conducted under prescribed conditions. To stay on the cutting edge of industry trends, MLPerf continues to evolve, holding new tests at regular intervals and adding new workloads that represent the state of the art in AI.

System Requirements

This is a GPU-specific workload and requires high-performance graphic cards to run. It is recommended that the system-under-test have a high-performing Nvidia (e.g. M60 or higher) or AMD (e.g. MI25 or higher) graphics card.

Supported Hardware Systems

The following section defines the hardware systems/SKUs on which the MLPerf workload will run effectively in cloud environments. These hardware systems contain GPU components for which the MLPerf workload is designed to test.

  • Datacenter systems MLPerf Inference

    • A100-SXM-80GBx8 (NVIDIA DGX A100, 80GB variant)
    • A100-SXM-80GBx4 (NVIDIA DGX Station A100, "Red SEPTober", 80GB variant)
    • A100-PCIex8 (80GB variant)
    • A2x2
    • A30x8
  • Edge Systems MLPerf Inference

    • A100-SXM-80GBx1
    • A100-PCIex1 (80 GB variant)
    • A30x1
    • A2x1
    • Orin
    • Xavier NX
  • Supported Config Files for MlPerf Bert Training (config_{nodes}x{gpus per node}x{local batch size}x{gradien accumulation}.sh)

    • config_A30_1x2x224x14.sh
    • config_DGXA100_1x4x56x2.sh
    • config_DGXA100_1x8x56x1.sh
    • config_DGXA100_4gpu_common.sh
    • config_DGXA100_512x8x2x1_pack.sh
    • config_DGXA100_8x8x48x1.sh
    • config_DGXA100_common.sh

Source: link

Additional details on whether a system is supported or not can be found in the documetation here, for each benchmark check it's respective implementation folder : https://github.com/mlcommons/training_results_v2.1/tree/main/NVIDIA/benchmarks https://github.com/mlcommons/inference_results_v2.0/tree/master/closed/NVIDIA

What is Being Measured?

GPU performance across a wide range of inference models. Work is planned for integrating support for training models as well.

  • Training Benchmarks

    • bert
    • dlrm (not supported yet)
    • maskrcnn (not supported yet)
    • minigo (not supported yet)
    • resnet (not supported yet)
    • rnnt (not supported yet)
    • ssd (not supported yet)
    • unet3 (not supported yet)
  • Inference Benchmarks

    • bert
    • rnnt
    • ssd-mobilenet
    • ssd-resnet34
    • resnet50 (not supported yet)
    • DLRM (not supported yet)
    • 3D UNET (not supported yet)

Workload Metrics MLPerf Inference

The following metrics are examples of those captured by the Virtual Client when running the MLPerf Inference workload.

ScenarioMetric NameExample Value (min)Example Value (max)Example Value (avg)Unit
bertDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_9_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_9_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_9_MaxP-Server-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_9_MaxP-Server-PerformanceMode0.01.00.5333333333333333VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_MaxP-Server-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_MaxP-Server-PerformanceMode0.01.00.8333333333333334VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Server-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Server-PerformanceMode0.01.00.7954545454545454VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_9_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_9_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_9_MaxP-Server-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_9_MaxP-Server-PerformanceMode1.01.01.0VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
bertDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Server-AccuracyMode1.01.01.0PASS/FAIL
bertDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Server-PerformanceMode0.01.00.9680851063829787VALID/INVALID
rnntDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
rnntDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
rnntDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_MaxP-Server-AccuracyMode1.01.01.0PASS/FAIL
rnntDGX-A100_A100-SXM4-40GBx8_TRT-custom_k_99_MaxP-Server-PerformanceMode1.01.01.0VALID/INVALID
ssd-mobilenetDGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
ssd-mobilenetDGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
ssd-mobilenetDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
ssd-mobilenetDGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
ssd-resnet34DGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
ssd-resnet34DGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
ssd-resnet34DGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Server-AccuracyMode1.01.01.0PASS/FAIL
ssd-resnet34DGX-A100_A100-SXM4-40GBx8_TRT-lwis_k_99_MaxP-Server-PerformanceMode1.01.01.0VALID/INVALID
ssd-resnet34DGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Offline-AccuracyMode1.01.01.0PASS/FAIL
ssd-resnet34DGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Offline-PerformanceMode1.01.01.0VALID/INVALID
ssd-resnet34DGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Server-AccuracyMode1.01.01.0PASS/FAIL
ssd-resnet34DGX-A100_A100-SXM4-40GBx8_TRT_Triton-triton_k_99_MaxP-Server-PerformanceMode1.01.01.0VALID/INVALID

Workload Metrics MLPerf Training

ScenarioMetric NameExample Value (min)Example Value (max)Example Value (avg)Unit
training-mlperf-bert-batchsize-45-gpu-8eval_mlm_accuracy0.6505528540.6725528540.662552854%
training-mlperf-bert-batchsize-45-gpu-8e2e_time1071.0405711078.0405711074.040571s
training-mlperf-bert-batchsize-45-gpu-8training_sequences_per_second2288.4636152300.4636152295.463615
training-mlperf-bert-batchsize-45-gpu-8final_loss000
training-mlperf-bert-batchsize-45-gpu-8raw_train_time1053.9822371070.9822371063.982237s