Skip to main content

Releasing SuperBench v0.5

· 2 min read
Peng Cheng
SuperBench Team

We are very happy to announce that SuperBench 0.5.0 version is officially released today!

You can install and try superbench by following Getting Started Tutorial.

SuperBench 0.5.0 Release Notes#

Micro-benchmark Improvements#

  • Support NIC only NCCL bandwidth benchmark on single node in NCCL/RCCL bandwidth test.
  • Support bi-directional bandwidth benchmark in GPU copy bandwidth test.
  • Support data checking in GPU copy bandwidth test.
  • Update rccl-tests submodule to fix divide by zero error.
  • Add GPU-Burn micro-benchmark.

Model-benchmark Improvements#

  • Sync results on root rank for e2e model benchmarks in distributed mode.
  • Support customized env in local and torch.distributed mode.
  • Add support for pytorch>=1.9.0.
  • Keep BatchNorm as fp32 for pytorch cnn models cast to fp16.
  • Remove FP16 samples type converting time.
  • Support FAMBench.

Inference Benchmark Improvements#

  • Revise the default setting for inference benchmark.
  • Add percentile metrics for inference benchmarks.
  • Support T4 and A10 in GEMM benchmark.
  • Add configuration with inference benchmark.

Other Improvements#

  • Add command to support listing all optional parameters for benchmarks.
  • Unify benchmark naming convention and support multiple tests with same benchmark and different parameters/options in one configuration file.
  • Support timeout to detect the benchmark failure and stop the process automatically.
  • Add rocm5.0 dockerfile.
  • Improve output interface.

Data Diagnosis and Analysis#

  • Support multi-benchmark check.
  • Support result summary in md, html and excel formats.
  • Support data diagnosis in md and html formats.
  • Support result output for all nodes in data diagnosis.