Releasing SuperBench v0.5
· 2 min read
We are very happy to announce that SuperBench 0.5.0 version is officially released today!
You can install and try superbench by following Getting Started Tutorial.
#
SuperBench 0.5.0 Release Notes#
Micro-benchmark Improvements- Support NIC only NCCL bandwidth benchmark on single node in NCCL/RCCL bandwidth test.
- Support bi-directional bandwidth benchmark in GPU copy bandwidth test.
- Support data checking in GPU copy bandwidth test.
- Update rccl-tests submodule to fix divide by zero error.
- Add GPU-Burn micro-benchmark.
#
Model-benchmark Improvements- Sync results on root rank for e2e model benchmarks in distributed mode.
- Support customized
env
in local and torch.distributed mode. - Add support for pytorch>=1.9.0.
- Keep BatchNorm as fp32 for pytorch cnn models cast to fp16.
- Remove FP16 samples type converting time.
- Support FAMBench.
#
Inference Benchmark Improvements- Revise the default setting for inference benchmark.
- Add percentile metrics for inference benchmarks.
- Support T4 and A10 in GEMM benchmark.
- Add configuration with inference benchmark.
#
Other Improvements- Add command to support listing all optional parameters for benchmarks.
- Unify benchmark naming convention and support multiple tests with same benchmark and different parameters/options in one configuration file.
- Support timeout to detect the benchmark failure and stop the process automatically.
- Add rocm5.0 dockerfile.
- Improve output interface.
#
Data Diagnosis and Analysis- Support multi-benchmark check.
- Support result summary in md, html and excel formats.
- Support data diagnosis in md and html formats.
- Support result output for all nodes in data diagnosis.