Result Summary
#
IntroductionThis tool is to generate a readable summary report based on the raw benchmark results of single or multiple machines.
#
UsageInstall SuperBench on the local machine.
Prepare the raw data and rule file on the local machine.
Generate the result summary automatically using
sb result summary
command. The detailed command can be found from SuperBench CLI.sb result summary --data-file ./results-summary.jsonl --rule-file ./rule.yaml --output-file-format md --output-dir ${output-dir}
Find the output result file named 'results-summary.md' under ${output_dir}.
#
InputThe input includes 2 files:
- Raw Data: jsonl file including multiple nodes' results automatically generated by SuperBench runner.
Tips
Raw data file can be found at ${output-dir}/results-summary.jsonl after each successful run.
- Rule File: It uses YAML format and defines how to generate the result summary including how to classify the metrics and what statistical methods (P50, mean, etc.) are applied.
#
Rule FileThis section describes how to write rules in rule file.
The convention is the same as SuperBench Config File, please view it first.
Here is an overview of the rule file structure:
version: stringsuperbench: rules: ${rule_name}: statistics: - ${statistic_name} categories: string aggregate: (optional)[bool|string] metrics: - ${benchmark_name}/regex - ${benchmark_name}/regex
# SuperBench rulesversion: v0.11superbench: rules: kernel_launch: statistics: - mean - p90 - min - max aggregate: True categories: KernelLaunch metrics: - kernel-launch/event_time - kernel-launch/wall_time nccl: statistics: mean categories: NCCL metrics: - nccl-bw/allreduce_8388608_busbw ib-loopback: statistics: mean categories: RDMA metrics: - ib-loopback/IB_write_8388608_Avg_\d+ aggregate: ib-loopback/IB_write_.*_Avg_(\d+)
This rule file describes the rules used for the result summary.
They are organized by the rule name and each rule mainly includes several elements:
metrics
#
The list of metrics for this rule. Each metric is in the format of ${benchmark_name}/regex, you can use regex after the first '/', but to be noticed, the benchmark name can not be a regex.
categories
#
User-defined category name in string belongs to the rule, which is used to classify and organize the metrics.
aggregate
#
This item is used to determine whether to aggregate the benchmark results from multiple devices to treat them as one collection. For example, aggregate the results of kernel-launch overhead from 8 GPU devices into one collection.
The value of this item should be bool or pattern string with regex​:
- bool:
False
(default): if no aggregation.True
: aggregate the results of multiple ranks. In detail, the metric names inmetrics
like 'metric:\d+' will be aggregated and turned into 'metric' for most microbenchmark metrics.
- pattern string with regex: aggregate the results using the pattern string, which is used to match the metric names in
metrics
. In detail, the part of the metric that matches the contents of () in the pattern string will be turned into *, other parts of the metric remain unchanged.
statistics
#
A list of statistical functions is used for this rule to get the results statistics from multiple nodes/ranks.
The following illustrates all statistical functions:
count
max
mean
min
p${value}
: ${value} can be 1-99. For example, p50, p90, etc.std
#
OutputWe support different output formats for result sumamry including markdown, html, etc. The output includes the metrics grouped by category and their values obtained by applying statistical methods to all raw results.