Skip to main content

Result Summary

Introduction#

This tool is to generate a readable summary report based on the raw benchmark results of single or multiple machines.

Usage#

  1. Install SuperBench on the local machine.

  2. Prepare the raw data and rule file on the local machine.

  3. Generate the result summary automatically using sb result summary command. The detailed command can be found from SuperBench CLI.

    sb result summary --data-file ./results-summary.jsonl --rule-file ./rule.yaml --output-file-format md --output-dir ${output-dir}
  4. Find the output result file named 'results-summary.md' under ${output_dir}.

Input#

The input includes 2 files:

  • Raw Data: jsonl file including multiple nodes' results automatically generated by SuperBench runner.
Tips

Raw data file can be found at ${output-dir}/results-summary.jsonl after each successful run.

  • Rule File: It uses YAML format and defines how to generate the result summary including how to classify the metrics and what statistical methods (P50, mean, etc.) are applied.

Rule File#

This section describes how to write rules in rule file.

The convention is the same as SuperBench Config File, please view it first.

Here is an overview of the rule file structure:

Scheme
version: stringsuperbench:  rules:    ${rule_name}:      statistics:        - ${statistic_name}      categories: string      aggregate: (optional)[bool|string]      metrics:        - ${benchmark_name}/regex        - ${benchmark_name}/regex
Example
# SuperBench rulesversion: v0.11superbench:  rules:    kernel_launch:      statistics:        - mean        - p90        - min        - max      aggregate: True      categories: KernelLaunch      metrics:        - kernel-launch/event_time        - kernel-launch/wall_time    nccl:      statistics: mean      categories: NCCL      metrics:        - nccl-bw/allreduce_8388608_busbw    ib-loopback:      statistics: mean      categories: RDMA      metrics:        - ib-loopback/IB_write_8388608_Avg_\d+      aggregate: ib-loopback/IB_write_.*_Avg_(\d+)

This rule file describes the rules used for the result summary.

They are organized by the rule name and each rule mainly includes several elements:

metrics#

The list of metrics for this rule. Each metric is in the format of ${benchmark_name}/regex, you can use regex after the first '/', but to be noticed, the benchmark name can not be a regex.

categories#

User-defined category name in string belongs to the rule, which is used to classify and organize the metrics.

aggregate#

This item is used to determine whether to aggregate the benchmark results from multiple devices to treat them as one collection. For example, aggregate the results of kernel-launch overhead from 8 GPU devices into one collection.

The value of this item should be bool or pattern string with regex​:

  • bool:
    • False(default): if no aggregation.
    • True: aggregate the results of multiple ranks. In detail, the metric names in metrics like 'metric:\d+' will be aggregated and turned into 'metric' for most microbenchmark metrics.
  • pattern string with regex: aggregate the results using the pattern string, which is used to match the metric names in metrics. In detail, the part of the metric that matches the contents of () in the pattern string will be turned into *, other parts of the metric remain unchanged.

statistics#

A list of statistical functions is used for this rule to get the results statistics from multiple nodes/ranks.

The following illustrates all statistical functions:

  • count
  • max
  • mean
  • min
  • p${value}: ${value} can be 1-99. For example, p50, p90, etc.
  • std

Output#

We support different output formats for result sumamry including markdown, html, etc. The output includes the metrics grouped by category and their values obtained by applying statistical methods to all raw results.