Installation
SuperBench is used to run validations for AI infrastructure, thus you need to prepare one control node which is used to run SuperBench commands, and one or multiple managed nodes which are going to be validated.
Usually control node could be a CPU node, while managed nodes are GPU nodes with high speed inter-connection.
Tips
It is fine if you have only one GPU node and want to try SuperBench on it. Control node and managed node can co-locate on the same machine.
#
Control nodeHere're the system requirements for control node.
#
Requirements- Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
- Python version 3.6 or later (which can be checked by running
python3 --version
). - Pip version 18.0 or later (which can be checked by running
python3 -m pip --version
).
note
Windows is not supported due to lack of Ansible support, but you still can use WSL2.
Besides, control node should be able to access all managed nodes through SSH.
If you are going to use password instead of private key for SSH, you also need to install sshpass
.
sudo apt-get install sshpass
It is also recommended to use venv for virtual environments, but it is not strictly necessary.
# create a new virtual environmentpython3 -m venv ./venv# activate the virtual environmentsource ./venv/bin/activate
# exit the virtual environment later# after you finish running superbenchdeactivate
#
BuildYou can clone the source from GitHub and build it.
Note
You should checkout corresponding tag to use release version, for example,
git clone -b v0.11.0 https://github.com/microsoft/superbenchmark
git clone https://github.com/microsoft/superbenchmarkcd superbenchmark
python3 -m pip install .make postinstall
After installation, you should be able to run SB CLI.
sb
#
Managed nodesHere're the system requirements for all managed GPU nodes.
#
Requirements- NVIDIA GPU
- AMD GPU
- Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
- Compatible GPU drivers should be installed correctly. Driver version can be checked by running
nvidia-smi
. - Docker CE version 20.10 or later (which can be checked by running
docker --version
). - NVIDIA GPU support in Docker, install nvidia-container-toolkit.
- Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
- Compatible GPU drivers should be installed correctly, and group permission should be set to access GPU resources.
You should be able to run
rocm-smi
androcminfo
directly to check GPU usage and information. - Docker CE version 20.10 or later (which can be checked by running
docker --version
).