CLI Reference
submit-aml
Submit a job to be run on Azure Machine Learning.
Unrecognized arguments are ignored and propagated to the script.
submit-aml \
--script run.py \
--experiment-name "my-experiment" \
--mount "vindr_dir=VINDR-CXR-V2" \
--my-script-arg "hello"
Usage:
Options:
-e, --experiment-name TEXT Name of the Azure ML experiment to which the
job will be submitted. If not provided, the
name of the current directory name will be
used.
-r, --run-name TEXT Display name of the Azure ML run.
--workspace TEXT Name of the Azure ML workspace.
-g, --resource-group TEXT Name of the Azure ML resource group.
--subscription TEXT Subscription ID of the workspace.
--description TEXT Description for the Azure ML job. If not
provided, the local command will be used.
-c, --compute-target TEXT Name of the Azure ML compute target to run
the job on.
-i, --docker-image TEXT Base Docker image to use for the job.
[default: mcr.microsoft.com/azureml/openmpi4
.1.0-cuda11.8-cudnn8-ubuntu22.04]
--build-context / --no-build-context
Whether to build a Docker context from the
project directory. [default: build-context]
--docker-run TEXT Extra command to run in Docker build before
syncing the environment.
--aml-environment TEXT Name of an existing Azure ML environment to
use for the job. If provided, the Docker
image and build context arguments will be
ignored.
--shared-memory INTEGER Amount of shared memory for the Docker
container (in GB) [default: 256]
-n, --num-nodes INTEGER Number of nodes to use for the job.
[default: 1]
-d, --download TEXT Azure ML dataset or job output folder to
download. To download an Azure ML dataset,
the argument should take the form: alias,
name and version of the dataset; for
example: 'vindr_dir=VINDR-CXR-V2:1'. If the
version is omitted, the last one will be
used. To download the output folder of a
previous job, the argument should take the
form 'alias=job_dir:<job_id>:<path/in/job/ou
tputs>'; for example: 'checkpoint=job_dir:cr
usty_hat_43s6lmvb25:outputs/checkpoint-10000
'. The alias can be used to pass input
datasets to the script, e.g.,
'${{inputs.vindr_dir}}' or
'${{inputs.checkpoint}}'. This option can be
used multiple times.
-m, --mount TEXT Azure ML dataset or job output folder to
mount. For an Azure ML dataset, the alias,
name and version should be provided while
for a job output folder, the alias, job ID
and path in the job outputs should be
provided. See the --download option for more
information.
-o, --output TEXT Alias, datastore and path to folder into
which outputs will be written, expressed as
"alias=datastore/path/to/dir". For example:
"out_dir=mydatastore/my_dataset". The alias
can be used to pass outputs to the script,
e.g., "${{outputs.out_dir}}". See the
example for more information. This option
can be used multiple times.
--command-prefix TEXT Prefix to prepend to the command. For
example, `uv run`. [default: uv run --no-
default-groups]
--executable TEXT The executable, e.g., `python`, `'torchrun
--nproc-per-node auto'`, `bash`, or `nvidia-
smi`. [default: python]
-s, --script PATH Path to the script that will be run on Azure
ML.
--sweep TEXT Azure ML hyperparameter for sweep jobs.
Examples: "seed=[0, 1, 2]",
"model/unet=['tiny', 'small']",
"+trainer.max_epochs=[10, 20]",
"model.learning_rate=[1.0e-4, 2.0e-4]". If a
`--sweep-prefix` is passed, the sweep
arguments will be added to the command with
the prefix. The keys are adapted to be
compatible with Azure ML Inputs and will be
available as environment variables in the
job. For the examples above, the environment
variables will be `AZUREML_SWEEP_seed`,
`AZUREML_SWEEP_model_unet`,
`AZUREML_SWEEP_trainer_max_epochs`, and
`AZUREML_SWEEP_model_learning_rate`.
--sweep-prefix TEXT Prefix to prepend to the sweep arguments in
the command. If not provided, the sweep
arguments will not be added to the command.
--max-concurrent-trials INTEGER
Maximum number of concurrent trials for the
sweep job.
-l, --stream-logs Wait for completion and stream the logs of
the job.
--source-dir PATH Path to the directory containing the source
code for the job. If not provided, the
current directory is used.
-P, --project-dir PATH Directory containing a pyproject.toml,
uv.lock and .python-version file. These
files will be used to build the Docker
image. If not provided, the current
directory is used.
--num-gpus INTEGER Number of requested GPUs per node. This
should typically match the number of GPUs in
the compute target. If provided, the
`PyTorchDistribution` will be selected.
Otherwise, the `MpiDistribution` will be
used and `--executable` should be set to
`'torchrun --nproc-per-node auto'` for
multi-GPU PyTorch runs. Must not be set for
Lightning jobs. More information at
https://learn.microsoft.com/en-
us/azure/machine-learning/how-to-train-
distributed-gpu?view=azureml-api-2.
--debug / --no-debug Install debugpy on AML and run the command
using debugpy. The job will not start until
a remote debugger is attached. More
information at
https://learn.microsoft.com/en-
us/azure/machine-learning/how-to-
interactive-jobs?view=azureml-
api-2&tabs=ui#attach-a-debugger-to-a-job.
[default: no-debug]
--tensorboard / --no-tensorboard
Enable a TensorBoard interactive service for
the job. [default: tensorboard]
--tensorboard-dir PATH Directory in which the TensorBoard logs are
expected to be stored. [default:
logs/tensorboard]
--profiler / --no-profiler Enable profiling on Azure ML. Needs CUDA >=
12 and PyTorch >= 2. [default: no-profiler]
-G, --dependency-group TEXT Dependency groups to install in the Docker
image. If not provided, no dependency groups
are installed. The groups are defined in the
pyproject.toml file. This option can be used
multiple times.
--extra TEXT Optional dependency groups (extras) to
install in the Docker image. If not
provided, no extras are installed. The
optional groups are defined in the
pyproject.toml file. This option can be used
multiple times.
--conda-env-file PATH Path to a conda environment YAML file (e.g.,
environment.yml). If provided, a conda
environment will be used instead of Docker
build context. Cannot be used together with
--build-context, --aml-environment, or uv-
specific options.
--only-env Exit after instantiating the environment.
This is useful during development so that
the AML environment build runs immediately
and the job starts faster once the script is
ready to be submitted.
-E, --set TEXT Environment variables to set on the job. The
format is `KEY=VALUE`. This option can be
used multiple times.
-D, --dry-run Exit before submitting the job.
--install-completion Install completion for the current shell.
--show-completion Show completion for the current shell, to
copy it or customize the installation.