Isaac Lab reinforcement learning training with SKRL and RSL-RL backends. Both Azure ML and OSMO platforms support distributed GPU training with automatic checkpointing and MLflow experiment tracking.
| Component | Requirement |
|---|---|
| Infrastructure | AKS cluster deployed via Infrastructure Guide |
| Azure ML | Extension installed via 02-deploy-azureml-extension.sh |
| OSMO | Control plane and backend via 03-deploy-osmo-control-plane.sh |
| Terraform outputs | Available in infrastructure/terraform/ (or provide values via CLI / environment vars) |
| Azure CLI | az with ml extension for Azure ML submissions |
| OSMO CLI | osmo CLI installed and authenticated for OSMO submissions |
./scripts/submit-azureml-training.sh \
--task Isaac-Velocity-Rough-Anymal-C-v0 \
--num-envs 2048 \
--stream
./scripts/submit-osmo-training.sh \
--task Isaac-Velocity-Rough-Anymal-C-v0 \
--num-envs 2048
./scripts/submit-osmo-dataset-training.sh \
--task Isaac-Velocity-Rough-Anymal-C-v0 \
--dataset-name my-training-v1
Dataset injection removes the ~1 MB payload size limit of base64-encoded archives and enables dataset reuse across runs.
| Aspect | Azure ML | OSMO |
|---|---|---|
| Submission | az ml job create via YAML templates |
osmo workflow submit |
| Orchestration | AKS compute targets | KAI Scheduler / Volcano integration |
| Experiment tracking | MLflow (managed) | MLflow (Azure ML backend) |
| Dataset delivery | Azure ML datastores | Base64 payload or OSMO bucket upload |
| Monitoring | Azure ML Studio | OSMO UI Dashboard |
| Payload modes | Single (YAML template) | Base64 or dataset folder injection |
Azure ML provides managed compute and experiment tracking through Azure ML Studio. OSMO adds distributed training coordination, KAI Scheduler integration, and a dataset versioning system.
Core parameters shared across platforms:
| Parameter | Default | Description |
|---|---|---|
--task |
Isaac-Velocity-Rough-Anymal-C-v0 |
Isaac Lab task identifier |
--num-envs |
2048 |
Parallel simulation environments |
--max-iterations |
(unset) | Training iteration limit |
--image |
nvcr.io/nvidia/isaac-lab:2.2.0 |
Container image |
--backend |
skrl |
Training backend: skrl or rsl_rl |
--headless |
true |
Disable rendering |
Values resolve in order: CLI arguments → environment variables → Terraform outputs.
| Backend | Algorithms | Use Case |
|---|---|---|
| SKRL | PPO, IPPO, MAPPO, AMP, SAC | General-purpose RL with MLflow |
| RSL-RL | PPO, Distillation | Locomotion-focused, teacher-student |
SKRL is the default backend and supports automatic MLflow metric logging via monkey-patching. See MLflow Integration for metric details.
Four checkpoint modes control how training initializes:
| Mode | Behavior |
|---|---|
from-scratch |
Default. No checkpoint loaded, training starts fresh. |
warm-start |
Load weights only. Resets optimizer and iteration counters. |
resume |
Load full state. Continues from exact training position. |
fresh |
Load model architecture only. Reinitializes all parameters. |
# Resume interrupted training (Azure ML)
./scripts/submit-azureml-training.sh \
--checkpoint-uri "runs:/abc123/checkpoint" \
--checkpoint-mode resume
# Warm-start from a registered model (OSMO)
./scripts/submit-osmo-training.sh \
--checkpoint-uri "models:/anymal-c-velocity/1" \
--checkpoint-mode warm-start
Training scripts register checkpoints to Azure ML automatically. Override the model name or skip registration:
# Custom model name
./scripts/submit-azureml-training.sh \
--register-checkpoint my-custom-model
# Skip registration
./scripts/submit-osmo-training.sh \
--skip-register-checkpoint
OSMO supports two payload delivery modes for training code:
| Mode | Script | Size Limit | Versioning |
|---|---|---|---|
| Base64 payload | submit-osmo-training.sh |
~1 MB | None |
| Dataset injection | submit-osmo-dataset-training.sh |
Unlimited | Automatic |
Dataset injection uploads training/rl/ as a versioned OSMO dataset, mounted at /data/<dataset_name>/training in the container:
./scripts/submit-osmo-dataset-training.sh \
--dataset-bucket custom-bucket \
--dataset-name my-training-v1 \
--task Isaac-Velocity-Rough-Anymal-C-v0
The script stages files to exclude __pycache__ and build artifacts via .amlignore patterns before upload.
🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.