Skip to main content

Script Examples

Detailed submission examples for training, inference, and pipeline workflows on OSMO and Azure ML platforms.

[!NOTE] For CLI argument reference and script inventory, see Script Reference.

OSMO Dataset Training

The submit-osmo-dataset-training.sh script uploads training/rl/ as a versioned OSMO dataset. This approach removes the ~1MB size limit of base64-encoded archives and enables dataset reuse across runs.

Dataset Submission Example

# Default dataset configuration
./submit-osmo-dataset-training.sh --task Isaac-Velocity-Rough-Anymal-C-v0

# Custom dataset bucket and name
./submit-osmo-dataset-training.sh \
--dataset-bucket custom-bucket \
--dataset-name my-training-v1 \
--task Isaac-Velocity-Rough-Anymal-C-v0

# With checkpoint resume
./submit-osmo-dataset-training.sh \
--task Isaac-Velocity-Rough-Anymal-C-v0 \
--checkpoint-uri "runs:/abc123/checkpoint" \
--checkpoint-mode resume

Dataset Parameters

ParameterDefaultDescription
--dataset-buckettrainingOSMO bucket for training code
--dataset-nametraining-codeDataset name (auto-versioned)
--training-pathtraining/rlLocal folder to upload

The script stages files to exclude __pycache__ and build artifacts via .amlignore patterns before upload.

LeRobot Behavioral Cloning

The submit-osmo-lerobot-training.sh script submits LeRobot training workflows supporting ACT and Diffusion policy architectures. Uses HuggingFace Hub datasets and installs runtime dependencies from training/il/lerobot/pyproject.toml.

LeRobot Submission Examples

# ACT policy with WANDB logging
./submit-osmo-lerobot-training.sh -d user/my-dataset

# Diffusion policy with Azure MLflow
./submit-osmo-lerobot-training.sh \
-d user/my-dataset \
-p diffusion \
--mlflow-enable \
-r my-model-name

# Fine-tune from pre-trained policy
./submit-osmo-lerobot-training.sh \
-d user/my-dataset \
--policy-repo-id user/pretrained-act \
--training-steps 50000 \
--batch-size 16

LeRobot Parameters

ParameterDefaultDescription
--dataset-repo-id(required)HuggingFace dataset repository ID
--policy-typeactPolicy: act, diffusion
--job-namelerobot-act-trainingJob identifier
--wandb-enableenabledWANDB logging (default)
--mlflow-enabledisabledAzure ML MLflow logging
--policy-repo-id(none)Pre-trained policy for fine-tuning
--training-steps(LeRobot default)Total training iterations
--save-freq5000Checkpoint save frequency

LeRobot Inference

The submit-osmo-lerobot-inference.sh script evaluates trained LeRobot policies from HuggingFace Hub. Downloads the policy, runs evaluation, and optionally registers the model to Azure ML.

LeRobot Inference Examples

# Evaluate a trained policy
./submit-osmo-lerobot-inference.sh --policy-repo-id user/trained-act-policy

# Evaluate with model registration
./submit-osmo-lerobot-inference.sh \
--policy-repo-id user/trained-act-policy \
-r my-evaluated-model

# Diffusion policy evaluation
./submit-osmo-lerobot-inference.sh \
--policy-repo-id user/trained-diffusion \
-p diffusion \
--eval-episodes 50

Inference Parameters

ParameterDefaultDescription
--policy-repo-id(required)HuggingFace policy repository
--policy-typeactPolicy: act, diffusion
--eval-episodes10Number of evaluation episodes
--register-model(none)Model name for Azure ML registration
--dataset-repo-id(none)Dataset for environment replay

AzureML LeRobot Training

The submit-azureml-lerobot-training.sh script submits LeRobot training directly to Azure ML instead of OSMO. It registers an environment, compiles runtime dependencies from training/il/lerobot/pyproject.toml, and submits via az ml job create.

AzureML LeRobot Examples

# ACT policy training
./submit-azureml-lerobot-training.sh -d user/my-dataset

# With model registration and log streaming
./submit-azureml-lerobot-training.sh \
-d user/my-dataset \
-r my-act-model \
--stream

# Custom environment and compute
./submit-azureml-lerobot-training.sh \
-d user/my-dataset \
--image custom-registry.io/lerobot:latest \
--compute my-gpu-cluster

End-to-End Pipeline

The run-lerobot-pipeline.sh script orchestrates the full LeRobot lifecycle: training → polling → inference → model registration. It delegates to the individual submission scripts and polls OSMO workflow status between stages.

Pipeline Stages

StageActionScript Used
1Submit training workflowsubmit-osmo-lerobot-training.sh
2Poll workflow status until completionosmo workflow status
3Submit inference/evaluation workflowsubmit-osmo-lerobot-inference.sh

Pipeline Examples

# Full pipeline: train → evaluate → register
./run-lerobot-pipeline.sh \
-d lerobot/aloha_sim_insertion_human \
--policy-repo-id user/my-act-policy \
-r my-act-model

# Async mode (submit training and exit)
./run-lerobot-pipeline.sh \
-d user/my-dataset \
--skip-wait

# Diffusion pipeline with MLflow
./run-lerobot-pipeline.sh \
-d user/my-dataset \
--policy-repo-id user/my-diffusion \
-p diffusion \
--mlflow-enable \
--training-steps 100000 \
-r my-diffusion-model

# Skip inference (training only with polling)
./run-lerobot-pipeline.sh \
-d user/my-dataset \
--skip-inference

Pipeline Parameters

ParameterDefaultDescription
--dataset-repo-id(required)HuggingFace dataset repository
--policy-repo-id(required*)HuggingFace policy target repo
--policy-typeactPolicy: act, diffusion
--register-model(none)Azure ML model registration name
--poll-interval60Status check interval (seconds)
--timeout720Training timeout (minutes)
--skip-waitdisabledAsync mode: submit and exit
--skip-inferencedisabledSkip inference stage

🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.