Skip to main content

LeRobot ACT Policy Inference

Run a trained ACT (Action Chunking with Transformers) policy locally against dataset observations or on a live UR10E robot via ROS2.

📋 Prerequisites

ToolVersionInstall
Python3.10+System or pyenv
uv or pipLatestpip install uv
Azure CLI2.50+uv pip install azure-cli
az ml extension2.22+az extension add -n ml

🚀 Quick Start

Pull the Model

The trained checkpoint is available from two sources.

From Azure ML:

az ml model download \
--name hve-robo-act-train --version 1 \
--download-path ./checkpoint \
--resource-group rg-osmorbt3-dev-001 \
--workspace-name mlw-osmorbt3-dev-001

From HuggingFace Hub:

pip install huggingface-hub
huggingface-cli download alizaidi/hve-robo-act-train --local-dir ./checkpoint/hve-robo-act-train

Both produce the same directory:

hve-robo-act-train/
├── config.json # Policy architecture config
├── model.safetensors # Trained weights (197 MB)
├── policy_preprocessor.json # Input normalization pipeline
├── policy_preprocessor_step_3_normalizer_processor.safetensors
├── policy_postprocessor.json # Output unnormalization pipeline
├── policy_postprocessor_step_0_unnormalizer_processor.safetensors
└── train_config.json # Training hyperparameters

Install Dependencies

uv pip install lerobot av pyarrow

Run Offline Inference

Validate the model against recorded dataset observations:

python scripts/test-lerobot-inference.py \
--policy-repo alizaidi/hve-robo-act-train \
--dataset-dir /path/to/hve-robo-cell \
--episode 0 --start-frame 100 --num-steps 30 \
--device cuda

Use --policy-repo ./checkpoint/hve-robo-act-train when loading from a local path instead of HuggingFace Hub.

Expected output:

Episode 0: 668 frames, starting at frame 100, testing 30 steps
step 0: pred=[ 0.001, 0.002, -0.001, -0.004, -0.019, 0.000] gt=[ 0.001, 0.002, -0.002, -0.005, -0.019, 0.000]

============================================================
Inference Results
============================================================
Steps evaluated: 30
MSE (all joints): 0.000004
MAE (all joints): 0.001173
Throughput: 130.0 steps/s
Realtime capable: yes (need 30 Hz)

⚙️ Configuration

Inference Script Parameters

ParameterDefaultDescription
--policy-repoalizaidi/hve-robo-act-trainHuggingFace repo ID or local path
--dataset-dir(required)LeRobot v3 dataset root directory
--episode0Episode index for test observations
--start-frame0Starting frame within the episode
--num-steps30Number of inference steps
--devicecudaInference device (cuda, cpu, mps)
--output(none)Save predictions to .npz file

Model Details

PropertyValue
Policy typeACT (Action Chunking with Transformers)
Parameters51.6M
State dim6 (UR10E joint positions in radians)
Action dim6 (joint position deltas)
Image input480 x 848 RGB
Control frequency30 Hz
BackboneResNet-18

📊 OSMO Evaluation with MLflow Plots

Run batch evaluation across multiple episodes on OSMO with trajectory plots logged directly to AzureML Studio via MLflow.

Submit with MLflow Enabled

scripts/submit-osmo-lerobot-inference.sh \
--policy-repo-id alizaidi/hve-robo-act-train \
--dataset-repo-id alizaidi/hve-robo-cell \
--eval-episodes 10 \
--mlflow-enable \
--experiment-name lerobot-act-eval

Viewing Plots in AzureML Studio

Navigate to AzureML Studio > Jobs > (run name) > Images. The left panel shows a folder tree organized by episode, and plots render inline with tab navigation across all images.

Each episode produces four plots plus one aggregate summary across all episodes:

PlotDescription
action_deltas.pngPer-joint predicted vs ground truth action overlays
cumulative_positions.pngReconstructed absolute joint positions
error_heatmap.pngTime x joint absolute error heatmap
summary_panel.png2x2 panel: all joints, error boxplots, latency, MAE bars
aggregate_summary.pngCross-episode comparison of MAE, MSE, throughput, per-joint error

Numeric metrics are on the Metrics tab: per-episode values (ep0_mse, ep0_mae, ep0_throughput_hz) and aggregate summaries (aggregate_mse, aggregate_mae).

OSMO Inference Script Parameters

ParameterDefaultDescription
--policy-repo-id(required)HuggingFace policy repository
--dataset-repo-id(none)HuggingFace dataset for replay evaluation
--eval-episodes10Number of episodes to evaluate
--mlflow-enablefalseLog plots and metrics to AzureML via MLflow
--experiment-nameauto-derivedMLflow experiment name
--register-model(none)Register model to AzureML after evaluation

🤖 ROS2 Deployment

For real robot control, use the ROS2 inference node in fleet-deployment/inference/act_inference_node.py.

Data Classes

evaluation/sil/robot_types.py defines the interface between the robot and the policy:

TypeMaps toShape
RobotObservation.joint_positionsobservation.state(6,) radians
RobotObservation.color_imageobservation.images.color(480, 848, 3) uint8
JointPositionCommand.positionsaction(6,) radians

Dry Run (No Robot Commands)

ros2 run lerobot_inference act_inference_node \
--ros-args -p policy_repo:=alizaidi/hve-robo-act-train \
-p device:=cuda \
-p enable_control:=false

Monitor predictions on /lerobot/status.

Live Control

ros2 run lerobot_inference act_inference_node \
--ros-args -p policy_repo:=alizaidi/hve-robo-act-train \
-p device:=cuda \
-p enable_control:=true \
-p action_mode:=delta

[!WARNING] Set enable_control:=false first and verify predictions on /lerobot/status are reasonable before enabling live robot commands.

ROS2 Node Parameters

ParameterDefaultDescription
policy_repoalizaidi/hve-robo-act-trainModel source
devicecudaInference device
control_hz30.0Control loop frequency
action_modedeltadelta (add to current) or absolute
enable_controlfalsePublish commands to the robot
camera_topic/camera/color/image_rawRGB image topic
joint_states_topic/joint_statesJoint state topic

ROS2 Topics

TopicTypeDirection
/joint_statessensor_msgs/JointStateSubscribe
/camera/color/image_rawsensor_msgs/ImageSubscribe
/lerobot/joint_commandstrajectory_msgs/JointTrajectoryPublish
/lerobot/statusstd_msgs/StringPublish

🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.