physical-ai-toolchain

The training pipeline uses monkey-patching to wrap the agent’s _update method, intercepting training updates to extract and log metrics to MLflow. This approach provides comprehensive experiment tracking without modifying the underlying SKRL agent implementation or training code.

Available Metrics

The MLflow integration automatically extracts metrics from SKRL agents across several categories:

Episode Statistics

episode_reward - Reward for the current episode
episode_reward_mean - Mean reward across recent episodes
episode_length - Length of the current episode
episode_length_mean - Mean episode length across recent episodes
cumulative_rewards - Cumulative rewards over time
mean_rewards - Mean reward values
success_rate - Success rate for task-specific metrics

Training Losses

policy_loss - Policy network loss
value_loss - Value network loss (critic loss for some algorithms)
critic_loss - Critic network loss (SAC, TD3, DDPG)
entropy - Policy entropy for exploration

Optimization Metrics

learning_rate - Current learning rate
grad_norm - Gradient norm for monitoring optimization
kl_divergence - KL divergence between old and new policies (PPO)

Timing Metrics

timesteps - Total environment timesteps
iterations - Training iteration count
fps - Training frames per second
epoch_time - Time per training epoch
rollout_time - Time spent collecting experience
learning_time - Time spent in optimization

Multi-Element Metrics

For metrics with multiple values (tensors or arrays), the integration extracts statistical aggregates:

metric_name/mean - Mean value
metric_name/std - Standard deviation
metric_name/min - Minimum value
metric_name/max - Maximum value

Custom Metrics

All entries in agent.tracking_data are automatically extracted, supporting algorithm-specific metrics from PPO, SAC, TD3, DDPG, A2C, and other SKRL implementations.

Implementation Details

The integration uses the create_mlflow_logging_wrapper function from skrl_mlflow_agent module to create a closure that wraps the agent’s _update method. The wrapper is applied after the SKRL Runner is instantiated but before training begins.

Configuration Parameters

agent - The SKRL agent instance to extract metrics from (required)
mlflow_module - The mlflow module for logging metrics (required)
metric_filter - Optional set of metric names to log (default: None)
- When None, all available metrics are logged
- Use a set of strings to only log specific metrics
- Useful for reducing MLflow API load in production environments

Logging Interval

The MLflow logging interval is controlled via the --mlflow_log_interval CLI argument:

step - Log metrics after every training step (most frequent)
balanced - Log metrics every 10 steps (default, recommended)
rollout - Log metrics once per rollout cycle
Integer value - Custom interval in steps

Metric Filtering Examples

To customize which metrics are logged, modify the create_mlflow_logging_wrapper call in skrl_training.py:

from training.rl.scripts.skrl_mlflow_agent import create_mlflow_logging_wrapper

basic_metrics = {
    "episode_reward_mean",
    "episode_length_mean",
    "policy_loss",
    "value_loss",
}

wrapper_func = create_mlflow_logging_wrapper(
    agent=runner.agent,
    mlflow_module=mlflow,
    metric_filter=basic_metrics,
)
runner.agent._update = wrapper_func

optimization_metrics = {
    "learning_rate",
    "grad_norm",
    "kl_divergence",
    "policy_loss",
    "value_loss",
}

wrapper_func = create_mlflow_logging_wrapper(
    agent=runner.agent,
    mlflow_module=mlflow,
    metric_filter=optimization_metrics,
)
runner.agent._update = wrapper_func

Usage Examples

Integration with SKRL Training

The monkey-patching approach is applied after creating the SKRL Runner:

import mlflow
from skrl.utils.runner.torch import Runner
from training.rl.scripts.skrl_mlflow_agent import create_mlflow_logging_wrapper

mlflow.set_tracking_uri("azureml://...")
mlflow.set_experiment("isaaclab-training")

with mlflow.start_run():
    runner = Runner(env, agent_cfg)

    wrapper_func = create_mlflow_logging_wrapper(
        agent=runner.agent,
        mlflow_module=mlflow,
        metric_filter=None,
    )

    runner.agent._update = wrapper_func
    runner.run()

Configuring Logging Intervals

Use CLI arguments to control logging frequency:

# Log after every training step
python training/rl/scripts/skrl_training.py --mlflow_log_interval step

# Log every 10 steps (default)
python training/rl/scripts/skrl_training.py --mlflow_log_interval balanced

# Log once per rollout
python training/rl/scripts/skrl_training.py --mlflow_log_interval rollout

# Log every 100 steps
python training/rl/scripts/skrl_training.py --mlflow_log_interval 100

Filtering Metrics for Production

Modify the wrapper creation in skrl_training.py:

production_metrics = {
    "episode_reward_mean",
    "episode_length_mean",
    "success_rate",
}

wrapper_func = create_mlflow_logging_wrapper(
    agent=runner.agent,
    mlflow_module=mlflow,
    metric_filter=production_metrics,
)
runner.agent._update = wrapper_func

Integration with Isaac Lab

The MLflow integration is automatically applied in skrl_training.py when training with Isaac Lab tasks:

python training/rl/scripts/skrl_training.py \
    --task Isaac-Cartpole-v0 \
    --num_envs 512 \
    --headless

The training script handles MLflow setup and monkey-patching automatically. To customize the logging interval, use the --mlflow_log_interval argument. To customize metric filtering, modify the create_mlflow_logging_wrapper call in skrl_training.py.

Troubleshooting

No Metrics Logged to MLflow

Training runs complete but no metrics appear in MLflow.

MLflow not configured - Verify mlflow.set_tracking_uri() is called with the correct Azure ML workspace URI and authentication is valid.
Monkey-patching not applied - Ensure create_mlflow_logging_wrapper is called after Runner instantiation and runner.agent._update is replaced before runner.run().
Short training runs - Training updates occur after rollouts complete. Very short runs may finish before metrics are captured.
Empty tracking data - Agent tracking_data may not populate until after the first rollout. If using metric_filter, verify the filter set contains matching metric names.

Missing Specific Metrics

Some expected metrics are not logged while others are.

Algorithm differences - Different SKRL algorithms expose different metrics. Check agent.tracking_data for available entries.
Metric filtered - If using metric_filter, ensure metric names match exactly (case-sensitive).
Extraction failure - Check logs for metric extraction warnings. Some metrics may have incompatible types.

AttributeError on Agent

AttributeError: Agent must have 'tracking_data' attribute

Incompatible agent - Ensure the agent is a SKRL agent with tracking_data. Verify SKRL version compatibility.
Timing - Apply monkey-patch after Runner instantiation. Verify runner.agent and runner.agent._update exist before replacement.

High MLflow API Load

Training slows down due to excessive MLflow API calls.

Increase logging interval with --mlflow_log_interval 100 or higher.
Use metric_filter to log only essential metrics.
The integration already batches metrics per training update. Enable asynchronous MLflow logging if available.

Metric Extraction Warnings

Log messages like "Failed to extract or log metrics at step X" indicate transient data structure changes or incompatible metric types. Occasional warnings are harmless. For persistent warnings, check the exception details and modify _extract_from_value() in skrl_mlflow_agent.py for specific metric types.

Possible Causes:

Transient data structure changes
- Some algorithms modify tracking_data structure during training
- Usually harmless if only occasional warnings appear
Incompatible metric types
- The integration attempts to convert all metrics to float
- Some complex objects cannot be converted and are skipped

Solutions:

Check warning details in logs
- Warnings include the exception message for debugging
- Determine if the failed metric is critical
Add custom extraction logic
- Modify _extract_from_value() in skrl_mlflow_agent.py for specific metric types
- Contribute improvements back to the integration module

Empty Metrics Dictionary

Symptom: Integration runs but extracts zero metrics.