Skip to main content

Synthetic Data

Generate photorealistic training data from simulation using NVIDIA Cosmos world foundation models. This section covers the SDG pipeline architecture, Cosmos model integration, and workflow submission.

SDG Pipeline

The synthetic data generation pipeline chains three Cosmos capabilities:

StageModelDescription
TransferCosmos Transfer 2.5Convert simulation renders to photorealistic images
PredictCosmos Predict 2.5Generate plausible future frame sequences
ReasonCosmos Reason 2Assess and curate data for training quality

🏗️ Architecture

synthetic-data/
├── workflows/ # OSMO and AzureML SDG job definitions
│ ├── osmo/ # OSMO workflow YAML (Jinja templates)
│ └── azureml/ # AzureML job YAML (commandJob schema)
├── cosmos/ # Cosmos model integration
│ ├── transfer/ # Cosmos Transfer 2.5
│ ├── predict/ # Cosmos Predict 2.5
│ ├── reason/ # Cosmos Reason 2
│ └── configs/ # Model configuration templates
├── examples/ # Pipeline examples
└── specifications/ # Domain specifications