Skip to main content

Trident Architecture

Trident is an image-based OS lifecycle agent providing atomic installation, A/B updates, and runtime configuration management. This document explains the architectural components, design principles, and operational workflows.

For a higher-level introduction, see What Is Trident and How Does Trident Work.

Overview

Trident's architecture follows a modular, subsystem-based design. A declarative Host Configuration describes the desired state of the target OS. Trident compares this against the current state and determines the necessary actions to reconcile them.

Execution Modes

Trident ships as a single binary that supports three execution modes:

  1. CLI — The operator invokes commands directly (e.g., trident install, trident update). Each command runs to completion and exits.
  2. Daemontrident daemon starts a long-running gRPC server (tridentd) that listens on a Unix domain socket. External orchestrators connect as gRPC clients to request servicing operations.
  3. gRPC Client — A built-in client that connects to a running daemon and issues commands over gRPC, providing the same operations as the CLI but through the daemon's API.

All three modes funnel into the same engine, so behavior is identical regardless of how Trident is invoked.

Engine

The engine is the central orchestrator. It receives a Host Configuration (the desired state) and the current Host Status (from the datastore), determines what servicing type is needed, and executes the appropriate operation.

Servicing Type Selection

When a new Host Configuration is provided, the engine compares it against the stored Host Status and selects one of the following servicing types:

  • Clean Install — No prior state exists. Trident partitions the disk, deploys the OS image, and configures the system from scratch.
  • A/B Update — The host has a prior installation with an A/B partition layout. Trident stages the new OS onto the inactive volume, configures it, and switches the boot target.
  • Runtime Update — Configuration-only changes that do not require a new OS image (e.g., adding users or changing network settings).

Subsystem Lifecycle

Each subsystem implements a trait with three ordered lifecycle steps:

  1. Prepare — Non-destructive work: validate configuration, check prerequisites, and plan changes.
  2. Provision — Initialize or migrate state on the target OS from the servicing OS: deploy images, set up encryption, install the ESP.
  3. Configure — Apply OS settings as specified by the Host Configuration and update the Host Status.

The engine executes all subsystems through each step in order — every subsystem completes its prepare step before any subsystem begins provisioning, and so on.

Subsystems

Subsystems are the building blocks that carry out the actual work. The engine invokes them in a fixed order during each operation. Each subsystem is responsible for one domain:

SubsystemResponsibility
StorageDisk partitioning, RAID, encryption, filesystem creation, swap
BootBootloader installation, UEFI variables, A/B boot switching
ESPEFI System Partition file management
OS ImageStreaming and deploying COSI images to target block devices
OS ConfigUsers, hostname, kernel parameters, systemd services
NetworkNetwork configuration via Netplan
SELinuxSELinux policy and labeling
ExtensionsSystem extensions (sysexts)
HooksCustom pre/post scripts executed at defined points
ManagementManagement OS configuration for the deployment environment
InitRDInitramfs configuration

Storage Pipeline

The storage subsystem has the most complex pipeline, executing in this order:

  1. Partitioning — Create or adopt partitions on target disks using the layout specified in the Host Configuration.
  2. RAID — Assemble software RAID arrays (mdadm) if configured.
  3. Encryption — Set up LUKS volumes (cryptsetup) with optional TPM-bound keys.
  4. Image Deployment — Stream filesystem images from COSI files directly onto target block devices.
  5. Filesystem Creation — Create any filesystems not provided by the OS image (ext4, XFS, FAT32, NTFS).
  6. Swap — Configure swap partitions.
  7. Verity — Set up dm-verity for root or /usr integrity verification.

Boot Subsystem

The boot subsystem manages the bootloader and firmware configuration:

  • GRUB2 — Installs and configures GRUB2, generates boot entries, and manages the A/B boot switching logic.
  • systemd-boot — UEFI boot manager integration.
  • UEFI Variables — Sets EFI boot order and fallback entries so that a failed update automatically rolls back to the previous OS.
  • ESP Management — Handles EFI System Partition detection, redundant ESP setups, and partition adoption.

Commands

trident install

Performs a clean install from a servicing OS (typically booted from ISO or PXE):

  1. Provisioning Network Setup — Establish network connectivity.
  2. Storage Preparation — Partition disks according to Host Configuration.
  3. Image Deployment — Stream COSI filesystem images to target partitions.
  4. System Configuration — Apply OS settings, users, and security policies.
  5. Bootloader Installation — Configure GRUB2 or systemd-boot.
  6. DataStore Creation — Establish persistent state tracking.

trident offline-initialize

For virtual machines, offline initialization runs as part of VM image creation:

  1. Image History — Read COSI metadata to understand the image layout.
  2. Disk Layout — Map the COSI partition layout.
  3. DataStore Creation — Establish persistent state to enable future servicing.

trident update

Performs an A/B update from within the running host OS:

  1. State Analysis — Compare current Host Status with new Host Configuration.
  2. Servicing Type Selection — Determine update strategy (A/B or runtime).
  3. Image Staging — Download and validate new COSI images.
  4. A/B Volume Preparation — Install updates to inactive volume.
  5. Configuration Migration — Transfer persistent state between A/B volumes.
  6. Boot Configuration Update — Modify bootloader to target the updated volume.
  7. Rollback Preparation — Configure UEFI fallback for safe rollback.

trident commit

After verifying a successful update, certifies the deployment and updates the boot configuration to reflect the new active volume.

trident rebuild-raid

Rebuilds RAID arrays: detects existing arrays, validates the desired configuration, and initiates the rebuild process.

trident validate

Validates a Host Configuration without making changes: checks schema correctness, logical consistency, and dependency availability.

trident get

Retrieves information from the datastore. The subcommand accepts a kind argument (defaults to status):

KindDescription
statusCurrent Host Status (default)
configurationActive Host Configuration
last-errorLast recorded fatal error
rollback-chainFull history of available rollback points
rollback-targetThe specific state that would be restored by a rollback

Output can be directed to a file with --outfile.

trident rollback

Triggers a manual rollback to the previous system state. Supports two modes:

  1. Runtime Rollback (--runtime) — Reverts runtime configuration changes without rebooting. Only available when the last operation was a runtime update.
  2. A/B Rollback (--ab) — Switches the active/inactive volume pair back, effectively reverting to the previous OS version. Requires a reboot to take effect.

A --check flag is available to preview what rollback operation would be performed without executing it. Like update, rollback supports --allowed-operations to control stage/finalize phases independently.

trident diagnose

Generates a diagnostic support bundle as a compressed tarball. Collects:

  • Trident logs and datastore state
  • System information
  • Optionally, full system journal and dmesg output (--journal)
  • Optionally, SELinux audit logs (--selinux)

The bundle is saved to the path specified by --output and can be shared for troubleshooting.

trident stream-disk (gRPC client only)

Streams a disk image from a URL directly to the target device. This command is available only through the gRPC client interface and is used for low-level image deployment scenarios. Accepts an optional --hash parameter for manifest integrity verification.

A/B Update Mechanism

Trident's A/B update model uses paired volumes (e.g., root-a / root-b). At any time, one volume is active and one is inactive:

  1. Stage — The new OS image is streamed onto the inactive volume. OS configuration is applied in a deployment chroot. Running workloads on the active volume are unaffected.
  2. Finalize — The boot configuration is updated to point at the newly staged volume. UEFI fallback is configured so that a boot failure automatically reverts to the previously active volume.
  3. Reboot — The system boots into the new OS.
  4. Commit — After the operator or orchestrator verifies the update, a commit marks the deployment as successful and updates the boot configuration to reflect the new active volume.

If the commit never happens (e.g., the new OS fails health checks), the UEFI fallback triggers an automatic rollback on the next reboot.

COSI Image Format

Trident uses the Composable OS Image (COSI) format for atomic image deployment:

image.cosi (tarball)
├── metadata.json # Image metadata and filesystem descriptions
└── images/ # Compressed filesystem images
├── root.img.zst # Root filesystem
├── usr.img.zst # /usr partition
└── ...

Key properties:

  • Streaming Support — Filesystem images are deployed directly to target block devices without intermediate storage.
  • Integrity Verification — SHA-384 checksums for all components.
  • Compression — ZSTD compression for efficient transfer.
  • Metadata Integration — Rich metadata from the build eliminates configuration duplication between image creation and deployment.

For details on how Trident consumes COSI files, see How Trident Consumes COSI.

gRPC Server

The daemon exposes a gRPC API over a Unix domain socket (/run/trident/trident.sock). Key design points:

  • Tonic + Tokio — Built on the Tonic gRPC framework with the Tokio async runtime.
  • Socket Activation — Integrates with systemd socket activation so the daemon only runs when a client connects.
  • Streaming Responses — All servicing operations return a stream of progress messages (Started → Log records → Completed).
  • Concurrency Control — A read-write lock allows multiple status queries concurrently but restricts servicing operations to one at a time.
  • Inactivity Shutdown — The daemon shuts down automatically after a configurable idle period (default: 5 minutes).

For full details, see the gRPC Server explanation.

Datastore

Trident maintains a SQLite database on the managed filesystem. The datastore records:

  • The Host Configuration used for each servicing operation.
  • The resulting Host Status after each operation.
  • A history of all servicing operations for audit and diagnostics.

The datastore operates in two modes: persistent for ongoing servicing of an installed host, or temporary for installer scenarios where the datastore is created fresh and written to the target filesystem.

This enables Trident to determine the current system state on subsequent invocations without rescanning hardware.

Host Configuration

All behavior is driven by the Host Configuration YAML file. It specifies:

trident:       # Trident agent configuration
storage: # Storage layout: disks, partitions, RAID, encryption
os: # OS settings: users, SELinux, network, services
image: # COSI image URL and integrity hash
scripts: # Custom pre/post automation hooks
managementOs: # Servicing OS settings

The engine compares a new Host Configuration against the stored Host Status to decide which subsystems need to run and what servicing type to use.

For a complete example, see the sample Host Configuration.

Data Flow

The following shows the end-to-end data flow for a typical servicing operation:

Design Principles

  • Declarative Configuration — The Host Configuration describes the desired end state. Trident determines the necessary actions. Operations are idempotent.
  • Separation of Concerns — Each subsystem manages a specific OS layer with clear interfaces between components.
  • Safety and Reliability — A/B updates provide automatic rollback. Changes are validated before execution. State tracking enables recovery from failures.
  • Platform Agnostic — Core servicing logic is separated from product-specific concerns. Extensibility is provided through hooks and scripts.

External Tool Integration

Trident wraps standard Linux utilities rather than reimplementing their functionality:

Storage & Partitioning:

ToolUsed For
systemd-repartDeclarative partition management
sfdiskLow-level partition table manipulation
partxKernel partition table re-reading
mdadmSoftware RAID creation and management
mkfsFilesystem creation
resize2fsext filesystem resizing
e2fsckext filesystem consistency checks
tune2fsext filesystem parameter tuning
mkswapSwap area creation
swapon / swapoffSwap activation and deactivation
wipefsFilesystem signature removal
ddRaw block-level data copying
losetupLoop device management

Encryption & Integrity:

ToolUsed For
cryptsetupLUKS disk encryption
systemd-cryptenrollLUKS token enrollment (TPM2, FIDO2)
veritysetupdm-verity integrity verification
opensslCertificate and key operations
tpm2_clear / tpm2_pcrreadTPM2 device management and PCR inspection
systemd-pcrlockPredictive PCR measurement locking

Boot:

ToolUsed For
grub2-mkconfigGRUB2 configuration generation
efibootmgrUEFI boot entry management
efivarUEFI variable access
dracut / mkinitrdInitramfs generation
systemd-firstbootInitial system identity setup

System & Device Management:

ToolUsed For
systemctlsystemd service management
systemd-sysext / systemd-confextSystem and configuration extensions
udevadmDevice event handling and settling
lsblk / blkidBlock device discovery and identification
findmnt / mount / umount / mountpointFilesystem mount operations
lsofOpen file detection (safe unmount checks)
ejectRemovable media ejection
unameKernel version detection
dfFilesystem space reporting

OS Configuration:

ToolUsed For
useradd / usermod / chpasswdUser and credential management
setfilesSELinux filesystem relabeling
netplanNetwork configuration
iptablesFirewall rule management
journalctlSystem log collection (diagnostics)