Galaxy Trajectory Report

Overview

The Galaxy Trajectory Report (output.md) is an automatically generated comprehensive execution log that documents the complete lifecycle of a multi-device task execution session in Galaxy. This human-readable Markdown report provides step-by-step visualization of constellation evolution, task execution, and device coordination.

Report Location

After each Galaxy session completes, the trajectory report is automatically generated:

logs/galaxy/<session_name>/output.md
logs/galaxy/<session_name>/topology_images/  # DAG visualizations

Example:

logs/galaxy/request_20251111_140216_1/
├── output.md                    # Main trajectory report
├── response.log                 # Raw JSONL execution log
├── request.log                  # Request details  
├── evaluation.log               # Optional evaluation
├── result.json                  # Performance metrics
└── topology_images/             # Generated DAG topology graphs
    ├── step1_after_constellation_xxx.png
    ├── step2_after_constellation_xxx.png
    └── step999_final_constellation_xxx.png

Report Structure

1. Executive Summary

High-level session overview:

## Executive Summary

- **User Request**: type hi on all linux and write results to windows notepad
- **Total Steps**: 4
- **Total Time**: 31.54s

Components: - User Request: Original natural language task description - Total Steps: Number of orchestration steps (DAG creation + execution rounds) - Total Time: End-to-end session duration in seconds

2. Step-by-Step Execution

Detailed breakdown of each orchestration step with:

Step Metadata

### Step 2

- **Agent**: constellation_agent (ConstellationAgent)
- **Status**: CONTINUE
- **Round**: 0 | **Round Step**: 0
- **Execution Time**: 9.27s
- **Time Breakdown**:
  - LLM_INTERACTION: 8.96s
  - ACTION_EXECUTION: 0.29s
  - MEMORY_UPDATE: 0.00s

Fields: - Agent: Agent name and type (ConstellationAgent for orchestration) - Status: Step outcome (CONTINUE, FINISH, ERROR) - Round/Round Step: ReAct iteration counters - Execution Time: Total step duration - Time Breakdown: Profiling data for LLM calls, action execution, memory updates

Actions Performed

Documents agent actions with collapsible argument details:

#### Actions Performed

**Function**: `build_constellation`

<details>
<summary>Arguments (click to expand)</summary>

```json
{
  "config": {
    "constellation_id": "constellation_xxx",
    "tasks": { ... },
    "dependencies": { ... }
  }
}

**Common Functions:**
- `build_constellation`: Initial DAG creation
- `edit_constellation`: Dynamic DAG modification
- `execute_constellation`: Trigger task execution

#### Constellation Evolution

Visualizes DAG state changes with interactive topology graphs:

```markdown
#### Constellation Evolution

<details>
<summary>Constellation AFTER (click to expand)</summary>

**Constellation ID**: constellation_bcd1726e_20251105_134526
**State**: created

##### Dependency Graph (Topology)

<img src="topology_images/step2_after_constellation_xxx.png" width="600">

##### Task Summary Table

| Task ID | Name | Status | Device | Duration |
|---------|------|--------|--------|----------|
| task-1 | Type hi on linux_agent_1 | pending | linux_agent_1 | N/A |
| task-2 | Type hi on linux_agent_2 | pending | linux_agent_2 | N/A |
| task-3 | Type hi on linux_agent_3 | pending | linux_agent_3 | N/A |

Topology Visualization Features: - Color-coded nodes by task status: - 🟢 Green: Completed - 🔵 Cyan: Running - ⚫ Gray: Pending - 🔴 Red: Failed/Error - Edge styles for dependencies: - Solid green: Satisfied dependencies - Dashed orange: Pending dependencies - Automatic layout with hierarchical spring algorithm - Legend showing node/edge meanings

Detailed Task Information

Comprehensive task metadata with execution details:

#### Task task-1: Type hi on linux_agent_1

- **Status**: completed
- **Target Device**: linux_agent_1
- **Priority**: 2
- **Description**: On device linux_agent_1 (Linux), open a terminal and execute the command: echo 'hi'. Return the output text.
- **Tips**:
  - Ensure CLI access is available.
  - Expected textual result: Return the exact output of the command, which should be 'hi'.
- **Result**: 
  ```
  hi
  ```
- **Started**: 2025-11-05T05:45:26.395208+00:00
- **Ended**: 2025-11-05T05:45:42.981859+00:00
- **Duration**: 16.59s

Task Fields: - Status: Current execution state (pending, running, completed, failed, cancelled) - Target Device: Assigned device agent ID - Priority: Task scheduling priority (1=HIGH, 2=MEDIUM, 3=LOW) - Description: Natural language task specification for device agent - Tips: Execution hints and expected output guidance - Result: Task execution output (truncated if large) - Error: Error message if task failed - Timing: Start/end timestamps and duration

Dependency Details

Shows task relationships and satisfaction status:

| Line ID | From Task | To Task | Type | Satisfied | Condition |
|---------|-----------|---------|------|-----------|----------|
| l1 | t1 | t4 | unconditional | [PENDING] | Output from linux_agent_1 collected successfully. |
| l2 | t2 | t4 | unconditional | [OK] | Output from linux_agent_2 collected successfully. |

Dependency Types: - unconditional: Always active when source task completes - conditional: Activated based on result evaluation

Connected Devices

Device registry snapshot at step completion:

<details>
<summary>Connected Devices</summary>

| Device ID | OS | Status | Last Heartbeat |
|-----------|----|---------|--------------|
| windowsagent | windows | idle | 2025-11-05T05:45:43 |
| linux_agent_1 | linux | idle | 2025-11-05T05:45:43 |
| linux_agent_2 | linux | idle | 2025-11-05T05:45:43 |
| linux_agent_3 | linux | idle | 2025-11-05T05:45:43 |

Device Statuses: - idle: Connected and available - busy: Executing task - disconnected: WebSocket connection lost

3. Final Constellation State

Complete final DAG with all task results:

## Final Constellation State

**ID**: constellation_bcd1726e_20251105_134526
**State**: completed
**Created**: 2025-11-05T05:45:26.230930+00:00
**Updated**: 2025-11-05T05:45:42.981859+00:00

### Task Details
[Full task information with results]

### Task Summary Table
[Aggregated task status table]

### Final Dependency Graph
[Final topology visualization]

Generation Process

Automatic Generation

The trajectory report is generated automatically by GalaxySession upon completion:

# galaxy/session/galaxy_session.py
async def close_session(self):
    """Generate trajectory report on session close"""
    trajectory = GalaxyTrajectory(self.log_path)
    trajectory.to_markdown(self.log_path + "output.md")

Trigger Points: 1. Normal session completion (GalaxyClient.shutdown()) 2. User termination (Ctrl+C in interactive mode) 3. Error-induced session end

Manual Generation

You can regenerate reports manually using the CLI tool:

# Generate report for specific session
python -m galaxy.trajectory.generate_report logs/galaxy/test1

# Custom output path
python -m galaxy.trajectory.generate_report logs/galaxy/test1 -o custom_report.md

# Minimal report (exclude details)
python -m galaxy.trajectory.generate_report logs/galaxy/test1 \
  --no-constellation --no-tasks --no-devices

CLI Options: - --no-constellation: Exclude constellation evolution details - --no-tasks: Exclude detailed task information - --no-devices: Exclude device connection information - -o, --output: Custom output file path

Batch Generation

Process multiple sessions at once:

# galaxy/trajectory/galaxy_parser.py
if __name__ == "__main__":
    """Process all Galaxy task logs and generate markdown reports."""

    galaxy_logs_dir = Path("logs/galaxy")
    task_dirs = sorted([d for d in galaxy_logs_dir.iterdir() if d.is_dir()])

    for task_dir in task_dirs:
        trajectory = GalaxyTrajectory(str(task_dir))
        output_path = task_dir / "trajectory_report.md"
        trajectory.to_markdown(str(output_path))

Run batch processing:

cd c:\Users\chaoyunzhang\OneDrive - Microsoft\Desktop\research\GPTV\UFO-windows\github\saber\UFO2
python -m galaxy.trajectory.galaxy_parser

Output:

[BOLD BLUE] Galaxy Trajectory Parser - Batch Mode
Found 42 task directories

Processing task_1... [OK]
Processing task_2... [OK]
Processing test1... [OK]
...

=====================================================
Summary:
  Total: 42
  Success: 40
  Skipped: 2
  Failed: 0
=====================================================

Programmatic Access

Loading Trajectory Data

from galaxy.trajectory import GalaxyTrajectory

# Load trajectory from log directory
trajectory = GalaxyTrajectory("logs/galaxy/test1")

# Access metadata
print(f"Request: {trajectory.request}")
print(f"Steps: {trajectory.total_steps}")
print(f"Cost: ${trajectory.total_cost:.4f}")
print(f"Time: {trajectory.total_time:.2f}s")

# Iterate through steps
for idx, step in enumerate(trajectory.step_log, 1):
    agent = step.get("agent_name")
    status = step.get("status")
    time = step.get("total_time", 0)
    print(f"Step {idx}: {agent} - {status} ({time:.2f}s)")

Extracting Constellation Data

# Get final constellation state
last_step = trajectory.step_log[-1]
final_constellation = trajectory._parse_constellation(
    last_step.get("constellation_after")
)

if final_constellation:
    constellation_id = final_constellation.get("constellation_id")
    state = final_constellation.get("state")
    tasks = final_constellation.get("tasks", {})

    print(f"Constellation {constellation_id}: {state}")
    print(f"Tasks: {len(tasks)}")

    # Analyze task outcomes
    completed = sum(1 for t in tasks.values() if t.get("status") == "completed")
    failed = sum(1 for t in tasks.values() if t.get("status") == "failed")

    print(f"Completed: {completed}/{len(tasks)}")
    print(f"Failed: {failed}/{len(tasks)}")

Custom Report Generation

# Generate custom report with specific options
trajectory.to_markdown(
    output_path="custom_report.md",
    include_constellation_details=True,  # Show DAG evolution
    include_task_details=True,          # Show task results
    include_device_info=False           # Hide device info
)

Visualization Features

Topology Graph Generation

The trajectory report includes dynamically generated DAG topology images:

Implementation:

def _generate_topology_image(
    self,
    dependencies: Dict[str, Any],
    tasks: Dict[str, Any],
    constellation_id: str,
    step_number: int,
    state: str = "before"
) -> Optional[str]:
    """Generate beautiful topology graph using networkx and matplotlib"""

    # Create directed graph
    G = nx.DiGraph()

    # Add all tasks as nodes
    for task_id in tasks.keys():
        G.add_node(task_id)

    # Add dependency edges
    for dep in dependencies.values():
        from_task = dep["from_task_id"]
        to_task = dep["to_task_id"]
        G.add_edge(from_task, to_task)

    # Color nodes by status
    status_colors = {
        "completed": "#28A745",  # Green
        "running": "#17A2B8",    # Cyan
        "pending": "#6C757D",    # Gray
        "failed": "#DC3545",     # Red
    }

    # Generate layout and save image
    pos = nx.spring_layout(G, k=1.5, iterations=100)
    # ... [matplotlib rendering code]

Graph Features: - Hierarchical Layout: Spring algorithm with optimized spacing (k=1.5) - Adaptive Node Size: Ellipses scale with task ID length - Color-Coded Status: Bootstrap-inspired color scheme - Edge Differentiation: Solid (satisfied) vs dashed (pending) - Legend: Automatic status and dependency type legend - High Quality: 120 DPI PNG with antialiasing

Image Organization

topology_images/
├── step1_after_constellation_7b3c0f47_20251104_182305.png
├── step2_before_constellation_bcd1726e_20251105_134526.png
├── step2_after_constellation_bcd1726e_20251105_134526.png
├── step3_before_constellation_bcd1726e_20251105_134526.png
├── step3_after_constellation_bcd1726e_20251105_134526.png
└── step999_final_constellation_bcd1726e_20251105_134526.png

Naming Convention: - step{N}_{state}_{constellation_id}.png - state: before, after, or final - step999: Reserved for final summary graph

Use Cases

1. Debugging Failed Sessions

Identify which task failed and why:

trajectory = GalaxyTrajectory("logs/galaxy/failed_session")

for step in trajectory.step_log:
    constellation = trajectory._parse_constellation(step.get("constellation_after"))
    if not constellation:
        continue

    tasks = constellation.get("tasks", {})
    for task_id, task in tasks.items():
        if task.get("status") == "failed":
            print(f"❌ Task {task_id}: {task.get('name')}")
            print(f"   Device: {task.get('target_device_id')}")
            print(f"   Error: {task.get('error')}")

2. Performance Analysis

Correlate with result.json for bottleneck identification:

import json

# Load trajectory for execution timeline
trajectory = GalaxyTrajectory("logs/galaxy/task_32")

# Load metrics for performance data
with open("logs/galaxy/task_32/result.json") as f:
    result = json.load(f)

metrics = result["session_results"]["metrics"]
task_stats = metrics["task_statistics"]

# Find slowest tasks
slow_tasks = [
    (tid, task.get("execution_duration", 0))
    for step in trajectory.step_log
    for tid, task in trajectory._parse_constellation(
        step.get("constellation_after")
    ).get("tasks", {}).items()
]

slow_tasks.sort(key=lambda x: x[1], reverse=True)
print(f"Top 5 slowest tasks:")
for tid, duration in slow_tasks[:5]:
    print(f"  {tid}: {duration:.2f}s")

3. Constellation Evolution Analysis

Track DAG modifications across steps:

trajectory = GalaxyTrajectory("logs/galaxy/adaptive_session")

for idx, step in enumerate(trajectory.step_log, 1):
    before = trajectory._parse_constellation(step.get("constellation_before"))
    after = trajectory._parse_constellation(step.get("constellation_after"))

    if before and after:
        tasks_before = len(before.get("tasks", {}))
        tasks_after = len(after.get("tasks", {}))

        if tasks_after > tasks_before:
            print(f"Step {idx}: Added {tasks_after - tasks_before} tasks")
        elif tasks_after < tasks_before:
            print(f"Step {idx}: Removed {tasks_before - tasks_after} tasks")

4. Device Utilization Tracking

Analyze device workload distribution:

trajectory = GalaxyTrajectory("logs/galaxy/multi_device")

# Count tasks per device
device_tasks = {}
for step in trajectory.step_log:
    constellation = trajectory._parse_constellation(step.get("constellation_after"))
    if not constellation:
        continue

    for task in constellation.get("tasks", {}).values():
        device = task.get("target_device_id")
        device_tasks[device] = device_tasks.get(device, 0) + 1

print("Task distribution:")
for device, count in sorted(device_tasks.items(), key=lambda x: x[1], reverse=True):
    print(f"  {device}: {count} tasks")

5. Session Comparison

Compare multiple sessions for regression testing:

def compare_sessions(session1_path, session2_path):
    t1 = GalaxyTrajectory(session1_path)
    t2 = GalaxyTrajectory(session2_path)

    print(f"Session 1 vs Session 2:")
    print(f"  Steps: {t1.total_steps} vs {t2.total_steps}")
    print(f"  Time: {t1.total_time:.2f}s vs {t2.total_time:.2f}s")
    print(f"  Cost: ${t1.total_cost:.4f} vs ${t2.total_cost:.4f}")

    speedup = (t1.total_time - t2.total_time) / t1.total_time * 100
    print(f"  Performance: {speedup:+.1f}%")

compare_sessions("logs/galaxy/test_v1", "logs/galaxy/test_v2")

Data Sources

The trajectory report aggregates data from multiple log sources:

1. response.log (Primary Source)

JSONL file with per-step execution records:

{
  "request": "type hi on all linux devices",
  "agent_name": "constellation_agent",
  "agent_type": "ConstellationAgent",
  "status": "CONTINUE",
  "round_num": 0,
  "round_step": 0,
  "total_time": 9.27,
  "cost": 0.0042,
  "execution_times": {
    "LLM_INTERACTION": 8.96,
    "ACTION_EXECUTION": 0.29,
    "MEMORY_UPDATE": 0.00
  },
  "action": [
    {
      "function": "build_constellation",
      "arguments": { ... }
    }
  ],
  "constellation_before": "{...}",
  "constellation_after": "{...}",
  "device_info": { ... }
}

2. result.json (Performance Metrics)

Aggregated session-level metrics:

{
  "session_results": {
    "request": "type hi on all linux devices",
    "status": "completed",
    "total_cost": 0.0156,
    "total_rounds": 1,
    "total_steps": 4,
    "total_time": 31.54,
    "metrics": {
      "task_statistics": { ... },
      "constellation_statistics": { ... }
    }
  }
}

3. evaluation.log (Optional)

User-provided evaluation results:

{
  "task_success": true,
  "evaluation_score": 5,
  "comments": "All tasks completed successfully"
}

Configuration

Customizing Report Content

Control report verbosity via generation parameters:

trajectory.to_markdown(
    output_path="output.md",
    include_constellation_details=True,  # DAG evolution (default: True)
    include_task_details=True,          # Task execution logs (default: True)
    include_device_info=True            # Device status (default: True)
)

Report Size Impact: - Full report (all options enabled): ~200KB for 10-task session - Minimal report (all options disabled): ~20KB - Topology images: ~50KB each

Topology Graph Styling

Customize graph appearance by modifying _generate_topology_image():

# Adjust node colors
status_colors = {
    "completed": "#28A745",  # Change to custom color
    "running": "#17A2B8",
    # ...
}

# Adjust layout parameters
pos = nx.spring_layout(
    G,
    k=1.5,        # Node spacing (higher = more spread)
    iterations=100,  # Layout quality (higher = better but slower)
    seed=42       # Deterministic layout
)

# Adjust image quality
plt.savefig(
    image_path,
    dpi=120,           # Resolution (higher = larger files)
    bbox_inches="tight",
    facecolor="white"
)

Best Practices

1. Regular Report Review

Monitor trajectory reports to catch issues early:

# Generate reports for recent sessions
for dir in logs/galaxy/*/; do
    python -m galaxy.trajectory.generate_report "$dir"
done

# Open reports in browser for visual inspection
start logs/galaxy/test1/output.md

2. Archive Trajectory Reports

Store reports with version control for reproducibility:

# Create timestamped archive
mkdir -p trajectory_archives/$(date +%Y-%m-%d)
cp logs/galaxy/*/output.md trajectory_archives/$(date +%Y-%m-%d)/
cp logs/galaxy/*/result.json trajectory_archives/$(date +%Y-%m-%d)/

3. Automated Analysis

Integrate trajectory parsing into CI/CD pipelines:

# test/analyze_trajectory.py
def validate_trajectory(log_dir):
    trajectory = GalaxyTrajectory(log_dir)

    # Check for failures
    for step in trajectory.step_log:
        if step.get("status") == "ERROR":
            raise AssertionError(f"Session failed at step {step.get('_line_number')}")

    # Check performance thresholds
    if trajectory.total_time > 60.0:
        print(f"WARNING: Session took {trajectory.total_time:.2f}s (>60s threshold)")

    return True

4. Compare Before/After States

Use constellation evolution to verify correctness:

# Verify DAG grows monotonically (no premature task deletion)
trajectory = GalaxyTrajectory("logs/galaxy/session")

prev_task_count = 0
for step in trajectory.step_log:
    constellation = trajectory._parse_constellation(step.get("constellation_after"))
    if constellation:
        task_count = len(constellation.get("tasks", {}))
        if task_count < prev_task_count:
            print(f"WARNING: Task count decreased from {prev_task_count} to {task_count}")
        prev_task_count = task_count

Performance Metrics - Quantitative session analysis with result.json
Result JSON Reference - Complete result.json schema documentation
Galaxy Overview - Main Galaxy framework documentation
Constellation Orchestrator - DAG execution engine
Task Constellation - DAG data structure and validation

Troubleshooting

Empty or Missing Report

Problem: output.md not generated after session

Solutions:

Check for response.log existence:

ls logs/galaxy/<session_name>/response.log

Manually trigger generation:

python -m galaxy.trajectory.generate_report logs/galaxy/<session_name>

Verify session closed properly (check for exception in terminal)

Parse Errors in Report

Problem: ⚠️ Parse Error warnings in report

Cause: Legacy log format with serialization bugs (tasks as Python strings instead of JSON)

Solution: This is a known issue fixed in current versions. Reports will display:

##### ⚠️ Parse Error

**Error Type**: `legacy_serialization_bug`
**Message**: Tasks field contains Python object representations (not pure JSON). 
This is due to a serialization bug in older versions.

Workaround: Re-run session with updated codebase to generate proper logs.

Missing Topology Images

Problem: Broken image links in report

Solutions:

Check topology_images/ directory exists:

ls logs/galaxy/<session_name>/topology_images/

Verify matplotlib backend:

import matplotlib
matplotlib.use("Agg")  # Non-interactive backend required

Regenerate report to recreate images:

python -m galaxy.trajectory.generate_report logs/galaxy/<session_name>

Large Report Files

Problem: output.md exceeds 10MB

Solutions:

Generate minimal report:

python -m galaxy.trajectory.generate_report logs/galaxy/<session_name> \
  --no-constellation --no-tasks

Reduce topology image quality (edit galaxy_parser.py):
```
plt.savefig(image_path, dpi=80)  # Lower DPI
```

Archive and compress:

gzip logs/galaxy/<session_name>/output.md

API Reference

GalaxyTrajectory Class

class GalaxyTrajectory:
    """Parser for Galaxy agent logs with constellation visualization"""

    def __init__(self, folder_path: str) -> None:
        """
        Initialize trajectory parser.

        Args:
            folder_path: Path to Galaxy log directory (e.g., logs/galaxy/task_1)

        Raises:
            ValueError: If response.log file not found
        """

    @property
    def step_log(self) -> List[Dict[str, Any]]:
        """Get all step logs from response.log"""

    @property
    def evaluation_log(self) -> Dict[str, Any]:
        """Get evaluation results from evaluation.log"""

    @property
    def request(self) -> Optional[str]:
        """Get original user request"""

    @property
    def total_steps(self) -> int:
        """Get total number of steps"""

    @property
    def total_cost(self) -> float:
        """Calculate total LLM cost"""

    @property
    def total_time(self) -> float:
        """Calculate total execution time"""

    def to_markdown(
        self,
        output_path: str,
        include_constellation_details: bool = True,
        include_task_details: bool = True,
        include_device_info: bool = True
    ) -> None:
        """
        Export trajectory to Markdown file.

        Args:
            output_path: Path to save markdown file
            include_constellation_details: Include DAG evolution details
            include_task_details: Include task execution logs
            include_device_info: Include device status information
        """

Next Steps: - Combine trajectory reports with result.json metrics for comprehensive analysis - Automate report generation in CI/CD pipelines - Visualize execution timelines with custom scripts - Compare session trajectories for performance regression testing

Galaxy Trajectory Report

Overview

Report Location

Report Structure

1. Executive Summary

2. Step-by-Step Execution

Step Metadata

Actions Performed

Detailed Task Information

Dependency Details

Connected Devices

3. Final Constellation State

Generation Process

Automatic Generation

Manual Generation

Batch Generation

Programmatic Access

Loading Trajectory Data

Extracting Constellation Data

Custom Report Generation

Visualization Features

Topology Graph Generation

Image Organization

Use Cases

1. Debugging Failed Sessions

2. Performance Analysis

3. Constellation Evolution Analysis

4. Device Utilization Tracking

5. Session Comparison

Data Sources

1. response.log (Primary Source)

2. result.json (Performance Metrics)

3. evaluation.log (Optional)

Configuration

Customizing Report Content

Topology Graph Styling

Best Practices

1. Regular Report Review

2. Archive Trajectory Reports

3. Automated Analysis

4. Compare Before/After States

Related Documentation

Troubleshooting

Empty or Missing Report

Parse Errors in Report

Missing Topology Images

Large Report Files

API Reference

GalaxyTrajectory Class