Galaxy Trajectory Report
Overview
The Galaxy Trajectory Report (output.md) is an automatically generated comprehensive execution log that documents the complete lifecycle of a multi-device task execution session in Galaxy. This human-readable Markdown report provides step-by-step visualization of constellation evolution, task execution, and device coordination.
Report Location
After each Galaxy session completes, the trajectory report is automatically generated:
logs/galaxy/<session_name>/output.md
logs/galaxy/<session_name>/topology_images/ # DAG visualizations
Example:
logs/galaxy/request_20251111_140216_1/
├── output.md # Main trajectory report
├── response.log # Raw JSONL execution log
├── request.log # Request details
├── evaluation.log # Optional evaluation
├── result.json # Performance metrics
└── topology_images/ # Generated DAG topology graphs
├── step1_after_constellation_xxx.png
├── step2_after_constellation_xxx.png
└── step999_final_constellation_xxx.png
Report Structure
1. Executive Summary
High-level session overview:
## Executive Summary
- **User Request**: type hi on all linux and write results to windows notepad
- **Total Steps**: 4
- **Total Time**: 31.54s
Components: - User Request: Original natural language task description - Total Steps: Number of orchestration steps (DAG creation + execution rounds) - Total Time: End-to-end session duration in seconds
2. Step-by-Step Execution
Detailed breakdown of each orchestration step with:
Step Metadata
### Step 2
- **Agent**: constellation_agent (ConstellationAgent)
- **Status**: CONTINUE
- **Round**: 0 | **Round Step**: 0
- **Execution Time**: 9.27s
- **Time Breakdown**:
- LLM_INTERACTION: 8.96s
- ACTION_EXECUTION: 0.29s
- MEMORY_UPDATE: 0.00s
Fields:
- Agent: Agent name and type (ConstellationAgent for orchestration)
- Status: Step outcome (CONTINUE, FINISH, ERROR)
- Round/Round Step: ReAct iteration counters
- Execution Time: Total step duration
- Time Breakdown: Profiling data for LLM calls, action execution, memory updates
Actions Performed
Documents agent actions with collapsible argument details:
#### Actions Performed
**Function**: `build_constellation`
<details>
<summary>Arguments (click to expand)</summary>
```json
{
"config": {
"constellation_id": "constellation_xxx",
"tasks": { ... },
"dependencies": { ... }
}
}
**Common Functions:**
- `build_constellation`: Initial DAG creation
- `edit_constellation`: Dynamic DAG modification
- `execute_constellation`: Trigger task execution
#### Constellation Evolution
Visualizes DAG state changes with interactive topology graphs:
```markdown
#### Constellation Evolution
<details>
<summary>Constellation AFTER (click to expand)</summary>
**Constellation ID**: constellation_bcd1726e_20251105_134526
**State**: created
##### Dependency Graph (Topology)
<img src="topology_images/step2_after_constellation_xxx.png" width="600">
##### Task Summary Table
| Task ID | Name | Status | Device | Duration |
|---------|------|--------|--------|----------|
| task-1 | Type hi on linux_agent_1 | pending | linux_agent_1 | N/A |
| task-2 | Type hi on linux_agent_2 | pending | linux_agent_2 | N/A |
| task-3 | Type hi on linux_agent_3 | pending | linux_agent_3 | N/A |
Topology Visualization Features: - Color-coded nodes by task status: - 🟢 Green: Completed - 🔵 Cyan: Running - ⚫ Gray: Pending - 🔴 Red: Failed/Error - Edge styles for dependencies: - Solid green: Satisfied dependencies - Dashed orange: Pending dependencies - Automatic layout with hierarchical spring algorithm - Legend showing node/edge meanings
Detailed Task Information
Comprehensive task metadata with execution details:
#### Task task-1: Type hi on linux_agent_1
- **Status**: completed
- **Target Device**: linux_agent_1
- **Priority**: 2
- **Description**: On device linux_agent_1 (Linux), open a terminal and execute the command: echo 'hi'. Return the output text.
- **Tips**:
- Ensure CLI access is available.
- Expected textual result: Return the exact output of the command, which should be 'hi'.
- **Result**:
```
hi
```
- **Started**: 2025-11-05T05:45:26.395208+00:00
- **Ended**: 2025-11-05T05:45:42.981859+00:00
- **Duration**: 16.59s
Task Fields:
- Status: Current execution state (pending, running, completed, failed, cancelled)
- Target Device: Assigned device agent ID
- Priority: Task scheduling priority (1=HIGH, 2=MEDIUM, 3=LOW)
- Description: Natural language task specification for device agent
- Tips: Execution hints and expected output guidance
- Result: Task execution output (truncated if large)
- Error: Error message if task failed
- Timing: Start/end timestamps and duration
Dependency Details
Shows task relationships and satisfaction status:
| Line ID | From Task | To Task | Type | Satisfied | Condition |
|---------|-----------|---------|------|-----------|----------|
| l1 | t1 | t4 | unconditional | [PENDING] | Output from linux_agent_1 collected successfully. |
| l2 | t2 | t4 | unconditional | [OK] | Output from linux_agent_2 collected successfully. |
Dependency Types:
- unconditional: Always active when source task completes
- conditional: Activated based on result evaluation
Connected Devices
Device registry snapshot at step completion:
<details>
<summary>Connected Devices</summary>
| Device ID | OS | Status | Last Heartbeat |
|-----------|----|---------|--------------|
| windowsagent | windows | idle | 2025-11-05T05:45:43 |
| linux_agent_1 | linux | idle | 2025-11-05T05:45:43 |
| linux_agent_2 | linux | idle | 2025-11-05T05:45:43 |
| linux_agent_3 | linux | idle | 2025-11-05T05:45:43 |
Device Statuses:
- idle: Connected and available
- busy: Executing task
- disconnected: WebSocket connection lost
3. Final Constellation State
Complete final DAG with all task results:
## Final Constellation State
**ID**: constellation_bcd1726e_20251105_134526
**State**: completed
**Created**: 2025-11-05T05:45:26.230930+00:00
**Updated**: 2025-11-05T05:45:42.981859+00:00
### Task Details
[Full task information with results]
### Task Summary Table
[Aggregated task status table]
### Final Dependency Graph
[Final topology visualization]
Generation Process
Automatic Generation
The trajectory report is generated automatically by GalaxySession upon completion:
# galaxy/session/galaxy_session.py
async def close_session(self):
"""Generate trajectory report on session close"""
trajectory = GalaxyTrajectory(self.log_path)
trajectory.to_markdown(self.log_path + "output.md")
Trigger Points:
1. Normal session completion (GalaxyClient.shutdown())
2. User termination (Ctrl+C in interactive mode)
3. Error-induced session end
Manual Generation
You can regenerate reports manually using the CLI tool:
# Generate report for specific session
python -m galaxy.trajectory.generate_report logs/galaxy/test1
# Custom output path
python -m galaxy.trajectory.generate_report logs/galaxy/test1 -o custom_report.md
# Minimal report (exclude details)
python -m galaxy.trajectory.generate_report logs/galaxy/test1 \
--no-constellation --no-tasks --no-devices
CLI Options:
- --no-constellation: Exclude constellation evolution details
- --no-tasks: Exclude detailed task information
- --no-devices: Exclude device connection information
- -o, --output: Custom output file path
Batch Generation
Process multiple sessions at once:
# galaxy/trajectory/galaxy_parser.py
if __name__ == "__main__":
"""Process all Galaxy task logs and generate markdown reports."""
galaxy_logs_dir = Path("logs/galaxy")
task_dirs = sorted([d for d in galaxy_logs_dir.iterdir() if d.is_dir()])
for task_dir in task_dirs:
trajectory = GalaxyTrajectory(str(task_dir))
output_path = task_dir / "trajectory_report.md"
trajectory.to_markdown(str(output_path))
Run batch processing:
cd c:\Users\chaoyunzhang\OneDrive - Microsoft\Desktop\research\GPTV\UFO-windows\github\saber\UFO2
python -m galaxy.trajectory.galaxy_parser
Output:
[BOLD BLUE] Galaxy Trajectory Parser - Batch Mode
Found 42 task directories
Processing task_1... [OK]
Processing task_2... [OK]
Processing test1... [OK]
...
=====================================================
Summary:
Total: 42
Success: 40
Skipped: 2
Failed: 0
=====================================================
Programmatic Access
Loading Trajectory Data
from galaxy.trajectory import GalaxyTrajectory
# Load trajectory from log directory
trajectory = GalaxyTrajectory("logs/galaxy/test1")
# Access metadata
print(f"Request: {trajectory.request}")
print(f"Steps: {trajectory.total_steps}")
print(f"Cost: ${trajectory.total_cost:.4f}")
print(f"Time: {trajectory.total_time:.2f}s")
# Iterate through steps
for idx, step in enumerate(trajectory.step_log, 1):
agent = step.get("agent_name")
status = step.get("status")
time = step.get("total_time", 0)
print(f"Step {idx}: {agent} - {status} ({time:.2f}s)")
Extracting Constellation Data
# Get final constellation state
last_step = trajectory.step_log[-1]
final_constellation = trajectory._parse_constellation(
last_step.get("constellation_after")
)
if final_constellation:
constellation_id = final_constellation.get("constellation_id")
state = final_constellation.get("state")
tasks = final_constellation.get("tasks", {})
print(f"Constellation {constellation_id}: {state}")
print(f"Tasks: {len(tasks)}")
# Analyze task outcomes
completed = sum(1 for t in tasks.values() if t.get("status") == "completed")
failed = sum(1 for t in tasks.values() if t.get("status") == "failed")
print(f"Completed: {completed}/{len(tasks)}")
print(f"Failed: {failed}/{len(tasks)}")
Custom Report Generation
# Generate custom report with specific options
trajectory.to_markdown(
output_path="custom_report.md",
include_constellation_details=True, # Show DAG evolution
include_task_details=True, # Show task results
include_device_info=False # Hide device info
)
Visualization Features
Topology Graph Generation
The trajectory report includes dynamically generated DAG topology images:
Implementation:
def _generate_topology_image(
self,
dependencies: Dict[str, Any],
tasks: Dict[str, Any],
constellation_id: str,
step_number: int,
state: str = "before"
) -> Optional[str]:
"""Generate beautiful topology graph using networkx and matplotlib"""
# Create directed graph
G = nx.DiGraph()
# Add all tasks as nodes
for task_id in tasks.keys():
G.add_node(task_id)
# Add dependency edges
for dep in dependencies.values():
from_task = dep["from_task_id"]
to_task = dep["to_task_id"]
G.add_edge(from_task, to_task)
# Color nodes by status
status_colors = {
"completed": "#28A745", # Green
"running": "#17A2B8", # Cyan
"pending": "#6C757D", # Gray
"failed": "#DC3545", # Red
}
# Generate layout and save image
pos = nx.spring_layout(G, k=1.5, iterations=100)
# ... [matplotlib rendering code]
Graph Features:
- Hierarchical Layout: Spring algorithm with optimized spacing (k=1.5)
- Adaptive Node Size: Ellipses scale with task ID length
- Color-Coded Status: Bootstrap-inspired color scheme
- Edge Differentiation: Solid (satisfied) vs dashed (pending)
- Legend: Automatic status and dependency type legend
- High Quality: 120 DPI PNG with antialiasing
Image Organization
topology_images/
├── step1_after_constellation_7b3c0f47_20251104_182305.png
├── step2_before_constellation_bcd1726e_20251105_134526.png
├── step2_after_constellation_bcd1726e_20251105_134526.png
├── step3_before_constellation_bcd1726e_20251105_134526.png
├── step3_after_constellation_bcd1726e_20251105_134526.png
└── step999_final_constellation_bcd1726e_20251105_134526.png
Naming Convention:
- step{N}_{state}_{constellation_id}.png
- state: before, after, or final
- step999: Reserved for final summary graph
Use Cases
1. Debugging Failed Sessions
Identify which task failed and why:
trajectory = GalaxyTrajectory("logs/galaxy/failed_session")
for step in trajectory.step_log:
constellation = trajectory._parse_constellation(step.get("constellation_after"))
if not constellation:
continue
tasks = constellation.get("tasks", {})
for task_id, task in tasks.items():
if task.get("status") == "failed":
print(f"❌ Task {task_id}: {task.get('name')}")
print(f" Device: {task.get('target_device_id')}")
print(f" Error: {task.get('error')}")
2. Performance Analysis
Correlate with result.json for bottleneck identification:
import json
# Load trajectory for execution timeline
trajectory = GalaxyTrajectory("logs/galaxy/task_32")
# Load metrics for performance data
with open("logs/galaxy/task_32/result.json") as f:
result = json.load(f)
metrics = result["session_results"]["metrics"]
task_stats = metrics["task_statistics"]
# Find slowest tasks
slow_tasks = [
(tid, task.get("execution_duration", 0))
for step in trajectory.step_log
for tid, task in trajectory._parse_constellation(
step.get("constellation_after")
).get("tasks", {}).items()
]
slow_tasks.sort(key=lambda x: x[1], reverse=True)
print(f"Top 5 slowest tasks:")
for tid, duration in slow_tasks[:5]:
print(f" {tid}: {duration:.2f}s")
3. Constellation Evolution Analysis
Track DAG modifications across steps:
trajectory = GalaxyTrajectory("logs/galaxy/adaptive_session")
for idx, step in enumerate(trajectory.step_log, 1):
before = trajectory._parse_constellation(step.get("constellation_before"))
after = trajectory._parse_constellation(step.get("constellation_after"))
if before and after:
tasks_before = len(before.get("tasks", {}))
tasks_after = len(after.get("tasks", {}))
if tasks_after > tasks_before:
print(f"Step {idx}: Added {tasks_after - tasks_before} tasks")
elif tasks_after < tasks_before:
print(f"Step {idx}: Removed {tasks_before - tasks_after} tasks")
4. Device Utilization Tracking
Analyze device workload distribution:
trajectory = GalaxyTrajectory("logs/galaxy/multi_device")
# Count tasks per device
device_tasks = {}
for step in trajectory.step_log:
constellation = trajectory._parse_constellation(step.get("constellation_after"))
if not constellation:
continue
for task in constellation.get("tasks", {}).values():
device = task.get("target_device_id")
device_tasks[device] = device_tasks.get(device, 0) + 1
print("Task distribution:")
for device, count in sorted(device_tasks.items(), key=lambda x: x[1], reverse=True):
print(f" {device}: {count} tasks")
5. Session Comparison
Compare multiple sessions for regression testing:
def compare_sessions(session1_path, session2_path):
t1 = GalaxyTrajectory(session1_path)
t2 = GalaxyTrajectory(session2_path)
print(f"Session 1 vs Session 2:")
print(f" Steps: {t1.total_steps} vs {t2.total_steps}")
print(f" Time: {t1.total_time:.2f}s vs {t2.total_time:.2f}s")
print(f" Cost: ${t1.total_cost:.4f} vs ${t2.total_cost:.4f}")
speedup = (t1.total_time - t2.total_time) / t1.total_time * 100
print(f" Performance: {speedup:+.1f}%")
compare_sessions("logs/galaxy/test_v1", "logs/galaxy/test_v2")
Data Sources
The trajectory report aggregates data from multiple log sources:
1. response.log (Primary Source)
JSONL file with per-step execution records:
{
"request": "type hi on all linux devices",
"agent_name": "constellation_agent",
"agent_type": "ConstellationAgent",
"status": "CONTINUE",
"round_num": 0,
"round_step": 0,
"total_time": 9.27,
"cost": 0.0042,
"execution_times": {
"LLM_INTERACTION": 8.96,
"ACTION_EXECUTION": 0.29,
"MEMORY_UPDATE": 0.00
},
"action": [
{
"function": "build_constellation",
"arguments": { ... }
}
],
"constellation_before": "{...}",
"constellation_after": "{...}",
"device_info": { ... }
}
2. result.json (Performance Metrics)
Aggregated session-level metrics:
{
"session_results": {
"request": "type hi on all linux devices",
"status": "completed",
"total_cost": 0.0156,
"total_rounds": 1,
"total_steps": 4,
"total_time": 31.54,
"metrics": {
"task_statistics": { ... },
"constellation_statistics": { ... }
}
}
}
3. evaluation.log (Optional)
User-provided evaluation results:
{
"task_success": true,
"evaluation_score": 5,
"comments": "All tasks completed successfully"
}
Configuration
Customizing Report Content
Control report verbosity via generation parameters:
trajectory.to_markdown(
output_path="output.md",
include_constellation_details=True, # DAG evolution (default: True)
include_task_details=True, # Task execution logs (default: True)
include_device_info=True # Device status (default: True)
)
Report Size Impact: - Full report (all options enabled): ~200KB for 10-task session - Minimal report (all options disabled): ~20KB - Topology images: ~50KB each
Topology Graph Styling
Customize graph appearance by modifying _generate_topology_image():
# Adjust node colors
status_colors = {
"completed": "#28A745", # Change to custom color
"running": "#17A2B8",
# ...
}
# Adjust layout parameters
pos = nx.spring_layout(
G,
k=1.5, # Node spacing (higher = more spread)
iterations=100, # Layout quality (higher = better but slower)
seed=42 # Deterministic layout
)
# Adjust image quality
plt.savefig(
image_path,
dpi=120, # Resolution (higher = larger files)
bbox_inches="tight",
facecolor="white"
)
Best Practices
1. Regular Report Review
Monitor trajectory reports to catch issues early:
# Generate reports for recent sessions
for dir in logs/galaxy/*/; do
python -m galaxy.trajectory.generate_report "$dir"
done
# Open reports in browser for visual inspection
start logs/galaxy/test1/output.md
2. Archive Trajectory Reports
Store reports with version control for reproducibility:
# Create timestamped archive
mkdir -p trajectory_archives/$(date +%Y-%m-%d)
cp logs/galaxy/*/output.md trajectory_archives/$(date +%Y-%m-%d)/
cp logs/galaxy/*/result.json trajectory_archives/$(date +%Y-%m-%d)/
3. Automated Analysis
Integrate trajectory parsing into CI/CD pipelines:
# test/analyze_trajectory.py
def validate_trajectory(log_dir):
trajectory = GalaxyTrajectory(log_dir)
# Check for failures
for step in trajectory.step_log:
if step.get("status") == "ERROR":
raise AssertionError(f"Session failed at step {step.get('_line_number')}")
# Check performance thresholds
if trajectory.total_time > 60.0:
print(f"WARNING: Session took {trajectory.total_time:.2f}s (>60s threshold)")
return True
4. Compare Before/After States
Use constellation evolution to verify correctness:
# Verify DAG grows monotonically (no premature task deletion)
trajectory = GalaxyTrajectory("logs/galaxy/session")
prev_task_count = 0
for step in trajectory.step_log:
constellation = trajectory._parse_constellation(step.get("constellation_after"))
if constellation:
task_count = len(constellation.get("tasks", {}))
if task_count < prev_task_count:
print(f"WARNING: Task count decreased from {prev_task_count} to {task_count}")
prev_task_count = task_count
Related Documentation
- Performance Metrics - Quantitative session analysis with
result.json - Result JSON Reference - Complete
result.jsonschema documentation - Galaxy Overview - Main Galaxy framework documentation
- Constellation Orchestrator - DAG execution engine
- Task Constellation - DAG data structure and validation
Troubleshooting
Empty or Missing Report
Problem: output.md not generated after session
Solutions:
-
Check for
response.logexistence:ls logs/galaxy/<session_name>/response.log -
Manually trigger generation:
python -m galaxy.trajectory.generate_report logs/galaxy/<session_name> -
Verify session closed properly (check for exception in terminal)
Parse Errors in Report
Problem: ⚠️ Parse Error warnings in report
Cause: Legacy log format with serialization bugs (tasks as Python strings instead of JSON)
Solution: This is a known issue fixed in current versions. Reports will display:
##### ⚠️ Parse Error
**Error Type**: `legacy_serialization_bug`
**Message**: Tasks field contains Python object representations (not pure JSON).
This is due to a serialization bug in older versions.
Workaround: Re-run session with updated codebase to generate proper logs.
Missing Topology Images
Problem: Broken image links in report
Solutions:
-
Check
topology_images/directory exists:ls logs/galaxy/<session_name>/topology_images/ -
Verify matplotlib backend:
import matplotlib matplotlib.use("Agg") # Non-interactive backend required -
Regenerate report to recreate images:
python -m galaxy.trajectory.generate_report logs/galaxy/<session_name>
Large Report Files
Problem: output.md exceeds 10MB
Solutions:
-
Generate minimal report:
python -m galaxy.trajectory.generate_report logs/galaxy/<session_name> \ --no-constellation --no-tasks -
Reduce topology image quality (edit
galaxy_parser.py):plt.savefig(image_path, dpi=80) # Lower DPI -
Archive and compress:
gzip logs/galaxy/<session_name>/output.md
API Reference
GalaxyTrajectory Class
class GalaxyTrajectory:
"""Parser for Galaxy agent logs with constellation visualization"""
def __init__(self, folder_path: str) -> None:
"""
Initialize trajectory parser.
Args:
folder_path: Path to Galaxy log directory (e.g., logs/galaxy/task_1)
Raises:
ValueError: If response.log file not found
"""
@property
def step_log(self) -> List[Dict[str, Any]]:
"""Get all step logs from response.log"""
@property
def evaluation_log(self) -> Dict[str, Any]:
"""Get evaluation results from evaluation.log"""
@property
def request(self) -> Optional[str]:
"""Get original user request"""
@property
def total_steps(self) -> int:
"""Get total number of steps"""
@property
def total_cost(self) -> float:
"""Calculate total LLM cost"""
@property
def total_time(self) -> float:
"""Calculate total execution time"""
def to_markdown(
self,
output_path: str,
include_constellation_details: bool = True,
include_task_details: bool = True,
include_device_info: bool = True
) -> None:
"""
Export trajectory to Markdown file.
Args:
output_path: Path to save markdown file
include_constellation_details: Include DAG evolution details
include_task_details: Include task execution logs
include_device_info: Include device status information
"""
Next Steps:
- Combine trajectory reports with result.json metrics for comprehensive analysis
- Automate report generation in CI/CD pipelines
- Visualize execution timelines with custom scripts
- Compare session trajectories for performance regression testing