Agent Server Overview

The Agent Server is the central orchestration engine that transforms UFO into a distributed multi-agent system, enabling seamless task coordination across heterogeneous devices through persistent WebSocket connections and robust state management.

New to the Agent Server? Start with the Quick Start Guide to get up and running in minutes.

What is the Agent Server?

The Agent Server is a FastAPI-based asynchronous WebSocket server that serves as the communication hub for UFO's distributed architecture. It bridges constellation orchestrators, device agents, and external systems through a unified protocol interface.

Core Responsibilities

Capability	Description	Key Benefit
🔌 Connection Management	Tracks device & constellation client lifecycles	Real-time device availability awareness
🎯 Task Orchestration	Coordinates execution across distributed devices	Centralized workflow control
💾 State Management	Maintains session lifecycles & execution contexts	Stateful multi-turn task execution
🌐 Dual API Interface	WebSocket (AIP) + HTTP (REST) endpoints	Flexible integration options
🛡️ Resilience	Handles disconnections, timeouts, failures gracefully	Production-grade reliability

Why Use the Agent Server?

Centralized Control: Single point of orchestration for multi-device workflows
Protocol Abstraction: Clients communicate via AIP, hiding network complexity
Async by Design: Non-blocking execution enables high concurrency
Platform Agnostic: Supports Windows, Linux, macOS (in development)

The Agent Server is part of UFO's distributed server-client architecture, where it handles orchestration and state management while Agent Clients handle command execution. See Server-Client Architecture for the complete design rationale and communication patterns.

Architecture

The server follows a clean separation of concerns with distinct layers for web service, connection management, and protocol handling.

Architectural Overview

Component Interaction Diagram:

graph TB subgraph "Web Layer" FastAPI[FastAPI App] HTTP[HTTP API] WS[WebSocket /ws] end subgraph "Service Layer" WSM[Client Manager] SM[Session Manager] WSH[WebSocket Handler] end subgraph "Clients" DC[Device Clients] CC[Constellation Clients] end FastAPI --> HTTP FastAPI --> WS HTTP --> SM HTTP --> WSM WS --> WSH WSH --> WSM WSH --> SM DC -->|WebSocket| WS CC -->|WebSocket| WS style FastAPI fill:#e1f5ff style WSM fill:#fff4e1 style SM fill:#f0ffe1 style WSH fill:#ffe1f5

This layered design ensures each component has a single, well-defined responsibility. The managers maintain state while the handler implements protocol logic.

Core Components

Component	Responsibility	Key Operations
FastAPI Application	Web service layer	✅ HTTP endpoint routing ✅ WebSocket connection acceptance ✅ Request/response handling ✅ CORS and middleware
Client Connection Manager	Connection registry	✅ Client identity tracking ✅ Session ↔ client mapping ✅ Device info caching ✅ Connection lifecycle hooks
Session Manager	Execution lifecycle	✅ Platform-specific session creation ✅ Background async task execution ✅ Result callback delivery ✅ Session cancellation
WebSocket Handler	Protocol implementation	✅ AIP message parsing/routing ✅ Client registration ✅ Heartbeat monitoring ✅ Task/command dispatch

Component Documentation: - Session Manager - Session lifecycle and background execution - Client Connection Manager - Connection registry and client tracking - WebSocket Handler - AIP protocol message handling - HTTP API - REST endpoint specifications

Key Capabilities

1. Multi-Client Coordination

The server supports two distinct client types with different roles in the distributed architecture.

Client Type Comparison:

Aspect	Device Client	Constellation Client
Role	Task executor	Task orchestrator
Connection	Long-lived WebSocket	Long-lived WebSocket
Registration	`ClientType.DEVICE`	`ClientType.CONSTELLATION`
Capabilities	Local execution, telemetry	Multi-device coordination
Target Field	Not required	Required for routing
Example	Windows agent, Linux agent	ConstellationClient orchestrator

Device Clients - Execute tasks locally on Windows/Linux machines - Report hardware specs and real-time status - Respond to commands via MCP tool servers - Stream execution logs back to server

See Agent Client Overview for detailed client architecture.

Constellation Clients
- Orchestrate multi-device workflows from a central point - Dispatch tasks to specific target devices via target_id - Coordinate complex cross-device DAG execution - Aggregate results from multiple devices

Both client types connect to /ws and register using the REGISTER message. The server differentiates behavior based on client_type field. For the complete server-client architecture and design rationale, see Server-Client Architecture.

See Quick Start for registration examples.

2. Session Lifecycle Management

Unlike stateless HTTP servers, the Agent Server maintains session state throughout task execution, enabling multi-turn interactions and result callbacks.

Session Lifecycle State Machine:

stateDiagram-v2 [*] --> Created: create_session() Created --> Running: Start execution Running --> Completed: Success Running --> Failed: Error Running --> Cancelled: Disconnect Completed --> [*] Failed --> [*] Cancelled --> [*] note right of Running Async background execution Non-blocking server end note

Lifecycle Stages:

Stage	Trigger	Session Manager Action	Server State
Created	HTTP dispatch or AIP `TASK`	Platform-specific session instantiation	Session ID generated
Running	Background task start	Async execution without blocking	Awaiting results
Completed	`TASK_END` (success)	Callback delivery to client	Results cached
Failed	`TASK_END` (error)	Error callback delivery	Error logged
Cancelled	Client disconnect	Cancel async task, cleanup	Session removed

Platform-Specific Sessions

The SessionManager creates different session types based on the target platform: - Windows: WindowsSession with UI automation support - Linux: LinuxSession with bash automation - Auto-detected or overridden via --platform flag

Session Manager Responsibilities:

✅ Platform abstraction: Hides Windows/Linux differences
✅ Background execution: Non-blocking async task execution
✅ Callback routing: Delivers results via WebSocket
✅ Resource cleanup: Cancels tasks on disconnect
✅ Result caching: Stores results for HTTP retrieval

3. Resilient Communication

The server implements the Agent Interaction Protocol (AIP), providing structured, type-safe communication with automatic failure handling.

Protocol Features:

Feature	Implementation	Benefit
Structured Messages	Pydantic models with validation	Type safety, automatic serialization
Connection Health	Heartbeat every 20-30s	Early failure detection
Error Recovery	Exponential backoff reconnection	Transient fault tolerance
State Tracking	Session client mapping	Proper cleanup on disconnect
Message Correlation	`request_id`, `prev_response_id` chains	Request-response tracing

Disconnection Handling Flow:

sequenceDiagram participant Client participant Server participant SM as Session Manager Client-xServer: Connection lost Server->>SM: Cancel sessions SM->>SM: Cleanup resources Server->>Server: Remove from registry Note over Server: Client can reconnect with same client_id

Important: Session Cancellation on Disconnect

When a client disconnects (device or constellation), all associated sessions are immediately cancelled to prevent orphaned tasks and resource leaks.

4. Dual API Interface

The server provides two API styles to support different integration patterns: real-time WebSocket for agents and simple HTTP for external systems.

WebSocket API (AIP-based)

Purpose: Real-time bidirectional communication with agent clients

Message Type	Direction	Purpose
`REGISTER`	Client Server	Initial capability advertisement
`TASK`	Server Client	Task assignment with commands
`COMMAND`	Server Client	Individual command execution
`COMMAND_RESULTS`	Client Server	Execution results
`TASK_END`	Bidirectional	Task completion notification
`HEARTBEAT`	Bidirectional	Connection keepalive
`DEVICE_INFO_REQUEST/RESPONSE`	Bidirectional	Telemetry exchange
`ERROR`	Bidirectional	Error condition reporting

WebSocket Connection

import websockets

async with websockets.connect("ws://localhost:5000/ws") as ws:
    # Register as device client
    await ws.send(json.dumps({
        "message_type": "REGISTER",
        "client_id": "windows_agent_001",
        "client_type": "device",
        "metadata": {"platform": "windows", "gpu": "NVIDIA RTX 3080"}
    }))

HTTP REST API

Purpose: Task dispatch and monitoring from external systems (HTTP clients, CI/CD, etc.)

Endpoint	Method	Purpose	Authentication
`/api/dispatch`	POST	Dispatch task to device	Optional (if configured)
`/api/task_result/{task_name}`	GET	Retrieve task results	Optional
`/api/clients`	GET	List connected clients	Optional
`/api/health`	GET	Server health check	None

HTTP Task Dispatch

# Dispatch task to device
curl -X POST http://localhost:5000/api/dispatch \
  -H "Content-Type: application/json" \
  -d '{
    "client_id": "my_windows_device",
    "request": "Open Notepad and type Hello World",
    "task_name": "test_task_001"
  }'

# Response: {"status": "dispatched", "session_id": "session_abc123", "task_name": "test_task_001"}

# Retrieve results
curl http://localhost:5000/api/task_result/test_task_001

See HTTP API Reference for complete endpoint documentation.

Workflow Examples

Complete Task Dispatch Flow

End-to-End HTTP WebSocket Device Execution:

sequenceDiagram participant EXT as External System participant HTTP as HTTP API participant SM as Session Manager participant WSH as WebSocket Handler participant DC as Device Client EXT->>HTTP: POST /api/dispatch {client_id, request, task_name} HTTP->>SM: create_session() SM->>SM: Create platform session SM-->>HTTP: session_id HTTP-->>EXT: 200 {session_id, task_name} SM->>WSH: send_task(session_id, task) WSH->>DC: TASK message (AIP) DC-->>WSH: ACK rect rgb(240, 255, 240) Note over DC: Background Execution DC->>DC: Execute via MCP tools DC->>DC: Generate results end DC->>WSH: COMMAND_RESULTS WSH->>SM: on_result_callback() SM->>SM: Cache results DC->>WSH: TASK_END (COMPLETED) WSH->>SM: on_task_end() EXT->>HTTP: GET /task_result/{session_id} HTTP->>SM: get_results() SM-->>HTTP: results HTTP-->>EXT: 200 {results}

The green box highlights async execution on the device side, which doesn't block the server.

Multi-Device Constellation Workflow

Constellation Client Coordinating Multiple Devices:

sequenceDiagram participant CC as Constellation Client participant Server as Agent Server participant D1 as Device 1 (GPU) participant D2 as Device 2 (CPU) CC->>Server: REGISTER (constellation) Server-->>CC: HEARTBEAT (OK) Note over CC: Plan multi-device DAG CC->>Server: TASK (target: device_1) Subtask 1: Image processing Server->>D1: TASK (forward) CC->>Server: TASK (target: device_2) Subtask 2: Data extraction Server->>D2: TASK (forward) par Parallel Execution D1->>D1: Process image on GPU D2->>D2: Extract data from DB end D1->>Server: COMMAND_RESULTS Server->>CC: COMMAND_RESULTS (from device_1) D2->>Server: COMMAND_RESULTS Server->>CC: COMMAND_RESULTS (from device_2) Note over CC: Combine results, Update DAG D1->>Server: TASK_END D2->>Server: TASK_END Server->>CC: TASK_END (both tasks)

The server acts as a message router, forwarding tasks to target devices and routing results back to the constellation orchestrator. See Constellation Documentation for more details on multi-device orchestration.

Platform Support

The server automatically detects client platforms and creates appropriate session implementations.

Supported Platforms:

Platform	Session Type	Capabilities	Status
Windows	`WindowsSession`	UI automation (UIA) COM API integration Native app control Screenshot capture	Full support
Linux	`LinuxSession`	Bash automation GUI tools (xdotool) Package management Process control	Full support
macOS	(Planned)	AppleScript UI automation Native app control	🚧 In development

Platform Auto-Detection:

The server automatically detects the client's platform during registration. You can override this globally with the --platform flag when needed for testing or specific deployment scenarios.

python -m ufo.server.app --platform windows  # Force Windows sessions
python -m ufo.server.app --platform linux    # Force Linux sessions
python -m ufo.server.app                     # Auto-detect (default)

When to Use Platform Override

Use --platform override when: - Testing cross-platform sessions without actual devices - Running server in container different from target platform - Debugging platform-specific session behavior

For more details on platform-specific implementations, see Windows Agent and Linux Agent.

Configuration

The server runs out-of-the-box with sensible defaults. Advanced configuration inherits from UFO's central config system.

Command-Line Arguments

python -m ufo.server.app [OPTIONS]

Available Options:

Option	Type	Default	Description
`--port`	int	5000	Server listening port
`--host`	str	`0.0.0.0`	Bind address (use `127.0.0.1` for localhost only)
`--platform`	str	auto	Force platform (`windows`, `linux`)
`--log-level`	str	`INFO`	Logging level (`DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`, `OFF`)
`--local`	flag	False	Restrict to local connections only

Configuration Examples

# Development: Local-only with debug logging
python -m ufo.server.app --local --log-level DEBUG --port 8000

# Production: External access, info logging
python -m ufo.server.app --host 0.0.0.0 --port 5000 --log-level INFO

# Testing: Force Linux sessions
python -m ufo.server.app --platform linux --port 9000

UFO Configuration Inheritance

The server uses UFO's central configuration from config_dev.yaml:

Config Section	Inherited Settings
Agent Strategies	HostAgent, AppAgent, EvaluationAgent configurations
LLM Models	Model endpoints, API keys, temperature settings
Automators	UI automation, COM API, web automation configs
Logging	Log file paths, rotation, format
Prompts	Agent system prompts, example templates

See Configuration Guide for comprehensive config documentation.

Monitoring & Operations

Health Monitoring

Monitor server status and performance using HTTP endpoints.

Health Check Endpoints:

# Server health and uptime
curl http://localhost:5000/api/health

# Response:
# {
#   "status": "healthy",
#   "online_clients": [...]
# }

# Connected clients list
curl http://localhost:5000/api/clients

# Response:
# {
#   "online_clients": ["windows_001", "linux_002", ...]
# }

For comprehensive monitoring strategies including performance metrics collection, log aggregation patterns, alert configuration, and dashboard setup, see Monitoring Guide.

Error Handling

The server handles common failure scenarios gracefully to maintain system stability.

Disconnection Handling Matrix:

Scenario	Server Detection	Automatic Action	Client Impact
Device Disconnect	Heartbeat timeout / WebSocket close	Cancel device sessions, notify constellation	Task fails, constellation retries
Constellation Disconnect	Heartbeat timeout / WebSocket close	Continue device execution, skip callbacks	Device completes but results not delivered
Task Execution Failure	`TASK_END` with error status	Log error, store in results	Client receives error via callback/HTTP
Network Partition	Heartbeat timeout	Mark disconnected, enable reconnection	Client reconnects with same ID
Server Crash	N/A	Clients detect via heartbeat	Clients reconnect to new instance

Reconnection Support

Clients can reconnect with the same client_id. The server will re-register the client and restore heartbeat monitoring, but will not restore previous sessions (sessions are ephemeral).

Best Practices

Development Environment

Optimize your development workflow with these recommended practices.

Recommended Development Configuration:

# Isolate to localhost, enable detailed logging
python -m ufo.server.app \
  --host 127.0.0.1 \
  --port 5000 \
  --local \
  --log-level DEBUG

Development Checklist:

Use --local flag to prevent external access
Enable DEBUG logging for detailed traces
Monitor logs in separate terminal: tail -f logs/ufo_server.log
Test with single device before adding multiple clients
Use HTTP API for quick task dispatch testing
Verify heartbeat monitoring with client disconnection

Development Testing Pattern

# Terminal 1: Start server with debug logging
python -m ufo.server.app --local --log-level DEBUG

# Terminal 2: Connect device client
python -m ufo.client.client --ws --ws-server ws://127.0.0.1:5000/ws

# Terminal 3: Dispatch test task
curl -X POST http://127.0.0.1:5000/api/dispatch \
  -H "Content-Type: application/json" \
  -d '{"client_id": "windowsagent", "request": "Open Notepad", "task_name": "test_001"}'

Production Deployment

The default configuration is not production-ready. Implement these security and reliability measures.

Production Architecture:

graph LR Internet[Internet] LB[Load Balancer nginx/HAProxy] SSL[SSL/TLS Termination] subgraph "UFO Server Cluster" S1[Server Instance 1 :5000] S2[Server Instance 2 :5001] S3[Server Instance 3 :5002] end Monitor[Monitoring Prometheus/Grafana] PM[Process Manager systemd/PM2] Internet --> LB LB --> SSL SSL --> S1 SSL --> S2 SSL --> S3 PM -.Manages.-> S1 PM -.Manages.-> S2 PM -.Manages.-> S3 S1 -.Metrics.-> Monitor S2 -.Metrics.-> Monitor S3 -.Metrics.-> Monitor style LB fill:#ffe1f5 style SSL fill:#fff4e1 style Monitor fill:#f0ffe1

Production Checklist:

Category	Recommendation	Rationale
Reverse Proxy	nginx, Apache, or cloud load balancer	SSL termination, rate limiting, DDoS protection
SSL/TLS	Enable WSS (WebSocket Secure)	Encrypt client-server communication
Authentication	Add auth middleware to FastAPI	Prevent unauthorized access
Process Management	systemd (Linux), PM2 (Node.js), Docker	Auto-restart on crash, resource limits
Monitoring	`/api/health` polling, metrics export	Detect issues proactively
Logging	Structured logging, log aggregation (ELK)	Centralized debugging and audit trails
Resource Limits	Set max connections, memory limits	Prevent resource exhaustion

Example Nginx Configuration:

upstream ufo_server {
    server localhost:5000;
}

server {
    listen 443 ssl;
    server_name ufo-server.example.com;

    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;

    # WebSocket endpoint
    location /ws {
        proxy_pass http://ufo_server;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_read_timeout 3600s;
    }

    # HTTP API
    location /api/ {
        proxy_pass http://ufo_server;
        proxy_set_header Host $host;
    }
}

Scaling Strategies

The server can scale horizontally for high-load deployments, but requires careful session management.

Scaling Patterns:

Pattern	Description	Use Case	Considerations
Vertical	Increase CPU/RAM on single instance	< 100 concurrent clients	Simplest, no session distribution
Horizontal (Sticky Sessions)	Multiple instances with session affinity	100-1000 clients	Load balancer routes same client to same instance
Horizontal (Shared State)	Multiple instances with Redis	> 1000 clients	Requires session state externalization

Current Limitation

The current implementation stores session state in-memory. For horizontal scaling, use sticky sessions (client affinity) in your load balancer to route clients to consistent server instances. Future: Shared state backend (Redis) for true stateless horizontal scaling.

Troubleshooting

Common Issues

Issue: Clients Can't Connect

# Symptom: Connection refused
Error: WebSocket connection to 'ws://localhost:5000/ws' failed

# Diagnosis:
1. Check server is running: curl http://localhost:5000/api/health
2. Verify port: netstat -an | grep 5000
3. Check firewall: sudo ufw status

# Solution:
# Start server with correct host binding
python -m ufo.server.app --host 0.0.0.0 --port 5000

Issue: Sessions Not Executing

# Symptom: Task dispatched but no results

# Diagnosis:
1. Check server logs for errors
2. Verify client is connected: curl http://localhost:5000/api/clients
3. Check target_id matches client_id

# Solution:
# Ensure client_id in request matches registered client
curl -X POST http://localhost:5000/api/dispatch \
  -d '{"client_id": "correct_client_id", "request": "test", "task_name": "test_001"}'

Issue: Memory Leak / High Memory Usage

# Symptom: Server memory grows over time

# Diagnosis:
1. Check session cleanup in logs
2. Monitor /api/health for session count
3. Profile with memory_profiler

# Solution:
# Ensure clients send TASK_END to complete sessions
# Restart server periodically (systemd handles this)
# Implement session timeout (future feature)

Debug Mode

Enable Maximum Verbosity

# Ultra-verbose debugging
python -m ufo.server.app \
  --log-level DEBUG \
  --local \
  --port 5000 2>&1 | tee debug.log

# Watch logs in real-time
tail -f debug.log | grep -E "(ERROR|WARNING|Session|WebSocket)"

Documentation Map

Explore related documentation to deepen your understanding of the Agent Server ecosystem.

Getting Started

Document	Purpose
Quick Start	Get server running in < 5 minutes
Client Registration	How clients connect to server

Architecture & Components

Document	Purpose
Session Manager	Task execution lifecycle deep-dive
Client Connection Manager	Connection registry internals
WebSocket Handler	AIP protocol message handling
HTTP API	REST endpoint specifications

Operations

Document	Purpose
Monitoring	Health checks, metrics, alerting

Document	Purpose
AIP Protocol	Communication protocol specification
Agent Architecture	Agent design and FSM framework
Server-Client Architecture	Distributed architecture rationale
Client Overview	Device client architecture
MCP Integration	Model Context Protocol tool servers

Next Steps

Follow this recommended sequence to master the Agent Server:

1. Run the Server (5 minutes) - Follow the Quick Start Guide - Verify server responds to /api/health

2. Connect a Client (10 minutes) - Use Device Client - Verify registration in server logs - Check /api/clients endpoint

3. Dispatch Tasks (15 minutes) - Use HTTP API to send tasks - Retrieve results via /api/task_result - Observe WebSocket message flow in logs

4. Understand Architecture (30 minutes) - Read Session Manager internals - Study WebSocket Handler protocol implementation - Review AIP Protocol message types

5. Deploy to Production (varies) - Set up reverse proxy (nginx) - Configure SSL/TLS - Implement monitoring - Test failover scenarios