Local MCP Servers

Local MCP servers run in-process with the UFO² agent, providing fast and efficient access to tools without network overhead. They are the most common server type for built-in functionality.

Overview

UFO² includes several built-in local MCP servers organized by functionality. This page provides a quick reference - click each server name for complete documentation.

Server Type Description Full Documentation
UICollector Data Collection Windows UI observation → Full Docs
HostUIExecutor Action Desktop-level UI automation → Full Docs
AppUIExecutor Action Application-level UI automation → Full Docs
CommandLineExecutor Action Shell command execution → Full Docs
WordCOMExecutor Action Microsoft Word COM API → Full Docs
ExcelCOMExecutor Action Microsoft Excel COM API → Full Docs
PowerPointCOMExecutor Action Microsoft PowerPoint COM API → Full Docs
PDFReaderExecutor Action PDF text extraction → Full Docs
ConstellationEditor Action Multi-device task orchestration → Full Docs

Server Summaries

UICollector

Type: Data Collection (read-only, automatically invoked)
Platform: Windows
Tools: 8 tools for screenshots, window lists, control info, and annotations

→ See complete UICollector documentation for all tool details, parameters, return values, and examples.


HostUIExecutor

Type: Action (LLM-selectable, state-modifying)
Platform: Windows
Agent: HostAgent
Tool: select_application_window - Window selection and focus management

→ See complete HostUIExecutor documentation for tool specifications and workflow examples.


AppUIExecutor

Type: Action (LLM-selectable, GUI automation)
Platform: Windows
Agent: AppAgent
Tools: 9 tools for clicks, typing, scrolling, and UI interaction

→ See complete AppUIExecutor documentation for all automation tools and usage patterns.


CommandLineExecutor

Type: Action (LLM-selectable, shell execution)
Platform: Cross-platform
Agent: HostAgent, AppAgent
Tool: run_shell - Execute shell commands

→ See complete CommandLineExecutor documentation for security guidelines and examples.


WordCOMExecutor

Type: Action (LLM-selectable, Word COM API)
Platform: Windows
Agent: AppAgent (WINWORD.EXE only)
Tools: 6 tools for Word document automation

→ See complete WordCOMExecutor documentation for all Word automation tools.


ExcelCOMExecutor

Type: Action (LLM-selectable, Excel COM API)
Platform: Windows
Agent: AppAgent (EXCEL.EXE only)
Tools: 6 tools for Excel automation

→ See complete ExcelCOMExecutor documentation for all Excel manipulation tools.


PowerPointCOMExecutor

Type: Action (LLM-selectable, PowerPoint COM API)
Platform: Windows
Agent: AppAgent (POWERPNT.EXE only)
Tools: 2 tools for PowerPoint automation

→ See complete PowerPointCOMExecutor documentation for PowerPoint tools and examples.


PDFReaderExecutor

Type: Action (LLM-selectable, PDF text extraction)
Platform: Windows
Agent: AppAgent (explorer.exe)
Tools: 3 tools for PDF text extraction with human simulation

→ See complete PDFReaderExecutor documentation for PDF extraction tools and workflows.


ConstellationEditor

Type: Action (LLM-selectable, multi-device orchestration)
Platform: Cross-platform
Agent: ConstellationAgent
Tools: 7 tools for task and dependency management

→ See complete ConstellationEditor documentation for multi-device workflow tools.


Configuration

All local servers are configured in config/ufo/mcp.yaml. For detailed configuration options, see:

  • MCP Configuration Guide - Complete configuration reference
  • Individual server documentation for server-specific configuration

Example configuration:

AppAgent:
  WINWORD.EXE:
    data_collection:
      - namespace: UICollector
        type: local
        reset: false
    action:
      - namespace: AppUIExecutor  # GUI automation
        type: local
        reset: false
      - namespace: WordCOMExecutor  # API automation
        type: local
        reset: true  # Reset when switching documents
      - namespace: CommandLineExecutor
        type: local
        reset: false

See Also