Prompts
All prompts used in UFO are stored in the ufo/prompts directory. The folder structure is as follows:
📦prompts
┣ 📂demonstration # Prompts for summarizing human demonstrations
┣ 📂evaluation # Prompts for the EvaluationAgent
┣ 📂examples # Demonstration examples for in-context learning
┣ 📂nonvisual # Examples for non-visual LLMs
┗ 📂visual # Examples for visual LLMs
┣ 📂experience # Prompts for summarizing agent self-experience
┣ 📂share # Shared prompt templates
┗ 📂base # Basic version of shared prompts
┣ 📜api.yaml # Basic API prompt
┣ 📜app_agent.yaml # Basic AppAgent prompt template
┗ 📜host_agent.yaml # Basic HostAgent prompt template
┗ 📂third_party # Third-party integration prompts (e.g., Linux agents)
Visual LLMs can process screenshots while non-visual LLMs rely on text-only control information.
Agent Prompts
Agent prompts are constructed from the following components:
| Component | Description | Source |
|---|---|---|
| Basic Template | Base template with system and user roles | YAML files in share/base/ |
| API Documentation | Skills and APIs available to the agent | Dynamically generated from MCP tools |
| Examples | In-context learning demonstrations | YAML files in examples/visual/ or examples/nonvisual/ |
You can find the base templates in the share/base directory.
How Prompts Are Constructed
The agent's Prompter class is responsible for:
- Loading YAML templates from the file system
- Formatting API documentation from available tools
- Selecting appropriate examples based on model type (visual/nonvisual)
- Combining all components into a structured message list for the LLM
- Injecting runtime context (observations, screenshots, retrieved knowledge)
Each agent type has its own specialized Prompter:
- HostAgentPrompter: Desktop-level orchestration with third-party agent support
- AppAgentPrompter: Application-level interactions with multi-action capabilities
- EvaluationAgentPrompter: Task evaluation and success assessment
For comprehensive details about the Prompter class architecture, template loading, and prompt construction workflow, see the Prompter documentation.