Supported Models

UFO supports a wide variety of LLM models and APIs. You can configure different models for HOST_AGENT, APP_AGENT, BACKUP_AGENT, and EVALUATION_AGENT in the config/ufo/agents.yaml file to optimize for performance, cost, or specific capabilities.

Available Model Integrations

Provider	Documentation	Visual Support	Authentication
OpenAI	OpenAI API	✅	API Key
Azure OpenAI (AOAI)	Azure OpenAI API	✅	API Key / Azure AD
Google Gemini	Gemini API	✅	API Key
Anthropic Claude	Claude API	✅	API Key
Qwen (Alibaba)	Qwen API	✅	API Key
DeepSeek	DeepSeek API	❌	API Key
Ollama	Ollama API	⚠️ Limited	Local
OpenAI Operator	Operator (CUA)	✅	Azure AD
Custom Models	Custom API	Depends	Varies

Model Selection Guide

By Use Case

For Production Deployments: - Primary: OpenAI GPT-4o or Azure OpenAI (enterprise features) - Cost-optimized: GPT-4o-mini for APP_AGENT, GPT-4o for HOST_AGENT - Privacy-sensitive: Ollama (local models)

For Development & Testing: - Fast iteration: Gemini 2.0 Flash (high speed, low cost) - Local testing: Ollama with llama2 or similar - Budget-friendly: DeepSeek or Qwen models

For Specialized Tasks: - Computer control: OpenAI Operator (CUA model) - Code generation: DeepSeek-Coder or Claude - Long context: Gemini 1.5 Pro (large context window)

By Capability

Vision Support (Screenshot Understanding): - ✅ OpenAI GPT-4o, GPT-4-turbo - ✅ Azure OpenAI (vision-enabled deployments) - ✅ Google Gemini (all 1.5+ models) - ✅ Claude 3+ (all variants) - ✅ Qwen-VL models - ⚠️ Ollama (llava models only) - ❌ DeepSeek (text-only)

JSON Schema Support: - ✅ OpenAI / Azure OpenAI - ✅ Google Gemini - ⚠️ Limited: Claude, Qwen, Ollama

Configuration Architecture

Each model is implemented as a separate class in the ufo/llm directory, inheriting from the BaseService class in ufo/llm/base.py. All models implement the chat_completion method to maintain a consistent interface.

Key Configuration Files:

config/ufo/agents.yaml: Primary agent configuration (HOST, APP, BACKUP, EVALUATION, OPERATOR)
config/ufo/system.yaml: System-wide LLM parameters (MAX_TOKENS, TEMPERATURE, etc.)
config/ufo/prices.yaml: Cost tracking for different models

Multi-Provider Setup

You can mix and match providers for different agents to optimize cost and performance:

# Use OpenAI for planning
HOST_AGENT:
  API_TYPE: "openai"
  API_MODEL: "gpt-4o"

# Use Azure OpenAI for execution (cost control)
APP_AGENT:
  API_TYPE: "aoai"
  API_MODEL: "gpt-4o-mini"

# Use Claude for evaluation
EVALUATION_AGENT:
  API_TYPE: "claude"
  API_MODEL: "claude-3-5-sonnet-20241022"

Getting Started

Choose your LLM provider from the table above
Follow the provider-specific documentation to obtain API keys
Configure config/ufo/agents.yaml with your credentials
Refer to the Quick Start Guide to begin

For detailed configuration options:

Agent Configuration Guide - Complete configuration reference
System Configuration - LLM parameters and behavior
Quick Start Guide - Step-by-step setup