Session

A Session is a continuous conversation instance between the user and UFO, managing multiple rounds of interaction from initial request to task completion across different execution modes and platforms.

Quick Reference:

Session types? See Session Types
Lifecycle? See Session Lifecycle
Mode differences? See Execution Modes
Platform differences? See Platform-Specific Sessions

Overview

A Session represents a complete conversation workflow, containing one or more Rounds of agent execution. Sessions manage:

Context: Shared state across all rounds
Agents: HostAgent and AppAgent (or LinuxAgent)
Rounds: Individual request-response cycles
Evaluation: Optional task completion assessment
Experience: Learning from successful workflows

Relationship: Session vs Round

graph TB subgraph "Session (Conversation)" S[Session Instance] CTX[Context Shared State] R1[Round 1 Request 1] R2[Round 2 Request 2] R3[Round 3 Request 3] EVAL[Evaluation Optional] end subgraph "Round 1 Details" HOST1[HostAgent] APP1[AppAgent] CMD1[Commands] end subgraph "Round 2 Details" HOST2[HostAgent] APP2[AppAgent] CMD2[Commands] end S --> CTX S --> R1 S --> R2 S --> R3 S --> EVAL R1 -.shares.-> CTX R2 -.shares.-> CTX R3 -.shares.-> CTX R1 --> HOST1 HOST1 --> APP1 APP1 --> CMD1 R2 --> HOST2 HOST2 --> APP2 APP2 --> CMD2 style S fill:#e1f5ff style CTX fill:#fff4e1 style R1 fill:#f0ffe1 style R2 fill:#f0ffe1 style R3 fill:#f0ffe1 style EVAL fill:#ffe1f5

Session Types

UFO supports 7 session types across Windows and Linux platforms:

Session Type	Platform	Mode	Description
Session	Windows	`normal`, `normal_operator`	Interactive with HostAgent
ServiceSession	Windows	`service`	WebSocket-controlled via AIP
FollowerSession	Windows	`follower`	Replays saved plans
FromFileSession	Windows	`batch_normal`	Executes from request files
OpenAIOperatorSession	Windows	`operator`	Pure operator mode
LinuxSession	Linux	`normal`, `normal_operator`	Interactive without HostAgent
LinuxServiceSession	Linux	`service`	WebSocket-controlled on Linux

Class Hierarchy

graph TB BASE[BaseSession Abstract] WIN_BASE[WindowsBaseSession with HostAgent] LINUX_BASE[LinuxBaseSession without HostAgent] SESSION[Session Interactive] SERVICE[ServiceSession WebSocket] FOLLOWER[FollowerSession Plan Replay] FROMFILE[FromFileSession Batch] OPERATOR[OpenAIOperatorSession Operator] LINUX_SESS[LinuxSession Interactive] LINUX_SERVICE[LinuxServiceSession WebSocket] BASE --> WIN_BASE BASE --> LINUX_BASE WIN_BASE --> SESSION WIN_BASE --> SERVICE WIN_BASE --> FOLLOWER WIN_BASE --> FROMFILE WIN_BASE --> OPERATOR LINUX_BASE --> LINUX_SESS LINUX_BASE --> LINUX_SERVICE style BASE fill:#e1f5ff style WIN_BASE fill:#fff4e1 style LINUX_BASE fill:#f0ffe1 style SESSION fill:#e1ffe1 style LINUX_SESS fill:#e1ffe1

Platform Base Classes

WindowsBaseSession: Creates HostAgent, supports two-tier architecture
LinuxBaseSession: Single-tier architecture with LinuxAgent only

Session Lifecycle

Standard Lifecycle

stateDiagram-v2 [*] --> Initialized: __init__ Initialized --> ContextReady: _init_context ContextReady --> Running: run() Running --> RoundCreate: create_new_round RoundCreate --> RoundExecute: round.run() RoundExecute --> RoundComplete: Round finishes RoundComplete --> CheckMore: is_finished? CheckMore --> RoundCreate: More requests CheckMore --> Snapshot: No more requests Snapshot --> Evaluation: capture_last_snapshot Evaluation --> CostPrint: evaluation() if enabled CostPrint --> [*]: Session complete

Core Execution Loop

The main session logic:

async def run(self) -> None:
    """
    Run the session.
    """

    while not self.is_finished():
        # Create new round for each request
        round = self.create_new_round()
        if round is None:
            break

        # Execute the round
        await round.run()

    # Capture final state
    if self.application_window is not None:
        await self.capture_last_snapshot()

    # Evaluate if configured
    if self._should_evaluate and not self.is_error():
        await self.evaluation()

    # Print cost summary
    self.print_cost()

Lifecycle Stages

1. Initialization

session = Session(
    task="email_task",
    should_evaluate=True,
    id=0,
    request="Send an email to John",
    mode="normal"
)

What happens: - Task name assigned - Session ID set - Initial request stored - Mode configured

2. Context Initialization

def _init_context(self) -> None:
    """Initialize the session context."""
    super()._init_context()

    # Create MCP server manager
    mcp_server_manager = MCPServerManager()

    # Create local dispatcher
    command_dispatcher = LocalCommandDispatcher(
        session=self,
        mcp_server_manager=mcp_server_manager
    )

    # Attach to context
    self.context.attach_command_dispatcher(command_dispatcher)

What happens: - Context object created - Command dispatcher attached (Local or WebSocket) - MCP servers initialized (if applicable) - Application window tracked

3. Round Creation

def create_new_round(self):
    """Create a new round."""

    # Get request (first or new)
    if not self.context.get(ContextNames.REQUEST):
        request = first_request()
    else:
        request, complete = new_request()
        if complete:
            return None

    # Create round with request
    round = Round(
        task=self.task,
        context=self.context,
        request=request,
        id=self._round_num
    )

    self._round_num += 1
    return round

What happens: - User prompted for request (interactive modes) - Or request read from file/plan (non-interactive) - Round object created with shared context - Round counter incremented

4. Round Execution

await round.run()

What happens: - HostAgent selects application (Windows) - AppAgent executes in application (or LinuxAgent directly) - Commands dispatched and executed - Results captured in context - Experience logged

5. Continuation Check

def is_finished(self) -> bool:
    """Check if session is complete."""
    return self.context.get(ContextNames.SESSION_FINISH, False)

What happens: - Check if user wants another request - Check if error occurred - Check if plan is complete (follower/batch modes)

6. Final Snapshot

async def capture_last_snapshot(self) -> None:
    """Capture the last snapshot of the application."""

    last_round = self.context.get(ContextNames.ROUND_STEP)
    subtask_amount = self.context.get(ContextNames.SUBTASK_AMOUNT)

    # Capture screenshot
    screenshot = self.application_window.capture_screenshot_infor()

    # Save to logs
    self.file_writer.save_screenshot(
        screenshot,
        last_round,
        subtask_amount,
        "last"
    )

What happens: - Screenshot captured - Control tree logged - Final state preserved

7. Evaluation

async def evaluation(self) -> None:
    """Evaluate the session."""

    evaluator = EvaluationAgent(
        name="evaluation",
        process_name=self.context.get(ContextNames.APPLICATION_PROCESS_NAME),
        app_root_name=self.context.get(ContextNames.APPLICATION_ROOT_NAME),
        is_visual=self.configs["EVA_SESSION"]["VIS_EVAL"],
        main_prompt=self.configs["EVA_SESSION"]["MAIN_PROMPT"],
        api_prompt=self.configs["EVA_SESSION"]["API_PROMPT"]
    )

    score = await evaluator.evaluate(
        request=self.context.get(ContextNames.REQUEST),
        trajectory=self.context.get(ContextNames.TRAJECTORY)
    )

    self.file_writer.save_evaluation(score)

What happens: - EvaluationAgent created - Task completion assessed - Score logged - Feedback saved

8. Cost Summary

def print_cost(self) -> None:
    """Print the session cost."""

    total_cost = self.context.get(ContextNames.TOTAL_COST, 0.0)
    total_tokens = self.context.get(ContextNames.TOTAL_TOKENS, 0)

    console.print(f"[bold green]Session Complete[/bold green]")
    console.print(f"Total Cost: ${total_cost:.4f}")
    console.print(f"Total Tokens: {total_tokens}")

Execution Modes

Normal Mode

Interactive execution with user in the loop:

session = Session(
    task="document_edit",
    should_evaluate=True,
    id=0,
    request="",  # Will prompt user
    mode="normal"
)

await session.run()

Features: - User prompted for initial request via first_request() - User prompted for each new request via new_request() - Commands executed locally via LocalCommandDispatcher - User can exit anytime by typing "N"

Flow:

1. Display welcome panel
2. User enters: "Open Word"
3. HostAgent selects Word application
4. AppAgent types content
5. User asked: "What next?"
6. User enters: "Save document"
7. AppAgent saves file
8. User asked: "What next?"
9. User enters: "N" (exit)
10. Session ends

Normal_Operator Mode

Normal mode with operator capabilities:

session = Session(
    task="complex_workflow",
    should_evaluate=True,
    id=0,
    request="Organize my files by date",
    mode="normal_operator"
)

Differences from Normal: - Agent can use operator-level actions - More powerful command set - Same interactive workflow

Service Mode

WebSocket-controlled remote execution:

from aip.protocol.task_execution import TaskExecutionProtocol

protocol = TaskExecutionProtocol(websocket_connection)

session = ServiceSession(
    task="remote_automation",
    should_evaluate=True,
    id="session_abc123",
    request="Click Submit button",
    task_protocol=protocol
)

await session.run()

Features: - No user interaction prompts - Single request per session - Commands sent via WebSocket - Results returned to server - Uses WebSocketCommandDispatcher

Flow:

1. Server sends request via WebSocket
2. ServiceSession created
3. Agent generates commands
4. Commands sent to client via WebSocket
5. Client executes locally
6. Results sent back
7. Session finishes immediately

Key Difference:

def is_finished(self) -> bool:
    """Service session finishes after one round."""
    return self._round_num > 0

Follower Mode

Replay saved action plans:

session = FollowerSession(
    task="email_replay",
    plan_file="/plans/send_email.json",
    should_evaluate=True,
    id=0
)

await session.run()

Features: - No user prompts - Reads actions from plan file - Deterministic execution - Good for testing/demos

Plan File Format:

{
  "request": "Send an email to John",
  "actions": [
    {
      "agent": "HostAgent",
      "action": "select_application",
      "parameters": {"app_name": "Outlook"}
    },
    {
      "agent": "AppAgent",
      "action": "click_element",
      "parameters": {"label": "New Email"}
    }
  ]
}

Batch_Normal Mode

Execute multiple requests from files:

session = FromFileSession(
    task="batch_task",
    plan_file="/requests/task1.json",
    should_evaluate=True,
    id=0
)

await session.run()

Features: - Request loaded from file - No user interaction - Can batch multiple files with SessionPool - Task status tracking available

Request File:

{
  "request": "Create a spreadsheet with sales data"
}

Operator Mode

Pure operator-level execution:

session = OpenAIOperatorSession(
    task="system_automation",
    should_evaluate=True,
    id=0,
    request="Install and configure software"
)

await session.run()

Features: - Operator-level permissions - Can modify system settings - More powerful than AppAgent - Same interactive prompts as normal mode

Platform-Specific Sessions

Windows Sessions

Characteristics: - Two-tier architecture: HostAgent → AppAgent - Base class: WindowsBaseSession - Agent flow: HostAgent selects app, AppAgent controls it - Automation: Uses UIA (UI Automation)

Example:

class Session(WindowsBaseSession):
    """Windows interactive session."""

    def _init_context(self):
        """Initialize with HostAgent."""
        super()._init_context()

        # HostAgent created by WindowsBaseSession
        self.host_agent = self.create_host_agent()

        # MCP and LocalCommandDispatcher
        self.setup_command_dispatcher()

Linux Sessions

Characteristics: - Single-tier architecture: LinuxAgent only (no HostAgent) - Base class: LinuxBaseSession - Agent flow: LinuxAgent controls application directly - Automation: Platform-specific tools

Example:

class LinuxSession(LinuxBaseSession):
    """Linux interactive session."""

    def _init_context(self):
        """Initialize without HostAgent."""
        super()._init_context()

        # No HostAgent - direct LinuxAgent usage
        self.linux_agent = self.create_linux_agent(
            application_name=self.application_name
        )

Comparison:

Aspect	Windows	Linux
Architecture	Two-tier (HostAgent + AppAgent)	Single-tier (LinuxAgent)
Application Selection	HostAgent decides	Pre-specified or LinuxAgent decides
Agent Switching	Yes (HostAgent ↔ AppAgent)	No
Modes Supported	All 7 modes	normal, normal_operator, service
UI Automation	UIA (UIAutomation)	Platform tools

See Platform Sessions for detailed comparison.

Experience Saving

Sessions can save successful workflows for future learning:

# After successful task completion
if self.configs["SAVE_EXPERIENCE"] == "ask":
    save = experience_asker()

    if save:
        self.save_experience()

Save Modes:

Mode	Behavior
`always`	Auto-save every successful session
`ask`	Prompt user after each session
`auto`	Save if evaluation score > threshold
`always_not`	Never save

Saved Experience Structure:

{
  "task": "Send email",
  "request": "Send an email to John about the meeting",
  "trajectory": [
    {
      "round": 0,
      "agent": "HostAgent",
      "observation": "Desktop with Outlook icon",
      "action": "select_application",
      "parameters": {"app_name": "Outlook"}
    },
    {
      "round": 0,
      "agent": "AppAgent",
      "observation": "Outlook main window",
      "action": "click_element",
      "parameters": {"label": "New Email"}
    }
  ],
  "outcome": "success",
  "evaluation_score": 0.95,
  "cost": 0.0234,
  "tokens": 1542
}

Error Handling

Error States

Sessions track errors through context:

def is_error(self) -> bool:
    """Check if session encountered error."""
    return self.context.get(ContextNames.ERROR, False)

def set_error(self, error_message: str):
    """Set error state."""
    self.context.set(ContextNames.ERROR, True)
    self.context.set(ContextNames.ERROR_MESSAGE, error_message)

Error Recovery

try:
    await round.run()
except AgentError as e:
    self.set_error(str(e))
    logger.error(f"Round {self._round_num} failed: {e}")

    # Decide whether to continue or abort
    if self.can_recover(e):
        # Try next round
        continue
    else:
        # Abort session
        break

Common Errors

Error Type	Cause	Handling
TimeoutError	Command execution timeout	Retry or skip
ConnectionError	WebSocket/MCP disconnection	Reconnect or abort
AgentError	Agent decision failure	Log and retry
ValidationError	Invalid command parameters	Skip command

Best Practices

Session Creation

Efficient Sessions

✅ Use SessionFactory.create_session() for platform-aware creation
✅ Enable evaluation for quality tracking
✅ Choose appropriate mode for use case
✅ Set meaningful task names for logging
❌ Don't create sessions directly (use factory)
❌ Don't mix modes (each session has one mode)

Interactive Sessions

User Experience

✅ Provide clear initial requests
✅ Allow users to exit gracefully ("N" option)
✅ Show progress and confirmations
✅ Handle sensitive actions with confirmation
❌ Don't prompt excessively
❌ Don't hide errors from users

Service Sessions

WebSocket Considerations

✅ Always provide task_protocol
✅ Handle connection loss gracefully
✅ Set appropriate timeouts
✅ Validate requests before execution
❌ Don't assume connection is stable
❌ Don't block waiting for results indefinitely

Batch Sessions

Batch Processing

✅ Enable task status tracking
✅ Use descriptive file names
✅ Group similar tasks
✅ Log failures for retry
❌ Don't stop batch on first failure
❌ Don't run too many sessions in parallel

Examples

Example 1: Basic Interactive Session

from ufo.module.sessions.session import Session

# Create session
session = Session(
    task="word_editing",
    should_evaluate=True,
    id=0,
    request="",  # Will prompt user
    mode="normal"
)

# Run session
await session.run()

# User interaction:
# 1. Welcome panel shown
# 2. User enters: "Open Word and type Hello World"
# 3. HostAgent selects Word
# 4. AppAgent types text
# 5. User asked for next request
# 6. User enters: "N" to exit
# 7. Session evaluates and ends

Example 2: Service Session

from ufo.module.sessions.service_session import ServiceSession
from aip.protocol.task_execution import TaskExecutionProtocol

# WebSocket established
protocol = TaskExecutionProtocol(websocket)

# Create service session
session = ServiceSession(
    task="remote_click",
    should_evaluate=False,  # Server evaluates
    id="sess_12345",
    request="Click the Submit button",
    task_protocol=protocol
)

# Run (non-blocking for client)
await session.run()

# Session finishes after one request

Example 3: Follower Session

from ufo.module.sessions.session import FollowerSession

# Replay saved plan
session = FollowerSession(
    task="email_demo",
    plan_file="./plans/send_email.json",
    should_evaluate=True,
    id=0
)

await session.run()

# Executes exactly as recorded in plan file
# No user prompts
# Deterministic execution

Example 4: Linux Session

from ufo.module.sessions.linux_session import LinuxSession

# Linux interactive session
session = LinuxSession(
    task="linux_task",
    should_evaluate=True,
    id=0,
    request="Open gedit and type Hello Linux",
    mode="normal",
    application_name="gedit"
)

await session.run()

# Single-tier architecture
# No HostAgent
# LinuxAgent controls gedit directly

Reference

BaseSession

Bases: ABC

A basic session in UFO. A session consists of multiple rounds of interactions and conversations.

Initialize a session.

Parameters:	`task` (`str`) – The name of current task. `should_evaluate` (`bool`) – Whether to evaluate the session. `id` (`str`) – The id of the session.

Source code in module/basic.py

def __init__(self, task: str, should_evaluate: bool, id: str) -> None:
    """
    Initialize a session.
    :param task: The name of current task.
    :param should_evaluate: Whether to evaluate the session.
    :param id: The id of the session.
    """

    self._should_evaluate = should_evaluate
    self._id = id
    self.task = task

    # Logging-related properties
    self.log_path = f"logs/{task}/"
    utils.create_folder(self.log_path)

    self._rounds: Dict[int, BaseRound] = {}

    self._context = Context()
    self._init_context()
    self._finish = False
    self._results = []
    self.logger = logging.getLogger(__name__)

    # Initialize platform-specific agents
    # Subclasses should override _init_agents() to set up their agents
    self._host_agent: Optional[HostAgent] = None
    self._init_agents()

`application_window` `property` `writable`

Get the application of the session. return: The application of the session.

`application_window_info` `property` `writable`

Get the application window info of the session. return: The application window info of the session.

`context` `property`

Get the context of the session. return: The context of the session.

`cost` `property` `writable`

Get the cost of the session. return: The cost of the session.

`current_agent_class` `property`

Get the class name of the current agent. return: The class name of the current agent.

`current_round` `property`

Get the current round of the session. return: The current round of the session.

`evaluation_logger` `property`

Get the file writer for evaluation. return: The file writer for evaluation.

`host_agent` `property`

Get the host agent of the session. May return None for sessions that don't use a host agent (e.g., Linux).

Returns:	`Optional[HostAgent]` – The host agent of the session, or None if not applicable.

`id` `property`

Get the id of the session. return: The id of the session.

`results` `property` `writable`

Get the evaluation results of the session. return: The evaluation results of the session.

`rounds` `property`

Get the rounds of the session. return: The rounds of the session.

`session_type` `property`

Get the class name of the session. return: The class name of the session.

`step` `property`

Get the step of the session. return: The step of the session.

`total_rounds` `property`

Get the total number of rounds in the session. return: The total number of rounds in the session.

`add_round(id, round)`

Add a round to the session.

Parameters:	`id` (`int`) – The id of the round. `round` (`BaseRound`) – The round to be added.

Source code in module/basic.py

def add_round(self, id: int, round: BaseRound) -> None:
    """
    Add a round to the session.
    :param id: The id of the round.
    :param round: The round to be added.
    """
    self._rounds[id] = round

`capture_last_screenshot(save_path, full_screen=False)` `async`

Capture the last window screenshot.

Parameters:	`save_path` (`str`) – The path to save the window screenshot. `full_screen` (`bool`, default: `False` ) – Whether to capture the full screen or just the active window.

Source code in module/basic.py

async def capture_last_screenshot(
    self, save_path: str, full_screen: bool = False
) -> None:
    """
    Capture the last window screenshot.
    :param save_path: The path to save the window screenshot.
    :param full_screen: Whether to capture the full screen or just the active window.
    """

    try:
        if full_screen:
            command = Command(
                tool_name="capture_desktop_screenshot",
                parameters={"all_screens": True},
                tool_type="data_collection",
            )
        else:

            command = Command(
                tool_name="capture_window_screenshot",
                parameters={},
                tool_type="data_collection",
            )

        result = await self.context.command_dispatcher.execute_commands([command])
        image = result[0].result

        self.logger.info(f"Captured screenshot at final: {save_path}")
        if image:
            utils.save_image_string(image, save_path)

    except Exception as e:
        self.logger.warning(
            f"The last snapshot capture failed, due to the error: {e}"
        )

`capture_last_snapshot()` `async`

Capture the last snapshot of the application, including the screenshot and the XML file if configured.

Source code in module/basic.py

async def capture_last_snapshot(self) -> None:
    """
    Capture the last snapshot of the application, including the screenshot and the XML file if configured.
    """  # Capture the final screenshot
    screenshot_save_path = self.log_path + "action_step_final.png"

    if (
        self.application_window is not None
        or self.application_window_info is not None
    ):

        await self.capture_last_screenshot(screenshot_save_path)

        if ufo_config.system.save_ui_tree:
            ui_tree_path = os.path.join(self.log_path, "ui_trees")
            ui_tree_file_name = "ui_tree_final.json"
            ui_tree_save_path = os.path.join(ui_tree_path, ui_tree_file_name)
            await self.capture_last_ui_tree(ui_tree_save_path)

        if ufo_config.system.save_full_screen:

            desktop_save_path = self.log_path + "desktop_final.png"

            await self.capture_last_screenshot(desktop_save_path, full_screen=True)

`capture_last_ui_tree(save_path)` `async`

Capture the last UI tree snapshot.

Parameters:	`save_path` (`str`) – The path to save the UI tree snapshot.

Source code in module/basic.py

async def capture_last_ui_tree(self, save_path: str) -> None:
    """
    Capture the last UI tree snapshot.
    :param save_path: The path to save the UI tree snapshot.
    """

    result = await self.context.command_dispatcher.execute_commands(
        [
            Command(
                tool_name="get_ui_tree",
                parameters={},
                tool_type="data_collection",
            )
        ]
    )

    if result and result[0].result:
        with open(save_path, "w") as file:
            json.dump(result[0].result, file, indent=4)

`create_following_round()`

Create a following round. return: The following round.

Source code in module/basic.py

def create_following_round(self) -> BaseRound:
    """
    Create a following round.
    return: The following round.
    """
    pass

`create_new_round()` `abstractmethod`

Create a new round.

Source code in module/basic.py

@abstractmethod
def create_new_round(self) -> Optional[BaseRound]:
    """
    Create a new round.
    """
    pass

`evaluation()`

Evaluate the session.

Source code in module/basic.py

def evaluation(self) -> None:
    """
    Evaluate the session.
    """
    console.print(_safe_console_text("📊 Evaluating the session..."), style="yellow")

    is_visual = ufo_config.evaluation_agent.visual_mode

    evaluator = EvaluationAgent(
        name="eva_agent",
        is_visual=is_visual,
        main_prompt=ufo_config.system.EVALUATION_PROMPT,
        example_prompt="",
    )

    requests = self.request_to_evaluate()

    # Evaluate the session, first use the default setting, if failed, then disable the screenshot evaluation.
    try:
        result, cost = evaluator.evaluate(
            request=requests,
            log_path=self.log_path,
            eva_all_screenshots=ufo_config.system.eva_all_screenshots,
            context=self.context,
        )
    except Exception as e:
        result, cost = evaluator.evaluate(
            request=requests,
            log_path=self.log_path,
            eva_all_screenshots=False,
            context=self.context,
        )

    # Add additional information to the evaluation result.
    additional_info = {
        "level": "session",
        "request": requests,
        "type": "evaluation_result",
    }
    result.update(additional_info)

    self._results.append(result)

    self.cost += cost

    evaluator.print_response(result)

    self.evaluation_logger.write(json.dumps(result))

    self.logger.info(
        f"Evaluation result saved to {os.path.join(self.log_path, 'evaluation.log')}"
    )

`experience_saver()`

Save the current trajectory as agent experience.

Source code in module/basic.py

def experience_saver(self) -> None:
    """
    Save the current trajectory as agent experience.
    """
    console.print(
        _safe_console_text(
            "📚 Summarizing and saving the execution flow as experience..."
        ),
        style="yellow",
    )

    summarizer = ExperienceSummarizer(
        ufo_config.app_agent.visual_mode,
        ufo_config.system.EXPERIENCE_PROMPT,
        ufo_config.system.APPAGENT_EXAMPLE_PROMPT,
        ufo_config.system.API_PROMPT,
    )
    experience = summarizer.read_logs(self.log_path)
    summaries, cost = summarizer.get_summary_list(experience)

    experience_path = ufo_config.system.EXPERIENCE_SAVED_PATH
    utils.create_folder(experience_path)
    summarizer.create_or_update_yaml(
        summaries, os.path.join(experience_path, "experience.yaml")
    )
    summarizer.create_or_update_vector_db(
        summaries, os.path.join(experience_path, "experience_db")
    )

    self.cost += cost
    self.logger.info(f"The experience has been saved to {experience_path}")

`initialize_logger(log_path, log_filename, mode='a')` `staticmethod`

Initialize logging. log_path: The path of the log file. log_filename: The name of the log file. return: The logger.

Source code in module/basic.py

@staticmethod
def initialize_logger(log_path: str, log_filename: str, mode="a") -> logging.Logger:
    """
    Initialize logging.
    log_path: The path of the log file.
    log_filename: The name of the log file.
    return: The logger.
    """
    # Code for initializing logging
    logger = logging.Logger(log_filename)

    if not ufo_config.system.print_log:
        # Remove existing handlers if PRINT_LOG is False
        logger.handlers = []

    log_file_path = os.path.join(log_path, log_filename)
    file_handler = logging.FileHandler(log_file_path, mode=mode, encoding="utf-8")
    formatter = logging.Formatter("%(message)s")
    file_handler.setFormatter(formatter)
    logger.addHandler(file_handler)
    logger.setLevel(ufo_config.system.log_level)

    return logger

`is_error()`

Check if the session is in error state. return: True if the session is in error state, otherwise False.

Source code in module/basic.py

def is_error(self):
    """
    Check if the session is in error state.
    return: True if the session is in error state, otherwise False.
    """
    if self.current_round is not None:
        return self.current_round.state.name() == AgentStatus.ERROR.value
    return False

`is_finished()`

Check if the session is ended. return: True if the session is ended, otherwise False.

Source code in module/basic.py

def is_finished(self) -> bool:
    """
    Check if the session is ended.
    return: True if the session is ended, otherwise False.
    """
    if (
        self._finish
        or self.step >= ufo_config.system.max_step
        or self.total_rounds >= ufo_config.system.max_round
    ):
        return True

    if self.is_error():
        return True

    return False

`next_request()` `abstractmethod`

Get the next request of the session. return: The request of the session.

Source code in module/basic.py

@abstractmethod
def next_request(self) -> str:
    """
    Get the next request of the session.
    return: The request of the session.
    """
    pass

`print_cost()`

Print the total cost of the session.

Source code in module/basic.py

def print_cost(self) -> None:
    """
    Print the total cost of the session.
    """

    if isinstance(self.cost, float) and self.cost > 0:
        formatted_cost = "${:.2f}".format(self.cost)
        console.print(
            _safe_console_text(
                f"💰 Total request cost of the session: {formatted_cost}"
            ),
            style="yellow",
        )
    else:
        console.print(
            _safe_console_text(
                f"ℹ️  Cost is not available for the model {ufo_config.host_agent.api_model} or {ufo_config.app_agent.api_model}."
            ),
            style="yellow",
        )
        self.logger.warning("Cost information is not available.")

`request_to_evaluate()` `abstractmethod`

Get the request to evaluate. return: The request(s) to evaluate.

Source code in module/basic.py

@abstractmethod
def request_to_evaluate(self) -> str:
    """
    Get the request to evaluate.
    return: The request(s) to evaluate.
    """
    pass

`reset()` `abstractmethod`

Reset the session to initial state.

Source code in module/basic.py

@abstractmethod
def reset(self) -> None:
    """
    Reset the session to initial state.
    """
    pass

`run()` `async`

Run the session.

Returns:	`List[Dict[str, str]]` – The result per session

Source code in module/basic.py

async def run(self) -> List[Dict[str, str]]:
    """
    Run the session.
    :return: The result per session
    """

    while not self.is_finished():

        round = self.create_new_round()
        if round is None:
            break

        round_result = await round.run()

        self.results.append({"request": round.request, "result": round_result})

    await self.capture_last_snapshot()

    if self._should_evaluate and not self.is_error():
        self.evaluation()

    if ufo_config.system.log_to_markdown:

        self.save_log_to_markdown()

    self.print_cost()

    return self.results

`save_log_to_markdown()`

Save the log of the session to markdown file.

Source code in module/basic.py

def save_log_to_markdown(self) -> None:
    """
    Save the log of the session to markdown file.
    """

    file_path = self.log_path
    trajectory = Trajectory(file_path)
    trajectory.to_markdown(file_path + "/output.md")
    self.logger.info(f"Trajectory saved to {file_path + '/output.md'}")

Session (Windows)

Bases: WindowsBaseSession

A session for UFO.

Initialize a session.

Parameters:	`task` (`str`) – The name of current task. `should_evaluate` (`bool`) – Whether to evaluate the session. `id` (`int`) – The id of the session. `request` (`str`, default: `''` ) – The user request of the session, optional. If not provided, UFO will ask the user to input the request. `mode` (`str`, default: `'normal'` ) – The mode of the task.

Source code in module/sessions/session.py

def __init__(
    self,
    task: str,
    should_evaluate: bool,
    id: int,
    request: str = "",
    mode: str = "normal",
) -> None:
    """
    Initialize a session.
    :param task: The name of current task.
    :param should_evaluate: Whether to evaluate the session.
    :param id: The id of the session.
    :param request: The user request of the session, optional. If not provided, UFO will ask the user to input the request.
    :param mode: The mode of the task.
    """

    self._mode = mode
    super().__init__(task, should_evaluate, id)

    self._init_request = request
    self.logger = logging.getLogger(__name__)

`create_new_round()`

Create a new round.

Source code in module/sessions/session.py

def create_new_round(self) -> Optional[BaseRound]:
    """
    Create a new round.
    """

    # Get a request for the new round.
    request = self.next_request()

    # Create a new round and return None if the session is finished.

    if self.is_finished():
        return None

    self._host_agent.set_state(self._host_agent.default_state)

    round = BaseRound(
        request=request,
        agent=self._host_agent,
        context=self.context,
        should_evaluate=ufo_config.system.eva_round,
        id=self.total_rounds,
    )

    self.add_round(round.id, round)

    return round

`next_request()`

Get the request for the host agent.

Returns:	`str` – The request for the host agent.

Source code in module/sessions/session.py

def next_request(self) -> str:
    """
    Get the request for the host agent.
    :return: The request for the host agent.
    """
    if self.total_rounds == 0:

        # If the request is provided via command line, use it directly.
        if self._init_request:
            return self._init_request
        # Otherwise, ask the user to input the request with enhanced UX.
        else:
            return interactor.first_request()
    else:
        request, iscomplete = interactor.new_request()
        if iscomplete:
            self._finish = True
        return request

`request_to_evaluate()`

Get the request to evaluate. return: The request(s) to evaluate.

Source code in module/sessions/session.py

def request_to_evaluate(self) -> str:
    """
    Get the request to evaluate.
    return: The request(s) to evaluate.
    """
    request_memory = self._host_agent.blackboard.requests
    return request_memory.to_json()

`run()` `async`

Run the session.

Source code in module/sessions/session.py

async def run(self) -> None:
    """
    Run the session.
    """
    await super().run()
    # Save the experience if the user asks so.

    save_experience = ufo_config.system.save_experience

    self.logger.info(f"Save experience setting: {save_experience}")

    if save_experience == "always":
        self.experience_saver()
    elif save_experience == "ask":
        if interactor.experience_asker():
            self.experience_saver()

    elif save_experience == "auto":
        task_completed = self.results.get("complete", "no")
        if task_completed.lower() == "yes":
            self.experience_saver()

    elif save_experience == "always_not":
        pass

LinuxSession

Bases: LinuxBaseSession

A session for UFO on Linux platform. Unlike Windows sessions, Linux sessions don't use a HostAgent. They work directly with application agents.

Initialize a Linux session.

Parameters:	`task` (`str`) – The name of current task. `should_evaluate` (`bool`) – Whether to evaluate the session. `id` (`int`) – The id of the session. `request` (`str`, default: `''` ) – The user request of the session. `mode` (`str`, default: `'normal'` ) – The mode of the task.

Source code in module/sessions/linux_session.py

def __init__(
    self,
    task: str,
    should_evaluate: bool,
    id: int,
    request: str = "",
    mode: str = "normal",
) -> None:
    """
    Initialize a Linux session.
    :param task: The name of current task.
    :param should_evaluate: Whether to evaluate the session.
    :param id: The id of the session.
    :param request: The user request of the session.
    :param mode: The mode of the task.
    """
    self._mode = mode
    self._init_request = request
    super().__init__(task, should_evaluate, id)
    self.logger = logging.getLogger(__name__)

`create_new_round()`

Create a new round for Linux session. Since there's no host agent, directly create app-level rounds.

Source code in module/sessions/linux_session.py

def create_new_round(self) -> Optional[BaseRound]:
    """
    Create a new round for Linux session.
    Since there's no host agent, directly create app-level rounds.
    """
    request = self.next_request()

    if self.is_finished():
        return None

    round = BaseRound(
        request=request,
        agent=self._agent,
        context=self.context,
        should_evaluate=ufo_config.system.eva_round,
        id=self.total_rounds,
    )

    self.add_round(round.id, round)
    return round

`next_request()`

Get the request for the app agent.

Returns:	`str` – The request for the app agent.

Source code in module/sessions/linux_session.py

def next_request(self) -> str:
    """
    Get the request for the app agent.
    :return: The request for the app agent.
    """
    if self.total_rounds == 0:
        if self._init_request:
            return self._init_request
        else:
            return interactor.first_request()
    else:
        request, iscomplete = interactor.new_request()
        if iscomplete:
            self._finish = True
        return request

`request_to_evaluate()`

Get the request to evaluate.

Returns:	`str` – The request(s) to evaluate.

Source code in module/sessions/linux_session.py

def request_to_evaluate(self) -> str:
    """
    Get the request to evaluate.
    :return: The request(s) to evaluate.
    """
    # For Linux session, collect requests from all rounds
    if self.current_round and hasattr(self.current_round.agent, "blackboard"):
        request_memory = self.current_round.agent.blackboard.requests
        return request_memory.to_json()
    return self._init_request

Session

Overview

Relationship: Session vs Round

Session Types

Class Hierarchy

Session Lifecycle

Standard Lifecycle

Core Execution Loop

Lifecycle Stages

1. Initialization

2. Context Initialization

3. Round Creation

4. Round Execution

5. Continuation Check

6. Final Snapshot

7. Evaluation

8. Cost Summary

Execution Modes

Normal Mode

Normal_Operator Mode

Service Mode

Follower Mode

Batch_Normal Mode

Operator Mode

Platform-Specific Sessions

Windows Sessions

Linux Sessions

Experience Saving

Error Handling

Error States

Error Recovery

Common Errors

Best Practices

Session Creation

Interactive Sessions

Service Sessions

Batch Sessions

Examples

Example 1: Basic Interactive Session

Example 2: Service Session

Example 3: Follower Session

Example 4: Linux Session

Reference

BaseSession

application_window property writable

application_window_info property writable

context property

cost property writable

current_agent_class property

current_round property

evaluation_logger property

host_agent property

id property

results property writable

rounds property

session_type property

step property

total_rounds property

add_round(id, round)

capture_last_screenshot(save_path, full_screen=False) async

capture_last_snapshot() async

capture_last_ui_tree(save_path) async

create_following_round()

create_new_round() abstractmethod

evaluation()

experience_saver()

initialize_logger(log_path, log_filename, mode='a') staticmethod

is_error()

is_finished()

next_request() abstractmethod

print_cost()

request_to_evaluate() abstractmethod

reset() abstractmethod

run() async

save_log_to_markdown()

Session (Windows)

create_new_round()

next_request()

request_to_evaluate()

run() async

`application_window` `property` `writable`

`application_window_info` `property` `writable`

`context` `property`

`cost` `property` `writable`

`current_agent_class` `property`

`current_round` `property`

`evaluation_logger` `property`

`host_agent` `property`

`id` `property`

`results` `property` `writable`

`rounds` `property`

`session_type` `property`

`step` `property`

`total_rounds` `property`

`add_round(id, round)`

`capture_last_screenshot(save_path, full_screen=False)` `async`

`capture_last_snapshot()` `async`

`capture_last_ui_tree(save_path)` `async`

`create_following_round()`

`create_new_round()` `abstractmethod`

`evaluation()`

`experience_saver()`

`initialize_logger(log_path, log_filename, mode='a')` `staticmethod`

`is_error()`

`is_finished()`

`next_request()` `abstractmethod`

`print_cost()`

`request_to_evaluate()` `abstractmethod`

`reset()` `abstractmethod`

`run()` `async`

`save_log_to_markdown()`

`create_new_round()`

`next_request()`

`request_to_evaluate()`

`run()` `async`

`create_new_round()`

`next_request()`

`request_to_evaluate()`