AppAgent: Application Execution Agent

AppAgent is the core execution runtime in UFO, responsible for carrying out individual subtasks within a specific Windows application. Each AppAgent functions as an isolated, application-specialized worker process launched and orchestrated by the central HostAgent.

What is AppAgent?

AppAgent Architecture: Application-specialized worker process for subtask execution

AppAgent operates as a child agent under the HostAgent's orchestration:

Isolated Runtime: Each AppAgent is dedicated to a single Windows application
Subtask Executor: Executes specific subtasks delegated by HostAgent
Application Expert: Tailored with deep knowledge of the target app's API surface, control semantics, and domain logic
Hybrid Execution: Leverages both GUI automation and API-based actions through MCP commands

Unlike monolithic Computer-Using Agents (CUAs) that treat all GUI contexts uniformly, each AppAgent is tailored to a single application and operates with specialized knowledge of its interface and capabilities.

Core Responsibilities

graph TB subgraph "AppAgent Core Responsibilities" SR[Sense: Capture Application State] RE[Reason: Analyze Next Action] EX[Execute: GUI or API Action] RP[Report: Write Results to Blackboard] end SR --> RE RE --> EX EX --> RP RP --> SR style SR fill:#e3f2fd style RE fill:#fff3e0 style EX fill:#f1f8e9 style RP fill:#fce4ec

Responsibility	Description	Example
State Sensing	Capture application UI, detect controls, understand current state	Screenshot Word window → Detect 50 controls → Annotate UI elements
Reasoning	Analyze state and determine next action using LLM	"Table visible with Export button [12] → Click to export data"
Action Execution	Execute GUI clicks or API calls via MCP commands	`click_input(control_id=12)` or `execute_word_command("export_table")`
Result Reporting	Write execution results to shared Blackboard	Write extracted data to `subtask_result_1` for HostAgent

ReAct-Style Control Loop

Upon receiving a subtask and execution context from the HostAgent, the AppAgent initializes a ReAct-style control loop where it iteratively:

Observes the current application state (screenshot + control detection)
Thinks about the next step (LLM reasoning)
Acts by executing either a GUI or API-based action (MCP commands)

sequenceDiagram participant HostAgent participant AppAgent participant Application participant Blackboard HostAgent->>AppAgent: Delegate subtask "Extract table from Word" loop ReAct Loop AppAgent->>Application: Observe (screenshot + controls) Application-->>AppAgent: UI state AppAgent->>AppAgent: Think (LLM reasoning) AppAgent->>Application: Act (click/API call) Application-->>AppAgent: Action result end AppAgent->>Blackboard: Write result AppAgent->>HostAgent: Return control

The MCP command system enables reliable control over dynamic and complex UIs by favoring structured API commands whenever available, while retaining fallback to GUI-based interaction commands when necessary.

Execution Architecture

Finite State Machine

AppAgent uses a finite state machine with 7 states to control its execution flow:

CONTINUE: Continue processing the current subtask
FINISH: Successfully complete the subtask
ERROR: Encounter an unrecoverable error
FAIL: Fail to complete the subtask
PENDING: Wait for user input or clarification
CONFIRM: Request user confirmation for sensitive actions
SCREENSHOT: Capture and re-annotate the application screenshot

State Details: See State Machine Documentation for complete state definitions and transitions.

4-Phase Processing Pipeline

Each execution round follows a 4-phase pipeline:

graph LR DC[Phase 1: DATA_COLLECTION Screenshot + Controls] --> LLM[Phase 2: LLM_INTERACTION Reasoning] LLM --> AE[Phase 3: ACTION_EXECUTION GUI/API Action] AE --> MU[Phase 4: MEMORY_UPDATE Record Action] style DC fill:#e1f5ff style LLM fill:#fff4e6 style AE fill:#e8f5e9 style MU fill:#fce4ec

Strategy Details: See Processing Strategy Documentation for complete pipeline implementation.

Hybrid GUI–API Execution

AppAgent executes actions through the MCP (Model-Context Protocol) command system, which provides a unified interface for both GUI automation and native API calls:

# GUI-based command (fallback)
command = Command(
    tool_name="click_input",
    parameters={"control_id": "12", "button": "left"}
)
await command_dispatcher.execute_commands([command])

# API-based command (preferred when available)
command = Command(
    tool_name="word_export_table",
    parameters={"format": "csv", "path": "output.csv"}
)
await command_dispatcher.execute_commands([command])

Implementation: See Hybrid Actions for details on the MCP command system.

Knowledge Enhancement

AppAgent is enhanced with Retrieval Augmented Generation (RAG) from heterogeneous sources:

Knowledge Source	Purpose	Configuration
Help Documents	Application-specific documentation	Learning from Help Documents
Bing Search	Latest information and updates	Learning from Bing Search
Self-Demonstrations	Successful action trajectories	Experience Learning
Human Demonstrations	Expert-provided workflows	Learning from Demonstrations

Knowledge Substrate Overview: See Knowledge Substrate for the complete RAG architecture.

Command System

AppAgent executes actions through the MCP (Model-Context Protocol) command system:

Application-Level Commands:

capture_window_screenshot - Capture application window
get_control_info - Detect UI controls via UIA/OmniParser
click_input - Click on UI control
set_edit_text - Type text into input field
annotation - Annotate screenshot with control labels

Command Details: See Command System Documentation for complete command reference.

Control Detection Backends

AppAgent supports multiple control detection backends for comprehensive UI understanding:

UIA (UI Automation):
Native Windows UI Automation API for standard controls

✅ Fast and accurate
✅ Works with most Windows applications
❌ May miss custom controls

OmniParser (Visual Detection):
Vision-based grounding model for visual elements

✅ Detects icons, images, custom controls
✅ Works with web content
❌ Requires external service

Hybrid (UIA + OmniParser):
Best of both worlds - maximum coverage

✅ Native controls + visual elements
✅ Comprehensive UI understanding

Control Detection Details: See Control Detection Overview.

Input and Output

AppAgent Input

Input	Description	Source
User Request	Original user request in natural language	HostAgent
Sub-Task	Specific subtask to execute	HostAgent delegation
Application Context	Target app name, window info	HostAgent
Control Information	Detected UI controls with labels	Data collection phase
Screenshots	Clean, annotated, previous step images	Data collection phase
Blackboard	Shared memory for inter-agent communication	Global context
Retrieved Knowledge	Help docs, demos, search results	RAG system

AppAgent Output

Output	Description	Consumer
Observation	Current UI state description	LLM context
Thought	Reasoning about next action	Execution log
ControlLabel	Selected control to interact with	Action executor
Function	MCP command to execute (click_input, set_edit_text, etc.)	Command dispatcher
Args	Command parameters	Command dispatcher
Status	Agent state (CONTINUE, FINISH, etc.)	State machine
Blackboard Update	Execution results	HostAgent

Example Output:

{
    "Observation": "Word document with table, Export button at [12]",
    "Thought": "Click Export to extract table data",
    "ControlLabel": "12",
    "Function": "click_input",
    "Args": {"button": "left"},
    "Status": "CONTINUE"
}

Detailed Documentation:

State Machine: Complete FSM with state definitions and transitions
Processing Strategy: 4-phase pipeline implementation details
Command System: Application-level MCP commands reference

Core Features:

Hybrid Actions: MCP command system for GUI–API execution
Control Detection: UIA and visual detection
Knowledge Substrate: RAG system overview

Tutorials:

Creating AppAgent: Step-by-step guide
Help Document Provision: Add help docs
Demonstration Provision: Add demos
Wrapping App-Native API: Integrate APIs

API Reference

Bases: BasicAgent

The AppAgent class that manages the interaction with the application.

Initialize the AppAgent.

Parameters:

name (str) –

The name of the agent.
process_name (str) –

The process name of the app.
app_root_name (str) –

The root name of the app.
is_visual (bool) –

The flag indicating whether the agent is visual or not.
main_prompt (str) –

The main prompt file path.
example_prompt (str) –

The example prompt file path.
skip_prompter (bool, default: False ) –

The flag indicating whether to skip the prompter initialization.
mode (str, default: 'normal' ) –

The mode of the agent.

Source code in agents/agent/app_agent.py

def __init__(
    self,
    name: str,
    process_name: str,
    app_root_name: str,
    is_visual: bool,
    main_prompt: str,
    example_prompt: str,
    skip_prompter: bool = False,
    mode: str = "normal",
) -> None:
    """
    Initialize the AppAgent.
    :param name: The name of the agent.
    :param process_name: The process name of the app.
    :param app_root_name: The root name of the app.
    :param is_visual: The flag indicating whether the agent is visual or not.
    :param main_prompt: The main prompt file path.
    :param example_prompt: The example prompt file path.
    :param skip_prompter: The flag indicating whether to skip the prompter initialization.
    :param mode: The mode of the agent.
    """
    super().__init__(name=name)
    if not skip_prompter:
        self.prompter = self.get_prompter(is_visual, main_prompt, example_prompt)
    self._process_name = process_name
    self._app_root_name = app_root_name
    self.offline_doc_retriever = None
    self.online_doc_retriever = None
    self.experience_retriever = None
    self.human_demonstration_retriever = None

    self._mode = mode

    self.set_state(self.default_state)

    self._context_provision_executed = False
    self.logger = logging.getLogger(__name__)

    self._processor: Optional[AppAgentProcessor] = None

`default_state` `property`

Get the default state.

`mode` `property`

Get the mode of the session.

`status_manager` `property`

Get the status manager.

`tools_info` `property` `writable`

Get the tools information.

Returns:	`List[MCPToolInfo]` – The list of MCPToolInfo objects.

`build_experience_retriever(db_path)`

Build the experience retriever.

Parameters:	`db_path` (`str`) – The path to the experience database.

Returns:	`None` – The experience retriever.

Source code in agents/agent/app_agent.py

def build_experience_retriever(self, db_path: str) -> None:
    """
    Build the experience retriever.
    :param db_path: The path to the experience database.
    :return: The experience retriever.
    """
    self.experience_retriever = self.retriever_factory.create_retriever(
        "experience", db_path
    )

`build_human_demonstration_retriever(db_path)`

Build the human demonstration retriever.

Parameters:	`db_path` (`str`) – The path to the human demonstration database.

Returns:	`None` – The human demonstration retriever.

Source code in agents/agent/app_agent.py

def build_human_demonstration_retriever(self, db_path: str) -> None:
    """
    Build the human demonstration retriever.
    :param db_path: The path to the human demonstration database.
    :return: The human demonstration retriever.
    """
    self.human_demonstration_retriever = self.retriever_factory.create_retriever(
        "demonstration", db_path
    )

`build_offline_docs_retriever()`

Build the offline docs retriever.

Source code in agents/agent/app_agent.py

def build_offline_docs_retriever(self) -> None:
    """
    Build the offline docs retriever.
    """
    self.offline_doc_retriever = self.retriever_factory.create_retriever(
        "offline", self._app_root_name
    )

`build_online_search_retriever(request, top_k)`

Build the online search retriever.

Parameters:	`request` (`str`) – The request for online Bing search. `top_k` (`int`) – The number of documents to retrieve.

Source code in agents/agent/app_agent.py

def build_online_search_retriever(self, request: str, top_k: int) -> None:
    """
    Build the online search retriever.
    :param request: The request for online Bing search.
    :param top_k: The number of documents to retrieve.
    """
    self.online_doc_retriever = self.retriever_factory.create_retriever(
        "online", request, top_k
    )

`context_provision(request='', context=None)` `async`

Provision the context for the app agent.

Parameters:	`request` (`str`, default: `''` ) – The request sent to the Bing search retriever.

Source code in agents/agent/app_agent.py

async def context_provision(
    self, request: str = "", context: Context = None
) -> None:
    """
    Provision the context for the app agent.
    :param request: The request sent to the Bing search retriever.
    """

    ufo_config = get_ufo_config()

    # Load the offline document indexer for the app agent if available.
    if ufo_config.rag.offline_docs:
        console.print(
            f"📚 Loading offline help document indexer for {self._process_name}...",
            style="magenta",
        )
        self.build_offline_docs_retriever()

    # Load the online search indexer for the app agent if available.

    if ufo_config.rag.online_search and request:
        console.print("🔍 Creating a Bing search indexer...", style="magenta")
        self.build_online_search_retriever(
            request, ufo_config.rag.online_search_topk
        )

    # Load the experience indexer for the app agent if available.
    if ufo_config.rag.experience:
        console.print("📖 Creating an experience indexer...", style="magenta")
        experience_path = ufo_config.rag.experience_saved_path
        db_path = os.path.join(experience_path, "experience_db")
        self.build_experience_retriever(db_path)

    # Load the demonstration indexer for the app agent if available.
    if ufo_config.rag.demonstration:
        console.print("🎬 Creating an demonstration indexer...", style="magenta")
        demonstration_path = ufo_config.rag.demonstration_saved_path
        db_path = os.path.join(demonstration_path, "demonstration_db")
        self.build_human_demonstration_retriever(db_path)

    await self._load_mcp_context(context)

`demonstration_prompt_helper(request)`

Get the examples and tips for the AppAgent using the demonstration retriever.

Parameters:	`request` – The request for the AppAgent.

Returns:	`Tuple[List[Dict[str, Any]]]` – The examples and tips for the AppAgent.

Source code in agents/agent/app_agent.py

def demonstration_prompt_helper(self, request) -> Tuple[List[Dict[str, Any]]]:
    """
    Get the examples and tips for the AppAgent using the demonstration retriever.
    :param request: The request for the AppAgent.
    :return: The examples and tips for the AppAgent.
    """

    ufo_config = get_ufo_config()

    # Get the examples and tips for the AppAgent using the experience and demonstration retrievers.
    if ufo_config.rag.experience:
        experience_results = self.rag_experience_retrieve(
            request, ufo_config.rag.experience_retrieved_topk
        )
    else:
        experience_results = []

    if ufo_config.rag.demonstration:
        demonstration_results = self.rag_demonstration_retrieve(
            request, ufo_config.rag.demonstration_retrieved_topk
        )
    else:
        demonstration_results = []

    return experience_results, demonstration_results

`external_knowledge_prompt_helper(request, offline_top_k, online_top_k)`

Retrieve the external knowledge and construct the prompt.

Parameters:	`request` (`str`) – The request. `offline_top_k` (`int`) – The number of offline documents to retrieve. `online_top_k` (`int`) – The number of online documents to retrieve.

Returns:	`Tuple[str, str]` – The prompt message for the external_knowledge.

Source code in agents/agent/app_agent.py

def external_knowledge_prompt_helper(
    self, request: str, offline_top_k: int, online_top_k: int
) -> Tuple[str, str]:
    """
    Retrieve the external knowledge and construct the prompt.
    :param request: The request.
    :param offline_top_k: The number of offline documents to retrieve.
    :param online_top_k: The number of online documents to retrieve.
    :return: The prompt message for the external_knowledge.
    """

    # Retrieve offline documents and construct the prompt
    if self.offline_doc_retriever:

        offline_docs = self.offline_doc_retriever.retrieve(
            request,
            offline_top_k,
            filter=None,
        )

        format_string = "[Similar Requests]: {question}\nStep: {answer}\n"

        offline_docs_prompt = self.prompter.retrieved_documents_prompt_helper(
            "[Help Documents]",
            "",
            [
                format_string.format(
                    question=doc.metadata.get("title", ""),
                    answer=doc.metadata.get("text", ""),
                )
                for doc in offline_docs
            ],
        )
    else:
        offline_docs_prompt = ""

    # Retrieve online documents and construct the prompt
    if self.online_doc_retriever:
        online_search_docs = self.online_doc_retriever.retrieve(
            request, online_top_k, filter=None
        )
        online_docs_prompt = self.prompter.retrieved_documents_prompt_helper(
            "Online Search Results",
            "Search Result",
            [doc.page_content for doc in online_search_docs],
        )
    else:
        online_docs_prompt = ""

    return offline_docs_prompt, online_docs_prompt

`get_prompter(is_visual, main_prompt, example_prompt)`

Get the prompt for the agent.

Parameters:	`is_visual` (`bool`) – The flag indicating whether the agent is visual or not. `main_prompt` (`str`) – The main prompt file path. `example_prompt` (`str`) – The example prompt file path.

Returns:	`AppAgentPrompter` – The prompter instance.

Source code in agents/agent/app_agent.py

def get_prompter(
    self,
    is_visual: bool,
    main_prompt: str,
    example_prompt: str,
) -> AppAgentPrompter:
    """
    Get the prompt for the agent.
    :param is_visual: The flag indicating whether the agent is visual or not.
    :param main_prompt: The main prompt file path.
    :param example_prompt: The example prompt file path.
    :return: The prompter instance.
    """
    return AppAgentPrompter(is_visual, main_prompt, example_prompt)

`message_constructor(dynamic_examples, dynamic_knowledge, image_list, control_info, prev_subtask, plan, request, subtask, current_application, host_message, blackboard_prompt, last_success_actions, include_last_screenshot)`

Construct the prompt message for the AppAgent.

Parameters:

dynamic_examples (str) –

The dynamic examples retrieved from the self-demonstration and human demonstration.
dynamic_knowledge (str) –

The dynamic knowledge retrieved from the external knowledge base.
image_list (List) –

The list of screenshot images.
control_info (str) –

The control information.
plan (List[str]) –

The plan list.
request (str) –

The overall user request.
subtask (str) –

The subtask for the current AppAgent to process.
current_application (str) –

The current application name.
host_message (List[str]) –

The message from the HostAgent.
blackboard_prompt (List[Dict[str, str]]) –

The prompt message from the blackboard.
last_success_actions (List[Dict[str, Any]]) –

The list of successful actions in the last step.
include_last_screenshot (bool) –

The flag indicating whether to include the last screenshot.

Returns:	`List[Dict[str, Union[str, List[Dict[str, str]]]]]` – The prompt message.

Source code in agents/agent/app_agent.py

def message_constructor(
    self,
    dynamic_examples: str,
    dynamic_knowledge: str,
    image_list: List,
    control_info: str,
    prev_subtask: List[Dict[str, str]],
    plan: List[str],
    request: str,
    subtask: str,
    current_application: str,
    host_message: List[str],
    blackboard_prompt: List[Dict[str, str]],
    last_success_actions: List[Dict[str, Any]],
    include_last_screenshot: bool,
) -> List[Dict[str, Union[str, List[Dict[str, str]]]]]:
    """
    Construct the prompt message for the AppAgent.
    :param dynamic_examples: The dynamic examples retrieved from the self-demonstration and human demonstration.
    :param dynamic_knowledge: The dynamic knowledge retrieved from the external knowledge base.
    :param image_list: The list of screenshot images.
    :param control_info: The control information.
    :param plan: The plan list.
    :param request: The overall user request.
    :param subtask: The subtask for the current AppAgent to process.
    :param current_application: The current application name.
    :param host_message: The message from the HostAgent.
    :param blackboard_prompt: The prompt message from the blackboard.
    :param last_success_actions: The list of successful actions in the last step.
    :param include_last_screenshot: The flag indicating whether to include the last screenshot.
    :return: The prompt message.
    """
    appagent_prompt_system_message = self.prompter.system_prompt_construction(
        dynamic_examples
    )

    appagent_prompt_user_message = self.prompter.user_content_construction(
        image_list=image_list,
        control_item=control_info,
        prev_subtask=prev_subtask,
        prev_plan=plan,
        user_request=request,
        subtask=subtask,
        current_application=current_application,
        host_message=host_message,
        retrieved_docs=dynamic_knowledge,
        last_success_actions=last_success_actions,
        include_last_screenshot=include_last_screenshot,
    )

    if blackboard_prompt:
        appagent_prompt_user_message = (
            blackboard_prompt + appagent_prompt_user_message
        )

    appagent_prompt_message = self.prompter.prompt_construction(
        appagent_prompt_system_message, appagent_prompt_user_message
    )

    return appagent_prompt_message

`print_response(response, print_action=True)`

Print the response using the presenter.

Parameters:	`response` (`AppAgentResponse`) – The response object to print. `print_action` (`bool`, default: `True` ) – The flag indicating whether to print the action.

Source code in agents/agent/app_agent.py

def print_response(
    self, response: AppAgentResponse, print_action: bool = True
) -> None:
    """
    Print the response using the presenter.
    :param response: The response object to print.
    :param print_action: The flag indicating whether to print the action.
    """
    self.presenter.present_app_agent_response(response, print_action=print_action)

`process(context)` `async`

Process the agent.

Parameters:	`context` (`Context`) – The context.

Source code in agents/agent/app_agent.py

async def process(self, context: Context) -> None:
    """
    Process the agent.
    :param context: The context.
    """
    if not self._context_provision_executed:
        await self.context_provision(context=context)
        self._context_provision_executed = True

    if not self._processor_cls:
        raise ValueError(f"{self.__class__.__name__} has no processor assigned.")

    self.processor: ProcessorTemplate = self._processor_cls(
        agent=self, global_context=context
    )
    await self.processor.process()

    self.status = self.processor.processing_context.get_local("status")

`process_confirmation()`

Process the user confirmation.

Returns:	`bool` – The decision.

Source code in agents/agent/app_agent.py

def process_confirmation(self) -> bool:
    """
    Process the user confirmation.
    :return: The decision.
    """
    action = self.processor.actions
    control_text = self.processor.control_text

    decision = interactor.sensitive_step_asker(action, control_text)

    if not decision:
        console.print("❌ The user has canceled the action.", style="red")

    return decision

`rag_demonstration_retrieve(request, demonstration_top_k)`

Retrieving demonstration examples for the user request.

Parameters:	`request` (`str`) – The user request. `demonstration_top_k` (`int`) – The number of documents to retrieve.

Returns:	`str` – The retrieved examples and tips string.

Source code in agents/agent/app_agent.py

def rag_demonstration_retrieve(self, request: str, demonstration_top_k: int) -> str:
    """
    Retrieving demonstration examples for the user request.
    :param request: The user request.
    :param demonstration_top_k: The number of documents to retrieve.
    :return: The retrieved examples and tips string.
    """

    retrieved_docs = []

    # Retrieve demonstration examples.
    demonstration_docs = self.human_demonstration_retriever.retrieve(
        request, demonstration_top_k
    )

    if demonstration_docs:
        for doc in demonstration_docs:
            example_request = doc.metadata.get("request", "")
            response = doc.metadata.get("example", {})
            subtask = doc.metadata.get("Sub-task", "")
            tips = doc.metadata.get("Tips", "")
            retrieved_docs.append(
                {
                    "Request": example_request,
                    "Response": response,
                    "Sub-task": subtask,
                    "Tips": tips,
                }
            )

        return retrieved_docs
    else:
        return []

`rag_experience_retrieve(request, experience_top_k)`

Retrieving experience examples for the user request.

Parameters:	`request` (`str`) – The user request. `experience_top_k` (`int`) – The number of documents to retrieve.

Returns:	`List[Dict[str, Any]]` – The retrieved examples and tips dictionary.

Source code in agents/agent/app_agent.py

def rag_experience_retrieve(
    self, request: str, experience_top_k: int
) -> List[Dict[str, Any]]:
    """
    Retrieving experience examples for the user request.
    :param request: The user request.
    :param experience_top_k: The number of documents to retrieve.
    :return: The retrieved examples and tips dictionary.
    """

    retrieved_docs = []

    # Retrieve experience examples. Only retrieve the examples that are related to the current application.
    experience_docs = self.experience_retriever.retrieve(
        request,
        experience_top_k,
        filter=lambda x: self._app_root_name.lower()
        in [app.lower() for app in x["app_list"]],
    )

    if experience_docs:
        for doc in experience_docs:
            example_request = doc.metadata.get("request", "")
            response = doc.metadata.get("example", {})
            tips = doc.metadata.get("Tips", "")
            subtask = doc.metadata.get("Sub-task", "")
            retrieved_docs.append(
                {
                    "Request": example_request,
                    "Response": response,
                    "Sub-task": subtask,
                    "Tips": tips,
                }
            )

    return retrieved_docs

Summary

AppAgent Key Characteristics:

✅ Application-Specialized Worker: Dedicated to single Windows application
✅ ReAct Control Loop: Iterative observe → think → act execution
✅ Hybrid Execution: GUI automation + API calls via MCP commands
✅ 7-State FSM: Robust state management for execution control
✅ 4-Phase Pipeline: Structured data collection → reasoning → action → memory
✅ Knowledge-Enhanced: RAG from docs, demos, and search
✅ Orchestrated by HostAgent: Child agent in hierarchical architecture

Next Steps:

Deep Dive: Read State Machine and Processing Strategy for implementation details
Learn Features: Explore Core Features for advanced capabilities
Hands-On Tutorial: Follow Creating AppAgent guide

AppAgent: Application Execution Agent

What is AppAgent?

Core Responsibilities

ReAct-Style Control Loop

Execution Architecture

Finite State Machine

4-Phase Processing Pipeline

Hybrid GUI–API Execution

Knowledge Enhancement

Command System

Control Detection Backends

Input and Output

AppAgent Input

AppAgent Output

Related Documentation

API Reference

default_state property

mode property

status_manager property

tools_info property writable

build_experience_retriever(db_path)

build_human_demonstration_retriever(db_path)

build_offline_docs_retriever()

build_online_search_retriever(request, top_k)

context_provision(request='', context=None) async

demonstration_prompt_helper(request)

external_knowledge_prompt_helper(request, offline_top_k, online_top_k)

get_prompter(is_visual, main_prompt, example_prompt)

message_constructor(dynamic_examples, dynamic_knowledge, image_list, control_info, prev_subtask, plan, request, subtask, current_application, host_message, blackboard_prompt, last_success_actions, include_last_screenshot)

print_response(response, print_action=True)

process(context) async

process_confirmation()

rag_demonstration_retrieve(request, demonstration_top_k)

rag_experience_retrieve(request, experience_top_k)

Summary

`default_state` `property`

`mode` `property`

`status_manager` `property`

`tools_info` `property` `writable`

`build_experience_retriever(db_path)`

`build_human_demonstration_retriever(db_path)`

`build_offline_docs_retriever()`

`build_online_search_retriever(request, top_k)`

`context_provision(request='', context=None)` `async`

`demonstration_prompt_helper(request)`

`external_knowledge_prompt_helper(request, offline_top_k, online_top_k)`

`get_prompter(is_visual, main_prompt, example_prompt)`

`message_constructor(dynamic_examples, dynamic_knowledge, image_list, control_info, prev_subtask, plan, request, subtask, current_application, host_message, blackboard_prompt, last_success_actions, include_last_screenshot)`

`print_response(response, print_action=True)`

`process(context)` `async`

`process_confirmation()`

`rag_demonstration_retrieve(request, demonstration_top_k)`

`rag_experience_retrieve(request, experience_top_k)`