Step Logs

The step log captures agent responses and execution details at every step. Each line in response.log is a JSON entry representing one agent action.

Location

logs/{task_name}/response.log

HostAgent Logs

LLM Response Fields

Field Description Type
observation Desktop screenshot analysis and current state String
thought Reasoning process for task decomposition String
current_subtask Subtask to be executed by AppAgent String
message Instructions and context for AppAgent List of Strings
control_label Index of selected application String
control_text Name of selected application String
plan Future subtasks after current one List of Strings
status Agent state: FINISH, CONTINUE, PENDING, or ASSIGN String
comment User-facing summary or progress update String
questions Questions requiring user clarification List of Strings
function System command to execute (optional) String

Additional Metadata

Field Description Type
step Global step number in session Integer
round_step Step number within current round Integer
agent_step Step number for this agent instance Integer
round_num Current round number Integer
request Original user request String
agent_type Set to HostAgent String
agent_name Agent instance name String
application Application process name String
cost LLM cost for this step Float
result Execution results String
screenshot_clean Clean desktop screenshot path String
screenshot_annotated Annotated screenshot path String
screenshot_concat Concatenated screenshot path String
screenshot_selected_control Selected control screenshot path String
time_cost Time spent on each processing phase Dictionary

AppAgent Logs

LLM Response Fields

Field Description Type
observation Application UI analysis and status String
thought Reasoning for next action String
control_label Index of selected control element String
control_text Name of selected control element String
action Action details including function and arguments Dictionary or List
status Agent state (CONTINUE, FINISH, etc.) String
plan Planned steps after current action List of Strings
comment Progress summary or completion notes String
save_screenshot Screenshot save configuration Dictionary

Additional Metadata

Field Description Type
step Global step number in session Integer
round_step Step number within current round Integer
agent_step Step number for this agent instance Integer
round_num Current round number Integer
subtask Subtask assigned by HostAgent String
subtask_index Index of subtask in current round Integer
action_type Type of action performed String
request Original user request String
agent_type Set to AppAgent String
agent_name Agent instance name String
application Application process name String
cost LLM cost for this step Float
result Execution results String
screenshot_clean Clean application screenshot path String
screenshot_annotated Annotated screenshot path String
screenshot_concat Concatenated screenshot path String
time_cost Time spent on each processing phase Dictionary

Reading Step Logs

import json

with open('logs/{task_name}/response.log', 'r') as f:
    for line in f:
        log = json.loads(line)
        print(f"Step {log['step']} - Agent: {log['agent_type']}")
        print(f"Thought: {log['thought']}")