Step Logs
The step log captures agent responses and execution details at every step. Each line in response.log is a JSON entry representing one agent action.
Location
logs/{task_name}/response.log
HostAgent Logs
LLM Response Fields
| Field | Description | Type |
|---|---|---|
observation |
Desktop screenshot analysis and current state | String |
thought |
Reasoning process for task decomposition | String |
current_subtask |
Subtask to be executed by AppAgent | String |
message |
Instructions and context for AppAgent | List of Strings |
control_label |
Index of selected application | String |
control_text |
Name of selected application | String |
plan |
Future subtasks after current one | List of Strings |
status |
Agent state: FINISH, CONTINUE, PENDING, or ASSIGN |
String |
comment |
User-facing summary or progress update | String |
questions |
Questions requiring user clarification | List of Strings |
function |
System command to execute (optional) | String |
Additional Metadata
| Field | Description | Type |
|---|---|---|
step |
Global step number in session | Integer |
round_step |
Step number within current round | Integer |
agent_step |
Step number for this agent instance | Integer |
round_num |
Current round number | Integer |
request |
Original user request | String |
agent_type |
Set to HostAgent |
String |
agent_name |
Agent instance name | String |
application |
Application process name | String |
cost |
LLM cost for this step | Float |
result |
Execution results | String |
screenshot_clean |
Clean desktop screenshot path | String |
screenshot_annotated |
Annotated screenshot path | String |
screenshot_concat |
Concatenated screenshot path | String |
screenshot_selected_control |
Selected control screenshot path | String |
time_cost |
Time spent on each processing phase | Dictionary |
AppAgent Logs
LLM Response Fields
| Field | Description | Type |
|---|---|---|
observation |
Application UI analysis and status | String |
thought |
Reasoning for next action | String |
control_label |
Index of selected control element | String |
control_text |
Name of selected control element | String |
action |
Action details including function and arguments | Dictionary or List |
status |
Agent state (CONTINUE, FINISH, etc.) | String |
plan |
Planned steps after current action | List of Strings |
comment |
Progress summary or completion notes | String |
save_screenshot |
Screenshot save configuration | Dictionary |
Additional Metadata
| Field | Description | Type |
|---|---|---|
step |
Global step number in session | Integer |
round_step |
Step number within current round | Integer |
agent_step |
Step number for this agent instance | Integer |
round_num |
Current round number | Integer |
subtask |
Subtask assigned by HostAgent | String |
subtask_index |
Index of subtask in current round | Integer |
action_type |
Type of action performed | String |
request |
Original user request | String |
agent_type |
Set to AppAgent |
String |
agent_name |
Agent instance name | String |
application |
Application process name | String |
cost |
LLM cost for this step | Float |
result |
Execution results | String |
screenshot_clean |
Clean application screenshot path | String |
screenshot_annotated |
Annotated screenshot path | String |
screenshot_concat |
Concatenated screenshot path | String |
time_cost |
Time spent on each processing phase | Dictionary |
Reading Step Logs
import json
with open('logs/{task_name}/response.log', 'r') as f:
for line in f:
log = json.loads(line)
print(f"Step {log['step']} - Agent: {log['agent_type']}")
print(f"Thought: {log['thought']}")