Step Logs
The step log contains the agent's response to the user's request and additional information at every step. The step log is stored in the response.log
file. The log fields are different for HostAgent
and AppAgent
. The step log is at the info
level.
HostAgent Logs
The HostAgent
logs contain the following fields:
LLM Output
Field | Description | Type |
---|---|---|
Observation | The observation of current desktop screenshots. | String |
Thought | The logical reasoning process of the HostAgent . |
String |
Current Sub-Task | The current sub-task to be executed by the AppAgent . |
String |
Message | The message to be sent to the AppAgent for the completion of the sub-task. |
String |
ControlLabel | The index of the selected application to execute the sub-task. | String |
ControlText | The name of the selected application to execute the sub-task. | String |
Plan | The plan for the following sub-tasks after the current sub-task. | List of Strings |
Status | The status of the agent, mapped to the AgentState . |
String |
Comment | Additional comments or information provided to the user. | String |
Questions | The questions to be asked to the user for additional information. | List of Strings |
Bash | The bash command to be executed by the HostAgent . It can be used to open applications or execute system commands. |
String |
Additional Information
Field | Description | Type |
---|---|---|
Step | The step number of the session. | Integer |
RoundStep | The step number of the current round. | Integer |
AgentStep | The step number of the HostAgent . |
Integer |
Round | The round number of the session. | Integer |
ControlLabel | The index of the selected application to execute the sub-task. | Integer |
ControlText | The name of the selected application to execute the sub-task. | String |
Request | The user request. | String |
Agent | The agent that executed the step, set to HostAgent . |
String |
AgentName | The name of the agent. | String |
Application | The application process name. | String |
Cost | The cost of the step. | Float |
Results | The results of the step, set to an empty string. | String |
CleanScreenshot | The image path of the desktop screenshot. | String |
AnnotatedScreenshot | The image path of the annotated application screenshot. | String |
ConcatScreenshot | The image path of the concatenated application screenshot. | String |
SelectedControlScreenshot | The image path of the selected control screenshot. | String |
time_cost | The time cost of each step in the process. | Dictionary |
AppAgent Logs
The AppAgent
logs contain the following fields:
LLM Output
Field | Description | Type |
---|---|---|
Observation | The observation of the current application screenshots. | String |
Thought | The logical reasoning process of the AppAgent . |
String |
ControlLabel | The index of the selected control to interact with. | String |
ControlText | The name of the selected control to interact with. | String |
Function | The function to be executed on the selected control. | String |
Args | The arguments required for the function execution. | List of Strings |
Status | The status of the agent, mapped to the AgentState . |
String |
Plan | The plan for the following steps after the current action. | List of Strings |
Comment | Additional comments or information provided to the user. | String |
SaveScreenshot | The flag to save the screenshot of the application to the blackboard for future reference. |
Boolean |
Additional Information
Field | Description | Type |
---|---|---|
Step | The step number of the session. | Integer |
RoundStep | The step number of the current round. | Integer |
AgentStep | The step number of the AppAgent . |
Integer |
Round | The round number of the session. | Integer |
Subtask | The sub-task to be executed by the AppAgent . |
String |
SubtaskIndex | The index of the sub-task in the current round. | Integer |
Action | The action to be executed by the AppAgent . |
String |
ActionType | The type of the action to be executed. | String |
Request | The user request. | String |
Agent | The agent that executed the step, set to AppAgent . |
String |
AgentName | The name of the agent. | String |
Application | The application process name. | String |
Cost | The cost of the step. | Float |
Results | The results of the step. | String |
CleanScreenshot | The image path of the desktop screenshot. | String |
AnnotatedScreenshot | The image path of the annotated application screenshot. | String |
ConcatScreenshot | The image path of the concatenated application screenshot. | String |
time_cost | The time cost of each step in the process. | Dictionary |
Tip
You can use the following python code to read the request log:
import json
with open('logs/{task_name}/request.log', 'r') as f:
for line in f:
log = json.loads(line)
Info
The FollowerAgent
logs share the same fields as the AppAgent
logs.