Follower Agent πŸšΆπŸ½β€β™‚οΈ

The FollowerAgent is inherited from the AppAgent and is responsible for following the user's instructions to perform specific tasks within the application. The FollowerAgent is designed to execute a series of actions based on the user's guidance. It is particularly useful for software testing, when clear instructions are provided to validate the application's behavior.

Different from the AppAgent

The FollowerAgent shares most of the functionalities with the AppAgent, but it is designed to follow the step-by-step instructions provided by the user, instead of does its own reasoning to determine the next action.

Usage

The FollowerAgent is available in follower mode. You can find more details in the documentation. It also uses differnt Session and Processor to handle the user's instructions. The step-wise instructions are provided by the user in the in a json file, which is then parsed by the FollowerAgent to execute the actions. An example of the json file is shown below:

{
    "task": "Type in a bold text of 'Test For Fun'",
    "steps": 
    [
        "1.type in 'Test For Fun'",
        "2.select the text of 'Test For Fun'",
        "3.click on the bold"
    ],
    "object": "draft.docx"
}

Reference

Bases: AppAgent

The FollowerAgent class the manager of a FollowedAgent that follows the step-by-step instructions for action execution within an application. It is a subclass of the AppAgent, which completes the action execution within the application.

Initialize the FollowAgent.

Parameters:
  • name (str) –

    The name of the agent.

  • process_name (str) –

    The process name of the app.

  • app_root_name (str) –

    The root name of the app.

  • is_visual (bool) –

    The flag indicating whether the agent is visual or not.

  • main_prompt (str) –

    The main prompt file path.

  • example_prompt (str) –

    The example prompt file path.

  • api_prompt (str) –

    The API prompt file path.

  • app_info_prompt (str) –

    The app information prompt file path.

Source code in agents/agent/follower_agent.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
def __init__(
    self,
    name: str,
    process_name: str,
    app_root_name: str,
    is_visual: bool,
    main_prompt: str,
    example_prompt: str,
    api_prompt: str,
    app_info_prompt: str,
):
    """
    Initialize the FollowAgent.
    :param name: The name of the agent.
    :param process_name: The process name of the app.
    :param app_root_name: The root name of the app.
    :param is_visual: The flag indicating whether the agent is visual or not.
    :param main_prompt: The main prompt file path.
    :param example_prompt: The example prompt file path.
    :param api_prompt: The API prompt file path.
    :param app_info_prompt: The app information prompt file path.
    """
    super().__init__(
        name=name,
        process_name=process_name,
        app_root_name=app_root_name,
        is_visual=is_visual,
        main_prompt=main_prompt,
        example_prompt=example_prompt,
        api_prompt=api_prompt,
        skip_prompter=True,
    )

    self.prompter = self.get_prompter(
        is_visual,
        main_prompt,
        example_prompt,
        api_prompt,
        app_info_prompt,
        app_root_name,
    )

get_prompter(is_visual, main_prompt, example_prompt, api_prompt, app_info_prompt, app_root_name='')

Get the prompter for the follower agent.

Parameters:
  • is_visual (str) –

    The flag indicating whether the agent is visual or not.

  • main_prompt (str) –

    The main prompt file path.

  • example_prompt (str) –

    The example prompt file path.

  • api_prompt (str) –

    The API prompt file path.

  • app_info_prompt (str) –

    The app information prompt file path.

  • app_root_name (str, default: '' ) –

    The root name of the app.

Returns:
  • FollowerAgentPrompter –

    The prompter instance.

Source code in agents/agent/follower_agent.py
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
def get_prompter(
    self,
    is_visual: str,
    main_prompt: str,
    example_prompt: str,
    api_prompt: str,
    app_info_prompt: str,
    app_root_name: str = "",
) -> FollowerAgentPrompter:
    """
    Get the prompter for the follower agent.
    :param is_visual: The flag indicating whether the agent is visual or not.
    :param main_prompt: The main prompt file path.
    :param example_prompt: The example prompt file path.
    :param api_prompt: The API prompt file path.
    :param app_info_prompt: The app information prompt file path.
    :param app_root_name: The root name of the app.
    :return: The prompter instance.
    """
    return FollowerAgentPrompter(
        is_visual,
        main_prompt,
        example_prompt,
        api_prompt,
        app_info_prompt,
        app_root_name,
    )

message_constructor(dynamic_examples, dynamic_tips, dynamic_knowledge, image_list, control_info, prev_subtask, plan, request, subtask, host_message, current_state, state_diff, include_last_screenshot)

Construct the prompt message for the FollowAgent.

Parameters:
  • dynamic_examples (str) –

    The dynamic examples retrieved from the self-demonstration and human demonstration.

  • dynamic_tips (str) –

    The dynamic tips retrieved from the self-demonstration and human demonstration.

  • dynamic_knowledge (str) –

    The dynamic knowledge retrieved from the self-demonstration and human demonstration.

  • image_list (List[str]) –

    The list of screenshot images.

  • control_info (str) –

    The control information.

  • prev_subtask (List[str]) –

    The previous subtask.

  • plan (List[str]) –

    The plan.

  • request (str) –

    The request.

  • subtask (str) –

    The subtask.

  • host_message (List[str]) –

    The host message.

  • current_state (Dict[str, str]) –

    The current state of the app.

  • state_diff (Dict[str, str]) –

    The state difference between the current state and the previous state.

  • include_last_screenshot (bool) –

    The flag indicating whether the last screenshot should be included.

Returns:
  • List[Dict[str, str]] –

    The prompt message.

Source code in agents/agent/follower_agent.py
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
def message_constructor(
    self,
    dynamic_examples: str,
    dynamic_tips: str,
    dynamic_knowledge: str,
    image_list: List[str],
    control_info: str,
    prev_subtask: List[str],
    plan: List[str],
    request: str,
    subtask: str,
    host_message: List[str],
    current_state: Dict[str, str],
    state_diff: Dict[str, str],
    include_last_screenshot: bool,
) -> List[Dict[str, str]]:
    """
    Construct the prompt message for the FollowAgent.
    :param dynamic_examples: The dynamic examples retrieved from the self-demonstration and human demonstration.
    :param dynamic_tips: The dynamic tips retrieved from the self-demonstration and human demonstration.
    :param dynamic_knowledge: The dynamic knowledge retrieved from the self-demonstration and human demonstration.
    :param image_list: The list of screenshot images.
    :param control_info: The control information.
    :param prev_subtask: The previous subtask.
    :param plan: The plan.
    :param request: The request.
    :param subtask: The subtask.
    :param host_message: The host message.
    :param current_state: The current state of the app.
    :param state_diff: The state difference between the current state and the previous state.
    :param include_last_screenshot: The flag indicating whether the last screenshot should be included.
    :return: The prompt message.
    """
    followagent_prompt_system_message = self.prompter.system_prompt_construction(
        dynamic_examples, dynamic_tips
    )
    followagent_prompt_user_message = self.prompter.user_content_construction(
        image_list=image_list,
        control_item=control_info,
        prev_subtask=prev_subtask,
        prev_plan=plan,
        user_request=request,
        subtask=subtask,
        current_application=self._process_name,
        host_message=host_message,
        retrieved_docs=dynamic_knowledge,
        current_state=current_state,
        state_diff=state_diff,
        include_last_screenshot=include_last_screenshot,
    )

    followagent_prompt_message = self.prompter.prompt_construction(
        followagent_prompt_system_message, followagent_prompt_user_message
    )

    return followagent_prompt_message