Follower Mode

Follower mode enables UFO to execute a predefined list of steps in natural language. Unlike normal mode where the agent generates its own plan, follower mode creates an AppAgent that follows user-provided steps to interact with applications. This mode is particularly useful for debugging, software testing, and verification.

Quick Start

Step 1: Create a Plan File

Create a JSON plan file containing the steps for the agent to follow:

Field Description Type
task The task description. String
steps The list of steps for the agent to follow. List of Strings
object The application or file to interact with. String

Example plan file:

{
    "task": "Type in a text of 'Test For Fun' with heading 1 level",
    "steps": 
    [
        "1.type in 'Test For Fun'", 
        "2.Select the 'Test For Fun' text",
        "3.Click 'Home' tab to show the 'Styles' ribbon tab",
        "4.Click 'Styles' ribbon tab to show the style 'Heading 1'",
        "5.Click 'Heading 1' style to apply the style to the selected text"
    ],
    "object": "draft.docx"
}

The object field specifies the application or file the agent will interact with. This object should be opened and accessible before starting follower mode.

Step 2: Start Follower Mode

Run the following command:

# Assume you are in the cloned UFO folder
python -m ufo --task {task_name} --mode follower --plan {plan_file}

Parameters: - {task_name}: Name for this task execution (used for logging) - {plan_file}: Path to the plan JSON file

Step 3: Run in Batch (Optional)

To execute multiple plan files sequentially, provide a folder containing multiple plan files:

# Assume you are in the cloned UFO folder
python -m ufo --task {task_name} --mode follower --plan {plan_folder}

UFO will automatically detect and execute all plan files in the folder sequentially.

Parameters: - {task_name}: Name for this batch execution (used for logging) - {plan_folder}: Path to the folder containing plan JSON files

Evaluation

UFO can automatically evaluate task completion. To enable evaluation, ensure EVA_SESSION is set to True in config/ufo/system.yaml.

Check the evaluation results in logs/{task_name}/evaluation.log.

References

Follower mode uses a PlanReader to parse the plan file and creates a FollowerSession to execute the steps.

PlanReader

The PlanReader is located at ufo/module/sessions/plan_reader.py.

The reader for a plan file.

Initialize a plan reader.

Parameters:
  • plan_file (str) –

    The path of the plan file.

Source code in module/sessions/plan_reader.py
18
19
20
21
22
23
24
25
26
27
28
def __init__(self, plan_file: str):
    """
    Initialize a plan reader.
    :param plan_file: The path of the plan file.
    """

    self.plan_file = plan_file
    with open(plan_file, "r") as f:
        self.plan = json.load(f)
    self.remaining_steps = self.get_steps()
    self.support_apps = ["WINWORD.EXE", "EXCEL.EXE", "POWERPNT.EXE"]

get_close()

Check if the plan is closed.

Returns:
  • bool

    True if the plan need closed, False otherwise.

Source code in module/sessions/plan_reader.py
30
31
32
33
34
35
36
def get_close(self) -> bool:
    """
    Check if the plan is closed.
    :return: True if the plan need closed, False otherwise.
    """

    return self.plan.get("close", False)

get_host_agent_request()

Get the request for the host agent.

Returns:
  • str

    The request for the host agent.

Source code in module/sessions/plan_reader.py
75
76
77
78
79
80
81
82
83
84
85
86
87
88
def get_host_agent_request(self) -> str:
    """
    Get the request for the host agent.
    :return: The request for the host agent.
    """

    object_name = self.get_operation_object()

    request = (
        f"Open and select the application of {object_name}, and output the FINISH status immediately, without assigning any subtask"
        "You must output the selected application with their control text and label even if it is already open."
    )

    return request

get_host_request()

Get the request for the host agent.

Returns:
  • str

    The request for the host agent.

Source code in module/sessions/plan_reader.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
def get_host_request(self) -> str:
    """
    Get the request for the host agent.
    :return: The request for the host agent.
    """

    task = self.get_task()
    object_name = self.get_operation_object()
    if object_name in self.support_apps:
        request = task
    else:
        request = (
            f"Your task is '{task}'. And open the application of {object_name}. "
            "You must output the selected application with their control text and label even if it is already open."
        )
    return request

get_initial_request()

Get the initial request in the plan.

Returns:
  • str

    The initial request.

Source code in module/sessions/plan_reader.py
62
63
64
65
66
67
68
69
70
71
72
73
def get_initial_request(self) -> str:
    """
    Get the initial request in the plan.
    :return: The initial request.
    """

    task = self.get_task()
    object_name = self.get_operation_object()

    request = f"{task} in {object_name}"

    return request

get_operation_object()

Get the operation object in the step.

Returns:
  • str

    The operation object.

Source code in module/sessions/plan_reader.py
54
55
56
57
58
59
60
def get_operation_object(self) -> str:
    """
    Get the operation object in the step.
    :return: The operation object.
    """

    return self.plan.get("object", None).lower()

get_root_path()

Get the root path of the plan.

Returns:
  • str

    The root path of the plan.

Source code in module/sessions/plan_reader.py
148
149
150
151
152
153
154
def get_root_path(self) -> str:
    """
    Get the root path of the plan.
    :return: The root path of the plan.
    """

    return os.path.dirname(os.path.abspath(self.plan_file))

get_steps()

Get the steps in the plan.

Returns:
  • List[str]

    The steps in the plan.

Source code in module/sessions/plan_reader.py
46
47
48
49
50
51
52
def get_steps(self) -> List[str]:
    """
    Get the steps in the plan.
    :return: The steps in the plan.
    """

    return self.plan.get("steps", [])

get_support_apps()

Get the support apps in the plan.

Returns:
  • List[str]

    The support apps in the plan.

Source code in module/sessions/plan_reader.py
103
104
105
106
107
108
109
def get_support_apps(self) -> List[str]:
    """
    Get the support apps in the plan.
    :return: The support apps in the plan.
    """

    return self.support_apps

get_task()

Get the task name.

Returns:
  • str

    The task name.

Source code in module/sessions/plan_reader.py
38
39
40
41
42
43
44
def get_task(self) -> str:
    """
    Get the task name.
    :return: The task name.
    """

    return self.plan.get("task", "")

next_step()

Get the next step in the plan.

Returns:
  • Optional[str]

    The next step.

Source code in module/sessions/plan_reader.py
128
129
130
131
132
133
134
135
136
137
138
def next_step(self) -> Optional[str]:
    """
    Get the next step in the plan.
    :return: The next step.
    """

    if self.remaining_steps:
        step = self.remaining_steps.pop(0)
        return step

    return None

task_finished()

Check if the task is finished.

Returns:
  • bool

    True if the task is finished, False otherwise.

Source code in module/sessions/plan_reader.py
140
141
142
143
144
145
146
def task_finished(self) -> bool:
    """
    Check if the task is finished.
    :return: True if the task is finished, False otherwise.
    """

    return not self.remaining_steps

FollowerSession

The FollowerSession is located at ufo/module/sessions/session.py.

Bases: WindowsBaseSession

A session for following a list of plan for action taken. This session is used for the follower agent, which accepts a plan file to follow using the PlanReader.

Initialize a session.

Parameters:
  • task (str) –

    The name of current task.

  • plan_file (str) –

    The path of the plan file to follow.

  • should_evaluate (bool) –

    Whether to evaluate the session.

  • id (int) –

    The id of the session.

Source code in module/sessions/session.py
166
167
168
169
170
171
172
173
174
175
176
177
178
179
def __init__(
    self, task: str, plan_file: str, should_evaluate: bool, id: int
) -> None:
    """
    Initialize a session.
    :param task: The name of current task.
    :param plan_file: The path of the plan file to follow.
    :param should_evaluate: Whether to evaluate the session.
    :param id: The id of the session.
    """

    super().__init__(task, should_evaluate, id)

    self.plan_reader = PlanReader(plan_file)

create_new_round()

Create a new round.

Source code in module/sessions/session.py
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
def create_new_round(self) -> None:
    """
    Create a new round.
    """
    from ufo.agents.agent.host_agent import HostAgent

    # Get a request for the new round.
    request = self.next_request()

    # Create a new round and return None if the session is finished.
    if self.is_finished():
        return None

    if self.total_rounds == 0:
        console.print("Complete the following request:", style="yellow")
        console.print(self.plan_reader.get_initial_request(), style="cyan")
        agent: HostAgent = self._host_agent
    else:
        host_agent: HostAgent = self._host_agent
        self.context.set(ContextNames.SUBTASK, request)
        agent = host_agent.create_subagent(context=self.context)

        # Clear the memory and set the state to continue the app agent.
        agent.clear_memory()
        agent.blackboard.requests.clear()

        agent.set_state(ContinueAppAgentState())

    round = BaseRound(
        request=request,
        agent=agent,
        context=self.context,
        should_evaluate=ufo_config.system.eva_round,
        id=self.total_rounds,
    )

    self.add_round(round.id, round)

    return round

next_request()

Get the request for the new round.

Source code in module/sessions/session.py
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
def next_request(self) -> str:
    """
    Get the request for the new round.
    """

    # If the task is finished, return an empty string.
    if self.plan_reader.task_finished():
        self._finish = True
        return ""

    # Get the request from the plan reader.
    if self.total_rounds == 0:
        return self.plan_reader.get_host_agent_request()
    else:
        return self.plan_reader.next_step()

request_to_evaluate()

Get the request to evaluate. return: The request(s) to evaluate.

Source code in module/sessions/session.py
248
249
250
251
252
253
254
def request_to_evaluate(self) -> str:
    """
    Get the request to evaluate.
    return: The request(s) to evaluate.
    """

    return self.plan_reader.get_task()