UI Automator
The UI Automator enables to mimic the operations of mouse and keyboard on the application's UI controls. UFO uses the UIA or Win32 APIs to interact with the application's UI controls, such as buttons, edit boxes, and menus.
Configuration
There are several configurations that need to be set up before using the UI Automator in the config_dev.yaml
file. Below is the list of configurations related to the UI Automator:
Configuration Option |
Description |
Type |
Default Value |
CONTROL_BACKEND |
The backend for control action, currently supporting uia and win32 . |
String |
"uia" |
CONTROL_LIST |
The list of widgets allowed to be selected. |
List |
["Button", "Edit", "TabItem", "Document", "ListItem", "MenuItem", "ScrollBar", "TreeItem", "Hyperlink", "ComboBox", "RadioButton", "DataItem"] |
ANNOTATION_COLORS |
The colors assigned to different control types for annotation. |
Dictionary |
{"Button": "#FFF68F", "Edit": "#A5F0B5", "TabItem": "#A5E7F0", "Document": "#FFD18A", "ListItem": "#D9C3FE", "MenuItem": "#E7FEC3", "ScrollBar": "#FEC3F8", "TreeItem": "#D6D6D6", "Hyperlink": "#91FFEB", "ComboBox": "#D8B6D4"} |
API_PROMPT |
The prompt for the UI automation API. |
String |
"ufo/prompts/share/base/api.yaml" |
CLICK_API |
The API used for click action, can be click_input or click . |
String |
"click_input" |
INPUT_TEXT_API |
The API used for input text action, can be type_keys or set_text . |
String |
"type_keys" |
INPUT_TEXT_ENTER |
Whether to press enter after typing the text. |
Boolean |
False |
Receiver
The receiver of the UI Automator is the ControlReceiver
class defined in the ufo/automator/ui_control/controller/control_receiver
module. It is initialized with the application's window handle and control wrapper that executes the actions. . The ControlReceiver
provides functionalities to interact with the application's UI controls. Below is the reference for the ControlReceiver
class:
Bases: ReceiverBasic
The control receiver class.
Initialize the control receiver.
Parameters: |
-
control
(Optional[UIAWrapper] )
–
-
application
(Optional[UIAWrapper] )
–
|
Source code in automator/ui_control/controller.py
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49 | def __init__(
self, control: Optional[UIAWrapper], application: Optional[UIAWrapper]
) -> None:
"""
Initialize the control receiver.
:param control: The control element.
:param application: The application element.
"""
self.control = control
self.application = application
if control:
self.control.set_focus()
self.wait_enabled()
elif application:
self.application.set_focus()
|
annotation(params, annotation_dict)
Take a screenshot of the current application window and annotate the control item on the screenshot.
Parameters: |
-
params
(Dict[str, str] )
–
The arguments of the annotation method.
-
annotation_dict
(Dict[str, UIAWrapper] )
–
The dictionary of the control labels.
|
Source code in automator/ui_control/controller.py
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254 | def annotation(
self, params: Dict[str, str], annotation_dict: Dict[str, UIAWrapper]
) -> List[str]:
"""
Take a screenshot of the current application window and annotate the control item on the screenshot.
:param params: The arguments of the annotation method.
:param annotation_dict: The dictionary of the control labels.
"""
selected_controls_labels = params.get("control_labels", [])
control_reannotate = [
annotation_dict[str(label)] for label in selected_controls_labels
]
return control_reannotate
|
atomic_execution(method_name, params)
Atomic execution of the action on the control elements.
Parameters: |
-
method_name
(str )
–
The name of the method to execute.
-
params
(Dict[str, Any] )
–
The arguments of the method.
|
Returns: |
-
str
–
The result of the action.
|
Source code in automator/ui_control/controller.py
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77 | def atomic_execution(self, method_name: str, params: Dict[str, Any]) -> str:
"""
Atomic execution of the action on the control elements.
:param method_name: The name of the method to execute.
:param params: The arguments of the method.
:return: The result of the action.
"""
import traceback
try:
method = getattr(self.control, method_name)
result = method(**params)
except AttributeError:
message = f"{self.control} doesn't have a method named {method_name}"
print_with_color(f"Warning: {message}", "yellow")
result = message
except Exception as e:
full_traceback = traceback.format_exc()
message = f"An error occurred: {full_traceback}"
print_with_color(f"Warning: {message}", "yellow")
result = message
return result
|
Click the control element.
Parameters: |
-
params
(Dict[str, Union[str, bool]] )
–
The arguments of the click method.
|
Returns: |
-
str
–
The result of the click action.
|
Source code in automator/ui_control/controller.py
79
80
81
82
83
84
85
86
87
88
89
90
91 | def click_input(self, params: Dict[str, Union[str, bool]]) -> str:
"""
Click the control element.
:param params: The arguments of the click method.
:return: The result of the click action.
"""
api_name = configs.get("CLICK_API", "click_input")
if api_name == "click":
return self.atomic_execution("click", params)
else:
return self.atomic_execution("click_input", params)
|
click_on_coordinates(params)
Click on the coordinates of the control element.
Parameters: |
-
params
(Dict[str, str] )
–
The arguments of the click on coordinates method.
|
Returns: |
-
str
–
The result of the click on coordinates action.
|
Source code in automator/ui_control/controller.py
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116 | def click_on_coordinates(self, params: Dict[str, str]) -> str:
"""
Click on the coordinates of the control element.
:param params: The arguments of the click on coordinates method.
:return: The result of the click on coordinates action.
"""
# Get the relative coordinates fraction of the application window.
x = float(params.get("x", 0))
y = float(params.get("y", 0))
button = params.get("button", "left")
double = params.get("double", False)
# Get the absolute coordinates of the application window.
tranformed_x, tranformed_y = self.transform_point(x, y)
self.application.set_focus()
pyautogui.click(
tranformed_x, tranformed_y, button=button, clicks=2 if double else 1
)
return ""
|
drag_on_coordinates(params)
Drag on the coordinates of the control element.
Parameters: |
-
params
(Dict[str, str] )
–
The arguments of the drag on coordinates method.
|
Returns: |
-
str
–
The result of the drag on coordinates action.
|
Source code in automator/ui_control/controller.py
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139 | def drag_on_coordinates(self, params: Dict[str, str]) -> str:
"""
Drag on the coordinates of the control element.
:param params: The arguments of the drag on coordinates method.
:return: The result of the drag on coordinates action.
"""
start = self.transform_point(
float(params.get("start_x", 0)), float(params.get("start_y", 0))
)
end = self.transform_point(
float(params.get("end_x", 0)), float(params.get("end_y", 0))
)
button = params.get("button", "left")
self.application.set_focus()
pyautogui.moveTo(start[0], start[1])
pyautogui.dragTo(end[0], end[1], button=button)
return ""
|
Keyboard input on the control element.
Parameters: |
-
params
(Dict[str, str] )
–
The arguments of the keyboard input method.
|
Returns: |
-
str
–
The result of the keyboard input action.
|
Source code in automator/ui_control/controller.py
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215 | def keyboard_input(self, params: Dict[str, str]) -> str:
"""
Keyboard input on the control element.
:param params: The arguments of the keyboard input method.
:return: The result of the keyboard input action.
"""
control_focus = params.get("control_focus", True)
keys = params.get("keys", "")
if control_focus:
self.atomic_execution("type_keys", {"keys": keys})
else:
pyautogui.typewrite(keys)
return keys
|
no_action()
No action on the control element.
Returns: |
-
–
The result of the no action.
|
Source code in automator/ui_control/controller.py
232
233
234
235
236
237
238 | def no_action(self):
"""
No action on the control element.
:return: The result of the no action.
"""
return ""
|
set_edit_text(params)
Set the edit text of the control element.
Parameters: |
-
params
(Dict[str, str] )
–
The arguments of the set edit text method.
|
Returns: |
-
str
–
The result of the set edit text action.
|
Source code in automator/ui_control/controller.py
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199 | def set_edit_text(self, params: Dict[str, str]) -> str:
"""
Set the edit text of the control element.
:param params: The arguments of the set edit text method.
:return: The result of the set edit text action.
"""
text = params.get("text", "")
inter_key_pause = configs.get("INPUT_TEXT_INTER_KEY_PAUSE", 0.1)
if configs["INPUT_TEXT_API"] == "set_text":
method_name = "set_edit_text"
args = {"text": text}
else:
method_name = "type_keys"
# Transform the text according to the tags.
text = TextTransformer.transform_text(text, "all")
args = {"keys": text, "pause": inter_key_pause, "with_spaces": True}
try:
result = self.atomic_execution(method_name, args)
if (
method_name == "set_text"
and args["text"] not in self.control.window_text()
):
raise Exception(f"Failed to use set_text: {args['text']}")
if configs["INPUT_TEXT_ENTER"] and method_name in ["type_keys", "set_text"]:
self.atomic_execution("type_keys", params={"keys": "{ENTER}"})
return result
except Exception as e:
if method_name == "set_text":
print_with_color(
f"{self.control} doesn't have a method named {method_name}, trying default input method",
"yellow",
)
method_name = "type_keys"
clear_text_keys = "^a{BACKSPACE}"
text_to_type = args["text"]
keys_to_send = clear_text_keys + text_to_type
method_name = "type_keys"
args = {
"keys": keys_to_send,
"pause": inter_key_pause,
"with_spaces": True,
}
return self.atomic_execution(method_name, args)
else:
return f"An error occurred: {e}"
|
summary(params)
Visual summary of the control element.
Parameters: |
-
params
(Dict[str, str] )
–
The arguments of the visual summary method. should contain a key "text" with the text summary.
|
Returns: |
-
str
–
The result of the visual summary action.
|
Source code in automator/ui_control/controller.py
141
142
143
144
145
146
147
148 | def summary(self, params: Dict[str, str]) -> str:
"""
Visual summary of the control element.
:param params: The arguments of the visual summary method. should contain a key "text" with the text summary.
:return: The result of the visual summary action.
"""
return params.get("text")
|
texts()
Get the text of the control element.
Returns: |
-
str
–
The text of the control element.
|
Source code in automator/ui_control/controller.py
| def texts(self) -> str:
"""
Get the text of the control element.
:return: The text of the control element.
"""
return self.control.texts()
|
Transform the relative coordinates to the absolute coordinates.
Parameters: |
-
fraction_x
(float )
–
The relative x coordinate.
-
fraction_y
(float )
–
The relative y coordinate.
|
Returns: |
-
Tuple[int, int]
–
The absolute coordinates.
|
Source code in automator/ui_control/controller.py
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298 | def transform_point(self, fraction_x: float, fraction_y: float) -> Tuple[int, int]:
"""
Transform the relative coordinates to the absolute coordinates.
:param fraction_x: The relative x coordinate.
:param fraction_y: The relative y coordinate.
:return: The absolute coordinates.
"""
application_rect: RECT = self.application.rectangle()
application_x = application_rect.left
application_y = application_rect.top
application_width = application_rect.width()
application_height = application_rect.height()
x = application_x + int(application_width * fraction_x)
y = application_y + int(application_height * fraction_y)
return x, y
|
wait_enabled(timeout=10, retry_interval=0.5)
Wait until the control is enabled.
Parameters: |
-
timeout
(int , default:
10
)
–
-
retry_interval
(int , default:
0.5
)
–
The retry interval to wait.
|
Source code in automator/ui_control/controller.py
256
257
258
259
260
261
262
263
264
265
266
267 | def wait_enabled(self, timeout: int = 10, retry_interval: int = 0.5) -> None:
"""
Wait until the control is enabled.
:param timeout: The timeout to wait.
:param retry_interval: The retry interval to wait.
"""
while not self.control.is_enabled():
time.sleep(retry_interval)
timeout -= retry_interval
if timeout <= 0:
warnings.warn(f"Timeout: {self.control} is not enabled.")
break
|
wait_visible(timeout=10, retry_interval=0.5)
Wait until the window is enabled.
Parameters: |
-
timeout
(int , default:
10
)
–
-
retry_interval
(int , default:
0.5
)
–
The retry interval to wait.
|
Source code in automator/ui_control/controller.py
269
270
271
272
273
274
275
276
277
278
279
280 | def wait_visible(self, timeout: int = 10, retry_interval: int = 0.5) -> None:
"""
Wait until the window is enabled.
:param timeout: The timeout to wait.
:param retry_interval: The retry interval to wait.
"""
while not self.control.is_visible():
time.sleep(retry_interval)
timeout -= retry_interval
if timeout <= 0:
warnings.warn(f"Timeout: {self.control} is not visible.")
break
|
Wheel mouse input on the control element.
Parameters: |
-
params
(Dict[str, str] )
–
The arguments of the wheel mouse input method.
|
Returns: |
-
–
The result of the wheel mouse input action.
|
Source code in automator/ui_control/controller.py
224
225
226
227
228
229
230 | def wheel_mouse_input(self, params: Dict[str, str]):
"""
Wheel mouse input on the control element.
:param params: The arguments of the wheel mouse input method.
:return: The result of the wheel mouse input action.
"""
return self.atomic_execution("wheel_mouse_input", params)
|
Command
The command of the UI Automator is the ControlCommand
class defined in the ufo/automator/ui_control/controller/ControlCommand
module. It encapsulates the function and parameters required to execute the action. The ControlCommand
class is a base class for all commands in the UI Automator application. Below is an example of a ClickInputCommand
class that inherits from the ControlCommand
class:
@ControlReceiver.register
class ClickInputCommand(ControlCommand):
"""
The click input command class.
"""
def execute(self) -> str:
"""
Execute the click input command.
:return: The result of the click input command.
"""
return self.receiver.click_input(self.params)
@classmethod
def name(cls) -> str:
"""
Get the name of the atomic command.
:return: The name of the atomic command.
"""
return "click_input"
Note
The concrete command classes must implement the execute
method to execute the action and the name
method to return the name of the atomic command.
Note
Each command must register with a specific ControlReceiver
to be executed using the @ControlReceiver.register
decorator.
Below is the list of available commands in the UI Automator that are currently supported by UFO:
Command Name |
Function Name |
Description |
ClickInputCommand |
click_input |
Click the control item with the mouse. |
ClickOnCoordinatesCommand |
click_on_coordinates |
Click on the specific fractional coordinates of the application window. |
DragOnCoordinatesCommand |
drag_on_coordinates |
Drag the mouse on the specific fractional coordinates of the application window. |
SetEditTextCommand |
set_edit_text |
Add new text to the control item. |
GetTextsCommand |
texts |
Get the text of the control item. |
WheelMouseInputCommand |
wheel_mouse_input |
Scroll the control item. |
KeyboardInputCommand |
keyboard_input |
Simulate the keyboard input. |
Tip
Please refer to the ufo/prompts/share/base/api.yaml
file for the detailed API documentation of the UI Automator.
Tip
You can customize the commands by adding new command classes to the ufo/automator/ui_control/controller/ControlCommand
module.