Hybrid Detection
We also support hybrid control detection using both UIA and OmniParser-v2. This method is useful for detecting standard controls in the application using the UI Automation (UIA) framework, and for detecting custom controls in the application that may not be recognized by standard UIA methods. The visually detected controls are merged with the UIA controls by removing the duplicate controls based on IOU. We illustrate the hybrid control detection in the figure below:
Configuration
Before using the hybrid control detection, you need to deploy and configure the OmniParser model. You can refer to the OmniParser deployment for more details.
To activate the icon control filtering, you need to set CONTROL_BACKEND
to ["uia", "omniparser"]
in the config_dev.yaml
file.
CONTROL_BACKEND: ["uia", "omniparser"]
Reference
The following classes are used for visual control detection in OmniParser:
Bases: BasicGrounding
The OmniparserGrounding class is a subclass of BasicGrounding, which is used to represent the Omniparser grounding model.
parse_results(results, application_window=None)
Parse the grounding results string into a list of control elements infomation dictionaries.
Parameters: |
|
---|
Returns: |
|
---|
Source code in automator/ui_control/grounding/omniparser.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
|
predict(image_path, box_threshold=0.05, iou_threshold=0.1, use_paddleocr=True, imgsz=640, api_name='/process')
Predict the grounding for the given image.
Parameters: |
|
---|
Returns: |
|
---|
Source code in automator/ui_control/grounding/omniparser.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|