Google Gemini
Step 1
To use the Google Gemini API, you need to create an account on the Google Gemini website and access the API key.
Step 2
You may need to install additional dependencies to use the Google Gemini API. You can install the dependencies using the following command:
pip install -U google-generativeai==0.7.0
Step 3
Configure the HOST_AGENT
and APP_AGENT
in the config.yaml
file (rename the config_template.yaml
file to config.yaml
) to use the Google Gemini API. The following is an example configuration for the Google Gemini API:
VISUAL_MODE: True, # Whether to use visual mode to understand screenshots and take actions
API_TYPE: "Gemini" ,
API_KEY: "YOUR_KEY",
API_MODEL: "YOUR_MODEL"
Tip
If you set VISUAL_MODE
to True
, make sure the API_MODEL
supports visual inputs.
Tip
API_MODEL
is the model name of Gemini LLM API. You can find the model name in the Gemini LLM model list. If you meet the 429
Resource has been exhausted (e.g. check quota)., it may because the rate limit of your Gemini API.
Step 4
After configuring the HOST_AGENT
and APP_AGENT
with the Gemini API, you can start using UFO to interact with the Gemini API for various tasks on Windows OS. Please refer to the Quick Start Guide for more details on how to get started with UFO.