LiteLLM
info
LiteLLM provides a unified interface to call 100+ LLMs using the same Input/Output format, including OpenAI, Huggingface, Anthropic, vLLM, Cohere, and even custom LLM API server. Taking LiteLLM as the bridge, many LLMs can be onboarded to TaskWeaver. Here we use the OpenAI Proxy Server provided by LiteLLM to make configuration.
- Install LiteLLM Proxy and configure the LLM server by following the instruction here. In general, there are a few steps:
- Install the package
pip install litellm[proxy]
- Setup the API key and other necessary environment variables which vary by LLM. Taking Cohere as an example, it is required to setup
export COHERE_API_KEY=my-api-key
. - Run LiteLLM proxy server by
litellm --model MODEL_NAME --drop_params
, for example, in Cohere, the model name can becommand-nightly
. Thedrop-params
argument is used to ensure the API compatibility. Then, a server will be automatically started onhttp://0.0.0.0:8000
.
- Install the package
tip
The full list of supported models by LiteLLM can be found in the page.
- Add the following content to your
taskweaver_config.json
file:
{
"llm.api_base": "http://0.0.0.0:8000",
"llm.api_key": "anything",
"llm.model": "gpt-3.5-turbo"
}
info
llm.api_key
and llm.model
are mainly used as placeholder for API call, whose actual values are not used. If the configuration does not work, please refer to LiteLLM documents to locally test whether you can send requests to the LLM.
- Open a new terminal, start TaskWeaver and chat. You can refer to the Quick Start for more details.