{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Models\n", "\n", "In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for model clients and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n", "\n", "This section provides a quick overview of available model clients.\n", "For more details on how to use them directly, please refer to [Model Clients](../../core-user-guide/components/model-clients.ipynb) in the Core API documentation.\n", "\n", "```{note}\n", "See {py:class}`~autogen_ext.models.cache.ChatCompletionCache` for a caching wrapper to use with the following clients.\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## OpenAI\n", "\n", "To access OpenAI models, install the `openai` extension, which allows you to use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "shellscript" } }, "outputs": [], "source": [ "pip install \"autogen-ext[openai]\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You will also need to obtain an [API key](https://platform.openai.com/account/api-keys) from OpenAI." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", "\n", "openai_model_client = OpenAIChatCompletionClient(\n", " model=\"gpt-4o-2024-08-06\",\n", " # api_key=\"sk-...\", # Optional if you have an OPENAI_API_KEY environment variable set.\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To test the model client, you can use the following code:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CreateResult(finish_reason='stop', content='The capital of France is Paris.', usage=RequestUsage(prompt_tokens=15, completion_tokens=7), cached=False, logprobs=None)\n" ] } ], "source": [ "from autogen_core.models import UserMessage\n", "\n", "result = await openai_model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```{note}\n", "You can use this client with models hosted on OpenAI-compatible endpoints, however, we have not tested this functionality.\n", "See {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` for more information.\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Azure OpenAI\n", "\n", "Similarly, install the `azure` and `openai` extensions to use the {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "shellscript" } }, "outputs": [], "source": [ "pip install \"autogen-ext[openai,azure]\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To use the client, you need to provide your deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n", "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.\n", "\n", "The following code snippet shows how to use AAD authentication.\n", "The identity used must be assigned the [Cognitive Services OpenAI User](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) role." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from autogen_ext.models.openai import AzureOpenAIChatCompletionClient\n", "from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n", "\n", "# Create the token provider\n", "token_provider = get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\")\n", "\n", "az_model_client = AzureOpenAIChatCompletionClient(\n", " azure_deployment=\"{your-azure-deployment}\",\n", " model=\"{model-name, such as gpt-4o}\",\n", " api_version=\"2024-06-01\",\n", " azure_endpoint=\"https://{your-custom-endpoint}.openai.azure.com/\",\n", " azure_ad_token_provider=token_provider, # Optional if you choose key-based authentication.\n", " # api_key=\"sk-...\", # For key-based authentication.\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions) for how to use the Azure client directly or for more information." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Azure AI Foundry\n", "\n", "[Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/) (previously known as Azure AI Studio) offers models hosted on Azure.\n", "To use those models, you use the {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`.\n", "\n", "You need to install the `azure` extra to use this client." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "shellscript" } }, "outputs": [], "source": [ "pip install \"autogen-ext[azure]\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below is an example of using this client with the Phi-4 model from [GitHub Marketplace](https://github.com/marketplace/models)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False logprobs=None\n" ] } ], "source": [ "import os\n", "\n", "from autogen_core.models import UserMessage\n", "from autogen_ext.models.azure import AzureAIChatCompletionClient\n", "from azure.core.credentials import AzureKeyCredential\n", "\n", "client = AzureAIChatCompletionClient(\n", " model=\"Phi-4\",\n", " endpoint=\"https://models.inference.ai.azure.com\",\n", " # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.\n", " # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens\n", " credential=AzureKeyCredential(os.environ[\"GITHUB_TOKEN\"]),\n", " model_info={\n", " \"json_output\": False,\n", " \"function_calling\": False,\n", " \"vision\": False,\n", " \"family\": \"unknown\",\n", " },\n", ")\n", "\n", "result = await client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Ollama\n", "\n", "[Ollama](https://ollama.com/) is a local model server that can run models locally on your machine.\n", "\n", "Currently, we recommend using the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`\n", "to interact with Ollama server.\n", "\n", "```{note}\n", "Small local models are typically not as capable as larger models on the cloud.\n", "For some tasks they may not perform as well and the output may be suprising.\n", "```" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "finish_reason='unknown' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=32, completion_tokens=8) cached=False logprobs=None\n" ] } ], "source": [ "from autogen_core.models import UserMessage\n", "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", "\n", "model_client = OpenAIChatCompletionClient(\n", " model=\"llama3.2:latest\",\n", " base_url=\"http://localhost:11434/v1\",\n", " api_key=\"placeholder\",\n", " model_info={\n", " \"vision\": False,\n", " \"function_calling\": True,\n", " \"json_output\": False,\n", " \"family\": \"unknown\",\n", " },\n", ")\n", "\n", "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", "print(response)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gemini (experimental)\n", "\n", "Gemini currently offers [an OpenAI-compatible API (beta)](https://ai.google.dev/gemini-api/docs/openai).\n", "So you can use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` with the Gemini API.\n", "\n", "```{note}\n", "While some model providers may offer OpenAI-compatible APIs, they may still have minor differences.\n", "For example, the `finish_reason` field may be different in the response.\n", "```" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "finish_reason='stop' content='Paris\\n' usage=RequestUsage(prompt_tokens=7, completion_tokens=2) cached=False logprobs=None thought=None\n" ] } ], "source": [ "from autogen_core.models import UserMessage\n", "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", "\n", "model_client = OpenAIChatCompletionClient(\n", " model=\"gemini-1.5-flash-8b\",\n", " # api_key=\"GEMINI_API_KEY\",\n", ")\n", "\n", "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", "print(response)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Semantic Kernel Adapter\n", "\n", "The {py:class}`~autogen_ext.models.semantic_kernel.SKChatCompletionAdapter`\n", "allows you to use Semantic kernel model clients as a\n", "{py:class}`~autogen_core.models.ChatCompletionClient` by adapting them to the required interface.\n", "\n", "You need to install the relevant provider extras to use this adapter. \n", "\n", "The list of extras that can be installed:\n", "\n", "- `semantic-kernel-anthropic`: Install this extra to use Anthropic models.\n", "- `semantic-kernel-google`: Install this extra to use Google Gemini models.\n", "- `semantic-kernel-ollama`: Install this extra to use Ollama models.\n", "- `semantic-kernel-mistralai`: Install this extra to use MistralAI models.\n", "- `semantic-kernel-aws`: Install this extra to use AWS models.\n", "- `semantic-kernel-hugging-face`: Install this extra to use Hugging Face models.\n", "\n", "For example, to use Anthropic models, you need to install `semantic-kernel-anthropic`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "shellscript" } }, "outputs": [], "source": [ "# pip install \"autogen-ext[semantic-kernel-anthropic]\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To use this adapter, you need create a Semantic Kernel model client and pass it to the adapter.\n", "\n", "For example, to use the Anthropic model:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "finish_reason='stop' content='The capital of France is Paris. It is also the largest city in France and one of the most populous metropolitan areas in Europe.' usage=RequestUsage(prompt_tokens=0, completion_tokens=0) cached=False logprobs=None\n" ] } ], "source": [ "import os\n", "\n", "from autogen_core.models import UserMessage\n", "from autogen_ext.models.semantic_kernel import SKChatCompletionAdapter\n", "from semantic_kernel import Kernel\n", "from semantic_kernel.connectors.ai.anthropic import AnthropicChatCompletion, AnthropicChatPromptExecutionSettings\n", "from semantic_kernel.memory.null_memory import NullMemory\n", "\n", "sk_client = AnthropicChatCompletion(\n", " ai_model_id=\"claude-3-5-sonnet-20241022\",\n", " api_key=os.environ[\"ANTHROPIC_API_KEY\"],\n", " service_id=\"my-service-id\", # Optional; for targeting specific services within Semantic Kernel\n", ")\n", "settings = AnthropicChatPromptExecutionSettings(\n", " temperature=0.2,\n", ")\n", "\n", "anthropic_model_client = SKChatCompletionAdapter(\n", " sk_client, kernel=Kernel(memory=NullMemory()), prompt_settings=settings\n", ")\n", "\n", "# Call the model directly.\n", "model_result = await anthropic_model_client.create(\n", " messages=[UserMessage(content=\"What is the capital of France?\", source=\"User\")]\n", ")\n", "print(model_result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read more about the [Semantic Kernel Adapter](../../../reference/python/autogen_ext.models.semantic_kernel.rst)." ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 2 }