Model Clients#

AutoGen provides the autogen_core.models module with a suite of built-in model clients for using ChatCompletion API. All model clients implement the ChatCompletionClient protocol class.

Built-in Model Clients#

Currently there are two built-in model clients: OpenAIChatCompletionClient and AzureOpenAIChatCompletionClient. Both clients are asynchronous.

To use the OpenAIChatCompletionClient, you need to provide the API key either through the environment variable OPENAI_API_KEY or through the api_key argument.

from autogen_core.models import UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient

# Create an OpenAI model client.
model_client = OpenAIChatCompletionClient(
    model="gpt-4o",
    # api_key="sk-...", # Optional if you have an API key set in the environment.
)

You can call the create() method to create a chat completion request, and await for an CreateResult object in return.

# Send a message list to the model and await the response.
messages = [
    UserMessage(content="What is the capital of France?", source="user"),
]
response = await model_client.create(messages=messages)

# Print the response
print(response.content)
The capital of France is Paris.
# Print the response token usage
print(response.usage)
RequestUsage(prompt_tokens=15, completion_tokens=7)

Default Model Capabilities may be overridden should the need arise.

Streaming Response#

You can use the create_streaming() method to create a chat completion request with streaming response.

messages = [
    UserMessage(content="Write a very short story about a dragon.", source="user"),
]

# Create a stream.
stream = model_client.create_stream(messages=messages)

# Iterate over the stream and print the responses.
print("Streamed responses:")
async for response in stream:  # type: ignore
    if isinstance(response, str):
        # A partial response is a string.
        print(response, flush=True, end="")
    else:
        # The last response is a CreateResult object with the complete message.
        print("\n\n------------\n")
        print("The complete response:", flush=True)
        print(response.content, flush=True)
        print("\n\n------------\n")
        print("The token usage was:", flush=True)
        print(response.usage, flush=True)
Streamed responses:
In the heart of an ancient forest, beneath the shadow of snow-capped peaks, a dragon named Elara lived secretly for centuries. Elara was unlike any dragon from the old tales; her scales shimmered with a deep emerald hue, each scale engraved with symbols of lost wisdom. The villagers in the nearby valley spoke of mysterious lights dancing across the night sky, but none dared venture close enough to solve the enigma.

One cold winter's eve, a young girl named Lira, brimming with curiosity and armed with the innocence of youth, wandered into Elara’s domain. Instead of fire and fury, she found warmth and a gentle gaze. The dragon shared stories of a world long forgotten and in return, Lira gifted her simple stories of human life, rich in laughter and scent of earth.

From that night on, the villagers noticed subtle changes—the crops grew taller, and the air seemed sweeter. Elara had infused the valley with ancient magic, a guardian of balance, watching quietly as her new friend thrived under the stars. And so, Lira and Elara’s bond marked the beginning of a timeless friendship that spun tales of hope whispered through the leaves of the ever-verdant forest.

------------

The complete response:
In the heart of an ancient forest, beneath the shadow of snow-capped peaks, a dragon named Elara lived secretly for centuries. Elara was unlike any dragon from the old tales; her scales shimmered with a deep emerald hue, each scale engraved with symbols of lost wisdom. The villagers in the nearby valley spoke of mysterious lights dancing across the night sky, but none dared venture close enough to solve the enigma.

One cold winter's eve, a young girl named Lira, brimming with curiosity and armed with the innocence of youth, wandered into Elara’s domain. Instead of fire and fury, she found warmth and a gentle gaze. The dragon shared stories of a world long forgotten and in return, Lira gifted her simple stories of human life, rich in laughter and scent of earth.

From that night on, the villagers noticed subtle changes—the crops grew taller, and the air seemed sweeter. Elara had infused the valley with ancient magic, a guardian of balance, watching quietly as her new friend thrived under the stars. And so, Lira and Elara’s bond marked the beginning of a timeless friendship that spun tales of hope whispered through the leaves of the ever-verdant forest.


------------

The token usage was:
RequestUsage(prompt_tokens=0, completion_tokens=0)

Note

The last response in the streaming response is always the final response of the type CreateResult.

NB the default usage response is to return zero values

A Note on Token usage counts with streaming example#

Comparing usage returns in the above Non Streaming model_client.create(messages=messages) vs streaming model_client.create_stream(messages=messages) we see differences. The non streaming response by default returns valid prompt and completion token usage counts. The streamed response by default returns zero values.

as documented in the OPENAI API Reference an additional parameter stream_options can be specified to return valid usage counts. see stream_options

Only set this when you using streaming ie , using create_stream

to enable this in create_stream set extra_create_args={"stream_options": {"include_usage": True}},

  • Note whilst other API’s like LiteLLM also support this, it is not always guarenteed that it is fully supported or correct

Streaming example with token usage#

messages = [
    UserMessage(content="Write a very short story about a dragon.", source="user"),
]

# Create a stream.
stream = model_client.create_stream(messages=messages, extra_create_args={"stream_options": {"include_usage": True}})

# Iterate over the stream and print the responses.
print("Streamed responses:")
async for response in stream:  # type: ignore
    if isinstance(response, str):
        # A partial response is a string.
        print(response, flush=True, end="")
    else:
        # The last response is a CreateResult object with the complete message.
        print("\n\n------------\n")
        print("The complete response:", flush=True)
        print(response.content, flush=True)
        print("\n\n------------\n")
        print("The token usage was:", flush=True)
        print(response.usage, flush=True)
Streamed responses:
In a lush, emerald valley hidden by towering peaks, there lived a dragon named Ember. Unlike others of her kind, Ember cherished solitude over treasure, and the songs of the stream over the roar of flames. One misty dawn, a young shepherd stumbled into her sanctuary, lost and frightened. 

Instead of fury, he was met with kindness as Ember extended a wing, guiding him back to safety. In gratitude, the shepherd visited yearly, bringing tales of his world beyond the mountains. Over time, a friendship blossomed, binding man and dragon in shared stories and laughter.

As the years passed, the legend of Ember the gentle-hearted spread far and wide, forever changing the way dragons were seen in the hearts of many.

------------

The complete response:
In a lush, emerald valley hidden by towering peaks, there lived a dragon named Ember. Unlike others of her kind, Ember cherished solitude over treasure, and the songs of the stream over the roar of flames. One misty dawn, a young shepherd stumbled into her sanctuary, lost and frightened. 

Instead of fury, he was met with kindness as Ember extended a wing, guiding him back to safety. In gratitude, the shepherd visited yearly, bringing tales of his world beyond the mountains. Over time, a friendship blossomed, binding man and dragon in shared stories and laughter.

As the years passed, the legend of Ember the gentle-hearted spread far and wide, forever changing the way dragons were seen in the hearts of many.


------------

The token usage was:
RequestUsage(prompt_tokens=17, completion_tokens=146)

Azure OpenAI#

To use the AzureOpenAIChatCompletionClient, you need to provide the deployment id, Azure Cognitive Services endpoint, api version, and model capabilities. For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential. To use AAD authentication, you need to first install the azure-identity package.

# pip install azure-identity

The following code snippet shows how to use AAD authentication. The identity used must be assigned the Cognitive Services OpenAI User role.

from autogen_ext.models.openai import AzureOpenAIChatCompletionClient
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

# Create the token provider
token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")

az_model_client = AzureOpenAIChatCompletionClient(
    azure_deployment="{your-azure-deployment}",
    model="{model-name, such as gpt-4o}",
    api_version="2024-06-01",
    azure_endpoint="https://{your-custom-endpoint}.openai.azure.com/",
    azure_ad_token_provider=token_provider,  # Optional if you choose key-based authentication.
    # api_key="sk-...", # For key-based authentication.
)

Note

See here for how to use the Azure client directly or for more info.

Build Agent using Model Client#

Let’s create a simple AI agent that can respond to messages using the ChatCompletion API.

from dataclasses import dataclass

from autogen_core import MessageContext, RoutedAgent, SingleThreadedAgentRuntime, message_handler
from autogen_core.models import ChatCompletionClient, SystemMessage, UserMessage
from autogen_ext.models.openai import OpenAIChatCompletionClient


@dataclass
class Message:
    content: str


class SimpleAgent(RoutedAgent):
    def __init__(self, model_client: ChatCompletionClient) -> None:
        super().__init__("A simple agent")
        self._system_messages = [SystemMessage(content="You are a helpful AI assistant.")]
        self._model_client = model_client

    @message_handler
    async def handle_user_message(self, message: Message, ctx: MessageContext) -> Message:
        # Prepare input to the chat completion model.
        user_message = UserMessage(content=message.content, source="user")
        response = await self._model_client.create(
            self._system_messages + [user_message], cancellation_token=ctx.cancellation_token
        )
        # Return with the model's response.
        assert isinstance(response.content, str)
        return Message(content=response.content)

The SimpleAgent class is a subclass of the autogen_core.RoutedAgent class for the convenience of automatically routing messages to the appropriate handlers. It has a single handler, handle_user_message, which handles message from the user. It uses the ChatCompletionClient to generate a response to the message. It then returns the response to the user, following the direct communication model.

Note

The cancellation_token of the type autogen_core.CancellationToken is used to cancel asynchronous operations. It is linked to async calls inside the message handlers and can be used by the caller to cancel the handlers.

# Create the runtime and register the agent.
from autogen_core import AgentId

runtime = SingleThreadedAgentRuntime()
await SimpleAgent.register(
    runtime,
    "simple_agent",
    lambda: SimpleAgent(
        OpenAIChatCompletionClient(
            model="gpt-4o-mini",
            # api_key="sk-...", # Optional if you have an OPENAI_API_KEY set in the environment.
        )
    ),
)
# Start the runtime processing messages.
runtime.start()
# Send a message to the agent and get the response.
message = Message("Hello, what are some fun things to do in Seattle?")
response = await runtime.send_message(message, AgentId("simple_agent", "default"))
print(response.content)
# Stop the runtime processing messages.
await runtime.stop()
Seattle is a vibrant city with a wide range of activities and attractions. Here are some fun things to do in Seattle:

1. **Space Needle**: Visit this iconic observation tower for stunning views of the city and surrounding mountains.

2. **Pike Place Market**: Explore this historic market where you can see the famous fish toss, buy local produce, and find unique crafts and eateries.

3. **Museum of Pop Culture (MoPOP)**: Dive into the world of contemporary culture, music, and science fiction at this interactive museum.

4. **Chihuly Garden and Glass**: Marvel at the beautiful glass art installations by artist Dale Chihuly, located right next to the Space Needle.

5. **Seattle Aquarium**: Discover the diverse marine life of the Pacific Northwest at this engaging aquarium.

6. **Seattle Art Museum**: Explore a vast collection of art from around the world, including contemporary and indigenous art.

7. **Kerry Park**: For one of the best views of the Seattle skyline, head to this small park on Queen Anne Hill.

8. **Ballard Locks**: Watch boats pass through the locks and observe the salmon ladder to see salmon migrating.

9. **Ferry to Bainbridge Island**: Take a scenic ferry ride across Puget Sound to enjoy charming shops, restaurants, and beautiful natural scenery.

10. **Olympic Sculpture Park**: Stroll through this outdoor park with large-scale sculptures and stunning views of the waterfront and mountains.

11. **Underground Tour**: Discover Seattle's history on this quirky tour of the city's underground passageways in Pioneer Square.

12. **Seattle Waterfront**: Enjoy the shops, restaurants, and attractions along the waterfront, including the Seattle Great Wheel and the aquarium.

13. **Discovery Park**: Explore the largest green space in Seattle, featuring trails, beaches, and views of Puget Sound.

14. **Food Tours**: Try out Seattle’s diverse culinary scene, including fresh seafood, international cuisines, and coffee culture (don’t miss the original Starbucks!).

15. **Attend a Sports Game**: Catch a Seahawks (NFL), Mariners (MLB), or Sounders (MLS) game for a lively local experience.

Whether you're interested in culture, nature, food, or history, Seattle has something for everyone to enjoy!

Manage Model Context#

The above SimpleAgent always responds with a fresh context that contains only the system message and the latest user’s message. We can use model context classes from autogen_core.components.model_context to make the agent “remember” previous conversations. A model context supports storage and retrieval of Chat Completion messages. It is always used together with a model client to generate LLM-based responses.

For example, BufferedChatCompletionContext is a most-recent-used (MRU) context that stores the most recent buffer_size number of messages. This is useful to avoid context overflow in many LLMs.

Let’s update the previous example to use BufferedChatCompletionContext.

from autogen_core.components.model_context import BufferedChatCompletionContext
from autogen_core.models import AssistantMessage


class SimpleAgentWithContext(RoutedAgent):
    def __init__(self, model_client: ChatCompletionClient) -> None:
        super().__init__("A simple agent")
        self._system_messages = [SystemMessage(content="You are a helpful AI assistant.")]
        self._model_client = model_client
        self._model_context = BufferedChatCompletionContext(buffer_size=5)

    @message_handler
    async def handle_user_message(self, message: Message, ctx: MessageContext) -> Message:
        # Prepare input to the chat completion model.
        user_message = UserMessage(content=message.content, source="user")
        # Add message to model context.
        await self._model_context.add_message(user_message)
        # Generate a response.
        response = await self._model_client.create(
            self._system_messages + (await self._model_context.get_messages()),
            cancellation_token=ctx.cancellation_token,
        )
        # Return with the model's response.
        assert isinstance(response.content, str)
        # Add message to model context.
        await self._model_context.add_message(AssistantMessage(content=response.content, source=self.metadata["type"]))
        return Message(content=response.content)

Now let’s try to ask follow up questions after the first one.

runtime = SingleThreadedAgentRuntime()
await SimpleAgentWithContext.register(
    runtime,
    "simple_agent_context",
    lambda: SimpleAgentWithContext(
        OpenAIChatCompletionClient(
            model="gpt-4o-mini",
            # api_key="sk-...", # Optional if you have an OPENAI_API_KEY set in the environment.
        )
    ),
)
# Start the runtime processing messages.
runtime.start()
agent_id = AgentId("simple_agent_context", "default")

# First question.
message = Message("Hello, what are some fun things to do in Seattle?")
print(f"Question: {message.content}")
response = await runtime.send_message(message, agent_id)
print(f"Response: {response.content}")
print("-----")

# Second question.
message = Message("What was the first thing you mentioned?")
print(f"Question: {message.content}")
response = await runtime.send_message(message, agent_id)
print(f"Response: {response.content}")

# Stop the runtime processing messages.
await runtime.stop()
Question: Hello, what are some fun things to do in Seattle?
Response: Seattle offers a wide variety of fun activities and attractions for visitors. Here are some highlights:

1. **Pike Place Market**: Explore this iconic market, where you can find fresh produce, unique crafts, and the famous fish-throwing vendors. Don’t forget to visit the original Starbucks!

2. **Space Needle**: Enjoy breathtaking views of the city and Mount Rainier from the observation deck of this iconic structure. You can also dine at the SkyCity restaurant.

3. **Chihuly Garden and Glass**: Admire the stunning glass art installations created by artist Dale Chihuly. The garden and exhibit are particularly beautiful, especially in good weather.

4. **Museum of Pop Culture (MoPOP)**: Dive into the world of music, science fiction, and pop culture through interactive exhibits and memorabilia.

5. **Seattle Aquarium**: Located on the waterfront, the aquarium features a variety of marine life native to the Pacific Northwest, including otters and diving birds.

6. **Seattle Art Museum (SAM)**: Explore a diverse collection of art from around the world, including Native American art and contemporary pieces.

7. **Ballard Locks**: Watch boats travel between the Puget Sound and Lake Union, and see salmon navigating the fish ladder during spawning season.

8. **Fremont Troll**: Visit this quirky public art installation located under the Aurora Bridge, where you can take fun photos with the giant troll.

9. **Kerry Park**: For a picturesque view of the Seattle skyline, head to Kerry Park on Queen Anne Hill, especially at sunset.

10. **Take a Ferry Ride**: Enjoy the scenic views while taking a ferry to nearby Bainbridge Island or Vashon Island for a relaxing day trip.

11. **Underground Tour**: Explore Seattle’s history on an entertaining underground tour in Pioneer Square, where you’ll learn about the city’s early days.

12. **Attend a Sporting Event**: Depending on the season, catch a Seattle Seahawks (NFL) game, a Seattle Mariners (MLB) game, or a Seattle Sounders (MLS) match.

13. **Explore Discovery Park**: Enjoy nature with hiking trails, beach access, and stunning views of the Puget Sound and Olympic Mountains.

14. **West Seattle’s Alki Beach**: Relax at this beach with beautiful views of the Seattle skyline and enjoy beachside activities like biking or kayaking.

15. **Dining and Craft Beer**: Seattle has a vibrant food scene and is known for its seafood, coffee culture, and craft breweries. Make sure to explore local restaurants and breweries.

There’s something for everyone in Seattle, whether you’re interested in nature, art, history, or food!
-----
Question: What was the first thing you mentioned?
Response: The first thing I mentioned was **Pike Place Market**, an iconic market in Seattle where you can find fresh produce, unique crafts, and experience the famous fish-throwing vendors. It's also home to the original Starbucks and various charming shops and eateries.

From the second response, you can see the agent now can recall its own previous responses.