Custom Agents#
You may have agents with behaviors that do not fall into a preset. In such cases, you can build custom agents.
All agents in AgentChat inherit from BaseChatAgent
class and implement the following abstract methods and attributes:
on_messages()
: The abstract method that defines the behavior of the agent in response to messages. This method is called when the agent is asked to provide a response inrun()
. It returns aResponse
object.on_reset()
: The abstract method that resets the agent to its initial state. This method is called when the agent is asked to reset itself.produced_message_types
: The list of possibleChatMessage
message types the agent can produce in its response.
Optionally, you can implement the the on_messages_stream()
method to stream messages as they are generated by the agent. If this method is not implemented, the agent
uses the default implementation of on_messages_stream()
that calls the on_messages()
method and
yields all messages in the response.
CountDownAgent#
In this example, we create a simple agent that counts down from a given number to zero, and produces a stream of messages with the current count.
from typing import AsyncGenerator, List, Sequence
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.base import Response
from autogen_agentchat.messages import AgentEvent, ChatMessage, TextMessage
from autogen_core import CancellationToken
class CountDownAgent(BaseChatAgent):
def __init__(self, name: str, count: int = 3):
super().__init__(name, "A simple agent that counts down.")
self._count = count
@property
def produced_message_types(self) -> Sequence[type[ChatMessage]]:
return (TextMessage,)
async def on_messages(self, messages: Sequence[ChatMessage], cancellation_token: CancellationToken) -> Response:
# Calls the on_messages_stream.
response: Response | None = None
async for message in self.on_messages_stream(messages, cancellation_token):
if isinstance(message, Response):
response = message
assert response is not None
return response
async def on_messages_stream(
self, messages: Sequence[ChatMessage], cancellation_token: CancellationToken
) -> AsyncGenerator[AgentEvent | ChatMessage | Response, None]:
inner_messages: List[AgentEvent | ChatMessage] = []
for i in range(self._count, 0, -1):
msg = TextMessage(content=f"{i}...", source=self.name)
inner_messages.append(msg)
yield msg
# The response is returned at the end of the stream.
# It contains the final message and all the inner messages.
yield Response(chat_message=TextMessage(content="Done!", source=self.name), inner_messages=inner_messages)
async def on_reset(self, cancellation_token: CancellationToken) -> None:
pass
async def run_countdown_agent() -> None:
# Create a countdown agent.
countdown_agent = CountDownAgent("countdown")
# Run the agent with a given task and stream the response.
async for message in countdown_agent.on_messages_stream([], CancellationToken()):
if isinstance(message, Response):
print(message.chat_message.content)
else:
print(message.content)
# Use asyncio.run(run_countdown_agent()) when running in a script.
await run_countdown_agent()
3...
2...
1...
Done!
ArithmeticAgent#
In this example, we create an agent class that can perform simple arithmetic operations
on a given integer. Then, we will use different instances of this agent class
in a SelectorGroupChat
to transform a given integer into another integer by applying a sequence of arithmetic operations.
The ArithmeticAgent
class takes an operator_func
that takes an integer and returns an integer,
after applying an arithmetic operation to the integer.
In its on_messages
method, it applies the operator_func
to the integer in the input message,
and returns a response with the result.
from typing import Callable, Sequence
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.base import Response
from autogen_agentchat.conditions import MaxMessageTermination
from autogen_agentchat.messages import ChatMessage
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.ui import Console
from autogen_core import CancellationToken
from autogen_ext.models.openai import OpenAIChatCompletionClient
class ArithmeticAgent(BaseChatAgent):
def __init__(self, name: str, description: str, operator_func: Callable[[int], int]) -> None:
super().__init__(name, description=description)
self._operator_func = operator_func
self._message_history: List[ChatMessage] = []
@property
def produced_message_types(self) -> Sequence[type[ChatMessage]]:
return (TextMessage,)
async def on_messages(self, messages: Sequence[ChatMessage], cancellation_token: CancellationToken) -> Response:
# Update the message history.
# NOTE: it is possible the messages is an empty list, which means the agent was selected previously.
self._message_history.extend(messages)
# Parse the number in the last message.
assert isinstance(self._message_history[-1], TextMessage)
number = int(self._message_history[-1].content)
# Apply the operator function to the number.
result = self._operator_func(number)
# Create a new message with the result.
response_message = TextMessage(content=str(result), source=self.name)
# Update the message history.
self._message_history.append(response_message)
# Return the response.
return Response(chat_message=response_message)
async def on_reset(self, cancellation_token: CancellationToken) -> None:
pass
Note
The on_messages
method may be called with an empty list of messages, in which
case it means the agent was called previously and is now being called again,
without any new messages from the caller. So it is important to keep a history
of the previous messages received by the agent, and use that history to generate
the response.
Now we can create a SelectorGroupChat
with 5 instances of ArithmeticAgent
:
one that adds 1 to the input integer,
one that subtracts 1 from the input integer,
one that multiplies the input integer by 2,
one that divides the input integer by 2 and rounds down to the nearest integer, and
one that returns the input integer unchanged.
We then create a SelectorGroupChat
with these agents,
and set the appropriate selector settings:
allow the same agent to be selected consecutively to allow for repeated operations, and
customize the selector prompt to tailor the model’s response to the specific task.
async def run_number_agents() -> None:
# Create agents for number operations.
add_agent = ArithmeticAgent("add_agent", "Adds 1 to the number.", lambda x: x + 1)
multiply_agent = ArithmeticAgent("multiply_agent", "Multiplies the number by 2.", lambda x: x * 2)
subtract_agent = ArithmeticAgent("subtract_agent", "Subtracts 1 from the number.", lambda x: x - 1)
divide_agent = ArithmeticAgent("divide_agent", "Divides the number by 2 and rounds down.", lambda x: x // 2)
identity_agent = ArithmeticAgent("identity_agent", "Returns the number as is.", lambda x: x)
# The termination condition is to stop after 10 messages.
termination_condition = MaxMessageTermination(10)
# Create a selector group chat.
selector_group_chat = SelectorGroupChat(
[add_agent, multiply_agent, subtract_agent, divide_agent, identity_agent],
model_client=OpenAIChatCompletionClient(model="gpt-4o"),
termination_condition=termination_condition,
allow_repeated_speaker=True, # Allow the same agent to speak multiple times, necessary for this task.
selector_prompt=(
"Available roles:\n{roles}\nTheir job descriptions:\n{participants}\n"
"Current conversation history:\n{history}\n"
"Please select the most appropriate role for the next message, and only return the role name."
),
)
# Run the selector group chat with a given task and stream the response.
task: List[ChatMessage] = [
TextMessage(content="Apply the operations to turn the given number into 25.", source="user"),
TextMessage(content="10", source="user"),
]
stream = selector_group_chat.run_stream(task=task)
await Console(stream)
# Use asyncio.run(run_number_agents()) when running in a script.
await run_number_agents()
---------- user ----------
Apply the operations to turn the given number into 25.
---------- user ----------
10
---------- multiply_agent ----------
20
---------- add_agent ----------
21
---------- multiply_agent ----------
42
---------- divide_agent ----------
21
---------- add_agent ----------
22
---------- add_agent ----------
23
---------- add_agent ----------
24
---------- add_agent ----------
25
---------- Summary ----------
Number of messages: 10
Finish reason: Maximum number of messages 10 reached, current message count: 10
Total prompt tokens: 0
Total completion tokens: 0
Duration: 2.40 seconds
From the output, we can see that the agents have successfully transformed the input integer from 10 to 25 by choosing appropriate agents that apply the arithmetic operations in sequence.
Using Custom Model Clients in Custom Agents#
One of the key features of the AssistantAgent
preset in AgentChat is that it takes a model_client
argument and can use it in responding to messages. However, in some cases, you may want your agent to use a custom model client that is not currently supported (see supported model clients) or custom model behaviours.
You can accomplish this with a custom agent that implements your custom model client.
In the example below, we will walk through an example of a custom agent that uses the Google Gemini SDK directly to respond to messages.
Note: You will need to install the Google Gemini SDK to run this example. You can install it using the following command:
pip install google-genai
# !pip install google-genai
import os
from typing import AsyncGenerator, Sequence
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.base import Response
from autogen_agentchat.messages import AgentEvent, ChatMessage
from autogen_core import CancellationToken
from autogen_core.model_context import UnboundedChatCompletionContext
from autogen_core.models import AssistantMessage, RequestUsage, UserMessage
from google import genai
from google.genai import types
class GeminiAssistantAgent(BaseChatAgent):
def __init__(
self,
name: str,
description: str = "An agent that provides assistance with ability to use tools.",
model: str = "gemini-1.5-flash-002",
api_key: str = os.environ["GEMINI_API_KEY"],
system_message: str
| None = "You are a helpful assistant that can respond to messages. Reply with TERMINATE when the task has been completed.",
):
super().__init__(name=name, description=description)
self._model_context = UnboundedChatCompletionContext()
self._model_client = genai.Client(api_key=api_key)
self._system_message = system_message
self._model = model
@property
def produced_message_types(self) -> Sequence[type[ChatMessage]]:
return (TextMessage,)
async def on_messages(self, messages: Sequence[ChatMessage], cancellation_token: CancellationToken) -> Response:
final_response = None
async for message in self.on_messages_stream(messages, cancellation_token):
if isinstance(message, Response):
final_response = message
if final_response is None:
raise AssertionError("The stream should have returned the final result.")
return final_response
async def on_messages_stream(
self, messages: Sequence[ChatMessage], cancellation_token: CancellationToken
) -> AsyncGenerator[AgentEvent | ChatMessage | Response, None]:
# Add messages to the model context
for msg in messages:
await self._model_context.add_message(UserMessage(content=msg.content, source=msg.source))
# Get conversation history
history = [
(msg.source if hasattr(msg, "source") else "system")
+ ": "
+ (msg.content if isinstance(msg.content, str) else "")
+ "\n"
for msg in await self._model_context.get_messages()
]
# Generate response using Gemini
response = self._model_client.models.generate_content(
model=self._model,
contents=f"History: {history}\nGiven the history, please provide a response",
config=types.GenerateContentConfig(
system_instruction=self._system_message,
temperature=0.3,
),
)
# Create usage metadata
usage = RequestUsage(
prompt_tokens=response.usage_metadata.prompt_token_count,
completion_tokens=response.usage_metadata.candidates_token_count,
)
# Add response to model context
await self._model_context.add_message(AssistantMessage(content=response.text, source=self.name))
# Yield the final response
yield Response(
chat_message=TextMessage(content=response.text, source=self.name, models_usage=usage),
inner_messages=[],
)
async def on_reset(self, cancellation_token: CancellationToken) -> None:
"""Reset the assistant by clearing the model context."""
await self._model_context.clear()
gemini_assistant = GeminiAssistantAgent("gemini_assistant")
await Console(gemini_assistant.run_stream(task="What is the capital of New York?"))
---------- user ----------
What is the capital of New York?
---------- gemini_assistant ----------
Albany
TERMINATE
TaskResult(messages=[TextMessage(source='user', models_usage=None, content='What is the capital of New York?', type='TextMessage'), TextMessage(source='gemini_assistant', models_usage=RequestUsage(prompt_tokens=46, completion_tokens=5), content='Albany\nTERMINATE\n', type='TextMessage')], stop_reason=None)
In the example above, we have chosen to provide model
, api_key
and system_message
as arguments - you can choose to provide any other arguments that are required by the model client you are using or fits with your application design.
Now, let us explore how to use this custom agent as part of a team in AgentChat.
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import TextMentionTermination
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.ui import Console
# Create the primary agent.
primary_agent = AssistantAgent(
"primary",
model_client=OpenAIChatCompletionClient(model="gpt-4o-mini"),
system_message="You are a helpful AI assistant.",
)
# Create a critic agent based on our new GeminiAssistantAgent.
gemini_critic_agent = GeminiAssistantAgent(
"gemini_critic",
system_message="Provide constructive feedback. Respond with 'APPROVE' to when your feedbacks are addressed.",
)
# Define a termination condition that stops the task if the critic approves or after 10 messages.
termination = TextMentionTermination("APPROVE") | MaxMessageTermination(10)
# Create a team with the primary and critic agents.
team = RoundRobinGroupChat([primary_agent, gemini_critic_agent], termination_condition=termination)
await Console(team.run_stream(task="Write a Haiku poem with 4 lines about the fall season."))
---------- user ----------
Write a Haiku poem with 4 lines about the fall season.
---------- primary ----------
Crimson leaves cascade,
Whispering winds sing of change,
Chill wraps the fading,
Nature's quilt, rich and warm.
---------- gemini_critic ----------
The poem is good, but it has four lines instead of three. A haiku must have three lines with a 5-7-5 syllable structure. The content is evocative of autumn, but the form is incorrect. Please revise to adhere to the haiku's syllable structure.
---------- primary ----------
Thank you for your feedback! Here’s a revised haiku that follows the 5-7-5 syllable structure:
Crimson leaves drift down,
Chill winds whisper through the gold,
Autumn’s breath is near.
---------- gemini_critic ----------
The revised haiku is much improved. It correctly follows the 5-7-5 syllable structure and maintains the evocative imagery of autumn. APPROVE
TaskResult(messages=[TextMessage(source='user', models_usage=None, content='Write a Haiku poem with 4 lines about the fall season.', type='TextMessage'), TextMessage(source='primary', models_usage=RequestUsage(prompt_tokens=33, completion_tokens=31), content="Crimson leaves cascade, \nWhispering winds sing of change, \nChill wraps the fading, \nNature's quilt, rich and warm.", type='TextMessage'), TextMessage(source='gemini_critic', models_usage=RequestUsage(prompt_tokens=86, completion_tokens=60), content="The poem is good, but it has four lines instead of three. A haiku must have three lines with a 5-7-5 syllable structure. The content is evocative of autumn, but the form is incorrect. Please revise to adhere to the haiku's syllable structure.\n", type='TextMessage'), TextMessage(source='primary', models_usage=RequestUsage(prompt_tokens=141, completion_tokens=49), content='Thank you for your feedback! Here’s a revised haiku that follows the 5-7-5 syllable structure:\n\nCrimson leaves drift down, \nChill winds whisper through the gold, \nAutumn’s breath is near.', type='TextMessage'), TextMessage(source='gemini_critic', models_usage=RequestUsage(prompt_tokens=211, completion_tokens=32), content='The revised haiku is much improved. It correctly follows the 5-7-5 syllable structure and maintains the evocative imagery of autumn. APPROVE\n', type='TextMessage')], stop_reason="Text 'APPROVE' mentioned")
In section above, we show several very important concepts:
We have developed a custom agent that uses the Google Gemini SDK to respond to messages.
We show that this custom agent can be used as part of the broader AgentChat ecosystem - in this case as a participant in a
RoundRobinGroupChat
as long as it inherits fromBaseChatAgent
.
Making the Custom Agent Declarative#
Autogen provides a Component interface for making the configuration of components serializable to a declarative format. This is useful for saving and loading configurations, and for sharing configurations with others.
We accomplish this by inheriting from the Component
class and implementing the _from_config
and _to_config
methods.
The declarative class can be serialized to a JSON format using the dump_component
method, and deserialized from a JSON format using the load_component
method.
import os
from typing import AsyncGenerator, Sequence
from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.base import Response
from autogen_agentchat.messages import AgentEvent, ChatMessage
from autogen_core import CancellationToken, Component
from pydantic import BaseModel
from typing_extensions import Self
class GeminiAssistantAgentConfig(BaseModel):
name: str
description: str = "An agent that provides assistance with ability to use tools."
model: str = "gemini-1.5-flash-002"
system_message: str | None = None
class GeminiAssistantAgent(BaseChatAgent, Component[GeminiAssistantAgentConfig]): # type: ignore[no-redef]
component_config_schema = GeminiAssistantAgentConfig
# component_provider_override = "mypackage.agents.GeminiAssistantAgent"
def __init__(
self,
name: str,
description: str = "An agent that provides assistance with ability to use tools.",
model: str = "gemini-1.5-flash-002",
api_key: str = os.environ["GEMINI_API_KEY"],
system_message: str
| None = "You are a helpful assistant that can respond to messages. Reply with TERMINATE when the task has been completed.",
):
super().__init__(name=name, description=description)
self._model_context = UnboundedChatCompletionContext()
self._model_client = genai.Client(api_key=api_key)
self._system_message = system_message
self._model = model
@property
def produced_message_types(self) -> Sequence[type[ChatMessage]]:
return (TextMessage,)
async def on_messages(self, messages: Sequence[ChatMessage], cancellation_token: CancellationToken) -> Response:
final_response = None
async for message in self.on_messages_stream(messages, cancellation_token):
if isinstance(message, Response):
final_response = message
if final_response is None:
raise AssertionError("The stream should have returned the final result.")
return final_response
async def on_messages_stream(
self, messages: Sequence[ChatMessage], cancellation_token: CancellationToken
) -> AsyncGenerator[AgentEvent | ChatMessage | Response, None]:
# Add messages to the model context
for msg in messages:
await self._model_context.add_message(UserMessage(content=msg.content, source=msg.source))
# Get conversation history
history = [
(msg.source if hasattr(msg, "source") else "system")
+ ": "
+ (msg.content if isinstance(msg.content, str) else "")
+ "\n"
for msg in await self._model_context.get_messages()
]
# Generate response using Gemini
response = self._model_client.models.generate_content(
model=self._model,
contents=f"History: {history}\nGiven the history, please provide a response",
config=types.GenerateContentConfig(
system_instruction=self._system_message,
temperature=0.3,
),
)
# Create usage metadata
usage = RequestUsage(
prompt_tokens=response.usage_metadata.prompt_token_count,
completion_tokens=response.usage_metadata.candidates_token_count,
)
# Add response to model context
await self._model_context.add_message(AssistantMessage(content=response.text, source=self.name))
# Yield the final response
yield Response(
chat_message=TextMessage(content=response.text, source=self.name, models_usage=usage),
inner_messages=[],
)
async def on_reset(self, cancellation_token: CancellationToken) -> None:
"""Reset the assistant by clearing the model context."""
await self._model_context.clear()
@classmethod
def _from_config(cls, config: GeminiAssistantAgentConfig) -> Self:
return cls(
name=config.name, description=config.description, model=config.model, system_message=config.system_message
)
def _to_config(self) -> GeminiAssistantAgentConfig:
return GeminiAssistantAgentConfig(
name=self.name,
description=self.description,
model=self._model,
system_message=self._system_message,
)
Now that we have the required methods implemented, we can now load and dump the custom agent to and from a JSON format, and then load the agent from the JSON format.
Note: You should set the
component_provider_override
class variable to the full path of the module containing the custom agent class e.g., (mypackage.agents.GeminiAssistantAgent
). This is used byload_component
method to determine how to instantiate the class.
gemini_assistant = GeminiAssistantAgent("gemini_assistant")
config = gemini_assistant.dump_component()
print(config.model_dump_json(indent=2))
loaded_agent = GeminiAssistantAgent.load_component(config)
print(loaded_agent)
{
"provider": "__main__.GeminiAssistantAgent",
"component_type": "agent",
"version": 1,
"component_version": 1,
"description": null,
"label": "GeminiAssistantAgent",
"config": {
"name": "gemini_assistant",
"description": "An agent that provides assistance with ability to use tools.",
"model": "gemini-1.5-flash-002",
"system_message": "You are a helpful assistant that can respond to messages. Reply with TERMINATE when the task has been completed."
}
}
<__main__.GeminiAssistantAgent object at 0x11a5c5a90>
Next Steps#
So far, we have seen how to create custom agents, add custom model clients to agents, and make custom agents declarative. There are a few ways in which this basic sample can be extended:
Extend the Gemini model client to handle function calling similar to the
AssistantAgent
class. https://ai.google.dev/gemini-api/docs/function-callingImplement a package with a custom agent and experiment with using its declarative format in a tool like AutoGen Studio.