Agents#

An agent is a software entity that communicates via messages, maintains its own state, and performs actions in response to received messages or changes in its state.

Warning

AgentChat is Work in Progress. APIs may change in future releases.

AgentChat provides a set of preset Agents, each with variations in how an agent might respond to received messages.

Each agent inherits from a BaseChatAgent class with a few generic properties:

  • name: The name of the agent. This is used by the team to uniquely identify the agent. It should be unique within the team.

  • description: The description of the agent. This is used by the team to make decisions about which agents to use. The description should detail the agent’s capabilities and how to interact with it.

Tip

How do agents send and receive messages?

AgentChat is built on the autogen-core package, which handles the details of sending and receiving messages. autogen-core provides a runtime environment, which facilitates communication between agents (message sending and delivery), manages their identities and lifecycles, and enforces security and privacy boundaries. AgentChat handles the details of instantiating a runtime and registering agents with the runtime.

To learn more about the runtime in autogen-core, see the autogen-core documentation on agents and runtime.

Each agent also implements an on_messages() method that defines the behavior of the agent in response to a message.

To begin, let us import the required classes and set up a model client that will be used by agents.

import logging

from autogen_agentchat import EVENT_LOGGER_NAME
from autogen_agentchat.agents import ToolUseAssistantAgent
from autogen_agentchat.logging import ConsoleLogHandler
from autogen_agentchat.messages import TextMessage
from autogen_core.base import CancellationToken
from autogen_core.components.models import OpenAIChatCompletionClient
from autogen_core.components.tools import FunctionTool

logger = logging.getLogger(EVENT_LOGGER_NAME)
logger.addHandler(ConsoleLogHandler())
logger.setLevel(logging.INFO)


# Create an OpenAI model client.
model_client = OpenAIChatCompletionClient(
    model="gpt-4o-2024-08-06",
    # api_key="sk-...", # Optional if you have an OPENAI_API_KEY env variable set.
)

ToolUseAssistantAgent#

This agent responds to messages by making appropriate tool or function calls.

Tip

Understanding Tool Calling

Large Language Models (LLMs) are typically limited to generating text or code responses. However, many complex tasks benefit from the ability to use external tools that perform specific actions, such as fetching data from APIs or databases.

To address this limitation, modern LLMs can now accept a list of available tool schemas (descriptions of tools and their arguments) and generate a tool call message. This capability is known as Tool Calling or Function Calling and is becoming a popular pattern in building intelligent agent-based applications.

For more information on tool calling, refer to the documentation from OpenAI and Anthropic.

To set up a ToolUseAssistantAgent in AgentChat, follow these steps:

  1. Define the tool, typically as a Python function.

  2. Wrap the function in the FunctionTool class from the autogen-core package. This ensures the function schema can be correctly parsed and used for tool calling.

  3. Attach the tool to the agent.

async def get_weather(city: str) -> str:
    return f"The weather in {city} is 72 degrees and Sunny."


get_weather_tool = FunctionTool(get_weather, description="Get the weather for a city")

tool_use_agent = ToolUseAssistantAgent(
    "tool_use_agent",
    system_message="You are a helpful assistant that solves tasks by only using your tools.",
    model_client=model_client,
    registered_tools=[get_weather_tool],
)
tool_result = await tool_use_agent.on_messages(
    messages=[
        TextMessage(content="What is the weather right now in France?", source="user"),
    ],
    cancellation_token=CancellationToken(),
)
print(tool_result)
source='tool_use_agent' content="Could you please specify a city in France for which you'd like to get the current weather?"

We can see that the response generated by the ToolUseAssistantAgent is a tool call message which can then be executed to get the right result.

CodeExecutionAgent#

This agent preset extracts and executes code snippets found in received messages and returns the output. It is typically used within a team where a CodingAssistantAgent is also present - the CodingAssistantAgent can generate code snippets, which the CodeExecutionAgent receives and executes to make progress on a task.

Note

It is recommended that the CodeExecutionAgent uses a Docker container to execute code snippets. This ensures that the code snippets are executed in a safe and isolated environment. To use Docker, your environment must have Docker installed and running. If you do not have Docker installed, you can install it from the Docker website or alternatively skip the next cell.

In the code snippet below, we show how to set up a CodeExecutionAgent that uses the DockerCommandLineCodeExecutor class to execute code snippets in a Docker container. The work_dir parameter indicates where all executed files are first saved locally before being executed in the Docker container.

from autogen_agentchat.agents import CodeExecutorAgent
from autogen_ext.code_executors import DockerCommandLineCodeExecutor

async with DockerCommandLineCodeExecutor(work_dir="coding") as code_executor:  # type: ignore[syntax]
    code_executor_agent = CodeExecutorAgent("code_executor", code_executor=code_executor)
    code_execution_result = await code_executor_agent.on_messages(
        messages=[
            TextMessage(content="Here is some code \n ```python print('Hello world') \n``` ", source="user"),
        ],
        cancellation_token=CancellationToken(),
    )
    print(code_execution_result)
source='code_executor' content='No code blocks found in the thread.'

Building Custom Agents#

In many cases, you may have agents with custom behaviors that do not fall into any of the preset agent categories. In such cases, you can build custom agents by subclassing the BaseChatAgent class and implementing the on_messages() method.

A common example is an agent that can be part of a team but primarily is driven by human input. Other examples include agents that respond with specific text, tool or function calls.

In the example below we show hot to implement a UserProxyAgent - an agent that asks the user to enter some text and then returns that message as a response.

import asyncio
from typing import Sequence

from autogen_agentchat.agents import BaseChatAgent
from autogen_agentchat.messages import (
    ChatMessage,
    StopMessage,
    TextMessage,
)


class UserProxyAgent(BaseChatAgent):
    def __init__(self, name: str) -> None:
        super().__init__(name, "A human user.")

    async def on_messages(self, messages: Sequence[ChatMessage], cancellation_token: CancellationToken) -> ChatMessage:
        user_input = await asyncio.get_event_loop().run_in_executor(None, input, "Enter your response: ")
        if "TERMINATE" in user_input:
            return StopMessage(content="User has terminated the conversation.", source=self.name)
        return TextMessage(content=user_input, source=self.name)


user_proxy_agent = UserProxyAgent(name="user_proxy_agent")

user_proxy_agent_result = await user_proxy_agent.on_messages(
    messages=[
        TextMessage(content="What is the weather right now in France?", source="user"),
    ],
    cancellation_token=CancellationToken(),
)
print(user_proxy_agent_result)
source='user_proxy_agent' content='Hello there'

Summary#

So far, we have learned a few key concepts:

  • How to define agents

  • How to send messages to agents by calling the on_messages() method on the BaseChatAgent class and viewing the agent’s response

  • An overview of the different types of agents available in AgentChat

  • How to build custom agents

However, the ability to address complex tasks is often best served by groups of agents that interact as a team. Let us review how to build these teams.