Agents#

AutoGen AgentChat provides a set of preset Agents, each with variations in how an agent might respond to messages. All agents share the following attributes and methods:

See autogen_agentchat.messages for more information on AgentChat message types.

Assistant Agent#

AssistantAgent is a built-in agent that uses a language model and has the ability to use tools.

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.messages import TextMessage
from autogen_core import CancellationToken
from autogen_ext.models.openai import OpenAIChatCompletionClient


# Define a tool that searches the web for information.
async def web_search(query: str) -> str:
    """Find information on the web"""
    return "AutoGen is a programming framework for building multi-agent applications."


# Create an agent that uses the OpenAI GPT-4o model.
model_client = OpenAIChatCompletionClient(
    model="gpt-4o",
    # api_key="YOUR_API_KEY",
)
agent = AssistantAgent(
    name="assistant",
    model_client=model_client,
    tools=[web_search],
    system_message="Use tools to solve tasks.",
)

Getting Responses#

We can use the on_messages() method to get the agent response to a given message.

async def assistant_run() -> None:
    response = await agent.on_messages(
        [TextMessage(content="Find information on AutoGen", source="user")],
        cancellation_token=CancellationToken(),
    )
    print(response.inner_messages)
    print(response.chat_message)


# Use asyncio.run(assistant_run()) when running in a script.
await assistant_run()
[ToolCallRequestEvent(source='assistant', models_usage=RequestUsage(prompt_tokens=61, completion_tokens=15), content=[FunctionCall(id='call_hqVC7UJUPhKaiJwgVKkg66ak', arguments='{"query":"AutoGen"}', name='web_search')]), ToolCallExecutionEvent(source='assistant', models_usage=None, content=[FunctionExecutionResult(content='AutoGen is a programming framework for building multi-agent applications.', call_id='call_hqVC7UJUPhKaiJwgVKkg66ak')])]
source='assistant' models_usage=RequestUsage(prompt_tokens=92, completion_tokens=14) content='AutoGen is a programming framework designed for building multi-agent applications.'

The call to the on_messages() method returns a Response that contains the agent’s final response in the chat_message attribute, as well as a list of inner messages in the inner_messages attribute, which stores the agent’s “thought process” that led to the final response.

Note

It is important to note that on_messages() will update the internal state of the agent – it will add the messages to the agent’s history. So you should not repeatedly call this method with the same messages if you want to carry on a conversation with the agent.

User Proxy Agent#

UserProxyAgent is a built-in agent that provides one way for a user to intervene in the process. This agent will put the team in a temporary blocking state, and thus any exceptions or runtime failures while in the blocked state will result in a deadlock. It is strongly advised that this agent be coupled with a timeout mechanic and that all errors and exceptions emanating from it are handled.

from autogen_agentchat.agents import UserProxyAgent


async def user_proxy_run() -> None:
    user_proxy_agent = UserProxyAgent("user_proxy")
    response = await user_proxy_agent.on_messages(
        [TextMessage(content="What is your name? ", source="user")], cancellation_token=CancellationToken()
    )
    print(f"Your name is {response.chat_message.content}")


# Use asyncio.run(user_proxy_run()) when running in a script.
await user_proxy_run()

The User Proxy agent is ideally used for on-demand human-in-the-loop interactions for scenarios such as Just In Time approvals, human feedback, alerts, etc. For slower user interactions, consider terminating the session using a termination condition and start another one from run or run_stream with another message.

Streaming Messages#

We can also stream each message as it is generated by the agent by using the on_messages_stream() method, and use Console to print the messages as they appear to the console.

from autogen_agentchat.ui import Console


async def assistant_run_stream() -> None:
    # Option 1: read each message from the stream (as shown in the previous example).
    # async for message in agent.on_messages_stream(
    #     [TextMessage(content="Find information on AutoGen", source="user")],
    #     cancellation_token=CancellationToken(),
    # ):
    #     print(message)

    # Option 2: use Console to print all messages as they appear.
    await Console(
        agent.on_messages_stream(
            [TextMessage(content="Find information on AutoGen", source="user")],
            cancellation_token=CancellationToken(),
        )
    )


# Use asyncio.run(assistant_run_stream()) when running in a script.
await assistant_run_stream()
---------- assistant ----------
[FunctionCall(id='call_fSp5iTGVm2FKw5NIvfECSqNd', arguments='{"query":"AutoGen information"}', name='web_search')]
[Prompt tokens: 61, Completion tokens: 16]
---------- assistant ----------
[FunctionExecutionResult(content='AutoGen is a programming framework for building multi-agent applications.', call_id='call_fSp5iTGVm2FKw5NIvfECSqNd')]
---------- assistant ----------
AutoGen is a programming framework designed for building multi-agent applications. If you need more detailed information or specific aspects about AutoGen, feel free to ask!
[Prompt tokens: 93, Completion tokens: 32]
---------- Summary ----------
Number of inner messages: 2
Total prompt tokens: 154
Total completion tokens: 48
Duration: 4.30 seconds

The on_messages_stream() method returns an asynchronous generator that yields each inner message generated by the agent, with the final item being the response message in the chat_message attribute.

From the messages, you can observe that the assistant agent utilized the web_search tool to gather information and responded based on the search results.

Understanding Tool Calling#

Large Language Models (LLMs) are typically limited to generating text or code responses. However, many complex tasks benefit from the ability to use external tools that perform specific actions, such as fetching data from APIs or databases.

To address this limitation, modern LLMs can now accept a list of available tool schemas (descriptions of tools and their arguments) and generate a tool call message. This capability is known as Tool Calling or Function Calling and is becoming a popular pattern in building intelligent agent-based applications.

For more information on tool calling, refer to the documentation from OpenAI and Anthropic.

Other Preset Agents#

The following preset agents are available:

  • CodeExecutorAgent: An agent that can execute code.

  • OpenAIAssistantAgent: An agent that is backed by an OpenAI Assistant, with ability to use custom tools.

  • MultimodalWebSurfer: A multi-modal agent that can search the web and visit web pages for information.

  • FileSurfer: An agent that can search and browse local files for information.

  • VideoSurfer: An agent that can watch videos for information.

Next Step#

Having explored the usage of the AssistantAgent, we can now proceed to the next section to learn about the teams feature in AgentChat.