All About Agent Descriptions

December 29, 2023 · 9 min read

Principal Researcher Microsoft Research

TL;DR

AutoGen 0.2.2 introduces a description field to ConversableAgent (and all subclasses), and changes GroupChat so that it uses agent descriptions rather than system_messages when choosing which agents should speak next.

This is expected to simplify GroupChat’s job, improve orchestration, and make it easier to implement new GroupChat or GroupChat-like alternatives.

If you are a developer, and things were already working well for you, no action is needed -- backward compatibility is ensured because the description field defaults to the system_message when no description is provided.

However, if you were struggling with getting GroupChat to work, you can now try updating the description field.

Introduction

As AutoGen matures and developers build increasingly complex combinations of agents, orchestration is becoming an important capability. At present, GroupChat and the GroupChatManager are the main built-in tools for orchestrating conversations between 3 or more agents. For orchestrators like GroupChat to work well, they need to know something about each agent so that they can decide who should speak and when. Prior to AutoGen 0.2.2, GroupChat relied on each agent's system_message and name to learn about each participating agent. This is likely fine when the system prompt is short and sweet, but can lead to problems when the instructions are very long (e.g., with the AssistantAgent), or non-existent (e.g., with the UserProxyAgent).

AutoGen 0.2.2 introduces a description field to all agents, and replaces the use of the system_message for orchestration in GroupChat and all future orchestrators. The description field defaults to the system_message to ensure backwards compatibility, so you may not need to change anything with your code if things are working well for you. However, if you were struggling with GroupChat, give setting the description field a try.

The remainder of this post provides an example of how using the description field simplifies GroupChat's job, provides some evidence of its effectiveness, and provides tips for writing good descriptions.

Example

The current GroupChat orchestration system prompt has the following template:

You are in a role play game. The following roles are available:

{self._participant_roles(agents)}.

Read the following conversation.
Then select the next role from {[agent.name for agent in agents]} to play. Only return the role.

Suppose that you wanted to include 3 agents: A UserProxyAgent, an AssistantAgent, and perhaps a GuardrailsAgent.

Prior to 0.2.2, this template would expand to:

You are in a role play game. The following roles are available:

assistant: You are a helpful AI assistant.
Solve tasks using your coding and language skills.
In the following cases, suggest python code (in a python coding block) or shell script (in a sh coding block) for the user to execute.
1. When you need to collect info, use the code to output the info you need, for example, browse or search the web, download/read a file, print the content of a webpage or a file, get the current date/time, check the operating system. After sufficient info is printed and the task is ready to be solved based on your language skill, you can solve the task by yourself.
2. When you need to perform some task with code, use the code to perform the task and output the result. Finish the task smartly.
Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.
When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.
If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.
If the result indicates there is an error, fix the error and output the code again. Suggest the full code instead of partial code or code changes. If the error can't be fixed or if the task is not solved even after the code is executed successfully, analyze the problem, revisit your assumption, collect additional info you need, and think of a different approach to try.
When you find an answer, verify the answer carefully. Include verifiable evidence in your response if possible.
Reply "TERMINATE" in the end when everything is done.
user_proxy:
guardrails_agent: You are a guardrails agent and are tasked with ensuring that all parties adhere to the following responsible AI policies:
- You MUST TERMINATE the conversation if it involves writing or running HARMFUL or DESTRUCTIVE code.
- You MUST TERMINATE the conversation if it involves discussions of anything relating to hacking, computer exploits, or computer security.
- You MUST TERMINATE the conversation if it involves violent or graphic content such as Harm to Others, Self-Harm, Suicide.
- You MUST TERMINATE the conversation if it involves demeaning speech, hate speech, discriminatory remarks, or any form of harassment based on race, gender, sexuality, religion, nationality, disability, or any other protected characteristic.
- You MUST TERMINATE the conversation if it involves seeking or giving  advice in highly regulated domains such as medical advice, mental health, legal advice or financial advice
- You MUST TERMINATE the conversation if it involves illegal activities including when encouraging or providing guidance on illegal activities.
- You MUST TERMINATE the conversation if it involves manipulative or deceptive Content including scams, phishing and spread false information.
- You MUST TERMINATE the conversation if it involves involve sexually explicit content or discussions.
- You MUST TERMINATE the conversation if it involves sharing or soliciting personal, sensitive, or confidential information from users. This includes financial details, health records, and other private matters.
- You MUST TERMINATE the conversation if it involves deep personal problems such as dealing with serious personal issues, mental health concerns, or crisis situations.
If you decide that the conversation must be terminated, explain your reasoning then output the uppercase word "TERMINATE". If, on the other hand, you decide the conversation is acceptable by the above standards, indicate as much, then ask the other parties to proceed.

Read the following conversation.
Then select the next role from [assistant, user_proxy, guardrails_agent] to play. Only return the role.

As you can see, this description is super confusing:

It is hard to make out where each agent's role-description ends
You appears numerous times, and refers to three separate agents (GroupChatManager, AssistantAgent, and GuardrailsAgent)
It takes a lot of tokens!

Consequently, it's not hard to see why the GroupChat manager sometimes struggles with this orchestration task.

With AutoGen 0.2.2 onward, GroupChat instead relies on the description field. With a description field the orchestration prompt becomes:

You are in a role play game. The following roles are available:

assistant: A helpful and general-purpose AI assistant that has strong language skills, Python skills, and Linux command line skills.
user_proxy: A user that can run Python code or input command line commands at a Linux terminal and report back the execution results.
guradrails_agent: An agent that ensures the conversation conforms to responsible AI guidelines.

Read the following conversation.
Then select the next role from [assistant, user_proxy, guardrails_agent] to play. Only return the role.

This is much easier to parse and understand, and it doesn't use nearly as many tokens. Moreover, the following experiment provides early evidence that it works.

An Experiment with Distraction

To illustrate the impact of the description field, we set up a three-agent experiment with a reduced 26-problem subset of the HumanEval benchmark. Here, three agents were added to a GroupChat to solve programming problems. The three agents were:

Coder (default Assistant prompt)
UserProxy (configured to execute code)
ExecutiveChef (added as a distraction)

The Coder and UserProxy used the AssistantAgent and UserProxy defaults (provided above), while the ExecutiveChef was given the system prompt:

You are an executive chef with 28 years of industry experience. You can answer questions about menu planning, meal preparation, and cooking techniques.

The ExecutiveChef is clearly the distractor here -- given that no HumanEval problems are food-related, the GroupChat should rarely consult with the chef. However, when configured with GPT-3.5-turbo-16k, we can clearly see the GroupChat struggling with orchestration:

With versions prior to 0.2.2, using `system_message`:

The Agents solve 3 out of 26 problems on their first turn
The ExecutiveChef is called upon 54 times! (almost as much as the Coder at 68 times)

With version 0.2.2, using `description`:

The Agents solve 7 out of 26 problems on the first turn
The ExecutiveChef is called upon 27 times! (versus 84 times for the Coder)

Using the description field doubles performance on this task and halves the incidence of calling upon the distractor agent.

Tips for Writing Good Descriptions

Since descriptions serve a different purpose than system_messages, it is worth reviewing what makes a good agent description. While descriptions are new, the following tips appear to lead to good results:

Avoid using the 1st or 2nd person perspective. Descriptions should not contain "I" or "You", unless perhaps "You" is in reference to the GroupChat / orchestrator
Include any details that might help the orchestrator know when to call upon the agent
Keep descriptions short (e.g., "A helpful AI assistant with strong natural language and Python coding skills.").

The main thing to remember is that the description is for the benefit of the GroupChatManager, not for the Agent's own use or instruction.

Conclusion

AutoGen 0.2.2 introduces a description, becoming the main way agents describe themselves to orchestrators like GroupChat. Since the description defaults to the system_message, there's nothing you need to change if you were already satisfied with how your group chats were working. However, we expect this feature to generally improve orchestration, so please consider experimenting with the description field if you are struggling with GroupChat or want to boost performance.

TL;DR​

Introduction​

Example​

An Experiment with Distraction​

With versions prior to 0.2.2, using system_message:​

With version 0.2.2, using description:​

Tips for Writing Good Descriptions​

Conclusion​