{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Multi-Agent Debate\n", "\n", "Multi-Agent Debate is a multi-agent design pattern that simulates a multi-turn interaction \n", "where in each turn, agents exchange their responses with each other, and refine \n", "their responses based on the responses from other agents.\n", "\n", "This example shows an implementation of the multi-agent debate pattern for solving\n", "math problems from the [GSM8K benchmark](https://huggingface.co/datasets/openai/gsm8k).\n", "\n", "There are of two types of agents in this pattern: solver agents and an aggregator agent.\n", "The solver agents are connected in a sparse manner following the technique described in\n", "[Improving Multi-Agent Debate with Sparse Communication Topology](https://arxiv.org/abs/2406.11776).\n", "The solver agents are responsible for solving math problems and exchanging responses with each other.\n", "The aggregator agent is responsible for distributing math problems to the solver agents,\n", "waiting for their final responses, and aggregating the responses to get the final answer.\n", "\n", "The pattern works as follows:\n", "1. User sends a math problem to the aggregator agent.\n", "2. The aggregator agent distributes the problem to the solver agents.\n", "3. Each solver agent processes the problem, and publishes a response to its neighbors.\n", "4. Each solver agent uses the responses from its neighbors to refine its response, and publishes a new response.\n", "5. Repeat step 4 for a fixed number of rounds. In the final round, each solver agent publishes a final response.\n", "6. The aggregator agent uses majority voting to aggregate the final responses from all solver agents to get a final answer, and publishes the answer.\n", "\n", "We will be using the broadcast API, i.e., {py:meth}`~autogen_core.BaseAgent.publish_message`,\n", "and we will be using topic and subscription to implement the communication topology.\n", "Read about [Topics and Subscriptions](../core-concepts/topic-and-subscription.md) to understand how they work." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import re\n", "from dataclasses import dataclass\n", "from typing import Dict, List\n", "\n", "from autogen_core import (\n", " DefaultTopicId,\n", " MessageContext,\n", " RoutedAgent,\n", " SingleThreadedAgentRuntime,\n", " TypeSubscription,\n", " default_subscription,\n", " message_handler,\n", ")\n", "from autogen_core.models import (\n", " AssistantMessage,\n", " ChatCompletionClient,\n", " LLMMessage,\n", " SystemMessage,\n", " UserMessage,\n", ")\n", "from autogen_ext.models.openai import OpenAIChatCompletionClient" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Message Protocol\n", "\n", "First, we define the messages used by the agents.\n", "`IntermediateSolverResponse` is the message exchanged among the solver agents in each round,\n", "and `FinalSolverResponse` is the message published by the solver agents in the final round." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "@dataclass\n", "class Question:\n", " content: str\n", "\n", "\n", "@dataclass\n", "class Answer:\n", " content: str\n", "\n", "\n", "@dataclass\n", "class SolverRequest:\n", " content: str\n", " question: str\n", "\n", "\n", "@dataclass\n", "class IntermediateSolverResponse:\n", " content: str\n", " question: str\n", " answer: str\n", " round: int\n", "\n", "\n", "@dataclass\n", "class FinalSolverResponse:\n", " answer: str" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solver Agent\n", "\n", "The solver agent is responsible for solving math problems and exchanging responses with other solver agents.\n", "Upon receiving a `SolverRequest`, the solver agent uses an LLM to generate an answer.\n", "Then, it publishes a `IntermediateSolverResponse`\n", "or a `FinalSolverResponse` based on the round number.\n", "\n", "The solver agent is given a topic type, which is used to indicate the topic\n", "to which the agent should publish intermediate responses. This topic is subscribed\n", "to by its neighbors to receive responses from this agent -- we will show\n", "how this is done later.\n", "\n", "We use {py:meth}`~autogen_core.components.default_subscription` to let\n", "solver agents subscribe to the default topic, which is used by the aggregator agent\n", "to collect the final responses from the solver agents." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "@default_subscription\n", "class MathSolver(RoutedAgent):\n", " def __init__(self, model_client: ChatCompletionClient, topic_type: str, num_neighbors: int, max_round: int) -> None:\n", " super().__init__(\"A debator.\")\n", " self._topic_type = topic_type\n", " self._model_client = model_client\n", " self._num_neighbors = num_neighbors\n", " self._history: List[LLMMessage] = []\n", " self._buffer: Dict[int, List[IntermediateSolverResponse]] = {}\n", " self._system_messages = [\n", " SystemMessage(\n", " content=(\n", " \"You are a helpful assistant with expertise in mathematics and reasoning. \"\n", " \"Your task is to assist in solving a math reasoning problem by providing \"\n", " \"a clear and detailed solution. Limit your output within 100 words, \"\n", " \"and your final answer should be a single numerical number, \"\n", " \"in the form of {{answer}}, at the end of your response. \"\n", " \"For example, 'The answer is {{42}}.'\"\n", " )\n", " )\n", " ]\n", " self._round = 0\n", " self._max_round = max_round\n", "\n", " @message_handler\n", " async def handle_request(self, message: SolverRequest, ctx: MessageContext) -> None:\n", " # Add the question to the memory.\n", " self._history.append(UserMessage(content=message.content, source=\"user\"))\n", " # Make an inference using the model.\n", " model_result = await self._model_client.create(self._system_messages + self._history)\n", " assert isinstance(model_result.content, str)\n", " # Add the response to the memory.\n", " self._history.append(AssistantMessage(content=model_result.content, source=self.metadata[\"type\"]))\n", " print(f\"{'-'*80}\\nSolver {self.id} round {self._round}:\\n{model_result.content}\")\n", " # Extract the answer from the response.\n", " match = re.search(r\"\\{\\{(\\-?\\d+(\\.\\d+)?)\\}\\}\", model_result.content)\n", " if match is None:\n", " raise ValueError(\"The model response does not contain the answer.\")\n", " answer = match.group(1)\n", " # Increment the counter.\n", " self._round += 1\n", " if self._round == self._max_round:\n", " # If the counter reaches the maximum round, publishes a final response.\n", " await self.publish_message(FinalSolverResponse(answer=answer), topic_id=DefaultTopicId())\n", " else:\n", " # Publish intermediate response to the topic associated with this solver.\n", " await self.publish_message(\n", " IntermediateSolverResponse(\n", " content=model_result.content,\n", " question=message.question,\n", " answer=answer,\n", " round=self._round,\n", " ),\n", " topic_id=DefaultTopicId(type=self._topic_type),\n", " )\n", "\n", " @message_handler\n", " async def handle_response(self, message: IntermediateSolverResponse, ctx: MessageContext) -> None:\n", " # Add neighbor's response to the buffer.\n", " self._buffer.setdefault(message.round, []).append(message)\n", " # Check if all neighbors have responded.\n", " if len(self._buffer[message.round]) == self._num_neighbors:\n", " print(\n", " f\"{'-'*80}\\nSolver {self.id} round {message.round}:\\nReceived all responses from {self._num_neighbors} neighbors.\"\n", " )\n", " # Prepare the prompt for the next question.\n", " prompt = \"These are the solutions to the problem from other agents:\\n\"\n", " for resp in self._buffer[message.round]:\n", " prompt += f\"One agent solution: {resp.content}\\n\"\n", " prompt += (\n", " \"Using the solutions from other agents as additional information, \"\n", " \"can you provide your answer to the math problem? \"\n", " f\"The original math problem is {message.question}. \"\n", " \"Your final answer should be a single numerical number, \"\n", " \"in the form of {{answer}}, at the end of your response.\"\n", " )\n", " # Send the question to the agent itself to solve.\n", " await self.send_message(SolverRequest(content=prompt, question=message.question), self.id)\n", " # Clear the buffer.\n", " self._buffer.pop(message.round)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Aggregator Agent\n", "\n", "The aggregator agent is responsible for handling user question and \n", "distributing math problems to the solver agents.\n", "\n", "The aggregator subscribes to the default topic using\n", "{py:meth}`~autogen_core.components.default_subscription`. The default topic is used to\n", "recieve user question, receive the final responses from the solver agents,\n", "and publish the final answer back to the user.\n", "\n", "In a more complex application when you want to isolate the multi-agent debate into a\n", "sub-component, you should use\n", "{py:meth}`~autogen_core.components.type_subscription` to set a specific topic\n", "type for the aggregator-solver communication, \n", "and have the both the solver and aggregator publish and subscribe to that topic type." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "@default_subscription\n", "class MathAggregator(RoutedAgent):\n", " def __init__(self, num_solvers: int) -> None:\n", " super().__init__(\"Math Aggregator\")\n", " self._num_solvers = num_solvers\n", " self._buffer: List[FinalSolverResponse] = []\n", "\n", " @message_handler\n", " async def handle_question(self, message: Question, ctx: MessageContext) -> None:\n", " print(f\"{'-'*80}\\nAggregator {self.id} received question:\\n{message.content}\")\n", " prompt = (\n", " f\"Can you solve the following math problem?\\n{message.content}\\n\"\n", " \"Explain your reasoning. Your final answer should be a single numerical number, \"\n", " \"in the form of {{answer}}, at the end of your response.\"\n", " )\n", " print(f\"{'-'*80}\\nAggregator {self.id} publishes initial solver request.\")\n", " await self.publish_message(SolverRequest(content=prompt, question=message.content), topic_id=DefaultTopicId())\n", "\n", " @message_handler\n", " async def handle_final_solver_response(self, message: FinalSolverResponse, ctx: MessageContext) -> None:\n", " self._buffer.append(message)\n", " if len(self._buffer) == self._num_solvers:\n", " print(f\"{'-'*80}\\nAggregator {self.id} received all final answers from {self._num_solvers} solvers.\")\n", " # Find the majority answer.\n", " answers = [resp.answer for resp in self._buffer]\n", " majority_answer = max(set(answers), key=answers.count)\n", " # Publish the aggregated response.\n", " await self.publish_message(Answer(content=majority_answer), topic_id=DefaultTopicId())\n", " # Clear the responses.\n", " self._buffer.clear()\n", " print(f\"{'-'*80}\\nAggregator {self.id} publishes final answer:\\n{majority_answer}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setting Up a Debate\n", "\n", "We will now set up a multi-agent debate with 4 solver agents and 1 aggregator agent.\n", "The solver agents will be connected in a sparse manner as illustrated in the figure\n", "below:\n", "\n", "```\n", "A --- B\n", "| |\n", "| |\n", "C --- D\n", "```\n", "\n", "Each solver agent is connected to two other solver agents. \n", "For example, agent A is connected to agents B and C.\n", "\n", "Let's first create a runtime and register the agent types." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "AgentType(type='MathAggregator')" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "runtime = SingleThreadedAgentRuntime()\n", "await MathSolver.register(\n", " runtime,\n", " \"MathSolverA\",\n", " lambda: MathSolver(\n", " model_client=OpenAIChatCompletionClient(model=\"gpt-4o-mini\"),\n", " topic_type=\"MathSolverA\",\n", " num_neighbors=2,\n", " max_round=3,\n", " ),\n", ")\n", "await MathSolver.register(\n", " runtime,\n", " \"MathSolverB\",\n", " lambda: MathSolver(\n", " model_client=OpenAIChatCompletionClient(model=\"gpt-4o-mini\"),\n", " topic_type=\"MathSolverB\",\n", " num_neighbors=2,\n", " max_round=3,\n", " ),\n", ")\n", "await MathSolver.register(\n", " runtime,\n", " \"MathSolverC\",\n", " lambda: MathSolver(\n", " model_client=OpenAIChatCompletionClient(model=\"gpt-4o-mini\"),\n", " topic_type=\"MathSolverC\",\n", " num_neighbors=2,\n", " max_round=3,\n", " ),\n", ")\n", "await MathSolver.register(\n", " runtime,\n", " \"MathSolverD\",\n", " lambda: MathSolver(\n", " model_client=OpenAIChatCompletionClient(model=\"gpt-4o-mini\"),\n", " topic_type=\"MathSolverD\",\n", " num_neighbors=2,\n", " max_round=3,\n", " ),\n", ")\n", "await MathAggregator.register(runtime, \"MathAggregator\", lambda: MathAggregator(num_solvers=4))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we will create the solver agent topology using {py:class}`~autogen_core.components.TypeSubscription`,\n", "which maps each solver agent's publishing topic type to its neighbors' agent types." ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "# Subscriptions for topic published to by MathSolverA.\n", "await runtime.add_subscription(TypeSubscription(\"MathSolverA\", \"MathSolverD\"))\n", "await runtime.add_subscription(TypeSubscription(\"MathSolverA\", \"MathSolverB\"))\n", "\n", "# Subscriptions for topic published to by MathSolverB.\n", "await runtime.add_subscription(TypeSubscription(\"MathSolverB\", \"MathSolverA\"))\n", "await runtime.add_subscription(TypeSubscription(\"MathSolverB\", \"MathSolverC\"))\n", "\n", "# Subscriptions for topic published to by MathSolverC.\n", "await runtime.add_subscription(TypeSubscription(\"MathSolverC\", \"MathSolverB\"))\n", "await runtime.add_subscription(TypeSubscription(\"MathSolverC\", \"MathSolverD\"))\n", "\n", "# Subscriptions for topic published to by MathSolverD.\n", "await runtime.add_subscription(TypeSubscription(\"MathSolverD\", \"MathSolverC\"))\n", "await runtime.add_subscription(TypeSubscription(\"MathSolverD\", \"MathSolverA\"))\n", "\n", "# All solvers and the aggregator subscribe to the default topic." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solving Math Problems\n", "\n", "Now let's run the debate to solve a math problem.\n", "We publish a `SolverRequest` to the default topic, \n", "and the aggregator agent will start the debate." ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--------------------------------------------------------------------------------\n", "Aggregator MathAggregator:default received question:\n", "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?\n", "--------------------------------------------------------------------------------\n", "Aggregator MathAggregator:default publishes initial solver request.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverC:default round 0:\n", "In April, Natalia sold 48 clips. In May, she sold half as many, which is 48 / 2 = 24 clips. To find the total number of clips sold in April and May, we add the amounts: 48 (April) + 24 (May) = 72 clips. \n", "\n", "Thus, the total number of clips sold by Natalia is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverB:default round 0:\n", "In April, Natalia sold 48 clips. In May, she sold half as many clips, which is 48 / 2 = 24 clips. To find the total clips sold in April and May, we add both amounts: \n", "\n", "48 (April) + 24 (May) = 72.\n", "\n", "Thus, the total number of clips sold altogether is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverD:default round 0:\n", "Natalia sold 48 clips in April. In May, she sold half as many, which is \\( \\frac{48}{2} = 24 \\) clips. To find the total clips sold in both months, we add the clips sold in April and May together:\n", "\n", "\\[ 48 + 24 = 72 \\]\n", "\n", "Thus, Natalia sold a total of 72 clips.\n", "\n", "The answer is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverC:default round 1:\n", "Received all responses from 2 neighbors.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverA:default round 1:\n", "Received all responses from 2 neighbors.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverA:default round 0:\n", "In April, Natalia sold clips to 48 friends. In May, she sold half as many, which is calculated as follows:\n", "\n", "Half of 48 is \\( 48 \\div 2 = 24 \\).\n", "\n", "Now, to find the total clips sold in April and May, we add the totals from both months:\n", "\n", "\\( 48 + 24 = 72 \\).\n", "\n", "Thus, the total number of clips Natalia sold altogether in April and May is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverD:default round 1:\n", "Received all responses from 2 neighbors.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverB:default round 1:\n", "Received all responses from 2 neighbors.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverC:default round 1:\n", "In April, Natalia sold 48 clips. In May, she sold half as many, which is 48 / 2 = 24 clips. The total number of clips sold in April and May is calculated by adding the two amounts: 48 (April) + 24 (May) = 72 clips. \n", "\n", "Therefore, the answer is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverA:default round 1:\n", "In April, Natalia sold 48 clips. In May, she sold half of that amount, which is 48 / 2 = 24 clips. To find the total clips sold in both months, we sum the clips from April and May: \n", "\n", "48 (April) + 24 (May) = 72.\n", "\n", "Thus, Natalia sold a total of {{72}} clips. \n", "\n", "The answer is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverD:default round 2:\n", "Received all responses from 2 neighbors.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverB:default round 2:\n", "Received all responses from 2 neighbors.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverD:default round 1:\n", "Natalia sold 48 clips in April. In May, she sold half of that, which is \\( 48 \\div 2 = 24 \\) clips. To find the total clips sold, we add the clips sold in both months:\n", "\n", "\\[ 48 + 24 = 72 \\]\n", "\n", "Therefore, the total number of clips sold by Natalia is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverB:default round 1:\n", "In April, Natalia sold 48 clips. In May, she sold half that amount, which is 48 / 2 = 24 clips. To find the total clips sold in both months, we add the amounts: \n", "\n", "48 (April) + 24 (May) = 72.\n", "\n", "Therefore, the total number of clips sold altogether by Natalia is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverA:default round 2:\n", "Received all responses from 2 neighbors.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverC:default round 2:\n", "Received all responses from 2 neighbors.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverA:default round 2:\n", "In April, Natalia sold 48 clips. In May, she sold half of that amount, which is \\( 48 \\div 2 = 24 \\) clips. To find the total clips sold in both months, we add the amounts from April and May:\n", "\n", "\\( 48 + 24 = 72 \\).\n", "\n", "Thus, the total number of clips sold by Natalia is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverC:default round 2:\n", "In April, Natalia sold 48 clips. In May, she sold half of that amount, which is \\( 48 \\div 2 = 24 \\) clips. To find the total number of clips sold in both months, we add the clips sold in April and May: \n", "\n", "48 (April) + 24 (May) = 72. \n", "\n", "Thus, the total number of clips sold altogether by Natalia is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverB:default round 2:\n", "In April, Natalia sold 48 clips. In May, she sold half as many, calculated as \\( 48 \\div 2 = 24 \\) clips. To find the total clips sold over both months, we sum the totals: \n", "\n", "\\( 48 (April) + 24 (May) = 72 \\).\n", "\n", "Therefore, the total number of clips Natalia sold is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Solver MathSolverD:default round 2:\n", "To solve the problem, we know that Natalia sold 48 clips in April. In May, she sold half that amount, which is calculated as \\( 48 \\div 2 = 24 \\) clips. To find the total number of clips sold over both months, we add the two amounts together:\n", "\n", "\\[ 48 + 24 = 72 \\]\n", "\n", "Thus, the total number of clips sold by Natalia is {{72}}.\n", "--------------------------------------------------------------------------------\n", "Aggregator MathAggregator:default received all final answers from 4 solvers.\n", "--------------------------------------------------------------------------------\n", "Aggregator MathAggregator:default publishes final answer:\n", "72\n" ] } ], "source": [ "question = \"Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?\"\n", "runtime.start()\n", "await runtime.publish_message(Question(content=question), DefaultTopicId())\n", "await runtime.stop_when_idle()" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 2 }