Memory#

There are several use cases where it is valuable to maintain a store of useful facts that can be intelligently added to the context of the agent just before a specific step. The typically use case here is a RAG pattern where a query is used to retrieve relevant information from a database that is then added to the agent’s context.

AgentChat provides a Memory protocol that can be extended to provide this functionality. The key methods are query, update_context, add, clear, and close.

add: add new entries to the memory store
query: retrieve relevant information from the memory store
update_context: mutate an agent’s internal model_context by adding the retrieved information (used in the AssistantAgent class)
clear: clear all entries from the memory store
close: clean up any resources used by the memory store

ListMemory Example#

{py:class}~autogen_core.memory.ListMemory is provided as an example implementation of the {py:class}~autogen_core.memory.Memory protocol. It is a simple list-based memory implementation that maintains memories in chronological order, appending the most recent memories to the model’s context. The implementation is designed to be straightforward and predictable, making it easy to understand and debug. In the following example, we will use ListMemory to maintain a memory bank of user preferences and demonstrate how it can be used to provide consistent context for agent responses over time.

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.ui import Console
from autogen_core.memory import ListMemory, MemoryContent, MemoryMimeType
from autogen_ext.models.openai import OpenAIChatCompletionClient

# Initialize user memory
user_memory = ListMemory()

# Add user preferences to memory
await user_memory.add(MemoryContent(content="The weather should be in metric units", mime_type=MemoryMimeType.TEXT))

await user_memory.add(MemoryContent(content="Meal recipe must be vegan", mime_type=MemoryMimeType.TEXT))


async def get_weather(city: str, units: str = "imperial") -> str:
    if units == "imperial":
        return f"The weather in {city} is 73 °F and Sunny."
    elif units == "metric":
        return f"The weather in {city} is 23 °C and Sunny."
    else:
        return f"Sorry, I don't know the weather in {city}."


assistant_agent = AssistantAgent(
    name="assistant_agent",
    model_client=OpenAIChatCompletionClient(
        model="gpt-4o-2024-08-06",
    ),
    tools=[get_weather],
    memory=[user_memory],
)

# Run the agent with a task.
stream = assistant_agent.run_stream(task="What is the weather in New York?")
await Console(stream)

TaskResult(messages=[TextMessage(source='user', models_usage=None, content='What is the weather in New York?', type='TextMessage'), MemoryQueryEvent(source='assistant_agent', models_usage=None, content=[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None, timestamp=None, source=None, score=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None, timestamp=None, source=None, score=None)], type='MemoryQueryEvent'), ToolCallRequestEvent(source='assistant_agent', models_usage=RequestUsage(prompt_tokens=123, completion_tokens=20), content=[FunctionCall(id='call_pHq4p89gW6oGjGr3VsVETCYX', arguments='{"city":"New York","units":"metric"}', name='get_weather')], type='ToolCallRequestEvent'), ToolCallExecutionEvent(source='assistant_agent', models_usage=None, content=[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', call_id='call_pHq4p89gW6oGjGr3VsVETCYX')], type='ToolCallExecutionEvent'), ToolCallSummaryMessage(source='assistant_agent', models_usage=None, content='The weather in New York is 23 °C and Sunny.', type='ToolCallSummaryMessage')], stop_reason=None)

We can inspect that the assistant_agent model_context is actually updated with the retrieved memory entries. The transform method is used to format the retrieved memory entries into a string that can be used by the agent. In this case, we simply concatenate the content of each memory entry into a single string.

await assistant_agent._model_context.get_messages()

[UserMessage(content='What is the weather in New York?', source='user', type='UserMessage'),
 SystemMessage(content='\nRelevant memory content (in chronological order):\n1. The weather should be in metric units\n2. Meal recipe must be vegan\n', type='SystemMessage'),
 AssistantMessage(content=[FunctionCall(id='call_pHq4p89gW6oGjGr3VsVETCYX', arguments='{"city":"New York","units":"metric"}', name='get_weather')], source='assistant_agent', type='AssistantMessage'),
 FunctionExecutionResultMessage(content=[FunctionExecutionResult(content='The weather in New York is 23 °C and Sunny.', call_id='call_pHq4p89gW6oGjGr3VsVETCYX')], type='FunctionExecutionResultMessage')]

We see above that the weather is returned in Centigrade as stated in the user preferences.

Similarly, assuming we ask a separate question about generating a meal plan, the agent is able to retrieve relevant information from the memory store and provide a personalized response.

stream = assistant_agent.run_stream(task="Write brief meal recipe with broth")
await Console(stream)

TaskResult(messages=[TextMessage(source='user', models_usage=None, content='Write brief meal recipe with broth', type='TextMessage'), MemoryQueryEvent(source='assistant_agent', models_usage=None, content=[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None, timestamp=None, source=None, score=None), MemoryContent(content='Meal recipe must be vegan', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None, timestamp=None, source=None, score=None)], type='MemoryQueryEvent'), TextMessage(source='assistant_agent', models_usage=RequestUsage(prompt_tokens=208, completion_tokens=253), content="Here's a brief vegan meal recipe using broth:\n\n**Vegan Mushroom & Herb Broth Soup**\n\n**Ingredients:**\n- 1 tablespoon olive oil\n- 1 onion, diced\n- 2 cloves garlic, minced\n- 250g mushrooms, sliced\n- 1 carrot, diced\n- 1 celery stalk, diced\n- 4 cups vegetable broth\n- 1 teaspoon thyme\n- 1 teaspoon rosemary\n- Salt and pepper to taste\n- Fresh parsley for garnish\n\n**Instructions:**\n1. Heat the olive oil in a large pot over medium heat. Add the diced onion and garlic, and sauté until the onion becomes translucent.\n\n2. Add the sliced mushrooms, carrot, and celery. Continue to sauté until the mushrooms are cooked through and the vegetables begin to soften, about 5 minutes.\n\n3. Pour in the vegetable broth. Stir in the thyme and rosemary, and bring the mixture to a boil.\n\n4. Reduce the heat to low and let the soup simmer for about 15 minutes, allowing the flavors to meld together.\n\n5. Season with salt and pepper to taste.\n\n6. Serve hot, garnished with fresh parsley.\n\nEnjoy your warm and comforting vegan mushroom & herb broth soup! \n\nTERMINATE", type='TextMessage')], stop_reason=None)

Custom Memory Stores (Vector DBs, etc.)#

You can build on the Memory protocol to implement more complex memory stores. For example, you could implement a custom memory store that uses a vector database to store and retrieve information, or a memory store that uses a machine learning model to generate personalized responses based on the user’s preferences etc.

Specifically, you will need to overload the add, query and update_context methods to implement the desired functionality and pass the memory store to your agent.