Retrieval Augmentation

Retrieval Augmented Generation (RAG) is a powerful technique that combines language models with external knowledge retrieval to improve the quality and relevance of generated responses.

One way to realize RAG in AutoGen is to construct agent chats with AssistantAgent and RetrieveUserProxyAgent classes.

Example Setup: RAG with Retrieval Augmented Agents

The following is an example setup demonstrating how to create retrieval augmented agents in AutoGen:

Step 1. Create an instance of `AssistantAgent` and `RetrieveUserProxyAgent`.

Here RetrieveUserProxyAgent instance acts as a proxy agent that retrieves relevant information based on the user's input.

Refer to the doc for more information on the detailed configurations.

assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config={
        "timeout": 600,
        "cache_seed": 42,
        "config_list": config_list,
    },
)
ragproxyagent = RetrieveUserProxyAgent(
    name="ragproxyagent",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=3,
    retrieve_config={
        "task": "code",
        "docs_path": [
            "https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Examples/Integrate%20-%20Spark.md",
            "https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md",
            os.path.join(os.path.abspath(""), "..", "website", "docs"),
        ],
        "custom_text_types": ["mdx"],
        "chunk_token_size": 2000,
        "model": config_list[0]["model"],
        "client": chromadb.PersistentClient(path="/tmp/chromadb"),
        "embedding_model": "all-mpnet-base-v2",
        "get_or_create": True,  # set to False if you don't want to reuse an existing collection, but you'll need to remove the collection manually
    },
    code_execution_config=False,  # set to False if you don't want to execute the code
)

Step 2. Initiating Agent Chat with Retrieval Augmentation

Once the retrieval augmented agents are set up, you can initiate a chat with retrieval augmentation using the following code:

code_problem = "How can I use FLAML to perform a classification task and use spark to do parallel training. Train 30 seconds and force cancel jobs if time limit is reached."
ragproxyagent.initiate_chat(
    assistant, message=ragproxyagent.message_generator, problem=code_problem, search_string="spark"
)  # search_string is used as an extra filter for the embeddings search, in this case, we only want to search documents that contain "spark".

You'll need to install chromadb<=0.5.0 if you see issue like #3551.

Example Setup: RAG with Retrieval Augmented Agents with PGVector

The following is an example setup demonstrating how to create retrieval augmented agents in AutoGen:

Step 1. Create an instance of `AssistantAgent` and `RetrieveUserProxyAgent`.

Here RetrieveUserProxyAgent instance acts as a proxy agent that retrieves relevant information based on the user's input.

Specify the connection_string, or the host, port, database, username, and password in the db_config.

assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config={
        "timeout": 600,
        "cache_seed": 42,
        "config_list": config_list,
    },
)
ragproxyagent = RetrieveUserProxyAgent(
    name="ragproxyagent",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=3,
    retrieve_config={
        "task": "code",
        "docs_path": [
            "https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Examples/Integrate%20-%20Spark.md",
            "https://raw.githubusercontent.com/microsoft/FLAML/main/website/docs/Research.md",
            os.path.join(os.path.abspath(""), "..", "website", "docs"),
        ],
        "vector_db": "pgvector",
        "collection_name": "autogen_docs",
        "db_config": {
            "connection_string": "postgresql://testuser:testpwd@localhost:5432/vectordb", # Optional - connect to an external vector database
            # "host": None, # Optional vector database host
            # "port": None, # Optional vector database port
            # "database": None, # Optional vector database name
            # "username": None, # Optional vector database username
            # "password": None, # Optional vector database password
        },
        "custom_text_types": ["mdx"],
        "chunk_token_size": 2000,
        "model": config_list[0]["model"],
        "get_or_create": True,
    },
    code_execution_config=False,
)

Step 2. Initiating Agent Chat with Retrieval Augmentation

Once the retrieval augmented agents are set up, you can initiate a chat with retrieval augmentation using the following code:

code_problem = "How can I use FLAML to perform a classification task and use spark to do parallel training. Train 30 seconds and force cancel jobs if time limit is reached."
ragproxyagent.initiate_chat(
    assistant, message=ragproxyagent.message_generator, problem=code_problem, search_string="spark"
)  # search_string is used as an extra filter for the embeddings search, in this case, we only want to search documents that contain "spark".

Online Demo

Retrival-Augmented Chat Demo on Huggingface

More Examples and Notebooks

For more detailed examples and notebooks showcasing the usage of retrieval augmented agents in AutoGen, refer to the following:

Automated Code Generation and Question Answering with Retrieval Augmented Agents - View Notebook
Automated Code Generation and Question Answering with PGVector based Retrieval Augmented Agents - View Notebook
Automated Code Generation and Question Answering with Qdrant based Retrieval Augmented Agents - View Notebook
Automated Code Generation and Question Answering with MongoDB Atlas based Retrieval Augmented Agents - View Notebook
Automated Code Generation and Question Answering with Couchbase based Retrieval Augmented Agents - View Notebook
Chat with OpenAI Assistant with Retrieval Augmentation - View Notebook
RAG: Group Chat with Retrieval Augmented Generation (with 5 group member agents and 1 manager agent) - View Notebook

Roadmap

Explore our detailed roadmap here for further advancements plan around RAG. Your contributions, feedback, and use cases are highly appreciated! We invite you to engage with us and play a pivotal role in the development of this impactful feature.

Retrieval Augmentation

Example Setup: RAG with Retrieval Augmented Agents​

Step 1. Create an instance of AssistantAgent and RetrieveUserProxyAgent.​

Step 2. Initiating Agent Chat with Retrieval Augmentation​

Example Setup: RAG with Retrieval Augmented Agents with PGVector​

Step 1. Create an instance of AssistantAgent and RetrieveUserProxyAgent.​

Step 2. Initiating Agent Chat with Retrieval Augmentation​

Online Demo​

More Examples and Notebooks​

Roadmap​

Example Setup: RAG with Retrieval Augmented Agents

Step 1. Create an instance of `AssistantAgent` and `RetrieveUserProxyAgent`.

Step 2. Initiating Agent Chat with Retrieval Augmentation

Example Setup: RAG with Retrieval Augmented Agents with PGVector

Step 1. Create an instance of `AssistantAgent` and `RetrieveUserProxyAgent`.

Step 2. Initiating Agent Chat with Retrieval Augmentation

Online Demo

More Examples and Notebooks

Roadmap