Challenge 04 - Retrieval Augmented Generation (RAG)

< Previous Challenge - Home - Next Challenge >

Pre-requisites

Introduction

Knowledge bases are widely used in enterprises and can contain an extensive number of documents across various categories. Retrieving relevant content based on user queries is a challenging task. Traditionally, methods like Page Rank have been employed to accurately retrieve information at the document level. However, users still need to manually search within the document to find the specific and relevant information they need. The recent advancements in Foundation Models, such as the one developed by OpenAI, offer a solution through the use of “Retrieval Augmented Generation” techniques and encoding information like “Embeddings.” These methods aid in finding the relevant information and then to answer or summarize the content to present to the user in a concise and succinct manner.

Retrieval augmented generation (RAG) is an innovative approach that combines the power of retrieval-based Knowledge bases, such as Azure Cognitive Search, and generative Large Language Models (LLMs), such as Azure OpenAI ChatGPT, to enhance the quality and relevance of generated outputs. This technique involves integrating a retrieval component into a generative model, enabling the retrieval of contextual and domain-specific information from the knowledge base. By incorporating this contextual knowledge alongside the original input, the model can generate desired outputs, such as summaries, information extraction, or question answering. In essence, the utilization of RAG with LLMs allows you to generate domain-specific text outputs by incorporating specific external data as part of the context provided to the LLMs.

RAG aims to overcome limitations found in purely generative models, including issues of factual accuracy, relevance, and coherence, often seen in the form of “hallucinations”. By integrating retrieval into the generative process, RAG seeks to mitigate these challenges. The incorporation of retrieved information serves to “ground” the large language models (LLMs), ensuring that the generated content better aligns with the intended context, enhances factual correctness, and produces more coherent and meaningful outputs.

Description

Questions you should be able to answer by the end of the challenge:

Some Considerations:

You will run the following two Jupyter notebooks for this challenge:

These files can be found in your Codespace under the /notebooks folder. If you are working locally or in the Cloud, you can find them in the /notebooks folder of Resources.zip file.

To run a Jupyter notebook, navigate to it in your Codespace or open it in VS Code on your local workstation. You will find further instructions for the challenge, as well as in-line code blocks that you will interact with to complete the tasks for the challenge. Return here to the student guide after completing all tasks in the Jupyter notebook to validate you have met the success criteria below for this challenge.

Success Criteria

To complete this challenge successfully, you should be able to:

Learning Resources

Advanced Challenges (Optional)

Too comfortable? Eager to do more? Try these additional challenges!