GraphRAG

Prompt Tuning ⚙️

GraphRAG provides the ability to create domain adapted prompts for the generation of the knowledge graph. This step is optional, though it is highly encouraged to run it as it will yield better results when executing an Index Run.

These are generated by loading the inputs, splitting them into chunks (text units) and then running a series of LLM invocations and template substitutions to generate the final prompts. We suggest using the default values provided by the script, but in this page you'll find the detail of each in case you want to further explore and tweak the prompt tuning algorithm.

Figure 1: Auto Tuning Conceptual Diagram.

Figure 1: Auto Tuning Conceptual Diagram.

Prerequisites

Before running auto tuning make sure you have already initialized your workspace with the graphrag.index --init command. This will create the necessary configuration files and the default prompts. Refer to the Init Documentation for more information about the initialization process.

Usage

You can run the main script from the command line with various options:

python -m graphrag.prompt_tune [--root ROOT] [--domain DOMAIN]  [--method METHOD] [--limit LIMIT] [--language LANGUAGE] \
[--max-tokens MAX_TOKENS] [--chunk-size CHUNK_SIZE] [--n-subset-max N_SUBSET_MAX] [--k K] \
[--min-examples-required MIN_EXAMPLES_REQUIRED] [--no-entity-types] [--output OUTPUT]

Command-Line Options

Example Usage

python -m graphrag.prompt_tune --root /path/to/project --config /path/to/settings.yaml --domain "environmental news" \
--method random --limit 10 --language English --max-tokens 2048 --chunk-size 256 --min-examples-required 3 \
--no-entity-types --output /path/to/output

or, with minimal configuration (suggested):

python -m graphrag.prompt_tune --root /path/to/project --config /path/to/settings.yaml --no-entity-types

Document Selection Methods

The auto tuning feature ingests the input data and then divides it into text units the size of the chunk size parameter. After that, it uses one of the following selection methods to pick a sample to work with for prompt generation:

Modify Env Vars

After running auto tuning, you should modify the following environment variables (or config variables) to pick up the new prompts on your index run. Note: Please make sure to update the correct path to the generated prompts, in this example we are using the default "prompts" path.

or in your yaml config file:

entity_extraction:
  prompt: "prompts/entity_extraction.txt"

summarize_descriptions:
  prompt: "prompts/summarize_descriptions.txt"

community_reports:
  prompt: "prompts/community_report.txt"