GraphRAG

Default Configuration Mode (using JSON/YAML)

The default configuration mode may be configured by using a config.json or config.yml file in the data project root. If a .env file is present along with this config file, then it will be loaded, and the environment variables defined therein will be available for token replacements in your configuration document using ${ENV_VAR} syntax.

For example:

# .env
API_KEY=some_api_key

# config.json
{
    "llm": {
        "api_key": "${API_KEY}"
    }
}

Config Sections

input

Fields

llm

This is the base LLM configuration section. Other steps may override this configuration with their own LLM configuration.

Fields

parallelization

Fields

async_mode

asyncio|threaded The async mode to use. Either asyncio or `threaded.

embeddings

Fields

chunks

Fields

cache

Fields

storage

Fields

reporting

Fields

entity_extraction

Fields

summarize_descriptions

Fields

claim_extraction

Fields

community_reports

Fields

cluster_graph

Fields

embed_graph

Fields

umap

Fields

snapshots

Fields

encoding_model

str - The text encoding model to use. Default is cl100k_base.

skip_workflows

list[str] - Which workflow names to skip.