CLI Reference
This page documents the command-line interface of the graphrag library.
graphrag
GraphRAG: A graph-based retrieval-augmented generation (RAG) system.
Usage:
Options:
--install-completion Install completion for the current shell.
--show-completion Show completion for the current shell, to copy it or
customize the installation.
index
Build a knowledge graph index.
Usage:
Options:
--config PATH The configuration to use.
--root PATH The project root directory. [default: .]
--verbose / --no-verbose Run the indexing pipeline with verbose
logging [default: no-verbose]
--memprofile / --no-memprofile Run the indexing pipeline with memory
profiling [default: no-memprofile]
--resume TEXT Resume a given indexing run
--reporter [rich|print|none] The progress reporter to use. [default:
rich]
--emit TEXT The data formats to emit, comma-separated.
[default: parquet]
--dry-run / --no-dry-run Run the indexing pipeline without executing
any steps to inspect and validate the
configuration. [default: no-dry-run]
--cache / --no-cache Use LLM cache. [default: cache]
--skip-validation / --no-skip-validation
Skip any preflight validation. Useful when
running no LLM steps. [default: no-skip-
validation]
--output PATH Indexing pipeline output directory.
Overrides storage.base_dir in the
configuration file.
init
Generate a default configuration file.
Usage:
Options:
prompt-tune
Generate custom graphrag prompts with your own data (i.e. auto templating).
Usage:
Options:
--root PATH The project root directory. [default: .]
--config PATH The configuration to use.
--domain TEXT The domain your input data is related to.
For example 'space science', 'microbiology',
'environmental news'. If not defined, a
domain will be inferred from the input data.
--selection-method [all|random|top|auto]
The text chunk selection method. [default:
random]
--n-subset-max INTEGER The number of text chunks to embed when
--selection-method=auto. [default: 300]
--k INTEGER The maximum number of documents to select
from each centroid when --selection-
method=auto. [default: 15]
--limit INTEGER The number of documents to load when
--selection-method={random,top}. [default:
15]
--max-tokens INTEGER The max token count for prompt generation.
[default: 2000]
--min-examples-required INTEGER
The minimum number of examples to
generate/include in the entity extraction
prompt. [default: 2]
--chunk-size INTEGER The max token count for prompt generation.
[default: 200]
--language TEXT The primary language used for inputs and
outputs in graphrag prompts.
--discover-entity-types / --no-discover-entity-types
Discover and extract unspecified entity
types. [default: discover-entity-types]
--output PATH The directory to save prompts to, relative
to the project root directory. [default:
prompts]
query
Query a knowledge graph index.
Usage:
Options:
--method [local|global|drift] The query algorithm to use. [required]
--query TEXT The query to execute. [required]
--config PATH The configuration to use.
--data PATH Indexing pipeline output directory (i.e.
contains the parquet files).
--root PATH The project root directory. [default: .]
--community-level INTEGER The community level in the Leiden community
hierarchy from which to load community
reports. Higher values represent reports from
smaller communities. [default: 2]
--response-type TEXT Free form text describing the response type
and format, can be anything, e.g. Multiple
Paragraphs, Single Paragraph, Single
Sentence, List of 3-7 Points, Single Page,
Multi-Page Report. Default: Multiple
Paragraphs [default: Multiple Paragraphs]
--streaming / --no-streaming Print response in a streaming manner.
[default: no-streaming]
update
Update an existing knowledge graph index.
Applies a default storage configuration (if not provided by config), saving the new index to the local file system in the update_output
folder.
Usage:
Options:
--config PATH The configuration to use.
--root PATH The project root directory. [default: .]
--verbose / --no-verbose Run the indexing pipeline with verbose
logging [default: no-verbose]
--memprofile / --no-memprofile Run the indexing pipeline with memory
profiling [default: no-memprofile]
--reporter [rich|print|none] The progress reporter to use. [default:
rich]
--emit TEXT The data formats to emit, comma-separated.
[default: parquet]
--cache / --no-cache Use LLM cache. [default: cache]
--skip-validation / --no-skip-validation
Skip any preflight validation. Useful when
running no LLM steps. [default: no-skip-
validation]
--output PATH Indexing pipeline output directory.
Overrides storage.base_dir in the
configuration file.