promptflow.rag module#
- promptflow.rag.build_index(*, name: str, vector_store: str = 'azure_ai_search', input_source: Union[AzureAISearchSource, LocalSource], index_config: Optional[AzureAISearchConfig] = None, embeddings_model_config: EmbeddingsModelConfig, data_source_url: Optional[str] = None, tokens_per_chunk: int = 1024, token_overlap_across_chunks: int = 0, input_glob: str = '**/*', max_sample_files: Optional[int] = None, chunk_prepend_summary: Optional[bool] = None, document_path_replacement_regex: Optional[Dict[str, str]] = None, embeddings_cache_path: Optional[str] = None) str #
Generates embeddings locally and stores Index reference in memory
- Parameters:
name (str) – The name of the output index.
vector_store (str) – The vector store to be indexed.
input_source (Union[AzureAISearchSource, LocalSource]) – The configuration for input data source.
index_config (AzureAISearchConfig) – The configuration for Azure Cognitive Search output.
embeddings_model_config (EmbeddingsModelConfig) – The configuration for embedding model.
data_source_url (Optional[str]) – The URL of the data source.
tokens_per_chunk (int) – The size of each chunk.
token_overlap_across_chunks (int) – The overlap between chunks.
input_glob (str) – The input glob pattern.
max_sample_files (Optional[int]) – The maximum number of sample files.
chunk_prepend_summary (Optional[bool]) – Whether to prepend summary to each chunk.
document_path_replacement_regex (Optional[Dict[str, str]]) – The regex for document path replacement.
embeddings_cache_path (Optional[str]) – The path to embeddings cache.
- Returns:
local path to the index created.
- Return type:
str
- promptflow.rag.get_langchain_retriever_from_index(path: str)#