- Ingest files into R2R
- Search over ingested files
- Use your data as input to RAG (Retrieval-Augmented Generation)
- Perform basic user auth
- Observe and analyze an R2R deployment
Introduction
R2R is an engine for building user-facing Retrieval-Augmented Generation (RAG) applications. At its core, R2R provides this service through an architecture of providers, services, and an integrated RESTful API. This cookbook provides a detailed walkthrough of how to interact with R2R. Refer here for a deeper dive on the R2R system architecture.R2R Application Lifecycle
The following diagram illustrates how R2R assembles a user-facing application:Hello R2R
R2R gives developers configurable vector search and RAG right out of the box, as well as direct method calls instead of the client-server architecture seen throughout the docs:core/examples/hello_r2r.py
Configuring R2R
R2R is highly configurable. To customize your R2R deployment:- Create a local configuration file named
r2r.toml
. - In this file, override default settings as needed.
r2r.toml
config-path
argument to specify your custom configuration when launching R2R:
Document Ingestion and Management
R2R efficiently handles diverse document types using Postgres with pgvector, combining relational data management with vector search capabilities. This approach enables seamless ingestion, storage, and retrieval of multimodal data, while supporting flexible document management and user permissions. Key features include:- Unique
document_id
generation for each ingested file - User and collection permissions through
user_id
andcollection_ids
- Document versioning for tracking changes over time
- Granular access to document content through chunk retrieval
- Flexible deletion and update mechanisms
Note, all document management commands are gated at the user level, with the exception of superusers.
Ingest Data
Ingest Data
R2R offers a powerful data ingestion process that handles various file types including This command initiates the ingestion process, producing output similar to:Key features of the ingestion process:
html
, pdf
, png
, mp3
, and txt
. The full list of supported filetypes is available here. The ingestion process parses, chunks, embeds, and stores documents efficiently with a fully asynchronous pipeline. To demonstrate this functionality:- Unique
document_id
generation for each file - Metadata association, including
user_id
andcollection_ids
for document management - Efficient parsing, chunking, and embedding of diverse file types
Get Documents Overview
Get Documents Overview
R2R allows retrieval of high-level document information stored in a relational table within the Postgres database. To fetch this information:This command returns document metadata, including:This overview provides quick access to document versions, sizes, and associated metadata, facilitating efficient document management.
Get Document Chunks
Get Document Chunks
R2R enables retrieval of specific document chunks and associated metadata. To fetch chunks for a particular document by id:This command returns detailed chunk information:These features allow for granular access to document content.
Delete Documents
Delete Documents
R2R supports flexible document deletion through a method that can run arbitrary deletion filters. To delete a document by its ID:This command produces output similar to:Key features of the deletion process:
- Deletion by document ID, extraction ID, or fragment ID, or other.
- Cascading deletion of associated chunks and metadata
- Confirmation of successful deletion
Update Documents
Update Documents
R2R provides robust document update capabilities through two main endpoints: Expected Output:Behind the scenes, this command utilizes the
update_documents
and update_files
. These endpoints allow for seamless updating of existing documents while maintaining version control.Key features of the update process:- Automatic versioning: When updating a document, R2R automatically increments the version (e.g., from “v0” to “v1”).
- Metadata preservation: The update process maintains existing metadata while allowing for updates.
-
Content replacement: The new document content completely replaces the old content in the order shown below
- Ingest the new version of the document
- Delete the old version
update_files
endpoint. The process involves:- Reading the new file content
- Incrementing the document version
- Ingesting the new version with updated metadata
- Deleting the old version of the document
/update_files
. This endpoint accepts a R2RUpdateFilesRequest
, which includes:files
: List of UploadFile objects containing the new document contentdocument_ids
: UUIDs of the documents to updatemetadatas
: Optional updated metadata for each document
AI Powered Search
R2R offers powerful and highly configurable search capabilities, including vector search, hybrid search, and knowledge graph-enhanced search. These features allow for more accurate and contextually relevant information retrieval.Vector Search
Vector search inside of R2R is highly configurable, allowing you to fine-tune your search parameters for optimal results. Here’s how to perform a basic vector search:Expected Output
Expected Output
use_vector_search
: Enable or disable vector search.index_measure
: Choose between “cosine_distance”, “l2_distance”, or “max_inner_product”.search_limit
: Set the maximum number of results to return.include_values
: Include search score values in the results.include_metadatas
: Include element metadata in the results.probes
: Number of ivfflat index lists to query (higher increases accuracy but decreases speed).ef_search
: Size of the dynamic candidate list for HNSW index search (higher increases accuracy but decreases speed).
Hybrid Search
R2R supports hybrid search, which combines traditional keyword-based search with vector search for improved results. Here’s how to perform a hybrid search:Knowledge Graph Search
R2R integrates knowledge graph capabilities to enhance search results with structured relationships. Knowledge graph search can be configured to focus on specific entity types, relationships, or search levels. Here’s how to utilize knowledge graph search:Knowledge Graphs are not constructed by default, refer to the cookbook here before attempting to run the command below!
use_kg_search
: Enable knowledge graph search.kg_search_type
: Choose between “global” or “local” search.kg_search_level
: Specify the level of community to search.entity_types
: List of entity types to include in the search.relationships
: List of relationship types to include in the search.max_community_description_length
: Maximum length of community descriptions.max_llm_queries_for_global_search
: Limit on the number of LLM queries for global search.local_search_limits
: Set limits for different types of local searches.
Retrieval-Augmented Generation (RAG)
R2R is built around a comprehensive Retrieval-Augmented Generation (RAG) engine, allowing you to generate contextually relevant responses based on your ingested documents. The RAG process combines all the search functionality shown above with Large Language Models to produce more accurate and informative answers.Basic RAG
Basic RAG
To generate a response using RAG, use the following command:Example Output:This command performs a search on the ingested documents and uses the retrieved information to generate a response.
RAG w/ Hybrid Search
RAG w/ Hybrid Search
R2R also supports hybrid search in RAG, combining the power of vector search and keyword-based search. To use hybrid search in RAG, simply add the Example Output:This example demonstrates how hybrid search can enhance the RAG process by combining semantic understanding with keyword matching, potentially providing more accurate and comprehensive results.
use_hybrid_search
flag to your search settings input:Streaming RAG
Streaming RAG
R2R also supports streaming RAG responses, which can be useful for real-time applications. To use streaming RAG:Example Output:Streaming allows the response to be generated and sent in real-time, chunk by chunk.
Customizing RAG
Customizing RAG
R2R offers extensive customization options for its Retrieval-Augmented Generation (RAG) functionality:This flexibility allows you to optimize RAG performance for your specific use case and leverage the strengths of various LLM providers.
-
Search Settings: Customize vector and knowledge graph search parameters using
VectorSearchSettings
andKGSearchSettings
. -
Generation Config: Fine-tune the language model’s behavior with
GenerationConfig
, including:- Temperature, top_p, top_k for controlling randomness
- Max tokens, model selection, and streaming options
- Advanced settings like beam search and sampling strategies
-
Multiple LLM Support: Easily switch between different language models and providers:
- OpenAI models (default)
- Anthropic’s Claude models
- Local models via Ollama
- Any provider supported by LiteLLM
User Auth
R2R provides robust user auth and management capabilities. This section briefly covers user authentication features and how they relate to document management.User Registration
User Registration
To register a new user:Example output:
Email Verification
Email Verification
After registration, users need to verify their email:
User Login
User Login
To log in and obtain access tokens:
Get Current User Info
Get Current User Info
To retrieve information about the currently authenticated user:
User-Specific Search
User-Specific Search
Once authenticated, search results are automatically filtered to include only documents associated with the current user:
Refresh Access Token
Refresh Access Token
To refresh an expired access token:
User Logout
User Logout
To log out and invalidate the current access token:
YOUR_ACCESS_TOKEN
and YOUR_REFRESH_TOKEN
with actual tokens obtained during the login process.
Observability and Analytics
R2R provides robust observability and analytics features, allowing superusers to monitor system performance, track usage patterns, and gain insights into the RAG application’s behavior. These advanced features are crucial for maintaining and optimizing your R2R deployment.Observability and analytics features are restricted to superusers only. By default, R2R is configured to treat unauthenticated users as superusers for quick testing and development. In a production environment, you should disable this setting and properly manage superuser access.
Users Overview
Users Overview
R2R offers high level user observability for superusersThis command returns detailed log user information, here’s some example output:This summary returns information for each user about their number of files ingested, the total size of user ingested files, and the corresponding document ids.
Logging
Logging
R2R automatically logs various events and metrics during its operation. You can access these logs using the This command returns detailed log entries for various operations, including search and RAG requests. Here’s an example of a log entry:These logs provide detailed information about each operation, including search results, queries, latencies, and LLM responses.
logs
command:Analytics
Analytics
R2R offers an analytics feature that allows you to aggregate and analyze log data. You can use the This command returns aggregated statistics based on the specified filters and analysis types. Here’s an example output:This analytics feature allows you to:
analytics
command to retrieve various statistics:- Filter logs based on specific criteria
- Perform statistical analysis on various metrics (e.g., search latencies)
- Track performance trends over time
- Identify potential bottlenecks or areas for optimization
Custom Analytics
Custom Analytics
R2R’s analytics system is flexible and allows for custom analysis. You can specify different filters and analysis types to focus on specific aspects of your application’s performance. For example:
- Analyze RAG latencies
- Track usage patterns by user or document type
- Monitor error rates and types
- Assess the effectiveness of different LLM models or configurations
filters
and analysis_types
parameters in the analytics
command to suit your specific needs.