Analytics & Observability

Introduction

This guide demonstrates how to leverage R2R’s powerful analytics and logging features. These capabilities allow you to monitor system performance, track usage patterns, and gain valuable insights into your RAG application’s behavior.

The features described in this cookbook are typically restricted to superusers. Ensure you have the necessary permissions before attempting to access these features.

For more information on user roles and permissions, including how to set up and manage superuser accounts, please refer to our User Auth Cookbook.

Setup

Before diving into the authentication features, ensure you have R2R installed and configured as described in the installation guide. For this guide, we’ll use the default configuration. Further, r2r serve must be called to serve R2R in either your local environment or local Docker engine.

Basic Usage

Logging

R2R automatically logs various events and metrics during its operation. To fetch our logs using the client-server architecture, use the following:

r2r logs

Expected Output:

[
    {
        'run_id': UUID('27f124ad-6f70-4641-89ab-f346dc9d1c2f'),
        'run_type': 'rag',
        'entries': [
            {'key': 'search_query', 'value': 'Who is Aristotle?'},
            {'key': 'search_latency', 'value': '0.39'},
            {'key': 'search_results', 'value': '["{\\"id\\":\\"7ed3a01c-88dc-5a58-a68b-6e5d9f292df2\\",...}"]'},
            {'key': 'rag_generation_latency', 'value': '3.79'},
            {'key': 'llm_response', 'value': 'Aristotle (Greek: Ἀριστοτέλης Aristotélēs; 384–322 BC) was...'}
        ]
    },
    # More log entries...
]

These logs provide detailed information about each operation, including search results, queries, latencies, and LLM responses. To fetch the logs directly from an instantiated R2R object:

app = R2R()

# Perform some searches / RAG completions
# ...

# Get the latest logs
logs = app.logs()
print(logs)

Analytics

R2R offers an analytics feature that allows you to aggregate and analyze log data: The relevant command

r2r analytics --filters '{"search_latencies": "search_latency"}' --analysis-types '{"search_latencies": ["basic_statistics", "search_latency"]}'

Expected Output:

{
    'results': {
        'filtered_logs': {
            'search_latencies': [
                {
                    'timestamp': '2024-06-20 21:29:06',
                    'log_id': UUID('0f28063c-8b87-4934-90dc-4cd84dda5f5c'),
                    'key': 'search_latency',
                    'value': '0.66',
                    'rn': 3
                },
                ...
            ]
        },
        'search_latencies': {
            'Mean': 0.734,
            'Median': 0.523,
            'Mode': 0.495,
            'Standard Deviation': 0.213,
            'Variance': 0.0453
        }
    }
}

To fetch the analytics directly from an instantiated R2R object:

from r2r import FilterCriteria, AnalysisTypes

filter_criteria = FilterCriteria(filters={"search_latencies": "search_latency"})
analysis_types = AnalysisTypes(analysis_types={"search_latencies": ["basic_statistics", "search_latency"]})

analytics_results = app.analytics(filter_criteria, analysis_types)
print(analytics_results)

The boilerplate analytics implementation allows you to:

Filter logs based on specific criteria
Perform statistical analysis on various metrics (e.g., search latencies)
Track performance trends over time
Identify potential bottlenecks or areas for optimization

Experimental Features

Advanced analytics features are still in an experimental state - please reach out to the R2R team if you are interested in configuring / using these additional features.

Custom Analytics

R2R’s analytics system is flexible and allows for custom analysis. You can specify different filters and analysis types to focus on specific aspects of your application’s performance.

# Analyze RAG latencies
rag_filter = FilterCriteria(filters={"rag_latencies": "rag_generation_latency", "rag_eval": "rag_eval_metric"})
rag_analysis = AnalysisTypes(analysis_types={"rag_latencies": ["basic_statistics", "rag_generation_latency"]})
rag_analytics = app.analytics(rag_filter, rag_analysis)

# Track usage patterns by user
user_filter = FilterCriteria(filters={"user_patterns": "user_id"})
user_analysis = AnalysisTypes(analysis_types={"user_patterns": ["bar_chart", "user_id"]})
user_analytics = app.analytics(user_filter, user_analysis)

# Monitor error rates
error_filter = FilterCriteria(filters={"error_rates": "error"})
error_analysis = AnalysisTypes(analysis_types={"error_rates": ["basic_statistics", "error"]})
error_analytics = app.analytics(error_filter, error_analysis)

Preloading Data for Analysis

To get meaningful analytics, you need a substantial amount of data. Here’s a script to preload your database with random searches:

import random
from r2r import R2R, GenerationConfig

app = R2R()

# List of sample queries
queries = [
    "What is artificial intelligence?",
    "Explain machine learning.",
    "How does natural language processing work?",
    "What are neural networks?",
    "Describe deep learning.",
    # Add more queries as needed
]

# Perform random searches
for _ in range(1000):
    query = random.choice(queries)
    app.rag(query, GenerationConfig(model="openai/gpt-4o-mini"))

print("Preloading complete. You can now run analytics on this data.")

After running this script, you’ll have a rich dataset to analyze using the analytics features described above.

User-Level Analytics

To get analytics for a specific user:

user_id = "your_user_id_here"

user_filter = FilterCriteria(filters={"user_analytics": "user_id"})
user_analysis = AnalysisTypes(analysis_types={
    "user_analytics": ["basic_statistics", "user_id"],
    "user_search_latencies": ["basic_statistics", "search_latency"]
})

user_analytics = app.analytics(user_filter, user_analysis)
print(f"Analytics for user {user_id}:")
print(user_analytics)

This will give you insights into the behavior and performance of specific users in your system.

Summary

R2R’s logging and analytics features provide powerful tools for understanding and optimizing your RAG application. By leveraging these capabilities, you can:

Monitor system performance in real-time
Analyze trends in search and RAG operations
Identify potential bottlenecks or areas for improvement
Track user behavior and usage patterns
Make data-driven decisions to enhance your application’s performance and user experience

For detailed setup and basic functionality, refer back to the R2R Quickstart. For more advanced usage and customization options, join the R2R Discord community.

General

Auth & Admin

Deployment

Introduction

Setup

Basic Usage

Logging

Analytics

Experimental Features

Custom Analytics

Preloading Data for Analysis

User-Level Analytics

Summary

General

Auth & Admin

Deployment

​Introduction

​Setup

​Basic Usage

​Logging

​Analytics

​Experimental Features

​Custom Analytics

​Preloading Data for Analysis

​User-Level Analytics

​Summary

Introduction

Setup

Basic Usage

Logging

Analytics

Experimental Features

Custom Analytics

Preloading Data for Analysis

User-Level Analytics

Summary