Embedding
Embedding Provider
By default, R2R uses the LiteLLM framework to communicate with various cloud embedding providers. To customize the embedding settings:
Let’s break down the embedding configuration options:
provider
: Choose fromollama
,litellm
andopenai
. R2R defaults to using the LiteLLM framework for maximum embedding provider flexibility.base_model
: Specifies the embedding model to use. Format is typically “provider/model-name” (e.g.,"openai/text-embedding-3-small"
).base_dimension
: Sets the dimension of the embedding vectors. Should match the output dimension of the chosen model.batch_size
: Determines the number of texts to embed in a single API call. Larger values can improve throughput but may increase latency.add_title_as_prefix
: When true, prepends the document title to the text before embedding, providing additional context.rerank_model
: Specifies a model for reranking results. Set to “None” to disable reranking (note: not supported by LiteLLMEmbeddingProvider).concurrent_request_limit
: Sets the maximum number of concurrent embedding requests to manage load and avoid rate limiting.
Supported LiteLLM Providers
Support for any of the embedding providers listed below is provided through LiteLLM.
Example configuration:
Supported models include:
- text-embedding-3-small
- text-embedding-3-large
- text-embedding-ada-002
For detailed usage instructions, refer to the LiteLLM OpenAI Embedding documentation.
Example configuration:
Supported models include:
- text-embedding-3-small
- text-embedding-3-large
- text-embedding-ada-002
For detailed usage instructions, refer to the LiteLLM OpenAI Embedding documentation.
Example configuration:
Supported models include:
- text-embedding-ada-002
For detailed usage instructions, refer to the LiteLLM Azure Embedding documentation.
Anthropic does not currently offer embedding models. Consider using OpenAI or another provider for embeddings.
Example configuration:
Supported models include:
- embed-english-v3.0
- embed-english-light-v3.0
- embed-multilingual-v3.0
- embed-multilingual-light-v3.0
- embed-english-v2.0
- embed-english-light-v2.0
- embed-multilingual-v2.0
For detailed usage instructions, refer to the LiteLLM Cohere Embedding documentation.
When running with Ollama, additional changes are recommended for the to the r2r.toml
file. In addition to using the ollama
provider directly, we recommend restricting the concurrent_request_limit
in order to avoid exceeding the throughput of your Ollama server.
Then, deploy your R2R server with r2r serve --config-path=r2r.toml
.
Example configuration:
LiteLLM supports all Feature-Extraction Embedding models on HuggingFace.
For detailed usage instructions, refer to the LiteLLM HuggingFace Embedding documentation.
Example configuration:
Supported models include:
- amazon.titan-embed-text-v1
- cohere.embed-english-v3
- cohere.embed-multilingual-v3
For detailed usage instructions, refer to the LiteLLM Bedrock Embedding documentation.
Example configuration:
Supported models include:
- textembedding-gecko
- textembedding-gecko-multilingual
- textembedding-gecko@001
- textembedding-gecko@003
- text-embedding-preview-0409
- text-multilingual-embedding-preview-0409
For detailed usage instructions, refer to the LiteLLM Vertex AI Embedding documentation.
Example Configuration
Supported models include:
- voyage-01
- voyage-lite-01
- voyage-lite-01-instruct
For detailed usage instructions, refer to the LiteLLM Voyage AI Embedding documentation.