r2r.toml:
r2r.toml
provider: The LLM provider (defaults to “LiteLLM” for maximum flexibility).concurrent_request_limit: Maximum number of concurrent LLM requests.model: The language model to use for generation.temperature: Controls the randomness of the output (0.0 to 1.0).top_p: Nucleus sampling parameter (0.0 to 1.0).max_tokens_to_sample: Maximum number of tokens to generate.stream: Enable/disable streaming of generated text.api_base: The base URL for remote communication, e.g.https://api.openai.com/v1
Serving select LLM providers
- OpenAI
- Azure
- Anthropic
- Vertex AI
- AWS Bedrock
- Groq
- Ollama
- Cohere
- Anyscale
- openai/gpt-4o
- openai/gpt-4-turbo
- openai/gpt-4
- openai/gpt-4o-mini

