r2r.toml
:
r2r.toml
provider
: The LLM provider (defaults to “LiteLLM” for maximum flexibility).concurrent_request_limit
: Maximum number of concurrent LLM requests.model
: The language model to use for generation.temperature
: Controls the randomness of the output (0.0 to 1.0).top_p
: Nucleus sampling parameter (0.0 to 1.0).max_tokens_to_sample
: Maximum number of tokens to generate.stream
: Enable/disable streaming of generated text.api_base
: The base URL for remote communication, e.g.https://api.openai.com/v1
Serving select LLM providers
- openai/gpt-4o
- openai/gpt-4-turbo
- openai/gpt-4
- openai/gpt-4o-mini