External DocumentationTo learn more, visit the LiteLLM documentation.
Basic Parameters
| Parameter | Description |
|---|---|
| Completions Number | The number of completions to generate. |
| Frequency Penalty | Penalize new tokens based on their existing frequency. Accept values from -2.0 to 2.0. |
| Include Log Probabilities | Select to include the log probabilities for the most likely tokens. |
| Include Top Log Probabilities | The number of log probabilities to return for the most likely tokens at each generation step. Accepts an integer value between 0 and 5. This parameter is only available when logprobs is set to true. |
| Logit Bias | A JSON object mapping token IDs to bias values. This modifies the probability of generating specific tokens. Example: |
| Max Completion Tokens | The maximum number of tokens to generate in the completion. This includes visible output tokens and reasoning tokens. Note: This parameter can’t be used along with Max Tokens. |
| Max Tokens | The maximum number of tokens to generate in the completion. Note: This parameter can’t be used along with Max Completion Tokens. |
| Messages | The list of messages in the conversation. Example: |
| Model ID | The ID of the model to get chat completions with. |
| Parallel Tool Calls | Select to allow the model to call multiple tools in parallel. Note: To set this parameter to True, the parameter Tools must be set. |
| Presence Penalty | Penalize tokens based on whether they appear in the text so far. Valid range is -2.0 - 2.0. |
| Provider | Select a provider of the chat model, this will allow to have access to the corresponding parameters. |
| Response Format | The format that the model must output. Set to { "type": "json_object" } to enable JSON mode, which guarantees the model generates valid JSON.Important Note: When using JSON mode, you must also instruct the model to produce JSON via a system or user message. Without this, the model may generate whitespace indefinitely until reaching the token limit. |
| Safety Identifier | A unique ID to help track and manage safety-related requests. |
| Seed | A seed to make the generated output more deterministic. |
| Stop Tokens | A comma-separated list of tokens that will stop the generation. Example: \n, END, --- |
| Temperature | The degree of randomness to use in the generation. Valid range is 0.0 - 2.0. |
| Tool Choice | Control which function is called by the model. This parameter accepts either a string or an object. The optional values for string are: * auto* required* noneObject Example: |
| Tools | A list of JSON objects representing tools that the model can use. Example: |
| Top P | An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with Top P probability mass. |
| User ID | The ID of the user to associate with the request. |
Advanced Parameters
| Parameter | Description |
|---|---|
| API Base | The API endpoint you want to call the model with. |
| API Version | The API version for the call. This is an Azure specific parameter. |
| Additional Parameters | A JSON object for additional body parameters. Values specified in this parameter will override equivalent parameters. For example: The object must follow the vendor’s structure as defined in the API documentation. |
| Fallbacks | A List of JSON objects representing fallback models to use in case the primary model fails. Example: |
| Metadata | Additional metadata to send with the request for logging. Example: |
| Num Retries | The number of times to retry the request in case of APIError, TimeoutError or ServiceUnavailableError. |