Skip to main content
Generate a chat completion from a model using a conversation history.
External DocumentationTo learn more, visit the LiteLLM documentation.

Basic Parameters

ParameterDescription
Completions NumberThe number of completions to generate.
Frequency PenaltyPenalize new tokens based on their existing frequency. Accept values from -2.0 to 2.0.
Include Log ProbabilitiesSelect to include the log probabilities for the most likely tokens.
Include Top Log ProbabilitiesThe number of log probabilities to return for the most likely tokens at each generation step.

Accepts an integer value between 0 and 5. This parameter is only available when logprobs is set to true.
Logit BiasA JSON object mapping token IDs to bias values. This modifies the probability of generating specific tokens.

Example:
{
“2683”: -100,
“7211”: 5
}
Max Completion TokensThe maximum number of tokens to generate in the completion. This includes visible output tokens and reasoning tokens.

Note: This parameter can’t be used along with Max Tokens.
Max TokensThe maximum number of tokens to generate in the completion.

Note: This parameter can’t be used along with Max Completion Tokens.
MessagesThe list of messages in the conversation.

Example:
[
{
“role”: “system”,
“content”: “You are a helpful assistant that answers questions about programming.”
},
{
“role”: “user”,
“content”: “Can you explain what a JSON map is?”
}
]
Model IDThe ID of the model to get chat completions with.
Parallel Tool CallsSelect to allow the model to call multiple tools in parallel.

Note: To set this parameter to True, the parameter Tools must be set.
Presence PenaltyPenalize tokens based on whether they appear in the text so far. Valid range is -2.0 - 2.0.
ProviderSelect a provider of the chat model, this will allow to have access to the corresponding parameters.
Response FormatThe format that the model must output.

Set to { "type": "json_object" } to enable JSON mode, which guarantees the model generates valid JSON.

Important Note: When using JSON mode, you must also instruct the model to produce JSON via a system or user message. Without this, the model may generate whitespace indefinitely until reaching the token limit.
Safety IdentifierA unique ID to help track and manage safety-related requests.
SeedA seed to make the generated output more deterministic.
Stop TokensA comma-separated list of tokens that will stop the generation.

Example: \n, END, ---
TemperatureThe degree of randomness to use in the generation. Valid range is 0.0 - 2.0.
Tool ChoiceControl which function is called by the model. This parameter accepts either a string or an object.

The optional values for string are:
* auto
* required
* none

Object Example:
{
“type”: “function”,
“function”: {
“name”: “get_weather”
}
}
ToolsA list of JSON objects representing tools that the model can use.

Example:
[
{
“type”: “function”,
“function”: {
“name”: “get_weather”,
“description”: “Fetches the current weather for a given location.”,
“parameters”: {
“type”: “object”,
“properties”: {
“location”: {
“type”: “string”,
“description”: “The city or location to get the weather for.”
},
“date”: {
“type”: “string”,
“description”: “The date for which to get the weather (YYYY-MM-DD).”
}
},
“required”: [“location”]
}
}
}
]
Top PAn alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with Top P probability mass.
User IDThe ID of the user to associate with the request.

Advanced Parameters

ParameterDescription
API BaseThe API endpoint you want to call the model with.
API VersionThe API version for the call. This is an Azure specific parameter.
Additional ParametersA JSON object for additional body parameters. Values specified in this parameter will override equivalent parameters.

For example:
{
“first_key”: 12345,
“second_key”: “some_value”
}
The object must follow the vendor’s structure as defined in the API documentation.
FallbacksA List of JSON objects representing fallback models to use in case the primary model fails.

Example:
[
{
“openai/*”: [
“gpt-4o-mini”
]
}
]
MetadataAdditional metadata to send with the request for logging.

Example:
{
“request_id”: “req-8f21c9a2”,
“tenant_id”: “acme-prod”,
“workflow”: “blink-workflow”,
“feature”: “docs-helper”,
“trace_id”: “trace-91b7e3”,
“user_id”: “user-1234”,
“debug”: true
}
Num RetriesThe number of times to retry the request in case of APIError, TimeoutError or ServiceUnavailableError.

Example Output

{
	"id": "chatcmpl-CzHVeyzZj09MzGSXlmesrcTztrr66",
	"created": 1768721350,
	"model": "gpt-4o-mini-2024-07-18",
	"object": "chat.completion",
	"system_fingerprint": "fp_29330a9688",
	"choices": [
		{
			"finish_reason": "stop",
			"index": 0,
			"message": {
				"content": "The current weather in New York City is cloudy, with a temperature of 12°C.",
				"role": "assistant"
			},
			"logprobs": {
				"content": [
					{
						"token": "The",
						"bytes": [
							84
						],
						"logprob": -0.001632451661862433,
						"top_logprobs": [
							{
								"token": "The",
								"bytes": [
									84
								],
								"logprob": -0.001632451661862433
							},
							{
								"token": "In",
								"bytes": [
									73
								],
								"logprob": -7.001632213592529
							},
							{
								"token": "Currently",
								"bytes": [
									67
								],
								"logprob": -7.376632213592529
							}
						]
					}
				]
			}
		}
	]
}

Workflow Library Example

Generate Chat Completion with Litellm and Send Results Via Email
Workflow LibraryPreview this Workflow on desktop