Optional
callbacksOptional
chatA list of previous messages between the user and the model, giving the model conversational context for responding to the user's message
.
Each item represents a single message in the chat history, excluding the current user turn. It has two properties: role
and message
. The role
identifies the sender (CHATBOT
, SYSTEM
, or USER
), while the message
contains the text content.
The chat_history parameter should not be used for SYSTEM
messages in most cases. Instead, to add a SYSTEM
role message at the beginning of a conversation, the preamble
parameter should be used.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
citationDefaults to "accurate"
.
Dictates the approach taken to generating citations as part of the RAG flow by allowing the user to specify whether they want "accurate"
results or "fast"
results.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
configurableRuntime values for attributes previously made configurable on this Runnable, or sub-Runnables.
Optional
connectorsOptional
conversationAn alternative to chat_history
.
Providing a conversation_id
creates or resumes a persisted conversation with the specified ID. The ID can be any non empty string.
Compatible Deployments: Cohere Platform
Optional
documentsA list of relevant documents that the model can cite to generate a more accurate reply. Each document is a string-string dictionary.
Example:
[ { "title": "Tall penguins", "text": "Emperor penguins are the tallest." }, { "title": "Penguin habitats", "text": "Emperor penguins only live in Antarctica." }, ]
Keys and values from each document will be serialized to a string and passed to the model. The resulting generation will include citations that reference some of these documents.
Some suggested keys are "text", "author", and "date". For better generation quality, it is recommended to keep the total word count of the strings in the dictionary to under 300 words.
An id
field (string) can be optionally supplied to identify the document in the citations. This field will not be passed to the model.
An _excludes
field (array of strings) can be optionally supplied to omit some key-value pairs from being shown to the model. The omitted fields will still show up in the citation object. The "_excludes" field will not be passed to the model.
See 'Document Mode' in the guide for more information. Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
forceForces the chat to be single step. Defaults to false
.
Optional
frequencyDefaults to 0.0
, min value of 0.0
, max value of 1.0
.
Used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation. Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
kEnsures only the top k
most likely tokens are considered for generation at each step.
Defaults to 0
, min value of 0
, max value of 500
.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
maxMaximum number of parallel calls to make.
Optional
maxThe maximum number of input tokens to send to the model. If not specified, max_input_tokens
is the model's context length limit minus a small buffer.
Input will be truncated according to the prompt_truncation
parameter.
Compatible Deployments: Cohere Platform
Optional
maxThe maximum number of tokens the model will generate as part of the response. Note: Setting a low value may result in incomplete generations. Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
metadataMetadata for this call and any sub-calls (eg. a Chain calling an LLM). Keys should be strings, values should be JSON-serializable.
Optional
modelDefaults to command-r-plus
.
The name of a compatible Cohere model or the ID of a fine-tuned model. Compatible Deployments: Cohere Platform, Private Deployments
Optional
pEnsures that only the most likely tokens, with total probability mass of p
, are considered for generation at each step. If both k
and p
are enabled, p
acts after k
.
Defaults to 0.75
. min value of 0.01
, max value of 0.99
.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
preambleWhen specified, the default Cohere preamble will be replaced with the provided one. Preambles are a part of the prompt used to adjust the model's overall behavior and conversation style, and use the SYSTEM
role.
The SYSTEM
role is also used for the contents of the optional chat_history=
parameter. When used with the chat_history=
parameter it adds content throughout a conversation. Conversely, when used with the preamble=
parameter it adds content at the start of the conversation only.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
presenceDefaults to 0.0
, min value of 0.0
, max value of 1.0
.
Used to reduce repetitiveness of generated tokens. Similar to frequency_penalty
, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
promptDefaults to AUTO
when connectors
are specified and OFF
in all other cases.
Dictates how the prompt will be constructed.
With prompt_truncation
set to "AUTO", some elements from chat_history
and documents
will be dropped in an attempt to construct a prompt that fits within the model's context length limit. During this process the order of the documents and chat history will be changed and ranked by relevance.
With prompt_truncation
set to "AUTO_PRESERVE_ORDER", some elements from chat_history
and documents
will be dropped in an attempt to construct a prompt that fits within the model's context length limit. During this process the order of the documents and chat history will be preserved as they are inputted into the API.
With prompt_truncation
set to "OFF", no elements will be dropped. If the sum of the inputs exceeds the model's context length limit, a TooManyTokens
error will be returned.
Compatible Deployments: Cohere Platform Only AUTO_PRESERVE_ORDER: Azure, AWS Sagemaker, Private Deployments
Optional
rawWhen enabled, the user's prompt will be sent to the model without any pre-processing. Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
recursionMaximum number of times a call can recurse. If not provided, defaults to 25.
Optional
returnThe prompt is returned in the prompt
response field when this is enabled.
Optional
runUnique identifier for the tracer run for this call. If not provided, a new UUID will be generated.
Optional
runName for the tracer run for this call. Defaults to the name of the class.
Optional
searchDefaults to false
.
When true
, the response will only contain a list of generated search queries, but no search will take place, and no reply from the model to the user's message
will be generated.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
seedIf specified, the backend will make a best effort to sample tokens deterministically, such that repeated requests with the same seed and parameters should return the same result. However, determinism cannot be totally guaranteed. Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
signalAbort signal for this call. If provided, the call will be aborted when the signal is aborted.
Optional
stopStop tokens to use for this call. If not provided, the default stop tokens for the model will be used.
Optional
stopA list of up to 5 strings that the model will use to stop generation. If the model generates a string that matches any of the strings in the list, it will stop generating tokens and return the generated text up to that point not including the stop sequence. Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
streamWhether or not to include token usage when streaming.
This will include an extra chunk at the end of the stream
with eventType: "stream-end"
and the token usage in
usage_metadata
.
Optional
tagsTags for this call and any sub-calls (eg. a Chain calling an LLM). You can use these to filter calls.
Optional
temperatureDefaults to 0.3
.
A non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations, and higher temperatures mean more random generations.
Randomness can be further maximized by increasing the value of the p
parameter.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
timeoutTimeout for this call in milliseconds.
Optional
toolA list of results from invoking tools recommended by the model in the previous chat turn. Results are used to produce a text response and will be referenced in citations. When using tool_results
, tools
must be passed as well.
Each tool_result contains information about how it was invoked, as well as a list of outputs in the form of dictionaries.
Note: outputs
must be a list of objects. If your tool returns a single object (eg {"status": 200}
), make sure to wrap it in a list.
tool_results = [
{
"call": {
"name": <tool name>,
"parameters": {
<param name>: <param value>
}
},
"outputs": [{
<key>: <value>
}]
},
...
]
Note: Chat calls with tool_results
should not be included in the Chat history to avoid duplication of the message text.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker, Private Deployments
Optional
tool_Specifies how the chat model should use tools.
undefined
Possible values:
- "auto": The model may choose to use any of the provided tools, or none.
- "any": The model must use one of the provided tools.
- "none": The model must not use any tools.
- A string (not "auto", "any", or "none"): The name of a specific tool the model must use.
- An object: A custom schema specifying tool choice parameters. Specific to the provider.
Note: Not all providers support tool_choice. An error will be thrown
if used with an unsupported model.
Optional
tools
Callbacks for this call and any sub-calls (eg. a Chain calling an LLM). Tags are passed to all callbacks, metadata is passed to handle*Start callbacks.