Abstract
Tests the batch processing capability of the chat model. This test ensures that the model can handle multiple inputs simultaneously and return appropriate responses for each.
It verifies that:
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to bind and use OpenAI-formatted tools. This test ensures that the model can correctly process and use tools formatted in the OpenAI function calling style.
It verifies that:
This test is crucial for ensuring compatibility with OpenAI's function calling format, which is a common standard in AI tool integration.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to bind and use Runnable-like tools.
This test ensures that the model can correctly process and use tools
that are created from Runnable objects using the asTool
method.
It verifies that:
This test is crucial for ensuring compatibility with tools created from Runnable objects, which provides a flexible way to integrate custom logic into the model's tool-calling capabilities.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to cache and retrieve complex message types. This test ensures that the model can correctly cache and retrieve messages with complex content structures, such as arrays of content objects.
It verifies that:
This test is crucial for ensuring that the caching mechanism works correctly with various message structures, maintaining consistency and efficiency.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to handle a conversation with multiple messages. This test ensures that the model can process a sequence of messages from different roles (Human and AI) and generate an appropriate response.
It verifies that:
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the basic invoke
method of the chat model.
This test ensures that the model can process a simple input and return a valid response.
It verifies that:
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to handle a more complex tool schema.
This test verifies that the model can correctly process and use a tool
with a schema that includes a z.record(z.unknown())
field, which
represents an object with unknown/any fields.
The test performs the following steps:
This test is particularly important for ensuring compatibility with APIs that may not accept JSON schemas with unknown object fields (e.g., Google's API).
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to accept and use a StructuredToolParams schema.
This schema contains the same fields as StructuredToolInterface
, but does not
require a function to be passed when the tool is created.
This test verifies that the model can:
The test uses a simple weather tool to simulate a scenario where the model needs to make a tool call to retrieve weather information.
It ensures that the model can correctly interpret the tool's schema, make the appropriate tool call, and include the required arguments.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to use tool calls in a multi-turn conversation. This test verifies that the model can:
This capability is crucial for building agents or other pipelines that involve tool usage.
The test follows these steps:
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to use tool calls in a multi-turn conversation with streaming. This test verifies that the model can:
This test is crucial for ensuring that the model can handle tool usage in a streaming context, which is important for building responsive agents or other AI systems that require real-time interaction.
The test follows these steps:
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to handle parallel tool calls in various scenarios. This comprehensive test covers three aspects of parallel tool calling:
The test uses a weather tool and a current time tool to simulate complex, multi-tool scenarios. It ensures that the model can correctly process and respond to prompts requiring multiple tool calls, both in streaming and non-streaming contexts, and can handle message histories with parallel tool calls.
Optional
callOptions: anyOptional call options to pass to the model.
If true, only verifies the message history test.
Tests the streaming capability of the chat model. This test ensures that the model can properly stream responses and that each streamed token is a valid AIMessageChunk.
It verifies that:
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the model can properly use the .streamEvents
method.
This test ensures the .streamEvents
method yields at least
three event types: on_chat_model_start
, on_chat_model_stream
,
and on_chat_model_end
.
It also verifies the first chunk is an on_chat_model_start
event,
and the last chunk is an on_chat_model_end
event. The middle chunk
should be an on_chat_model_stream
event.
Finally, it verifies the final chunk's event.data.output
field
matches the concatenated content of all on_chat_model_stream
events.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to stream tokens while using tool calls. This test ensures that the model can correctly stream responses that include tool calls, and that the streamed response contains the expected information.
It verifies that:
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to stream responses while using tools. This test verifies that the model can:
The test uses a simple weather tool to simulate a scenario where the model needs to make a tool call to retrieve weather information in a streaming context.
It ensures that the model can correctly interpret the tool's schema, make the appropriate tool call, and include the required arguments while streaming the response.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to process few-shot examples with tool calls. This test ensures that the model can correctly handle and respond to a conversation that includes tool calls within the context of few-shot examples.
The test performs the following steps:
This test is crucial for ensuring that the model can learn from and apply the patterns demonstrated in few-shot examples, particularly when those examples involve tool usage.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to handle message histories with list tool contents. This test is specifically designed for models that support tool calling with list-based content, such as Anthropic's Claude.
The test performs the following steps:
This test ensures that the model can correctly process and respond to complex message histories that include tool calls with list-based content structures.
Optional
callOptions: anyOptional call options to pass to the model.
Tests the chat model's ability to handle message histories with string tool contents. This test is specifically designed for models that support tool calling with string-based content, such as OpenAI's GPT models.
The test performs the following steps:
This test ensures that the model can correctly process and respond to complex message histories that include tool calls with string-based content structures.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the usage metadata functionality of the chat model. This test ensures that the model returns proper usage metadata after invoking it with a simple message.
It verifies that:
usage_metadata
field.usage_metadata
field contains input_tokens
, output_tokens
, and total_tokens
,
all of which are numbers.Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the usage metadata functionality for streaming responses from the chat model. This test ensures that the model returns proper usage metadata after streaming a response for a simple message.
It verifies that:
usage_metadata
field.usage_metadata
field contains input_tokens
, output_tokens
, and total_tokens
,
all of which are numbers.Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to generate structured output using the withStructuredOutput
method.
This test ensures that the model can correctly process a prompt and return a response
that adheres to a predefined schema (adderSchema).
It verifies that:
This test is crucial for ensuring that the model can generate responses in a specific format, which is useful for tasks requiring structured data output.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Tests the chat model's ability to generate structured output with raw response included. This test ensures that the model can correctly process a prompt and return a response that adheres to a predefined schema (adderSchema) while also including the raw model output.
It verifies that:
This test is crucial for ensuring that the model can generate responses in a specific format while also providing access to the original, unprocessed model output.
Optional
callOptions: anyOptional call options to pass to the model. These options will be applied to the model at runtime.
Run all unit tests for the chat model. Each test is wrapped in a try/catch block to prevent the entire test suite from failing. If a test fails, the error is logged to the console, and the test suite continues.