The Perception Layer in Tavus enhances AI agent with real-time visual understanding.
By using Raven, the AI agent becomes more context-aware, responsive, and capable of triggering actions based on visual input.
Configuring the Perception Layer
To configure the Perception Layer, define the following parameters within the layers.perception
object:
1. perception_model
Specifies the perception model to use.
- Options:
raven-0
(default and recommended): Advanced visual capabilities, including screen share support, ambient queries, and perception tools.
basic
: Legacy model with limited features.
off
: Disables the perception layer.
Screen Share Feature:When using raven-0
, screen share feature is enabled by default without additional configuration.
"layers": {
"perception": {
"perception_model": "raven-0"
}
}
2. ambient_awareness_queries
An array of custom queries that raven-0
continuously monitors in the visual stream.
"ambient_awareness_queries": [
"Is the user wearing a bright outfit?"
]
3. perception_analysis_queries
An array of custom queries that raven-0
processes at the end of the call to generate a visual analysis summary for the user.
"perception_analysis_queries": [
"Is the user wearing an outfit with multiple bright colors?",
"Is there any indication that more than one person is present?"
]
Best practices for ambient_awareness_queries
and perception_analysis_queries
:
- Use simple, focused prompts.
- Use queries that support your persona’s purpose.
Tell raven-0
when and how to trigger tools based on what it sees.
"perception_tool_prompt":
"You have a tool to notify the system when a bright outfit is detected, named `notify_if_bright_outfit_shown`. You MUST use this tool when a bright outfit is detected."
Defines callable functions that raven-0
can trigger upon detecting specific visual conditions. Each tool must include a type
and a function
object detailing its schema.
"perception_tools": [
{
"type": "function",
"function": {
"name": "notify_if_bright_outfit_shown",
"description": "Use this function when a bright outfit is detected in the image with high confidence",
"parameters": {
"type": "object",
"properties": {
"outfit_color": {
"type": "string",
"description": "Best guess on what color of outfit it is"
}
},
"required": ["outfit_color"]
}
}
}
]
Example Use Case
This example demonstrates a persona designed to identify when a user wears a bright outfit and triggers an internal action accordingly.
{
"persona_name": "Fashion Advisor",
"system_prompt": "As a Fashion Advisor, you specialize in offering tailored fashion advice.",
"pipeline_mode": "full",
"context": "You're having a video conversation with a client about their outfit.",
"default_replica_id": "r79e1c033f",
"layers": {
"perception": {
"perception_model": "raven-0",
"ambient_awareness_queries": [
"Is the user wearing a bright outfit?"
],
"perception_analysis_queries": [
"Is the user wearing multiple bright colors?",
"Is there any indication that more than one person is present?"
],
"perception_tool_prompt": "You have a tool to notify the system when a bright outfit is detected, named `notify_if_bright_outfit_shown`. You MUST use this tool when a bright outfit is detected.",
"perception_tools": [
{
"type": "function",
"function": {
"name": "notify_if_bright_outfit_shown",
"description": "Use this function when a bright outfit is detected in the image with high confidence",
"parameters": {
"type": "object",
"properties": {
"outfit_color": {
"type": "string",
"description": "Best guess on what color of outfit it is"
}
},
"required": ["outfit_color"]
}
}
}
]
}
}
}
End-of-call Perception Analysis
raven-0
generates a visual summary at the end of a call. This summary includes all detected visual artifacts and can be sent as:
This feature is exclusive to personas with raven-0
specified in the Perception Layer.
Below is an example of an end-of-call perception analysis payload for the example persona:
{
"properties": {
"analysis": "analysis : Here's a summary of the visual observations:\n\n* **Appearance:** The user is an Asian male, likely in his late teens or early twenties, with short dark hair. Throughout \nthe call, he is consistently wearing a bright yellow t-shirt.\n* **Emotional State:** The user's emotional state is generally neutral to slightly subdued. He appears contemplative, thoughtful, and occasionally troubled, but also calm and collected at times. There were no indications of strong positive emotions.\n* **Environment:** The user is indoors in a room with a green curtain and a door visible in the background.\n* **Focus:** The user consistently looks directly at the camera.\n* **Queries**: The user is wearing a bright yellow outfit, as the system was notified.\n\nBased on the provided information:\n\n* There is no indication that the user is wearing more than one bright color.\n* There is no indication that more than one person is present.\n"
},
"conversation_id": "c369a8e5c8224453",
"webhook_url": "<your_webhook_url>",
"message_type": "application",
"event_type": "application.perception_analysis",
"timestamp": "2025-06-20T01:43:33.571534Z"
}