Phoenix Replicas
Conversational Video Interface
Replica Personas
Create Persona
Create and customize a digital replica’s personality for Conversational Video Interface (CVI). A persona defines the replica’s behavior and capabilities through configurable layers including:
Core Components:
- Replica - Choice of audio/visual appearance
- Context - Customizable contextual information, for use by LLM
- System Prompt - Customizable system prompt, for use by LLM
- Layers
- STT - Transcription, turn taking, and Sparrow-0 settings
- LLM - Language model settings
- TTS - Text-to-Speech settings
- Perception - Multimodal vision and understanding settings (Raven-0)
When creating a conversation, the persona configuration determines how the replica interacts, processes information, and responds to participants. Each layer can be fine-tuned to achieve the desired conversational experience.
curl --request POST \
--url https://tavusapi.com/v2/personas \
--header 'Content-Type: application/json' \
--header 'x-api-key: <api-key>' \
--data '{
"persona_name": "Life Coach",
"system_prompt": "As a Life Coach, you are a dedicated professional who specializes in...",
"context": "Here are a few times that you have helped an individual make a breakthrough in...",
"default_replica_id": "r79e1c033f",
"layers": {
"llm": {
"model": "<string>",
"base_url": "your-base-url",
"api_key": "your-api-key",
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
]
},
"tts": {
"api_key": "your-api-key",
"tts_engine": "cartesia",
"external_voice_id": "external-voice-id",
"voice_settings": {
"speed": "normal",
"emotion": [
"positivity:high",
"curiosity"
]
},
"playht_user_id": "your-playht-user-id",
"tts_emotion_control": "false",
"tts_model_name": "sonic"
},
"perception": {
"perception_model": "raven-0",
"ambient_awareness_queries": [
"Is the user showing an ID card?",
"Does the user appear distressed or uncomfortable?"
],
"perception_tool_prompt": "You have a tool to notify the system when an ID card is detected, named `notify_if_id_shown`. You MUST use this tool when a form of ID is detected.",
"perception_tools": [
{
"name": "notify_if_id_shown",
"description": "Notify the system when an ID card is detected"
}
]
},
"stt": {
"stt_engine": "tavus-turbo",
"participant_pause_sensitivity": "low",
"participant_interrupt_sensitivity": "low",
"hotwords": "This is a hotword example",
"smart_turn_detection": true
}
}
}'
{
"persona_id": "p5317866",
"persona_name": "Life Coach",
"created_at": "<string>"
}
Authorizations
Body
A name for the persona.
"Life Coach"
This is the system prompt that will be used by the llm.
"As a Life Coach, you are a dedicated professional who specializes in..."
This is the context that will be used by the llm.
"Here are a few times that you have helped an individual make a breakthrough in..."
The default replica_id associated with this persona if one exists. When creating a conversation, a persona_id with a default_replica_id associated can we used to create a conversation without specifying a replica_id.
"r79e1c033f"
The model name that will be used by the llm. To use Tavus' llms, you may select from the following models: tavus-llama
, tavus-gpt-4o
, tavus-gpt-4o-mini
. If you would like to use your own OpenAI compatible llm, you may provide a model
, base_url
, and api_key
.
The base url for your OpenAI compatible endpoint.
"your-base-url"
The API key for the OpenAI compatible endpoint.
"your-api-key"
Optional tools to provide to your custom LLM
[
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
]
The custodial API key to be used to make requests to the chosen TTS provider.
"your-api-key"
The TTS engine that will be used.
cartesia
, elevenlabs
, playht
The voice ID used for the TTS engine when you want to customize your replica's voice. Choose from Cartesia's stock voices by referring to their Voice Catalog, or if you want more options you can consider ElevenLabs or PlayHT.
"external-voice-id"
Optional voice settings to be used for the TTS engine. These vary depending on the TTS engine you are using.
{
"speed": "normal",
"emotion": ["positivity:high", "curiosity"]
}
The user ID, required if using Playht TTS.
"your-playht-user-id"
If true, the TTS engine will be able to control the emotion of the voice. Only available for Cartesia TTS.
"false"
The model name that will be used by the TTS engine. Please double check this with the TTS provider you are using to ensure valid model names.
"sonic"
The perception model to use. Options include raven-0
for advanced multimodal perception or basic
for simpler vision capabilities, and off
to disable all perception.
raven-0
, basic
, off
"raven-0"
Custom queries that Raven will continuously monitor for in the visual stream. These provide ambient context without requiring explicit prompting.
[
"Is the user showing an ID card?",
"Does the user appear distressed or uncomfortable?"
]
A prompt that details how and when to use the tools that are passed to the perception layer. This helps the replica understand the context of the perception tools and grounds it.
"You have a tool to notify the system when an ID card is detected, named
notify_if_id_shown. You MUST use this tool when a form of ID is detected."
Tools that can be triggered based on visual context, enabling automated actions in response to visual cues.
[
{
"name": "notify_if_id_shown",
"description": "Notify the system when an ID card is detected"
}
]
The STT engine that will be used. tavus-turbo
is our lowest-latency model, but tavus-advanced
provides higher transcription accuracy. Please note that non-English languages will default to tavus-advanced
if not specified.
tavus-turbo
, tavus-advanced
Use this parameter to control how long of a pause you can take before the replica will respond to you. See more details here. The default is medium
, but you can adjust this to low
or high
depending on your needs.
low
, medium
, high
Use this parameter to control how long you can speak before the replica will be interrupted by you. See more details here. The default is medium
, but you can adjust this to low
or high
depending on your needs.
low
, medium
, high
The hotwords that will be used for the STT engine.
"This is a hotword example"
Smart Turn Detection enhances the natural flow of conversation between participants and digital replicas. This intelligent system uses lexical and semantic analysis to determine the optimal moment for the digital replica to respond. The default value is set to true.
How it works:
- Continuously evaluates the participant's speech patterns and content
- Assesses the likelihood that the participant has finished speaking
- Multilingual
- Works seamlessly with both speculative and non-speculative inference,
- Continuously uses participant speech patterns and content to determine how long to wait to respond.
- Works in conjunction with the
participant_pause_sensitivity
setting, which adjusts the maximum pause for when participant is clearly not done.
Key benefits:
- Rapid response: Triggers quick replies when the participant has definitively concluded their statement.
- Extended listening: Allows more time when the participant is clearly in the middle of expressing a thought.
Enabling Smart Turn Detection creates a more natural and engaging conversational experience, allowing the digital replica to interact more seamlessly with human participants.
curl --request POST \
--url https://tavusapi.com/v2/personas \
--header 'Content-Type: application/json' \
--header 'x-api-key: <api-key>' \
--data '{
"persona_name": "Life Coach",
"system_prompt": "As a Life Coach, you are a dedicated professional who specializes in...",
"context": "Here are a few times that you have helped an individual make a breakthrough in...",
"default_replica_id": "r79e1c033f",
"layers": {
"llm": {
"model": "<string>",
"base_url": "your-base-url",
"api_key": "your-api-key",
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": [
"celsius",
"fahrenheit"
]
}
},
"required": [
"location"
]
}
}
}
]
},
"tts": {
"api_key": "your-api-key",
"tts_engine": "cartesia",
"external_voice_id": "external-voice-id",
"voice_settings": {
"speed": "normal",
"emotion": [
"positivity:high",
"curiosity"
]
},
"playht_user_id": "your-playht-user-id",
"tts_emotion_control": "false",
"tts_model_name": "sonic"
},
"perception": {
"perception_model": "raven-0",
"ambient_awareness_queries": [
"Is the user showing an ID card?",
"Does the user appear distressed or uncomfortable?"
],
"perception_tool_prompt": "You have a tool to notify the system when an ID card is detected, named `notify_if_id_shown`. You MUST use this tool when a form of ID is detected.",
"perception_tools": [
{
"name": "notify_if_id_shown",
"description": "Notify the system when an ID card is detected"
}
]
},
"stt": {
"stt_engine": "tavus-turbo",
"participant_pause_sensitivity": "low",
"participant_interrupt_sensitivity": "low",
"hotwords": "This is a hotword example",
"smart_turn_detection": true
}
}
}'
{
"persona_id": "p5317866",
"persona_name": "Life Coach",
"created_at": "<string>"
}