
Key Concepts
CVI is built around three core concepts that work together to create real-time, humanlike interactions with an AI agent:Persona
The Persona defines the agent’s behavior, tone, and knowledge. It also configures the CVI layer and pipeline.
Replica
The Replica brings the persona to life visually. It renders a photorealistic human-like avatar using the Phoenix-3 model.
Conversation
A Conversation is a real-time video session that connects the persona and replica through a WebRTC connection.
Key Features
Natural Interaction
CVI uses facial cues, body language, and real-time turn-taking to enable natural, human-like conversations.
Modular pipeline
Customize the Perception, STT, LLM and TTS layers to control identity, behavior, and responses.
Lifelike AI replicas
Choose from over 100+ hyper-realistic digital twins or customize your own with human-like voice and expression.
Multilingual support
Hold natural conversations in 30+ languages using the supported TTS engines.
World's lowest latency
Experience real-time interactions with ~600ms response time and smooth turn-taking.
Layers
The Conversational Video Interface (CVI) is built on a modular layer system, where each layer handles a specific part of the interaction. Together, they capture input, process it, and generate a real-time, human-like response. Here’s how the layers work together:1. Transport
1. Transport
Handles real-time audio and video streaming using WebRTC (powered by Daily). This layer captures the user’s microphone and camera input and delivers output back to the user.This layer is always enabled. You can configure input/output for audio (mic) and video (camera).
2. Perception
2. Perception
Uses Raven to analyze user expressions, gaze, background, and screen content. This visual context helps the replica understand and respond more naturally.Click here to learn how to configure the Perception layer.
3. Conversational Flow
3. Conversational Flow
Controls the natural dynamics of conversation, including turn-taking and interruptibility. Uses Sparrow for intelligent turn detection, enabling the replica to decide when to speak and when to listen.Click here to learn how to configure the Conversational Flow layer.
4. Speech Recognition (STT)
4. Speech Recognition (STT)
This layer transcribes user speech in real time with lexical and semantic awareness.Click here to learn how to configure the Speech Recognition (STT) layer.
5. Large Language Model (LLM)
5. Large Language Model (LLM)
Processes the user’s transcribed speech and visual input using a low-latency LLM. Tavus provides ultra-low latency optimized LLMs or lets you integrate your own.Click here to learn how to configure the Large Language Model (LLM) layer.
6. Text-to-Speech (TTS)
6. Text-to-Speech (TTS)
Converts the LLM response into speech using the supported TTS Engines (Cartesia (Default), ElevenLabs).Click here to learn how to configure the Text-to-Speech (TTS) layer.
7. Realtime Replica
7. Realtime Replica
Delivers a high-quality, synchronized digital human response using Tavus’s real-time avatar engine powered by Phoenix.Click here to learn more about the Replica layer.
Most layers are configurable via the Persona.
Getting Started
You can quickly create a conversation by using the Developer Portal or following the steps in the Quickstart guide.If you use Cursor, use this pre-built prompt to get started faster:
✅ System Prompt for AI: React (Vite) + Tavus CVI Integration
Purpose: Generate React (TypeScript) apps with Tavus CVI using Vite, following the official docs and GitHub examples: https://docs.tavus.io/sections/integrations/embedding-cvi✅ AI MUST ALWAYS DO THE FOLLOWING:
1. Setup React App Using Vite
2. Install Tavus CVI UI Components
3. Wrap App with CVIProvider
Update src/App.tsx:4. Create Conversation API Call
Always use this pattern:conversation_url from API response is used inside <Conversation>.5. Render Conversation Inside CVIProvider
6. Add Environment Variable
Create.env in the my-tavus-app folder (if not already there):- Important: DO NOT create
.envoutside of themy-tavus-appfolder. It must be inside your project directory wheresrcexists.
7. Create cvi-components.json by Default
Ensure that the cvi-components.json file is generated in the my-tavus-app folder with the following content:✅ Expected AI Output Example
src/App.tsx:✅ Checklist for AI Output
-
✅ Uses Vite + React TypeScript (
npm create vite@latest --template react-ts) -
✅ Includes
npx @tavus/cvi-ui initandadd conversation -
✅ Imports
CVIProviderandConversationfromsrc/components/cvi/components/ -
✅ Uses correct API call with:
https://tavusapi.com/v2/conversations- Headers:
Content-Type,x-api-key - Body:
replica_id&persona_id
-
✅ Uses
import.meta.env.VITE_TAVUS_API_KEY -
✅ Renders
<Conversation>inside<CVIProvider> - ✅ Purple button is visible in both light and dark modes
-
✅
.envis created inside the correct project folder (my-tavus-app) -
✅
cvi-components.jsonis created by default with{ "tsx": true }
Keep things in mind:
- If you’re already in the
my-tavus-appfolder, avoid runningcd my-tavus-appagain. Check your current folder before running commands. - After running the necessary setup, remember to run
npm run devto start your app. - Do NOT place the
.envfile outside of the project folder. It must reside within themy-tavus-appdirectory.

