The Conversational Video Interface (CVI) is an end-to-end pipeline for creating real-time multimodal video conversations with a digital twin that can see, hear, and respond similarly to how a human would. Developers can deploy video AI agents/digital twins in minutes using CVI.

CVI is the world’s fastest interface of its kind, allowing you to put a human face and conversational ability to your AI agent or personality. With CVI, you can achieve utterance-to-utterance latency of under ~900ms, which is the full roundtrip time for a participant to say something and for the replica to speak back.

CVI provides a complete pipeline to have a conversation while also allowing you to customize and plug in your existing components where necessary.

Key Features

Face-to-face interactions

The first interface that speaks our language. CVI is multimodal and understands and uses facial expressions, body language, and has natural conversational awareness including interrupts and turn-taking.

World’s lowest latency

The world’s fastest interface of its kind, with less than 600 millisecond latency utterance-to-utterance.

End-to-end solution

CVI provides a turn-key solution, delivering all the components to easily deploy AI video agents without having to worry about WebRTC, ASR, or anything else.

Focused on naturalness

Easily create high-quality AI replicas of you or your customers, powered by our state-of-the-art replica model, Phoenix-2.

What does a conversation with CVI look like?

Here’s a sample:

Try it out!

You can try chatting with Carter on our website to get a taste of what a conversation with CVI looks like.

Try Out CVI Now!

Note that Carter can see and hear you.

What components does CVI provide, and what can I customize?

CVI provides a full pipeline allowing you to easily create video conversations. You can immediately jump into a real-time conversation with the generated Daily meeting URL. CVI provides the following layers:

  • WebRTC/video conferencing (using Daily)
  • Vision
  • Speech recognition (ASR), with interrupts
  • Optimized, conversational LLM
  • Text-to-speech (TTS)
  • Replica video output

You can choose to customize or bring your own layers as well. For example, you can:

  • Use OpenAI real-time API or other voice-to-voice models and only use Tavus to drive the replica video.
  • Bring your own LLM/conversation logic or enable function calling for Tavus-optimized LLMs.
  • Customize the TTS or ASR engine.
  • Use text parrot mode to directly drive a replica video.
  • Directly access the video streams and create a custom UI. See Custom UI / Getting Raw Streams.

Learn more about the layers and different modes in CVI Modes and Layers.

Key Concepts

What is a conversation?

A conversation is a single ‘session’ or ‘call’ with a digital twin using CVI. When you create a conversation, you receive a Daily meeting URL. This URL provides a full video conferencing solution, allowing you to avoid managing WebRTC or websockets. Navigating to this URL lets you directly join a prebuilt meeting room UI to chat with your digital twin.

Learn more about creating and customizing conversations.

What are personas?

Personas are the ‘character’ or ‘AI agent personality’ and contain all the settings and configuration for that character or agent. For example, you can create a persona for ‘Tim the Sales Agent’ or ‘Rob the Interviewer’. Personas let you customize CVI’s layers and prompt the LLM with personality and context.

Learn more about creating a persona.

What are replicas?

A replica is a talking-head/avatar of a human containing a voice and face clone, used as the video output layer for CVI. You can use stock replicas from Tavus or create your own with a few minutes of training data. A replica is key for video generation and CVI.

Learn how to create a great replica.

What is a digital twin?

A digital twin is an AI-powered digital version of a human, which looks and sounds like a person and can see and respond similarly to a human.

Getting Started

No Code

You can easily try out CVI using the Tavus dashboard. Note that not all settings and modes are available via the dashboard.

API Quick Start

Check out the Quick Start Guide to learn how to use the APIs to create a persona and conversation. Be sure to grab an API key first!

Visit platform.tavus.io for more information.