Training Overview

To train your personal replica, we first need you to submit a training video. A high-quality training video helps the Phoenix model properly map your face and voice, resulting in a more realistic replica overall. Your training video will be one continuous video, containing the following, in-order:

  1. Consent statement, required to allow our model to train on your video.
  2. 1 minute of talking.
  3. 1 minute of silence.

You can record your training video in app on the Developer Portal. Alternatively, you can upload a pre-recorded video on the Developer Portal or through our API endpoint. After 4-6 hours, we will notify you through email or API callback that your replica is ready for use.

How do I record 1 minute of talking?

We do not require a predefined script beyond the consent statement. You are welcome to discuss anything that showcases your natural speaking style and expertise.

How do I record 1 minute of silence?

Your training footage should conclude with 1 minute of silence. Our model uses your silent footage to create a natural resting position for your replica’s head. During this period, pretend that you are “actively listening” to someone, incorporate small (but non-repetitive) head movements throughout the minute, and ensure that your lips are closed the entire time.

How do I create a high-quality training video?

To ensure your replica is the best possible quality, follow the guidelines below before recording your training footage.

Step 1: Set up environment 🌞

Ensure that you are in a quiet, well-lit area without background movement.

✔️ Check that your face is evenly lit without any shadows. A large diffuse light works best for neutral lighting.
✔️ Avoid environments with background noise or reverb (e.g. air conditioning, construction).
✔️ Keep your background clear by removing moving objects and people.

Step 2: Set up camera 📷

Check your camera’s settings and position it in front of your setup.

✔️ Use a camera with at least 2k pixels, e.g. DSLR, newer laptops, Google Pixel.
✔️ Make sure your camera is set to 30 frames per second (24-60 FPS is also fine).
✔️ Position your camera at eye-level from your setup. You should take up at least 25% of the frame.
✔️ Ensure your lens is clean.

Step 3: Set up microphone 🎙️

Configure your microphone. We recommend to start with your phone or computer’s microphone.

✔️ Do not use a high-quality microphone or wireless earbuds (e.g. Apple Airpods). We find that natural audio works best!
✔️ If you use an external USB/XLR mic, ensure that the mic is not blocking your chin or lips.
✔️ Disable software-based audio enhancements, such as compressors, equalizers, and noise suppression.

Step 4: Set up yourself 🙎

Ensure that your full head is clearly visible in the frame.

✔️ If possible, avoid beards, glasses, high-collar shirts (e.g. turtlenecks) and accessories (e.g. hats).
✔️ Tuck back hair blocking the face.

Step 5: Record training video ⏺️

Read the consent script, followed by 1 minute of talking and 1 minute of true silence.

✔️ Aim for an engaging tone and a relaxed pace, while maintaining continuous eye contact with the camera.
✔️ Minimize body movement, such as hand gestures, head movement, jolts, etc.
✔️ Close your lips during pauses and at the end of sentences.
✔️ If you stumble, continue speaking. Perfection is not necessary!

You can record your training video in any language you prefer. Read more about Tavus’s language flexibility here.

Step 6: Check video file requirements ✅

Before sending file to our API endpoint or uploading it on the developer portal, ensure that it meets technical requirements.

  1. Format is either:
    • webm
    • mp4 with h264 video codec and aac audio codec
  2. File size is maximum 750MB. If your file is too large, check out our tips for reducing file size.
  3. Resolution is minimum 720p.
  4. Video length is at least 2 minutes, containing the consent script, 1 minute of talking and 1 minute of silence in order.

Step 7: Submit your training video 🙌

After ensuring that your training video fits our quality requirements, submit your video through the Developer Portal or through our API endpoint. If you are using our API, make sure that the URL is a download link (e.g. pre-signed Amazon S3 URL).

Next Steps after Training

Upon submission, your replica will immediately begin training in the background. After around 4-6 hours, you will be notified through email or API callback that your personal replica is ready for use. If you’re not happy with the results, be sure to contact us.

Congrats on finishing the training process — now explore generating videos or starting conversations!