The TTS Layer in Tavus enables your persona to generate natural-sounding voice responses. You can configure the TTS layer using a third-party tts engine provider. If layers.tts is not specified, Tavus will default to cartesia engine.

If you use the default engine, you do not need to specify any parameters within the tts layer.

Configuring the TTS Layer

Define the TTS layer under the layers.tts object. Below are the parameters available:

1. tts_engine

Specifies the supported third-party TTS engine.

  • Options: cartesia, elevenlabs, playht
"tts": {
  "tts_engine": "cartesia"
}

2. api_key

Authenticates requests to your selected third-party TTS provider. You can obtain an API key from one of the following:

Only required when using private voices.

"tts": {
  "api_key": "your-api-key"
}

3. external_voice_id

Specifies which voice to use with the selected TTS engine. To find supported voice IDs, refer to the provider’s documentation:

You can use any publicly accessible custom voice from ElevenLabs, Cartesia, or PlayHT without the provider’s API key. If the custom voice is private, you still need to use the provider’s API key

"tts": {
  "external_voice_id": "external-voice-id"
}

4. voice_settings

Optional object containing additional settings specific to the selected TTS engine.

These settings vary per engine:

ParameterCartesia (Sonic-1 only)ElevenLabs
speedRange -1.0 to 1.0 (negative = slower, positive = faster)Range 0.0 to 1.0 (0.0 = slowest, 1.0 = fastest)
emotionArray of "emotion:level" tags (e.g., "positivity:high")Not available
stabilityNot availableRange 0.0 to 1.0 (0.0 = variable, 1.0 = stable)
similarity_boostNot availableRange 0.0 to 1.0 (0.0 = creative, 1.0 = original)
styleNot availableRange 0.0 to 1.0 (0.0 = neutral, 1.0 = exaggerated)
use_speaker_boostNot availableBoolean (enhances speaker similarity)

For more information on each voice setting, see:
Cartesia Speed and Emotion Controls
ElevenLabs Voice Settings

"tts": {
  "voice_settings": {
    "speed": 0.5,
    "emotion": ["positivity:high", "curiosity"]
  }
}

5. playht_user_id

PlayHT-specific user ID, required if using PlayHT as the TTS engine.

Only available for the playht engine.

"tts": {
  "playht_user_id": "your-playht-user-id"
}

6. tts_emotion_control

If set to true, enables emotion control in speech.

Only available for the cartesia engine.

"tts": {
  "tts_emotion_control": true
}

7. tts_model_name

Model name used by the TTS engine. Refer to:

"tts": {
  "tts_model_name": "sonic"
}

Example Configuration

Below is an example persona with a fully configured TTS layer:

{
  "persona_name": "AI Presenter",
  "system_prompt": "You are a friendly and informative video host.",
  "pipeline_mode": "full",
  "context": "You're delivering updates in a conversational tone.",
  "default_replica_id": "r665388ec672",
  "layers": {
    "tts": {
      "tts_engine": "cartesia",
      "api_key": "your-api-key",
      "external_voice_id": "external-voice-id",
      "voice_settings": {
        "speed": "normal",
        "emotion": ["positivity:high", "curiosity"]
      },
      "tts_emotion_control": true,
      "tts_model_name": "sonic"
    }
  }
}

Refer to the Create Persona API for a complete list of supported fields.