Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tavus.io/llms.txt

Use this file to discover all available pages before exploring further.

Pronunciation dictionaries let you define custom pronunciation rules so your persona says words exactly how you want. This is useful for brand names, technical terms, acronyms, and foreign words that TTS engines may mispronounce. Tavus automatically syncs your dictionary to both Cartesia and ElevenLabs, so rules work regardless of which TTS engine your persona uses.

How it works

  1. You create a pronunciation dictionary with a set of rules
  2. Each rule maps a text (the word to match) to a pronunciation (how it should be spoken)
  3. You attach the dictionary to a persona via the pronunciation_dictionary_id field in the TTS layer
  4. Tavus resolves the correct provider-specific dictionary at save time, so conversations have zero extra latency
When you update a dictionary’s rules, all personas referencing it are automatically updated. When you delete a dictionary, it is cleanly removed from all linked personas.

Automatic voice settings population

When you attach a Tavus pronunciation dictionary to a persona, Tavus denormalizes it into the underlying TTS provider’s native dictionary format and writes it directly into the persona’s voice_settings. This is what enables CVI to apply the rules with zero extra latency at conversation time. As a result, after saving a persona with a pronunciation_dictionary_id, you will see provider-specific fields appear in voice_settings:
  • For Cartesia: a pronunciation_dict_id field
  • For ElevenLabs: a pronunciation_dictionary_locators array
This is expected behavior — the IDs in voice_settings are managed by Tavus and reflect the denormalized form of your Tavus dictionary.
Tavus pronunciation dictionaries will overwrite the matching voice_settings pronunciation fields. To avoid confusion, choose either the Tavus pronunciation dictionary route or specify a provider dictionary directly in voice_settings — not both.

Bring your own TTS API key

If you provide a custodial TTS API key (your own Cartesia or ElevenLabs account), you can skip the Tavus pronunciation dictionary entirely and reference a dictionary you’ve already created in your provider account directly via voice_settings. For Cartesia, supply the dictionary ID:
"voice_settings": {
  "pronunciation_dict_id": "pdict_ScmNEjTeLuwVCuwwomwcPz"
}
For ElevenLabs, supply one or more dictionary locators:
"voice_settings": {
  "pronunciation_dictionary_locators": [
    {
      "version_id": "BJ8NDCDzGweDN5wn4dyD",
      "pronunciation_dictionary_id": "idLBTWkPEMrpJ0eehSuu"
    }
  ]
}
Pick one approach per persona. If you set a Tavus pronunciation_dictionary_id, Tavus will overwrite these voice_settings fields with the denormalized dictionary it manages.

Rule types

Each rule requires a type that determines how the pronunciation is interpreted:
TypeDescriptionExample
aliasReplace the matched text with a different spoken phrase"Tavus""TAH-vus"
ipaUse IPA (International Phonetic Alphabet) notation"bayou""ˈbɑju"

Alias rules

Alias rules perform simple text substitution. The TTS engine speaks the pronunciation value instead of the original text.
{
  "text": "Tavus",
  "pronunciation": "TAH-vus",
  "type": "alias"
}

IPA rules

IPA rules let you specify exact phonetic pronunciation. You can provide IPA in two formats:
  • Raw IPA: Standard IPA string (e.g., "hɛloʊ")
  • Pipe-delimited IPA: Pre-tokenized phonemes separated by | (e.g., "ˈ|b|ɑ|j|u")
{
  "text": "bayou",
  "pronunciation": "ˈ|b|ɑ|j|u",
  "type": "ipa"
}

Rule options

Each rule supports optional matching parameters:
ParameterTypeDefaultDescription
case_sensitivebooleanfalseWhether matching is case-sensitive
word_boundariesbooleantrueWhether to match only whole words
{
  "text": "UN",
  "pronunciation": "United Nations",
  "type": "alias",
  "case_sensitive": false,
  "word_boundaries": false
}
word_boundaries is an ElevenLabs-only feature. When syncing to Cartesia, this option is dropped and Cartesia will apply the rule without word-boundary matching.

Attaching a dictionary to a persona

Set pronunciation_dictionary_id in the TTS layer when creating or updating a persona:
{
  "persona_name": "Sales Agent",
  "system_prompt": "You are a helpful sales agent.",
  "layers": {
    "tts": {
      "tts_engine": "cartesia",
      "pronunciation_dictionary_id": "pd_abc123def456"
    }
  }
}
Each persona supports one pronunciation dictionary at a time. Setting a new pronunciation_dictionary_id replaces the previous one. Setting it to an empty string removes the dictionary.

Limits

LimitValue
Text field max length200 characters
Pronunciation field max length500 characters
Dictionary name max length255 characters

API reference