Pronunciation dictionaries let you define custom pronunciation rules so your persona says words exactly how you want. This is useful for brand names, technical terms, acronyms, and foreign words that TTS engines may mispronounce. Tavus automatically syncs your dictionary to both Cartesia and ElevenLabs, so rules work regardless of which TTS engine your persona uses.Documentation Index
Fetch the complete documentation index at: https://docs.tavus.io/llms.txt
Use this file to discover all available pages before exploring further.
How it works
- You create a pronunciation dictionary with a set of rules
- Each rule maps a text (the word to match) to a pronunciation (how it should be spoken)
- You attach the dictionary to a persona via the
pronunciation_dictionary_idfield in the TTS layer - Tavus resolves the correct provider-specific dictionary at save time, so conversations have zero extra latency
Automatic voice settings population
When you attach a Tavus pronunciation dictionary to a persona, Tavus denormalizes it into the underlying TTS provider’s native dictionary format and writes it directly into the persona’svoice_settings. This is what enables CVI to apply the rules with zero extra latency at conversation time.
As a result, after saving a persona with a pronunciation_dictionary_id, you will see provider-specific fields appear in voice_settings:
- For Cartesia: a
pronunciation_dict_idfield - For ElevenLabs: a
pronunciation_dictionary_locatorsarray
voice_settings are managed by Tavus and reflect the denormalized form of your Tavus dictionary.
Tavus pronunciation dictionaries will overwrite the matching
voice_settings pronunciation fields. To avoid confusion, choose either the Tavus pronunciation dictionary route or specify a provider dictionary directly in voice_settings — not both.Bring your own TTS API key
If you provide a custodial TTS API key (your own Cartesia or ElevenLabs account), you can skip the Tavus pronunciation dictionary entirely and reference a dictionary you’ve already created in your provider account directly viavoice_settings.
For Cartesia, supply the dictionary ID:
Rule types
Each rule requires atype that determines how the pronunciation is interpreted:
| Type | Description | Example |
|---|---|---|
alias | Replace the matched text with a different spoken phrase | "Tavus" → "TAH-vus" |
ipa | Use IPA (International Phonetic Alphabet) notation | "bayou" → "ˈbɑju" |
Alias rules
Alias rules perform simple text substitution. The TTS engine speaks thepronunciation value instead of the original text.
IPA rules
IPA rules let you specify exact phonetic pronunciation. You can provide IPA in two formats:- Raw IPA: Standard IPA string (e.g.,
"hɛloʊ") - Pipe-delimited IPA: Pre-tokenized phonemes separated by
|(e.g.,"ˈ|b|ɑ|j|u")
Rule options
Each rule supports optional matching parameters:| Parameter | Type | Default | Description |
|---|---|---|---|
case_sensitive | boolean | false | Whether matching is case-sensitive |
word_boundaries | boolean | true | Whether to match only whole words |
word_boundaries is an ElevenLabs-only feature. When syncing to Cartesia, this option is dropped and Cartesia will apply the rule without word-boundary matching.Attaching a dictionary to a persona
Setpronunciation_dictionary_id in the TTS layer when creating or updating a persona:
Each persona supports one pronunciation dictionary at a time. Setting a new
pronunciation_dictionary_id replaces the previous one. Setting it to an empty string removes the dictionary.Limits
| Limit | Value |
|---|---|
| Text field max length | 200 characters |
| Pronunciation field max length | 500 characters |
| Dictionary name max length | 255 characters |

