Speech to Speech Quickstart
The Speech to Speech pipeline mode allows you to bypass ASR, LLM, and TTS by leveraging an external speech to speech model. You may use Tavus speech to speech model integrations or you may bring your own.
Getting started with Speech to Speech is as simple as configuring pipeline mode and speech to speech layer as part of the persona creation:
- Set
pipeline_mode
tospeech-to-speech
. - Configure the speech to speech layer by supplying your provider and API key with websocket URL and session settings. Default
provider
isopenai
andapi_key
is required to specify awebsocket_url
. - System prompt and conversational context are not allowed. Instead, you can configure
instructions
as part of thesession_settings
in speech to speech layer. - The
instructions
may be updated in real time by sending arealtime_api
event to the conversation through our Interactions protocol.
From this call to Create Personas, you will receive a response containing a persona_id
. For example in the following response, we have a persona_id
of p24293d6
.
Using the above persona_id
, we can create a conversation using the Create Conversation endpoint. In this request, we will include the replica_id
of the replica that we want to use for this conversation and the persona_id
that we created above.
You can reuse personas when creating conversations. You can learn more about creating conversations here
Response:
In the response, you will receive a conversation_id
. Using this conversation_id
,you can join the conversation and connect to your speech to speech model.