This is an event broadcasted by Tavus.
A conversation.utterance.streaming event is sent each time a new batch of audio is delivered to the user during a replica’s turn. Each event contains the accumulated text that has been spoken so far, allowing you to progressively display what the replica is saying in real time.
The content_index field is a 0-based, monotonically increasing integer per inference that can be used to maintain correct ordering of messages received over the network.
When the replica finishes speaking, a final event is sent with final: true. If the replica was interrupted by the user before completing its response, the final event will also include is_interrupted: true, and the speech will contain only the words that were actually spoken aloud.
This event is distinct from the legacy conversation.utterance event (role=replica), which is sent immediately when the replica begins speaking and contains the full LLM response text. The conversation.utterance.streaming event instead reflects what the replica has actually spoken so far, making it ideal for building accurate transcripts and chat histories.
The inference_id can be used to correlate this event with other events such as conversation.utterance and conversation.replica.started_speaking.
Message type indicates what product this event will be used for. In this case, the message_type will be conversation
"conversation"
This is the type of event that is being sent back. This field will always be conversation.utterance.streaming.
"conversation.utterance.streaming"
The unique identifier for the conversation.
"c123456"
A unique identifier for the replica's current inference turn. Can be used to correlate with other events such as conversation.utterance and conversation.replica.started_speaking.
"83294d9f-8306-491b-a284-791f56c8383f"
Contains the accumulated spoken text and metadata about the streaming state.