{
  "message_type": "conversation",
  "event_type": "conversation.utterance",
  "conversation_id": "c123456",
  "inference_id": "83294d9f-8306-491b-a284-791f56c8383f",
  "properties": {
    "role": "user",
    "speech": "Hello, how are you?",
    "visual_context": "There is a man wearing over-ear headphones in a room that seems to be in an office setting, with monitors in the background. The man seems happy, and is looking at the screen."
  }
}

Interactions Protocol

Utterance Event

This is an event broadcasted by Tavus.

An utterance contains the content of the what was spoken and an indication of who spoke it (i.e. the user or replica). Each utterance event includes all of the words spoken by the user or replica measured from when the person started speaking to when they finished speaking. This could include multiple sentences or phrases.

Utterance events can be used to keep track of what the user or the replica has said.

To track when how long an utterance lasts, please refer to duration in “User Started/Stopped Speaking” and “Replica Started/Stopped Speaking” events.

The schema is of type object.

Append Conversational Context Interaction Tool Call Event

Getting Started

Conversational Video Interface

Replica

Video Generation

Resources

Utterance Event