This is a guide to help you get started with Echo Mode. We will first walkthrough setting up a persona and conversation, then we will show you how to send echo messages.

Part 1: Creating the Persona and Conversation

We will first create a persona that has pipeline_mode set to echo and has the proper layers configured using the Create Persona endpoint. You can learn more about creating personas here.

POST /v2/personas

{
    "persona_name": "Echo Mode Persona",
    "pipeline_mode": "echo",
    "system_prompt": "You are a helpful assistant that can answer questions and help with tasks."
}

From this call to Create Personas, you will receive a response containing a persona_id. For example in the following response, we have a persona_id of p24293d6.

{
  "persona_id": "p24293d6"
}

Using the above persona_id, we can create a conversation using the Create Conversation endpoint. In this request, we will include the replica_id of the replica that we want to use for this conversation and the persona_id that we created above. You can reuse personas when creating conversations. You can learn more about creating conversations here

POST /v2/conversations
{
  "replica_id": "re8e740a42",
  "persona_id": "p24293d6",
  "conversation_name": "Music Chat with DJ Kot",
  "conversational_context": "Talk about the greatest hits from my favorite band, Daft Punk, and how their style influenced modern electronic music.",
}

Response:

{
  "conversation_id": "c12345",
  "conversation_name": "Music Chat with DJ Kot",
  "status": "active",
  "conversation_url": "https://tavus.daily.co/c12345",
  "replica_id": "re8e740a42",
  "persona_id": "p24293d6",
  "created_at": "2024-08-13T12:34:56Z"
}

In the response, you will receive a conversation_id. Using this conversation_id, we can join the conversation and send echo messages.

Part 2: Using Text and Audio Echo

Once we have a conversation_id, we can join the conversation and send echo messages whether they are text or audio. If sending audio, it must be base64 encoded. While we recommend a sample rate of 24000Hz for higher quality, we will default to 16000 to ensure backwards compatibility.

Here is a simple python flask app that joins a conversation and sends audio echo interaction messages.

Learn more about formatting Echo Interactions here

import sys

from daily import CallClient, Daily, EventHandler
from flask import Flask, jsonify, request
import time

app = Flask(__name__)

# Global variable to store the CallClient instance
call_client = None


class RoomHandler(EventHandler):
    def __init__(self):
        super().__init__()

    def on_app_message(self, message, sender: str) -> None:
        print(f"Incoming app message from {sender}: {message}")


def join_room(url):
    global call_client
    try:
        Daily.init()
        output_handler = RoomHandler()
        call_client = CallClient(event_handler=output_handler)
        call_client.join(url)
        print(f"Joined room: {url}")
    except Exception as e:
        print(f"Error joining room: {e}")
        raise

audio_chunks = ["base64-chunk-1", "base64-chunk-2", "base64-chunk-3"]


@app.route("/send_audio_message", methods=["POST"])
def send_audio_message():
    global call_client
    if not call_client:
        return jsonify({"error": "Not connected to a room"}), 400
    
    try:
        body = request.json
        conversation_id = body.get("conversation_id")
        modality = body.get("modality")
        base64_audio = body.get("audio")
        sample_rate = body.get("sample_rate", 16000)
        inference_id = body.get("inference_id")
        done = body.get("done")

        message = {
            "message_type": "conversation",
            "event_type": "conversation.echo",
            "conversation_id": conversation_id,
            "properties": {
                "modality": modality,
                "inference_id": inference_id,
                "audio": base64_audio,
                "done": done,
                "sample_rate": sample_rate,
            }
        }

        call_client.send_app_message(message)
        return jsonify({"status": "Message sent successfully"}), 200
    except Exception as e:
        return jsonify({"error": f"Failed to send message: {str(e)}"}), 500


if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("Usage: python script.py <room_url>")
        sys.exit(1)

    room_url = sys.argv[1]

    try:
        join_room(room_url)
        app.run(port=8000, debug=True)
    except Exception as e:
        print(f"Failed to start the application: {e}")
        sys.exit(1)

In the above example, we can hit send_app_message to send the base64 encoded audio chunks to the replica.

You can learn more about how to send text or audio messages via the Echo Interaction here

Replicas

Conversational Video Interface

Video Generation

Lipsync

Troubleshooting

Resources

Echo Mode Quickstart

Part 1: Creating the Persona and Conversation

Part 2: Using Text and Audio Echo

Replicas

Conversational Video Interface

Video Generation

Lipsync

Troubleshooting

Resources

​Part 1: Creating the Persona and Conversation

​Part 2: Using Text and Audio Echo

Part 1: Creating the Persona and Conversation

Part 2: Using Text and Audio Echo