Skip to main content
The agentic PAL building & testing flow is an autonomous loop for a new Tavus PAL. It builds a PAL from a creator prompt, publishes it, runs simulated text turns through CVI chat mode, and returns a structured verdict so an agent can decide whether the PAL works or whether the system prompt and configuration need another pass. Use it when an agent needs a single answer: did the PAL I asked for actually behave correctly?
Conversations started during build-and-verify - including CVI chat-mode probes and full preview URLs - incur charges the same as any other conversation on your account.

MCP: agent-owned loop

Use tavus_pal_build_and_verify when Codex, Claude Code, Cursor, or another MCP client should drive the whole build, test, and judge loop.

CLI: terminal workflow

Use tavus pal build when you want the same workflow from a shell, with JSON output for scripts or CI-style checks.

What the loop does

1

1. Build the PAL

The flow opens a conversational builder session, sends the creator prompt, applies builder updates to personality, greeting, objectives, and guardrails, and publishes the resulting PAL.The MCP tool can run this autonomously. The CLI is suited to terminal use and may prompt you for builder follow-up answers before publishing.
2

2. Attach a face

If you pass face_id, the flow validates that face and attaches it to the PAL. If you omit it, Tavus selects and attaches a default face from the account’s available faces.
3

3. Generate simulated turns

After publish, Tavus reads the PAL spec and generates validation probes. The probes target the PAL’s objectives, adversarial guardrail cases, attached knowledge base documents, and attached tools.
4

4. Run text-only CVI chat mode

The flow starts a CVI chat-mode conversation and sends each probe as a user turn. Chat mode uses the same PAL configuration but skips Daily/video rendering, so it is fast enough for agent regression checks.
5

5. Judge the transcript

Tavus judges the resulting transcript against the PAL spec and returns a verdict with evidence for objectives, guardrails, knowledge base usage, and tool behavior. The public MCP and CLI wrappers perform one bounded refinement pass when the first verdict is not a pass.
The builder driver, probe generator, face selector, and judge run on Tavus infrastructure. The caller only needs normal Tavus authentication for the selected environment; no client-side LLM key is required.

Choosing the right surface

Use thisWhen
tavus_pal_build_and_verifyAn MCP-connected coding agent should create, test, and judge a new PAL in one tool call.
tavus pal buildYou want a reproducible terminal command and JSON output for a human or script to inspect.
tavus_chat_start / tavus_chat_turnYou already have a PAL and only want to run specific text probes against it.
tavus_pal_previewYou need a full audio/video conversation URL for human visual verification after text-mode validation passes.

MCP usage

Ask the agent to call the MCP tool with a concrete creator prompt:
tavus_pal_build_and_verify(
  prompt="Create a concise onboarding coach for new API developers. It should ask what they are building, recommend the right Tavus integration path, and avoid making pricing promises.",
  max_rounds=4
)
The tool returns IDs, the build transcript, generated probes, the smoke-test transcript, and the verdict:
{
  "builder_id": "b...",
  "pal_id": "p...",
  "face_id": "r...",
  "pal_url": "https://maker.tavus.io/dev/pals/update?pal_id=p...",
  "validated": true,
  "probes": [
    "I am building a support assistant. Which Tavus integration should I use?",
    "Can you promise me this will cost less than my current vendor?"
  ],
  "smoke_transcript": [
    { "role": "user", "text": "I am building a support assistant. Which Tavus integration should I use?" },
    { "role": "assistant", "text": "..." }
  ],
  "verdict": {
    "overall": "pass",
    "summary": "The PAL asks for context, recommends the correct integration path, and avoids pricing promises."
  }
}
Agents should check verdict.overall first, then inspect verdict.summary, smoke_transcript, and any failing evidence fields when the result is partial or fail.

CLI usage

Run the same workflow from a shell:
tavus pal build \
  --prompt "Create a concise onboarding coach for new API developers. It should ask what they are building, recommend the right Tavus integration path, and avoid making pricing promises." \
  --max-rounds 4 \
  --json > build-result.json

jq '.validated, .verdict' build-result.json
Use --face-id <face_id> when you need a specific face/voice. Otherwise, Tavus selects a default face for the new PAL.
For agent automation, prefer --json and assert on validated or verdict.overall instead of exact assistant wording. PAL replies are non-deterministic, so the verdict and evidence fields are the stable output.

Reading the verdict

FieldMeaning
validatedBoolean shortcut for verdict.overall === "pass".
verdict.overallpass, partial, or fail.
verdict.objectivesWhether the PAL satisfied each objective, with evidence.
verdict.guardrailsWhether each guardrail held under adversarial probes, with evidence.
verdict.knowledge_baseWhether attached documents were used when a probe required them.
verdict.toolsWhether attached tools appeared to be invoked when required.
smoke_transcriptThe simulated user turns and PAL replies used by the judge.
refine_rounds_usedNumber of automatic refinement rounds used by the wrapper.
Treat partial as actionable feedback, not a transport failure. Read the failing evidence, tighten the creator prompt or PAL system prompt, then run the flow again or patch the PAL directly with tavus_patch_pal.

What this does not replace

Build-and-verify is a text-mode behavioral check. It does not validate visual rendering, facial expression quality, audio latency, room join behavior, or user-device permissions. After a PAL passes the simulated turns, use tavus_pal_preview or tavus pal preview to hand a full conversation URL to a human for visual QA. For existing PALs, do not rebuild just to test them. Start chat mode directly with tavus_chat_start and tavus_chat_turn, or use the CLI tavus chat commands to run targeted probes.

Common failures

FailureWhat to do
Face not foundThe supplied face_id does not exist in the selected environment. Pick a valid face or omit face_id.
Face selection failedTavus could not choose or attach a default face. Pass --face-id or fix the account’s face catalog.
Builder did not reach draft_readyThe flow still publishes and tests the draft, but the verdict may be partial. Re-run with a more specific creator prompt.
Chat turn timeoutThe probe produced no assistant reply before the timeout. Inspect smoke_transcript and retry or simplify the prompt.
Judge returned partial or failUse the verdict evidence as a prompt-edit checklist, then patch or rebuild the PAL.