Agentic PAL Building & Testing

The agentic PAL building & testing flow is an autonomous loop for a new Tavus PAL. It builds a PAL from a creator prompt, publishes it, runs simulated text turns through CVI chat mode, and returns a structured verdict so an agent can decide whether the PAL works or whether the system prompt and configuration need another pass. Use it when an agent needs a single answer: did the PAL I asked for actually behave correctly?

Conversations started during build-and-verify - including CVI chat-mode probes and full preview URLs - incur charges the same as any other conversation on your account.

MCP: agent-owned loop

Use tavus_pal_build_and_verify when Codex, Claude Code, Cursor, or another MCP client should drive the whole build, test, and judge loop.

CLI: terminal workflow

Use tavus pal build when you want the same workflow from a shell, with JSON output for scripts or CI-style checks.

What the loop does

1. Build the PAL

The flow opens a conversational builder session, sends the creator prompt, applies builder updates to personality, greeting, objectives, and guardrails, and publishes the resulting PAL.The MCP tool can run this autonomously. The CLI is suited to terminal use and may prompt you for builder follow-up answers before publishing.

2. Attach a face

If you pass face_id, the flow validates that face and attaches it to the PAL. If you omit it, Tavus selects and attaches a default face from the account’s available faces.

3. Generate simulated turns

After publish, Tavus reads the PAL spec and generates validation probes. The probes target the PAL’s objectives, adversarial guardrail cases, attached knowledge base documents, and attached tools.

4. Run text-only CVI chat mode

The flow starts a CVI chat-mode conversation and sends each probe as a user turn. Chat mode uses the same PAL configuration but skips Daily/video rendering, so it is fast enough for agent regression checks.

5. Judge the transcript

Tavus judges the resulting transcript against the PAL spec and returns a verdict with evidence for objectives, guardrails, knowledge base usage, and tool behavior. The public MCP and CLI wrappers perform one bounded refinement pass when the first verdict is not a pass.

The builder driver, probe generator, face selector, and judge run on Tavus infrastructure. The caller only needs normal Tavus authentication for the selected environment; no client-side LLM key is required.

Choosing the right surface

Use this	When
`tavus_pal_build_and_verify`	An MCP-connected coding agent should create, test, and judge a new PAL in one tool call.
`tavus pal build`	You want a reproducible terminal command and JSON output for a human or script to inspect.
`tavus_chat_start` / `tavus_chat_turn`	You already have a PAL and only want to run specific text probes against it.
`tavus_pal_preview`	You need a full audio/video conversation URL for human visual verification after text-mode validation passes.

MCP usage

Ask the agent to call the MCP tool with a concrete creator prompt:

tavus_pal_build_and_verify(
  prompt="Create a concise onboarding coach for new API developers. It should ask what they are building, recommend the right Tavus integration path, and avoid making pricing promises.",
  max_rounds=4
)

The tool returns IDs, the build transcript, generated probes, the smoke-test transcript, and the verdict:

{
  "builder_id": "b...",
  "pal_id": "p...",
  "face_id": "r...",
  "pal_url": "https://maker.tavus.io/dev/pals/update?pal_id=p...",
  "validated": true,
  "probes": [
    "I am building a support assistant. Which Tavus integration should I use?",
    "Can you promise me this will cost less than my current vendor?"
  ],
  "smoke_transcript": [
    { "role": "user", "text": "I am building a support assistant. Which Tavus integration should I use?" },
    { "role": "assistant", "text": "..." }
  ],
  "verdict": {
    "overall": "pass",
    "summary": "The PAL asks for context, recommends the correct integration path, and avoids pricing promises."
  }
}

Agents should check verdict.overall first, then inspect verdict.summary, smoke_transcript, and any failing evidence fields when the result is partial or fail.

CLI usage

Run the same workflow from a shell:

tavus pal build \
  --prompt "Create a concise onboarding coach for new API developers. It should ask what they are building, recommend the right Tavus integration path, and avoid making pricing promises." \
  --max-rounds 4 \
  --json > build-result.json

jq '.validated, .verdict' build-result.json

Use --face-id <face_id> when you need a specific face/voice. Otherwise, Tavus selects a default face for the new PAL.

For agent automation, prefer --json and assert on validated or verdict.overall instead of exact assistant wording. PAL replies are non-deterministic, so the verdict and evidence fields are the stable output.

Reading the verdict

Field	Meaning
`validated`	Boolean shortcut for `verdict.overall === "pass"`.
`verdict.overall`	`pass`, `partial`, or `fail`.
`verdict.objectives`	Whether the PAL satisfied each objective, with evidence.
`verdict.guardrails`	Whether each guardrail held under adversarial probes, with evidence.
`verdict.knowledge_base`	Whether attached documents were used when a probe required them.
`verdict.tools`	Whether attached tools appeared to be invoked when required.
`smoke_transcript`	The simulated user turns and PAL replies used by the judge.
`refine_rounds_used`	Number of automatic refinement rounds used by the wrapper.

Treat partial as actionable feedback, not a transport failure. Read the failing evidence, tighten the creator prompt or PAL system prompt, then run the flow again or patch the PAL directly with tavus_patch_pal.

What this does not replace

Build-and-verify is a text-mode behavioral check. It does not validate visual rendering, facial expression quality, audio latency, room join behavior, or user-device permissions. After a PAL passes the simulated turns, use tavus_pal_preview or tavus pal preview to hand a full conversation URL to a human for visual QA. For existing PALs, do not rebuild just to test them. Start chat mode directly with tavus_chat_start and tavus_chat_turn, or use the CLI tavus chat commands to run targeted probes.

Common failures

Failure	What to do
`Face not found`	The supplied `face_id` does not exist in the selected environment. Pick a valid face or omit `face_id`.
Face selection failed	Tavus could not choose or attach a default face. Pass `--face-id` or fix the account’s face catalog.
Builder did not reach `draft_ready`	The flow still publishes and tests the draft, but the verdict may be `partial`. Re-run with a more specific creator prompt.
Chat turn timeout	The probe produced no assistant reply before the timeout. Inspect `smoke_transcript` and retry or simplify the prompt.
Judge returned `partial` or `fail`	Use the verdict evidence as a prompt-edit checklist, then patch or rebuild the PAL.

Getting started

Build

Deploy

Debug

Guides

Resources

Agentic PAL Building & Testing

MCP: agent-owned loop

CLI: terminal workflow

What the loop does

Choosing the right surface

MCP usage

CLI usage

Reading the verdict

What this does not replace

Common failures

MCP: agent-owned loop

CLI: terminal workflow

​What the loop does

​Choosing the right surface

​MCP usage

​CLI usage

​Reading the verdict

​What this does not replace

​Common failures

What the loop does

Choosing the right surface

MCP usage

CLI usage

Reading the verdict

What this does not replace

Common failures