Learn how to use Tavus-optimized LLMs or integrate your own custom LLM.
model
tavus-llama-4
is the default model and runs an optimized variant of Llama-4-17B.tavus-llama-4
(Recommended)tavus-gpt-4o
tavus-gpt-4o-mini
tools
speculative_inference
true
, the LLM begins processing speech transcriptions before user input ends, improving responsiveness.
/chat/completions
endpointmodel
base_url
base_url
.api_key
base_url
and api_key
are required only when using a custom model.tools
speculative_inference
true
, the LLM begins processing speech transcriptions before user input ends, improving responsiveness.
headers
extra_body
raven-0
perception model with a custom LLM, your LLM will receive system messages containing visual context extracted from the user’s video input.