Upload documents to your knowledge base for personas to reference during conversations.
Documentation Index
Fetch the complete documentation index at: https://docs.tavus.io/llms.txt
Use this file to discover all available pages before exploring further.
Direct URL to a file or a website for your Knowledge Base. Submitting this URL starts processing asynchronously; the document can be used in conversations once processing completes, which may take a few minutes depending on file size.
For now, our Knowledge Base only supports documents written in English and works best for conversations in English. We will be expanding our Knowledge Base language support soon.Maximum file size 50MB. Supported file formats: .pdf, .txt, .docx, .doc, .png, .jpg, .pptx, .csv, and .xlsx. Website URLs are supported: a snapshot of the page is processed into document content; use the crawl object for multi-page crawling from a starting URL.
"https://docs.example.com/"
Optional name for the document. If not provided, a default name will be generated.
"Example Docs"
Optional URL that receives status updates while the document processes asynchronously (e.g. started, processing, ready, error).
"https://your-server.com/webhook"
Optional tags to categorize the document for management and for use with document-based access in conversations. After the document is ready, attach it via document_ids on Create Persona or Create Conversation.
["docs", "website"]Optional configuration for website crawling. When provided with a website URL, the system follows links from the starting URL and processes multiple pages into a single document. Without this parameter, only the single page at the URL is scraped.
Rate limits: at most 100 crawl documents per user, at most 5 concurrent crawls at any time, and a 1-hour cooldown between recrawls of the same document.
To fetch fresh content after a crawled document exists, use Recrawl Document.
Document created successfully
Unique identifier for the created document
"d8-5c71baca86fc"
Name of the document
"Example Docs"
URL of the document or website
"https://docs.example.com/"
Current status of the document processing. Possible values: started, processing, ready, error, recrawling.
started, processing, ready, error, recrawling "started"
Processing progress as a percentage (0-100). Null when processing has not started or is complete.
null
Error code indicating why processing failed. Only present when status is error. Possible values include: file_download_failed, file_format_unsupported, file_size_too_large, file_empty, invalid_file_url, document_processing_failed, website_processing_failed, chunking_failed, embedding_failed, vector_store_failed, contact_support.
ISO 8601 timestamp of when the document was created
"2024-01-01T12:00:00Z"
ISO 8601 timestamp of when the document was last updated
"2024-01-01T12:00:00Z"
URL that will receive status updates
"https://your-server.com/webhook"
Array of document tags
["docs", "website"]The crawl configuration used for this document (only present for crawled websites)
List of URLs that were crawled (only present for crawled websites after processing completes)
[
"https://docs.example.com/",
"https://docs.example.com/getting-started",
"https://docs.example.com/api"
]ISO 8601 timestamp of when the document was last crawled
"2024-01-01T12:00:00Z"
Number of times the document has been crawled
1