# create avatar
Source: https://anam.ai/docs/api-reference/avatars/create-avatar
https://api.anam.ai/swagger.json post /v1/avatars
Create a new one-shot avatar from an image file or image URL. You can use either multipart/form-data with an image file, or JSON with an image URL.
# delete avatar
Source: https://anam.ai/docs/api-reference/avatars/delete-avatar
https://api.anam.ai/swagger.json delete /v1/avatars/{id}
Delete an avatar by ID
# get avatar
Source: https://anam.ai/docs/api-reference/avatars/get-avatar
https://api.anam.ai/swagger.json get /v1/avatars/{id}
Returns an avatar by ID
# list avatars
Source: https://anam.ai/docs/api-reference/avatars/list-avatars
https://api.anam.ai/swagger.json get /v1/avatars
Returns a list of all avatars
# update avatar
Source: https://anam.ai/docs/api-reference/avatars/update-avatar
https://api.anam.ai/swagger.json put /v1/avatars/{id}
Update an avatar by ID (only display name can be updated)
# create knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/create-knowledge-group
https://api.anam.ai/swagger.json post /v1/knowledge/groups
Create a new knowledge group
# delete knowledge document
Source: https://anam.ai/docs/api-reference/knowledge/delete-knowledge-document
https://api.anam.ai/swagger.json delete /v1/knowledge/documents/{id}
Delete a document from a RAG group
# delete knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/delete-knowledge-group
https://api.anam.ai/swagger.json delete /v1/knowledge/groups/{id}
Delete a RAG group
# get knowledge document
Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-document
https://api.anam.ai/swagger.json get /v1/knowledge/documents/{id}
Get a single document by ID
# get knowledge document download
Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-document-download
https://api.anam.ai/swagger.json get /v1/knowledge/documents/{id}/download
Get a presigned download URL for a knowledge document
# get knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-group
https://api.anam.ai/swagger.json get /v1/knowledge/groups/{id}
Get a single RAG group by ID
# list knowledge group documents
Source: https://anam.ai/docs/api-reference/knowledge/list-knowledge-group-documents
https://api.anam.ai/swagger.json get /v1/knowledge/groups/{id}/documents
Get all documents in a RAG group
# list knowledge groups
Source: https://anam.ai/docs/api-reference/knowledge/list-knowledge-groups
https://api.anam.ai/swagger.json get /v1/knowledge/groups
Returns a list of all knowledge groups for the organization
# search knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/search-knowledge-group
https://api.anam.ai/swagger.json post /v1/knowledge/groups/{id}/search
Search for similar content in a RAG group using vector similarity
# update knowledge document
Source: https://anam.ai/docs/api-reference/knowledge/update-knowledge-document
https://api.anam.ai/swagger.json put /v1/knowledge/documents/{id}
Update a document (rename)
# update knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/update-knowledge-group
https://api.anam.ai/swagger.json put /v1/knowledge/groups/{id}
Update a RAG group
# upload knowledge group document
Source: https://anam.ai/docs/api-reference/knowledge/upload-knowledge-group-document
https://api.anam.ai/swagger.json post /v1/knowledge/groups/{id}/documents
Upload a document to a RAG group (Supports PDF, TXT, MD, DOCX, CSV up to 50MB). Authentication can be via API key (Bearer token) OR upload token (X-Upload-Token header).
# create llm
Source: https://anam.ai/docs/api-reference/llms/create-llm
https://api.anam.ai/swagger.json post /v1/llms
Create a new LLM configuration
# delete llm
Source: https://anam.ai/docs/api-reference/llms/delete-llm
https://api.anam.ai/swagger.json delete /v1/llms/{id}
Delete an LLM configuration
# get llm
Source: https://anam.ai/docs/api-reference/llms/get-llm
https://api.anam.ai/swagger.json get /v1/llms/{id}
Get a specific LLM by ID
# list llms
Source: https://anam.ai/docs/api-reference/llms/list-llms
https://api.anam.ai/swagger.json get /v1/llms
Returns a list of all LLMs available to the organization
# update llm
Source: https://anam.ai/docs/api-reference/llms/update-llm
https://api.anam.ai/swagger.json put /v1/llms/{id}
Update an LLM configuration
# Introduction
Source: https://anam.ai/docs/api-reference/overview
Quick start for calling the Anam REST API directly.
All endpoints are served under:
```
https://api.anam.ai/v1
```
## Get an API key
Create a key from the [API keys page](https://lab.anam.ai/api-keys) in the Lab. See [Get your API key](/javascript-sdk/api-key) for the full walkthrough. Keep the key on your server. Never ship it to a browser or mobile app.
## Authenticate requests
Every request takes a bearer token in the `Authorization` header:
```bash theme={"system"}
curl https://api.anam.ai/v1/personas \
-H "Authorization: Bearer $ANAM_API_KEY"
```
## Connecting clients
Clients don't use your API key. Your backend creates a short-lived session token with [`POST /v1/auth/session-token`](/api-reference/create-session-token) and hands it to the client, which uses it to open a WebRTC stream. See [Authentication](/javascript-sdk/authentication) for a worked server example.
## Conventions
* Request and response bodies are JSON. Field names are `camelCase`.
* List endpoints accept `page` and `perPage` query parameters and return `{ data: [...], meta: { total, currentPage, perPage, lastPage, prev, next } }`.
# create persona
Source: https://anam.ai/docs/api-reference/personas/create-persona
https://api.anam.ai/swagger.json post /v1/personas
Create a new persona
# delete persona
Source: https://anam.ai/docs/api-reference/personas/delete-persona
https://api.anam.ai/swagger.json delete /v1/personas/{id}
Delete a persona by id
# get persona
Source: https://anam.ai/docs/api-reference/personas/get-persona
https://api.anam.ai/swagger.json get /v1/personas/{id}
Returns a persona by id
# list personas
Source: https://anam.ai/docs/api-reference/personas/list-personas
https://api.anam.ai/swagger.json get /v1/personas
Returns a list of all personas
# update persona
Source: https://anam.ai/docs/api-reference/personas/update-persona
https://api.anam.ai/swagger.json put /v1/personas/{id}
Update a persona by id
# create session token
Source: https://anam.ai/docs/api-reference/sessions/create-session-token
https://api.anam.ai/swagger.json post /v1/auth/session-token
Create a new session token used to initialise Anam client side SDKs
# get session
Source: https://anam.ai/docs/api-reference/sessions/get-session
https://api.anam.ai/swagger.json get /v1/sessions/{id}
Returns a session by ID
# get session recording
Source: https://anam.ai/docs/api-reference/sessions/get-session-recording
https://api.anam.ai/swagger.json get /v1/sessions/{id}/recording
Returns a presigned URL to download the session recording
# get session transcript
Source: https://anam.ai/docs/api-reference/sessions/get-session-transcript
https://api.anam.ai/swagger.json get /v1/sessions/{id}/transcript
Returns the conversation transcript for a session
# list sessions
Source: https://anam.ai/docs/api-reference/sessions/list-sessions
https://api.anam.ai/swagger.json get /v1/sessions
Returns a list of all sessions for the organization
# create share link
Source: https://anam.ai/docs/api-reference/share-links/create-share-link
https://api.anam.ai/swagger.json post /v1/share-links
Create a new share link
# delete share link
Source: https://anam.ai/docs/api-reference/share-links/delete-share-link
https://api.anam.ai/swagger.json delete /v1/share-links/{id}
Delete a share link by ID
# get share link
Source: https://anam.ai/docs/api-reference/share-links/get-share-link
https://api.anam.ai/swagger.json get /v1/share-links/{id}
Returns a share link by ID
# list share links
Source: https://anam.ai/docs/api-reference/share-links/list-share-links
https://api.anam.ai/swagger.json get /v1/share-links
Returns a list of all share links for the organization
# update share link
Source: https://anam.ai/docs/api-reference/share-links/update-share-link
https://api.anam.ai/swagger.json put /v1/share-links/{id}
Update a share link by ID
# create tool
Source: https://anam.ai/docs/api-reference/tools/create-tool
https://api.anam.ai/swagger.json post /v1/tools
Create a new tool for function calling in persona sessions
# delete tool
Source: https://anam.ai/docs/api-reference/tools/delete-tool
https://api.anam.ai/swagger.json delete /v1/tools/{id}
Delete a tool. The tool will be soft-deleted and no longer available.
# get tool
Source: https://anam.ai/docs/api-reference/tools/get-tool
https://api.anam.ai/swagger.json get /v1/tools/{id}
Get a tool by ID
# list tools
Source: https://anam.ai/docs/api-reference/tools/list-tools
https://api.anam.ai/swagger.json get /v1/tools
Returns a list of all tools for the organization
# update tool
Source: https://anam.ai/docs/api-reference/tools/update-tool
https://api.anam.ai/swagger.json put /v1/tools/{id}
Update an existing tool
# create voice
Source: https://anam.ai/docs/api-reference/voices/create-voice
https://api.anam.ai/swagger.json post /v1/voices
Create a new voice by cloning from an audio file
# delete voice
Source: https://anam.ai/docs/api-reference/voices/delete-voice
https://api.anam.ai/swagger.json delete /v1/voices/{id}
Delete a voice by ID
# get voice
Source: https://anam.ai/docs/api-reference/voices/get-voice
https://api.anam.ai/swagger.json get /v1/voices/{id}
Returns a voice by ID
# list voices
Source: https://anam.ai/docs/api-reference/voices/list-voices
https://api.anam.ai/swagger.json get /v1/voices
Returns a list of all voices
# update voice
Source: https://anam.ai/docs/api-reference/voices/update-voice
https://api.anam.ai/swagger.json put /v1/voices/{id}
Update a voice by ID (display name and provider model ID can be updated)
# Changelog
Source: https://anam.ai/docs/changelog
New features, improvements, and fixes
## β‘ More predictable session openings
This release gives builders more control over how sessions begin, especially when a tool-driven turn needs to run cleanly without being interrupted partway through. That makes longer or multi-step tool flows feel more predictable for both builders and end users.
On the media side, you can now pin a session to start in high video quality using `sessionOptions.videoQuality`, which helps sessions reach their intended bitrate faster. We also tightened one-shot avatar refinement so flat or near-solid backgrounds are preserved more reliably in both the Lab and `/v1` avatar creation flow.
***
## Lab Changes
**Improvements**
* **Better default model:** New personas and built-in agent templates now default to GPT OSS 120B instead of GPT OSS 20B, improving reasoning quality and tool use out of the box.
**Fixes**
* **Cleaner avatar refinement:** Fixed a Gemini refinement issue that could replace plain or near-solid avatar backgrounds with invented scenery, textures, or objects during one-shot avatar creation.
## Persona Changes
**Improvements**
* **Protected tool turns:** Tool-driven turns can now optionally suppress interruptions while your app is still handling the action, making longer or multi-step tool flows more predictable.
**Fixes**
* **Protected-turn cleanup:** Interrupt protection is now released cleanly when a greeting or tool turn finishes without spoken output, reducing the chance of sessions getting stuck in a protected state.
## SDK/API Changes
**Improvements**
* **Initial video quality control:** `sessionOptions.videoQuality` now accepts `high` or `auto`, letting you pin a session to start at the maximum video bitrate instead of ramping up from the default profile.
**Fixes**
* **Avatar API refinement backgrounds:** The same background-preservation fix now applies to the `/v1` avatar creation flow, so refined API-created avatars are less likely to pick up hallucinated scenery.
## π The Anam docs have been overhauled
We redesigned the docs to make it much easier to find the right starting point and drill into the part of the platform you care about. Navigation is now organized around Overview, Embed, JavaScript SDK, Python SDK, Integrations, API Reference, and Changelog, with a rewritten overview page and clearer Learn / Embed / Build entry points.
This overhaul also adds dedicated Python SDK and LiveKit documentation, plus more focused guides for avatars, voices, LLMs, tools, session options, and network configuration.
***
## Docs Changes
**Improvements**
* **New navigation:** The docs now use clearer top-level tabs and reorganized sections so it is faster to jump between concepts, embedding, SDKs, integrations, and API reference.
* **New SDK and integration guides:** Added dedicated Python SDK documentation and a full LiveKit integration section, including overview, quickstart, and configuration guides.
* **Focused concept pages:** Split key setup topics into dedicated pages for available LLMs, creating custom avatars, session controls, voice configuration, and network requirements.
**Fixes**
* **Docs redirects:** Added redirects for renamed and legacy docs URLs so older links and indexed API-reference pages are less likely to land on 404s.
* **Navigation polish:** Improved overview labeling, changelog labeling, and navbar behavior across the docs experience.
## Lab Changes
**Improvements**
* **Sessions page:** Tool calls now appear across session Analytics, Overview, Transcript, and export views, including status, arguments, results, errors, and execution time.
## Persona Changes
**Improvements**
* **Client tool round-trips:** Personas can now continue once your application returns a client tool result, making client-side actions easier to chain into a conversation.
* **Webhook tracing:** Webhook tool requests now include session and correlation IDs, making it easier to trace tool calls across your own backend systems.
**Fixes**
* **Audio preprocessing resilience:** Sessions now fail open if speech-enhancement preprocessing is unavailable, instead of ending unexpectedly.
* **Session startup reliability:** Improved startup and media-timeout handling so transient processing issues are less likely to interrupt an active turn.
## SDK/API Changes
**Improvements**
* **Client tool results:** The JavaScript SDK now sends client tool results and errors back to the engine over the data channel, with session-scoped safeguards.
* **Avatar creation API:** `POST /v1/avatars` now accepts an optional `avatarModel` field during avatar creation.
## π οΈ Tool setup got much easier in the Lab
We redesigned the tool editor so webhook tools can be configured with form-based builders for headers, query params, and body params instead of raw JSON. That makes it much easier to set up tools correctly, especially for non-technical builders or teams collaborating across product and engineering.
This release also includes a few practical fixes around upload limits, session behavior, and API error handling so the platform behaves more clearly when something goes wrong.
***
## Lab Changes
**Improvements**
* **Tool editor:** Rebuilt webhook tool configuration with form-based builders for headers, query params, and body params, so you no longer need to edit raw JSON for common setups.
**Fixes**
* **Connection errors:** Improved LLM URL normalization and connection error messages when custom model endpoints are misconfigured.
* **Avatar uploads:** Reduced the avatar image upload limit to match the real platform file limit and avoid failed uploads.
* **Session cleanup:** Fixed a bug where active sessions could keep running after the player unmounted during tab switches.
## SDK/API Changes
**Improvements**
* **Capacity signaling:** When session capacity is exhausted, the API now returns a clearer `429` response instead of a generic failure.
**Fixes**
* **Knowledge auth:** Fixed knowledge-upload auth and header handling for API callers.
## π― Client-side context injection
You can now inject context into a conversation without triggering a persona response. Call `addContext()` in the JavaScript SDK to silently append information β like CRM data, page navigation events, or real-time application state β to the conversation history. The persona won't respond immediately, but will have that context available the next time the user speaks.
This is useful for building context-aware agents that adapt to what the user is doing in your application without interrupting the conversation flow.
## ποΈ User speech detection events
The SDK now emits `userSpeechStarted` and `userSpeechEnded` events the moment voice activity is detected, before any transcription is available. Use these to build responsive "listening" indicators and other UI feedback that reacts instantly when the user begins or stops speaking.
***
## Lab Changes
**Improvements**
* **Voice cloning for all paid plans:** Custom voice cloning is now available to Explorer and Growth plans, previously limited to Professional and Enterprise.
* **Share and embed redesign:** Share links and embed widgets have been consolidated into a simpler 1-to-1 model with a cleaner management interface.
* **Persona tools via API:** The PUT persona endpoint now accepts a `tool` field, allowing you to attach tools to personas programmatically.
**Fixes**
* Fixed one-shot avatar refinement timing out by making Gemini refinement non-fatal with a 35-second timeout.
* Fixed knowledge upload endpoints not accepting Bearer API key authentication.
* Fixed end-session race conditions with idempotent endpoint and atomic updates.
## Persona Changes
**Improvements**
* **Conversation context accuracy:** A new message history system tracks which text was actually spoken versus interrupted, and records tool call arguments and results. The persona now maintains accurate context after interruptions, leading to more coherent multi-turn conversations.
* **Audio passthrough stability:** Late-arriving audio in BYO TTS sessions no longer causes unintended interruptions. Audio is buffered and played back in order, improving reliability for Pipecat and other audio passthrough integrations.
**Fixes**
* Fixed stale video frames occasionally appearing after a response completes.
## SDK/API Changes
**Improvements**
* **Context injection:** New `addContext()` method lets you inject context into the conversation history without triggering a response ([JS SDK v4.11.0](https://github.com/anam-org/javascript-sdk/releases/tag/v4.11.0)).
* **Speech detection events:** `userSpeechStarted` and `userSpeechEnded` events fire at the VAD level for instant speech detection ([JS SDK v4.12.0](https://github.com/anam-org/javascript-sdk/releases/tag/v4.12.0)).
## π‘ Adaptive bitrate streaming
Anam now dynamically adjusts video quality based on network conditions. When bandwidth drops, the stream adapts in real time to maintain smooth, uninterrupted video rather than freezing or dropping frames. When conditions improve, quality scales back up automatically. This is a significant improvement for users on mobile networks, VPNs, or connections with variable bandwidth.
## π Zero Data Retention mode
Enterprise customers can now enable **Zero Data Retention** on any persona. When enabled, no session data β recordings, transcripts, or conversation logs β is stored after a session ends. This applies across the full pipeline including voice and LLM data.
Toggle it on from persona settings in the Lab, or set it via the API. [Learn more](https://anam.ai/docs/security/privacy).
***
## Lab Changes
**Improvements**
* **System tools:** Personas can now use built-in system tools. `change_language` switches speech recognition to a different language mid-conversation, and `skip_turn` pauses the persona from responding when the user needs a moment to think. Enable them from the Tools tab in Build.
* **Tool validation:** Auto-deduplication of tool names with clearer validation error messages.
* **Share link management:** Migrated share links to a 1-to-1 primary model with a simpler toggle interface.
**Fixes**
* Fixed reasoning model responses getting stuck in "thinking..." state.
* Fixed soft-deleted knowledge folders not restoring on document upload.
* Fixed LiveKit session type classification for snake\_case environment payloads.
## Persona Changes
**Improvements**
* **Agora AV1 support:** Agora integration now supports the AV1 video codec for better compression and quality at lower bitrates.
* **Multi-agent LiveKit:** Audio routing now works correctly in multi-agent LiveKit rooms with multiple Anam avatars.
**Fixes**
* Fixed tool enum type validation.
## π New integrations
Four new ways to use Anam avatars in your stack:
**Pipecat**\
The [`pipecat-anam`](https://pypi.org/project/pipecat-anam/) package brings Anam avatars to [Pipecat](https://github.com/pipecat-ai/pipecat), the open-source framework for voice and multimodal AI agents. `pip install pipecat-anam`, add `AnamVideoService` to your pipeline, and you're streaming. Use audio passthrough for full control over your own orchestration, or let Anam handle the pipeline end-to-end. [GitHub repo](https://github.com/anam-org/pipecat-anam).
**ElevenLabs server-side agents**\
Put a face on any agent you've built in ElevenLabs. Pass in your ElevenLabs agent ID and session token when starting a session, and Anam handles the rest, no changes to your existing ElevenLabs setup needed. [Cookbook](https://anam.ai/cookbook/elevenlabs-server-side-agents).
**VideoSDK**\
Anam is now officially supported on [VideoSDK](https://www.videosdk.live/), a WebRTC platform similar to LiveKit. Built on top of the Python SDK.
**Framer**\
The Anam Avatar plugin is now [on the Framer Marketplace](https://www.framer.com/marketplace/plugins/anam-avatar/). Drop an avatar into any Framer site without writing code.
## π Metaxy: sample-level versioning for ML pipelines
We wrote up a deep dive on [Metaxy](https://anam.ai/blog/metaxy), our open-source metadata versioning framework for multimodal data pipelines. It tracks partial data updates at the field level so teams only reprocess what actually changed. Works with orchestrators like Dagster, agnostic to compute (Ray, DuckDB, etc.). [GitHub](https://github.com/anam-org/metaxy).
***
## Lab Changes
**Improvements**
* **Build page redesign:** Everything lives in Build now. Avatars, Voices, LLMs, Tools, and Knowledge are tabs within a single page. Create custom avatars, clone voices, add LLMs, and upload knowledge files without leaving the page. Knowledge is a file drop on the Prompt tab: upload a document and it's automatically turned into a RAG tool.
* **Smart voice matching:** One-shot avatars now auto-select a voice matching the avatar's detected gender.
* **Mobile improvements:** Tables replaced with cards and lists. Bottom tab bar instead of hamburger menu. Long-press context menus on persona tiles. Touch-friendly tooltips.
* **Knowledge base improvements:** Non-blocking document deletion with pending state and rollback on error. PDF uploads restored. Stuck documents are auto-detected with retry from the UI.
**Fixes**
* Fixed typo in thinking duration display.
* Fixed sticky hover states on touch devices.
## Persona Changes
**Improvements**
* **Video stability:** New TWCC-based frame-drop pacer with GCC congestion control. Smoother video on constrained or variable-bandwidth connections.
* **Network connectivity:** TURN over TLS for ICE, improving session establishment behind corporate firewalls and VPNs.
**Fixes**
* Fixed ElevenLabs pronunciation issues with certain text patterns.
* Fixed text sanitization causing incorrect punctuation in TTS output.
* Fixed silent responses not being detected correctly.
## SDK/API Changes
**Improvements**
* **Tool call event handlers:** `onToolCallStarted`, `onToolCallCompleted`, and `onToolCallFailed` handlers for tracking tool execution on the client.
* **Documents accessed:** `ToolCallCompletedPayload` now includes a `documentsAccessed` field for Knowledge Base tool calls.
**Fixes**
* Fixed duplicate tool call completion events.
## π Anam Python SDK
Anam now has a [Python SDK](https://github.com/anam-org/python-sdk). It handles WebRTC streaming, audio/video frame delivery, and session management.
What's in the box:
* **Media handling** β The SDK manages WebRTC connections and signalling. Connect, and you get synchronized audio and video frames back.
* **Multiple integration modes** β Use the full pipeline (STT, LLM, TTS, Face) or bring your own TTS via audio passthrough.
* **Live transcriptions** β User speech and persona responses stream in as partial transcripts, useful for captions or logging conversations.
* **Async-first** β Built on Python's async/await. Process media frames with async iterators or hook into events with decorators.
People are already building with it β rendering ascii avatars in the terminal, processing frames with OpenCV, piping audio to custom pipelines. Check the [GitHub repo](https://github.com/anam-org/python-sdk) to get started.
***
## Lab Changes
**Improvements**
* **Visual refresh:** Updated Lab UI with new brand styling, including new typography (Figtree), refreshed color tokens, and consistent component styles across all pages.
## Persona Changes
**Improvements**
* **ICE recovery grace period:** WebRTC sessions now survive brief network disconnections instead of terminating immediately. The engine detects ICE connection drops and holds the session open, allowing the client to reconnect without losing conversation state.
* **Language configuration:** You can now set a language code on your persona, ensuring the STT pipeline uses the correct language from session start.
* **Voice generation options:** Added configurable voice generation parameters for more control over TTS output.
* **ElevenLabs streaming:** Removed input buffering for ElevenLabs TTS, reducing time-to-first-audio for all sessions using ElevenLabs voices.
## π¬ Session recordings
By default, every session is now recorded and saved for 30 days. Watch back any conversation in the Lab (lab.anam.ai/sessions) to see exactly how users interact with your personas, including the full video stream and conversation flow.
Recordings and transcripts are also available via API. Use `GET /v1/sessions/{id}/transcript` to fetch the full conversation programmatically for analytics, QA, or archival. For privacy-sensitive applications, you can disable recording in your persona config.
## π¨ Two-pass avatar refinement
One-shot avatar creation now refines images in two passes. Upload an image, and the system generates an initial avatar, then refines it for better likeness and expression. Available to all users.
***
## Lab Changes
**Improvements**
* Added `speechEnhancementLevel` (0-1) to `voiceDetectionOptions` for control over how aggressively background noise is filtered from user audio
* Support for ephemeral tool IDs, so you can configure tools dynamically per session
* Added delete account and organization buttons
**Fixes**
* Fixed terminology on tools tab
* Fixed RAG default parameters not being passed
* Fixed custom LLM default settings
## Persona Changes
**Improvements**
* Support for Gemini thinking/reasoning models
* The `speechEnhancementLevel` parameter now passes through via `voiceDetectionOptions`
* Engine optimizations for lower latency under load
**Fixes**
* Fixed GPT-5 tool calls returning errors
* Fixed audio frame padding that could cause playback issues
* Fixed repeated silence messages
* Fixed silence breaker not responding to typed messages
## π§ User Speech Enhancement
We've integrated [ai-coustics](https://ai-coustics.com/) as a preprocessing layer in our user audio pipeline. It enhances audio quality before it reaches speech detection, cleaning up background noise and improving signal clarity in real-world conditions. This reduces false transcriptions from ambient sounds and improves endpointing accuracy, especially in noisy environments like cafes, offices, or outdoor settings.
## ποΈ Configurable Persona Responsiveness
Control how quickly your persona responds with [voiceDetectionOptions](https://anam.ai/docs/personas/session/voice-detection) in the persona config:
* `endOfSpeechSensitivity` (0-1): How eager the persona is to jump in. 0 waits until it's confident you're done talking, 1 responds sooner.
* `silenceBeforeSkipTurnSeconds`: How long before the persona prompts a quiet user.
* `silenceBeforeSessionEndSeconds`: How long silence ends the session.
* `silenceBeforeAutoEndTurnSeconds`: How long a mid-sentence pause waits before the persona responds.
## π§ Reasoning Model Support
Added support for OpenAI reasoning models and custom Groq LLMs. Reasoning models can think through complex scenarios before responding, while Groq's high-throughput infrastructure makes these typically-slower models respond with conversational latencies suitable for real-time interactions. Add your reasoning model in the lab: [https://lab.anam.ai/llms](https://lab.anam.ai/llms).
## Persona Changes
**Fixes**
* Fixed Knowledge Base (RAG) tool calling with proper default query parameters
* Fixed panic crashes when sessions error during startup
## Lab Changes
**Fixes**
* Fixed `Powered by Anam` text visibility when watermark removal is enabled
* Updated API responses for GET/UPDATE persona endpoints
## SDK/API Changes
**Improvements**
* Introduced agent audio input streaming for BYO audio workflows, allowing you to integrate with arbitrary voice agents, eg ElevenLabs agents (see the [ElevenLabs server-side agents recipe](https://anam.ai/cookbook/elevenlabs-server-side-agents) on how to integrate).
* Added WebRTC reasoning event handlers for reasoning model support
## π Introducing Cara 3: our most expressive model yet
The accumulation of over 6 months of research, **Cara 3** is now available. This new model delivers significantly more expressive avatars featuring realistic eye movement, more dynamic head motion, smoother transitions in and out of idling, and improved lip sync.
You can opt-in to the new model in your persona config using `avatarModel: 'cara-3'` or by selecting it in the Lab UI. Note that all new custom avatars will use Cara 3 exclusively, while existing personas will continue to use the Cara 2 model by default unless explicitly updated.
## π‘οΈ SOC-2 Type II compliance
Anam has achieved SOC-2 Type II compliance. This milestone validates that our security, availability, and data protection controls have been independently audited and proven over time.
For customers building across learning, enablement, or live production use cases, this provides formal assurance regarding how we handle security, access, and reliability.\
[**Visit the Trust Center**](https://trust.anam.ai/)
## π Integrations
**Model Context Protocol (MCP) server**\
Manage your personas and avatars directly within Claude Desktop, Cursor, and other MCP-compatible clients. Use your favorite LLM-assisted tools to interact with the Anam API.
**Anam x ElevenLabs agents**\
Turn any ElevenLabs conversational AI agent into a visual avatar using Anam's audio passthrough.\
[Watch the demo](https://anam.ai/cookbook/elevenlabs-server-side-agents)
***
## Lab Changes
**Improvements**
* **UI overhaul:** A redesigned Homepage and Build page make persona creation more intuitive. You can now preview voices/avatars without starting a chat and create custom assets directly within the Build flow. Sidebar and Pricing pages have also been refreshed.
* **Performance:** Implemented Tanstack caching to significantly improve Lab responsiveness.
**Fixes**
* Fixed a bug where client tool events were not appearing in the Build page chat.
* Resolved an issue where tool calls and RAG were not passing parameters correctly.
## Persona Changes
**Improvements**
* **More voices:** Added \~100 new Cartesia voices (Sonic-3) and \~180 new ElevenLabs voices (Flash v2.5), covering languages and accents from all over the world.
* **New default LLM:** `kimi-k2-instruct-0905` is now available. This SOTA open-source model offers high intelligence and excellent conversational abilities. (Note: Standard `kimi-k2` remains recommended for heavy tool-use scenarios).
* **Configurable greetings:** Added `skip_greeting` parameter, allowing you to configure whether the persona initiates the conversation or waits for the user.
* **Latency reductions:**
* **STT optimization:** We are now self-hosting Deepgram for Speech-to-Text, resulting in a **\~30ms (p50)** and **\~170ms (p90)** latency improvement.
* **Frame buffering:** Optimized output frame buffer, shaving off an additional **\~40ms** of latency per response.
**Fixes**
* Corrected header handling to ensure reliable data center failover.
* Fixed a visual artifact where Cara 3 video frames occasionally displayed random noise.
* Resolved a freeze-frame issue affecting \~1% of sessions ([Incident Report](https://status.anam.ai/incidents/01KC7A6Q9Q6H1JDZ83TP1EF1Z1)).
## SDK/API Changes
**Improvements**
* **API gateway guide:** added documentation and an example repository for routing Anam SDK traffic through your own API Gateway server. [View on GitHub](https://github.com/anam-org/anam-gateway-example).
## π₯ Livekit out of Beta and new latency record
LiveKit integration is now generally available: drop Anam's expressive real-time avatars into any LiveKit Agents app so your AI can join LiveKit rooms as synchronised voice + video participants.\
It turns voice-only agents into face-and-voice experiences for calls, livestreams, and collaborative WebRTC spaces, with LiveKit handling infra and Anam handling the human layer. Docs
***
## β‘ Record-breaking latency: 330 ms decrease in latency for all customers
Server-side optimisations cuts average end-to-end latency by 330 ms for all customers, thanks to cumulative engine optimisations across transcription, frame generation, and frame writing, plus upgraded Deepgram Flux endpointing for faster, best in class turn-taking without regressions in voice quality or TTS.
***
## Lab Changes
**Improvements** β’ Overhaul to avatar video upload and management system
β’ Upgraded default Cartesia voices to Sonic 3
β’ Standardised voice model selection across the platform
**Fixes** β’ Enhanced share-link management capabilities
β’ Corrected LiveKit persona type identification logic
***
## Persona Changes
**Improvements** β’ Server-side optimisations to our frame buffering to reduce latency of responses by \~250ms for all personas.
**Fixes** β’ Changed timeout behavior to never time out based on heartbeats; only time out when websocket is disconnected for 10 seconds or more.
β’ Fixed intermittent issue where persona stopped responding
β’ Set pix\_fmt for video output, moving from yuvj420p (JPEG) to yuv420 color space to avoid incorrect encoding/output.
β’ Added timeout in our silence breaking logic to prevent hangs.
## π Introducing Anam Agents
Build and deploy AI agents in Anam that can engage alongside you.
With Anam Agents, your Personas can now interact with your applications, access your knowledge, and trigger workflows directly through natural conversation. This marks Anam's evolution from conversational Personas to agentic Personas that think, decide, and execute.
## Knowledge Tools
Give your Personas access to your company's knowledge. Upload docs to the Lab, and they'll use semantic retrieval to integrate the right info.\
[Docs for Knowledge Base](https://anam.ai/docs/personas/knowledge/overview)
## Client Tools
Personas can control your interface in real timeβopen checkout, display modals, navigate UI, and update state by voice.\
[Docs for Client Tools](https://anam.ai/docs/personas/tools/client-tools)
## Webhook Tools
Connect your Personas to external APIs and services. Create tickets, fetch status, update records, or fetch live data.\
[Docs for Webhook Tools](https://anam.ai/docs/personas/tools/webhook-tools)
## Intelligent Tool Selection
Each Persona's LLM chooses tools based on intentβnot scripts.
You can create/manage tools on the Tools page in the Lab and attach them to any Persona from Build.
**Anam Agents are available in beta for all users:** [https://lab.anam.ai/login](https://lab.anam.ai/login)
***
## Lab Changes
**Improvements**
* Cartesia Sonic-3 voices: the most expressive TTS model.
* Voice modal expanded: 50+ languages, voice samples, Cartesia TTS now default.
* Session reports work for custom LLMs.
**Fixes**
* Prevented auto-logout when switching contexts.
* Fixed race conditions in cookie handling.
* Resolved legacy session token issues.
* Removed problematic voices.
* Corrected player/stream aspect ratios on mobile.
## Persona Changes
**Improvements**
* Deepgram Flux support for turn-taking ([Deepgram Flux Details](https://deepgram.com/learn/introducing-flux-conversational-speech-recognition))
* Server-side optimization: reduced GIL contention and latency, faster connections.
**Fixes**
* Bug-fix for dangling LiveKit connections.
## Research
**Improvements**
* Our first open-source library!\
Metaxy, a metadata layer for ML/data pipelines:\
[Read more](https://anam-org.github.io/metaxy/main/#3-run-user-defined-computation-over-the-metadata-increment) | [GitHub](https://github.com/anam-org/metaxy)
## π‘οΈ Anam is now HIPAA compliant
A big milestone for our customers and partners. Anam now meets HIPAA requirements for handling protected health information.
[**Learn more at the Anam Trust Center**](https://trust.anam.ai/)
## Lab Changes
**Improvements**
* Enhanced voice selection: search by use case/conversational style, 50+ languages.
* Product tour update.
* Streamlined One-Shot avatar creation.
* Auto-generated Persona names based on selected avatar.
* Session start now 1.1s faster.
**Fixes**
* Share links: fixed extra concurrency slot usage.
## Persona Changes
**Improvements**
* Improved TTS pronunciation via smarter text chunking.
* Traceability and monitoring for session IDs.
* Increased internal audio sampling rate to 24kHz.
* Increased max websocket size to 16Mb.
**Fixes**
* Concurrency calculation now only considers sessions from last 2 hours.
* Less freezing for slower LLMs.
## π Session Analytics
Once a conversation ends, how do you review what happened? To help you understand and improve your Persona's performance, we're launching Session Analytics in the Lab. Now you can access a detailed report for every conversation, complete with a full transcript, performance metrics, and AI-powered analysis.
* **Full Conversation Transcripts.** Review every turn of a conversation with a complete, time-stamped transcript. See what the user said and how your Persona responded, making it easy to diagnose issues and identify successful interaction patterns.
* **Detailed Analytics & Timeline.** Alongside the transcript, a new Analytics tab provides key metrics grouped into "Transcript Metrics" (word count, turns) and "Processing Metrics" (e.g., LLM latency). A visual timeline charts the entire conversation, showing who spoke when and highlighting any technical warnings.
* **AI-Powered Insights.** For a deeper analysis, you can generate an AI-powered summary and review key insights. This feature, currently powered by gpt-5-mini, evaluates the conversation for highlights, adherence to the system prompt, and user interruption rates.
You can find your session history on the Sessions page in the Lab. Click on any past session to explore the new analytics report. This is available today for all session types, except for LiveKit sessions. For privacy-sensitive applications, session logging can be disabled via the SDK.
## Lab Changes
**Improvements**
* Improved Voice Discovery: The Voices page has been updated to be more searchable, allowing you to preview voices with a single click, and view new details like gender, TTS-model and language.
**Fixes**
* Fixed share-link session bug: Fixed bug of share-link sessions taking an extra concurrency slot.
## Persona Changes
**Improvements**
* Small improvement to connection time: Tweaks to how we perform webrtc signalling which allows for slightly faster connection times (\~900ms faster for p95 connection time).
* Improvement to output audio quality for poor connections: Enabled Opus in-band FEC to improve audio quality under packet loss.
* Small reduction in network latency: Optimisations have been made to our outbound media streams to reduce A/V jitter (and hence jitter buffer delay). Expected latency improvement is modest (\<50ms).
**Fixes**
* Fix for livekit sessions with slow TTS audio: Stabilizes LiveKit streaming by pacing output and duplicating frames during slowdowns to prevent underflow.
## β‘ Intelligent LLM Routing for Faster Responses
The performance of LLM endpoints can be highly variable, with time-to-first-token latencies sometimes fluctuating by as much as 500ms from one day to the next depending on regional load. To solve this and ensure your personas respond as quickly and reliably as possible, we've rolled out a new intelligent routing system for LLM requests. This is active for both our turnkey customers and for customers using their own server-side **Custom LLMs** if they deploy multiple endpoints.
This new system constantly monitors the health and performance of all configured LLM endpoints by sending lightweight probes at regular intervals. Using a time-aware moving average, it builds a real-time picture of network latency and processing speed for each endpoint. When a request is made, the system uses this data to calculate the optimal route, automatically shedding load from any overloaded or slow endpoints within a region.
## Lab Changes
**Improvements**
* Generate one-shot avatars from text prompts: You can now generate one-shot avatars from text prompts within the lab, powered by Gemini's new Nano Banana model. The one-shot creation flow has been redesigned for speed and ease-of-use, and is now available to all plans. Image upload and webcam avatars remain exclusive to Pro and Enterprise.
* Improved management of published embed widgets: Published embed widgets can now be configured and monitored from the lab at [https://lab.anam.ai/personas/published](https://lab.anam.ai/personas/published).
## Persona Changes
**Improvements**
* Automatic failover to backup data centres: To ensure maximum uptime and reliability for our personas, we've implemented automatic failover to backup data centres.
**Fixes**
* Prevent session crash on long user speech: Previously, unbroken user speech exceeding 30 seconds would trigger a transcription error and crash the session. We now automatically truncate continuous speech to 30 seconds, preventing sessions from failing in these rare cases.
* Allow configurable session lengths of up to 2 hours for Enterprise plans: We had a bug where sessions had a max timeout of 30 mins instead of 2 hours for enterprise plans. This has now been fixed.
* Resolved slow connection times caused by incorrect database region selection: An undocumented issue with our database provider led to incorrect region selection for our databases. Simply refreshing our credentials resolved the problem, resulting in a \~1s improvement in median connection times and \~3s faster p95 times. While our provider works on a permanent fix, we're actively monitoring for any recurrence.
## π Embed Widget
Embed personas directly into your website with our new widget. Within the **lab's build page** click Publish then generate your unique html snippet. This snippet will work in most common website builders, eg Wordpress.org or SquareSpace.
For added security we recommend adding a whitelist with your domain url. This will lock down the persona to only work on your website. You can also cap the number of sessions or give the widget an expiration period.
## Lab Changes
**Improvements**
* ONE-SHOT avatars available via API: Professional and Enterprise accounts can now create one-shot avatars via API. Docs **here**.
* Spend caps: It's now possible to set a spend cap on your account. Available in **profile settings**.
## Persona Changes
**Fixes**
* Prevent Cartesia from timing out when using slow custom LLMs: We've added a safeguard to prevent Cartesia contexts from unexpectedly closing during pauses in text streaming. With slower llms or if there's a break or slow-down in text being sent, your connection will now stay alive, ensuring smoother, uninterrupted interactions.
For full legal and policy information, see:
* [Trust Center](https://trust.anam.ai/)
* [AI Governance](https://anam.ai/ai-governance)
* [Terms of Service](https://anam.ai/terms-of-service)
* [DPA](https://anam.ai/data-processing)
* [Acceptable Use Policy](https://anam.ai/acceptable-use-policy)
* [Privacy Policy](https://anam.ai/privacy-policy)
# Embed Anam on your website
Source: https://anam.ai/docs/embed/overview
Add an Anam avatar to any website using the Widget, Player, or SDK. Compare options and follow platform-specific setup guides.
There are three ways to add an Anam avatar to your site: the Widget, the Player, or the SDK.
## Which option should I use?
**Widget** β A pre-built Web Component. The avatar loads directly on your page with its own UI. You can listen to events, handle tool calls, and control it from JavaScript.
**Player** β A single iframe. The avatar runs inside a sandboxed frame, completely isolated from your page. The most compatible option across website builders, but you can't interact with it from your own code.
**SDK** β A JavaScript library for building your own interface from scratch. Use this when the pre-built UIs don't fit your design and you want full control over the experience.
## Widget
The Widget loads as a `` [Web Component](https://developer.mozilla.org/en-US/docs/Web/API/Web_Components) on your page. It handles its own authentication, renders inside a Shadow DOM (so it won't clash with your styles), and dispatches DOM events you can listen to.
* Floating overlay or inline layout modes
* DOM events for analytics, error handling, and custom behavior
* Supports tool calls and text input
* Configure appearance from Lab or HTML attributes
* Your site must allow external JavaScript
* Your domain must be added to the allowed list in Lab's Widget tab
* Microphone access required for voice
```html theme={"system"}
```
See the [Widget documentation](/embed/widget/overview) for configuration, events, and framework-specific setup.
## Player
The Player runs inside an iframe. Your page and the avatar are fully isolated from each other β different JavaScript contexts, different stylesheets, no interaction between the two. This makes it the most broadly compatible option, especially on platforms that restrict custom JavaScript.
* Full Anam interface in a sandboxed frame
* Isolated from your site's CSS and JavaScript
* Works on platforms that block custom scripts but allow iframes
* Your site must allow iframe embedding
* HTTPS required
* Microphone access required for voice
```html theme={"system"}
```
## SDK
The SDK gives you a JavaScript client and raw media streams. There's no pre-built UI β you build your own. Use this when you need complete control over how the avatar looks and behaves on your page.
* Full control over layout and appearance
* Direct access to video/audio streams
* Programmatic session management
* Your site must allow external JavaScript
* HTTPS required
* Microphone access required for voice
See the [SDK Reference](/javascript-sdk/reference/basic-usage) for usage details.
## Platform compatibility
Not every website builder supports every embed type. This depends on whether the platform lets you add custom JavaScript, iframes, or both.
| Platform | Widget | Player | SDK | Notes |
| ------------- | ------ | ------ | --- | ------------------------------------------------ |
| WordPress.org | β
| β
| β
| |
| Webflow | β
| β
| β
| May need enterprise plan for script whitelisting |
| Squarespace | β
| β
| β
| Requires paid plan |
| Jimdo Creator | β
| β
| β
| Requires paid plan |
| Shopify | β
| β
| β | |
| WordPress.com | β
| β
| β | Requires paid plan |
| GoDaddy | β
| β
| β | |
| Wix | β
| β | β | Requires paid plan |
### WordPress.com
Requires Business plan (\$25/month) or higher. Add Widget or Player code via Custom HTML block. SDK is not supported.
### WordPress.org (self-hosted)
All three embed options work. Add code via the Gutenberg editor (Custom HTML block), your theme's `footer.php`, or a plugin like "Insert Headers and Footers".
### Shopify
**Player:** Go to Online Store > Themes > Customize, add a "Custom Liquid" section, paste the iframe code.
**Widget/SDK:** Go to Online Store > Themes > Actions > Edit code, open `theme.liquid`, add the code before `