# create avatar Source: https://anam.ai/docs/api-reference/avatars/create-avatar https://api.anam.ai/swagger.json post /v1/avatars Create a new one-shot avatar from an image file or image URL. You can use either multipart/form-data with an image file, or JSON with an image URL. # delete avatar Source: https://anam.ai/docs/api-reference/avatars/delete-avatar https://api.anam.ai/swagger.json delete /v1/avatars/{id} Delete an avatar by ID # get avatar Source: https://anam.ai/docs/api-reference/avatars/get-avatar https://api.anam.ai/swagger.json get /v1/avatars/{id} Returns an avatar by ID # list avatars Source: https://anam.ai/docs/api-reference/avatars/list-avatars https://api.anam.ai/swagger.json get /v1/avatars Returns a list of all avatars # update avatar Source: https://anam.ai/docs/api-reference/avatars/update-avatar https://api.anam.ai/swagger.json put /v1/avatars/{id} Update an avatar by ID (only display name can be updated) # create knowledge group Source: https://anam.ai/docs/api-reference/knowledge/create-knowledge-group https://api.anam.ai/swagger.json post /v1/knowledge/groups Create a new knowledge group # delete knowledge document Source: https://anam.ai/docs/api-reference/knowledge/delete-knowledge-document https://api.anam.ai/swagger.json delete /v1/knowledge/documents/{id} Delete a document from a RAG group # delete knowledge group Source: https://anam.ai/docs/api-reference/knowledge/delete-knowledge-group https://api.anam.ai/swagger.json delete /v1/knowledge/groups/{id} Delete a RAG group # get knowledge document Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-document https://api.anam.ai/swagger.json get /v1/knowledge/documents/{id} Get a single document by ID # get knowledge document download Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-document-download https://api.anam.ai/swagger.json get /v1/knowledge/documents/{id}/download Get a presigned download URL for a knowledge document # get knowledge group Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-group https://api.anam.ai/swagger.json get /v1/knowledge/groups/{id} Get a single RAG group by ID # list knowledge group documents Source: https://anam.ai/docs/api-reference/knowledge/list-knowledge-group-documents https://api.anam.ai/swagger.json get /v1/knowledge/groups/{id}/documents Get all documents in a RAG group # list knowledge groups Source: https://anam.ai/docs/api-reference/knowledge/list-knowledge-groups https://api.anam.ai/swagger.json get /v1/knowledge/groups Returns a list of all knowledge groups for the organization # search knowledge group Source: https://anam.ai/docs/api-reference/knowledge/search-knowledge-group https://api.anam.ai/swagger.json post /v1/knowledge/groups/{id}/search Search for similar content in a RAG group using vector similarity # update knowledge document Source: https://anam.ai/docs/api-reference/knowledge/update-knowledge-document https://api.anam.ai/swagger.json put /v1/knowledge/documents/{id} Update a document (rename) # update knowledge group Source: https://anam.ai/docs/api-reference/knowledge/update-knowledge-group https://api.anam.ai/swagger.json put /v1/knowledge/groups/{id} Update a RAG group # upload knowledge group document Source: https://anam.ai/docs/api-reference/knowledge/upload-knowledge-group-document https://api.anam.ai/swagger.json post /v1/knowledge/groups/{id}/documents Upload a document to a RAG group (Supports PDF, TXT, MD, DOCX, CSV up to 50MB). Authentication can be via API key (Bearer token) OR upload token (X-Upload-Token header). # create llm Source: https://anam.ai/docs/api-reference/llms/create-llm https://api.anam.ai/swagger.json post /v1/llms Create a new LLM configuration # delete llm Source: https://anam.ai/docs/api-reference/llms/delete-llm https://api.anam.ai/swagger.json delete /v1/llms/{id} Delete an LLM configuration # get llm Source: https://anam.ai/docs/api-reference/llms/get-llm https://api.anam.ai/swagger.json get /v1/llms/{id} Get a specific LLM by ID # list llms Source: https://anam.ai/docs/api-reference/llms/list-llms https://api.anam.ai/swagger.json get /v1/llms Returns a list of all LLMs available to the organization # update llm Source: https://anam.ai/docs/api-reference/llms/update-llm https://api.anam.ai/swagger.json put /v1/llms/{id} Update an LLM configuration # Introduction Source: https://anam.ai/docs/api-reference/overview Quick start for calling the Anam REST API directly. All endpoints are served under: ``` https://api.anam.ai/v1 ``` ## Get an API key Create a key from the [API keys page](https://lab.anam.ai/api-keys) in the Lab. See [Get your API key](/javascript-sdk/api-key) for the full walkthrough. Keep the key on your server. Never ship it to a browser or mobile app. ## Authenticate requests Every request takes a bearer token in the `Authorization` header: ```bash theme={"system"} curl https://api.anam.ai/v1/personas \ -H "Authorization: Bearer $ANAM_API_KEY" ``` ## Connecting clients Clients don't use your API key. Your backend creates a short-lived session token with [`POST /v1/auth/session-token`](/api-reference/create-session-token) and hands it to the client, which uses it to open a WebRTC stream. See [Authentication](/javascript-sdk/authentication) for a worked server example. ## Conventions * Request and response bodies are JSON. Field names are `camelCase`. * List endpoints accept `page` and `perPage` query parameters and return `{ data: [...], meta: { total, currentPage, perPage, lastPage, prev, next } }`. # create persona Source: https://anam.ai/docs/api-reference/personas/create-persona https://api.anam.ai/swagger.json post /v1/personas Create a new persona # delete persona Source: https://anam.ai/docs/api-reference/personas/delete-persona https://api.anam.ai/swagger.json delete /v1/personas/{id} Delete a persona by id # get persona Source: https://anam.ai/docs/api-reference/personas/get-persona https://api.anam.ai/swagger.json get /v1/personas/{id} Returns a persona by id # list personas Source: https://anam.ai/docs/api-reference/personas/list-personas https://api.anam.ai/swagger.json get /v1/personas Returns a list of all personas # update persona Source: https://anam.ai/docs/api-reference/personas/update-persona https://api.anam.ai/swagger.json put /v1/personas/{id} Update a persona by id # create session token Source: https://anam.ai/docs/api-reference/sessions/create-session-token https://api.anam.ai/swagger.json post /v1/auth/session-token Create a new session token used to initialise Anam client side SDKs # get session Source: https://anam.ai/docs/api-reference/sessions/get-session https://api.anam.ai/swagger.json get /v1/sessions/{id} Returns a session by ID # get session recording Source: https://anam.ai/docs/api-reference/sessions/get-session-recording https://api.anam.ai/swagger.json get /v1/sessions/{id}/recording Returns a presigned URL to download the session recording # get session transcript Source: https://anam.ai/docs/api-reference/sessions/get-session-transcript https://api.anam.ai/swagger.json get /v1/sessions/{id}/transcript Returns the conversation transcript for a session # list sessions Source: https://anam.ai/docs/api-reference/sessions/list-sessions https://api.anam.ai/swagger.json get /v1/sessions Returns a list of all sessions for the organization # create share link Source: https://anam.ai/docs/api-reference/share-links/create-share-link https://api.anam.ai/swagger.json post /v1/share-links Create a new share link # delete share link Source: https://anam.ai/docs/api-reference/share-links/delete-share-link https://api.anam.ai/swagger.json delete /v1/share-links/{id} Delete a share link by ID # get share link Source: https://anam.ai/docs/api-reference/share-links/get-share-link https://api.anam.ai/swagger.json get /v1/share-links/{id} Returns a share link by ID # list share links Source: https://anam.ai/docs/api-reference/share-links/list-share-links https://api.anam.ai/swagger.json get /v1/share-links Returns a list of all share links for the organization # update share link Source: https://anam.ai/docs/api-reference/share-links/update-share-link https://api.anam.ai/swagger.json put /v1/share-links/{id} Update a share link by ID # create tool Source: https://anam.ai/docs/api-reference/tools/create-tool https://api.anam.ai/swagger.json post /v1/tools Create a new tool for function calling in persona sessions # delete tool Source: https://anam.ai/docs/api-reference/tools/delete-tool https://api.anam.ai/swagger.json delete /v1/tools/{id} Delete a tool. The tool will be soft-deleted and no longer available. # get tool Source: https://anam.ai/docs/api-reference/tools/get-tool https://api.anam.ai/swagger.json get /v1/tools/{id} Get a tool by ID # list tools Source: https://anam.ai/docs/api-reference/tools/list-tools https://api.anam.ai/swagger.json get /v1/tools Returns a list of all tools for the organization # update tool Source: https://anam.ai/docs/api-reference/tools/update-tool https://api.anam.ai/swagger.json put /v1/tools/{id} Update an existing tool # create voice Source: https://anam.ai/docs/api-reference/voices/create-voice https://api.anam.ai/swagger.json post /v1/voices Create a new voice by cloning from an audio file # delete voice Source: https://anam.ai/docs/api-reference/voices/delete-voice https://api.anam.ai/swagger.json delete /v1/voices/{id} Delete a voice by ID # get voice Source: https://anam.ai/docs/api-reference/voices/get-voice https://api.anam.ai/swagger.json get /v1/voices/{id} Returns a voice by ID # list voices Source: https://anam.ai/docs/api-reference/voices/list-voices https://api.anam.ai/swagger.json get /v1/voices Returns a list of all voices # update voice Source: https://anam.ai/docs/api-reference/voices/update-voice https://api.anam.ai/swagger.json put /v1/voices/{id} Update a voice by ID (display name and provider model ID can be updated) # Changelog Source: https://anam.ai/docs/changelog New features, improvements, and fixes ## ⚑ More predictable session openings This release gives builders more control over how sessions begin, especially when a tool-driven turn needs to run cleanly without being interrupted partway through. That makes longer or multi-step tool flows feel more predictable for both builders and end users. On the media side, you can now pin a session to start in high video quality using `sessionOptions.videoQuality`, which helps sessions reach their intended bitrate faster. We also tightened one-shot avatar refinement so flat or near-solid backgrounds are preserved more reliably in both the Lab and `/v1` avatar creation flow. *** ## Lab Changes **Improvements** * **Better default model:** New personas and built-in agent templates now default to GPT OSS 120B instead of GPT OSS 20B, improving reasoning quality and tool use out of the box. **Fixes** * **Cleaner avatar refinement:** Fixed a Gemini refinement issue that could replace plain or near-solid avatar backgrounds with invented scenery, textures, or objects during one-shot avatar creation. ## Persona Changes **Improvements** * **Protected tool turns:** Tool-driven turns can now optionally suppress interruptions while your app is still handling the action, making longer or multi-step tool flows more predictable. **Fixes** * **Protected-turn cleanup:** Interrupt protection is now released cleanly when a greeting or tool turn finishes without spoken output, reducing the chance of sessions getting stuck in a protected state. ## SDK/API Changes **Improvements** * **Initial video quality control:** `sessionOptions.videoQuality` now accepts `high` or `auto`, letting you pin a session to start at the maximum video bitrate instead of ramping up from the default profile. **Fixes** * **Avatar API refinement backgrounds:** The same background-preservation fix now applies to the `/v1` avatar creation flow, so refined API-created avatars are less likely to pick up hallucinated scenery. ## πŸ“š The Anam docs have been overhauled We redesigned the docs to make it much easier to find the right starting point and drill into the part of the platform you care about. Navigation is now organized around Overview, Embed, JavaScript SDK, Python SDK, Integrations, API Reference, and Changelog, with a rewritten overview page and clearer Learn / Embed / Build entry points. This overhaul also adds dedicated Python SDK and LiveKit documentation, plus more focused guides for avatars, voices, LLMs, tools, session options, and network configuration. *** ## Docs Changes **Improvements** * **New navigation:** The docs now use clearer top-level tabs and reorganized sections so it is faster to jump between concepts, embedding, SDKs, integrations, and API reference. * **New SDK and integration guides:** Added dedicated Python SDK documentation and a full LiveKit integration section, including overview, quickstart, and configuration guides. * **Focused concept pages:** Split key setup topics into dedicated pages for available LLMs, creating custom avatars, session controls, voice configuration, and network requirements. **Fixes** * **Docs redirects:** Added redirects for renamed and legacy docs URLs so older links and indexed API-reference pages are less likely to land on 404s. * **Navigation polish:** Improved overview labeling, changelog labeling, and navbar behavior across the docs experience. ## Lab Changes **Improvements** * **Sessions page:** Tool calls now appear across session Analytics, Overview, Transcript, and export views, including status, arguments, results, errors, and execution time. ## Persona Changes **Improvements** * **Client tool round-trips:** Personas can now continue once your application returns a client tool result, making client-side actions easier to chain into a conversation. * **Webhook tracing:** Webhook tool requests now include session and correlation IDs, making it easier to trace tool calls across your own backend systems. **Fixes** * **Audio preprocessing resilience:** Sessions now fail open if speech-enhancement preprocessing is unavailable, instead of ending unexpectedly. * **Session startup reliability:** Improved startup and media-timeout handling so transient processing issues are less likely to interrupt an active turn. ## SDK/API Changes **Improvements** * **Client tool results:** The JavaScript SDK now sends client tool results and errors back to the engine over the data channel, with session-scoped safeguards. * **Avatar creation API:** `POST /v1/avatars` now accepts an optional `avatarModel` field during avatar creation. ## πŸ› οΈ Tool setup got much easier in the Lab We redesigned the tool editor so webhook tools can be configured with form-based builders for headers, query params, and body params instead of raw JSON. That makes it much easier to set up tools correctly, especially for non-technical builders or teams collaborating across product and engineering. This release also includes a few practical fixes around upload limits, session behavior, and API error handling so the platform behaves more clearly when something goes wrong. *** ## Lab Changes **Improvements** * **Tool editor:** Rebuilt webhook tool configuration with form-based builders for headers, query params, and body params, so you no longer need to edit raw JSON for common setups. **Fixes** * **Connection errors:** Improved LLM URL normalization and connection error messages when custom model endpoints are misconfigured. * **Avatar uploads:** Reduced the avatar image upload limit to match the real platform file limit and avoid failed uploads. * **Session cleanup:** Fixed a bug where active sessions could keep running after the player unmounted during tab switches. ## SDK/API Changes **Improvements** * **Capacity signaling:** When session capacity is exhausted, the API now returns a clearer `429` response instead of a generic failure. **Fixes** * **Knowledge auth:** Fixed knowledge-upload auth and header handling for API callers. ## 🎯 Client-side context injection You can now inject context into a conversation without triggering a persona response. Call `addContext()` in the JavaScript SDK to silently append information β€” like CRM data, page navigation events, or real-time application state β€” to the conversation history. The persona won't respond immediately, but will have that context available the next time the user speaks. This is useful for building context-aware agents that adapt to what the user is doing in your application without interrupting the conversation flow. ## πŸŽ™οΈ User speech detection events The SDK now emits `userSpeechStarted` and `userSpeechEnded` events the moment voice activity is detected, before any transcription is available. Use these to build responsive "listening" indicators and other UI feedback that reacts instantly when the user begins or stops speaking. *** ## Lab Changes **Improvements** * **Voice cloning for all paid plans:** Custom voice cloning is now available to Explorer and Growth plans, previously limited to Professional and Enterprise. * **Share and embed redesign:** Share links and embed widgets have been consolidated into a simpler 1-to-1 model with a cleaner management interface. * **Persona tools via API:** The PUT persona endpoint now accepts a `tool` field, allowing you to attach tools to personas programmatically. **Fixes** * Fixed one-shot avatar refinement timing out by making Gemini refinement non-fatal with a 35-second timeout. * Fixed knowledge upload endpoints not accepting Bearer API key authentication. * Fixed end-session race conditions with idempotent endpoint and atomic updates. ## Persona Changes **Improvements** * **Conversation context accuracy:** A new message history system tracks which text was actually spoken versus interrupted, and records tool call arguments and results. The persona now maintains accurate context after interruptions, leading to more coherent multi-turn conversations. * **Audio passthrough stability:** Late-arriving audio in BYO TTS sessions no longer causes unintended interruptions. Audio is buffered and played back in order, improving reliability for Pipecat and other audio passthrough integrations. **Fixes** * Fixed stale video frames occasionally appearing after a response completes. ## SDK/API Changes **Improvements** * **Context injection:** New `addContext()` method lets you inject context into the conversation history without triggering a response ([JS SDK v4.11.0](https://github.com/anam-org/javascript-sdk/releases/tag/v4.11.0)). * **Speech detection events:** `userSpeechStarted` and `userSpeechEnded` events fire at the VAD level for instant speech detection ([JS SDK v4.12.0](https://github.com/anam-org/javascript-sdk/releases/tag/v4.12.0)). ## πŸ“‘ Adaptive bitrate streaming Anam now dynamically adjusts video quality based on network conditions. When bandwidth drops, the stream adapts in real time to maintain smooth, uninterrupted video rather than freezing or dropping frames. When conditions improve, quality scales back up automatically. This is a significant improvement for users on mobile networks, VPNs, or connections with variable bandwidth. ## πŸ”’ Zero Data Retention mode Enterprise customers can now enable **Zero Data Retention** on any persona. When enabled, no session data β€” recordings, transcripts, or conversation logs β€” is stored after a session ends. This applies across the full pipeline including voice and LLM data. Toggle it on from persona settings in the Lab, or set it via the API. [Learn more](https://anam.ai/docs/security/privacy). *** ## Lab Changes **Improvements** * **System tools:** Personas can now use built-in system tools. `change_language` switches speech recognition to a different language mid-conversation, and `skip_turn` pauses the persona from responding when the user needs a moment to think. Enable them from the Tools tab in Build. * **Tool validation:** Auto-deduplication of tool names with clearer validation error messages. * **Share link management:** Migrated share links to a 1-to-1 primary model with a simpler toggle interface. **Fixes** * Fixed reasoning model responses getting stuck in "thinking..." state. * Fixed soft-deleted knowledge folders not restoring on document upload. * Fixed LiveKit session type classification for snake\_case environment payloads. ## Persona Changes **Improvements** * **Agora AV1 support:** Agora integration now supports the AV1 video codec for better compression and quality at lower bitrates. * **Multi-agent LiveKit:** Audio routing now works correctly in multi-agent LiveKit rooms with multiple Anam avatars. **Fixes** * Fixed tool enum type validation. ## πŸ”Œ New integrations Four new ways to use Anam avatars in your stack: **Pipecat**\ The [`pipecat-anam`](https://pypi.org/project/pipecat-anam/) package brings Anam avatars to [Pipecat](https://github.com/pipecat-ai/pipecat), the open-source framework for voice and multimodal AI agents. `pip install pipecat-anam`, add `AnamVideoService` to your pipeline, and you're streaming. Use audio passthrough for full control over your own orchestration, or let Anam handle the pipeline end-to-end. [GitHub repo](https://github.com/anam-org/pipecat-anam). **ElevenLabs server-side agents**\ Put a face on any agent you've built in ElevenLabs. Pass in your ElevenLabs agent ID and session token when starting a session, and Anam handles the rest, no changes to your existing ElevenLabs setup needed. [Cookbook](https://anam.ai/cookbook/elevenlabs-server-side-agents). **VideoSDK**\ Anam is now officially supported on [VideoSDK](https://www.videosdk.live/), a WebRTC platform similar to LiveKit. Built on top of the Python SDK. **Framer**\ The Anam Avatar plugin is now [on the Framer Marketplace](https://www.framer.com/marketplace/plugins/anam-avatar/). Drop an avatar into any Framer site without writing code. ## πŸ“ Metaxy: sample-level versioning for ML pipelines We wrote up a deep dive on [Metaxy](https://anam.ai/blog/metaxy), our open-source metadata versioning framework for multimodal data pipelines. It tracks partial data updates at the field level so teams only reprocess what actually changed. Works with orchestrators like Dagster, agnostic to compute (Ray, DuckDB, etc.). [GitHub](https://github.com/anam-org/metaxy). *** ## Lab Changes **Improvements** * **Build page redesign:** Everything lives in Build now. Avatars, Voices, LLMs, Tools, and Knowledge are tabs within a single page. Create custom avatars, clone voices, add LLMs, and upload knowledge files without leaving the page. Knowledge is a file drop on the Prompt tab: upload a document and it's automatically turned into a RAG tool. * **Smart voice matching:** One-shot avatars now auto-select a voice matching the avatar's detected gender. * **Mobile improvements:** Tables replaced with cards and lists. Bottom tab bar instead of hamburger menu. Long-press context menus on persona tiles. Touch-friendly tooltips. * **Knowledge base improvements:** Non-blocking document deletion with pending state and rollback on error. PDF uploads restored. Stuck documents are auto-detected with retry from the UI. **Fixes** * Fixed typo in thinking duration display. * Fixed sticky hover states on touch devices. ## Persona Changes **Improvements** * **Video stability:** New TWCC-based frame-drop pacer with GCC congestion control. Smoother video on constrained or variable-bandwidth connections. * **Network connectivity:** TURN over TLS for ICE, improving session establishment behind corporate firewalls and VPNs. **Fixes** * Fixed ElevenLabs pronunciation issues with certain text patterns. * Fixed text sanitization causing incorrect punctuation in TTS output. * Fixed silent responses not being detected correctly. ## SDK/API Changes **Improvements** * **Tool call event handlers:** `onToolCallStarted`, `onToolCallCompleted`, and `onToolCallFailed` handlers for tracking tool execution on the client. * **Documents accessed:** `ToolCallCompletedPayload` now includes a `documentsAccessed` field for Knowledge Base tool calls. **Fixes** * Fixed duplicate tool call completion events. ## 🐍 Anam Python SDK Anam now has a [Python SDK](https://github.com/anam-org/python-sdk). It handles WebRTC streaming, audio/video frame delivery, and session management. What's in the box: * **Media handling** β€” The SDK manages WebRTC connections and signalling. Connect, and you get synchronized audio and video frames back. * **Multiple integration modes** β€” Use the full pipeline (STT, LLM, TTS, Face) or bring your own TTS via audio passthrough. * **Live transcriptions** β€” User speech and persona responses stream in as partial transcripts, useful for captions or logging conversations. * **Async-first** β€” Built on Python's async/await. Process media frames with async iterators or hook into events with decorators. People are already building with it β€” rendering ascii avatars in the terminal, processing frames with OpenCV, piping audio to custom pipelines. Check the [GitHub repo](https://github.com/anam-org/python-sdk) to get started. *** ## Lab Changes **Improvements** * **Visual refresh:** Updated Lab UI with new brand styling, including new typography (Figtree), refreshed color tokens, and consistent component styles across all pages. ## Persona Changes **Improvements** * **ICE recovery grace period:** WebRTC sessions now survive brief network disconnections instead of terminating immediately. The engine detects ICE connection drops and holds the session open, allowing the client to reconnect without losing conversation state. * **Language configuration:** You can now set a language code on your persona, ensuring the STT pipeline uses the correct language from session start. * **Voice generation options:** Added configurable voice generation parameters for more control over TTS output. * **ElevenLabs streaming:** Removed input buffering for ElevenLabs TTS, reducing time-to-first-audio for all sessions using ElevenLabs voices. ## 🎬 Session recordings By default, every session is now recorded and saved for 30 days. Watch back any conversation in the Lab (lab.anam.ai/sessions) to see exactly how users interact with your personas, including the full video stream and conversation flow. Recordings and transcripts are also available via API. Use `GET /v1/sessions/{id}/transcript` to fetch the full conversation programmatically for analytics, QA, or archival. For privacy-sensitive applications, you can disable recording in your persona config. ## 🎨 Two-pass avatar refinement One-shot avatar creation now refines images in two passes. Upload an image, and the system generates an initial avatar, then refines it for better likeness and expression. Available to all users. *** ## Lab Changes **Improvements** * Added `speechEnhancementLevel` (0-1) to `voiceDetectionOptions` for control over how aggressively background noise is filtered from user audio * Support for ephemeral tool IDs, so you can configure tools dynamically per session * Added delete account and organization buttons **Fixes** * Fixed terminology on tools tab * Fixed RAG default parameters not being passed * Fixed custom LLM default settings ## Persona Changes **Improvements** * Support for Gemini thinking/reasoning models * The `speechEnhancementLevel` parameter now passes through via `voiceDetectionOptions` * Engine optimizations for lower latency under load **Fixes** * Fixed GPT-5 tool calls returning errors * Fixed audio frame padding that could cause playback issues * Fixed repeated silence messages * Fixed silence breaker not responding to typed messages ## 🎧 User Speech Enhancement We've integrated [ai-coustics](https://ai-coustics.com/) as a preprocessing layer in our user audio pipeline. It enhances audio quality before it reaches speech detection, cleaning up background noise and improving signal clarity in real-world conditions. This reduces false transcriptions from ambient sounds and improves endpointing accuracy, especially in noisy environments like cafes, offices, or outdoor settings. ## πŸŽ›οΈ Configurable Persona Responsiveness Control how quickly your persona responds with [voiceDetectionOptions](https://anam.ai/docs/personas/session/voice-detection) in the persona config: * `endOfSpeechSensitivity` (0-1): How eager the persona is to jump in. 0 waits until it's confident you're done talking, 1 responds sooner. * `silenceBeforeSkipTurnSeconds`: How long before the persona prompts a quiet user. * `silenceBeforeSessionEndSeconds`: How long silence ends the session. * `silenceBeforeAutoEndTurnSeconds`: How long a mid-sentence pause waits before the persona responds. ## 🧠 Reasoning Model Support Added support for OpenAI reasoning models and custom Groq LLMs. Reasoning models can think through complex scenarios before responding, while Groq's high-throughput infrastructure makes these typically-slower models respond with conversational latencies suitable for real-time interactions. Add your reasoning model in the lab: [https://lab.anam.ai/llms](https://lab.anam.ai/llms). ## Persona Changes **Fixes** * Fixed Knowledge Base (RAG) tool calling with proper default query parameters * Fixed panic crashes when sessions error during startup ## Lab Changes **Fixes** * Fixed `Powered by Anam` text visibility when watermark removal is enabled * Updated API responses for GET/UPDATE persona endpoints ## SDK/API Changes **Improvements** * Introduced agent audio input streaming for BYO audio workflows, allowing you to integrate with arbitrary voice agents, eg ElevenLabs agents (see the [ElevenLabs server-side agents recipe](https://anam.ai/cookbook/elevenlabs-server-side-agents) on how to integrate). * Added WebRTC reasoning event handlers for reasoning model support ## 🎭 Introducing Cara 3: our most expressive model yet The accumulation of over 6 months of research, **Cara 3** is now available. This new model delivers significantly more expressive avatars featuring realistic eye movement, more dynamic head motion, smoother transitions in and out of idling, and improved lip sync. You can opt-in to the new model in your persona config using `avatarModel: 'cara-3'` or by selecting it in the Lab UI. Note that all new custom avatars will use Cara 3 exclusively, while existing personas will continue to use the Cara 2 model by default unless explicitly updated. ## πŸ›‘οΈ SOC-2 Type II compliance Anam has achieved SOC-2 Type II compliance. This milestone validates that our security, availability, and data protection controls have been independently audited and proven over time. For customers building across learning, enablement, or live production use cases, this provides formal assurance regarding how we handle security, access, and reliability.\ [**Visit the Trust Center**](https://trust.anam.ai/) ## πŸ”Œ Integrations **Model Context Protocol (MCP) server**\ Manage your personas and avatars directly within Claude Desktop, Cursor, and other MCP-compatible clients. Use your favorite LLM-assisted tools to interact with the Anam API. **Anam x ElevenLabs agents**\ Turn any ElevenLabs conversational AI agent into a visual avatar using Anam's audio passthrough.\ [Watch the demo](https://anam.ai/cookbook/elevenlabs-server-side-agents) *** ## Lab Changes **Improvements** * **UI overhaul:** A redesigned Homepage and Build page make persona creation more intuitive. You can now preview voices/avatars without starting a chat and create custom assets directly within the Build flow. Sidebar and Pricing pages have also been refreshed. * **Performance:** Implemented Tanstack caching to significantly improve Lab responsiveness. **Fixes** * Fixed a bug where client tool events were not appearing in the Build page chat. * Resolved an issue where tool calls and RAG were not passing parameters correctly. ## Persona Changes **Improvements** * **More voices:** Added \~100 new Cartesia voices (Sonic-3) and \~180 new ElevenLabs voices (Flash v2.5), covering languages and accents from all over the world. * **New default LLM:** `kimi-k2-instruct-0905` is now available. This SOTA open-source model offers high intelligence and excellent conversational abilities. (Note: Standard `kimi-k2` remains recommended for heavy tool-use scenarios). * **Configurable greetings:** Added `skip_greeting` parameter, allowing you to configure whether the persona initiates the conversation or waits for the user. * **Latency reductions:** * **STT optimization:** We are now self-hosting Deepgram for Speech-to-Text, resulting in a **\~30ms (p50)** and **\~170ms (p90)** latency improvement. * **Frame buffering:** Optimized output frame buffer, shaving off an additional **\~40ms** of latency per response. **Fixes** * Corrected header handling to ensure reliable data center failover. * Fixed a visual artifact where Cara 3 video frames occasionally displayed random noise. * Resolved a freeze-frame issue affecting \~1% of sessions ([Incident Report](https://status.anam.ai/incidents/01KC7A6Q9Q6H1JDZ83TP1EF1Z1)). ## SDK/API Changes **Improvements** * **API gateway guide:** added documentation and an example repository for routing Anam SDK traffic through your own API Gateway server. [View on GitHub](https://github.com/anam-org/anam-gateway-example). ## πŸŽ₯ Livekit out of Beta and new latency record LiveKit integration is now generally available: drop Anam's expressive real-time avatars into any LiveKit Agents app so your AI can join LiveKit rooms as synchronised voice + video participants.\ It turns voice-only agents into face-and-voice experiences for calls, livestreams, and collaborative WebRTC spaces, with LiveKit handling infra and Anam handling the human layer. Docs *** ## ⚑ Record-breaking latency: 330 ms decrease in latency for all customers Server-side optimisations cuts average end-to-end latency by 330 ms for all customers, thanks to cumulative engine optimisations across transcription, frame generation, and frame writing, plus upgraded Deepgram Flux endpointing for faster, best in class turn-taking without regressions in voice quality or TTS. *** ## Lab Changes **Improvements** β€’ Overhaul to avatar video upload and management system β€’ Upgraded default Cartesia voices to Sonic 3 β€’ Standardised voice model selection across the platform **Fixes** β€’ Enhanced share-link management capabilities β€’ Corrected LiveKit persona type identification logic *** ## Persona Changes **Improvements** β€’ Server-side optimisations to our frame buffering to reduce latency of responses by \~250ms for all personas. **Fixes** β€’ Changed timeout behavior to never time out based on heartbeats; only time out when websocket is disconnected for 10 seconds or more. β€’ Fixed intermittent issue where persona stopped responding β€’ Set pix\_fmt for video output, moving from yuvj420p (JPEG) to yuv420 color space to avoid incorrect encoding/output. β€’ Added timeout in our silence breaking logic to prevent hangs. ## πŸš€ Introducing Anam Agents Build and deploy AI agents in Anam that can engage alongside you. With Anam Agents, your Personas can now interact with your applications, access your knowledge, and trigger workflows directly through natural conversation. This marks Anam's evolution from conversational Personas to agentic Personas that think, decide, and execute. ## Knowledge Tools Give your Personas access to your company's knowledge. Upload docs to the Lab, and they'll use semantic retrieval to integrate the right info.\ [Docs for Knowledge Base](https://anam.ai/docs/personas/knowledge/overview) ## Client Tools Personas can control your interface in real timeβ€”open checkout, display modals, navigate UI, and update state by voice.\ [Docs for Client Tools](https://anam.ai/docs/personas/tools/client-tools) ## Webhook Tools Connect your Personas to external APIs and services. Create tickets, fetch status, update records, or fetch live data.\ [Docs for Webhook Tools](https://anam.ai/docs/personas/tools/webhook-tools) ## Intelligent Tool Selection Each Persona's LLM chooses tools based on intentβ€”not scripts. You can create/manage tools on the Tools page in the Lab and attach them to any Persona from Build. **Anam Agents are available in beta for all users:** [https://lab.anam.ai/login](https://lab.anam.ai/login) *** ## Lab Changes **Improvements** * Cartesia Sonic-3 voices: the most expressive TTS model. * Voice modal expanded: 50+ languages, voice samples, Cartesia TTS now default. * Session reports work for custom LLMs. **Fixes** * Prevented auto-logout when switching contexts. * Fixed race conditions in cookie handling. * Resolved legacy session token issues. * Removed problematic voices. * Corrected player/stream aspect ratios on mobile. ## Persona Changes **Improvements** * Deepgram Flux support for turn-taking ([Deepgram Flux Details](https://deepgram.com/learn/introducing-flux-conversational-speech-recognition)) * Server-side optimization: reduced GIL contention and latency, faster connections. **Fixes** * Bug-fix for dangling LiveKit connections. ## Research **Improvements** * Our first open-source library!\ Metaxy, a metadata layer for ML/data pipelines:\ [Read more](https://anam-org.github.io/metaxy/main/#3-run-user-defined-computation-over-the-metadata-increment) | [GitHub](https://github.com/anam-org/metaxy) ## πŸ›‘οΈ Anam is now HIPAA compliant A big milestone for our customers and partners. Anam now meets HIPAA requirements for handling protected health information. [**Learn more at the Anam Trust Center**](https://trust.anam.ai/) ## Lab Changes **Improvements** * Enhanced voice selection: search by use case/conversational style, 50+ languages. * Product tour update. * Streamlined One-Shot avatar creation. * Auto-generated Persona names based on selected avatar. * Session start now 1.1s faster. **Fixes** * Share links: fixed extra concurrency slot usage. ## Persona Changes **Improvements** * Improved TTS pronunciation via smarter text chunking. * Traceability and monitoring for session IDs. * Increased internal audio sampling rate to 24kHz. * Increased max websocket size to 16Mb. **Fixes** * Concurrency calculation now only considers sessions from last 2 hours. * Less freezing for slower LLMs. ## πŸ“Š Session Analytics Once a conversation ends, how do you review what happened? To help you understand and improve your Persona's performance, we're launching Session Analytics in the Lab. Now you can access a detailed report for every conversation, complete with a full transcript, performance metrics, and AI-powered analysis. * **Full Conversation Transcripts.** Review every turn of a conversation with a complete, time-stamped transcript. See what the user said and how your Persona responded, making it easy to diagnose issues and identify successful interaction patterns. * **Detailed Analytics & Timeline.** Alongside the transcript, a new Analytics tab provides key metrics grouped into "Transcript Metrics" (word count, turns) and "Processing Metrics" (e.g., LLM latency). A visual timeline charts the entire conversation, showing who spoke when and highlighting any technical warnings. * **AI-Powered Insights.** For a deeper analysis, you can generate an AI-powered summary and review key insights. This feature, currently powered by gpt-5-mini, evaluates the conversation for highlights, adherence to the system prompt, and user interruption rates. You can find your session history on the Sessions page in the Lab. Click on any past session to explore the new analytics report. This is available today for all session types, except for LiveKit sessions. For privacy-sensitive applications, session logging can be disabled via the SDK. ## Lab Changes **Improvements** * Improved Voice Discovery: The Voices page has been updated to be more searchable, allowing you to preview voices with a single click, and view new details like gender, TTS-model and language. **Fixes** * Fixed share-link session bug: Fixed bug of share-link sessions taking an extra concurrency slot. ## Persona Changes **Improvements** * Small improvement to connection time: Tweaks to how we perform webrtc signalling which allows for slightly faster connection times (\~900ms faster for p95 connection time). * Improvement to output audio quality for poor connections: Enabled Opus in-band FEC to improve audio quality under packet loss. * Small reduction in network latency: Optimisations have been made to our outbound media streams to reduce A/V jitter (and hence jitter buffer delay). Expected latency improvement is modest (\<50ms). **Fixes** * Fix for livekit sessions with slow TTS audio: Stabilizes LiveKit streaming by pacing output and duplicating frames during slowdowns to prevent underflow. ## ⚑ Intelligent LLM Routing for Faster Responses The performance of LLM endpoints can be highly variable, with time-to-first-token latencies sometimes fluctuating by as much as 500ms from one day to the next depending on regional load. To solve this and ensure your personas respond as quickly and reliably as possible, we've rolled out a new intelligent routing system for LLM requests. This is active for both our turnkey customers and for customers using their own server-side **Custom LLMs** if they deploy multiple endpoints. This new system constantly monitors the health and performance of all configured LLM endpoints by sending lightweight probes at regular intervals. Using a time-aware moving average, it builds a real-time picture of network latency and processing speed for each endpoint. When a request is made, the system uses this data to calculate the optimal route, automatically shedding load from any overloaded or slow endpoints within a region. ## Lab Changes **Improvements** * Generate one-shot avatars from text prompts: You can now generate one-shot avatars from text prompts within the lab, powered by Gemini's new Nano Banana model. The one-shot creation flow has been redesigned for speed and ease-of-use, and is now available to all plans. Image upload and webcam avatars remain exclusive to Pro and Enterprise. * Improved management of published embed widgets: Published embed widgets can now be configured and monitored from the lab at [https://lab.anam.ai/personas/published](https://lab.anam.ai/personas/published). ## Persona Changes **Improvements** * Automatic failover to backup data centres: To ensure maximum uptime and reliability for our personas, we've implemented automatic failover to backup data centres. **Fixes** * Prevent session crash on long user speech: Previously, unbroken user speech exceeding 30 seconds would trigger a transcription error and crash the session. We now automatically truncate continuous speech to 30 seconds, preventing sessions from failing in these rare cases. * Allow configurable session lengths of up to 2 hours for Enterprise plans: We had a bug where sessions had a max timeout of 30 mins instead of 2 hours for enterprise plans. This has now been fixed. * Resolved slow connection times caused by incorrect database region selection: An undocumented issue with our database provider led to incorrect region selection for our databases. Simply refreshing our credentials resolved the problem, resulting in a \~1s improvement in median connection times and \~3s faster p95 times. While our provider works on a permanent fix, we're actively monitoring for any recurrence. ## πŸ”Œ Embed Widget Embed personas directly into your website with our new widget. Within the **lab's build page** click Publish then generate your unique html snippet. This snippet will work in most common website builders, eg Wordpress.org or SquareSpace. For added security we recommend adding a whitelist with your domain url. This will lock down the persona to only work on your website. You can also cap the number of sessions or give the widget an expiration period. ## Lab Changes **Improvements** * ONE-SHOT avatars available via API: Professional and Enterprise accounts can now create one-shot avatars via API. Docs **here**. * Spend caps: It's now possible to set a spend cap on your account. Available in **profile settings**. ## Persona Changes **Fixes** * Prevent Cartesia from timing out when using slow custom LLMs: We've added a safeguard to prevent Cartesia contexts from unexpectedly closing during pauses in text streaming. With slower llms or if there's a break or slow-down in text being sent, your connection will now stay alive, ensuring smoother, uninterrupted interactions. For full legal and policy information, see: * [Trust Center](https://trust.anam.ai/) * [AI Governance](https://anam.ai/ai-governance) * [Terms of Service](https://anam.ai/terms-of-service) * [DPA](https://anam.ai/data-processing) * [Acceptable Use Policy](https://anam.ai/acceptable-use-policy) * [Privacy Policy](https://anam.ai/privacy-policy) # Embed Anam on your website Source: https://anam.ai/docs/embed/overview Add an Anam avatar to any website using the Widget, Player, or SDK. Compare options and follow platform-specific setup guides. There are three ways to add an Anam avatar to your site: the Widget, the Player, or the SDK. ## Which option should I use? **Widget** β€” A pre-built Web Component. The avatar loads directly on your page with its own UI. You can listen to events, handle tool calls, and control it from JavaScript. **Player** β€” A single iframe. The avatar runs inside a sandboxed frame, completely isolated from your page. The most compatible option across website builders, but you can't interact with it from your own code. **SDK** β€” A JavaScript library for building your own interface from scratch. Use this when the pre-built UIs don't fit your design and you want full control over the experience. ## Widget The Widget loads as a `` [Web Component](https://developer.mozilla.org/en-US/docs/Web/API/Web_Components) on your page. It handles its own authentication, renders inside a Shadow DOM (so it won't clash with your styles), and dispatches DOM events you can listen to. * Floating overlay or inline layout modes * DOM events for analytics, error handling, and custom behavior * Supports tool calls and text input * Configure appearance from Lab or HTML attributes * Your site must allow external JavaScript * Your domain must be added to the allowed list in Lab's Widget tab * Microphone access required for voice ```html theme={"system"} ``` See the [Widget documentation](/embed/widget/overview) for configuration, events, and framework-specific setup. ## Player The Player runs inside an iframe. Your page and the avatar are fully isolated from each other β€” different JavaScript contexts, different stylesheets, no interaction between the two. This makes it the most broadly compatible option, especially on platforms that restrict custom JavaScript. * Full Anam interface in a sandboxed frame * Isolated from your site's CSS and JavaScript * Works on platforms that block custom scripts but allow iframes * Your site must allow iframe embedding * HTTPS required * Microphone access required for voice ```html theme={"system"} ``` ## SDK The SDK gives you a JavaScript client and raw media streams. There's no pre-built UI β€” you build your own. Use this when you need complete control over how the avatar looks and behaves on your page. * Full control over layout and appearance * Direct access to video/audio streams * Programmatic session management * Your site must allow external JavaScript * HTTPS required * Microphone access required for voice See the [SDK Reference](/javascript-sdk/reference/basic-usage) for usage details. ## Platform compatibility Not every website builder supports every embed type. This depends on whether the platform lets you add custom JavaScript, iframes, or both. | Platform | Widget | Player | SDK | Notes | | ------------- | ------ | ------ | --- | ------------------------------------------------ | | WordPress.org | βœ… | βœ… | βœ… | | | Webflow | βœ… | βœ… | βœ… | May need enterprise plan for script whitelisting | | Squarespace | βœ… | βœ… | βœ… | Requires paid plan | | Jimdo Creator | βœ… | βœ… | βœ… | Requires paid plan | | Shopify | βœ… | βœ… | ❌ | | | WordPress.com | βœ… | βœ… | ❌ | Requires paid plan | | GoDaddy | βœ… | βœ… | ❌ | | | Wix | βœ… | ❌ | ❌ | Requires paid plan | ### WordPress.com Requires Business plan (\$25/month) or higher. Add Widget or Player code via Custom HTML block. SDK is not supported. ### WordPress.org (self-hosted) All three embed options work. Add code via the Gutenberg editor (Custom HTML block), your theme's `footer.php`, or a plugin like "Insert Headers and Footers". ### Shopify **Player:** Go to Online Store > Themes > Customize, add a "Custom Liquid" section, paste the iframe code. **Widget/SDK:** Go to Online Store > Themes > Actions > Edit code, open `theme.liquid`, add the code before ``. You may need to update your theme's Content Security Policy. ### Wix Requires a paid plan for custom code. Add Widget or Player code via Wix's HTML/Embed elements. SDK is not supported. ### Squarespace Requires Business plan or higher. Use a Code Block (HTML mode) for the Player, or Settings > Advanced > Code Injection for Widget/SDK. ### Webflow **Player:** Add an Embed element and paste the iframe code. Check Site Settings > Security > Secure Frame Headers if it doesn't load. **Widget/SDK:** Go to Project Settings > Custom Code, add to Footer Code. Changes aren't visible in the designer β€” publish first. ### GoDaddy **Widget/Player:** Add a "Custom Code" section and paste the embed code. SDK is not supported. ### Jimdo Creator Requires "Creator" mode (not "Dolphin"). Add an HTML/Widget element for the Player, or use Settings > Edit Head for Widget/SDK scripts. ## Security All embed options require HTTPS and microphone access. Users are prompted for microphone permissions the first time they interact with the avatar. ### Content Security Policy (CSP) If your site has a CSP, add these directives for the option you're using: ```http theme={"system"} # Widget Content-Security-Policy: script-src https://unpkg.com/@anam-ai/; connect-src https://api.anam.ai wss://connect.anam.ai wss://connect-us.anam.ai wss://connect-eu.anam.ai; # Player Content-Security-Policy: frame-src https://lab.anam.ai; # SDK Content-Security-Policy: connect-src https://api.anam.ai wss://connect.anam.ai wss://connect-us.anam.ai wss://connect-eu.anam.ai; ``` ### Browser support Chrome/Edge 80+, Firefox 75+, Safari 14.1+, iOS Safari 14.5+, Chrome Android 80+. ## Troubleshooting * Check that your share token (Player) or agent ID (Widget) is correct * Open browser DevTools (F12) and look for errors in the console * Confirm your site is served over HTTPS, not HTTP * Check that your platform plan supports custom embeds * Try disabling ad blockers * Click the lock icon in the address bar to check permissions * Microphone only works over HTTPS * Try Chrome or Edge for best compatibility * Disable privacy extensions that might block permissions * Look for "NotAllowedError" or "NotFoundError" in the console * Add the required domains to your Content Security Policy (see above) * For Player: check that X-Frame-Options isn't set to DENY * For Widget: make sure your domain is in the allowed list in Lab * Some platforms have non-configurable security policies ## FAQ No. Use one Anam instance per page. The Player uses a share token (generated from the Share button in Lab). The Widget uses an agent ID (your persona's ID) and handles authentication automatically via domain allowlisting. **Widget:** Yes β€” layout, position, UI toggles, and more. Configure from Lab or with HTML attributes. See [Widget configuration](/embed/widget/configuration). **Player:** Limited to sizing the iframe. **SDK:** Full control β€” you build the UI yourself. The Widget and SDK support text input as a fallback (enable with the `ui-text-input` attribute or SDK configuration). The Player shows a message asking users to grant access. ## Next steps Configuration, events, and framework-specific setup guides. Build your own interface with the JavaScript SDK. # Configuration Source: https://anam.ai/docs/embed/widget/configuration Configure the widget with HTML attributes, Lab settings, and custom positioning The widget loads its configuration from your published persona settings in [Anam Lab](https://lab.anam.ai). You can override any setting by adding HTML attributes directly to the `` element. ## Layout modes ### Floating The default mode. The widget renders as a fixed-position overlay, anchored to a corner of the viewport. Users click to expand it into a conversation panel. ```html theme={"system"} ``` The `initial-state` attribute controls whether the widget starts expanded or minimized: * `"expanded"` (default) the conversation panel is open immediately * `"minimized"` only the floating orb is visible; the user clicks to expand ### Inline The widget fills its parent container and becomes part of your page layout. There is no floating orb or expand/collapse behavior. ```html theme={"system"}
``` The inline widget inherits the dimensions of its parent. Make sure the parent has explicit width and height (or aspect-ratio) set. ## Custom positioning For `floating` layout, the `position` attribute supports both predefined corners and custom CSS values using bracket syntax. ### Predefined positions | Value | CSS Result | | ---------------- | --------------------------- | | `"bottom-right"` | `bottom: 24px; right: 24px` | | `"bottom-left"` | `bottom: 24px; left: 24px` | | `"top-right"` | `top: 24px; right: 24px` | | `"top-left"` | `top: 24px; left: 24px` | ### Bracket syntax For precise control, use bracket syntax to set arbitrary CSS position values: ```html theme={"system"} {/* 20px from the top, flush to the right edge */} {/* 100px from the bottom, 40px from the left */} ``` The format is `[property-value,property-value,...]` where: * Allowed properties: `top`, `right`, `bottom`, `left`, `margin`, `margin-top`, `margin-right`, `margin-bottom`, `margin-left` * Bare numbers are treated as pixels (e.g., `top-20` becomes `top: 20px`) * CSS units can be specified directly (e.g., `top-2rem`) ## Attributes reference `agent-id` is required. All other attributes are optional and will fall back to your Lab configuration. | Attribute | Type | Default | Description | | ------------------ | ----------------------------- | ------------------------- | ---------------------------------------------------------------------------- | | `agent-id` | `string` | | Persona ID. Required for fetching config and creating sessions. | | `api-base-url` | `string` | `https://api.anam.ai` | API URL override. Only needed for non-production environments. | | `layout` | `"floating"` \| `"inline"` | `"floating"` | Widget layout mode. | | `initial-state` | `"expanded"` \| `"minimized"` | `"expanded"` | Initial state when layout is `floating`. Ignored for `inline`. | | `position` | `string` | `"bottom-right"` | Corner position or custom bracket syntax. Only applies to `floating` layout. | | `ui-mute-button` | `"true"` \| `"false"` | `"true"` | Show the microphone mute button. | | `ui-text-input` | `"true"` \| `"false"` | `"true"` | Show the text input field. | | `call-to-action` | `string` | `"Talk to our assistant"` | Custom text for the start button. | | `avatar-url` | `string` | | Override avatar thumbnail image URL. | | `avatar-video-url` | `string` | | Override avatar preview video URL. | | `agent-name` | `string` | | Override persona name displayed in the default CTA. | # Events Source: https://anam.ai/docs/embed/widget/events Listen to widget events for analytics and custom behavior The `` element dispatches standard [DOM Custom Events](https://developer.mozilla.org/en-US/docs/Web/API/CustomEvent) that cross the Shadow DOM boundary, so you can listen to them with `addEventListener` on the element itself or any ancestor. Event data is available on `event.detail`. ## Listening to events ```javascript theme={"system"} const widget = document.querySelector("anam-agent"); widget.addEventListener("anam-agent:session-started", (e) => { console.log("Session started:", e.detail.sessionId); }); widget.addEventListener("anam-agent:message-received", (e) => { console.log(`${e.detail.role}: ${e.detail.content}`); }); widget.addEventListener("anam-agent:error", (e) => { console.error(`Error [${e.detail.code}]: ${e.detail.message}`); }); ``` ## Events reference | Event Name | Payload | Description | | ----------------------------- | ---------------------------------------------- | ------------------------------------------------------ | | `anam-agent:session-started` | `{ sessionId: string }` | A WebRTC session has been established. | | `anam-agent:session-ended` | `{ sessionId: string, reason: string }` | The session has ended (user-initiated or server-side). | | `anam-agent:message-received` | `{ role: "user" \| "agent", content: string }` | A transcript message was received from either party. | | `anam-agent:message-sent` | `{ content: string }` | The user sent a text message via the input field. | | `anam-agent:expanded` | `{}` | The widget was expanded (floating layout only). | | `anam-agent:collapsed` | `{}` | The widget was collapsed (floating layout only). | | `anam-agent:error` | `{ code: string, message: string }` | An error occurred (auth failure, network issue, etc.). | | `anam-agent:mic-muted` | `{}` | The user muted their microphone. | | `anam-agent:mic-unmuted` | `{}` | The user unmuted their microphone. | ## Common patterns Track session starts, message counts, and engagement duration: ```javascript theme={"system"} const widget = document.querySelector("anam-agent"); let sessionStart; widget.addEventListener("anam-agent:session-started", (e) => { sessionStart = Date.now(); analytics.track("avatar_session_started", { sessionId: e.detail.sessionId, }); }); widget.addEventListener("anam-agent:session-ended", (e) => { const duration = Date.now() - sessionStart; analytics.track("avatar_session_ended", { sessionId: e.detail.sessionId, reason: e.detail.reason, durationMs: duration, }); }); widget.addEventListener("anam-agent:message-received", (e) => { if (e.detail.role === "agent") { analytics.track("avatar_message", { contentLength: e.detail.content.length, }); } }); ``` React to widget expand/collapse to adjust your page layout: ```javascript theme={"system"} const widget = document.querySelector("anam-agent"); const sidebar = document.getElementById("sidebar"); widget.addEventListener("anam-agent:expanded", () => { sidebar.style.marginRight = "420px"; }); widget.addEventListener("anam-agent:collapsed", () => { sidebar.style.marginRight = "0"; }); ``` Display user-facing messages or trigger fallback behavior: ```javascript theme={"system"} const widget = document.querySelector("anam-agent"); widget.addEventListener("anam-agent:error", (e) => { const { code, message } = e.detail; if (message.includes("Origin not allowed")) { showBanner("Widget configuration required. Contact your admin."); } else if (message.includes("Too many requests")) { showBanner("Please wait a moment before trying again."); } else { showBanner("Something went wrong. Please try again."); } }); ``` # Framer Source: https://anam.ai/docs/embed/widget/framer Add an Anam AI avatar widget to your Framer website Add a conversational AI avatar to your Framer store using the [Anam Widget](/embed/widget/overview) -- no coding required. ## Quick Start Go to [Anam Lab](https://lab.anam.ai)Β configure its avatar, voice, and behavior. In the Widget tab, add your Framer domain to the **Allowed domains** list and **Publish** the avatar. Open this plugin in:Β [https://www.framer.com/marketplace/plugins/anam-avatar/preview/](https://www.framer.com/marketplace/plugins/anam-avatar/preview/)Β and paste your widget code there.Β  ## Next Steps Customize the widget's appearance and behavior Listen for widget events in your Framer website # Installation Source: https://anam.ai/docs/embed/widget/installation Install the Anam Widget via CDN, npm, or framework-specific methods Include a script tag and the custom element. No build step required. ```html unpkg theme={"system"} ``` ```html jsdelivr theme={"system"} ``` The script auto-registers the `` custom element. Place it anywhere in your HTML. The `async` attribute ensures it doesn't block page rendering. For projects with a build system, install the package and register the element in your application code. ```bash theme={"system"} npm install @anam-ai/agent-widget ``` ```javascript theme={"system"} import { registerWidget } from "@anam-ai/agent-widget"; registerWidget(); ``` Then use `` anywhere in your templates or JSX. ## Framework-specific setup Add the snippet before the closing `` tag: ```html theme={"system"} My Site {/* Your page content */} ``` Use the `Script` component and render the custom element in a client component: ```tsx app/components/AnamWidget.tsx theme={"system"} "use client"; import Script from "next/script"; export function AnamWidget() { return ( <> ``` ## Next steps CDN, npm, and framework-specific setup guides. All HTML attributes, layout options, and positioning. Add the widget to your Shopify store. Add the widget to your WordPress site. # Shopify Source: https://anam.ai/docs/embed/widget/shopify Add an Anam AI avatar widget to your Shopify store Add a conversational AI avatar to your Shopify store using the [Anam Widget](/embed/widget/overview) -- no coding required. Step-by-step guide with screenshots for embedding the widget in your Shopify theme ## Overview The Anam widget can be added to any Shopify store by editing the `theme.liquid` file. Once installed, the avatar appears on every page (or specific pages using Liquid conditionals). ## Quick Start Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to get your persona ID and configure the widget appearance. In the Widget tab, add your Shopify domain (e.g., `https://your-store.myshopify.com`) to the **Allowed domains** list. In your Shopify admin, go to **Online Store > Themes > Edit code** and open `theme.liquid`. Add the widget snippet after the opening `` tag: ```html theme={"system"} ``` Click **Publish** in Anam Lab to make the persona live on your store. Use Shopify's Liquid conditionals to show the widget only on specific pages. See the [cookbook](https://anam.ai/cookbook/widget-shopify) for examples. ## Next Steps Customize the widget's appearance and behavior Listen for widget events in your Shopify theme # Squarespace Source: https://anam.ai/docs/embed/widget/squarespace Add an Anam AI avatar widget to your Squarespace site Add a conversational AI avatar to your Squarespace site using the [Anam Widget](/embed/widget/overview) and Squarespace's code injection feature -- no coding required. Step-by-step guide for embedding the widget using Squarespace's code injection ## Quick Start Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to get your persona ID. In the Widget tab, add your Squarespace domain to the **Allowed domains** list. In your Squarespace dashboard, go to **Settings > Advanced > Code Injection**. Paste the following into the **Header** section: ```html theme={"system"} ``` Then paste into the **Footer** section: ```html theme={"system"} ``` Click **Publish** in Anam Lab and save your Squarespace settings. ## Next Steps Customize the widget's appearance and behavior Listen for widget events # Wix Source: https://anam.ai/docs/embed/widget/wix Add an Anam AI avatar widget to your Wix site Add a conversational AI avatar to your Wix website using the [Anam Widget](/embed/widget/overview) and Wix's Custom Code feature -- no coding required. Step-by-step guide for embedding the widget on your Wix site ## Quick Start Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to get your persona ID. In the Widget tab, add your Wix domain to the **Allowed domains** list. In your Wix dashboard, go to **Settings > Custom Code**. Click **Add Custom Code** and paste: ```html theme={"system"} ``` Set placement to **All pages** and loading to **Load code once**. Click **Publish** in Anam Lab and publish your Wix site. ## Next Steps Customize the widget's appearance and behavior Listen for widget events # WordPress Source: https://anam.ai/docs/embed/widget/wordpress Add an Anam AI avatar widget to your WordPress site Add a conversational AI avatar to your WordPress site using the [Anam Widget](/embed/widget/overview) and the WPCode plugin -- no coding required. Step-by-step guide for embedding the widget using WPCode ## Quick Start Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to get your persona ID. In the Widget tab, add your WordPress domain to the **Allowed domains** list. In your WordPress admin, go to **Plugins > Add New** and search for **WPCode**. Install and activate it. Go to **Code Snippets > Add Snippet** and create a new **HTML Snippet**. Paste: ```html theme={"system"} ``` Set the insertion method to **Site Wide Header** and activate the snippet. Click **Publish** in Anam Lab to make the persona live. ## Next Steps Customize the widget's appearance and behavior Listen for widget events # LiveKit Configuration Source: https://anam.ai/docs/integrations/livekit/configuration Configuration options, advanced examples, and API reference for the Anam LiveKit plugin ## Installation ```bash theme={"system"} pip install livekit-plugins-anam ``` ## Environment variables | Service | Where to get it | | ----------------- | ---------------------------------------------------- | | **Anam** | [lab.anam.ai](https://lab.anam.ai) | | **LiveKit** | [LiveKit Cloud](https://livekit.io) or self-hosted | | **LLM providers** | DeepGram, ElevenLabs, OpenAI, Google AI Studio, etc. | ```bash .env theme={"system"} ANAM_API_KEY=your_anam_api_key ANAM_AVATAR_ID=your_avatar_id LIVEKIT_URL=wss://your-project.livekit.cloud LIVEKIT_API_KEY=your_livekit_api_key LIVEKIT_API_SECRET=your_livekit_api_secret OPENAI_API_KEY=your_openai_api_key # or GEMINI_API_KEY=your_gemini_api_key ``` ## PersonaConfig Configure the avatar identity: ```python theme={"system"} persona_config = anam.PersonaConfig( name="Maya", # Display name for the avatar avatarId="uuid-here", # Avatar appearance ID ) ``` Display name for the avatar. Used in logs and debugging. UUID of the avatar to use. Get this from the [Avatar Gallery](/resources/avatar-gallery) or [Anam Lab](https://lab.anam.ai/avatars). ## AvatarSession ```python theme={"system"} avatar = anam.AvatarSession( persona_config=anam.PersonaConfig(...), api_key="your_api_key", api_url="https://api.anam.ai", # Optional ) ``` Configuration for the avatar's identity and appearance. Your Anam API key. Anam API endpoint. Override for staging or self-hosted deployments. ### start() Starts the avatar session and connects it to the LiveKit room. ```python theme={"system"} await avatar.start(session, room=ctx.room) ``` The LiveKit agent session to connect the avatar to. The LiveKit room instance from the job context. ## Advanced examples ### Gemini with Vision Use Gemini Live for multimodal conversations with screen share analysis: ```python theme={"system"} import os from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli from livekit.agents.voice import VoiceActivityVideoSampler, room_io from livekit.plugins import anam, google async def entrypoint(ctx: JobContext): await ctx.connect() llm = google.realtime.RealtimeModel( model="gemini-2.0-flash-exp", api_key=os.getenv("GEMINI_API_KEY"), voice="Aoede", instructions="You are a helpful assistant that can see the user's screen.", ) avatar = anam.AvatarSession( persona_config=anam.PersonaConfig( name="Maya", avatarId=os.getenv("ANAM_AVATAR_ID"), ), api_key=os.getenv("ANAM_API_KEY"), ) session = AgentSession( llm=llm, video_sampler=VoiceActivityVideoSampler( speaking_fps=0.2, silent_fps=0.1, ), ) await avatar.start(session, room=ctx.room) await session.start( agent=Agent(instructions="Help the user with what you see on their screen."), room=ctx.room, room_input_options=room_io.RoomInputOptions(video_enabled=True), ) if __name__ == "__main__": cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint)) ``` ### Function tools Extend your agent with custom tools: ```python theme={"system"} from livekit.agents import function_tool @function_tool async def fill_form_field(field_name: str, value: str) -> str: """Fill in a form field on the user's screen. Args: field_name: The name of the field to fill value: The value to enter Returns: Confirmation message """ await send_command_to_frontend("fill_field", {"field": field_name, "value": value}) return "Field filled successfully" session = AgentSession( llm=llm, tools=[fill_form_field], ) ``` ## Running your agent ```bash theme={"system"} python agent.py dev ``` Connects to your LiveKit server and automatically joins rooms when participants connect. ```bash theme={"system"} python agent.py ``` Deploy using Docker, Kubernetes, or your preferred container platform. See the [LiveKit Agents deployment guide](https://docs.livekit.io/agents/deployment) for details. ## Troubleshooting * Verify `LIVEKIT_URL`, `LIVEKIT_API_KEY`, and `LIVEKIT_API_SECRET` are correct * Check that your LiveKit server is accessible * Ensure WebSocket connections aren't blocked by a firewall * Test connectivity at [meet.livekit.io](https://meet.livekit.io) * Verify your `ANAM_API_KEY` is valid * Check that `ANAM_AVATAR_ID` matches an existing avatar * Review agent logs for Anam connection errors * Ensure the avatar session starts before the agent session * Check your LLM API key is valid (OpenAI, Gemini, etc.) * Verify microphone permissions in the browser * Look for API errors in the agent logs * Confirm the agent is receiving audio tracks * Check your network connection stability * Consider using LiveKit Cloud for optimized routing * Reduce video sampling frequency if CPU-bound * Monitor your LLM API response times # LiveKit Integration Source: https://anam.ai/docs/integrations/livekit/overview Add Anam avatars to LiveKit agent applications The Anam LiveKit plugin adds a visual avatar face to your LiveKit voice agents. Combine Anam's avatar technology with any STT, LLM, or TTS β€” including OpenAI Realtime, Gemini Live, or your own custom models. ## How it works LiveKit uses a room-based architecture. Human users and AI agents both connect to rooms as participants. Anam plugs into this as a video layer: ``` User Input (Voice/Video) ↓ LiveKit Room (Real-time Communication) ↓ Your LLM (OpenAI, Gemini, Claude, etc.) ↓ Text Response β†’ Anam Avatar (TTS + Video) ↓ User sees and hears the avatar ``` The Anam plugin listens to the audio being sent to users and generates a synchronized video stream of the avatar speaking. The video is published to the room as a separate track that clients display. **Bring Your Own LLM**: Anam handles only the visual avatar. You choose the ears, intelligence, and voice β€” whether that's DeepGram, ElevenLabs, Cartesia, OpenAI, Gemini, Claude, or a custom model. ## Demo See the integration in action with our onboarding assistant demo: ![Anam LiveKit Demo - AI Onboarding Assistant](https://img.youtube.com/vi/9mxK5HdHzes/0.jpg) ## Use cases The Anam + LiveKit combination is ideal for scenarios requiring voice interaction with visual presence: Guide new hires through forms and processes with screen share analysis. The AI sees what they see and provides contextual help. Help students with homework by seeing their work. The avatar can point out errors and explain concepts visually. See customer screens and provide step-by-step guidance with a friendly visual presence. Assist patients filling out medical forms with a calm, reassuring avatar presence. Guide users through account opening, KYC processes, and complex financial forms. ## Resources Build a LiveKit voice agent with an Anam avatar from scratch Add Gemini Vision to a LiveKit agent for screen share analysis Full source code for the onboarding assistant demo Official LiveKit documentation # LiveKit Quickstart Source: https://anam.ai/docs/integrations/livekit/quickstart Add an Anam avatar to a LiveKit agent in minutes This quickstart shows how to add an Anam avatar face to a LiveKit voice agent using OpenAI Realtime for the LLM. ## Prerequisites * [LiveKit CLI](https://docs.livekit.io/home/cli/cli-setup/) installed * A [LiveKit Cloud](https://cloud.livekit.io) account * An OpenAI API key * An Anam API key from [lab.anam.ai](https://lab.anam.ai) ## Set up the agent Clone the LiveKit Node.js agent starter and install dependencies: ```bash theme={"system"} git clone https://github.com/livekit-examples/agent-starter-node.git cd agent-starter-node pnpm install ``` Download the required model files (VAD and turn detection): ```bash theme={"system"} pnpm run download-files ``` Install the Anam plugin: ```bash theme={"system"} pnpm add @livekit/agents-plugin-anam ``` ## Configure credentials Create a `.env.local` file: ```bash .env.local theme={"system"} # LiveKit Cloud credentials (from cloud.livekit.io) LIVEKIT_URL=wss://your-project.livekit.cloud LIVEKIT_API_KEY=your_api_key LIVEKIT_API_SECRET=your_api_secret # OpenAI (for voice + LLM) OPENAI_API_KEY=your_openai_key # Anam (for avatar face) ANAM_API_KEY=your_anam_key ANAM_AVATAR_ID=edf6fdcb-acab-44b8-b974-ded72665ee26 ``` The avatar ID above is "Mia", one of Anam's stock avatars. Browse others in the [Avatar Gallery](/resources/avatar-gallery) or create your own at [lab.anam.ai/avatars](https://lab.anam.ai/avatars). ## Add the avatar to your agent Replace the contents of `src/agent.ts`: ```typescript theme={"system"} import { type JobContext, ServerOptions, cli, defineAgent, voice } from '@livekit/agents'; import * as anam from '@livekit/agents-plugin-anam'; import * as openai from '@livekit/agents-plugin-openai'; import { BackgroundVoiceCancellation } from '@livekit/noise-cancellation-node'; import dotenv from 'dotenv'; import { fileURLToPath } from 'node:url'; dotenv.config({ path: '.env.local' }); class Assistant extends voice.Agent { constructor() { super({ instructions: `You are a helpful voice AI assistant. You eagerly assist users with their questions. Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols. You are curious, friendly, and have a sense of humor.`, }); } } export default defineAgent({ entry: async (ctx: JobContext) => { await ctx.connect(); // Start the voice session with OpenAI Realtime const session = new voice.AgentSession({ llm: new openai.realtime.RealtimeModel({ voice: 'alloy' }), }); await session.start({ agent: new Assistant(), room: ctx.room, inputOptions: { noiseCancellation: BackgroundVoiceCancellation(), }, }); // Start the Anam avatar session const avatarId = process.env.ANAM_AVATAR_ID; if (!avatarId) { console.warn('ANAM_AVATAR_ID is not set. Avatar will not start.'); return; } const avatarSession = new anam.AvatarSession({ personaConfig: { name: 'Mia', avatarId, }, }); await avatarSession.start(session, ctx.room); console.log('Agent and avatar session started'); }, }); cli.runApp(new ServerOptions({ agent: fileURLToPath(import.meta.url) })); ``` ## Test locally ```bash theme={"system"} pnpm run dev ``` The agent connects to LiveKit Cloud and waits for rooms. You need a frontend to create a room and connect. ## Set up the frontend In a new terminal, create the React frontend: ```bash theme={"system"} lk app create --template agent-starter-react cd agent-starter-react pnpm install ``` Create a `.env.local` with your LiveKit credentials: ```bash .env.local theme={"system"} LIVEKIT_URL=wss://your-project.livekit.cloud LIVEKIT_API_KEY=your_api_key LIVEKIT_API_SECRET=your_api_secret ``` Start the dev server: ```bash theme={"system"} pnpm dev ``` Open `http://localhost:3000`, click connect, and the avatar appears as the agent speaks. ## Deploy to LiveKit Cloud ```bash theme={"system"} lk agent deploy --secrets-file=.env.local ``` This uploads your agent code and environment variables. The agent will now automatically join any rooms created in your project. ## Next steps * [Configuration](/livekit/configuration) β€” persona config, advanced examples, and API reference * [Avatar Gallery](/resources/avatar-gallery) β€” browse stock avatars * [Create a custom avatar](https://lab.anam.ai/avatars) β€” use your own face # Other SDKs & Integrations Source: https://anam.ai/docs/integrations/sdks Official and community-built SDKs, plugins, and integrations for Anam Beyond the [JavaScript SDK](/javascript-sdk/quickstart) and [Python SDK](/python-sdk/overview), Anam can be integrated through several partner and community-built SDKs. ## Official Integrations Build voice AI pipelines with Pipecat and render them with Anam avatars. Official plugin maintained by Anam. Use Anam avatars as the visual layer for Agora's Conversational AI platform. Add Anam avatars to VideoSDK voice agent pipelines. Works with both RealTimePipeline and CascadingPipeline. ## Community SDKs These are built and maintained by the community. We can't guarantee their functionality or maintenance, but we're excited to see them. Kotlin Multiplatform SDK for Android, iOS, and desktop targets. Community-maintained Flutter SDK by Stu Kennedy for mobile applications. ## Built something? If you've built an SDK, plugin, or integration for Anam, reach out at [info@anam.ai](mailto:info@anam.ai) and we'll add it here. # Anam Documentation Source: https://anam.ai/docs/introduction/overview Explore our docs to start building with Anam Anam AI Avatars Anam builds real-time AI avatars: photorealistic digital humans that talk, listen, and perform actions based on your conversations. A persona is the combination of a face, a voice, an LLM, and a system prompt. You integrate a persona with your webpage and it holds live conversations with your users. ## Who builds with Anam Teams use Anam to build interactive agents from scratch, or to put a face on agents they already have. Common cases include customer support, sales and lead qualification, language tutoring, skill training, and medical front-desk assistance. In each of these cases, talking with an Anam avatar feels closer to a face-to-face conversation than text or voice alone. ## How it works Every live conversation with a persona runs through a four-stage pipeline: 1. Speech-to-text (STT) listens to the user 2. An LLM decides what to say back 3. Text-to-speech (TTS) turns the reply into audio 4. Face generation turns the audio into a live video of the persona speaking The default setup, which we call Turnkey, runs the whole pipeline for you. You can also bring your own LLM, your own STT, your own TTS, or hand us pre-generated audio and have us run face generation only. Whichever setup matches your stack, the persona and the avatar stream stay the same. ## Three ways to start Read how personas work. Avatars, voices, LLMs, and how they fit together. Add an avatar to a site you already have. Widget, Player, or platforms like Framer and Shopify. Use the JavaScript or Python SDK to build your own. The pipeline and the UI are yours. # Get your API Key Source: https://anam.ai/docs/javascript-sdk/api-key Learn how to create and manage your Anam API key from the Lab. Required to authenticate requests and integrate personas into your app. Your API key is used to authenticate your requests to the Anam API and is required for integrating the Personas you create into your own applications. API keys are managed via the [Anam Lab](https://lab.anam.ai/). ### Create a new API key From the [API keys page](https://lab.anam.ai/api-keys), click on the "Create API key" button. The label is only used to help you identify the API key. It is not used for authentication. Create a new API key Click "Create" and your new API key will be shown to you. Remember to save it somewhere safe as you will not be able to access it later. You will be prevented from closing the dialog until you have clicked the "Copy" button. API key created Anam encrypts and store your API keys securely. For this reason, we are unable to recover lost API keys. If you lose your API key, you will need to create a new key via the Anam Lab. # Authentication Source: https://anam.ai/docs/javascript-sdk/authentication Secure your API keys and manage session tokens Anam uses a two-tier authentication system: API keys for server-side requests and session tokens for client connections. ## Tier 1: API Key Your API key authenticates server-side requests to the Anam API. **Never expose your API key on the client side**. It should only exist in your server environment. ### Getting Your API Key See the [API key page](/api-key) for details on how to get your API key from the Anam Lab. ## Tier 2: Session Tokens Session tokens are temporary credentials (valid for 1 hour) that allow client applications to connect to Anam's streaming infrastructure without exposing your API key. ### How Session Tokens Work Your server requests a session token from Anam using your API key and persona configuration Anam generates a temporary token tied to your specific persona configuration Your client uses the session token with the Anam SDK to establish a direct WebRTC connection Once connected, the client can send messages and receive video/audio streams directly ### Creating Session Tokens Below is a basic Express server that exposes an endpoint for creating session tokens. ```typescript server.ts theme={"system"} import express, { Request, Response } from "express"; interface PersonaConfig { name: string; avatarId: string; voiceId: string; llmId?: string; systemPrompt?: string; } interface SessionTokenResponse { sessionToken: string; } const app = express(); app.use(express.json()); app.post("/api/session-token", async (req: Request, res: Response) => { try { const response = await fetch("https://api.anam.ai/v1/auth/session-token", { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${process.env.ANAM_API_KEY}`, }, body: JSON.stringify({ personaConfig: { name: "Cara", avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18", voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b", llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466", systemPrompt: "You are a helpful assistant.", } satisfies PersonaConfig, }), }); if (!response.ok) { const errorData = await response.json(); console.error("Token creation failed:", errorData); return res.status(response.status).json({ error: "Token creation failed" }); } const { sessionToken }: SessionTokenResponse = await response.json(); res.json({ sessionToken }); } catch (error) { console.error("Network error:", error); res.status(500).json({ error: "Failed to create session" }); } }); app.listen(3000, () => console.log("Server running on port 3000")); ``` ```javascript server.js theme={"system"} const express = require("express"); const app = express(); app.use(express.json()); app.post("/api/session-token", async (req, res) => { try { const response = await fetch("https://api.anam.ai/v1/auth/session-token", { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${process.env.ANAM_API_KEY}`, }, body: JSON.stringify({ personaConfig: { name: "Cara", avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18", voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b", llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466", systemPrompt: "You are a helpful assistant.", }, }), }); if (!response.ok) { const errorData = await response.json(); console.error("Token creation failed:", errorData); return res.status(response.status).json({ error: "Token creation failed" }); } const { sessionToken } = await response.json(); res.json({ sessionToken }); } catch (error) { console.error("Network error:", error); res.status(500).json({ error: "Failed to create session" }); } }); app.listen(3000, () => console.log("Server running on port 3000")); ``` ### Using the Session Token (Client-Side) After your server creates a session token, your client fetches it and uses the Anam SDK to start streaming: ```typescript theme={"system"} import { createClient } from "@anam-ai/js-sdk"; async function startPersonaSession() { // Fetch token from your server const response = await fetch("/api/session-token", { method: "POST" }); const { sessionToken } = await response.json(); // Create client with the session token const anamClient = createClient(sessionToken); // Start streaming to a video element await anamClient.streamToVideoElement("persona-video"); } ``` ### Dynamic Persona Configuration Instead of using the same persona for all users, you can customize based on context: #### User-based Personalization ```javascript theme={"system"} app.post("/api/session-token", authenticateUser, async (req, res) => { const user = req.user; const personaConfig = { name: `Persona for user: ${user.id}`, avatarId: user.preferredAvatar || defaultAvatarId, voiceId: user.preferredVoice || defaultVoiceId, llmId: user.preferredllmId || "0934d97d-0c3a-4f33-91b0-5e136a0ef466", systemPrompt: buildPersonalizedPrompt(user), }; const sessionToken = await fetchAnamSessionToken(personaConfig); res.json({ sessionToken }); }); ``` #### Context-aware Sessions ```javascript theme={"system"} app.post("/api/session-token", authenticateUser, async (req, res) => { const { context, metadata } = req.body; let personaConfig; switch (context) { case "customer-support": personaConfig = buildSupportPersona(metadata); break; case "sales": personaConfig = buildSalesPersona(metadata); break; case "training": personaConfig = buildTrainingPersona(metadata); break; default: personaConfig = defaultPersonaConfig; } const sessionToken = await fetchAnamSessionToken(personaConfig); res.json({ sessionToken }); }); ``` ## Environment Setup Store your API key securely: ```bash .env theme={"system"} ANAM_API_KEY=your-api-key-here NODE_ENV=production ``` ```javascript config.js theme={"system"} const config = { anamApiKey: process.env.ANAM_API_KEY, anamApiUrl: process.env.ANAM_API_URL || "https://api.anam.ai", }; if (!config.anamApiKey) { throw new Error("ANAM_API_KEY environment variable is required"); } module.exports = config; ``` ## Next Steps # Basic application Source: https://anam.ai/docs/javascript-sdk/examples/basic-app Build a complete web application with an Anam persona This guide creates a minimal but complete web application with an interactive AI persona. You will build a Node.js server that handles authentication and a client that streams the persona to a video element. Looking for a faster start? The [Quickstart](/javascript-sdk/quickstart) guide gets you running in a single HTML file without a server. ## Prerequisites * **Node.js** (version 16 or higher) and **npm** installed on your system * Basic knowledge of JavaScript * An Anam API key ([get one here](/api-key)) * A microphone and speakers for voice interaction ## Project Setup This example creates a web application with three main files: ``` my-anam-app/ β”œβ”€β”€ server.js # Express server for secure API key handling β”œβ”€β”€ package.json # Node.js dependencies β”œβ”€β”€ public/ # Static files served to the browser β”‚ β”œβ”€β”€ index.html # Main HTML page with video element β”‚ └── script.js # Client-side JavaScript for persona control └── .env # Environment variables (optional) ``` ```bash theme={"system"} mkdir my-anam-app cd my-anam-app ``` ```bash theme={"system"} npm init -y ``` ```bash theme={"system"} mkdir public ``` ```bash theme={"system"} npm install express dotenv ``` Create a `.env` file in your project root to store your API key securely: ```bash .env theme={"system"} ANAM_API_KEY=your-api-key-here ``` Replace `your-api-key-here` with your actual Anam API key. Never commit this file to version control. ## Step 1: Set up your server Create a basic Express server to handle session token generation. In a production application, integrate this into your existing backend service. ```javascript server.js theme={"system"} require("dotenv").config(); const express = require("express"); const app = express(); app.use(express.json()); app.use(express.static("public")); app.post("/api/session-token", async (req, res) => { try { const response = await fetch("https://api.anam.ai/v1/auth/session-token", { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${process.env.ANAM_API_KEY}`, }, body: JSON.stringify({ personaConfig: { name: "Cara", avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18", voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b", llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466", systemPrompt: "You are Cara, a helpful customer service representative. Be friendly and concise in your responses.", }, }), }); if (!response.ok) { const errorData = await response.json().catch(() => ({})); console.error("Anam API error:", response.status, errorData); return res.status(response.status).json({ error: errorData.message || "Failed to create session token", }); } const data = await response.json(); res.json({ sessionToken: data.sessionToken }); } catch (error) { console.error("Server error:", error); res.status(500).json({ error: "Internal server error" }); } }); app.listen(3000, () => { console.log("Server running on http://localhost:3000"); }); ``` The server exchanges your API key for a temporary session token. This token has limited scope and expires, so your API key stays secure on the server. ## Step 2: Set up your HTML Create an HTML page with a video element for the persona and controls to start/stop the chat: ```html public/index.html theme={"system"} My First Anam Persona

Chat with Cara

``` ## Step 3: Initialize the Anam client Create the client-side JavaScript to control your persona connection: ```javascript public/script.js theme={"system"} import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest"; let anamClient = null; const startButton = document.getElementById("start-button"); const stopButton = document.getElementById("stop-button"); const videoElement = document.getElementById("persona-video"); const statusElement = document.getElementById("status"); function setStatus(message) { statusElement.textContent = message; } async function startChat() { try { startButton.disabled = true; setStatus("Creating session..."); // Get session token from your server const response = await fetch("/api/session-token", { method: "POST", }); if (!response.ok) { const error = await response.json(); throw new Error(error.error || "Failed to get session token"); } const { sessionToken } = await response.json(); setStatus("Connecting..."); // Create the Anam client anamClient = createClient(sessionToken); // Start streaming to the video element await anamClient.streamToVideoElement("persona-video"); startButton.disabled = true; stopButton.disabled = false; setStatus("Connected - start speaking!"); } catch (error) { console.error("Failed to start chat:", error); setStatus(`Error: ${error.message}`); startButton.disabled = false; } } function stopChat() { if (anamClient) { anamClient.stopStreaming(); anamClient = null; videoElement.srcObject = null; startButton.disabled = false; stopButton.disabled = true; setStatus("Chat ended"); } } startButton.addEventListener("click", startChat); stopButton.addEventListener("click", stopChat); ``` For details on `createClient`, `streamToVideoElement`, and `stopStreaming`, see the [SDK Reference](/javascript-sdk/reference/basic-usage). ## Step 4: Run your application 1. Start your server: ```bash theme={"system"} node server.js ``` 2. Open [http://localhost:3000](http://localhost:3000) in your browser 3. Click "Start Chat" to begin your conversation with Cara You should see Cara appear in the video element, ready to chat through voice interaction. ## How it works 1. **Server-side authentication**: Your server exchanges the API key for a session token, keeping credentials secure 2. **Client connection**: The SDK creates a WebRTC connection for real-time video streaming 3. **Voice interaction**: Cara listens for your voice input and responds with synchronized audio and video 4. **Connection control**: Start and stop buttons control when the persona is active ## Next steps Full tutorial for building a Next.js app with the Anam SDK Learn how personas, tokens, and streaming work React to conversation events and user interactions Deploy your persona securely at scale ## Common issues **Persona not appearing?** * Check that your API key is set correctly in `.env` * Ensure the video element has `autoplay` and `playsinline` attributes * Check the browser console for errors **No audio?** * Make sure your browser allows autoplay with sound * The user must interact with the page first (clicking "Start Chat" satisfies this) **Connection issues?** * Verify your server can reach api.anam.ai * Check network connectivity and firewall settings # Custom LLM (client-side) Source: https://anam.ai/docs/javascript-sdk/examples/custom-llm Build your own AI conversation logic with OpenAI, Anthropic, and other language models Learn how to bypass Anam's built-in language models and integrate your own custom LLM for complete control over conversation logic. This guide uses OpenAI as an example, but the pattern works with any LLM provider (Anthropic, Google Gemini, Groq, Mistral, etc.). Step-by-step tutorial with full source code **New Feature**: Anam now supports [server-side custom LLMs](/concepts/custom-llms) where we handle the LLM calls for you, improving latency and simplifying development. This guide shows the client-side approach where you manage the LLM calls yourself. ## What You'll Build By the end of this guide, you'll have a persona application featuring: * **Custom AI Brain** using your own language model (OpenAI GPT-4.1-mini) * **Streaming Responses** with real-time text-to-speech conversion * **Turn-taking Management** that handles conversation flow * **Message History Integration** that maintains conversation context * **Error Handling & Recovery** for production use After completing the initial setup (Steps 1-4), you can extend this foundation by adding features like conversation memory, different LLM providers, custom system prompts, or specialized AI behaviors. This guide uses **OpenAI's GPT-4.1-mini** as an example custom LLM for demonstration purposes. In your actual application, you would replace the OpenAI integration with calls to your specific LLM provider. The core integration pattern remains the same regardless of your LLM choice. ## Prerequisites * **Node.js** (version 18 or higher) and **npm** installed * Understanding of modern JavaScript/TypeScript and streaming APIs * An Anam API key ([get one here](/api-key)) * An OpenAI API key ([get one here](https://platform.openai.com/api-keys)) * Basic knowledge of Express.js and modern web development * A microphone and speakers for voice interaction ## Understanding the Custom LLM Flow Before diving into the implementation, here is how custom LLM integration works with Anam personas. Regardless of your custom LLM provider, the implementation pattern follows these steps: The `llmId: "CUSTOMER_CLIENT_V1"` setting in the session token request disables Anam's default AI, allowing you to handle all conversation logic. The `MESSAGE_HISTORY_UPDATED` event fires when the user finishes speaking, providing the complete conversation history including the new user message. Your server endpoint receives the conversation history and generates a streaming response using your chosen LLM (OpenAI in this example). The LLM response is streamed back to the client and forwarded to the persona using `createTalkMessageStream()` for text-to-speech conversion. Using these core concepts, we'll build a simple web application that allows you to chat with your custom LLM-powered persona. ## Basic Setup Let's start by building the foundation with custom LLM integration. This setup creates a web application with four main components: ``` anam-custom-llm-app/ β”œβ”€β”€ server.js # Express server with streaming LLM endpoint β”œβ”€β”€ package.json # Node.js dependencies β”œβ”€β”€ public/ # Static files served to the browser β”‚ β”œβ”€β”€ index.html # Main HTML page with video element β”‚ └── script.js # Client-side JavaScript for persona control └── .env # Environment variables ``` ```bash theme={"system"} mkdir anam-custom-llm-app cd anam-custom-llm-app ``` ```bash theme={"system"} npm init -y ``` This creates a `package.json` file for managing dependencies. ```bash theme={"system"} mkdir public ``` The `public` folder will contain your HTML and JavaScript files that are served to the browser. ```bash theme={"system"} npm install express dotenv openai ``` We're installing Express for the server, dotenv for environment variables, and the OpenAI SDK for custom LLM integration. The Anam SDK will be loaded directly from a CDN in the browser. Create a `.env` file in your project root to store your API keys securely: ```bash .env theme={"system"} ANAM_API_KEY=your-anam-api-key-here OPENAI_API_KEY=your-openai-api-key-here ``` Replace the placeholder values with your actual API keys. Never commit this file to version control. ### Step 1: Set up your server with LLM streaming Create an Express server that handles both session token generation and LLM streaming: ```javascript server.js theme={"system"} require('dotenv').config(); const express = require('express'); const OpenAI = require('openai'); const app = express(); // Initialize OpenAI client const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, }); app.use(express.json()); app.use(express.static('public')); // Session token endpoint with custom brain configuration app.post('/api/session-token', async (req, res) => { try { const response = await fetch('https://api.anam.ai/v1/auth/session-token', { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.ANAM_API_KEY}`, }, body: JSON.stringify({ personaConfig: { name: 'Cara', avatarId: '30fa96d0-26c4-4e55-94a0-517025942e18', voiceId: '6bfbe25a-979d-40f3-a92b-5394170af54b', // This disables Anam's default brain and enables custom LLM integration llmId: 'CUSTOMER_CLIENT_V1', }, }), }); const data = await response.json(); res.json({ sessionToken: data.sessionToken }); } catch (error) { console.error('Session token error:', error); res.status(500).json({ error: 'Failed to create session' }); } }); // Custom LLM streaming endpoint app.post('/api/chat-stream', async (req, res) => { try { const { messages } = req.body; // Create a streaming response from OpenAI const stream = await openai.chat.completions.create({ model: 'gpt-4.1-mini', messages: [ { role: 'system', content: 'You are Cara, a helpful AI assistant. Be friendly, concise, and conversational in your responses. Keep responses under 100 words unless specifically asked for detailed information.', }, ...messages, ], stream: true, temperature: 0.7, }); // Set headers for streaming response res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); // Process the OpenAI stream and forward to client for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content || ''; if (content) { // Send each chunk as JSON res.write(JSON.stringify({ content }) + '\n'); } } res.end(); } catch (error) { console.error('LLM streaming error:', error); res.status(500).json({ error: 'An error occurred while streaming response' }); } }); app.listen(8000, () => { console.log('Server running on http://localhost:8000'); console.log('Custom LLM integration ready!'); }); ``` The key difference here is setting `llmId: "CUSTOMER_CLIENT_V1"` which disables Anam's default AI and enables custom LLM integration. The `/api/chat-stream` endpoint handles the actual AI conversation logic. ### Step 2: Set up your HTML Create a simple HTML page with video element and conversation display: ```html public/index.html theme={"system"} Custom LLM Persona - Anam Integration

Custom LLM Persona

Ready to connect

Conversation

Start a conversation to see your chat history...
``` ### Step 3: Implement the client-side custom LLM integration Create the client-side JavaScript that handles the custom LLM integration: ```javascript public/script.js theme={"system"} import { createClient } from 'https://esm.sh/@anam-ai/js-sdk@latest'; import { AnamEvent } from 'https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types'; let anamClient = null; // Get DOM elements const startButton = document.getElementById('start-button'); const stopButton = document.getElementById('stop-button'); const videoElement = document.getElementById('persona-video'); const statusElement = document.getElementById('status'); const chatHistory = document.getElementById('chat-history'); // Status management function updateStatus(message, type = 'normal') { statusElement.textContent = message; const colors = { loading: '#f39c12', connected: '#28a745', error: '#dc3545', normal: '#333', }; statusElement.style.color = colors[type] || colors.normal; } // Chat history management function updateChatHistory(messages) { if (!chatHistory) return; chatHistory.innerHTML = ''; if (messages.length === 0) { chatHistory.innerHTML = '
Start a conversation to see your chat history...
'; return; } messages.forEach((message) => { const messageDiv = document.createElement('div'); const isUser = message.role === 'user'; messageDiv.style.cssText = ` margin-bottom: 10px; padding: 8px 12px; border-radius: 8px; max-width: 85%; background: ${isUser ? '#e3f2fd' : '#f1f8e9'}; ${isUser ? 'margin-left: auto; text-align: right;' : ''} `; messageDiv.innerHTML = `${isUser ? 'You' : 'Cara'}: ${message.content}`; chatHistory.appendChild(messageDiv); }); // Scroll to bottom chatHistory.scrollTop = chatHistory.scrollHeight; } // Custom LLM response handler async function handleUserMessage(messageHistory) { // Only respond to user messages if (messageHistory.length === 0 || messageHistory[messageHistory.length - 1].role !== 'user') { return; } if (!anamClient) return; try { console.log('Getting custom LLM response for:', messageHistory); // Convert Anam message format to OpenAI format const openAIMessages = messageHistory.map((msg) => ({ role: msg.role === 'user' ? 'user' : 'assistant', content: msg.content, })); // Create a streaming talk session // You can optionally pass a correlationId to track this specific message stream const talkStream = anamClient.createTalkMessageStream(); // Call our custom LLM streaming endpoint const response = await fetch('/api/chat-stream', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ messages: openAIMessages }), }); if (!response.ok) { throw new Error(`LLM request failed: ${response.status}`); } const reader = response.body?.getReader(); if (!reader) { throw new Error('Failed to get response stream reader'); } const textDecoder = new TextDecoder(); console.log('Streaming LLM response to persona...'); // Stream the response chunks to the persona while (true) { const { done, value } = await reader.read(); if (done) { console.log('LLM streaming complete'); if (talkStream.isActive()) { talkStream.endMessage(); } break; } if (value) { const text = textDecoder.decode(value); const lines = text.split('\n').filter((line) => line.trim()); for (const line of lines) { try { const data = JSON.parse(line); if (data.content && talkStream.isActive()) { talkStream.streamMessageChunk(data.content, false); } } catch (parseError) { // Ignore parse errors in streaming } } } } } catch (error) { console.error('Custom LLM error:', error); if (anamClient) { anamClient.talk( "I'm sorry, I encountered an error while processing your request. Please try again." ); } } } async function startConversation() { try { startButton.disabled = true; updateStatus('Connecting...', 'loading'); // Get session token from server const response = await fetch('/api/session-token', { method: 'POST', }); if (!response.ok) { throw new Error('Failed to get session token'); } const { sessionToken } = await response.json(); // Create Anam client anamClient = createClient(sessionToken); // Set up event listeners anamClient.addListener(AnamEvent.SESSION_READY, () => { console.log('Session ready!'); updateStatus('Connected - Custom LLM active', 'connected'); startButton.disabled = true; stopButton.disabled = false; // Send initial greeting anamClient.talk("Hello! I'm Cara, powered by a custom AI brain. How can I help you today?"); }); anamClient.addListener(AnamEvent.CONNECTION_CLOSED, () => { console.log('Connection closed'); stopConversation(); }); // This is the key event for custom LLM integration anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, handleUserMessage); // Update chat history in real-time anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => { updateChatHistory(messages); }); // Handle stream interruptions (user interrupted the persona while speaking) anamClient.addListener(AnamEvent.TALK_STREAM_INTERRUPTED, () => { console.log('Talk stream interrupted by user'); }); // Start streaming to video element await anamClient.streamToVideoElement('persona-video'); console.log('Custom LLM persona started successfully!'); } catch (error) { console.error('Failed to start conversation:', error); updateStatus(`Error: ${error.message}`, 'error'); startButton.disabled = false; } } function stopConversation() { if (anamClient) { anamClient.stopStreaming(); anamClient = null; } // Reset UI videoElement.srcObject = null; updateChatHistory([]); updateStatus('Disconnected', 'normal'); startButton.disabled = false; stopButton.disabled = true; console.log('Conversation stopped'); } // Add event listeners startButton.addEventListener('click', startConversation); stopButton.addEventListener('click', stopConversation); // Cleanup on page unload window.addEventListener('beforeunload', stopConversation); ``` ### Step 4: Test your custom LLM integration 1. Start your server: ```bash theme={"system"} node server.js ``` 2. Open [http://localhost:8000](http://localhost:8000) in your browser 3. Click "Start Conversation" to begin chatting with your custom LLM-powered persona! You should see Cara appear and greet you, powered by your custom OpenAI integration. Try having a conversation - your voice will be transcribed, sent to OpenAI's GPT-4.1-mini, and the response will be streamed back through the persona's voice and video. ## Advanced Features ### Enhanced Error Handling Add retry logic to improve reliability: ```javascript theme={"system"} // Add this to your script.js handleUserMessage function async function handleUserMessage(messageHistory) { if (messageHistory.length === 0 || messageHistory[messageHistory.length - 1].role !== 'user') { return; } if (!anamClient) return; const maxRetries = 3; let retryCount = 0; while (retryCount < maxRetries) { try { // ... existing LLM call code ... return; // Success, exit retry loop } catch (error) { retryCount++; console.error(`Custom LLM error (attempt ${retryCount}):`, error); if (retryCount >= maxRetries) { // Final fallback response if (anamClient) { anamClient.talk( "I'm experiencing some technical difficulties. Please try rephrasing your question or try again in a moment." ); } } else { // Wait before retry await new Promise((resolve) => setTimeout(resolve, 1000 * retryCount)); } } } } ``` ## What You've Built You've integrated a custom language model with Anam's persona system. Your application includes: * **Custom AI Brain**: Control over your persona's intelligence using OpenAI's GPT-4.1-mini, with the ability to customize personality, knowledge, and behavior. * **Real-time Streaming**: Responses stream from your LLM through the persona's voice. * **Conversation Context**: Full conversation history is maintained and provided to your LLM for contextually aware responses. * **Error Handling**: Retry logic and fallback responses for reliability. * **Extensible Architecture**: The modular design allows you to swap LLM providers, add custom logic, or integrate with other AI services. ## Troubleshooting **Symptoms**: Persona doesn't speak or responses are delayed **Solutions**: * Verify OpenAI API key is correctly configured * Check that `llmId: "CUSTOMER_CLIENT_V1"` is set in session token * Ensure `MESSAGE_HISTORY_UPDATED` event listener is properly connected * Check browser console for JavaScript errors * Verify the `/api/chat-stream` endpoint is responding correctly **Symptoms**: Slow or choppy persona responses **Solutions**: * Optimize LLM model parameters (reduce max\_tokens, adjust temperature) * Implement response caching for common queries * Use faster models like `gpt-4.1-mini` instead of `gpt-4` * Consider chunking large responses for better streaming * Monitor network latency and server performance *** # Custom TTS (client-side) Source: https://anam.ai/docs/javascript-sdk/examples/custom-tts Use your own text-to-speech provider with Anam avatars via audio passthrough mode. **Beta Feature**: Audio passthrough mode is currently in beta. APIs may change as we continue to improve the integration. **Want to use ElevenLabs Agents with Anam?** We recommend the [server-side ElevenLabs integration](https://anam.ai/cookbook/elevenlabs-server-side-agents) insteadβ€”it's simpler and has lower latency. This page covers the client-side approach for when you need direct control over the audio pipeline. This guide shows how to use Anam's **audio passthrough** mode to pipe externally-generated speech audio into an avatar for real-time lip-sync. The example below uses [ElevenLabs Conversational AI](https://elevenlabs.io/conversational-ai) as the TTS source, but the same pattern works with **any TTS provider** (Cartesia, PlayHT, Azure Speech, Google Cloud TTS, etc.)β€”you just need to deliver PCM audio chunks to the Anam SDK. Your TTS must generate audio *above* realtime speed. If your TTS provider streams audio slower than 1x realtime, you will experience **stutter and frame drops** because Anam needs extra time to buffer and render the lip-sync animation. Most cloud TTS providers stream well above realtime, but verify this before going to production.