# create avatar
Source: https://anam.ai/docs/api-reference/avatars/create-avatar

https://api.anam.ai/swagger.json post /v1/avatars
Create a new one-shot avatar from an image file or image URL. You can use either multipart/form-data with an image file, or JSON with an image URL.


# delete avatar
Source: https://anam.ai/docs/api-reference/avatars/delete-avatar

https://api.anam.ai/swagger.json delete /v1/avatars/{id}
Delete an avatar by ID


# get avatar
Source: https://anam.ai/docs/api-reference/avatars/get-avatar

https://api.anam.ai/swagger.json get /v1/avatars/{id}
Returns an avatar by ID


# list avatars
Source: https://anam.ai/docs/api-reference/avatars/list-avatars

https://api.anam.ai/swagger.json get /v1/avatars
Returns a list of all avatars


# update avatar
Source: https://anam.ai/docs/api-reference/avatars/update-avatar

https://api.anam.ai/swagger.json put /v1/avatars/{id}
Update an avatar by ID (only display name can be updated)


# create knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/create-knowledge-group

https://api.anam.ai/swagger.json post /v1/knowledge/groups
Create a new knowledge group


# delete knowledge document
Source: https://anam.ai/docs/api-reference/knowledge/delete-knowledge-document

https://api.anam.ai/swagger.json delete /v1/knowledge/documents/{id}
Delete a document from a RAG group


# delete knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/delete-knowledge-group

https://api.anam.ai/swagger.json delete /v1/knowledge/groups/{id}
Delete a RAG group


# get knowledge document
Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-document

https://api.anam.ai/swagger.json get /v1/knowledge/documents/{id}
Get a single document by ID


# get knowledge document download
Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-document-download

https://api.anam.ai/swagger.json get /v1/knowledge/documents/{id}/download
Get a presigned download URL for a knowledge document


# get knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/get-knowledge-group

https://api.anam.ai/swagger.json get /v1/knowledge/groups/{id}
Get a single RAG group by ID


# list knowledge group documents
Source: https://anam.ai/docs/api-reference/knowledge/list-knowledge-group-documents

https://api.anam.ai/swagger.json get /v1/knowledge/groups/{id}/documents
Get all documents in a RAG group


# list knowledge groups
Source: https://anam.ai/docs/api-reference/knowledge/list-knowledge-groups

https://api.anam.ai/swagger.json get /v1/knowledge/groups
Returns a list of all knowledge groups for the organization


# search knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/search-knowledge-group

https://api.anam.ai/swagger.json post /v1/knowledge/groups/{id}/search
Search for similar content in a RAG group using vector similarity


# update knowledge document
Source: https://anam.ai/docs/api-reference/knowledge/update-knowledge-document

https://api.anam.ai/swagger.json put /v1/knowledge/documents/{id}
Update a document (rename)


# update knowledge group
Source: https://anam.ai/docs/api-reference/knowledge/update-knowledge-group

https://api.anam.ai/swagger.json put /v1/knowledge/groups/{id}
Update a RAG group


# upload knowledge group document
Source: https://anam.ai/docs/api-reference/knowledge/upload-knowledge-group-document

https://api.anam.ai/swagger.json post /v1/knowledge/groups/{id}/documents
Upload a document to a RAG group (Supports PDF, TXT, MD, DOCX, CSV up to 50MB). Authentication can be via API key (Bearer token) OR upload token (X-Upload-Token header).


# create llm
Source: https://anam.ai/docs/api-reference/llms/create-llm

https://api.anam.ai/swagger.json post /v1/llms
Create a new LLM configuration


# delete llm
Source: https://anam.ai/docs/api-reference/llms/delete-llm

https://api.anam.ai/swagger.json delete /v1/llms/{id}
Delete an LLM configuration


# get llm
Source: https://anam.ai/docs/api-reference/llms/get-llm

https://api.anam.ai/swagger.json get /v1/llms/{id}
Get a specific LLM by ID


# list llms
Source: https://anam.ai/docs/api-reference/llms/list-llms

https://api.anam.ai/swagger.json get /v1/llms
Returns a list of all LLMs available to the organization


# update llm
Source: https://anam.ai/docs/api-reference/llms/update-llm

https://api.anam.ai/swagger.json put /v1/llms/{id}
Update an LLM configuration


# Introduction
Source: https://anam.ai/docs/api-reference/overview

Quick start for calling the Anam REST API directly.

All endpoints are served under:

```
https://api.anam.ai/v1
```

## Get an API key

Create a key from the [API keys page](https://lab.anam.ai/api-keys) in the Lab. See [Get your API key](/javascript-sdk/api-key) for the full walkthrough. Keep the key on your server. Never ship it to a browser or mobile app.

## Authenticate requests

Every request takes a bearer token in the `Authorization` header:

```bash theme={"system"}
curl https://api.anam.ai/v1/personas \
  -H "Authorization: Bearer $ANAM_API_KEY"
```

## Connecting clients

Clients don't use your API key. Your backend creates a short-lived session token with [`POST /v1/auth/session-token`](/api-reference/create-session-token) and hands it to the client, which uses it to open a WebRTC stream. See [Authentication](/javascript-sdk/authentication) for a worked server example.

## Conventions

* Request and response bodies are JSON. Field names are `camelCase`.
* List endpoints accept `page` and `perPage` query parameters and return `{ data: [...], meta: { total, currentPage, perPage, lastPage, prev, next } }`.


# create persona
Source: https://anam.ai/docs/api-reference/personas/create-persona

https://api.anam.ai/swagger.json post /v1/personas
Create a new persona


# delete persona
Source: https://anam.ai/docs/api-reference/personas/delete-persona

https://api.anam.ai/swagger.json delete /v1/personas/{id}
Delete a persona by id


# get persona
Source: https://anam.ai/docs/api-reference/personas/get-persona

https://api.anam.ai/swagger.json get /v1/personas/{id}
Returns a persona by id


# list personas
Source: https://anam.ai/docs/api-reference/personas/list-personas

https://api.anam.ai/swagger.json get /v1/personas
Returns a list of all personas


# update persona
Source: https://anam.ai/docs/api-reference/personas/update-persona

https://api.anam.ai/swagger.json put /v1/personas/{id}
Update a persona by id


# create session token
Source: https://anam.ai/docs/api-reference/sessions/create-session-token

https://api.anam.ai/swagger.json post /v1/auth/session-token
Create a new session token used to initialise Anam client side SDKs


# get session
Source: https://anam.ai/docs/api-reference/sessions/get-session

https://api.anam.ai/swagger.json get /v1/sessions/{id}
Returns a session by ID


# get session recording
Source: https://anam.ai/docs/api-reference/sessions/get-session-recording

https://api.anam.ai/swagger.json get /v1/sessions/{id}/recording
Returns a presigned URL to download the session recording


# get session transcript
Source: https://anam.ai/docs/api-reference/sessions/get-session-transcript

https://api.anam.ai/swagger.json get /v1/sessions/{id}/transcript
Returns the conversation transcript for a session


# list sessions
Source: https://anam.ai/docs/api-reference/sessions/list-sessions

https://api.anam.ai/swagger.json get /v1/sessions
Returns a list of all sessions for the organization


# create share link
Source: https://anam.ai/docs/api-reference/share-links/create-share-link

https://api.anam.ai/swagger.json post /v1/share-links
Create a new share link


# delete share link
Source: https://anam.ai/docs/api-reference/share-links/delete-share-link

https://api.anam.ai/swagger.json delete /v1/share-links/{id}
Delete a share link by ID


# get share link
Source: https://anam.ai/docs/api-reference/share-links/get-share-link

https://api.anam.ai/swagger.json get /v1/share-links/{id}
Returns a share link by ID


# list share links
Source: https://anam.ai/docs/api-reference/share-links/list-share-links

https://api.anam.ai/swagger.json get /v1/share-links
Returns a list of all share links for the organization


# update share link
Source: https://anam.ai/docs/api-reference/share-links/update-share-link

https://api.anam.ai/swagger.json put /v1/share-links/{id}
Update a share link by ID


# create tool
Source: https://anam.ai/docs/api-reference/tools/create-tool

https://api.anam.ai/swagger.json post /v1/tools
Create a new tool for function calling in persona sessions


# delete tool
Source: https://anam.ai/docs/api-reference/tools/delete-tool

https://api.anam.ai/swagger.json delete /v1/tools/{id}
Delete a tool. The tool will be soft-deleted and no longer available.


# get tool
Source: https://anam.ai/docs/api-reference/tools/get-tool

https://api.anam.ai/swagger.json get /v1/tools/{id}
Get a tool by ID


# list tools
Source: https://anam.ai/docs/api-reference/tools/list-tools

https://api.anam.ai/swagger.json get /v1/tools
Returns a list of all tools for the organization


# update tool
Source: https://anam.ai/docs/api-reference/tools/update-tool

https://api.anam.ai/swagger.json put /v1/tools/{id}
Update an existing tool


# create voice
Source: https://anam.ai/docs/api-reference/voices/create-voice

https://api.anam.ai/swagger.json post /v1/voices
Create a new voice by cloning from an audio file


# delete voice
Source: https://anam.ai/docs/api-reference/voices/delete-voice

https://api.anam.ai/swagger.json delete /v1/voices/{id}
Delete a voice by ID


# get voice
Source: https://anam.ai/docs/api-reference/voices/get-voice

https://api.anam.ai/swagger.json get /v1/voices/{id}
Returns a voice by ID


# list voices
Source: https://anam.ai/docs/api-reference/voices/list-voices

https://api.anam.ai/swagger.json get /v1/voices
Returns a list of all voices


# update voice
Source: https://anam.ai/docs/api-reference/voices/update-voice

https://api.anam.ai/swagger.json put /v1/voices/{id}
Update a voice by ID (display name and provider model ID can be updated)


# Changelog
Source: https://anam.ai/docs/changelog

New features, improvements, and fixes

<Update label="April 20th, 2026" description="More predictable session openings, high-quality starts, and cleaner avatar refinement">
  ## ⚡ More predictable session openings

  This release gives builders more control over how sessions begin, especially when a tool-driven turn needs to run cleanly without being interrupted partway through. That makes longer or multi-step tool flows feel more predictable for both builders and end users.

  On the media side, you can now pin a session to start in high video quality using `sessionOptions.videoQuality`, which helps sessions reach their intended bitrate faster. We also tightened one-shot avatar refinement so flat or near-solid backgrounds are preserved more reliably in both the Lab and `/v1` avatar creation flow.

  ***

  ## Lab Changes

  **Improvements**

  * **Better default model:** New personas and built-in agent templates now default to GPT OSS 120B instead of GPT OSS 20B, improving reasoning quality and tool use out of the box.

  **Fixes**

  * **Cleaner avatar refinement:** Fixed a Gemini refinement issue that could replace plain or near-solid avatar backgrounds with invented scenery, textures, or objects during one-shot avatar creation.

  ## Persona Changes

  **Improvements**

  * **Protected tool turns:** Tool-driven turns can now optionally suppress interruptions while your app is still handling the action, making longer or multi-step tool flows more predictable.

  **Fixes**

  * **Protected-turn cleanup:** Interrupt protection is now released cleanly when a greeting or tool turn finishes without spoken output, reducing the chance of sessions getting stuck in a protected state.

  ## SDK/API Changes

  **Improvements**

  * **Initial video quality control:** `sessionOptions.videoQuality` now accepts `high` or `auto`, letting you pin a session to start at the maximum video bitrate instead of ramping up from the default profile.

  **Fixes**

  * **Avatar API refinement backgrounds:** The same background-preservation fix now applies to the `/v1` avatar creation flow, so refined API-created avatars are less likely to pick up hallucinated scenery.
</Update>

<Update label="April 13th, 2026" description="A major docs overhaul and better tool visibility across session views">
  ## 📚 The Anam docs have been overhauled

  We redesigned the docs to make it much easier to find the right starting point and drill into the part of the platform you care about. Navigation is now organized around Overview, Embed, JavaScript SDK, Python SDK, Integrations, API Reference, and Changelog, with a rewritten overview page and clearer Learn / Embed / Build entry points.

  This overhaul also adds dedicated Python SDK and LiveKit documentation, plus more focused guides for avatars, voices, LLMs, tools, session options, and network configuration.

  ***

  ## Docs Changes

  **Improvements**

  * **New navigation:** The docs now use clearer top-level tabs and reorganized sections so it is faster to jump between concepts, embedding, SDKs, integrations, and API reference.
  * **New SDK and integration guides:** Added dedicated Python SDK documentation and a full LiveKit integration section, including overview, quickstart, and configuration guides.
  * **Focused concept pages:** Split key setup topics into dedicated pages for available LLMs, creating custom avatars, session controls, voice configuration, and network requirements.

  **Fixes**

  * **Docs redirects:** Added redirects for renamed and legacy docs URLs so older links and indexed API-reference pages are less likely to land on 404s.
  * **Navigation polish:** Improved overview labeling, changelog labeling, and navbar behavior across the docs experience.

  ## Lab Changes

  **Improvements**

  * **Sessions page:** Tool calls now appear across session Analytics, Overview, Transcript, and export views, including status, arguments, results, errors, and execution time.

  ## Persona Changes

  **Improvements**

  * **Client tool round-trips:** Personas can now continue once your application returns a client tool result, making client-side actions easier to chain into a conversation.
  * **Webhook tracing:** Webhook tool requests now include session and correlation IDs, making it easier to trace tool calls across your own backend systems.

  **Fixes**

  * **Audio preprocessing resilience:** Sessions now fail open if speech-enhancement preprocessing is unavailable, instead of ending unexpectedly.
  * **Session startup reliability:** Improved startup and media-timeout handling so transient processing issues are less likely to interrupt an active turn.

  ## SDK/API Changes

  **Improvements**

  * **Client tool results:** The JavaScript SDK now sends client tool results and errors back to the engine over the data channel, with session-scoped safeguards.
  * **Avatar creation API:** `POST /v1/avatars` now accepts an optional `avatarModel` field during avatar creation.
</Update>

<Update label="March 30th, 2026" description="A simpler tool builder and clearer API/runtime error handling">
  ## 🛠️ Tool setup got much easier in the Lab

  We redesigned the tool editor so webhook tools can be configured with form-based builders for headers, query params, and body params instead of raw JSON. That makes it much easier to set up tools correctly, especially for non-technical builders or teams collaborating across product and engineering.

  This release also includes a few practical fixes around upload limits, session behavior, and API error handling so the platform behaves more clearly when something goes wrong.

  ***

  ## Lab Changes

  **Improvements**

  * **Tool editor:** Rebuilt webhook tool configuration with form-based builders for headers, query params, and body params, so you no longer need to edit raw JSON for common setups.

  **Fixes**

  * **Connection errors:** Improved LLM URL normalization and connection error messages when custom model endpoints are misconfigured.
  * **Avatar uploads:** Reduced the avatar image upload limit to match the real platform file limit and avoid failed uploads.
  * **Session cleanup:** Fixed a bug where active sessions could keep running after the player unmounted during tab switches.

  ## SDK/API Changes

  **Improvements**

  * **Capacity signaling:** When session capacity is exhausted, the API now returns a clearer `429` response instead of a generic failure.

  **Fixes**

  * **Knowledge auth:** Fixed knowledge-upload auth and header handling for API callers.
</Update>

<Update label="March 23rd, 2026" description="Context injection, speech detection events & voice cloning for all paid plans">
  ## 🎯 Client-side context injection

  You can now inject context into a conversation without triggering a persona response. Call `addContext()` in the JavaScript SDK to silently append information — like CRM data, page navigation events, or real-time application state — to the conversation history. The persona won't respond immediately, but will have that context available the next time the user speaks.

  This is useful for building context-aware agents that adapt to what the user is doing in your application without interrupting the conversation flow.

  ## 🎙️ User speech detection events

  The SDK now emits `userSpeechStarted` and `userSpeechEnded` events the moment voice activity is detected, before any transcription is available. Use these to build responsive "listening" indicators and other UI feedback that reacts instantly when the user begins or stops speaking.

  ***

  ## Lab Changes

  **Improvements**

  * **Voice cloning for all paid plans:** Custom voice cloning is now available to Explorer and Growth plans, previously limited to Professional and Enterprise.
  * **Share and embed redesign:** Share links and embed widgets have been consolidated into a simpler 1-to-1 model with a cleaner management interface.
  * **Persona tools via API:** The PUT persona endpoint now accepts a `tool` field, allowing you to attach tools to personas programmatically.

  **Fixes**

  * Fixed one-shot avatar refinement timing out by making Gemini refinement non-fatal with a 35-second timeout.
  * Fixed knowledge upload endpoints not accepting Bearer API key authentication.
  * Fixed end-session race conditions with idempotent endpoint and atomic updates.

  ## Persona Changes

  **Improvements**

  * **Conversation context accuracy:** A new message history system tracks which text was actually spoken versus interrupted, and records tool call arguments and results. The persona now maintains accurate context after interruptions, leading to more coherent multi-turn conversations.
  * **Audio passthrough stability:** Late-arriving audio in BYO TTS sessions no longer causes unintended interruptions. Audio is buffered and played back in order, improving reliability for Pipecat and other audio passthrough integrations.

  **Fixes**

  * Fixed stale video frames occasionally appearing after a response completes.

  ## SDK/API Changes

  **Improvements**

  * **Context injection:** New `addContext()` method lets you inject context into the conversation history without triggering a response ([JS SDK v4.11.0](https://github.com/anam-org/javascript-sdk/releases/tag/v4.11.0)).
  * **Speech detection events:** `userSpeechStarted` and `userSpeechEnded` events fire at the VAD level for instant speech detection ([JS SDK v4.12.0](https://github.com/anam-org/javascript-sdk/releases/tag/v4.12.0)).
</Update>

<Update label="March 9th, 2026" description="Adaptive bitrate streaming, zero data retention & system tools">
  ## 📡 Adaptive bitrate streaming

  Anam now dynamically adjusts video quality based on network conditions. When bandwidth drops, the stream adapts in real time to maintain smooth, uninterrupted video rather than freezing or dropping frames. When conditions improve, quality scales back up automatically. This is a significant improvement for users on mobile networks, VPNs, or connections with variable bandwidth.

  ## 🔒 Zero Data Retention mode

  Enterprise customers can now enable **Zero Data Retention** on any persona. When enabled, no session data — recordings, transcripts, or conversation logs — is stored after a session ends. This applies across the full pipeline including voice and LLM data.

  Toggle it on from persona settings in the Lab, or set it via the API. [Learn more](https://anam.ai/docs/security/privacy).

  ***

  ## Lab Changes

  **Improvements**

  * **System tools:** Personas can now use built-in system tools. `change_language` switches speech recognition to a different language mid-conversation, and `skip_turn` pauses the persona from responding when the user needs a moment to think. Enable them from the Tools tab in Build.
  * **Tool validation:** Auto-deduplication of tool names with clearer validation error messages.
  * **Share link management:** Migrated share links to a 1-to-1 primary model with a simpler toggle interface.

  **Fixes**

  * Fixed reasoning model responses getting stuck in "thinking..." state.
  * Fixed soft-deleted knowledge folders not restoring on document upload.
  * Fixed LiveKit session type classification for snake\_case environment payloads.

  ## Persona Changes

  **Improvements**

  * **Agora AV1 support:** Agora integration now supports the AV1 video codec for better compression and quality at lower bitrates.
  * **Multi-agent LiveKit:** Audio routing now works correctly in multi-agent LiveKit rooms with multiple Anam avatars.

  **Fixes**

  * Fixed tool enum type validation.
</Update>

<Update label="February 27th, 2026" description="Four new integrations, Build page redesign & knowledge base overhaul">
  ## 🔌 New integrations

  Four new ways to use Anam avatars in your stack:

  **Pipecat**\
  The [`pipecat-anam`](https://pypi.org/project/pipecat-anam/) package brings Anam avatars to [Pipecat](https://github.com/pipecat-ai/pipecat), the open-source framework for voice and multimodal AI agents. `pip install pipecat-anam`, add `AnamVideoService` to your pipeline, and you're streaming. Use audio passthrough for full control over your own orchestration, or let Anam handle the pipeline end-to-end. [GitHub repo](https://github.com/anam-org/pipecat-anam).

  **ElevenLabs server-side agents**\
  Put a face on any agent you've built in ElevenLabs. Pass in your ElevenLabs agent ID and session token when starting a session, and Anam handles the rest, no changes to your existing ElevenLabs setup needed. [Cookbook](https://anam.ai/cookbook/elevenlabs-server-side-agents).

  **VideoSDK**\
  Anam is now officially supported on [VideoSDK](https://www.videosdk.live/), a WebRTC platform similar to LiveKit. Built on top of the Python SDK.

  **Framer**\
  The Anam Avatar plugin is now [on the Framer Marketplace](https://www.framer.com/marketplace/plugins/anam-avatar/). Drop an avatar into any Framer site without writing code.

  ## 📐 Metaxy: sample-level versioning for ML pipelines

  We wrote up a deep dive on [Metaxy](https://anam.ai/blog/metaxy), our open-source metadata versioning framework for multimodal data pipelines. It tracks partial data updates at the field level so teams only reprocess what actually changed. Works with orchestrators like Dagster, agnostic to compute (Ray, DuckDB, etc.). [GitHub](https://github.com/anam-org/metaxy).

  ***

  ## Lab Changes

  **Improvements**

  * **Build page redesign:** Everything lives in Build now. Avatars, Voices, LLMs, Tools, and Knowledge are tabs within a single page. Create custom avatars, clone voices, add LLMs, and upload knowledge files without leaving the page. Knowledge is a file drop on the Prompt tab: upload a document and it's automatically turned into a RAG tool.
  * **Smart voice matching:** One-shot avatars now auto-select a voice matching the avatar's detected gender.
  * **Mobile improvements:** Tables replaced with cards and lists. Bottom tab bar instead of hamburger menu. Long-press context menus on persona tiles. Touch-friendly tooltips.
  * **Knowledge base improvements:** Non-blocking document deletion with pending state and rollback on error. PDF uploads restored. Stuck documents are auto-detected with retry from the UI.

  **Fixes**

  * Fixed typo in thinking duration display.
  * Fixed sticky hover states on touch devices.

  ## Persona Changes

  **Improvements**

  * **Video stability:** New TWCC-based frame-drop pacer with GCC congestion control. Smoother video on constrained or variable-bandwidth connections.
  * **Network connectivity:** TURN over TLS for ICE, improving session establishment behind corporate firewalls and VPNs.

  **Fixes**

  * Fixed ElevenLabs pronunciation issues with certain text patterns.
  * Fixed text sanitization causing incorrect punctuation in TTS output.
  * Fixed silent responses not being detected correctly.

  ## SDK/API Changes

  **Improvements**

  * **Tool call event handlers:** `onToolCallStarted`, `onToolCallCompleted`, and `onToolCallFailed` handlers for tracking tool execution on the client.
  * **Documents accessed:** `ToolCallCompletedPayload` now includes a `documentsAccessed` field for Knowledge Base tool calls.

  **Fixes**

  * Fixed duplicate tool call completion events.
</Update>

<Update label="February 4th, 2026" description="Anam Python SDK & ICE recovery">
  ## 🐍 Anam Python SDK

  Anam now has a [Python SDK](https://github.com/anam-org/python-sdk). It handles WebRTC streaming, audio/video frame delivery, and session management.

  What's in the box:

  * **Media handling** — The SDK manages WebRTC connections and signalling. Connect, and you get synchronized audio and video frames back.
  * **Multiple integration modes** — Use the full pipeline (STT, LLM, TTS, Face) or bring your own TTS via audio passthrough.
  * **Live transcriptions** — User speech and persona responses stream in as partial transcripts, useful for captions or logging conversations.
  * **Async-first** — Built on Python's async/await. Process media frames with async iterators or hook into events with decorators.

  People are already building with it — rendering ascii avatars in the terminal, processing frames with OpenCV, piping audio to custom pipelines. Check the [GitHub repo](https://github.com/anam-org/python-sdk) to get started.

  ***

  ## Lab Changes

  **Improvements**

  * **Visual refresh:** Updated Lab UI with new brand styling, including new typography (Figtree), refreshed color tokens, and consistent component styles across all pages.

  ## Persona Changes

  **Improvements**

  * **ICE recovery grace period:** WebRTC sessions now survive brief network disconnections instead of terminating immediately. The engine detects ICE connection drops and holds the session open, allowing the client to reconnect without losing conversation state.
  * **Language configuration:** You can now set a language code on your persona, ensuring the STT pipeline uses the correct language from session start.
  * **Voice generation options:** Added configurable voice generation parameters for more control over TTS output.
  * **ElevenLabs streaming:** Removed input buffering for ElevenLabs TTS, reducing time-to-first-audio for all sessions using ElevenLabs voices.
</Update>

<Update label="January 29th, 2026" description="Session recordings & two-pass avatar refinement">
  ## 🎬 Session recordings

  By default, every session is now recorded and saved for 30 days. Watch back any conversation in the Lab (lab.anam.ai/sessions) to see exactly how users interact with your personas, including the full video stream and conversation flow.

  Recordings and transcripts are also available via API. Use `GET /v1/sessions/{id}/transcript` to fetch the full conversation programmatically for analytics, QA, or archival. For privacy-sensitive applications, you can disable recording in your persona config.

  ## 🎨 Two-pass avatar refinement

  One-shot avatar creation now refines images in two passes. Upload an image, and the system generates an initial avatar, then refines it for better likeness and expression. Available to all users.

  ***

  ## Lab Changes

  **Improvements**

  * Added `speechEnhancementLevel` (0-1) to `voiceDetectionOptions` for control over how aggressively background noise is filtered from user audio
  * Support for ephemeral tool IDs, so you can configure tools dynamically per session
  * Added delete account and organization buttons

  **Fixes**

  * Fixed terminology on tools tab
  * Fixed RAG default parameters not being passed
  * Fixed custom LLM default settings

  ## Persona Changes

  **Improvements**

  * Support for Gemini thinking/reasoning models
  * The `speechEnhancementLevel` parameter now passes through via `voiceDetectionOptions`
  * Engine optimizations for lower latency under load

  **Fixes**

  * Fixed GPT-5 tool calls returning errors
  * Fixed audio frame padding that could cause playback issues
  * Fixed repeated silence messages
  * Fixed silence breaker not responding to typed messages
</Update>

<Update label="January 18th, 2026" description="Speech Enhancement & Reasoning Models">
  ## 🎧 User Speech Enhancement

  We've integrated [ai-coustics](https://ai-coustics.com/) as a preprocessing layer in our user audio pipeline. It enhances audio quality before it reaches speech detection, cleaning up background noise and improving signal clarity in real-world conditions. This reduces false transcriptions from ambient sounds and improves endpointing accuracy, especially in noisy environments like cafes, offices, or outdoor settings.

  ## 🎛️ Configurable Persona Responsiveness

  Control how quickly your persona responds with [voiceDetectionOptions](https://anam.ai/docs/personas/session/voice-detection) in the persona config:

  * `endOfSpeechSensitivity` (0-1): How eager the persona is to jump in. 0 waits until it's confident you're done talking, 1 responds sooner.
  * `silenceBeforeSkipTurnSeconds`: How long before the persona prompts a quiet user.
  * `silenceBeforeSessionEndSeconds`: How long silence ends the session.
  * `silenceBeforeAutoEndTurnSeconds`: How long a mid-sentence pause waits before the persona responds.

  ## 🧠 Reasoning Model Support

  Added support for OpenAI reasoning models and custom Groq LLMs. Reasoning models can think through complex scenarios before responding, while Groq's high-throughput infrastructure makes these typically-slower models respond with conversational latencies suitable for real-time interactions. Add your reasoning model in the lab: [https://lab.anam.ai/llms](https://lab.anam.ai/llms).

  ## Persona Changes

  **Fixes**

  * Fixed Knowledge Base (RAG) tool calling with proper default query parameters
  * Fixed panic crashes when sessions error during startup

  ## Lab Changes

  **Fixes**

  * Fixed `Powered by Anam` text visibility when watermark removal is enabled
  * Updated API responses for GET/UPDATE persona endpoints

  ## SDK/API Changes

  **Improvements**

  * Introduced agent audio input streaming for BYO audio workflows, allowing you to integrate with arbitrary voice agents, eg ElevenLabs agents (see the [ElevenLabs server-side agents recipe](https://anam.ai/cookbook/elevenlabs-server-side-agents) on how to integrate).

  * Added WebRTC reasoning event handlers for reasoning model support
</Update>

<Update label="December 21st, 2025" description="Cara 3 Avatar Model & SOC-2 Compliance">
  ## 🎭 Introducing Cara 3: our most expressive model yet

  The accumulation of over 6 months of research, **Cara 3** is now available. This new model delivers significantly more expressive avatars featuring realistic eye movement, more dynamic head motion, smoother transitions in and out of idling, and improved lip sync.

  You can opt-in to the new model in your persona config using `avatarModel: 'cara-3'` or by selecting it in the Lab UI. Note that all new custom avatars will use Cara 3 exclusively, while existing personas will continue to use the Cara 2 model by default unless explicitly updated.

  ## 🛡️ SOC-2 Type II compliance

  Anam has achieved SOC-2 Type II compliance. This milestone validates that our security, availability, and data protection controls have been independently audited and proven over time.

  For customers building across learning, enablement, or live production use cases, this provides formal assurance regarding how we handle security, access, and reliability.\
  [**Visit the Trust Center**](https://trust.anam.ai/)

  ## 🔌 Integrations

  **Model Context Protocol (MCP) server**\
  Manage your personas and avatars directly within Claude Desktop, Cursor, and other MCP-compatible clients. Use your favorite LLM-assisted tools to interact with the Anam API.

  **Anam x ElevenLabs agents**\
  Turn any ElevenLabs conversational AI agent into a visual avatar using Anam's audio passthrough.\
  [Watch the demo](https://anam.ai/cookbook/elevenlabs-server-side-agents)

  ***

  ## Lab Changes

  **Improvements**

  * **UI overhaul:** A redesigned Homepage and Build page make persona creation more intuitive. You can now preview voices/avatars without starting a chat and create custom assets directly within the Build flow. Sidebar and Pricing pages have also been refreshed.
  * **Performance:** Implemented Tanstack caching to significantly improve Lab responsiveness.

  **Fixes**

  * Fixed a bug where client tool events were not appearing in the Build page chat.
  * Resolved an issue where tool calls and RAG were not passing parameters correctly.

  ## Persona Changes

  **Improvements**

  * **More voices:** Added \~100 new Cartesia voices (Sonic-3) and \~180 new ElevenLabs voices (Flash v2.5), covering languages and accents from all over the world.
  * **New default LLM:** `kimi-k2-instruct-0905` is now available. This SOTA open-source model offers high intelligence and excellent conversational abilities. (Note: Standard `kimi-k2` remains recommended for heavy tool-use scenarios).
  * **Configurable greetings:** Added `skip_greeting` parameter, allowing you to configure whether the persona initiates the conversation or waits for the user.
  * **Latency reductions:**
    * **STT optimization:** We are now self-hosting Deepgram for Speech-to-Text, resulting in a **\~30ms (p50)** and **\~170ms (p90)** latency improvement.
    * **Frame buffering:** Optimized output frame buffer, shaving off an additional **\~40ms** of latency per response.

  **Fixes**

  * Corrected header handling to ensure reliable data center failover.
  * Fixed a visual artifact where Cara 3 video frames occasionally displayed random noise.
  * Resolved a freeze-frame issue affecting \~1% of sessions ([Incident Report](https://status.anam.ai/incidents/01KC7A6Q9Q6H1JDZ83TP1EF1Z1)).

  ## SDK/API Changes

  **Improvements**

  * **API gateway guide:** added documentation and an example repository for routing Anam SDK traffic through your own API Gateway server. [View on GitHub](https://github.com/anam-org/anam-gateway-example).
</Update>

<Update label="November 27th, 2025" description="Livekit out of Beta and new sub-latency record">
  ## 🎥 Livekit out of Beta and new latency record

  LiveKit integration is now generally available: drop Anam's expressive real-time avatars into any LiveKit Agents app so your AI can join LiveKit rooms as synchronised voice + video participants.\
  It turns voice-only agents into face-and-voice experiences for calls, livestreams, and collaborative WebRTC spaces, with LiveKit handling infra and Anam handling the human layer. Docs

  ***

  ## ⚡ Record-breaking latency: 330 ms decrease in latency for all customers

  Server-side optimisations cuts average end-to-end latency by 330 ms for all customers, thanks to cumulative engine optimisations across transcription, frame generation, and frame writing, plus upgraded Deepgram Flux endpointing for faster, best in class turn-taking without regressions in voice quality or TTS.

  ***

  ## Lab Changes

  **Improvements** • Overhaul to avatar video upload and management system

  • Upgraded default Cartesia voices to Sonic 3

  • Standardised voice model selection across the platform

  **Fixes** • Enhanced share-link management capabilities

  • Corrected LiveKit persona type identification logic

  ***

  ## Persona Changes

  **Improvements** • Server-side optimisations to our frame buffering to reduce latency of responses by \~250ms for all personas.

  **Fixes** • Changed timeout behavior to never time out based on heartbeats; only time out when websocket is disconnected for 10 seconds or more.

  • Fixed intermittent issue where persona stopped responding

  • Set pix\_fmt for video output, moving from yuvj420p (JPEG) to yuv420 color space to avoid incorrect encoding/output.

  • Added timeout in our silence breaking logic to prevent hangs.
</Update>

<Update label="November 6th, 2025" description="Introducing Anam Agents">
  ## 🚀 Introducing Anam Agents

  Build and deploy AI agents in Anam that can engage alongside you.

  With Anam Agents, your Personas can now interact with your applications, access your knowledge, and trigger workflows directly through natural conversation. This marks Anam's evolution from conversational Personas to agentic Personas that think, decide, and execute.

  ## Knowledge Tools

  Give your Personas access to your company's knowledge. Upload docs to the Lab, and they'll use semantic retrieval to integrate the right info.\
  [Docs for Knowledge Base](https://anam.ai/docs/personas/knowledge/overview)

  ## Client Tools

  Personas can control your interface in real time—open checkout, display modals, navigate UI, and update state by voice.\
  [Docs for Client Tools](https://anam.ai/docs/personas/tools/client-tools)

  ## Webhook Tools

  Connect your Personas to external APIs and services. Create tickets, fetch status, update records, or fetch live data.\
  [Docs for Webhook Tools](https://anam.ai/docs/personas/tools/webhook-tools)

  ## Intelligent Tool Selection

  Each Persona's LLM chooses tools based on intent—not scripts.

  You can create/manage tools on the Tools page in the Lab and attach them to any Persona from Build.

  **Anam Agents are available in beta for all users:** [https://lab.anam.ai/login](https://lab.anam.ai/login)

  ***

  ## Lab Changes

  **Improvements**

  * Cartesia Sonic-3 voices: the most expressive TTS model.
  * Voice modal expanded: 50+ languages, voice samples, Cartesia TTS now default.
  * Session reports work for custom LLMs.

  **Fixes**

  * Prevented auto-logout when switching contexts.
  * Fixed race conditions in cookie handling.
  * Resolved legacy session token issues.
  * Removed problematic voices.
  * Corrected player/stream aspect ratios on mobile.

  ## Persona Changes

  **Improvements**

  * Deepgram Flux support for turn-taking ([Deepgram Flux Details](https://deepgram.com/learn/introducing-flux-conversational-speech-recognition))
  * Server-side optimization: reduced GIL contention and latency, faster connections.

  **Fixes**

  * Bug-fix for dangling LiveKit connections.

  ## Research

  **Improvements**

  * Our first open-source library!\
    Metaxy, a metadata layer for ML/data pipelines:\
    [Read more](https://anam-org.github.io/metaxy/main/#3-run-user-defined-computation-over-the-metadata-increment) | [GitHub](https://github.com/anam-org/metaxy)
</Update>

<Update label="October 6, 2025" description="Anam is now HIPAA compliant">
  ## 🛡️ Anam is now HIPAA compliant

  A big milestone for our customers and partners. Anam now meets HIPAA requirements for handling protected health information.

  [**Learn more at the Anam Trust Center**](https://trust.anam.ai/)

  ## Lab Changes

  **Improvements**

  * Enhanced voice selection: search by use case/conversational style, 50+ languages.
  * Product tour update.
  * Streamlined One-Shot avatar creation.
  * Auto-generated Persona names based on selected avatar.
  * Session start now 1.1s faster.

  **Fixes**

  * Share links: fixed extra concurrency slot usage.

  ## Persona Changes

  **Improvements**

  * Improved TTS pronunciation via smarter text chunking.
  * Traceability and monitoring for session IDs.
  * Increased internal audio sampling rate to 24kHz.
  * Increased max websocket size to 16Mb.

  **Fixes**

  * Concurrency calculation now only considers sessions from last 2 hours.
  * Less freezing for slower LLMs.
</Update>

<Update label="September 21, 2025" description="Session Analytics">
  ## 📊 Session Analytics

  Once a conversation ends, how do you review what happened? To help you understand and improve your Persona's performance, we're launching Session Analytics in the Lab. Now you can access a detailed report for every conversation, complete with a full transcript, performance metrics, and AI-powered analysis.

  * **Full Conversation Transcripts.** Review every turn of a conversation with a complete, time-stamped transcript. See what the user said and how your Persona responded, making it easy to diagnose issues and identify successful interaction patterns.
  * **Detailed Analytics & Timeline.** Alongside the transcript, a new Analytics tab provides key metrics grouped into "Transcript Metrics" (word count, turns) and "Processing Metrics" (e.g., LLM latency). A visual timeline charts the entire conversation, showing who spoke when and highlighting any technical warnings.
  * **AI-Powered Insights.** For a deeper analysis, you can generate an AI-powered summary and review key insights. This feature, currently powered by gpt-5-mini, evaluates the conversation for highlights, adherence to the system prompt, and user interruption rates.

  You can find your session history on the Sessions page in the Lab. Click on any past session to explore the new analytics report. This is available today for all session types, except for LiveKit sessions. For privacy-sensitive applications, session logging can be disabled via the SDK.

  ## Lab Changes

  **Improvements**

  * Improved Voice Discovery: The Voices page has been updated to be more searchable, allowing you to preview voices with a single click, and view new details like gender, TTS-model and language.

  **Fixes**

  * Fixed share-link session bug: Fixed bug of share-link sessions taking an extra concurrency slot.

  ## Persona Changes

  **Improvements**

  * Small improvement to connection time: Tweaks to how we perform webrtc signalling which allows for slightly faster connection times (\~900ms faster for p95 connection time).
  * Improvement to output audio quality for poor connections: Enabled Opus in-band FEC to improve audio quality under packet loss.
  * Small reduction in network latency: Optimisations have been made to our outbound media streams to reduce A/V jitter (and hence jitter buffer delay). Expected latency improvement is modest (\<50ms).

  **Fixes**

  * Fix for livekit sessions with slow TTS audio: Stabilizes LiveKit streaming by pacing output and duplicating frames during slowdowns to prevent underflow.
</Update>

<Update label="September 11, 2025" description="Intelligent LLM Routing for Faster Responses">
  ## ⚡ Intelligent LLM Routing for Faster Responses

  The performance of LLM endpoints can be highly variable, with time-to-first-token latencies sometimes fluctuating by as much as 500ms from one day to the next depending on regional load. To solve this and ensure your personas respond as quickly and reliably as possible, we've rolled out a new intelligent routing system for LLM requests. This is active for both our turnkey customers and for customers using their own server-side **Custom LLMs** if they deploy multiple endpoints.

  This new system constantly monitors the health and performance of all configured LLM endpoints by sending lightweight probes at regular intervals. Using a time-aware moving average, it builds a real-time picture of network latency and processing speed for each endpoint. When a request is made, the system uses this data to calculate the optimal route, automatically shedding load from any overloaded or slow endpoints within a region.

  ## Lab Changes

  **Improvements**

  * Generate one-shot avatars from text prompts: You can now generate one-shot avatars from text prompts within the lab, powered by Gemini's new Nano Banana model. The one-shot creation flow has been redesigned for speed and ease-of-use, and is now available to all plans. Image upload and webcam avatars remain exclusive to Pro and Enterprise.
  * Improved management of published embed widgets: Published embed widgets can now be configured and monitored from the lab at [https://lab.anam.ai/personas/published](https://lab.anam.ai/personas/published).

  ## Persona Changes

  **Improvements**

  * Automatic failover to backup data centres: To ensure maximum uptime and reliability for our personas, we've implemented automatic failover to backup data centres.

  **Fixes**

  * Prevent session crash on long user speech: Previously, unbroken user speech exceeding 30 seconds would trigger a transcription error and crash the session. We now automatically truncate continuous speech to 30 seconds, preventing sessions from failing in these rare cases.
  * Allow configurable session lengths of up to 2 hours for Enterprise plans: We had a bug where sessions had a max timeout of 30 mins instead of 2 hours for enterprise plans. This has now been fixed.
  * Resolved slow connection times caused by incorrect database region selection: An undocumented issue with our database provider led to incorrect region selection for our databases. Simply refreshing our credentials resolved the problem, resulting in a \~1s improvement in median connection times and \~3s faster p95 times. While our provider works on a permanent fix, we're actively monitoring for any recurrence.
</Update>

<Update label="September 4, 2025" description="Embed Widget">
  ## 🔌 Embed Widget

  Embed personas directly into your website with our new widget. Within the **lab's build page** click Publish then generate your unique html snippet. This snippet will work in most common website builders, eg Wordpress.org or SquareSpace.

  For added security we recommend adding a whitelist with your domain url. This will lock down the persona to only work on your website. You can also cap the number of sessions or give the widget an expiration period.

  ## Lab Changes

  **Improvements**

  * ONE-SHOT avatars available via API: Professional and Enterprise accounts can now create one-shot avatars via API. Docs **here**.
  * Spend caps: It's now possible to set a spend cap on your account. Available in **profile settings**.

  ## Persona Changes

  **Fixes**

  * Prevent Cartesia from timing out when using slow custom LLMs: We've added a safeguard to prevent Cartesia contexts from unexpectedly closing during pauses in text streaming. With slower llms or if there's a break or slow-down in text being sent, your connection will now stay alive, ensuring smoother, uninterrupted interactions.
</Update>

For full legal and policy information, see:

* [Trust Center](https://trust.anam.ai/)
* [AI Governance](https://anam.ai/ai-governance)
* [Terms of Service](https://anam.ai/terms-of-service)
* [DPA](https://anam.ai/data-processing)
* [Acceptable Use Policy](https://anam.ai/acceptable-use-policy)
* [Privacy Policy](https://anam.ai/privacy-policy)


# Embed Anam on your website
Source: https://anam.ai/docs/embed/overview

Add an Anam avatar to any website using the Widget, Player, or SDK.    Compare options and follow platform-specific setup guides.

There are three ways to add an Anam avatar to your site: the Widget, the Player, or the SDK.

## Which option should I use?

<Tip>
  **Widget** — A pre-built Web Component. The avatar loads directly on your page with its own UI. You can listen to events, handle tool calls, and control it from JavaScript.

  **Player** — A single iframe. The avatar runs inside a sandboxed frame, completely isolated from your page. The most compatible option across website builders, but you can't interact with it from your own code.

  **SDK** — A JavaScript library for building your own interface from scratch. Use this when the pre-built UIs don't fit your design and you want full control over the experience.
</Tip>

## Widget

The Widget loads as a `<anam-agent>` [Web Component](https://developer.mozilla.org/en-US/docs/Web/API/Web_Components) on your page. It handles its own authentication, renders inside a Shadow DOM (so it won't clash with your styles), and dispatches DOM events you can listen to.

<Tabs>
  <Tab title="Features">
    * Floating overlay or inline layout modes
    * DOM events for analytics, error handling, and custom behavior
    * Supports tool calls and text input
    * Configure appearance from Lab or HTML attributes
  </Tab>

  <Tab title="Requirements">
    * Your site must allow external JavaScript
    * Your domain must be added to the allowed list in Lab's Widget tab
    * Microphone access required for voice
  </Tab>

  <Tab title="Example code">
    ```html theme={"system"}
    <anam-agent agent-id="your-persona-id"></anam-agent>
    <script src="https://unpkg.com/@anam-ai/agent-widget" async></script>
    ```
  </Tab>
</Tabs>

See the [Widget documentation](/embed/widget/overview) for configuration, events, and framework-specific setup.

## Player

The Player runs inside an iframe. Your page and the avatar are fully isolated from each other — different JavaScript contexts, different stylesheets, no interaction between the two. This makes it the most broadly compatible option, especially on platforms that restrict custom JavaScript.

<Tabs>
  <Tab title="Features">
    * Full Anam interface in a sandboxed frame
    * Isolated from your site's CSS and JavaScript
    * Works on platforms that block custom scripts but allow iframes
  </Tab>

  <Tab title="Requirements">
    * Your site must allow iframe embedding
    * HTTPS required
    * Microphone access required for voice
  </Tab>

  <Tab title="Example code">
    ```html theme={"system"}
    <iframe
      src="https://lab.anam.ai/frame/SHARE_TOKEN"
      width="720"
      height="480"
      allow="microphone"
      style="border: none;">
    </iframe>
    ```
  </Tab>
</Tabs>

## SDK

The SDK gives you a JavaScript client and raw media streams. There's no pre-built UI — you build your own. Use this when you need complete control over how the avatar looks and behaves on your page.

<Tabs>
  <Tab title="Features">
    * Full control over layout and appearance
    * Direct access to video/audio streams
    * Programmatic session management
  </Tab>

  <Tab title="Requirements">
    * Your site must allow external JavaScript
    * HTTPS required
    * Microphone access required for voice
  </Tab>
</Tabs>

See the [SDK Reference](/javascript-sdk/reference/basic-usage) for usage details.

## Platform compatibility

Not every website builder supports every embed type. This depends on whether the platform lets you add custom JavaScript, iframes, or both.

| Platform      | Widget | Player | SDK | Notes                                            |
| ------------- | ------ | ------ | --- | ------------------------------------------------ |
| WordPress.org | ✅      | ✅      | ✅   |                                                  |
| Webflow       | ✅      | ✅      | ✅   | May need enterprise plan for script whitelisting |
| Squarespace   | ✅      | ✅      | ✅   | Requires paid plan                               |
| Jimdo Creator | ✅      | ✅      | ✅   | Requires paid plan                               |
| Shopify       | ✅      | ✅      | ❌   |                                                  |
| WordPress.com | ✅      | ✅      | ❌   | Requires paid plan                               |
| GoDaddy       | ✅      | ✅      | ❌   |                                                  |
| Wix           | ✅      | ❌      | ❌   | Requires paid plan                               |

<Accordion title="Platform-specific instructions">
  ### WordPress.com

  Requires Business plan (\$25/month) or higher. Add Widget or Player code via Custom HTML block. SDK is not supported.

  ### WordPress.org (self-hosted)

  All three embed options work. Add code via the Gutenberg editor (Custom HTML block), your theme's `footer.php`, or a plugin like "Insert Headers and Footers".

  ### Shopify

  **Player:** Go to Online Store > Themes > Customize, add a "Custom Liquid" section, paste the iframe code.

  **Widget/SDK:** Go to Online Store > Themes > Actions > Edit code, open `theme.liquid`, add the code before `</body>`. You may need to update your theme's Content Security Policy.

  ### Wix

  Requires a paid plan for custom code. Add Widget or Player code via Wix's HTML/Embed elements. SDK is not supported.

  ### Squarespace

  Requires Business plan or higher. Use a Code Block (HTML mode) for the Player, or Settings > Advanced > Code Injection for Widget/SDK.

  ### Webflow

  **Player:** Add an Embed element and paste the iframe code. Check Site Settings > Security > Secure Frame Headers if it doesn't load.

  **Widget/SDK:** Go to Project Settings > Custom Code, add to Footer Code. Changes aren't visible in the designer — publish first.

  ### GoDaddy

  **Widget/Player:** Add a "Custom Code" section and paste the embed code. SDK is not supported.

  ### Jimdo Creator

  Requires "Creator" mode (not "Dolphin"). Add an HTML/Widget element for the Player, or use Settings > Edit Head for Widget/SDK scripts.
</Accordion>

## Security

All embed options require HTTPS and microphone access. Users are prompted for microphone permissions the first time they interact with the avatar.

### Content Security Policy (CSP)

If your site has a CSP, add these directives for the option you're using:

```http theme={"system"}
# Widget
Content-Security-Policy: script-src https://unpkg.com/@anam-ai/; connect-src https://api.anam.ai wss://connect.anam.ai wss://connect-us.anam.ai wss://connect-eu.anam.ai;

# Player
Content-Security-Policy: frame-src https://lab.anam.ai;

# SDK
Content-Security-Policy: connect-src https://api.anam.ai wss://connect.anam.ai wss://connect-us.anam.ai wss://connect-eu.anam.ai;
```

### Browser support

Chrome/Edge 80+, Firefox 75+, Safari 14.1+, iOS Safari 14.5+, Chrome Android 80+.

## Troubleshooting

<AccordionGroup>
  <Accordion title="Embed not appearing">
    * Check that your share token (Player) or agent ID (Widget) is correct
    * Open browser DevTools (F12) and look for errors in the console
    * Confirm your site is served over HTTPS, not HTTP
    * Check that your platform plan supports custom embeds
    * Try disabling ad blockers
  </Accordion>

  <Accordion title="Microphone not working">
    * Click the lock icon in the address bar to check permissions
    * Microphone only works over HTTPS
    * Try Chrome or Edge for best compatibility
    * Disable privacy extensions that might block permissions
    * Look for "NotAllowedError" or "NotFoundError" in the console
  </Accordion>

  <Accordion title="Cross-origin errors">
    * Add the required domains to your Content Security Policy (see above)
    * For Player: check that X-Frame-Options isn't set to DENY
    * For Widget: make sure your domain is in the allowed list in Lab
    * Some platforms have non-configurable security policies
  </Accordion>
</AccordionGroup>

## FAQ

<AccordionGroup>
  <Accordion title="Can I use multiple embed options on the same page?">
    No. Use one Anam instance per page.
  </Accordion>

  <Accordion title="What's the difference between a share token and an agent ID?">
    The Player uses a share token (generated from the Share button in Lab). The Widget uses an agent ID (your persona's ID) and handles authentication automatically via domain allowlisting.
  </Accordion>

  <Accordion title="Can I customize the appearance?">
    **Widget:** Yes — layout, position, UI toggles, and more. Configure from Lab or with HTML attributes. See [Widget configuration](/embed/widget/configuration).

    **Player:** Limited to sizing the iframe.

    **SDK:** Full control — you build the UI yourself.
  </Accordion>

  <Accordion title="What if users deny microphone access?">
    The Widget and SDK support text input as a fallback (enable with the `ui-text-input` attribute or SDK configuration). The Player shows a message asking users to grant access.
  </Accordion>
</AccordionGroup>

## Next steps

<CardGroup>
  <Card title="Widget documentation" icon="puzzle-piece" href="/embed/widget/overview">
    Configuration, events, and framework-specific setup guides.
  </Card>

  <Card title="SDK reference" icon="code" href="/javascript-sdk/reference/basic-usage">
    Build your own interface with the JavaScript SDK.
  </Card>
</CardGroup>


# Configuration
Source: https://anam.ai/docs/embed/widget/configuration

Configure the widget with HTML attributes, Lab settings, and custom positioning

The widget loads its configuration from your published persona settings in [Anam Lab](https://lab.anam.ai). You can override any setting by adding HTML attributes directly to the `<anam-agent-widget>` element.

## Layout modes

### Floating

The default mode. The widget renders as a fixed-position overlay, anchored to a corner of the viewport. Users click to expand it into a conversation panel.

```html theme={"system"}
<anam-agent
  agent-id="your-persona-id"
  layout="floating"
  position="bottom-right"
  initial-state="minimized"
></anam-agent>
```

The `initial-state` attribute controls whether the widget starts expanded or minimized:

* `"expanded"` (default) the conversation panel is open immediately
* `"minimized"` only the floating orb is visible; the user clicks to expand

### Inline

The widget fills its parent container and becomes part of your page layout. There is no floating orb or expand/collapse behavior.

```html theme={"system"}
<div style="width: 100%; max-width: 720px; aspect-ratio: 3/2;">
  <anam-agent
    agent-id="your-persona-id"
    layout="inline"
  ></anam-agent>
</div>
```

<Tip>
  The inline widget inherits the dimensions of its parent. Make sure the parent has explicit width and height (or aspect-ratio) set.
</Tip>

## Custom positioning

For `floating` layout, the `position` attribute supports both predefined corners and custom CSS values using bracket syntax.

### Predefined positions

| Value            | CSS Result                  |
| ---------------- | --------------------------- |
| `"bottom-right"` | `bottom: 24px; right: 24px` |
| `"bottom-left"`  | `bottom: 24px; left: 24px`  |
| `"top-right"`    | `top: 24px; right: 24px`    |
| `"top-left"`     | `top: 24px; left: 24px`     |

### Bracket syntax

For precise control, use bracket syntax to set arbitrary CSS position values:

```html theme={"system"}
{/*  20px from the top, flush to the right edge  */}
<anam-agent
  agent-id="..."
  position="[top-20,right-0]"
></anam-agent>

{/*  100px from the bottom, 40px from the left  */}
<anam-agent
  agent-id="..."
  position="[bottom-100,left-40]"
></anam-agent>
```

The format is `[property-value,property-value,...]` where:

* Allowed properties: `top`, `right`, `bottom`, `left`, `margin`, `margin-top`, `margin-right`, `margin-bottom`, `margin-left`
* Bare numbers are treated as pixels (e.g., `top-20` becomes `top: 20px`)
* CSS units can be specified directly (e.g., `top-2rem`)

## Attributes reference

`agent-id` is required. All other attributes are optional and will fall back to your Lab configuration.

| Attribute          | Type                          | Default                   | Description                                                                  |
| ------------------ | ----------------------------- | ------------------------- | ---------------------------------------------------------------------------- |
| `agent-id`         | `string`                      |                           | Persona ID. Required for fetching config and creating sessions.              |
| `api-base-url`     | `string`                      | `https://api.anam.ai`     | API URL override. Only needed for non-production environments.               |
| `layout`           | `"floating"` \| `"inline"`    | `"floating"`              | Widget layout mode.                                                          |
| `initial-state`    | `"expanded"` \| `"minimized"` | `"expanded"`              | Initial state when layout is `floating`. Ignored for `inline`.               |
| `position`         | `string`                      | `"bottom-right"`          | Corner position or custom bracket syntax. Only applies to `floating` layout. |
| `ui-mute-button`   | `"true"` \| `"false"`         | `"true"`                  | Show the microphone mute button.                                             |
| `ui-text-input`    | `"true"` \| `"false"`         | `"true"`                  | Show the text input field.                                                   |
| `call-to-action`   | `string`                      | `"Talk to our assistant"` | Custom text for the start button.                                            |
| `avatar-url`       | `string`                      |                           | Override avatar thumbnail image URL.                                         |
| `avatar-video-url` | `string`                      |                           | Override avatar preview video URL.                                           |
| `agent-name`       | `string`                      |                           | Override persona name displayed in the default CTA.                          |


# Events
Source: https://anam.ai/docs/embed/widget/events

Listen to widget events for analytics and custom behavior

The `<anam-agent>` element dispatches standard [DOM Custom Events](https://developer.mozilla.org/en-US/docs/Web/API/CustomEvent) that cross the Shadow DOM boundary, so you can listen to them with `addEventListener` on the element itself or any ancestor. Event data is available on `event.detail`.

## Listening to events

```javascript theme={"system"}
const widget = document.querySelector("anam-agent");

widget.addEventListener("anam-agent:session-started", (e) => {
  console.log("Session started:", e.detail.sessionId);
});

widget.addEventListener("anam-agent:message-received", (e) => {
  console.log(`${e.detail.role}: ${e.detail.content}`);
});

widget.addEventListener("anam-agent:error", (e) => {
  console.error(`Error [${e.detail.code}]: ${e.detail.message}`);
});
```

## Events reference

| Event Name                    | Payload                                        | Description                                            |
| ----------------------------- | ---------------------------------------------- | ------------------------------------------------------ |
| `anam-agent:session-started`  | `{ sessionId: string }`                        | A WebRTC session has been established.                 |
| `anam-agent:session-ended`    | `{ sessionId: string, reason: string }`        | The session has ended (user-initiated or server-side). |
| `anam-agent:message-received` | `{ role: "user" \| "agent", content: string }` | A transcript message was received from either party.   |
| `anam-agent:message-sent`     | `{ content: string }`                          | The user sent a text message via the input field.      |
| `anam-agent:expanded`         | `{}`                                           | The widget was expanded (floating layout only).        |
| `anam-agent:collapsed`        | `{}`                                           | The widget was collapsed (floating layout only).       |
| `anam-agent:error`            | `{ code: string, message: string }`            | An error occurred (auth failure, network issue, etc.). |
| `anam-agent:mic-muted`        | `{}`                                           | The user muted their microphone.                       |
| `anam-agent:mic-unmuted`      | `{}`                                           | The user unmuted their microphone.                     |

## Common patterns

<AccordionGroup>
  <Accordion title="Analytics tracking">
    Track session starts, message counts, and engagement duration:

    ```javascript theme={"system"}
    const widget = document.querySelector("anam-agent");
    let sessionStart;

    widget.addEventListener("anam-agent:session-started", (e) => {
      sessionStart = Date.now();
      analytics.track("avatar_session_started", {
        sessionId: e.detail.sessionId,
      });
    });

    widget.addEventListener("anam-agent:session-ended", (e) => {
      const duration = Date.now() - sessionStart;
      analytics.track("avatar_session_ended", {
        sessionId: e.detail.sessionId,
        reason: e.detail.reason,
        durationMs: duration,
      });
    });

    widget.addEventListener("anam-agent:message-received", (e) => {
      if (e.detail.role === "agent") {
        analytics.track("avatar_message", {
          contentLength: e.detail.content.length,
        });
      }
    });
    ```
  </Accordion>

  <Accordion title="Show or hide page elements">
    React to widget expand/collapse to adjust your page layout:

    ```javascript theme={"system"}
    const widget = document.querySelector("anam-agent");
    const sidebar = document.getElementById("sidebar");

    widget.addEventListener("anam-agent:expanded", () => {
      sidebar.style.marginRight = "420px";
    });

    widget.addEventListener("anam-agent:collapsed", () => {
      sidebar.style.marginRight = "0";
    });
    ```
  </Accordion>

  <Accordion title="Error handling">
    Display user-facing messages or trigger fallback behavior:

    ```javascript theme={"system"}
    const widget = document.querySelector("anam-agent");

    widget.addEventListener("anam-agent:error", (e) => {
      const { code, message } = e.detail;

      if (message.includes("Origin not allowed")) {
        showBanner("Widget configuration required. Contact your admin.");
      } else if (message.includes("Too many requests")) {
        showBanner("Please wait a moment before trying again.");
      } else {
        showBanner("Something went wrong. Please try again.");
      }
    });
    ```
  </Accordion>
</AccordionGroup>


# Framer
Source: https://anam.ai/docs/embed/widget/framer

Add an Anam AI avatar widget to your Framer website

Add a conversational AI avatar to your Framer store using the [Anam Widget](/embed/widget/overview) -- no coding required.

<Card href="https://www.framer.com/marketplace/plugins/anam-avatar/preview/" />

## Quick Start

<Steps>
  <Step title="Create an avatar">
    Go to [Anam Lab](https://lab.anam.ai) configure its avatar, voice, and behavior.
  </Step>

  <Step title="Publish avatar and copy the widget code.">
    In the Widget tab, add your Framer domain to the **Allowed domains** list and **Publish** the avatar.
  </Step>

  <Step title="Paste code in the Plugin.">
    Open this plugin in: [https://www.framer.com/marketplace/plugins/anam-avatar/preview/](https://www.framer.com/marketplace/plugins/anam-avatar/preview/) and paste your widget code there. 
  </Step>

  <Step title="Customize messages, avatar settings, and collect data for performance improvement." />
</Steps>

## Next Steps

<CardGroup>
  <Card icon="sliders" href="/embed/widget/configuration" title="Widget Configuration">
    Customize the widget's appearance and behavior
  </Card>

  <Card icon="bolt" href="/embed/widget/events" title="Widget Events">
    Listen for widget events in your Framer website
  </Card>
</CardGroup>


# Installation
Source: https://anam.ai/docs/embed/widget/installation

Install the Anam Widget via CDN, npm, or framework-specific methods

<Tabs>
  <Tab title="CDN (Recommended)">
    Include a script tag and the custom element. No build step required.

    <CodeGroup>
      ```html unpkg theme={"system"}
      <anam-agent agent-id="your-persona-id"></anam-agent>
      <script src="https://unpkg.com/@anam-ai/agent-widget" async></script>
      ```

      ```html jsdelivr theme={"system"}
      <anam-agent-widget agent-id="your-persona-id"></anam-agent-widget>
      <script src="https://cdn.jsdelivr.net/npm/@anam-ai/agent-widget" async></script>
      ```
    </CodeGroup>

    The script auto-registers the `<anam-agent-widget>` custom element. Place it anywhere in your HTML. The `async` attribute ensures it doesn't block page rendering.
  </Tab>

  <Tab title="npm">
    For projects with a build system, install the package and register the element in your application code.

    ```bash theme={"system"}
    npm install @anam-ai/agent-widget
    ```

    ```javascript theme={"system"}
    import { registerWidget } from "@anam-ai/agent-widget";

    registerWidget();
    ```

    Then use `<anam-agent-widget>` anywhere in your templates or JSX.
  </Tab>
</Tabs>

## Framework-specific setup

<Tabs>
  <Tab title="Plain HTML">
    Add the snippet before the closing `</body>` tag:

    ```html theme={"system"}
    <!DOCTYPE html>
    <html>
    <head>
      <title>My Site</title>
    </head>
    <body>
      {/*  Your page content  */}

      <anam-agent agent-id="your-persona-id"></anam-agent>
      <script src="https://unpkg.com/@anam-ai/agent-widget" async></script>
    </body>
    </html>
    ```
  </Tab>

  <Tab title="Next.js">
    Use the `Script` component and render the custom element in a client component:

    ```tsx app/components/AnamWidget.tsx theme={"system"}
    "use client";

    import Script from "next/script";

    export function AnamWidget() {
      return (
        <>
          <Script
            src="https://unpkg.com/@anam-ai/agent-widget"
            strategy="lazyOnload"
          />
          {/* @ts-expect-error -- custom element */}
          <anam-agent agent-id="your-persona-id" />
        </>
      );
    }
    ```

    ```tsx app/layout.tsx theme={"system"}
    import { AnamWidget } from "./components/AnamWidget";

    export default function RootLayout({ children }) {
      return (
        <html>
          <body>
            {children}
            <AnamWidget />
          </body>
        </html>
      );
    }
    ```
  </Tab>
</Tabs>

For specific website builder guides, see our integration pages for [Shopify](/third-party-integrations/shopify), [WordPress](/third-party-integrations/wordpress), [Wix](/third-party-integrations/wix), and [Squarespace](/third-party-integrations/squarespace), or browse all [widget cookbooks](https://anam.ai/cookbook/topic/widget).

## Domain allowlisting

<Warning>
  The widget will not create sessions unless your domain is allowlisted. Without this step, users will see an "Origin not allowed" error when they try to start a conversation.
</Warning>

To allowlist your domain:

1. Open [Anam Lab](https://lab.anam.ai) and navigate to your persona
2. Go to the **Widget** tab
3. Under **Allowed domains**, add your website origin (e.g., `https://example.com`)
4. Click **Publish**

You can add multiple domains (e.g., production, staging, localhost for development). Alternatively, enable **Allow everywhere** to skip domain restrictions entirely.

<Tip>
  For local development, add `http://localhost:3000` (or your dev server port) to the allowed domains list.
</Tip>


# Overview
Source: https://anam.ai/docs/embed/widget/overview

Embed a conversational AI agent on any website.

<img alt="Anam Widget" />

# What is the Anam Widget?

The Anam Widget (`<anam-agent-widget>`) is a self-contained [Web Component](https://developer.mozilla.org/en-US/docs/Web/API/Web_Components) that embeds a conversational AI avatar on any website. Configuration is managed through [Anam Lab](https://lab.anam.ai) and securely fetched at runtime using domain-based authentication.

## Quick Start

<Steps>
  <Step title="Create a persona in Lab">
    Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to customize the widget appearance.
  </Step>

  <Step title="Allowlist your domain">
    In the Widget tab, add your website's domain to the **Allowed domains** list (e.g., `https://example.com`). Alternatively, enable **Allow everywhere** for unrestricted access.
  </Step>

  <Step title="Publish your persona">
    Click **Publish** to make your configuration live. The widget fetches published settings at runtime, so unpublished changes won't appear on your site.
  </Step>

  <Step title="Paste the embed snippet">
    Copy the embed code from the Widget tab and add it to your website:

    ```html theme={"system"}
    <anam-agent agent-id="your-persona-id"></anam-agent>
    <script src="https://unpkg.com/@anam-ai/agent-widget" async></script>
    ```
  </Step>
</Steps>

## Next steps

<CardGroup>
  <Card icon="download" href="/embed/widget/installation" title="Installation">
    CDN, npm, and framework-specific setup guides.
  </Card>

  <Card icon="sliders" href="/embed/widget/configuration" title="Configuration">
    All HTML attributes, layout options, and positioning.
  </Card>

  <Card icon="shopify" href="/embed/widget/shopify" title="Shopify">
    Add the widget to your Shopify store.
  </Card>

  <Card icon="wordpress" href="/embed/widget/wordpress" title="WordPress">
    Add the widget to your WordPress site.
  </Card>
</CardGroup>


# Shopify
Source: https://anam.ai/docs/embed/widget/shopify

Add an Anam AI avatar widget to your Shopify store

Add a conversational AI avatar to your Shopify store using the [Anam Widget](/embed/widget/overview) -- no coding required.

<Card title="Cookbook: Adding the Anam Widget to Shopify" icon="book-open" href="https://anam.ai/cookbook/widget-shopify">
  Step-by-step guide with screenshots for embedding the widget in your Shopify theme
</Card>

## Overview

The Anam widget can be added to any Shopify store by editing the `theme.liquid` file. Once installed, the avatar appears on every page (or specific pages using Liquid conditionals).

## Quick Start

<Steps>
  <Step title="Create a persona in Anam Lab">
    Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to get your persona ID and configure the widget appearance.
  </Step>

  <Step title="Allowlist your Shopify domain">
    In the Widget tab, add your Shopify domain (e.g., `https://your-store.myshopify.com`) to the **Allowed domains** list.
  </Step>

  <Step title="Edit your Shopify theme">
    In your Shopify admin, go to **Online Store > Themes > Edit code** and open `theme.liquid`. Add the widget snippet after the opening `<body>` tag:

    ```html theme={"system"}
    <anam-agent agent-id="your-persona-id"></anam-agent>
    <script src="https://unpkg.com/@anam-ai/agent-widget" async></script>
    ```
  </Step>

  <Step title="Publish">
    Click **Publish** in Anam Lab to make the persona live on your store.
  </Step>
</Steps>

<Tip>
  Use Shopify's Liquid conditionals to show the widget only on specific pages. See the [cookbook](https://anam.ai/cookbook/widget-shopify) for examples.
</Tip>

## Next Steps

<CardGroup>
  <Card title="Widget Configuration" icon="sliders" href="/embed/widget/configuration">
    Customize the widget's appearance and behavior
  </Card>

  <Card title="Widget Events" icon="bolt" href="/embed/widget/events">
    Listen for widget events in your Shopify theme
  </Card>
</CardGroup>


# Squarespace
Source: https://anam.ai/docs/embed/widget/squarespace

Add an Anam AI avatar widget to your Squarespace site

Add a conversational AI avatar to your Squarespace site using the [Anam Widget](/embed/widget/overview) and Squarespace's code injection feature -- no coding required.

<Card title="Cookbook: Adding the Anam Widget to Squarespace" icon="book-open" href="https://anam.ai/cookbook/widget-squarespace">
  Step-by-step guide for embedding the widget using Squarespace's code injection
</Card>

## Quick Start

<Steps>
  <Step title="Create a persona in Anam Lab">
    Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to get your persona ID.
  </Step>

  <Step title="Allowlist your domain">
    In the Widget tab, add your Squarespace domain to the **Allowed domains** list.
  </Step>

  <Step title="Add to Squarespace">
    In your Squarespace dashboard, go to **Settings > Advanced > Code Injection**. Paste the following into the **Header** section:

    ```html theme={"system"}
    <script src="https://unpkg.com/@anam-ai/agent-widget" async></script>
    ```

    Then paste into the **Footer** section:

    ```html theme={"system"}
    <anam-agent agent-id="your-persona-id"></anam-agent>
    ```
  </Step>

  <Step title="Publish">
    Click **Publish** in Anam Lab and save your Squarespace settings.
  </Step>
</Steps>

## Next Steps

<CardGroup>
  <Card title="Widget Configuration" icon="sliders" href="/embed/widget/configuration">
    Customize the widget's appearance and behavior
  </Card>

  <Card title="Widget Events" icon="bolt" href="/embed/widget/events">
    Listen for widget events
  </Card>
</CardGroup>


# Wix
Source: https://anam.ai/docs/embed/widget/wix

Add an Anam AI avatar widget to your Wix site

Add a conversational AI avatar to your Wix website using the [Anam Widget](/embed/widget/overview) and Wix's Custom Code feature -- no coding required.

<Card title="Cookbook: Adding the Anam Widget to Wix" icon="book-open" href="https://anam.ai/cookbook/widget-wix">
  Step-by-step guide for embedding the widget on your Wix site
</Card>

## Quick Start

<Steps>
  <Step title="Create a persona in Anam Lab">
    Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to get your persona ID.
  </Step>

  <Step title="Allowlist your domain">
    In the Widget tab, add your Wix domain to the **Allowed domains** list.
  </Step>

  <Step title="Add to Wix">
    In your Wix dashboard, go to **Settings > Custom Code**. Click **Add Custom Code** and paste:

    ```html theme={"system"}
    <anam-agent agent-id="your-persona-id"></anam-agent>
    <script src="https://unpkg.com/@anam-ai/agent-widget" async></script>
    ```

    Set placement to **All pages** and loading to **Load code once**.
  </Step>

  <Step title="Publish">
    Click **Publish** in Anam Lab and publish your Wix site.
  </Step>
</Steps>

## Next Steps

<CardGroup>
  <Card title="Widget Configuration" icon="sliders" href="/embed/widget/configuration">
    Customize the widget's appearance and behavior
  </Card>

  <Card title="Widget Events" icon="bolt" href="/embed/widget/events">
    Listen for widget events
  </Card>
</CardGroup>


# WordPress
Source: https://anam.ai/docs/embed/widget/wordpress

Add an Anam AI avatar widget to your WordPress site

Add a conversational AI avatar to your WordPress site using the [Anam Widget](/embed/widget/overview) and the WPCode plugin -- no coding required.

<Card title="Cookbook: Adding the Anam Widget to WordPress" icon="book-open" href="https://anam.ai/cookbook/widget-wordpress">
  Step-by-step guide for embedding the widget using WPCode
</Card>

## Quick Start

<Steps>
  <Step title="Create a persona in Anam Lab">
    Go to [Anam Lab](https://lab.anam.ai), create a persona, and configure its avatar, voice, and behavior. Navigate to the **Widget** tab to get your persona ID.
  </Step>

  <Step title="Allowlist your domain">
    In the Widget tab, add your WordPress domain to the **Allowed domains** list.
  </Step>

  <Step title="Install WPCode">
    In your WordPress admin, go to **Plugins > Add New** and search for **WPCode**. Install and activate it.
  </Step>

  <Step title="Add the widget code">
    Go to **Code Snippets > Add Snippet** and create a new **HTML Snippet**. Paste:

    ```html theme={"system"}
    <anam-agent agent-id="your-persona-id"></anam-agent>
    <script src="https://unpkg.com/@anam-ai/agent-widget" async></script>
    ```

    Set the insertion method to **Site Wide Header** and activate the snippet.
  </Step>

  <Step title="Publish">
    Click **Publish** in Anam Lab to make the persona live.
  </Step>
</Steps>

## Next Steps

<CardGroup>
  <Card title="Widget Configuration" icon="sliders" href="/embed/widget/configuration">
    Customize the widget's appearance and behavior
  </Card>

  <Card title="Widget Events" icon="bolt" href="/embed/widget/events">
    Listen for widget events
  </Card>
</CardGroup>


# LiveKit Configuration
Source: https://anam.ai/docs/integrations/livekit/configuration

Configuration options, advanced examples, and API reference for the Anam LiveKit plugin

## Installation

```bash theme={"system"}
pip install livekit-plugins-anam
```

## Environment variables

| Service           | Where to get it                                      |
| ----------------- | ---------------------------------------------------- |
| **Anam**          | [lab.anam.ai](https://lab.anam.ai)                   |
| **LiveKit**       | [LiveKit Cloud](https://livekit.io) or self-hosted   |
| **LLM providers** | DeepGram, ElevenLabs, OpenAI, Google AI Studio, etc. |

```bash .env theme={"system"}
ANAM_API_KEY=your_anam_api_key
ANAM_AVATAR_ID=your_avatar_id

LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret

OPENAI_API_KEY=your_openai_api_key
# or
GEMINI_API_KEY=your_gemini_api_key
```

## PersonaConfig

Configure the avatar identity:

```python theme={"system"}
persona_config = anam.PersonaConfig(
    name="Maya",           # Display name for the avatar
    avatarId="uuid-here",  # Avatar appearance ID
)
```

<ParamField type="string">
  Display name for the avatar. Used in logs and debugging.
</ParamField>

<ParamField type="string">
  UUID of the avatar to use. Get this from the [Avatar Gallery](/resources/avatar-gallery) or [Anam Lab](https://lab.anam.ai/avatars).
</ParamField>

## AvatarSession

```python theme={"system"}
avatar = anam.AvatarSession(
    persona_config=anam.PersonaConfig(...),
    api_key="your_api_key",
    api_url="https://api.anam.ai",  # Optional
)
```

<ParamField type="PersonaConfig">
  Configuration for the avatar's identity and appearance.
</ParamField>

<ParamField type="string">
  Your Anam API key.
</ParamField>

<ParamField type="string">
  Anam API endpoint. Override for staging or self-hosted deployments.
</ParamField>

### start()

Starts the avatar session and connects it to the LiveKit room.

```python theme={"system"}
await avatar.start(session, room=ctx.room)
```

<ParamField type="AgentSession">
  The LiveKit agent session to connect the avatar to.
</ParamField>

<ParamField type="rtc.Room">
  The LiveKit room instance from the job context.
</ParamField>

## Advanced examples

### Gemini with Vision

Use Gemini Live for multimodal conversations with screen share analysis:

```python theme={"system"}
import os
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.agents.voice import VoiceActivityVideoSampler, room_io
from livekit.plugins import anam, google

async def entrypoint(ctx: JobContext):
    await ctx.connect()

    llm = google.realtime.RealtimeModel(
        model="gemini-2.0-flash-exp",
        api_key=os.getenv("GEMINI_API_KEY"),
        voice="Aoede",
        instructions="You are a helpful assistant that can see the user's screen.",
    )

    avatar = anam.AvatarSession(
        persona_config=anam.PersonaConfig(
            name="Maya",
            avatarId=os.getenv("ANAM_AVATAR_ID"),
        ),
        api_key=os.getenv("ANAM_API_KEY"),
    )

    session = AgentSession(
        llm=llm,
        video_sampler=VoiceActivityVideoSampler(
            speaking_fps=0.2,
            silent_fps=0.1,
        ),
    )

    await avatar.start(session, room=ctx.room)
    await session.start(
        agent=Agent(instructions="Help the user with what you see on their screen."),
        room=ctx.room,
        room_input_options=room_io.RoomInputOptions(video_enabled=True),
    )

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
```

### Function tools

Extend your agent with custom tools:

```python theme={"system"}
from livekit.agents import function_tool

@function_tool
async def fill_form_field(field_name: str, value: str) -> str:
    """Fill in a form field on the user's screen.

    Args:
        field_name: The name of the field to fill
        value: The value to enter

    Returns:
        Confirmation message
    """
    await send_command_to_frontend("fill_field", {"field": field_name, "value": value})
    return "Field filled successfully"

session = AgentSession(
    llm=llm,
    tools=[fill_form_field],
)
```

## Running your agent

<Tabs>
  <Tab title="Development">
    ```bash theme={"system"}
    python agent.py dev
    ```

    Connects to your LiveKit server and automatically joins rooms when participants connect.
  </Tab>

  <Tab title="Production">
    ```bash theme={"system"}
    python agent.py
    ```

    Deploy using Docker, Kubernetes, or your preferred container platform. See the [LiveKit Agents deployment guide](https://docs.livekit.io/agents/deployment) for details.
  </Tab>
</Tabs>

## Troubleshooting

<AccordionGroup>
  <Accordion title="Agent won't connect to LiveKit">
    * Verify `LIVEKIT_URL`, `LIVEKIT_API_KEY`, and `LIVEKIT_API_SECRET` are correct
    * Check that your LiveKit server is accessible
    * Ensure WebSocket connections aren't blocked by a firewall
    * Test connectivity at [meet.livekit.io](https://meet.livekit.io)
  </Accordion>

  <Accordion title="Avatar not appearing">
    * Verify your `ANAM_API_KEY` is valid
    * Check that `ANAM_AVATAR_ID` matches an existing avatar
    * Review agent logs for Anam connection errors
    * Ensure the avatar session starts before the agent session
  </Accordion>

  <Accordion title="No voice response">
    * Check your LLM API key is valid (OpenAI, Gemini, etc.)
    * Verify microphone permissions in the browser
    * Look for API errors in the agent logs
    * Confirm the agent is receiving audio tracks
  </Accordion>

  <Accordion title="High latency or choppy audio">
    * Check your network connection stability
    * Consider using LiveKit Cloud for optimized routing
    * Reduce video sampling frequency if CPU-bound
    * Monitor your LLM API response times
  </Accordion>
</AccordionGroup>


# LiveKit Integration
Source: https://anam.ai/docs/integrations/livekit/overview

Add Anam avatars to LiveKit agent applications

The Anam LiveKit plugin adds a visual avatar face to your LiveKit voice agents. Combine Anam's avatar technology with any STT, LLM, or TTS — including OpenAI Realtime, Gemini Live, or your own custom models.

## How it works

LiveKit uses a room-based architecture. Human users and AI agents both connect to rooms as participants. Anam plugs into this as a video layer:

```
User Input (Voice/Video)
    ↓
LiveKit Room (Real-time Communication)
    ↓
Your LLM (OpenAI, Gemini, Claude, etc.)
    ↓
Text Response → Anam Avatar (TTS + Video)
    ↓
User sees and hears the avatar
```

The Anam plugin listens to the audio being sent to users and generates a synchronized video stream of the avatar speaking. The video is published to the room as a separate track that clients display.

<Tip>
  **Bring Your Own LLM**: Anam handles only the visual avatar. You choose the ears, intelligence, and voice — whether that's DeepGram, ElevenLabs, Cartesia, OpenAI, Gemini, Claude, or a custom model.
</Tip>

## Demo

See the integration in action with our onboarding assistant demo:

<Frame>
  <a href="https://youtu.be/9mxK5HdHzes">
    ![Anam LiveKit Demo - AI Onboarding Assistant](https://img.youtube.com/vi/9mxK5HdHzes/0.jpg)
  </a>
</Frame>

## Use cases

The Anam + LiveKit combination is ideal for scenarios requiring voice interaction with visual presence:

<AccordionGroup>
  <Accordion title="Employee Onboarding" icon="id-card">
    Guide new hires through forms and processes with screen share analysis. The AI sees what they see and provides contextual help.
  </Accordion>

  <Accordion title="Educational Tutoring" icon="graduation-cap">
    Help students with homework by seeing their work. The avatar can point out errors and explain concepts visually.
  </Accordion>

  <Accordion title="Technical Support" icon="headset">
    See customer screens and provide step-by-step guidance with a friendly visual presence.
  </Accordion>

  <Accordion title="Healthcare Intake" icon="hospital">
    Assist patients filling out medical forms with a calm, reassuring avatar presence.
  </Accordion>

  <Accordion title="Financial Services" icon="building-columns">
    Guide users through account opening, KYC processes, and complex financial forms.
  </Accordion>
</AccordionGroup>

## Resources

<CardGroup>
  <Card title="Cookbook: Getting Started" icon="book-open" href="https://anam.ai/cookbook/getting-started-with-livekit">
    Build a LiveKit voice agent with an Anam avatar from scratch
  </Card>

  <Card title="Cookbook: Gemini Vision" icon="book-open" href="https://anam.ai/cookbook/gemini-vision-with-anam-livekit">
    Add Gemini Vision to a LiveKit agent for screen share analysis
  </Card>

  <Card title="Demo Source Code" icon="github" href="https://github.com/anam-org/anam-livekit-demo">
    Full source code for the onboarding assistant demo
  </Card>

  <Card title="LiveKit Docs" icon="book" href="https://docs.livekit.io">
    Official LiveKit documentation
  </Card>
</CardGroup>


# LiveKit Quickstart
Source: https://anam.ai/docs/integrations/livekit/quickstart

Add an Anam avatar to a LiveKit agent in minutes

This quickstart shows how to add an Anam avatar face to a LiveKit voice agent using OpenAI Realtime for the LLM.

## Prerequisites

* [LiveKit CLI](https://docs.livekit.io/home/cli/cli-setup/) installed
* A [LiveKit Cloud](https://cloud.livekit.io) account
* An OpenAI API key
* An Anam API key from [lab.anam.ai](https://lab.anam.ai)

## Set up the agent

Clone the LiveKit Node.js agent starter and install dependencies:

```bash theme={"system"}
git clone https://github.com/livekit-examples/agent-starter-node.git
cd agent-starter-node
pnpm install
```

Download the required model files (VAD and turn detection):

```bash theme={"system"}
pnpm run download-files
```

Install the Anam plugin:

```bash theme={"system"}
pnpm add @livekit/agents-plugin-anam
```

## Configure credentials

Create a `.env.local` file:

```bash .env.local theme={"system"}
# LiveKit Cloud credentials (from cloud.livekit.io)
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret

# OpenAI (for voice + LLM)
OPENAI_API_KEY=your_openai_key

# Anam (for avatar face)
ANAM_API_KEY=your_anam_key
ANAM_AVATAR_ID=edf6fdcb-acab-44b8-b974-ded72665ee26
```

The avatar ID above is "Mia", one of Anam's stock avatars. Browse others in the [Avatar Gallery](/resources/avatar-gallery) or create your own at [lab.anam.ai/avatars](https://lab.anam.ai/avatars).

## Add the avatar to your agent

Replace the contents of `src/agent.ts`:

```typescript theme={"system"}
import { type JobContext, ServerOptions, cli, defineAgent, voice } from '@livekit/agents';
import * as anam from '@livekit/agents-plugin-anam';
import * as openai from '@livekit/agents-plugin-openai';
import { BackgroundVoiceCancellation } from '@livekit/noise-cancellation-node';
import dotenv from 'dotenv';
import { fileURLToPath } from 'node:url';

dotenv.config({ path: '.env.local' });

class Assistant extends voice.Agent {
  constructor() {
    super({
      instructions: `You are a helpful voice AI assistant.
You eagerly assist users with their questions.
Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols.
You are curious, friendly, and have a sense of humor.`,
    });
  }
}

export default defineAgent({
  entry: async (ctx: JobContext) => {
    await ctx.connect();

    // Start the voice session with OpenAI Realtime
    const session = new voice.AgentSession({
      llm: new openai.realtime.RealtimeModel({ voice: 'alloy' }),
    });

    await session.start({
      agent: new Assistant(),
      room: ctx.room,
      inputOptions: {
        noiseCancellation: BackgroundVoiceCancellation(),
      },
    });

    // Start the Anam avatar session
    const avatarId = process.env.ANAM_AVATAR_ID;
    if (!avatarId) {
      console.warn('ANAM_AVATAR_ID is not set. Avatar will not start.');
      return;
    }

    const avatarSession = new anam.AvatarSession({
      personaConfig: {
        name: 'Mia',
        avatarId,
      },
    });

    await avatarSession.start(session, ctx.room);
    console.log('Agent and avatar session started');
  },
});

cli.runApp(new ServerOptions({ agent: fileURLToPath(import.meta.url) }));
```

## Test locally

```bash theme={"system"}
pnpm run dev
```

The agent connects to LiveKit Cloud and waits for rooms. You need a frontend to create a room and connect.

## Set up the frontend

In a new terminal, create the React frontend:

```bash theme={"system"}
lk app create --template agent-starter-react
cd agent-starter-react
pnpm install
```

Create a `.env.local` with your LiveKit credentials:

```bash .env.local theme={"system"}
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret
```

Start the dev server:

```bash theme={"system"}
pnpm dev
```

Open `http://localhost:3000`, click connect, and the avatar appears as the agent speaks.

## Deploy to LiveKit Cloud

```bash theme={"system"}
lk agent deploy --secrets-file=.env.local
```

This uploads your agent code and environment variables. The agent will now automatically join any rooms created in your project.

## Next steps

* [Configuration](/livekit/configuration) — persona config, advanced examples, and API reference
* [Avatar Gallery](/resources/avatar-gallery) — browse stock avatars
* [Create a custom avatar](https://lab.anam.ai/avatars) — use your own face


# Other SDKs & Integrations
Source: https://anam.ai/docs/integrations/sdks

Official and community-built SDKs, plugins, and integrations for Anam

Beyond the [JavaScript SDK](/javascript-sdk/quickstart) and [Python SDK](/python-sdk/overview), Anam can be integrated through several partner and community-built SDKs.

## Official Integrations

<CardGroup>
  <Card title="Pipecat" icon="microchip" href="https://docs.pipecat.ai/api-reference/server/services/community-integrations">
    Build voice AI pipelines with Pipecat and render them with Anam avatars. Official plugin maintained by Anam.
  </Card>

  <Card title="Agora Conversational AI" icon="tower-broadcast" href="https://docs.agora.io/en/conversational-ai/models/avatar/anam">
    Use Anam avatars as the visual layer for Agora's Conversational AI platform.
  </Card>

  <Card title="VideoSDK" icon="video" href="https://anam.ai/cookbook/videosdk-anam-avatar">
    Add Anam avatars to VideoSDK voice agent pipelines. Works with both RealTimePipeline and CascadingPipeline.
  </Card>
</CardGroup>

## Community SDKs

These are built and maintained by the community. We can't guarantee their functionality or maintenance, but we're excited to see them.

<CardGroup>
  <Card title="Kotlin Multiplatform" icon="mobile" href="https://github.com/anam-org/kmp-sdk">
    Kotlin Multiplatform SDK for Android, iOS, and desktop targets.
  </Card>

  <Card title="Flutter" icon="mobile-screen" href="https://github.com/stukennedy/anam_flutter_sdk">
    Community-maintained Flutter SDK by Stu Kennedy for mobile applications.
  </Card>
</CardGroup>

## Built something?

If you've built an SDK, plugin, or integration for Anam, reach out at [info@anam.ai](mailto:info@anam.ai) and we'll add it here.


# Anam Documentation
Source: https://anam.ai/docs/introduction/overview

Explore our docs to start building with Anam

<img alt="Anam AI Avatars" />

Anam builds real-time AI avatars: photorealistic digital humans that talk, listen, and perform actions based on your conversations. A persona is the combination of a face, a voice, an LLM, and a system prompt. You integrate a persona with your webpage and it holds live conversations with your users.

## Who builds with Anam

Teams use Anam to build interactive agents from scratch, or to put a face on agents they already have. Common cases include customer support, sales and lead qualification, language tutoring, skill training, and medical front-desk assistance. In each of these cases, talking with an Anam avatar feels closer to a face-to-face conversation than text or voice alone.

## How it works

Every live conversation with a persona runs through a four-stage pipeline:

1. Speech-to-text (STT) listens to the user
2. An LLM decides what to say back
3. Text-to-speech (TTS) turns the reply into audio
4. Face generation turns the audio into a live video of the persona speaking

The default setup, which we call Turnkey, runs the whole pipeline for you. You can also bring your own LLM, your own STT, your own TTS, or hand us pre-generated audio and have us run face generation only. Whichever setup matches your stack, the persona and the avatar stream stay the same.

## Three ways to start

<CardGroup>
  <Card title="Learn" icon="lightbulb" href="/personas/overview">
    Read how personas work. Avatars, voices, LLMs, and how they fit together.
  </Card>

  <Card title="Embed" icon="code" href="/embed/overview">
    Add an avatar to a site you already have. Widget, Player, or platforms like Framer and Shopify.
  </Card>

  <Card title="Build" icon="rocket" href="/javascript-sdk/quickstart">
    Use the JavaScript or Python SDK to build your own. The pipeline and the UI are yours.
  </Card>
</CardGroup>


# Get your API Key
Source: https://anam.ai/docs/javascript-sdk/api-key

Learn how to create and manage your Anam API key from the Lab.       Required to authenticate requests and integrate personas into your app.

Your API key is used to authenticate your requests to the Anam API and is required for integrating the Personas you create into your own applications. API keys are managed via the [Anam Lab](https://lab.anam.ai/).

### Create a new API key

From the [API keys page](https://lab.anam.ai/api-keys), click on the "Create API key" button.

<Note>
  The label is only used to help you identify the API key. It is not used for authentication.
</Note>

<Frame>
  <img alt="Create a new API key" />
</Frame>

Click "Create" and your new API key will be shown to you. Remember to save it somewhere safe as you will not be able to access it later. You will be prevented from closing the dialog until you have clicked the "Copy" button.

<Frame>
  <img alt="API key created" />
</Frame>

<Warning>
  Anam encrypts and store your API keys securely. For this reason, we are unable to recover lost API keys. If you lose your API key, you will need to create a new key via the Anam Lab.
</Warning>


# Authentication
Source: https://anam.ai/docs/javascript-sdk/authentication

Secure your API keys and manage session tokens

Anam uses a two-tier authentication system: API keys for server-side requests and session tokens for client connections.

## Tier 1: API Key

Your API key authenticates server-side requests to the Anam API.

<Warning>
  **Never expose your API key on the client side**. It should only exist in your
  server environment.
</Warning>

### Getting Your API Key

See the [API key page](/api-key) for details on how to get your API key from the Anam Lab.

## Tier 2: Session Tokens

Session tokens are temporary credentials (valid for 1 hour) that allow client applications to connect to Anam's streaming infrastructure without exposing your API key.

### How Session Tokens Work

<Steps>
  <Step title="Token Request">
    Your server requests a session token from Anam using your API key and
    persona configuration
  </Step>

  <Step title="Token Generation">
    Anam generates a temporary token tied to your specific persona configuration
  </Step>

  <Step title="Client Connection">
    Your client uses the session token with the Anam SDK to establish a direct
    WebRTC connection
  </Step>

  <Step title="Real-time Communication">
    Once connected, the client can send messages and receive video/audio streams
    directly
  </Step>
</Steps>

### Creating Session Tokens

Below is a basic Express server that exposes an endpoint for creating session tokens.

<CodeGroup>
  ```typescript server.ts theme={"system"}
  import express, { Request, Response } from "express";

  interface PersonaConfig {
    name: string;
    avatarId: string;
    voiceId: string;
    llmId?: string;
    systemPrompt?: string;
  }

  interface SessionTokenResponse {
    sessionToken: string;
  }

  const app = express();
  app.use(express.json());

  app.post("/api/session-token", async (req: Request, res: Response) => {
    try {
      const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
        },
        body: JSON.stringify({
          personaConfig: {
            name: "Cara",
            avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
            voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
            llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
            systemPrompt: "You are a helpful assistant.",
          } satisfies PersonaConfig,
        }),
      });

      if (!response.ok) {
        const errorData = await response.json();
        console.error("Token creation failed:", errorData);
        return res.status(response.status).json({ error: "Token creation failed" });
      }

      const { sessionToken }: SessionTokenResponse = await response.json();
      res.json({ sessionToken });
    } catch (error) {
      console.error("Network error:", error);
      res.status(500).json({ error: "Failed to create session" });
    }
  });

  app.listen(3000, () => console.log("Server running on port 3000"));
  ```

  ```javascript server.js theme={"system"}
  const express = require("express");
  const app = express();
  app.use(express.json());

  app.post("/api/session-token", async (req, res) => {
    try {
      const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
        },
        body: JSON.stringify({
          personaConfig: {
            name: "Cara",
            avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
            voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
            llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
            systemPrompt: "You are a helpful assistant.",
          },
        }),
      });

      if (!response.ok) {
        const errorData = await response.json();
        console.error("Token creation failed:", errorData);
        return res.status(response.status).json({ error: "Token creation failed" });
      }

      const { sessionToken } = await response.json();
      res.json({ sessionToken });
    } catch (error) {
      console.error("Network error:", error);
      res.status(500).json({ error: "Failed to create session" });
    }
  });

  app.listen(3000, () => console.log("Server running on port 3000"));
  ```
</CodeGroup>

### Using the Session Token (Client-Side)

After your server creates a session token, your client fetches it and uses the Anam SDK to start streaming:

```typescript theme={"system"}
import { createClient } from "@anam-ai/js-sdk";

async function startPersonaSession() {
  // Fetch token from your server
  const response = await fetch("/api/session-token", { method: "POST" });
  const { sessionToken } = await response.json();

  // Create client with the session token
  const anamClient = createClient(sessionToken);

  // Start streaming to a video element
  await anamClient.streamToVideoElement("persona-video");
}
```

### Dynamic Persona Configuration

Instead of using the same persona for all users, you can customize based on context:

#### User-based Personalization

```javascript theme={"system"}
app.post("/api/session-token", authenticateUser, async (req, res) => {
  const user = req.user;

  const personaConfig = {
    name: `Persona for user: ${user.id}`,
    avatarId: user.preferredAvatar || defaultAvatarId,
    voiceId: user.preferredVoice || defaultVoiceId,
    llmId: user.preferredllmId || "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
    systemPrompt: buildPersonalizedPrompt(user),
  };

  const sessionToken = await fetchAnamSessionToken(personaConfig);
  res.json({ sessionToken });
});
```

#### Context-aware Sessions

```javascript theme={"system"}
app.post("/api/session-token", authenticateUser, async (req, res) => {
  const { context, metadata } = req.body;

  let personaConfig;

  switch (context) {
    case "customer-support":
      personaConfig = buildSupportPersona(metadata);
      break;
    case "sales":
      personaConfig = buildSalesPersona(metadata);
      break;
    case "training":
      personaConfig = buildTrainingPersona(metadata);
      break;
    default:
      personaConfig = defaultPersonaConfig;
  }

  const sessionToken = await fetchAnamSessionToken(personaConfig);
  res.json({ sessionToken });
});
```

## Environment Setup

Store your API key securely:

<CodeGroup>
  ```bash .env theme={"system"}
  ANAM_API_KEY=your-api-key-here
  NODE_ENV=production
  ```

  ```javascript config.js theme={"system"}
  const config = {
    anamApiKey: process.env.ANAM_API_KEY,
    anamApiUrl: process.env.ANAM_API_URL || "https://api.anam.ai",
  };

  if (!config.anamApiKey) {
    throw new Error("ANAM_API_KEY environment variable is required");
  }

  module.exports = config;
  ```
</CodeGroup>

## Next Steps


# Basic application
Source: https://anam.ai/docs/javascript-sdk/examples/basic-app

Build a complete web application with an Anam persona

This guide creates a minimal but complete web application with an interactive AI persona. You will build a Node.js server that handles authentication and a client that streams the persona to a video element.

<Info>
  Looking for a faster start? The [Quickstart](/javascript-sdk/quickstart) guide gets you running in a single HTML file without a server.
</Info>

## Prerequisites

* **Node.js** (version 16 or higher) and **npm** installed on your system
* Basic knowledge of JavaScript
* An Anam API key ([get one here](/api-key))
* A microphone and speakers for voice interaction

## Project Setup

This example creates a web application with three main files:

```
my-anam-app/
├── server.js          # Express server for secure API key handling
├── package.json       # Node.js dependencies
├── public/            # Static files served to the browser
│   ├── index.html     # Main HTML page with video element
│   └── script.js      # Client-side JavaScript for persona control
└── .env               # Environment variables (optional)
```

<Steps>
  <Step title="Create project directory">
    ```bash theme={"system"}
    mkdir my-anam-app
    cd my-anam-app
    ```
  </Step>

  <Step title="Initialize Node.js project">
    ```bash theme={"system"}
    npm init -y
    ```
  </Step>

  <Step title="Create public directory">
    ```bash theme={"system"}
    mkdir public
    ```
  </Step>

  <Step title="Install dependencies">
    ```bash theme={"system"}
    npm install express dotenv
    ```
  </Step>

  <Step title="Create environment file">
    Create a `.env` file in your project root to store your API key securely:

    ```bash .env theme={"system"}
    ANAM_API_KEY=your-api-key-here
    ```

    <Warning>
      Replace `your-api-key-here` with your actual Anam API key. Never commit this file to version control.
    </Warning>
  </Step>
</Steps>

## Step 1: Set up your server

Create a basic Express server to handle session token generation. In a production application, integrate this into your existing backend service.

```javascript server.js theme={"system"}
require("dotenv").config();
const express = require("express");
const app = express();

app.use(express.json());
app.use(express.static("public"));

app.post("/api/session-token", async (req, res) => {
  try {
    const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
      },
      body: JSON.stringify({
        personaConfig: {
          name: "Cara",
          avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
          voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
          llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
          systemPrompt:
            "You are Cara, a helpful customer service representative. Be friendly and concise in your responses.",
        },
      }),
    });

    if (!response.ok) {
      const errorData = await response.json().catch(() => ({}));
      console.error("Anam API error:", response.status, errorData);
      return res.status(response.status).json({
        error: errorData.message || "Failed to create session token",
      });
    }

    const data = await response.json();
    res.json({ sessionToken: data.sessionToken });
  } catch (error) {
    console.error("Server error:", error);
    res.status(500).json({ error: "Internal server error" });
  }
});

app.listen(3000, () => {
  console.log("Server running on http://localhost:3000");
});
```

The server exchanges your API key for a temporary session token. This token has limited scope and expires, so your API key stays secure on the server.

## Step 2: Set up your HTML

Create an HTML page with a video element for the persona and controls to start/stop the chat:

```html public/index.html theme={"system"}
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>My First Anam Persona</title>
  </head>
  <body>
    <div style="text-align: center; padding: 20px;">
      <h1>Chat with Cara</h1>
      <video
        id="persona-video"
        autoplay
        playsinline
        style="max-width: 100%; border-radius: 8px;"
      ></video>
      <div style="margin-top: 20px;">
        <button id="start-button">Start Chat</button>
        <button id="stop-button" disabled>Stop Chat</button>
      </div>
      <div id="status" style="margin-top: 10px; color: #666;"></div>
    </div>
    <script type="module" src="script.js"></script>
  </body>
</html>
```

## Step 3: Initialize the Anam client

Create the client-side JavaScript to control your persona connection:

```javascript public/script.js theme={"system"}
import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";

let anamClient = null;

const startButton = document.getElementById("start-button");
const stopButton = document.getElementById("stop-button");
const videoElement = document.getElementById("persona-video");
const statusElement = document.getElementById("status");

function setStatus(message) {
  statusElement.textContent = message;
}

async function startChat() {
  try {
    startButton.disabled = true;
    setStatus("Creating session...");

    // Get session token from your server
    const response = await fetch("/api/session-token", {
      method: "POST",
    });

    if (!response.ok) {
      const error = await response.json();
      throw new Error(error.error || "Failed to get session token");
    }

    const { sessionToken } = await response.json();
    setStatus("Connecting...");

    // Create the Anam client
    anamClient = createClient(sessionToken);

    // Start streaming to the video element
    await anamClient.streamToVideoElement("persona-video");

    startButton.disabled = true;
    stopButton.disabled = false;
    setStatus("Connected - start speaking!");
  } catch (error) {
    console.error("Failed to start chat:", error);
    setStatus(`Error: ${error.message}`);
    startButton.disabled = false;
  }
}

function stopChat() {
  if (anamClient) {
    anamClient.stopStreaming();
    anamClient = null;

    videoElement.srcObject = null;

    startButton.disabled = false;
    stopButton.disabled = true;
    setStatus("Chat ended");
  }
}

startButton.addEventListener("click", startChat);
stopButton.addEventListener("click", stopChat);
```

For details on `createClient`, `streamToVideoElement`, and `stopStreaming`, see the [SDK Reference](/javascript-sdk/reference/basic-usage).

## Step 4: Run your application

1. Start your server:

```bash theme={"system"}
node server.js
```

2. Open [http://localhost:3000](http://localhost:3000) in your browser
3. Click "Start Chat" to begin your conversation with Cara

<Check>
  You should see Cara appear in the video element, ready to chat through voice interaction.
</Check>

## How it works

1. **Server-side authentication**: Your server exchanges the API key for a session token, keeping credentials secure
2. **Client connection**: The SDK creates a WebRTC connection for real-time video streaming
3. **Voice interaction**: Cara listens for your voice input and responds with synchronized audio and video
4. **Connection control**: Start and stop buttons control when the persona is active

## Next steps

<CardGroup>
  <Card title="Cookbook: Basic Next.js App" icon="book-open" href="https://anam.ai/cookbook/basic-nextjs-app">
    Full tutorial for building a Next.js app with the Anam SDK
  </Card>

  <Card title="Core Concepts" icon="lightbulb" href="/personas/overview">
    Learn how personas, tokens, and streaming work
  </Card>

  <Card title="Handle Events" icon="bolt" href="/javascript-sdk/reference/event-types">
    React to conversation events and user interactions
  </Card>

  <Card title="Production Setup" icon="shield" href="/javascript-sdk/production">
    Deploy your persona securely at scale
  </Card>
</CardGroup>

## Common issues

**Persona not appearing?**

* Check that your API key is set correctly in `.env`
* Ensure the video element has `autoplay` and `playsinline` attributes
* Check the browser console for errors

**No audio?**

* Make sure your browser allows autoplay with sound
* The user must interact with the page first (clicking "Start Chat" satisfies this)

**Connection issues?**

* Verify your server can reach api.anam.ai
* Check network connectivity and firewall settings


# Custom LLM (client-side)
Source: https://anam.ai/docs/javascript-sdk/examples/custom-llm

Build your own AI conversation logic with OpenAI, Anthropic, and other language models

Learn how to bypass Anam's built-in language models and integrate your own custom LLM for complete control over conversation logic. This guide uses OpenAI as an example, but the pattern works with any LLM provider (Anthropic, Google Gemini, Groq, Mistral, etc.).

<Card title="Cookbook: Custom LLM (Client-Side)" icon="book-open" href="https://anam.ai/cookbook/custom-llm-client-side">
  Step-by-step tutorial with full source code
</Card>

<Note>
  **New Feature**: Anam now supports [server-side custom LLMs](/concepts/custom-llms) where we
  handle the LLM calls for you, improving latency and simplifying development. This guide shows the
  client-side approach where you manage the LLM calls yourself.
</Note>

## What You'll Build

By the end of this guide, you'll have a persona application featuring:

* **Custom AI Brain** using your own language model (OpenAI GPT-4.1-mini)
* **Streaming Responses** with real-time text-to-speech conversion
* **Turn-taking Management** that handles conversation flow
* **Message History Integration** that maintains conversation context
* **Error Handling & Recovery** for production use

<Tip>
  After completing the initial setup (Steps 1-4), you can extend this foundation by adding features
  like conversation memory, different LLM providers, custom system prompts, or specialized AI
  behaviors.
</Tip>

<Info>
  This guide uses **OpenAI's GPT-4.1-mini** as an example custom LLM for demonstration purposes. In
  your actual application, you would replace the OpenAI integration with calls to your specific LLM
  provider. The core integration pattern remains the same regardless of your LLM choice.
</Info>

## Prerequisites

* **Node.js** (version 18 or higher) and **npm** installed
* Understanding of modern JavaScript/TypeScript and streaming APIs
* An Anam API key ([get one here](/api-key))
* An OpenAI API key ([get one here](https://platform.openai.com/api-keys))
* Basic knowledge of Express.js and modern web development
* A microphone and speakers for voice interaction

## Understanding the Custom LLM Flow

Before diving into the implementation, here is how custom LLM integration works with Anam personas. Regardless of your custom LLM provider, the implementation pattern follows these steps:

<Steps>
  <Step title="Disable Default Brain">
    The `llmId: "CUSTOMER_CLIENT_V1"` setting in the session token request disables Anam's default AI, allowing you to handle all conversation logic.
  </Step>

  <Step title="Listen for User Input">
    The `MESSAGE_HISTORY_UPDATED` event fires when the user finishes speaking, providing the complete
    conversation history including the new user message.
  </Step>

  <Step title="Process with Custom LLM">
    Your server endpoint receives the conversation history and generates a streaming response using
    your chosen LLM (OpenAI in this example).
  </Step>

  <Step title="Stream to Persona">
    The LLM response is streamed back to the client and forwarded to the persona using `createTalkMessageStream()` for text-to-speech conversion.
  </Step>
</Steps>

Using these core concepts, we'll build a simple web application that allows you to chat with your custom LLM-powered persona.

## Basic Setup

Let's start by building the foundation with custom LLM integration. This setup creates a web application with four main components:

```
anam-custom-llm-app/
├── server.js              # Express server with streaming LLM endpoint
├── package.json           # Node.js dependencies
├── public/                # Static files served to the browser
│   ├── index.html         # Main HTML page with video element
│   └── script.js          # Client-side JavaScript for persona control
└── .env                   # Environment variables
```

<Steps>
  <Step title="Create project directory">
    ```bash theme={"system"}
    mkdir anam-custom-llm-app
    cd anam-custom-llm-app
    ```
  </Step>

  <Step title="Initialize Node.js project">
    ```bash theme={"system"}
    npm init -y
    ```

    This creates a `package.json` file for managing dependencies.
  </Step>

  <Step title="Create public directory">
    ```bash theme={"system"}
    mkdir public
    ```

    <Info>
      The `public` folder will contain your HTML and JavaScript files that are served to the browser.
    </Info>
  </Step>

  <Step title="Install dependencies">
    ```bash theme={"system"}
    npm install express dotenv openai
    ```

    <Info>
      We're installing Express for the server, dotenv for environment variables, and the OpenAI SDK
      for custom LLM integration. The Anam SDK will be loaded directly from a CDN in the browser.
    </Info>
  </Step>

  <Step title="Configure environment variables">
    Create a `.env` file in your project root to store your API keys securely:

    ```bash .env theme={"system"}
    ANAM_API_KEY=your-anam-api-key-here
    OPENAI_API_KEY=your-openai-api-key-here
    ```

    <Warning>
      Replace the placeholder values with your actual API keys. Never commit this file to version control.
    </Warning>
  </Step>
</Steps>

### Step 1: Set up your server with LLM streaming

Create an Express server that handles both session token generation and LLM streaming:

```javascript server.js theme={"system"}
require('dotenv').config();
const express = require('express');
const OpenAI = require('openai');

const app = express();

// Initialize OpenAI client
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

app.use(express.json());
app.use(express.static('public'));

// Session token endpoint with custom brain configuration
app.post('/api/session-token', async (req, res) => {
  try {
    const response = await fetch('https://api.anam.ai/v1/auth/session-token', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
      },
      body: JSON.stringify({
        personaConfig: {
          name: 'Cara',
          avatarId: '30fa96d0-26c4-4e55-94a0-517025942e18',
          voiceId: '6bfbe25a-979d-40f3-a92b-5394170af54b',
          // This disables Anam's default brain and enables custom LLM integration
          llmId: 'CUSTOMER_CLIENT_V1',
        },
      }),
    });

    const data = await response.json();
    res.json({ sessionToken: data.sessionToken });
  } catch (error) {
    console.error('Session token error:', error);
    res.status(500).json({ error: 'Failed to create session' });
  }
});

// Custom LLM streaming endpoint
app.post('/api/chat-stream', async (req, res) => {
  try {
    const { messages } = req.body;

    // Create a streaming response from OpenAI
    const stream = await openai.chat.completions.create({
      model: 'gpt-4.1-mini',
      messages: [
        {
          role: 'system',
          content:
            'You are Cara, a helpful AI assistant. Be friendly, concise, and conversational in your responses. Keep responses under 100 words unless specifically asked for detailed information.',
        },
        ...messages,
      ],
      stream: true,
      temperature: 0.7,
    });

    // Set headers for streaming response
    res.setHeader('Content-Type', 'text/event-stream');
    res.setHeader('Cache-Control', 'no-cache');
    res.setHeader('Connection', 'keep-alive');

    // Process the OpenAI stream and forward to client
    for await (const chunk of stream) {
      const content = chunk.choices[0]?.delta?.content || '';
      if (content) {
        // Send each chunk as JSON
        res.write(JSON.stringify({ content }) + '\n');
      }
    }

    res.end();
  } catch (error) {
    console.error('LLM streaming error:', error);
    res.status(500).json({ error: 'An error occurred while streaming response' });
  }
});

app.listen(8000, () => {
  console.log('Server running on http://localhost:8000');
  console.log('Custom LLM integration ready!');
});
```

<Info>
  The key difference here is setting `llmId: "CUSTOMER_CLIENT_V1"` which disables Anam's default AI
  and enables custom LLM integration. The `/api/chat-stream` endpoint handles the actual AI
  conversation logic.
</Info>

### Step 2: Set up your HTML

Create a simple HTML page with video element and conversation display:

```html public/index.html theme={"system"}
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Custom LLM Persona - Anam Integration</title>
  </head>
  <body style="font-family: Arial, sans-serif; margin: 20px; background-color: #f5f5f5;">
    <div style="max-width: 1000px; margin: 0 auto;">
      <h1 style="text-align: center; color: #333;">Custom LLM Persona</h1>

      <div style="display: grid; grid-template-columns: 1fr 1fr; gap: 20px; margin-bottom: 20px;">
        <!-- Persona Panel -->
        <div
          style="background: white; padding: 20px; border-radius: 8px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);"
        >
          <video
            id="persona-video"
            autoplay
            playsinline
            muted
            style="width: 100%; max-width: 400px; border-radius: 8px; background: #000; display: block; margin: 0 auto;"
          ></video>

          <div id="status" style="text-align: center; margin: 15px 0; font-weight: bold;">
            Ready to connect
          </div>

          <div style="text-align: center;">
            <button
              id="start-button"
              style="background: #007bff; color: white; border: none; padding: 10px 20px; border-radius: 4px; cursor: pointer; margin: 5px; font-size: 14px;"
            >
              Start Conversation
            </button>
            <button
              id="stop-button"
              disabled
              style="background: #dc3545; color: white; border: none; padding: 10px 20px; border-radius: 4px; cursor: pointer; margin: 5px; font-size: 14px; opacity: 0.6;"
            >
              Stop Conversation
            </button>
          </div>
        </div>

        <!-- Chat Panel -->
        <div
          style="background: white; padding: 20px; border-radius: 8px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);"
        >
          <h2 style="margin-top: 0; color: #333;">Conversation</h2>
          <div
            id="chat-history"
            style="height: 400px; overflow-y: auto; padding: 10px; border: 1px solid #ddd; border-radius: 4px; background: #fafafa;"
          >
            <div style="font-style: italic; color: #666; text-align: center;">
              Start a conversation to see your chat history...
            </div>
          </div>
        </div>
      </div>
    </div>

    <script type="module" src="script.js"></script>
  </body>
</html>
```

### Step 3: Implement the client-side custom LLM integration

Create the client-side JavaScript that handles the custom LLM integration:

```javascript public/script.js theme={"system"}
import { createClient } from 'https://esm.sh/@anam-ai/js-sdk@latest';
import { AnamEvent } from 'https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types';

let anamClient = null;

// Get DOM elements
const startButton = document.getElementById('start-button');
const stopButton = document.getElementById('stop-button');
const videoElement = document.getElementById('persona-video');
const statusElement = document.getElementById('status');
const chatHistory = document.getElementById('chat-history');

// Status management
function updateStatus(message, type = 'normal') {
  statusElement.textContent = message;
  const colors = {
    loading: '#f39c12',
    connected: '#28a745',
    error: '#dc3545',
    normal: '#333',
  };
  statusElement.style.color = colors[type] || colors.normal;
}

// Chat history management
function updateChatHistory(messages) {
  if (!chatHistory) return;

  chatHistory.innerHTML = '';

  if (messages.length === 0) {
    chatHistory.innerHTML =
      '<div style="font-style: italic; color: #666; text-align: center;">Start a conversation to see your chat history...</div>';
    return;
  }

  messages.forEach((message) => {
    const messageDiv = document.createElement('div');
    const isUser = message.role === 'user';
    messageDiv.style.cssText = `
      margin-bottom: 10px; 
      padding: 8px 12px; 
      border-radius: 8px; 
      max-width: 85%; 
      background: ${isUser ? '#e3f2fd' : '#f1f8e9'};
      ${isUser ? 'margin-left: auto; text-align: right;' : ''}
    `;
    messageDiv.innerHTML = `<strong>${isUser ? 'You' : 'Cara'}:</strong> ${message.content}`;
    chatHistory.appendChild(messageDiv);
  });

  // Scroll to bottom
  chatHistory.scrollTop = chatHistory.scrollHeight;
}

// Custom LLM response handler
async function handleUserMessage(messageHistory) {
  // Only respond to user messages
  if (messageHistory.length === 0 || messageHistory[messageHistory.length - 1].role !== 'user') {
    return;
  }

  if (!anamClient) return;

  try {
    console.log('Getting custom LLM response for:', messageHistory);

    // Convert Anam message format to OpenAI format
    const openAIMessages = messageHistory.map((msg) => ({
      role: msg.role === 'user' ? 'user' : 'assistant',
      content: msg.content,
    }));

    // Create a streaming talk session
    // You can optionally pass a correlationId to track this specific message stream
    const talkStream = anamClient.createTalkMessageStream();

    // Call our custom LLM streaming endpoint
    const response = await fetch('/api/chat-stream', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ messages: openAIMessages }),
    });

    if (!response.ok) {
      throw new Error(`LLM request failed: ${response.status}`);
    }

    const reader = response.body?.getReader();
    if (!reader) {
      throw new Error('Failed to get response stream reader');
    }

    const textDecoder = new TextDecoder();
    console.log('Streaming LLM response to persona...');

    // Stream the response chunks to the persona
    while (true) {
      const { done, value } = await reader.read();

      if (done) {
        console.log('LLM streaming complete');
        if (talkStream.isActive()) {
          talkStream.endMessage();
        }
        break;
      }

      if (value) {
        const text = textDecoder.decode(value);
        const lines = text.split('\n').filter((line) => line.trim());

        for (const line of lines) {
          try {
            const data = JSON.parse(line);
            if (data.content && talkStream.isActive()) {
              talkStream.streamMessageChunk(data.content, false);
            }
          } catch (parseError) {
            // Ignore parse errors in streaming
          }
        }
      }
    }
  } catch (error) {
    console.error('Custom LLM error:', error);
    if (anamClient) {
      anamClient.talk(
        "I'm sorry, I encountered an error while processing your request. Please try again."
      );
    }
  }
}

async function startConversation() {
  try {
    startButton.disabled = true;
    updateStatus('Connecting...', 'loading');

    // Get session token from server
    const response = await fetch('/api/session-token', {
      method: 'POST',
    });

    if (!response.ok) {
      throw new Error('Failed to get session token');
    }

    const { sessionToken } = await response.json();

    // Create Anam client
    anamClient = createClient(sessionToken);

    // Set up event listeners
    anamClient.addListener(AnamEvent.SESSION_READY, () => {
      console.log('Session ready!');
      updateStatus('Connected - Custom LLM active', 'connected');
      startButton.disabled = true;
      stopButton.disabled = false;

      // Send initial greeting
      anamClient.talk("Hello! I'm Cara, powered by a custom AI brain. How can I help you today?");
    });

    anamClient.addListener(AnamEvent.CONNECTION_CLOSED, () => {
      console.log('Connection closed');
      stopConversation();
    });

    // This is the key event for custom LLM integration
    anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, handleUserMessage);

    // Update chat history in real-time
    anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => {
      updateChatHistory(messages);
    });

    // Handle stream interruptions (user interrupted the persona while speaking)
    anamClient.addListener(AnamEvent.TALK_STREAM_INTERRUPTED, () => {
      console.log('Talk stream interrupted by user');
    });

    // Start streaming to video element
    await anamClient.streamToVideoElement('persona-video');

    console.log('Custom LLM persona started successfully!');
  } catch (error) {
    console.error('Failed to start conversation:', error);
    updateStatus(`Error: ${error.message}`, 'error');
    startButton.disabled = false;
  }
}

function stopConversation() {
  if (anamClient) {
    anamClient.stopStreaming();
    anamClient = null;
  }

  // Reset UI
  videoElement.srcObject = null;
  updateChatHistory([]);
  updateStatus('Disconnected', 'normal');
  startButton.disabled = false;
  stopButton.disabled = true;

  console.log('Conversation stopped');
}

// Add event listeners
startButton.addEventListener('click', startConversation);
stopButton.addEventListener('click', stopConversation);

// Cleanup on page unload
window.addEventListener('beforeunload', stopConversation);
```

### Step 4: Test your custom LLM integration

1. Start your server:

```bash theme={"system"}
node server.js
```

2. Open [http://localhost:8000](http://localhost:8000) in your browser

3. Click "Start Conversation" to begin chatting with your custom LLM-powered persona!

<Check>
  You should see Cara appear and greet you, powered by your custom OpenAI integration. Try having a
  conversation - your voice will be transcribed, sent to OpenAI's GPT-4.1-mini, and the response will
  be streamed back through the persona's voice and video.
</Check>

## Advanced Features

### Enhanced Error Handling

Add retry logic to improve reliability:

```javascript theme={"system"}
// Add this to your script.js handleUserMessage function
async function handleUserMessage(messageHistory) {
  if (messageHistory.length === 0 || messageHistory[messageHistory.length - 1].role !== 'user') {
    return;
  }

  if (!anamClient) return;

  const maxRetries = 3;
  let retryCount = 0;

  while (retryCount < maxRetries) {
    try {
      // ... existing LLM call code ...
      return; // Success, exit retry loop
    } catch (error) {
      retryCount++;
      console.error(`Custom LLM error (attempt ${retryCount}):`, error);

      if (retryCount >= maxRetries) {
        // Final fallback response
        if (anamClient) {
          anamClient.talk(
            "I'm experiencing some technical difficulties. Please try rephrasing your question or try again in a moment."
          );
        }
      } else {
        // Wait before retry
        await new Promise((resolve) => setTimeout(resolve, 1000 * retryCount));
      }
    }
  }
}
```

## What You've Built

You've integrated a custom language model with Anam's persona system. Your application includes:

* **Custom AI Brain**: Control over your persona's intelligence using OpenAI's GPT-4.1-mini, with the ability to customize personality, knowledge, and behavior.
* **Real-time Streaming**: Responses stream from your LLM through the persona's voice.
* **Conversation Context**: Full conversation history is maintained and provided to your LLM for contextually aware responses.
* **Error Handling**: Retry logic and fallback responses for reliability.
* **Extensible Architecture**: The modular design allows you to swap LLM providers, add custom logic, or integrate with other AI services.

## Troubleshooting

<AccordionGroup>
  <Accordion title="LLM Responses Not Streaming">
    **Symptoms**: Persona doesn't speak or responses are delayed

    **Solutions**:

    * Verify OpenAI API key is correctly configured
    * Check that `llmId: "CUSTOMER_CLIENT_V1"` is set in session token
    * Ensure `MESSAGE_HISTORY_UPDATED` event listener is properly connected
    * Check browser console for JavaScript errors
    * Verify the `/api/chat-stream` endpoint is responding correctly
  </Accordion>

  <Accordion title="Streaming Performance Issues">
    **Symptoms**: Slow or choppy persona responses

    **Solutions**:

    * Optimize LLM model parameters (reduce max\_tokens, adjust temperature)
    * Implement response caching for common queries
    * Use faster models like `gpt-4.1-mini` instead of `gpt-4`
    * Consider chunking large responses for better streaming
    * Monitor network latency and server performance
  </Accordion>
</AccordionGroup>

***


# Custom TTS (client-side)
Source: https://anam.ai/docs/javascript-sdk/examples/custom-tts

Use your own text-to-speech provider with Anam avatars via audio passthrough mode.

<Note>**Beta Feature**: Audio passthrough mode is currently in beta. APIs may change as we continue to improve the integration.</Note>

<Note>
  **Want to use ElevenLabs Agents with Anam?** We recommend the [server-side ElevenLabs integration](https://anam.ai/cookbook/elevenlabs-server-side-agents) instead—it's simpler and has lower latency. This page covers the client-side approach for when you need direct control over the audio pipeline.
</Note>

This guide shows how to use Anam's **audio passthrough** mode to pipe externally-generated speech audio into an avatar for real-time lip-sync. The example below uses [ElevenLabs Conversational AI](https://elevenlabs.io/conversational-ai) as the TTS source, but the same pattern works with **any TTS provider** (Cartesia, PlayHT, Azure Speech, Google Cloud TTS, etc.)—you just need to deliver PCM audio chunks to the Anam SDK.

<Warning>
  Your TTS must generate audio *above* realtime speed. If your TTS provider streams audio slower than 1x realtime, you will experience **stutter and frame drops** because Anam needs extra time to buffer and render the lip-sync animation. Most cloud TTS providers stream well above realtime, but verify this before going to production.
</Warning>

<Frame>
  <iframe title="YouTube video player" />
</Frame>

<Card title="View Example" icon="github" href="https://github.com/anam-org/11labs_agent_demo">
  Full source code for the ElevenLabs conversational agent with Anam avatar (client-side).
</Card>

## How It Works

The integration uses Anam's **audio passthrough** mode, where Anam renders an avatar that lip-syncs to audio you provide—without using Anam's own AI or microphone input.

<Frame>
  <img />
</Frame>

<Tip>**Bring Your Own Voice**: Your TTS provider generates the speech audio. Anam renders the lip-synced avatar video.</Tip>

## Quick Start

### Prerequisites

* An account with your TTS provider (ElevenLabs used in this example)
* [Anam](https://anam.ai) account with API access
* Node.js or Bun runtime
* Modern browser with WebRTC support (Chrome, Firefox, Safari, Edge)

### Installation

```bash theme={"system"}
npm install @anam-ai/js-sdk chatdio
```

`chatdio` provides microphone capture utilities used to send user audio to ElevenLabs.

### Basic Integration

Here's the core pattern for connecting an external TTS source to Anam:

```typescript theme={"system"}
import { createClient } from "@anam-ai/js-sdk";

// 1. Create Anam client with audio passthrough session
const anamClient = createClient(sessionToken, {
  disableInputAudio: true, // Your TTS provider handles microphone
});
await anamClient.streamToVideoElement("video-element");

// 2. Create agent audio input stream
const audioInputStream = anamClient.createAgentAudioInputStream({
  encoding: "pcm_s16le",
  sampleRate: 16000,
  channels: 1,
});

// 3. Connect to your TTS provider and forward audio
// (ElevenLabs WebSocket shown here as an example)
const ws = new WebSocket(`wss://api.elevenlabs.io/v1/convai/conversation?agent_id=${agentId}`);

ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);

  if (msg.type === "audio" && msg.audio_event?.audio_base_64) {
    // Forward audio chunks to Anam for lip-sync
    audioInputStream.sendAudioChunk(msg.audio_event.audio_base_64);
  }

  if (msg.type === "agent_response") {
    // Signal end of audio sequence
    audioInputStream.endSequence();
  }

  if (msg.type === "interruption") {
    // Handle barge-in: stop the avatar animation and end the audio sequence
    anamClient.interruptPersona();
    audioInputStream.endSequence();
  }
};
```

## Full Example

### Project Structure

```
src/
├── client.ts          # Main client orchestration
├── elevenlabs.ts      # ElevenLabs WebSocket handling
└── routes/
    └── api/
        └── config.ts  # Server-side session token endpoint
```

### Server: Create Anam Session

Your server creates an Anam session token with `enableAudioPassthrough: true`:

```typescript config.ts theme={"system"}
const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${ANAM_API_KEY}`,
  },
  body: JSON.stringify({
    personaConfig: {
      avatarId: AVATAR_ID,
      enableAudioPassthrough: true, // Enable external audio input
    },
  }),
});

const { sessionToken } = await response.json();
```

### Client: ElevenLabs Module

Handle the WebSocket connection and microphone capture:

```typescript elevenlabs.ts theme={"system"}
import { MicrophoneCapture, arrayBufferToBase64 } from "chatdio";

const SAMPLE_RATE = 16000;

export interface ElevenLabsCallbacks {
  onReady?: () => void;
  onAudio?: (base64Audio: string) => void;
  onUserTranscript?: (text: string) => void;
  onAgentResponse?: (text: string) => void;
  onInterrupt?: () => void;
  onError?: () => void;
  onDisconnect?: () => void;
}

export async function connectElevenLabs(agentId: string, callbacks: ElevenLabsCallbacks) {
  const ws = new WebSocket(`wss://api.elevenlabs.io/v1/convai/conversation?agent_id=${agentId}`);

  // Set up microphone capture
  const mic = new MicrophoneCapture({
    sampleRate: SAMPLE_RATE,
    echoCancellation: true,
    noiseSuppression: true,
  });

  mic.on("data", (data: ArrayBuffer) => {
    if (ws.readyState === WebSocket.OPEN) {
      ws.send(
        JSON.stringify({
          user_audio_chunk: arrayBufferToBase64(data),
        })
      );
    }
  });

  ws.onopen = async () => {
    await mic.start();
    callbacks.onReady?.();
  };

  ws.onmessage = (event) => {
    const msg = JSON.parse(event.data);

    switch (msg.type) {
      case "audio":
        callbacks.onAudio?.(msg.audio_event.audio_base_64);
        break;
      case "agent_response":
        callbacks.onAgentResponse?.(msg.agent_response_event.agent_response);
        break;
      case "user_transcript":
        callbacks.onUserTranscript?.(msg.user_transcription_event.user_transcript);
        break;
      case "interruption":
        callbacks.onInterrupt?.();
        break;
      case "ping":
        ws.send(JSON.stringify({ type: "pong", event_id: msg.ping_event.event_id }));
        break;
    }
  };

  ws.onclose = () => {
    mic.stop();
    callbacks.onDisconnect?.();
  };
}
```

### Client: Main Integration

Wire everything together:

```typescript client.ts theme={"system"}
import { createClient } from "@anam-ai/js-sdk";
import { connectElevenLabs } from "./elevenlabs";

async function startConversation() {
  // Get session config from your server
  const { anamSessionToken, elevenLabsAgentId } = await fetch("/api/config").then((r) => r.json());

  // Initialize Anam avatar (disable input audio since ElevenLabs handles mic)
  const anamClient = createClient(anamSessionToken, {
    disableInputAudio: true,
  });
  await anamClient.streamToVideoElement("anam-video");

  // Create agent audio input stream
  const audioInputStream = anamClient.createAgentAudioInputStream({
    encoding: "pcm_s16le",
    sampleRate: 16000,
    channels: 1,
  });

  // Connect to ElevenLabs
  await connectElevenLabs(elevenLabsAgentId, {
    onAudio: (audio) => {
      audioInputStream.sendAudioChunk(audio);
    },
    onAgentResponse: () => {
      audioInputStream.endSequence();
    },
    onInterrupt: () => {
      anamClient.interruptPersona();
      audioInputStream.endSequence();
    },
  });
}
```

### Cleanup

Stop the conversation and release resources:

```typescript theme={"system"}
function stopConversation() {
  anamClient.stopStreaming();
}
```

## Configuration

### Environment Variables

<Steps>
  <Step title="Get your API credentials">
    You'll need credentials from both services:

    | Service        | Where to get it                                          |
    | -------------- | -------------------------------------------------------- |
    | **Anam**       | [lab.anam.ai](https://lab.anam.ai) → Settings → API Keys |
    | **ElevenLabs** | [elevenlabs.io](https://elevenlabs.io) → Agents          |
  </Step>

  <Step title="Set environment variables">
    ```bash .env theme={"system"}
    # Anam credentials
    ANAM_API_KEY=your_anam_api_key
    ANAM_AVATAR_ID=your_avatar_id

    # ElevenLabs credentials
    ELEVENLABS_AGENT_ID=your_agent_id
    ```
  </Step>
</Steps>

### ElevenLabs Agent Setup

When configuring your ElevenLabs agent, set the output audio format to match Anam's expectations:

| Setting         | Value      |
| --------------- | ---------- |
| **Format**      | PCM 16-bit |
| **Sample Rate** | 16000 Hz   |
| **Channels**    | Mono       |

<Warning>Mismatched audio formats will cause lip-sync issues. Ensure your TTS provider outputs PCM16 at 16kHz.</Warning>

### Choosing an Avatar

<CardGroup>
  <Card title="Stock Avatars" icon="users" href="/personas/avatars/gallery">
    Browse ready-to-use avatars in our gallery. Copy the avatar ID directly into your config.
  </Card>

  <Card title="Custom Avatars" icon="wand-magic-sparkles" href="https://lab.anam.ai/avatars">
    Create your own personalized avatar in Anam Lab with custom appearance and style.
  </Card>
</CardGroup>

## Audio Passthrough API

### createAgentAudioInputStream()

Creates a stream for sending audio chunks to the avatar for lip-sync. Must be called **after** `streamToVideoElement()` resolves (the session must be started first).

```typescript theme={"system"}
const audioInputStream = anamClient.createAgentAudioInputStream({
  encoding: "pcm_s16le",
  sampleRate: 16000,
  channels: 1,
});
```

<ParamField type="string">
  Audio encoding format. Only `pcm_s16le` (16-bit signed little-endian PCM) is supported.
</ParamField>

<ParamField type="number">
  Sample rate in Hz. Should match your TTS provider output (typically 16000).
</ParamField>

<ParamField type="number">
  Number of audio channels. Use `1` for mono.
</ParamField>

### sendAudioChunk()

Send a base64-encoded audio chunk for lip-sync rendering.

```typescript theme={"system"}
audioInputStream.sendAudioChunk(base64AudioData);
```

<Tip>Audio chunks can be sent **faster than realtime**. Anam buffers them internally and renders lip-sync at the correct pace.</Tip>

### endSequence()

Signal that the current audio sequence has ended. This helps Anam optimize lip-sync timing and handle transitions.

```typescript theme={"system"}
audioInputStream.endSequence();
```

Call this when:

* Your TTS provider signals the agent has finished speaking
* The user interrupts (barge-in)

## Handling Interruptions

When a user speaks while the agent is talking (barge-in), your TTS provider sends an interruption event. Handle it by interrupting the avatar and ending the audio sequence:

```typescript theme={"system"}
onInterrupt: () => {
  anamClient.interruptPersona();
  audioInputStream.endSequence();
},
```

`interruptPersona()` stops the avatar's current lip-sync animation immediately. `endSequence()` tells the audio stream that the current sequence is done. Both are needed—without `interruptPersona()`, the avatar may continue playing buffered audio.

## Performance Considerations

### Latency

This integration combines two real-time services, which adds latency compared to using Anam's turnkey solution:

| Path                         | Typical Latency |
| ---------------------------- | --------------- |
| User speech → ElevenLabs STT | 200-400ms       |
| ElevenLabs LLM processing    | 300-800ms       |
| ElevenLabs TTS → Anam avatar | 100-200ms       |
| **Total end-to-end**         | **600-1400ms**  |

For lower latency requirements, consider using Anam's [turnkey solution](/javascript-sdk/quickstart) which handles STT, LLM, and TTS in an optimized pipeline, or the [server-side ElevenLabs integration](https://anam.ai/cookbook/elevenlabs-server-side-agents) which reduces latency through server-to-server audio flow.

### Browser Compatibility

The integration requires WebRTC support. Tested browsers:

| Browser     | Support      |
| ----------- | ------------ |
| Chrome 80+  | Full support |
| Firefox 75+ | Full support |
| Safari 14+  | Full support |
| Edge 80+    | Full support |

<Note>Mobile browsers are supported but may have higher latency on cellular networks.</Note>

## Billing

When using audio passthrough mode:

* **Anam**: Billed for avatar streaming time (session duration)
* **TTS Provider**: Billed separately for STT, LLM, and TTS usage

Check both [Anam pricing](https://anam.ai/pricing) and your TTS provider's pricing to understand total costs.

## When to Use This Approach

This **client-side** approach is a good fit when you:

* Need direct control over the audio pipeline in the browser
* Want to use client-side tools with your TTS provider's agent
* Have an existing client-side integration you want to add avatars to

For most new projects, we recommend the **[server-side integration](https://anam.ai/cookbook/elevenlabs-server-side-agents)** instead—it's simpler to set up and has lower latency.

## Troubleshooting

<AccordionGroup>
  <Accordion title="Avatar lips not moving">
    * Verify audio format matches (PCM16, 16kHz, mono)
    * Check that `sendAudioChunk()` is receiving data
    * Ensure the audio input stream was created successfully
    * Look for errors in browser console
  </Accordion>

  <Accordion title="Audio/lip-sync out of sync">
    * Call `endSequence()` when agent responses complete
    * Ensure you're handling interruptions correctly
    * Check network latency to both services
  </Accordion>

  <Accordion title="No audio from agent">
    * Verify your TTS provider agent is configured correctly
    * Check the WebSocket connection is established
    * Look for audio events in the message handler
    * Confirm your agent ID is correct
  </Accordion>

  <Accordion title="Microphone not working">
    * Check browser permissions for microphone access
    * Ensure `echoCancellation` is enabled to prevent feedback
    * Verify the microphone is sending data at 16kHz
  </Accordion>

  <Accordion title="Session token errors">
    * Verify your `ANAM_API_KEY` is valid
    * Check that `enableAudioPassthrough: true` is set in the session request
    * Ensure the avatar ID exists in your account
  </Accordion>
</AccordionGroup>

## Resources

<CardGroup>
  <Card title="Server-Side Integration" icon="server" href="https://anam.ai/cookbook/elevenlabs-server-side-agents">
    Recommended: simpler setup with lower latency
  </Card>

  <Card title="Client-Side Source Code" icon="github" href="https://github.com/robbie-anam/elevenlabs-agent/tree/clientside_version">
    Full source code for the client-side integration
  </Card>

  <Card title="Cookbook: Expressive Voice Agents" icon="book-open" href="https://anam.ai/cookbook/elevenlabs-expressive-voice-agents">
    Guide to using ElevenLabs V3 expressive voices with Anam
  </Card>

  <Card title="Avatar Gallery" icon="images" href="/personas/avatars/gallery">
    Browse available stock avatars
  </Card>
</CardGroup>


# Full application
Source: https://anam.ai/docs/javascript-sdk/examples/full-app

Build a full-featured Anam application with advanced UI/UX and event handling

Build a web application that uses the Anam SDK's advanced features. This guide walks you through creating an interactive AI assistant with modern UI components, real-time event handling, chat history, talk commands, and production-grade security practices.

## How to use this guide

This guide takes an **additive approach**, covering a range of Anam features and implementation patterns. Each section builds upon the previous ones, allowing you to add functionality progressively.

<Info>
  We provide detailed, complete code examples to make integration
  with AI coding assistants straightforward. Use the buttons in the top-right corner of
  this page to open the content in your preferred LLM or copy the entire guide
  for easy reference.
</Info>

<Tip>
  You likely won't need every component shown in this guide. After completing
  the initial setup (Steps 1-4), use the **right-hand navigation** to jump
  directly to the specific features you want to implement.
</Tip>

Start with the basic setup, then selectively add the advanced features that match your use case.

## What You'll Build

By the end of this guide, you'll have a persona application featuring:

* **Modern UI/UX** with loading states, connection status, and responsive design
* **Chat History Panel** showing conversation transcripts in real-time
* **Advanced Event Handling** for connection management and user feedback
* **Talk Commands** for programmatic persona control
* **Audio Management** with mute/unmute functionality
* **Production Security** with proper authentication and rate limiting
* **Error Handling & Retry Logic** for better user experience

## Prerequisites

* **Node.js** (version 18 or higher) and **npm** installed
* Understanding of modern JavaScript/TypeScript
* An Anam API key ([get one here](/api-key))
* Basic knowledge of Express.js and modern web development
* A microphone and speakers for voice interaction

## Basic App

<Note>
  If you've already completed the [Basic Application](/examples/basic-app) guide, you can skip to [Listening to events](#listening-to-events). The setup below is similar but included here for completeness.
</Note>

Let's start by building a simple working version, then enhance it with advanced features. This foundational setup creates a minimal web application with three main files:

```
anam-production-app/
├── server.js          # Express server for secure API key handling
├── package.json       # Node.js dependencies
├── public/            # Static files served to the browser
│   ├── index.html     # Main HTML page with video element
│   └── script.js      # Client-side JavaScript for persona control
└── .env               # Environment variables
```

<Steps>
  <Step title="Create project directory">
    ```bash theme={"system"}
    mkdir anam-production-app
    cd anam-production-app
    ```
  </Step>

  <Step title="Initialize Node.js project">
    `bash npm init -y ` This creates a `package.json` file for managing dependencies.
  </Step>

  <Step title="Create public directory">
    `bash mkdir public `

    <Info>
      The `public` folder will contain your HTML and JavaScript files that are served to the browser.
    </Info>
  </Step>

  <Step title="Install dependencies">
    `bash npm install express dotenv `

    <Info>
      We're installing Express for the server and dotenv for environment variable management. The Anam SDK will be loaded directly from a CDN in the browser.
    </Info>
  </Step>

  <Step title="Configure environment variables">
    Create a `.env` file in your project root to store your API key securely:

    ```bash .env theme={"system"}
    ANAM_API_KEY=your-api-key-here
    ```

    <Warning>
      Replace `your-api-key-here` with your actual Anam API key. Never commit this file to version control.
    </Warning>
  </Step>
</Steps>

### Step 1: Set up your server

Create a basic Express server to handle session token generation:

```javascript server.js theme={"system"}
require("dotenv").config();
const express = require("express");
const app = express();

app.use(express.json());
app.use(express.static("public"));

app.post("/api/session-token", async (req, res) => {
  try {
    const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
      },
      body: JSON.stringify({
        personaConfig: {
          name: "Cara",
          avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
          voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
          llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
          systemPrompt:
            "You are Cara, a helpful AI assistant. Be friendly, concise, and helpful in your responses.",
        },
      }),
    });

    const data = await response.json();
    res.json({ sessionToken: data.sessionToken });
  } catch (error) {
    res.status(500).json({ error: "Failed to create session" });
  }
});

app.listen(8000, () => {
  console.log("Server running on http://localhost:8000");
});
```

### Step 2: Set up your HTML

Create a simple HTML page with a video element and basic controls:

```html public/index.html theme={"system"}
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Anam AI Assistant - Production App</title>
  </head>
  <body>
    <div style="text-align: center; padding: 20px;">
      <h1>Chat with Cara</h1>
      <video
        id="persona-video"
        autoplay
        playsinline
        style="max-width: 100%; border-radius: 8px;"
      ></video>
      <div style="margin-top: 20px;">
        <button id="start-button">Start Chat</button>
        <button id="stop-button" disabled>Stop Chat</button>
      </div>
    </div>
    <script type="module" src="script.js"></script>
  </body>
</html>
```

### Step 3: Initialize the Anam client

Create the client-side JavaScript to control your persona:

```javascript public/script.js theme={"system"}
import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";

let anamClient = null;

// Get DOM elements
const startButton = document.getElementById("start-button");
const stopButton = document.getElementById("stop-button");
const videoElement = document.getElementById("persona-video");

async function startChat() {
  try {
    startButton.disabled = true;

    // Get session token from your server
    const response = await fetch("/api/session-token", {
      method: "POST",
    });
    const { sessionToken } = await response.json();

    // Create the Anam client
    anamClient = createClient(sessionToken);

    // Start streaming to the video element
    await anamClient.streamToVideoElement("persona-video");

    // Update button states
    startButton.disabled = true;
    stopButton.disabled = false;

    console.log("Chat started successfully!");
  } catch (error) {
    console.error("Failed to start chat:", error);
    startButton.disabled = false;
  }
}

function stopChat() {
  if (anamClient) {
    // Disconnect the client
    anamClient.stopStreaming();
    anamClient = null;

    // Clear video element
    videoElement.srcObject = null;

    // Update button states
    startButton.disabled = false;
    stopButton.disabled = true;

    console.log("Chat stopped.");
  }
}

// Add event listeners
startButton.addEventListener("click", startChat);
stopButton.addEventListener("click", stopChat);
```

### Step 4: Test your basic setup

1. Start your server:

```bash theme={"system"}
node server.js
```

2. Open [http://localhost:8000](http://localhost:8000) in your browser
3. Click "Start Chat" to begin your conversation!

<Check>
  You should see Cara appear in the video element and be ready to chat through
  voice interaction.
</Check>

Now that you have a working basic app, let's enhance it with production-ready features.

## Listening to events

Anam personas communicate through an event system that allows your application to respond to connection changes, conversation updates, and user interactions. Let's enhance our basic app with event handling to create a more responsive experience.

### Understanding the Event System

The Anam SDK uses an event-driven architecture where you can listen for specific events and react accordingly. Here's the process for listening to events:

<Steps>
  <Step title="Import the event types">
    Import the specific event types you want to listen to:

    ```javascript theme={"system"}
    import { AnamEvent } from "https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types";
    ```
  </Step>

  <Step title="Define your event listener function">
    Create a function to handle the event:

    ```javascript theme={"system"}
    function onSessionReady() {
      console.log('Session Ready!');
      // Do something when the session is ready
    }
    ```
  </Step>

  <Step title="Add your event listener to the client">
    Register your listener function with the client:

    ```javascript theme={"system"}
    anamClient.addListener(AnamEvent.SESSION_READY, onSessionReady);
    ```
  </Step>
</Steps>

<Tip>
  For a full list of events, see the [Event Handling](/javascript-sdk/reference/event-types)
  documentation.
</Tip>

Now let's implement this pattern in our app by adding a loading state.

### Adding a loading state

Loading states provide important user feedback during connection establishment. Let's add a simple loading indicator to our HTML:

#### Step 1: Update the UI

```html public/index.html theme={"system"}
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Anam AI Assistant - Production App</title>
  </head>
  <body>
    <div style="text-align: center; padding: 20px;">
      <h1>Chat with Cara</h1>

      <div id="loading-message" style="display: none;">Loading...</div>

      <video
        id="persona-video"
        autoplay
        playsinline
        style="max-width: 100%; border-radius: 8px;"
      ></video>

      <div style="margin-top: 20px;">
        <button id="start-button">Start Chat</button>
        <button id="stop-button" disabled>Stop Chat</button>
      </div>
    </div>

    <script type="module" src="script.js"></script>
  </body>
</html>
```

#### Step 2: Adding the event listener

Now let's update our JavaScript to handle the loading state by adding the following to our `script.js` file:

```javascript public/script.js {2,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,39,40,41,42,43,44} theme={"system"}
import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";
import { AnamEvent } from "https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types";

let anamClient = null;

// Get DOM elements
const startButton = document.getElementById("start-button");
const stopButton = document.getElementById("stop-button");
const videoElement = document.getElementById("persona-video");
const loadingMessage = document.getElementById("loading-message");

function showLoadingState() {
  if (loadingMessage) {
    loadingMessage.style.display = "block";
  }
}

function hideLoadingState() {
  if (loadingMessage) {
    loadingMessage.style.display = "none";
  }
}

async function startChat() {
  try {
    startButton.disabled = true;
    showLoadingState();

    // Get session token from your server
    const response = await fetch("/api/session-token", {
      method: "POST",
    });
    const { sessionToken } = await response.json();

    // Create the Anam client
    anamClient = createClient(sessionToken);

    // Listen for SESSION_READY event to hide loading state
    anamClient.addListener(AnamEvent.SESSION_READY, () => {
      console.log("Session is ready!");
      hideLoadingState();
      startButton.disabled = true;
      stopButton.disabled = false;
    });

    // Start streaming to the video element
    await anamClient.streamToVideoElement("persona-video");

    console.log("Chat started successfully!");
  } catch (error) {
    console.error("Failed to start chat:", error);
    startButton.disabled = false;
    hideLoadingState();
  }
}

function stopChat() {
  if (anamClient) {
    // Disconnect the client
    anamClient.stopStreaming();
    anamClient = null;

    // Clear video element
    videoElement.srcObject = null;

    // Update button states
    startButton.disabled = false;
    stopButton.disabled = true;

    console.log("Chat stopped.");
  }
}

// Add event listeners
startButton.addEventListener("click", startChat);
stopButton.addEventListener("click", stopChat);
```

### Adding a chat history panel

A common use case for event listeners is to update the UI based on the transcription of the conversation. Let's look at how we can implement this using Anam events.

#### Understanding Message Events

Anam provides two key events for tracking conversation transcriptions:

| Event                           | Description                                                                    |
| ------------------------------- | ------------------------------------------------------------------------------ |
| `MESSAGE_HISTORY_UPDATED`       | Provides the complete conversation history each time someone finishes speaking |
| `MESSAGE_STREAM_EVENT_RECEIVED` | Provides real-time transcription updates as speech occurs                      |

Let's start with the `MESSAGE_HISTORY_UPDATED` event to build a basic chat history.

#### Step 1: Add chat history UI

First, update your HTML to include a simple chat panel:

```html public/index.html theme={"system"}
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Anam AI Assistant - Production App</title>
  </head>
  <body>
    <div style="text-align: center; padding: 20px;">
      <h1>Chat with Cara</h1>

      <div id="loading-message" style="display: none;">Loading...</div>

      <video
        id="persona-video"
        autoplay
        playsinline
        style="max-width: 100%; border-radius: 8px;"
      ></video>

      <!-- Chat History Panel -->
      <div
        id="chat-history"
        style="margin-top: 20px; padding: 10px; border: 1px solid #ccc; border-radius: 5px; max-height: 200px; overflow-y: auto;"
      >
        <p>Start a conversation to see your chat history</p>
      </div>

      <div style="margin-top: 20px;">
        <button id="start-button">Start Chat</button>
        <button id="stop-button" disabled>Stop Chat</button>
      </div>
    </div>

    <script type="module" src="script.js"></script>
  </body>
</html>
```

#### Step 2: Listen for updates

Now let's add the event listener to handle complete conversation updates:

```javascript public/script.js {25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,74,75,76,77,78,79,99} theme={"system"}
import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";
import { AnamEvent } from "https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types";

let anamClient = null;

// Get DOM elements
const startButton = document.getElementById("start-button");
const stopButton = document.getElementById("stop-button");
const videoElement = document.getElementById("persona-video");
const loadingMessage = document.getElementById("loading-message");
const chatHistory = document.getElementById("chat-history");

function showLoadingState() {
  if (loadingMessage) {
    loadingMessage.style.display = "block";
  }
}

function hideLoadingState() {
  if (loadingMessage) {
    loadingMessage.style.display = "none";
  }
}

function updateChatHistory(messages) {
  if (!chatHistory) return;
  // Clear existing content
  chatHistory.innerHTML = "";
  if (messages.length === 0) {
    chatHistory.innerHTML =
      "<p>Start a conversation to see your chat history</p>";
    return;
  }
  // Add each message to the chat history
  messages.forEach((message) => {
    const messageDiv = document.createElement("div");
    messageDiv.style.marginBottom = "10px";
    messageDiv.style.padding = "5px";
    messageDiv.style.borderRadius = "5px";
    if (message.role === "user") {
      messageDiv.style.backgroundColor = "#e3f2fd";
      messageDiv.innerHTML = `<strong>You:</strong> ${message.content}`;
    } else {
      messageDiv.style.backgroundColor = "#f1f8e9";
      messageDiv.innerHTML = `<strong>Cara:</strong> ${message.content}`;
    }
    chatHistory.appendChild(messageDiv);
  });
  // Scroll to bottom
  chatHistory.scrollTop = chatHistory.scrollHeight;
}

async function startChat() {
  try {
    startButton.disabled = true;
    showLoadingState();

    // Get session token from your server
    const response = await fetch("/api/session-token", {
      method: "POST",
    });
    const { sessionToken } = await response.json();

    // Create the Anam client
    anamClient = createClient(sessionToken);

    // Listen for SESSION_READY event to hide loading state
    anamClient.addListener(AnamEvent.SESSION_READY, () => {
      console.log("Session is ready!");
      hideLoadingState();
      startButton.disabled = true;
      stopButton.disabled = false;
    });

    // Listen for MESSAGE_HISTORY_UPDATED to update chat history
    anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => {
      console.log("Conversation updated:", messages);
      updateChatHistory(messages);
    });

    // Start streaming to the video element
    await anamClient.streamToVideoElement("persona-video");

    console.log("Chat started successfully!");
  } catch (error) {
    console.error("Failed to start chat:", error);
    startButton.disabled = false;
    hideLoadingState();
  }
}

function stopChat() {
  if (anamClient) {
    // Disconnect the client
    anamClient.stopStreaming();
    anamClient = null;

    // Clear video element and chat history
    videoElement.srcObject = null;
    updateChatHistory([]);

    // Update button states
    startButton.disabled = false;
    stopButton.disabled = true;

    console.log("Chat stopped.");
  }
}

// Add event listeners
startButton.addEventListener("click", startChat);
stopButton.addEventListener("click", stopChat);
```

<Info>
  The `MESSAGE_HISTORY_UPDATED` event provides an array of message objects with
  `role` ("user" or "assistant") and `content` properties. This gives you the
  complete conversation transcript each time someone finishes speaking.
</Info>

#### Step 3: Add real-time transcription

Now let's enhance the experience by showing live transcription as the persona speaks using the `MESSAGE_STREAM_EVENT_RECEIVED` event:

```html theme={"system"}
<!-- Add this to your HTML after the chat history -->
<div
  id="live-transcript"
  style="margin-top: 10px; padding: 10px; background-color: #e6e0ff; border-radius: 5px;"
>
  <strong>Persona Live:</strong> <span id="transcript-text"></span>
</div>
```

And update your JavaScript to handle real-time events:

```javascript theme={"system"}
// Add this event listener inside your startChat function after the MESSAGE_HISTORY_UPDATED listener

// Listen for real-time transcription events
anamClient.addListener(AnamEvent.MESSAGE_STREAM_EVENT_RECEIVED, (event) => {
  const liveTranscript = document.getElementById("live-transcript");
  const transcriptText = document.getElementById("transcript-text");

  console.log("event", event);

  if (event.role === "persona") {
    // Show persona speaking in real-time
    if (liveTranscript && transcriptText) {
      transcriptText.textContent = transcriptText.textContent + event.content;
    }
  } else if (event.role === "user") {
    // clear the persona live transcript when the user speaks
    if (liveTranscript && transcriptText) {
      transcriptText.textContent = "";
    }
  }
});
```

<Tip>
  The `MESSAGE_STREAM_EVENT_RECEIVED` event fires continuously as speech is
  being processed. Use `event.role` to distinguish between "persona" (AI
  speaking) and "user" (user speaking) events.
</Tip>

<Check>
  Your app now displays both complete conversation history and real-time
  transcription as the persona speaks!
</Check>

## Sending commands

Beyond voice interaction, you can programmatically send messages to your persona using the talk command. This is useful for creating interactive experiences, automated workflows, or custom UI controls.

### Using the talk command

The talk command allows you to send text messages that the persona will speak directly. This can be used to instruct the persona to say something in response to a user action other than voice input, such as a button click or when certain UI elements are shown on screen.

You can send a talk command by calling the `talk` method on the Anam client during a session.

```javascript theme={"system"}
anamClient.talk("Hello, how are you?");
```

Let's use this to add a talk command UI to our app.

#### Step 1: Update the UI

First, update your HTML to include a text input and send button

```html public/index.html theme={"system"}
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Anam AI Assistant - Production App</title>
  </head>
  <body>
    <div style="text-align: center; padding: 20px;">
      <h1>Chat with Cara</h1>

      <div id="loading-message" style="display: none;">Loading...</div>

      <video
        id="persona-video"
        autoplay
        playsinline
        style="max-width: 100%; border-radius: 8px;"
      ></video>

      <!-- Chat History Panel -->
      <div
        id="chat-history"
        style="margin-top: 20px; padding: 10px; border: 1px solid #ccc; border-radius: 5px; max-height: 200px; overflow-y: auto;"
      >
        <p>Start a conversation to see your chat history</p>
      </div>

      <!-- Live Transcript -->
      <div
        id="live-transcript"
        style="margin-top: 10px; padding: 10px; background-color: #e6e0ff; border-radius: 5px;"
      >
        <strong>Persona Live:</strong> <span id="transcript-text"></span>
      </div>

      <!-- Talk Command Panel -->
      <div
        style="margin-top: 20px; padding: 15px; border: 1px solid #ccc; border-radius: 5px;"
      >
        <h3 style="margin: 0 0 10px 0;">Send Message</h3>
        <div style="display: flex; gap: 10px; align-items: flex-end;">
          <textarea
            id="message-input"
            placeholder="Type a message for Cara to respond to..."
            rows="3"
            style="flex: 1; padding: 10px; border: 1px solid #ddd; border-radius: 4px; font-family: inherit;"
          ></textarea>
          <button
            id="send-message"
            disabled
            style="padding: 10px 20px; border: none; border-radius: 4px; background: #4CAF50; color: white; cursor: pointer;"
          >
            Send Message
          </button>
        </div>
      </div>

      <div style="margin-top: 20px;">
        <button id="start-button">Start Chat</button>
        <button id="stop-button" disabled>Stop Chat</button>
      </div>
    </div>

    <script type="module" src="script.js"></script>
  </body>
</html>
```

#### Step 2: Add the code

Now let's add the code to our `script.js` file to send the command to the persona

```javascript public/script.js theme={"system"}
// Add these functions to your existing script.js file

async function sendTalkMessage() {
  const messageInput = document.getElementById("message-input");
  const sendButton = document.getElementById("send-message");

  if (!anamClient || !messageInput) return;

  const message = messageInput.value.trim();
  if (!message) {
    alert("Please enter a message");
    return;
  }

  try {
    sendButton.disabled = true;
    sendButton.textContent = "Sending...";

    // Send the message to the persona
    await anamClient.talk(message);

    // Clear the input
    messageInput.value = "";
  } catch (error) {
    console.error("Failed to send message:", error);
    alert("Failed to send message. Please try again.");
  } finally {
    sendButton.disabled = false;
    sendButton.textContent = "Send Message";
  }
}

function updateTalkControls(connected) {
  const sendButton = document.getElementById("send-message");
  const messageInput = document.getElementById("message-input");

  if (sendButton) {
    sendButton.disabled = !connected;
    sendButton.style.opacity = connected ? "1" : "0.5";
  }

  if (messageInput) {
    messageInput.disabled = !connected;
    messageInput.placeholder = connected
      ? "Type a message for Cara to respond to..."
      : "Connect to start chatting...";
  }
}

// Update your startChat function to enable talk controls
async function startChat() {
  try {
    startButton.disabled = true;
    showLoadingState();

    const response = await fetch("/api/session-token", {
      method: "POST",
    });
    const { sessionToken } = await response.json();

    anamClient = createClient(sessionToken);

    anamClient.addListener(AnamEvent.SESSION_READY, () => {
      console.log("Session is ready!");
      hideLoadingState();
      startButton.disabled = true;
      stopButton.disabled = false;
      updateTalkControls(true); // Enable talk controls
    });

    anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => {
      console.log("Conversation updated:", messages);
      updateChatHistory(messages);
    });

    anamClient.addListener(AnamEvent.MESSAGE_STREAM_EVENT_RECEIVED, (event) => {
      const liveTranscript = document.getElementById("live-transcript");
      const transcriptText = document.getElementById("transcript-text");

      if (event.role === "persona") {
        if (liveTranscript && transcriptText) {
          transcriptText.textContent =
            transcriptText.textContent + event.content;
        }
      } else if (event.role === "user") {
        if (liveTranscript && transcriptText) {
          transcriptText.textContent = "";
        }
      }
    });

    await anamClient.streamToVideoElement("persona-video");

    console.log("Chat started successfully!");
  } catch (error) {
    console.error("Failed to start chat:", error);
    startButton.disabled = false;
    hideLoadingState();
  }
}

// Add event listeners for talk functionality
document.addEventListener("DOMContentLoaded", () => {
  const sendButton = document.getElementById("send-message");
  const messageInput = document.getElementById("message-input");

  if (sendButton) {
    sendButton.addEventListener("click", sendTalkMessage);
  }

  if (messageInput) {
    // Send message with Enter key (Shift+Enter for new line)
    messageInput.addEventListener("keydown", (e) => {
      if (e.key === "Enter" && !e.shiftKey) {
        e.preventDefault();
        sendTalkMessage();
      }
    });
  }
});
```

## Advanced Stream Management

For applications that require more control over audio and video streams, you have two main approaches for customizing the streaming behavior:

1. **Custom Input Streams**: Pass your own audio input to `streamToVideoElement`
2. **Output Stream Capture**: Use `stream()` to capture and process persona output

### Custom Input Streams

The `streamToVideoElement()` method accepts an optional audio input stream, allowing you to process or record user audio before sending it to the persona:

```javascript theme={"system"}
// Get and potentially process user audio
const userInputStream = await navigator.mediaDevices.getUserMedia({
  audio: true,
});

// Start recording user input
const userRecorder = new MediaRecorder(userInputStream);
userRecorder.start();

// Use the same stream for persona input
await anamClient.streamToVideoElement("persona-video", userInputStream);
```

This approach is useful when you want to:

* Record user audio input
* Apply audio filters or effects to user input
* Monitor user audio levels
* Handle custom microphone setups

### Output Stream Capture

For capturing and processing the persona's output (video and audio), use the `stream()` method which returns the raw output streams:

```javascript theme={"system"}
const userInputStream = await navigator.mediaDevices.getUserMedia({
  audio: true,
});
const [videoStream] = await anamClient.stream(userInputStream);

// Now you have full control over the output stream
videoElement.srcObject = videoStream;

// Extract audio for separate processing
const audioTracks = videoStream.getAudioTracks();
const personaAudioStream = new MediaStream(audioTracks);
```

This approach enables:

* Recording persona video and audio output
* Custom video rendering and effects
* Audio analysis and processing
* Streaming to multiple destinations

## What You've Built

You now have a working Anam application with these core patterns:

* **WebRTC Streaming**: Real-time video connection for voice conversations with an AI persona
* **Event Handling**: Responding to session state changes and conversation updates using the SDK's event system
* **Live Transcription**: Real-time display of what the persona is saying via `MESSAGE_STREAM_EVENT_RECEIVED`
* **Conversation History**: Complete transcript tracking via `MESSAGE_HISTORY_UPDATED`
* **Talk Commands**: Programmatic text input to trigger persona responses
* **Session Management**: Secure server-side token generation with proper API key protection
* **Stream Recording**: Manual stream management for custom audio processing workflows

## What's Next

Ready to take your persona application further? Here are your next steps:

**Explore Advanced Features**: Continue reading our guides to learn about [audio control](/javascript-sdk/reference/audio-control), [custom personas](/personas/overview), and [production deployment](/javascript-sdk/production) best practices.

**Integrate Your Own AI**: Want to use your own language model instead of Anam's default brain? Check out our [Custom LLM Integration](/examples/custom-llm) guide to learn how to connect your persona to custom AI models and APIs.

## Advanced Customization

### Persona Configuration

Modify the `getDefaultPersonaConfig()` method to customize:

* Avatar appearance and voice selection
* System prompts and personality traits
* Custom brain types and LLM integration

### UI Theming

Update the CSS variables in `app.css` to match your brand:

* Color schemes and gradients
* Typography and spacing
* Animation timing and effects

### Event Integration

Extend the event handlers to integrate with your existing systems:

* Analytics and user behavior tracking
* Customer service platforms
* CRM and database systems

## Troubleshooting

<AccordionGroup>
  <Accordion title="Connection Issues">
    **Symptoms**: Persona fails to connect or frequently disconnects

    **Solutions**:

    * Verify your API key is correctly configured
    * Check network connectivity and firewall settings
    * Ensure WebRTC is supported in your browser
    * Review server logs for authentication errors
  </Accordion>

  <Accordion title="Audio Problems">
    **Symptoms**: No audio input/output or poor quality

    **Solutions**:

    * Grant microphone permissions in browser settings
    * Check audio device configuration
    * Ensure HTTPS is enabled (required for microphone access)
    * Test with different browsers or devices
  </Accordion>

  <Accordion title="Performance Issues">
    **Symptoms**: Slow loading or laggy interactions

    **Solutions**:

    * Optimize network bandwidth and connection quality
    * Reduce video quality if needed
    * Implement connection pooling and caching
    * Monitor server resource usage
  </Accordion>

  <Accordion title="UI Responsiveness">
    **Symptoms**: Interface doesn't work on mobile or different screen sizes

    **Solutions**:

    * Test on various devices and screen sizes
    * Verify CSS media queries are working
    * Check touch event handling
    * Validate accessibility compliance
  </Accordion>
</AccordionGroup>


# Creating Smarter Personas
Source: https://anam.ai/docs/javascript-sdk/examples/smarter-personas

Learn how to create personas that can understand and respond to your customers.

When [creating a persona](/concepts/personas) the system prompt you choose defines how the persona will respond to your customers. Choosing an effective system prompt is crucial for delivering meaningful interactions with your users. When designing your system prompt focus on crafting a well-defined identity, personality and communication style.
Here's some tips on how to create smarter, more effective personas:

## Defining Core Identity

Your persona's identity is the foundation of its interactions. Start by clearly establishing:

* What specific role your persona will fulfill (customer support, sales assistant, product specialist, etc.)
* Areas of expertise relevant to your business
* The types of interactions it should handle
* Clear boundaries of its capabilities and limitations

For example, if you're creating a customer support persona for a software company, you might define it as:

"A knowledgeable technical support specialist focused on helping users troubleshoot our software products, equipped to handle basic to intermediate technical issues, while suggesting alternate support channels with human support staff for more complex problems."

## Personality and Communication Style

The way your persona communicates is just as important as what it communicates. Consider:

* Tone of voice (professional, friendly, casual)
* Communication style (concise vs detailed, formal vs informal)
* How it should address users
* Cultural sensitivity and adaptability

Your persona should maintain consistency in its communication style while adapting to the user's needs. For instance, it might use a friendly yet professional tone:

"I understand you're having trouble with the login process. Let me help you resolve this step by step. First, could you tell me what error message you're seeing?"

## Handling Different Scenarios

Guide your persona on how to handle various situations:

When facing uncertainty:
"If you're not completely sure about something, acknowledge it and ask for clarification. It's better to say 'I want to make sure I understand your question correctly' than to provide incorrect information."

For complex requests:
"Break down complex problems into smaller, manageable steps. Guide users through the process gradually, confirming understanding at each step."

## Response Structure

Help your persona maintain effective conversations by instructing it to keep responses concise and relevant and to break down complex information into digestible parts.

## Testing and Refinement

Personas can be sensitive to even small changes in the system prompt. If you notice your persona is not performing as expected, try small incremental changes to the system prompt.

## Best Practices

1. Start simple and expand capabilities gradually.
2. Use example interactions in your system prompt.
3. Collect and incorporate user feedback.

Creating an effective persona is an iterative process. Start with these fundamentals and refine based on your specific needs and user interactions. Monitor performance and adjust as needed.


# Usage in Production
Source: https://anam.ai/docs/javascript-sdk/production

Securely deploy Anam AI in production environments

## Overview

When deploying to production, do not expose your API key publicly. Instead:

1. Exchange your API key for a short-lived session token on the server side
2. Pass this token to the client
3. Initialize the Anam SDK with the session token

Anam AI offers two types of session tokens: **Stateful** and **Ephemeral**.

## Session Token Types

### Stateful Session Tokens

Stateful tokens reference a persona that you've created and configured in the Anam AI Lab, or using the [Anam AI API](/api-reference/create-persona). These are referenced by a unique ID.

<CardGroup>
  <Card title="Pros" icon="check">
    Configuration changes are managed in the Lab interface without needing code
    changes. Ideal for personas that don't need to change from session to
    session.
  </Card>

  <Card title="Cons" icon="x">
    Less flexibility for per-user customization.
  </Card>
</CardGroup>

### Ephemeral Session Tokens

Ephemeral tokens allow you to define the persona configuration at runtime.

<CardGroup>
  <Card title="Pros" icon="check">
    Define your persona configuration at runtime, enabling per-session
    customization and fast feedback during development.
  </Card>

  <Card title="Cons" icon="x">
    Requires managing persona configuration inside your application.
  </Card>
</CardGroup>

## Getting a Session Token

<Note>
  Session tokens are valid for **1 hour** by default. You should request a new token for each user session rather than caching tokens long-term.
</Note>

<Warning>
  The session token endpoint must be called from your **server**, not from client-side code. Making this request from the browser would expose your API key.
</Warning>

### Stateful Session Token

From your server, make a request to get a stateful session token, referencing your persona ID (found in the [Anam Lab](https://lab.anam.ai) under your persona's settings):

```typescript Fetching a stateful session token on your server theme={"system"}
interface SessionTokenResponse {
  sessionToken: string;
}

async function getStatefulSessionToken(
  apiKey: string,
  personaId: string
): Promise<string> {
  const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${apiKey}`,
    },
    body: JSON.stringify({
      personaConfig: {
        personaId,
      },
    }),
  });

  if (!response.ok) {
    const error = await response.text();
    throw new Error(`Failed to get session token: ${response.status} ${error}`);
  }

  const data: SessionTokenResponse = await response.json();
  return data.sessionToken;
}
```

### Ephemeral Session Token

From your server, make a request to get an ephemeral session token with your persona configuration:

```typescript Fetching an ephemeral session token on your server theme={"system"}
interface EphemeralPersonaConfig {
  name: string;
  avatarId: string;
  voiceId: string;
  llmId: string;
  systemPrompt: string;
}

interface SessionTokenResponse {
  sessionToken: string;
}

async function getEphemeralSessionToken(
  apiKey: string,
  personaConfig: EphemeralPersonaConfig
): Promise<string> {
  const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${apiKey}`,
    },
    body: JSON.stringify({
      personaConfig,
    }),
  });

  if (!response.ok) {
    const error = await response.text();
    throw new Error(`Failed to get session token: ${response.status} ${error}`);
  }

  const data: SessionTokenResponse = await response.json();
  return data.sessionToken;
}

// Example usage
const sessionToken = await getEphemeralSessionToken(apiKey, {
  name: "Cara",
  avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt:
    "[STYLE] Reply in natural speech without formatting. Add pauses using '...' and very occasionally a disfluency. [PERSONALITY] You are Cara, a helpful assistant.",
});
```

### Common Error Responses

| Status Code | Meaning                             | Solution                                                               |
| ----------- | ----------------------------------- | ---------------------------------------------------------------------- |
| 401         | Invalid or missing API key          | Check your API key is correct and included in the Authorization header |
| 403         | API key does not have permission    | Verify your API key has the correct permissions in the Lab             |
| 404         | Persona not found (stateful tokens) | Check the persona ID exists and belongs to your account                |
| 429         | Rate limited                        | Reduce request frequency or contact support for higher limits          |

## Client Initialization

Once you have a session token from your server, use the `createClient` method to initialize the Anam client:

```typescript HelloWorld.ts theme={"system"}
import { createClient } from "@anam-ai/js-sdk";

const anamClient = createClient(sessionToken);
```

<Note>
  The client exposes the same methods whether initialized with an API key or
  session token. See [Basic Usage](/javascript-sdk/reference/basic-usage) for streaming and interaction examples.
</Note>

## Understanding a Session

The sequence diagram below shows how a typical session is started.

<Frame>
  <img alt="Session initialization sequence diagram" />
</Frame>

## Next Steps

<CardGroup>
  <Card title="Talk Commands" icon="message" href="/javascript-sdk/reference/talk-commands">
    Control persona output using talk commands
  </Card>

  <Card title="Audio Control" icon="microphone" href="/javascript-sdk/reference/audio-control">
    Control audio input and streaming behavior
  </Card>
</CardGroup>


# Quickstart
Source: https://anam.ai/docs/javascript-sdk/quickstart

Get your first AI persona running in under 5 minutes

The Anam API provides a simple interface for implementing digital AI personas within your web applications. This guide will walk you through the process of setting up a minimal example of an interactive AI persona. By the end, you'll have a working persona that can have real-time conversations in your web browser.

## Prerequisites

* An Anam API key ([get one here](/api-key))
* A modern web browser with microphone access
* A local web server (we'll show you how)

## Create Your First Persona

<Steps>
  <Step title="Create the HTML structure">
    Create a new file called `index.html` and add this basic structure:

    ```html index.html theme={"system"}
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Hello Cara</title>
    </head>
    <body>
        <div style="text-align: center; padding: 20px;">
            <h1>Chat with Cara</h1>
            <p>Your AI persona will appear below and start automatically</p>
            <video id="persona-video" autoplay playsinline style="max-width: 100%; border-radius: 8px;"></video>
            <div id="status" style="margin-top: 15px; font-size: 14px; color: #666;">Loading...</div>
        </div>
    </body>
    </html>
    ```
  </Step>

  <Step title="Add the JavaScript">
    Now add the script that will automatically start your persona. Add this `<script>` tag just before the closing `</body>` tag:

    ```javascript theme={"system"}
    <script type="module">
        import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";

        // Replace with your actual API key
        const API_KEY = "your-api-key-here";

        const videoElement = document.getElementById("persona-video");
        const statusElement = document.getElementById("status");

        async function createSessionToken() {
            const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
                method: "POST",
                headers: {
                    "Content-Type": "application/json",
                    Authorization: `Bearer ${API_KEY}`,
                },
                body: JSON.stringify({
                    personaConfig: {
                        name: "Cara",
                        avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
                        voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
                        llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
                        systemPrompt: "You are Cara, a helpful and friendly AI assistant. Keep responses conversational and concise.",
                    },
                }),
            });

            if (!response.ok) {
                throw new Error(`Failed to create session token: ${response.status}`);
            }

            const data = await response.json();
            return data.sessionToken;
        }

        async function startChat() {
            try {
                statusElement.textContent = "Creating session...";

                const sessionToken = await createSessionToken();
                statusElement.textContent = "Connecting...";

                const anamClient = createClient(sessionToken);
                await anamClient.streamToVideoElement("persona-video");

                statusElement.textContent = "Connected! Start speaking to Cara";

            } catch (error) {
                console.error("Failed to start chat:", error);
                statusElement.textContent = "Failed to connect. Check your API key.";
            }
        }

        // Auto-start when page loads
        startChat();
    </script>
    ```
  </Step>

  <Step title="Add your API key">
    Replace `your-api-key-here` with your actual Anam API key in the script.
  </Step>

  <Step title="Complete file example">
    Your final `index.html` file should look like this:

    ```html index.html [expandable] theme={"system"}
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Hello Cara</title>
    </head>
    <body>
        <div style="text-align: center; padding: 20px;">
            <h1>Chat with Cara</h1>
            <p>Your AI persona will appear below and start automatically</p>
            <video id="persona-video" autoplay playsinline style="max-width: 100%; border-radius: 8px;"></video>
            <div id="status" style="margin-top: 15px; font-size: 14px; color: #666;">Loading...</div>
        </div>

        <script type="module">
            import { createClient } from "https://esm.sh/@anam-ai/js-sdk@latest";

            // Replace with your actual API key
            const API_KEY = "your-api-key-here";

            const videoElement = document.getElementById("persona-video");
            const statusElement = document.getElementById("status");

            async function createSessionToken() {
                const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
                    method: "POST",
                    headers: {
                        "Content-Type": "application/json",
                        Authorization: `Bearer ${API_KEY}`,
                    },
                    body: JSON.stringify({
                        personaConfig: {
                            name: "Cara",
                            avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
                            voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
                            llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
                            systemPrompt: "You are Cara, a helpful and friendly AI assistant. Keep responses conversational and concise.",
                        },
                    }),
                });

                if (!response.ok) {
                    throw new Error(`Failed to create session token: ${response.status}`);
                }

                const data = await response.json();
                return data.sessionToken;
            }

            async function startChat() {
                try {
                    statusElement.textContent = "Creating session...";

                    const sessionToken = await createSessionToken();
                    statusElement.textContent = "Connecting...";

                    const anamClient = createClient(sessionToken);
                    await anamClient.streamToVideoElement("persona-video");

                    statusElement.textContent = "Connected! Start speaking to Cara";

                } catch (error) {
                    console.error("Failed to start chat:", error);
                    statusElement.textContent = "Failed to connect. Check your API key.";
                }
            }

            // Auto-start when page loads
            startChat();
        </script>
    </body>
    </html>
    ```
  </Step>

  <Step title="Start a local server">
    To serve your html file, you'll need a local web server. You can use one of
    the following methods depending on which development tools you have installed.

    <CodeGroup>
      ```bash Node.js theme={"system"}
      npx http-server -p 8000
      ```

      ```bash Python 3 theme={"system"}
      python -m http.server 8000
      ```

      ```bash Python 2 theme={"system"}
      python -m SimpleHTTPServer 8000
      ```

      ```bash PHP theme={"system"}
      php -S localhost:8000
      ```
    </CodeGroup>
  </Step>

  <Step title="Open and test">
    1. Navigate to `http://localhost:8000` in your browser
    2. Allow microphone access when prompted
    3. Cara will appear automatically and you can start talking!
    4. When you're done, close the browser tab to end the session.
  </Step>
</Steps>

## What just happened?

* **Obtaining a Session Token**: Your API key was exchanged for a temporary session token that enables the persona connection
* **Establishing a WebRTC Stream**: The Anam SDK established a real-time video/audio connection to display your persona
* **Voice Interaction**: Cara listens to your microphone and responds with natural speech and facial expressions

## What's next?

Now that you have Cara running, here are some directions to explore:

<CardGroup>
  <Card title="Customize your persona" icon="paintbrush" href="/personas/overview">
    Change your persona's appearance, voice, and personality
  </Card>

  <Card title="Build a full application" icon="code" href="examples/basic-app">
    Create a complete app with proper client/server architecture
  </Card>

  <Card title="Handle conversation events" icon="bolt" href="/javascript-sdk/reference/event-types">
    React when your persona speaks, listens, or encounters errors
  </Card>
</CardGroup>


# Audio Control
Source: https://anam.ai/docs/javascript-sdk/reference/audio-control

Control audio input and streaming behavior

## Audio Input State

By default, the Anam client starts capturing input audio from the user's microphone when a session starts and stops capturing when the session ends. For certain use cases, you may wish to control the input audio state programmatically.

### The InputAudioState Type

The `InputAudioState` interface contains two properties:

```typescript theme={"system"}
interface InputAudioState {
  isMuted: boolean;
  permissionState: AudioPermissionState;
}

enum AudioPermissionState {
  PENDING = 'pending',
  GRANTED = 'granted',
  DENIED = 'denied',
  NOT_REQUESTED = 'not_requested',
}
```

### Getting the Current State

To check the current input audio state:

```typescript theme={"system"}
import { InputAudioState, AudioPermissionState } from '@anam-ai/js-sdk';

const audioState: InputAudioState = anamClient.getInputAudioState();
// { isMuted: false, permissionState: 'granted' }

// Check if microphone permission was granted
if (audioState.permissionState === AudioPermissionState.DENIED) {
  console.log('Microphone access was denied');
}
```

### Mute Audio Input

To mute the input audio:

```typescript theme={"system"}
const audioState: InputAudioState = anamClient.muteInputAudio();
// { isMuted: true, permissionState: 'granted' }
```

<Note>
  If you mute the input audio before starting a stream, the session will start with microphone input disabled.
</Note>

### Unmute Audio Input

To unmute the input audio:

```typescript theme={"system"}
const audioState: InputAudioState = anamClient.unmuteInputAudio();
// { isMuted: false, permissionState: 'granted' }
```

## Changing Audio Input Device

To switch to a different microphone during an active session:

```typescript theme={"system"}
// Get available audio input devices
const devices = await navigator.mediaDevices.enumerateDevices();
const audioInputs = devices.filter(device => device.kind === 'audioinput');

// Switch to a specific device by its ID
await anamClient.changeAudioInputDevice(audioInputs[1].deviceId);
```

<Note>
  You must be actively streaming to change the audio input device. This method throws an error if called before `streamToVideoElement()`.
</Note>

## Configuration Options

### Disabling Audio Input

To completely disable microphone input (useful for text-only interactions or when using custom input streams):

```typescript theme={"system"}
import { createClient } from '@anam-ai/js-sdk';

const anamClient = createClient(sessionToken, {
  disableInputAudio: true,
});
```

When `disableInputAudio` is `true`:

* The SDK will not request microphone permissions
* `muteInputAudio()` and `unmuteInputAudio()` will have no effect
* Any user-provided audio stream will be ignored

### Specifying an Audio Device

To use a specific microphone at initialization:

```typescript theme={"system"}
const anamClient = createClient(sessionToken, {
  audioDeviceId: 'specific-device-id',
});
```

## Custom Input Streams

If you wish to control the microphone input audio capture yourself, you can pass your own `MediaStream` object when starting a stream:

```typescript theme={"system"}
// Get your own media stream
const userProvidedMediaStream = await navigator.mediaDevices.getUserMedia({
  audio: true,
  video: false,
});

anamClient.streamToVideoElement(
  'video-element-id',
  userProvidedMediaStream
);
```

<Note>
  The `userProvidedMediaStream` object must be an instance of `MediaStream` and the user input audio should be the first audio track returned from the `MediaStream.getAudioTracks()` method.
</Note>

<Tip>
  This is the default behavior if you are using `navigator.mediaDevices.getUserMedia()`.
</Tip>


# Basic Usage
Source: https://anam.ai/docs/javascript-sdk/reference/basic-usage

Install, setup, and deploy the Anam AI JavaScript SDK

## Installation

Install the SDK in your project using npm:

```bash theme={"system"}
npm install @anam-ai/js-sdk
```

## Browser Requirements

The Anam SDK uses WebRTC for real-time video and audio streaming. Ensure your target browsers support:

* WebRTC (Chrome 56+, Firefox 44+, Safari 11+, Edge 79+)
* MediaDevices API for microphone access

## HTML Setup

This example requires a video element to display the persona:

```html theme={"system"}
<video id="video-element-id" autoplay playsinline></video>
```

<Note>
  `autoplay` starts the stream when the page loads. `playsinline` prevents
  fullscreen mode on mobile devices.
</Note>

## Basic Usage

To keep your API key secure, exchange it for a short-lived session token on your server before initializing the client. See [Usage in Production](/javascript-sdk/production) for detailed session token information.

### Initialize the Client

Use `createClient` with your session token:

```typescript theme={"system"}
import { createClient, AnamEvent } from "@anam-ai/js-sdk";

const anamClient = createClient(sessionToken);
```

### Start Streaming

Stream the persona to your video element:

```typescript theme={"system"}
try {
  await anamClient.streamToVideoElement("video-element-id");
} catch (error) {
  console.error("Failed to start stream:", error);
}
```

### Listen for Events

Handle connection lifecycle events:

```typescript theme={"system"}
import { AnamEvent, ConnectionClosedCode } from "@anam-ai/js-sdk";

anamClient.addListener(AnamEvent.CONNECTION_ESTABLISHED, () => {
  console.log("Connected to persona");
});

anamClient.addListener(AnamEvent.CONNECTION_CLOSED, (code: ConnectionClosedCode) => {
  console.log("Connection closed:", code);
});
```

## Stopping a Stream

To stop an active session and release resources:

```typescript theme={"system"}
await anamClient.stopStreaming();
```

## Client Options

The `createClient` function accepts an optional second parameter for configuration:

```typescript theme={"system"}
import { createClient } from "@anam-ai/js-sdk";

const anamClient = createClient(sessionToken, {
  disableInputAudio: false,      // Set true to disable microphone
  audioDeviceId: "device-id",    // Specify audio input device
  voiceDetection: {
    endOfSpeechSensitivity: 0.5, // 0-1, higher = more sensitive
  },
  iceServers: [],                // Custom ICE/TURN servers for WebRTC
  metrics: {
    disableClientMetrics: false, // Disable telemetry for air-gapped environments
  },
});
```

## Troubleshooting

| Issue                | Solution                                                                        |
| -------------------- | ------------------------------------------------------------------------------- |
| Video not appearing  | Verify the video element ID matches and has `autoplay playsinline` attributes   |
| No microphone access | Ensure HTTPS (required for getUserMedia) and user has granted permission        |
| Connection fails     | Check browser console for errors; verify session token is valid and not expired |

## Next Steps

<CardGroup>
  <Card title="Talk Commands" icon="message" href="/javascript-sdk/reference/talk-commands">
    Control persona output using talk commands
  </Card>

  <Card title="Audio Control" icon="microphone" href="/javascript-sdk/reference/audio-control">
    Control audio input and streaming behavior
  </Card>

  <Card title="User Messages" icon="user" href="/javascript-sdk/reference/user-messages">
    Send messages programmatically on behalf of the user
  </Card>

  <Card title="Events" icon="bolt" href="/javascript-sdk/reference/event-types">
    Handle connection and conversation events
  </Card>
</CardGroup>


# Event Handling
Source: https://anam.ai/docs/javascript-sdk/reference/event-types

React to conversation events and user interactions with Anam personas

Anam personas communicate through an event system that lets you respond to connection changes, conversation updates, and user interactions. Understanding these events helps you build responsive, interactive applications.

## How Events Work

The Anam SDK uses an event-driven architecture where your application listens for specific events and reacts accordingly. This allows you to:

* Update your UI based on connection status
* Track conversation history in real-time
* Handle user interruptions
* Monitor stream quality and performance

## Available Events

### Connection Events

These events track the connection lifecycle between your client and Anam's streaming infrastructure:

#### `CONNECTION_ESTABLISHED`

Fired when the WebRTC connection is successfully established.

```javascript theme={"system"}
import { AnamEvent } from "@anam-ai/js-sdk";

anamClient.addListener(AnamEvent.CONNECTION_ESTABLISHED, () => {
  console.log("Connected to Anam streaming service");
  updateConnectionStatus("connected");
  hideLoadingSpinner();
});
```

#### `SESSION_READY`

Fired after the session initializes and all backend components are ready. Use this to remove loading indicators.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.SESSION_READY, (sessionId) => {
  console.log("Session ready:", sessionId);
  setIsLoading(false);
});
```

#### `CONNECTION_CLOSED`

Fired when the connection is terminated. Receives a reason code and optional details.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.CONNECTION_CLOSED, (reason, details) => {
  console.log("Connection closed:", reason, details);
  updateConnectionStatus("disconnected");
  showReconnectButton();
});
```

### Video Events

#### `VIDEO_STREAM_STARTED`

Fired when the video MediaStream becomes available.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.VIDEO_STREAM_STARTED, (videoStream) => {
  console.log("Video stream available");
  // You can attach this to a video element if using manual stream handling
});
```

#### `VIDEO_PLAY_STARTED`

Fired when the first video frames begin playing. Ideal for removing loading indicators.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.VIDEO_PLAY_STARTED, () => {
  console.log("Video stream started");
  hideVideoLoadingState();
  showPersonaInterface();
});
```

### Audio Events

#### `AUDIO_STREAM_STARTED`

Fired when the audio MediaStream from the persona becomes available.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.AUDIO_STREAM_STARTED, (audioStream) => {
  console.log("Audio stream available");
});
```

#### `INPUT_AUDIO_STREAM_STARTED`

Fired when microphone input is successfully initialized.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.INPUT_AUDIO_STREAM_STARTED, (stream) => {
  console.log("Microphone access granted");
  showMicrophoneIndicator();
  updateAudioInputStatus("active");
});
```

#### `INPUT_AUDIO_DEVICE_CHANGED`

Fired when the input audio device changes.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.INPUT_AUDIO_DEVICE_CHANGED, (deviceId) => {
  console.log("Audio input device changed to:", deviceId);
});
```

### Microphone Permission Events

These events track the state of microphone permission requests:

#### `MIC_PERMISSION_PENDING`

Fired when the browser is requesting microphone permission from the user.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.MIC_PERMISSION_PENDING, () => {
  showMicPermissionPrompt();
});
```

#### `MIC_PERMISSION_GRANTED`

Fired when the user grants microphone permission.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.MIC_PERMISSION_GRANTED, () => {
  hideMicPermissionPrompt();
});
```

#### `MIC_PERMISSION_DENIED`

Fired when the user denies microphone permission.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.MIC_PERMISSION_DENIED, (error) => {
  console.error("Microphone permission denied:", error);
  showMicPermissionError();
});
```

### Conversation Events

These events help you track and respond to conversation flow:

#### `MESSAGE_HISTORY_UPDATED`

Provides the complete conversation history each time a participant finishes speaking.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => {
  console.log("Conversation updated:", messages);
  updateChatHistory(messages);

  // Example message structure:
  // [
  //   { role: "user", content: "Hello" },
  //   { role: "assistant", content: "Hi there! How can I help?" }
  // ]
});
```

#### `MESSAGE_STREAM_EVENT_RECEIVED`

Provides real-time transcription updates as speech occurs.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.MESSAGE_STREAM_EVENT_RECEIVED, (event) => {
  if (event.role === "persona") {
    // Persona is speaking - show real-time transcription
    updatePersonaTranscript(event.content);
  } else if (event.role === "user") {
    // User finished speaking - complete transcription
    updateUserTranscript(event.content);
  }
});
```

### Reasoning Events

These events provide insight into the persona's reasoning process (when using models that support it):

#### `REASONING_HISTORY_UPDATED`

Provides the complete reasoning history.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.REASONING_HISTORY_UPDATED, (thoughts) => {
  console.log("Reasoning updated:", thoughts);
});
```

#### `REASONING_STREAM_EVENT_RECEIVED`

Provides real-time reasoning updates as the persona thinks.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.REASONING_STREAM_EVENT_RECEIVED, (event) => {
  updateReasoningDisplay(event);
});
```

### Talk Stream Events

#### `TALK_STREAM_INTERRUPTED`

Fired when a user interrupts a `TalkMessageStream` by speaking. Receives the correlation ID directly.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.TALK_STREAM_INTERRUPTED, (correlationId) => {
  console.log("Talk stream interrupted:", correlationId);
  handleStreamInterruption(correlationId);
  stopCurrentGeneration();
});
```

### Server Events

#### `SERVER_WARNING`

Fired when the server sends a warning message.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.SERVER_WARNING, (message) => {
  console.warn("Server warning:", message);
});
```

### Tool Call Events

All tool calls emit lifecycle events — `started`, `completed`, and `failed` — that you can use to trigger client-side actions or for monitoring.

For client tools, use `registerToolCallHandler` to define per-tool handlers, this will automatically emit `completed` or `failed` events when the handler completes.

#### Using `registerToolCallHandler` (Recommended)

Register handlers for specific tools by name. For client tools, the return value from `onStart` is sent back to the LLM as the tool result:

```javascript theme={"system"}
// Register a handler — returns a cancel function
const cancelNavHandler = anamClient.registerToolCallHandler("navigate_to_page", {
  onStart: async (payload) => {
    const { page, section } = payload.arguments;
    window.location.href = `/${page}`;
    return `Navigated to ${page}`;
  },
  onComplete: async (payload) => {
    console.log(`Completed in ${payload.executionTime}ms`);
  },
  onFail: async (payload) => {
    console.error(`Failed: ${payload.errorMessage}`);
  },
});

// Clean up when done
cancelNavHandler();
```

#### `TOOL_CALL_STARTED`

Fired when any tool call begins. Receives a `ToolCallStartedPayload` with the tool name, type, and arguments.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.TOOL_CALL_STARTED, (event) => {
  const { toolName, arguments: args } = event;
  console.log(`Tool started: ${toolName}`, args);
});
```

#### `TOOL_CALL_COMPLETED`

Fired when a tool call completes successfully. Includes the result and execution time.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.TOOL_CALL_COMPLETED, (event) => {
  const { toolName, result, executionTime } = event;
  console.log(`Tool ${toolName} completed in ${executionTime}ms:`, result);
});
```

#### `TOOL_CALL_FAILED`

Fired when a tool call fails. Includes the error message and execution time.

```javascript theme={"system"}
anamClient.addListener(AnamEvent.TOOL_CALL_FAILED, (event) => {
  const { toolName, errorMessage, executionTime } = event;
  console.error(`Tool ${toolName} failed after ${executionTime}ms: ${errorMessage}`);
});
```

#### `CLIENT_TOOL_EVENT_RECEIVED` (Deprecated)

A legacy event for client tools. Replaced by `TOOL_CALL_STARTED` and `registerToolCallHandler` as of SDK v4.9.0.

```javascript theme={"system"}
// Deprecated — use registerToolCallHandler instead
anamClient.addListener(AnamEvent.CLIENT_TOOL_EVENT_RECEIVED, (event) => {
  const { eventName, eventData } = event;

  switch (eventName) {
    case "navigate_to_page":
      window.location.href = `/${eventData.page}`;
      break;
    case "open_modal":
      openModal(eventData.modalType, eventData.data);
      break;
  }
});
```

<Note>Client tools enable voice-driven or chat-driven UI control. For a complete guide on creating and configuring client tools, see the [Client Tools Guide](/personas/tools/client-tools).</Note>

## Removing Event Listeners

Remove listeners when components unmount or sessions end to prevent memory leaks:

```javascript theme={"system"}
const handleMessage = (messages) => {
  updateChatHistory(messages);
};

// Add listener
anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, handleMessage);

// Remove listener when done
anamClient.removeListener(AnamEvent.MESSAGE_HISTORY_UPDATED, handleMessage);
```

## Learn More

<CardGroup>
  <Card title="Events API Reference" icon="code" href="/javascript-sdk/reference/events">
    Complete event reference with callback signatures
  </Card>

  <Card title="Client Tools Guide" icon="browser" href="/personas/tools/client-tools">
    Guide for creating and handling client tools
  </Card>
</CardGroup>


# Listening for Events
Source: https://anam.ai/docs/javascript-sdk/reference/events

Listen to events during Anam AI sessions

## Adding Event Listeners

After initializing the Anam client, you can register event listeners using the `addListener` method:

```typescript theme={"system"}
import { AnamClient, AnamEvent } from "@anam-ai/js-sdk";

anamClient.addListener(AnamEvent.CONNECTION_ESTABLISHED, () => {
  console.log("Connection Established");
});

anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => {
  console.log("Updated Messages:", messages);
});
```

## Available Events

| Event Name                        | Description                                                                                  | Callback Parameters                                |
| --------------------------------- | -------------------------------------------------------------------------------------------- | -------------------------------------------------- |
| `CONNECTION_ESTABLISHED`          | Fired when the WebRTC connection is established                                              | None                                               |
| `CONNECTION_CLOSED`               | Fired when the connection is terminated                                                      | `reason: ConnectionClosedCode`, `details?: string` |
| `SESSION_READY`                   | Fired after the session initializes and backend components are ready                         | `sessionId: string`                                |
| `VIDEO_STREAM_STARTED`            | Fired when the video stream becomes available                                                | `videoStream: MediaStream`                         |
| `VIDEO_PLAY_STARTED`              | Fired when the first video frames start playing                                              | None                                               |
| `AUDIO_STREAM_STARTED`            | Fired when the audio stream becomes available                                                | `audioStream: MediaStream`                         |
| `INPUT_AUDIO_STREAM_STARTED`      | Fired when microphone input is initialized                                                   | `audioStream: MediaStream`                         |
| `USER_SPEECH_STARTED`             | The user started speaking. No transcript yet — useful for showing a listening indicator.     | `correlationId: string`                            |
| `USER_SPEECH_ENDED`               | The user stopped speaking. A transcript event may follow in `MESSAGE_STREAM_EVENT_RECEIVED`. | `correlationId: string`                            |
| `MESSAGE_HISTORY_UPDATED`         | Fired when a participant finishes speaking with full history                                 | `messages: Message[]`                              |
| `MESSAGE_STREAM_EVENT_RECEIVED`   | Fired with real-time transcription updates                                                   | `event: MessageStreamEvent`                        |
| `TALK_STREAM_INTERRUPTED`         | Fired when a user interrupts a `TalkMessageStream`                                           | `correlationId: string`                            |
| `TOOL_CALL_STARTED`               | Fired when any tool call begins                                                              | `event: ToolCallStartedPayload`                    |
| `TOOL_CALL_COMPLETED`             | Fired when a tool call completes successfully                                                | `event: ToolCallCompletedPayload`                  |
| `TOOL_CALL_FAILED`                | Fired when a tool call fails                                                                 | `event: ToolCallFailedPayload`                     |
| `CLIENT_TOOL_EVENT_RECEIVED`      | **Deprecated** - Use `TOOL_CALL_STARTED` instead                                             | `event: ClientToolEvent`                           |
| `SERVER_WARNING`                  | Fired when the server sends a warning message                                                | `message: string`                                  |
| `MIC_PERMISSION_PENDING`          | Fired when microphone permission is being requested                                          | None                                               |
| `MIC_PERMISSION_GRANTED`          | Fired when microphone permission is granted                                                  | None                                               |
| `MIC_PERMISSION_DENIED`           | Fired when microphone permission is denied                                                   | `error: string`                                    |
| `INPUT_AUDIO_DEVICE_CHANGED`      | Fired when the input audio device changes                                                    | `deviceId: string`                                 |
| `REASONING_HISTORY_UPDATED`       | Fired when reasoning/thought history is updated                                              | `messages: ReasoningMessage[]`                     |
| `REASONING_STREAM_EVENT_RECEIVED` | Fired with real-time reasoning updates                                                       | `event: ReasoningStreamEvent`                      |

## Type Definitions

### Message

```typescript theme={"system"}
interface Message {
  id: string;
  content: string;
  role: MessageRole; // 'user' | 'persona'
  interrupted?: boolean;
}
```

### MessageStreamEvent

```typescript theme={"system"}
interface MessageStreamEvent {
  id: string;
  content: string;
  role: MessageRole; // 'user' | 'persona'
  endOfSpeech: boolean;
  interrupted: boolean;
}
```

### ToolCallStartedPayload

```typescript theme={"system"}
interface ToolCallStartedPayload {
  eventUid: string;                 // Unique ID for this event
  toolCallId: string;               // ID of the tool call
  sessionId: string;                // ID of the active session
  toolName: string;                 // The tool name (e.g., "navigate_to_page")
  toolType: string;                 // Tool type (e.g., "client", "server")
  toolSubtype?: string;             // Tool subtype for server events (e.g., "webhook", "knowledge")
  arguments: Record<string, any>;   // LLM-generated parameters
  timestamp: string;                // ISO timestamp
  timestampUserAction: string;      // Timestamp of the user action that triggered the tool call
  userActionCorrelationId: string;  // Id of the user action that triggered the tool call
}
```

### ToolCallCompletedPayload

```typescript theme={"system"}
interface ToolCallCompletedPayload {
  eventUid: string;                 // Unique ID for this event
  toolCallId: string;               // ID of the tool call
  sessionId: string;                // ID of the active session
  toolName: string;                 // The tool name
  toolType: string;                 // Tool type
  toolSubtype?: string;             // Tool subtype
  result: any;                      // Result returned by the handler
  executionTime: number;            // Execution time in milliseconds
  timestamp: string;                // ISO timestamp
  documentsAccessed?: string[];     // List of file names accessed during the tool call, if applicable (Knowledge tools only)
  timestampUserAction: string;      // Timestamp of the user action that triggered the tool call
  userActionCorrelationId: string;  // Id of the user action that triggered the tool call
}
```

### ToolCallFailedPayload

```typescript theme={"system"}
interface ToolCallFailedPayload {
  eventUid: string;                 // Unique ID for this event
  toolCallId: string;               // ID of the tool call
  sessionId: string;                // ID of the active session
  toolName: string;                 // The tool name
  toolType: string;                 // Tool type
  toolSubtype?: string;             // Tool subtype
  errorMessage: string;             // Error description
  executionTime: number;            // Execution time in milliseconds
  timestamp: string;                // ISO timestamp
  timestampUserAction: string;      // Timestamp of the user action that triggered the tool call
  userActionCorrelationId: string;  // Id of the user action that triggered the tool call
}
```

### ToolCallHandler

```typescript theme={"system"}
interface ToolCallHandler {
  onStart?: (payload: ToolCallStartedPayload) => Promise<string | void>;
  onFail?: (payload: ToolCallFailedPayload) => Promise<void>;
  onComplete?: (payload: ToolCallCompletedPayload) => Promise<void>;
}
```

### ClientToolEvent (Deprecated)

<Warning>
  `ClientToolEvent` is deprecated as of SDK v4.9.0. Use `ToolCallStartedPayload` with `registerToolCallHandler` or `TOOL_CALL_STARTED` events instead.
</Warning>

```typescript theme={"system"}
interface ClientToolEvent {
  eventUid: string;           // Unique ID for this event
  sessionId: string;          // Session ID
  eventName: string;          // The tool name (e.g., "navigate_to_page")
  eventData: Record<string, any>; // LLM-generated parameters
  timestamp: string;          // ISO timestamp when event was created
  timestampUserAction: string; // ISO timestamp of user action that triggered this
  userActionCorrelationId: string; // Correlation ID for tracking
}
```

### ConnectionClosedCode

```typescript theme={"system"}
enum ConnectionClosedCode {
  NORMAL = 'CONNECTION_CLOSED_CODE_NORMAL',
  MICROPHONE_PERMISSION_DENIED = 'CONNECTION_CLOSED_CODE_MICROPHONE_PERMISSION_DENIED',
  SIGNALLING_CLIENT_CONNECTION_FAILURE = 'CONNECTION_CLOSED_CODE_SIGNALLING_CLIENT_CONNECTION_FAILURE',
  WEBRTC_FAILURE = 'CONNECTION_CLOSED_CODE_WEBRTC_FAILURE',
  SERVER_CLOSED_CONNECTION = 'CONNECTION_CLOSED_CODE_SERVER_CLOSED_CONNECTION',
}
```

### ReasoningMessage

```typescript theme={"system"}
interface ReasoningMessage {
  id: string;
  content: string;
  role: string;
}
```

### ReasoningStreamEvent

```typescript theme={"system"}
interface ReasoningStreamEvent {
  id: string;
  content: string;
  endOfThought: boolean;
  role: string;
}
```

## Example Usage

### Loading States

Use connection events to manage loading states:

```typescript theme={"system"}
anamClient.addListener(AnamEvent.CONNECTION_ESTABLISHED, () => {
  setIsConnecting(false);
});

anamClient.addListener(AnamEvent.SESSION_READY, (sessionId: string) => {
  console.log("Session ready:", sessionId);
  setIsLoading(false);
});
```

### Connection Closed Handling

Handle connection closures with reason codes:

```typescript theme={"system"}
import { AnamEvent, ConnectionClosedCode } from "@anam-ai/js-sdk";

anamClient.addListener(
  AnamEvent.CONNECTION_CLOSED,
  (reason: ConnectionClosedCode, details?: string) => {
    switch (reason) {
      case ConnectionClosedCode.NORMAL:
        console.log("Connection closed normally");
        break;
      case ConnectionClosedCode.MICROPHONE_PERMISSION_DENIED:
        showError("Microphone access is required");
        break;
      case ConnectionClosedCode.WEBRTC_FAILURE:
        showError("Connection failed. Please check your network.");
        break;
      default:
        console.log("Connection closed:", reason, details);
    }
  }
);
```

### Microphone Permission Flow

Track microphone permission state:

```typescript theme={"system"}
anamClient.addListener(AnamEvent.MIC_PERMISSION_PENDING, () => {
  showPermissionPrompt("Please allow microphone access");
});

anamClient.addListener(AnamEvent.MIC_PERMISSION_GRANTED, () => {
  hidePermissionPrompt();
  showMicrophoneIndicator();
});

anamClient.addListener(AnamEvent.MIC_PERMISSION_DENIED, (error: string) => {
  showError(`Microphone access denied: ${error}`);
});
```

### Speech detection

Know when the user starts and stops speaking, before any transcript is available:

```typescript theme={"system"}
anamClient.addListener(AnamEvent.USER_SPEECH_STARTED, (correlationId: string) => {
  setIsUserSpeaking(true);
  showListeningIndicator();
});

anamClient.addListener(AnamEvent.USER_SPEECH_ENDED, (correlationId: string) => {
  setIsUserSpeaking(false);
  showProcessingIndicator();
});
```

These fire as soon as the server-side VAD picks up voice activity, before the full transcript arrives via `MESSAGE_STREAM_EVENT_RECEIVED`. The `correlationId` ties the speech start/end pair to the eventual transcript.

If `USER_SPEECH_ENDED` doesn't arrive within \~10 seconds of a `USER_SPEECH_STARTED`, treat it as a dropped detection and reset your UI. This can happen if the connection drops mid-speech.

### Message History

Track conversation history:

```typescript theme={"system"}
import { Message } from "@anam-ai/js-sdk";

anamClient.addListener(
  AnamEvent.MESSAGE_HISTORY_UPDATED,
  (messages: Message[]) => {
    setConversationHistory(messages);
  }
);
```

### Real-time Transcription

Monitor speech in real-time:

```typescript theme={"system"}
import { MessageStreamEvent, MessageRole } from "@anam-ai/js-sdk";

anamClient.addListener(
  AnamEvent.MESSAGE_STREAM_EVENT_RECEIVED,
  (event: MessageStreamEvent) => {
    if (event.role === MessageRole.PERSONA) {
      updatePersonaSpeech(event.content);
    } else {
      updateUserSpeech(event.content);
    }

    if (event.endOfSpeech) {
      finalizeSpeech(event.id);
    }
  }
);
```

### Tool Call Handlers (Recommended)

Use `registerToolCallHandler` to register handlers for specific tools. This automatically emits `completed` or `failed` events when the handler completes.

```typescript theme={"system"}
import { AnamClient } from "@anam-ai/js-sdk";

const cancelNavHandler = anamClient.registerToolCallHandler("navigate_to_page", {
  onStart: async (payload) => {
    const { page, section } = payload.arguments;
    router.push(`/${page}${section ? `#${section}` : ""}`);
  },
  onComplete: async (payload) => {
    console.log(`tool call completed ${payload.toolName}`);
  },
  onFail: async (payload) => {
    console.log(`tool call failed ${payload.toolName}`);
  },
});

// you can also register partial handlers
const cancelModalHandler = anamClient.registerToolCallHandler("open_modal", {
  onStart: async (payload) => {
    openModal(payload.arguments.modalType, payload.arguments.data);
    return `Opened ${payload.arguments.modalType} modal`;
  },
});

// unsubscribe when no longer needed
cancelNavHandler();
cancelModalHandler();
```

<Tip>
  `registerToolCallHandler` is the recommended approach for client tools. This will automatically emit `completed` or `failed` events when the handler completes.
</Tip>

### Tool Call Events

Listen for tool call lifecycle events across all tool types — client, webhook, and knowledge tools all emit these events, so you can use them for logging, analytics, or monitoring:

```typescript theme={"system"}
import { AnamEvent, ToolCallStartedPayload, ToolCallCompletedPayload, ToolCallFailedPayload } from "@anam-ai/js-sdk";

anamClient.addListener(
  AnamEvent.TOOL_CALL_STARTED,
  (event: ToolCallStartedPayload) => {
    console.log(`Tool started: ${event.toolName} (${event.toolType})`, event.arguments);
  }
);

anamClient.addListener(
  AnamEvent.TOOL_CALL_COMPLETED,
  (event: ToolCallCompletedPayload) => {
    console.log(`Tool completed: ${event.toolName} in ${event.executionTime}ms`);
  }
);

anamClient.addListener(
  AnamEvent.TOOL_CALL_FAILED,
  (event: ToolCallFailedPayload) => {
    console.error(`Tool failed: ${event.toolName} - ${event.errorMessage}`);
  }
);
```

Use `toolType` and `toolSubtype` to distinguish between tool types in your listeners:

```typescript theme={"system"}
anamClient.addListener(AnamEvent.TOOL_CALL_COMPLETED, (event) => {
  switch (event.toolType) {
    case "client":
      console.log(`Client action completed: ${event.toolName}`);
      break;
    case "server":
      if (event.toolSubtype === "webhook") {
        console.log(`Webhook responded: ${event.toolName} in ${event.executionTime}ms`);
      } else if (event.toolSubtype === "knowledge") {
        console.log(`Knowledge search completed: ${event.toolName}`);
      }
      break;
  }
});
```

<Tip>
  These events fire for all tool types. For guides on configuring each type, see [Client Tools](/personas/tools/client-tools), [Webhook Tools](/personas/tools/webhook-tools), and [Knowledge Tools](/personas/knowledge/tools).
</Tip>

### Server Warnings

Handle server-side warnings:

```typescript theme={"system"}
anamClient.addListener(AnamEvent.SERVER_WARNING, (message: string) => {
  console.warn("Server warning:", message);
  showWarningToast(message);
});
```

### Reasoning Events (Extended Thinking)

Track AI reasoning when using models with extended thinking:

```typescript theme={"system"}
import { ReasoningMessage, ReasoningStreamEvent } from "@anam-ai/js-sdk";

// Get complete reasoning history
anamClient.addListener(
  AnamEvent.REASONING_HISTORY_UPDATED,
  (messages: ReasoningMessage[]) => {
    setReasoningHistory(messages);
  }
);

// Stream reasoning in real-time
anamClient.addListener(
  AnamEvent.REASONING_STREAM_EVENT_RECEIVED,
  (event: ReasoningStreamEvent) => {
    updateReasoningDisplay(event.content);
    if (event.endOfThought) {
      finalizeThought(event.id);
    }
  }
);
```

## Removing Event Listeners

Remove listeners to prevent memory leaks, especially in single-page applications:

```typescript theme={"system"}
const handleMessages = (messages: Message[]) => {
  setConversationHistory(messages);
};

// Add listener
anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, handleMessages);

// Remove listener when done
anamClient.removeListener(AnamEvent.MESSAGE_HISTORY_UPDATED, handleMessages);
```

### React Example

```typescript theme={"system"}
useEffect(() => {
  const cancelNav = client.registerToolCallHandler("navigate_to_page", {
    onStart: async (payload) => {
      router.push(`/${payload.arguments.page}`);
    },
  });

  const cancelModal = client.registerToolCallHandler("open_modal", {
    onStart: async (payload) => {
      openModal(payload.arguments.modalType);
      return "Modal opened";
    },
  });

  return () => {
    cancelNav();
    cancelModal();
  };
}, [client]);
```

## Learn More

<CardGroup>
  <Card title="Client Tools Guide" icon="browser" href="/personas/tools/client-tools">
    Guide with examples for navigation, modals, UI updates, and more
  </Card>

  <Card title="Tools Overview" icon="wrench" href="/personas/tools/overview">
    Learn about all tool types: client, webhook, and knowledge tools
  </Card>
</CardGroup>


# Interrupt Command
Source: https://anam.ai/docs/javascript-sdk/reference/interrupt-command

Stop the persona from speaking mid-response

## Overview

The `interruptPersona()` method allows you to programmatically stop the persona while it's speaking.

<Note>Requires SDK version 3.4.0 or higher</Note>

## Method Signature

```typescript theme={"system"}
interruptPersona(): void
```

## Basic Usage

Stop the persona from speaking immediately:

```javascript theme={"system"}
anamClient.interruptPersona();
```

The persona will stop its current speech and be ready to receive new input.

## Prerequisites

Before calling `interruptPersona()`, you must:

1. Have an active streaming session (call `stream()` or `streamToVideoElement()` first)
2. Have successfully connected to the Anam Engine

## Error Handling

The method throws errors in the following cases:

```javascript theme={"system"}
try {
  anamClient.interruptPersona();
} catch (error) {
  // Handle the error
  console.error(error.message);
}
```

| Error Message                                               | Cause                                             |
| ----------------------------------------------------------- | ------------------------------------------------- |
| `Failed to send interrupt command: not currently streaming` | Called before starting a stream or after stopping |
| `Failed to send interrupt command: no active session`       | No session ID available                           |

## Behavior

The `interruptPersona()` method:

* Immediately stops any ongoing speech from the persona
* The persona remains ready to receive new input after being interrupted
* Has no effect if the persona is not currently speaking (no error thrown in this case)

## Use Cases

### User-Initiated Interruption

Allow users to interrupt the persona when they want to ask a different question or change topics:

```javascript theme={"system"}
document.getElementById("interrupt-btn").addEventListener("click", () => {
  try {
    anamClient.interruptPersona();
  } catch (error) {
    console.error("Could not interrupt:", error.message);
  }
});
```

<Note>
  Voice activity detection automatically interrupts the persona when the user starts speaking. Use this method for programmatic interruption or when you need immediate response to UI actions.
</Note>

### Topic Change Handling

Stop the persona when users navigate to a different section or topic:

```javascript theme={"system"}
function handleNavigationChange(newSection) {
  // Stop any ongoing explanation
  try {
    anamClient.interruptPersona();
  } catch (error) {
    // May fail if not streaming - that's fine for navigation
  }

  // Send context about the new topic
  anamClient.sendUserMessage(
    `Note to AI: User navigated to ${newSection}`
  );
}
```

## Next Steps

<CardGroup>
  <Card title="Events" icon="webhook" href="/javascript-sdk/reference/events">
    Learn about speech events to track when the persona is speaking
  </Card>

  <Card title="User Messages" icon="message" href="/javascript-sdk/reference/user-messages">
    Send messages programmatically after interrupting
  </Card>
</CardGroup>


# Talk Commands
Source: https://anam.ai/docs/javascript-sdk/reference/talk-commands

Control persona output using talk commands

## Basic Usage

During a persona session, you can force a response from the persona using the `talk` method. This is useful when the user interacts with a UI element or when you use your own LLM instead of Anam's built-in models.

<Note>
  Both `talk()` and `createTalkMessageStream()` require an active streaming session. Call `stream()` or `streamToVideoElement()` first.
</Note>

```typescript theme={"system"}
await anamClient.talk("Hello, how are you?");
```

This sends the text to the persona, which then speaks it aloud.

<Note>
  To learn more about using the `talk` method with your own LLM, see the [Custom LLMs](/personas/llms/custom-llms) guide.
</Note>

## Streaming Talk Input

For lower latency, stream messages to the persona in chunks. This works well when streaming output from a custom LLM.

```typescript theme={"system"}
const talkMessageStream = anamClient.createTalkMessageStream();

const chunks = ["He", "l", "lo", ", how are you?"];

for (const chunk of chunks) {
  if (talkMessageStream.isActive()) {
    await talkMessageStream.streamMessageChunk(
      chunk,
      chunk === chunks[chunks.length - 1] // endOfSpeech: true on last chunk
    );
  }
}
```

<Note>
  Each `TalkMessageStream` represents one conversation turn. Once the turn ends, create a new stream for the next turn.
</Note>

### Available Methods

The `TalkMessageStream` object provides these methods:

| Method                                     | Description                                                                      |
| ------------------------------------------ | -------------------------------------------------------------------------------- |
| `streamMessageChunk(content, endOfSpeech)` | Send a text chunk. Set `endOfSpeech: true` on the final chunk.                   |
| `endMessage()`                             | End the stream without sending more content. Alternative to `endOfSpeech: true`. |
| `isActive()`                               | Returns `true` if the stream can still accept chunks.                            |
| `getState()`                               | Returns the current state: `UNSTARTED`, `STREAMING`, `INTERRUPTED`, or `ENDED`.  |
| `getCorrelationId()`                       | Returns the correlation ID for this stream.                                      |

### Ending a Stream

End a conversation turn in one of two ways:

**Option 1: Set `endOfSpeech` on the last chunk**

```typescript theme={"system"}
await talkMessageStream.streamMessageChunk("final text", true);
```

**Option 2: Call `endMessage()` separately**

```typescript theme={"system"}
await talkMessageStream.streamMessageChunk("final text", false);
await talkMessageStream.endMessage();
```

### Handling Interruptions

When a user speaks during a stream, the SDK emits `AnamEvent.TALK_STREAM_INTERRUPTED` and closes the stream:

```typescript theme={"system"}
import { AnamEvent } from "@anam-ai/js-sdk";

anamClient.addListener(AnamEvent.TALK_STREAM_INTERRUPTED, (event) => {
  console.log("Stream interrupted:", event.correlationId);
  // Handle the interruption - e.g., stop your LLM generation
});
```

### Checking Stream State

Check whether a stream can still accept chunks:

```typescript theme={"system"}
if (talkMessageStream.isActive()) {
  await talkMessageStream.streamMessageChunk(chunk, false);
}

// Or check the specific state
const state = talkMessageStream.getState();
// Returns: UNSTARTED | STREAMING | INTERRUPTED | ENDED
```

### Error Handling

The `streamMessageChunk` method throws an error if the stream is not active:

```typescript theme={"system"}
try {
  await talkMessageStream.streamMessageChunk("text", false);
} catch (error) {
  // Stream is in INTERRUPTED or ENDED state
  console.error("Cannot send chunk:", error.message);
}
```

## Correlation IDs

Attach a correlation ID to track streams, especially useful for matching interruption events:

```typescript theme={"system"}
const correlationId = "request-123";
const talkMessageStream = anamClient.createTalkMessageStream(correlationId);

// Later, when handling interruptions:
anamClient.addListener(AnamEvent.TALK_STREAM_INTERRUPTED, (event) => {
  if (event.correlationId === correlationId) {
    // This specific stream was interrupted
  }
});
```

<Note>
  Use unique correlation IDs for each `TalkMessageStream`. The ID appears in `TALK_STREAM_INTERRUPTED` events, helping you identify which stream was interrupted.
</Note>

## Next Steps

<Card title="Audio Control" icon="volume" href="/javascript-sdk/reference/audio-control">
  Learn how to control audio input in your Anam sessions
</Card>


# User Messages
Source: https://anam.ai/docs/javascript-sdk/reference/user-messages

Send messages on behalf of the user programmatically

## Overview

The `sendUserMessage()` method sends text messages on behalf of the user programmatically. Use it to send contextual information or trigger specific persona responses without actual user input.

<Note>Requires SDK version 3.3.0 or higher</Note>

## Basic Usage

Send a message as if the user typed or spoke it:

```javascript theme={"system"}
anamClient.sendUserMessage("Hello, how can you help me today?");
```

The persona will receive this message and respond as if the user had actually said it.

## Important Considerations

<Warning>
  Messages sent via `sendUserMessage()` are not automatically added to the
  transcript. To maintain an accurate conversation history, you must manually
  add these messages to your transcript display.
</Warning>

The `sendUserMessage()` method differs from regular user messages in that:

* It does not trigger message events that would normally update your UI
* The message is sent directly to the persona without going through the normal message pipeline
* You need to handle transcript updates separately in your application

## Use Cases

### Simulating User Input

```javascript theme={"system"}
// Simulate a user greeting when the session starts
anamClient.sendUserMessage("Hi there!");
```

### Providing Context to the LLM

Provide contextual information about user actions that the LLM can respond to:

```javascript theme={"system"}
// Inform the AI about user actions
anamClient.sendUserMessage(
  "Note to AI: I just clicked the checkout button"
);

// The persona might respond with something like:
// "Thanks for shopping with us today! Let me help you complete your purchase."
```

<Tip>
  When providing context to the LLM, prefix your messages with "Note to AI:" or
  similar to make the intent clear. This helps the LLM understand that it's
  receiving contextual information rather than a direct user question.
</Tip>

### Triggering Specific Flows

```javascript theme={"system"}
// Trigger a specific conversation flow
anamClient.sendUserMessage("I want to learn about your pricing plans");

// Or provide more detailed context
anamClient.sendUserMessage(
  "Note to AI: User navigated to pricing page and spent 30 seconds reading"
);
```

### Custom Client-Side Transcription

Use `sendUserMessage()` to implement your own client-side speech-to-text transcription. Capture and transcribe audio using your preferred service, then send the transcribed messages directly:

```javascript theme={"system"}
// Example using a custom transcription service
async function handleCustomTranscription(audioStream) {
  // Use your preferred transcription service (e.g., Web Speech API, Whisper, etc.)
  const transcribedText =
    await customTranscriptionService.transcribe(audioStream);

  // Send the transcribed message to the persona
  anamClient.sendUserMessage(transcribedText);

  // Update your UI transcript
  addMessageToTranscript(transcribedText, "user");
}
```

<Note>
  With this approach you can use specialized transcription models, handle multiple
  languages, or implement custom preprocessing of the audio before sending it to the persona.
</Note>

## Complete Example

Here's a complete example showing how to use `sendUserMessage()` with proper transcript management:

```javascript theme={"system"}
import { createClient } from "@anam-ai/js-sdk";

// Initialize the client
const anamClient = createClient(sessionToken);
await anamClient.streamToVideoElement("video-element-id");

// Function to update your UI transcript
function addMessageToTranscript(message, sender) {
  const transcript = document.getElementById("transcript");
  const messageElement = document.createElement("div");
  messageElement.className = sender === "user" ? "user-message" : "ai-message";
  messageElement.textContent = message;
  transcript.appendChild(messageElement);
}

// Send a message and update the transcript
function sendMessage(message) {
  // Add to transcript manually since sendUserMessage doesn't trigger events
  addMessageToTranscript(message, "user");

  // Send the message to the persona
  anamClient.sendUserMessage(message);
}

// Example usage
sendMessage("What products do you recommend?");

// Provide context about user actions
sendMessage(
  "Note to AI: User added item to cart with ID product_123"
);
```

## Error Handling

The `sendUserMessage()` method throws errors in two scenarios:

```javascript theme={"system"}
// Error: Not currently streaming
try {
  anamClient.sendUserMessage("Hello");
} catch (error) {
  // "Failed to send user message: not currently streaming"
}

// Error: No active session
try {
  anamClient.sendUserMessage("Hello");
} catch (error) {
  // "Failed to send user message: no active session"
}
```

Always ensure the client is streaming before calling this method:

```javascript theme={"system"}
if (anamClient.isStreaming()) {
  anamClient.sendUserMessage("Hello");
}
```

## Next Steps

<CardGroup>
  <Card title="Events" icon="webhook" href="/javascript-sdk/reference/events">
    Learn how to handle message events and build interactive experiences
  </Card>

  <Card title="Interrupt Command" icon="hand" href="/javascript-sdk/reference/interrupt-command">
    Stop the persona from speaking mid-response
  </Card>
</CardGroup>


# Custom Avatars Best Practices
Source: https://anam.ai/docs/personas/avatars/custom-avatar-best-practices

Technical specifications and best practices for creating custom avatar images with text-to-avatar or uploading images.

Use custom avatar images to create personalized digital personas. Follow these specifications and guidelines to ensure your custom avatars display correctly and perform well.

## Technical specifications

### File format

* **Supported formats:** .png or .jpg
* **Resolution:** Minimum of 720x480 pixels (higher resolutions preferred)
* **Aspect ratio:** Widescreen format, either 3:2 or 16:9
* **File size:** Maximum 4.5MB

<Note>
  If your uploaded image does not meet these specifications, the system will display a validation error explaining which requirements were not met.
</Note>

Example:

<img alt="hunter_table.png" title="hunter_table.png" />

## Subject pose and expression

### Facial focus

The subject's face must be in sharp focus with the following characteristics:

* **Gaze:** Looking directly at the camera
* **Camera angle:** Slightly low angle so the subject appears to be looking down slightly, similar to their appearance on a video call
* **Facial expression:** Neutral expression required
  * No extreme expressions
  * Face should appear relaxed
  * Mouth should be closed but not held tightly
* **Pose reference:** Replicate the provided reference pose as closely as possible

## Framing and composition

### Centering

* **Horizontal:** Position the subject in the center of the frame
* **Vertical:** Position the subject's mouth-line at approximately 50% of the frame height

### Additional framing guidelines

* **Headroom:** Adequate vertical space above the subject's head, equivalent to roughly 25% of the total image height
* **View:** The subject's shoulders and chest must be clearly visible in the shot
* **Body orientation:** The subject should be facing the camera directly

## Environment and appearance

### Clothing

Avoid wearing clothing with large text or distracting logos.

### Background

Simple, uncluttered backgrounds are preferred.

### Camera distance

The ideal distance between the camera and the subject is approximately 2 meters, although this may vary depending on the camera lens used.


# Creating Custom Avatars
Source: https://anam.ai/docs/personas/avatars/custom-avatars

How to create custom avatars using Anam Lab or the API

Anam's one-shot avatar generator creates a personalized avatar from a single photo. You can use any face — a real person, a stock photo, or an AI-generated image — and the system produces an avatar that can speak, express, and hold conversations.

## Create in Anam Lab

<Steps>
  <Step title="Open the Build page">
    Go to [lab.anam.ai/build](https://lab.anam.ai/build) and select the **Avatar** tab.
  </Step>

  <Step title="Add a new avatar">
    Click the **Add** button, then either upload a photo or capture one with your webcam. For best results, follow the [best practices](/concepts/custom-avatars).
  </Step>

  <Step title="Generate">
    Click **Create** and wait for your avatar to be generated. Once ready, it's available for use in any persona.
  </Step>
</Steps>

## Create via API

For programmatic avatar creation, use the [Create Avatar](/api-reference/create-avatar) endpoint. This is useful for building custom onboarding flows or batch-creating avatars.

```javascript theme={"system"}
const response = await fetch("https://api.anam.ai/v1/avatars", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${apiKey}`,
  },
  body: JSON.stringify({
    displayName: "My Custom Avatar",
    imageUrl: "https://example.com/photo.jpg",
  }),
});
```

## Using your avatar

Once created, reference the avatar by its ID in your persona configuration:

```javascript theme={"system"}
const personaConfig = {
  name: "My Persona",
  avatarId: "your-custom-avatar-id",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt: "You are a helpful assistant.",
};
```

You can find the avatar ID in Lab by clicking the three-dot menu on any avatar and selecting **Copy ID**.


# Avatar Gallery
Source: https://anam.ai/docs/personas/avatars/gallery

Browse and select from available avatar options for your personas

<Note>
  To use an avatar in your application, copy its ID from the table below and
  include it when creating your session token. See the [Usage in
  Production](/production) guide for more information.
</Note>

## Available Avatars

Click on an avatar to copy the avatar ID.

<AvatarGrid />


# Knowledge Base (RAG)
Source: https://anam.ai/docs/personas/knowledge/overview

Give your AI personas access to your documents using semantic search and Retrieval-Augmented Generation

## Overview

Anam's Knowledge Base lets your AI personas search and retrieve information from your documents using Retrieval-Augmented Generation (RAG). Instead of relying solely on the LLM's training data, your personas can access up-to-date, organization-specific information from your uploaded documents.

<Warning>
  **Beta Feature**: Knowledge Base is currently in beta. You may encounter some
  issues as we continue to improve the feature. Please report any feedback or
  issues to help us make it better.
</Warning>

With this feature, personas can:

* Answer questions from your documentation
* Provide information about your products, policies, or procedures
* Stay current with your latest content updates
* Ground responses in verified sources

## How It Works

Knowledge tools let your AI persona search your documents to answer questions.

<Steps>
  <Step title="Upload">
    You upload a file using a three-step signed upload process.

    <CodeGroup>
      ```javascript Signed URL Upload theme={"system"}
      async function uploadDocument(file, folderId, apiKey) {
        try {
          // Step 1: Get an upload URL from Anam
          const response = await fetch(
            `/v1/knowledge/groups/${folderId}/documents/presigned-upload`,
            {
              method: 'POST',
              headers: {
                'Authorization': `Bearer ${apiKey}`,
                'Content-Type': 'application/json'
              },
              body: JSON.stringify({
                filename: file.name,
                contentType: file.type,
                fileSize: file.size
              })
            }
          );

          if (!response.ok) {
            throw new Error('Failed to get upload URL');
          }

          const { uploadUrl, documentId } = await response.json();

          // Step 2: Upload your file directly to the URL
          const uploadResponse = await fetch(uploadUrl, {
            method: 'PUT',
            body: file
          });

          if (!uploadResponse.ok) {
            throw new Error('Failed to upload file to storage');
          }

          // Step 3: Confirm the upload with Anam
          const confirmResponse = await fetch(
            `/v1/knowledge/documents/${documentId}/confirm-upload`,
            {
              method: 'POST',
              headers: {
                'Authorization': `Bearer ${apiKey}`,
                'Content-Type': 'application/json'
              },
              body: JSON.stringify({ fileSize: file.size })
            }
          );

          if (!confirmResponse.ok) {
            throw new Error('Failed to confirm upload');
          }

          return await confirmResponse.json();
        } catch (error) {
          console.error('Upload failed:', error.message);
          throw error;
        }
      }
      ```
    </CodeGroup>

    This process stores the document and begins the processing pipeline.
  </Step>

  <Step title="Processing">
    A background job processes the document to make it searchable. This typically takes \~30 seconds.

    Status changes to `PROCESSING`.
  </Step>

  <Step title="Ready">
    Once processing completes (typically \~30 seconds), the document status changes to `READY` and becomes searchable.

    <Check>
      You can now attach this folder to knowledge tools and start searching.
    </Check>
  </Step>
</Steps>

<Warning>
  Documents must be in READY status to be searchable. If a document fails
  processing, its status will be set to FAILED with an error message.
</Warning>

## Document Processing

The system processes different file types appropriately:

| File Type          | How It's Processed                               |
| ------------------ | ------------------------------------------------ |
| PDF, TXT, MD, DOCX | Split into paragraphs for precise search results |
| CSV                | Each row is searchable independently             |
| JSON               | Entire file kept intact                          |
| LOG                | Each line is searchable independently            |

### Optimizing Document Structure

For best search results, structure your documents with:

**Clear headings and sections**:

```markdown theme={"system"}
# Installation Guide

## Prerequisites

Before installing, ensure you have...

## Step 1: Download the software

Navigate to our downloads page...

## Step 2: Run the installer

Double-click the downloaded file...
```

**Self-contained paragraphs**:
Each paragraph should make sense independently, as it may be retrieved without surrounding context.

**Descriptive filenames**:

* `product-installation-guide.pdf`
* `billing-faq-2024.md`

Avoid generic names like `document1.pdf` or `untitled.txt`.

## Storage Limits and Quotas

### Upload Limits

Document uploads are subject to file size and monthly storage limits based on your plan.

<Info>
  **Need higher limits?** Contact us about Enterprise plans with custom upload
  limits tailored to your needs.
</Info>

**File uploads**:

* Maximum file size per document varies by plan
* Batch uploads supported (multiple files at once)
* Storage quotas count only non-deleted documents

Deleting documents frees up quota for new uploads.

### Checking Your Usage

You can view your current usage in the Anam Lab UI at `/knowledge` or via API:

```javascript theme={"system"}
const response = await fetch("/v1/knowledge/usage", {
  headers: {
    Authorization: `Bearer ${apiKey}`,
  },
});

const usage = await response.json();
console.log(`Used: ${usage.totalBytes} / ${usage.quotaBytes}`);
```

## Search Performance

### How Search Works

When your AI searches documents, it finds the most relevant information to answer the user's question. The system ranks results by relevance and provides the best matches to the LLM.

### Improving Search Results

**Use descriptive folder names and document titles**:
Metadata helps the system understand context.

**Keep documents focused on specific topics**:
Instead of one 500-page manual, upload focused documents on individual topics.

**Update documents regularly**:
Delete outdated documents and upload current versions.

**Organize by knowledge domain**:
Create separate folders for different areas:

* Product documentation
* FAQs
* Policies
* Troubleshooting guides

## Using Knowledge Tools

Once your documents are uploaded and processed, create knowledge tools to enable search:

```typescript theme={"system"}
{
  type: 'server',
  subtype: 'knowledge',
  name: 'search_product_docs',
  description: 'Search product documentation when users ask technical questions about features, installation, or usage',
  documentFolderIds: ['550e8400-e29b-41d4-a716-446655440000', '6ba7b810-9dad-11d1-80b4-00c04fd430c8']
}
```

Attach the tool to a persona, and the LLM will invoke it when relevant:

**User**: "How do I configure SSL?"

**LLM internal process**:

1. Recognizes this is a technical question
2. Invokes `search_product_docs` with query "SSL configuration"
3. Receives relevant chunks from documentation
4. Generates response: "To configure SSL, you'll need to..."

<Note>
  Learn more about creating and using knowledge tools in the [Tools
  documentation](/personas/tools/overview).
</Note>

## Security and Privacy

### Organization Isolation

All knowledge base data is organization-scoped:

* Users can only access their organization's folders and documents
* API requests are filtered by organization ID at the database level
* Even with knowledge of folder IDs, users cannot access other organizations' data

<Warning>
  Always use HTTPS for API requests. Never commit API keys to version control or
  expose them client-side.
</Warning>

## Troubleshooting

### No Search Results

**Possible causes**:

* Documents still in PROCESSING status (wait \~30 seconds)
* Query semantically unrelated to document content
* Folder contains no READY documents
* Documents in wrong folder (check folder assignments)

**Solutions**:

1. Check document status in the UI or via API
2. Test query using debug modal (`Ctrl+Shift+K`)
3. Try rephrasing query with more specific terms
4. Verify folder contains relevant documents

### Upload Failures

**File too large** (>50MB):

* Split document into smaller files
* Remove images in PDFs
* Remove unnecessary pages

**Processing failed**:

* Check file isn't corrupted
* Verify file format is supported
* Try re-uploading the file

### Slow Processing

**Normal processing time**: 30 seconds for typical documents

**Longer processing times** may occur with:

* Very large files (40-50MB)
* Complex PDFs with many images
* High system load

<Info>
  You can upload multiple documents simultaneously. The system processes up to 4
  documents concurrently.
</Info>

## Next Steps

<CardGroup>
  <Card title="Cookbook: RAG Knowledge Base" icon="book-open" href="https://anam.ai/cookbook/rag-knowledge-base">
    Tutorial for adding a RAG knowledge base to your avatar agent
  </Card>

  <Card title="Knowledge Base Setup" icon="upload" href="/personas/knowledge/setup">
    Step-by-step guide to creating folders and uploading documents
  </Card>

  <Card title="Uploading Documents" icon="file-arrow-up" href="/personas/knowledge/uploading-documents">
    Detailed guide on the upload process and batch uploads
  </Card>

  <Card title="Tools Overview" icon="wrench" href="/personas/tools/overview">
    Learn about all available tool types including knowledge tools
  </Card>
</CardGroup>


# Knowledge Base Setup
Source: https://anam.ai/docs/personas/knowledge/setup

Create folders and organize your documents for semantic search with RAG

## Overview

Setting up your knowledge base is the first step to enabling your AI personas to search and retrieve information from your documents. This guide walks you through creating folders, understanding the organizational structure, and preparing for document uploads.

## Before You Begin

<Warning>**Beta Feature**: Knowledge Base is currently in beta. You may encounter some issues as we continue to improve the feature. Please report any feedback or issues to help us make it better.</Warning>

<Info>
  **What you'll need**:

  * An Anam account with API access
  * Documents to upload (PDF, TXT, MD, DOCX, CSV, JSON, or LOG files)
  * Basic understanding of your content organization

  **Time to complete**: 10-15 minutes
</Info>

## Understanding Folder Structure

Knowledge folders organize your documents by topic, use case, or category. Think of them as searchable collections that can be assigned to different knowledge tools.

### Example Organization

```
Your Organization
├── Product Documentation (Folder)
│   ├── user-guide.pdf
│   ├── api-reference.md
│   └── faq.pdf
│
├── Technical Support (Folder)
│   ├── troubleshooting-guide.pdf
│   ├── error-codes.txt
│   └── common-issues.md
│
├── Company Policies (Folder)
│   ├── privacy-policy.pdf
│   ├── terms-of-service.md
│   └── refund-policy.pdf
│
└── Customer FAQs (Folder)
    ├── billing-faq.md
    ├── account-faq.md
    └── shipping-faq.pdf
```

### Why Organize by Folder?

<AccordionGroup>
  <Accordion title="Better search relevance">
    Smaller, focused folders return more relevant results than searching across all documents at once.
  </Accordion>

  <Accordion title="Easier tool assignment">Different folders can be assigned to different knowledge tools, allowing the LLM to search only relevant content.</Accordion>

  <Accordion title="Simpler maintenance">Update or replace documents in specific knowledge domains without affecting others.</Accordion>

  <Accordion title="Clear organization">
    Easy to understand what content is available and where it's located.
  </Accordion>
</AccordionGroup>

## Step 1: Plan Your Folder Structure

Before creating folders, plan your organization strategy based on your use case.

### By Content Type

Organize by the type of information:

```
├── Product Documentation
├── API Reference
├── User Guides
├── FAQs
└── Policies
```

**Good for**: SaaS products, developer platforms

### By Department

Organize by who creates or owns the content:

```
├── Sales Resources
├── Support Knowledge Base
├── HR Policies
├── Engineering Docs
└── Marketing Materials
```

**Good for**: Internal knowledge bases, employee assistants

### By User Journey

Organize by when users need the information:

```
├── Getting Started
├── Basic Features
├── Advanced Features
├── Troubleshooting
└── Account Management
```

**Good for**: Customer support, onboarding assistants

### By Product/Service

Organize by different offerings:

```
├── Product A Documentation
├── Product B Documentation
├── Service C Information
└── General Company Info
```

**Good for**: Multi-product companies

<Tip>Start simple with 3-5 folders. You can always create more as your knowledge base grows.</Tip>

## Step 2: Create Your First Folder

<Tabs>
  <Tab title="UI">
    <Steps>
      <Step title="Navigate to Knowledge Base">
        Go to `/knowledge` in the Anam Lab
      </Step>

      <Step title="Create new folder">Click the **Create Folder** button in the top right</Step>

      <Step title="Fill in details">
        Enter folder information:

        **Name**: Descriptive name for the folder

        * Examples: "Product Documentation", "Customer FAQs", "API Reference"
        * Use clear, specific names
        * 1-100 characters

        **Description** (optional): What content is in this folder

        * Helps you remember the folder's purpose
        * Useful when you have many folders
        * 0-500 characters
      </Step>

      <Step title="Create folder">
        Click **Create**

        <Check>
          Your folder is created and appears in the folder list. You'll see:

          * Folder name and description
          * Document count (0 initially)
          * Created date
          * Folder ID (UUID) for API usage
        </Check>
      </Step>
    </Steps>
  </Tab>

  <Tab title="API">
    Create folders programmatically using the API:

    ```bash cURL theme={"system"}
    curl -X POST 'https://api.anam.ai/v1/knowledge/groups' \
      -H 'Authorization: Bearer YOUR_API_KEY' \
      -H 'Content-Type: application/json' \
      -d '{
        "name": "Product Documentation",
        "description": "Technical guides and product information for customer support"
      }'
    ```

    ```javascript JavaScript theme={"system"}
    const response = await fetch("https://api.anam.ai/v1/knowledge/groups", {
      method: "POST",
      headers: {
        Authorization: "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        name: "Product Documentation",
        description: "Technical guides and product information for customer support",
      }),
    });

    const folder = await response.json();
    console.log("Folder created with ID:", folder.id);
    ```

    ```python Python theme={"system"}
    import requests

    response = requests.post(
        'https://api.anam.ai/v1/knowledge/groups',
        headers={
            'Authorization': 'Bearer YOUR_API_KEY',
            'Content-Type': 'application/json'
        },
        json={
            'name': 'Product Documentation',
            'description': 'Technical guides and product information for customer support'
        }
    )

    folder = response.json()
    print(f"Folder created with ID: {folder['id']}")
    ```

    **Response**:

    ```json theme={"system"}
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "name": "Product Documentation",
      "description": "Technical guides and product information for customer support",
      "documentCount": 0,
      "organizationId": "org-uuid",
      "createdAt": "2024-03-10T10:30:00Z",
      "updatedAt": "2024-03-10T10:30:00Z"
    }
    ```

    <Check>
      Save the folder `id` - you'll need it for uploading documents and creating knowledge tools.
    </Check>
  </Tab>
</Tabs>

## Step 3: Create Additional Folders

Create folders for your other content categories:

<CodeGroup>
  ```bash cURL theme={"system"}
  # Create Technical Support folder
  curl -X POST 'https://api.anam.ai/v1/knowledge/groups' \
    -H 'Authorization: Bearer YOUR_API_KEY' \
    -H 'Content-Type: application/json' \
    -d '{
      "name": "Technical Support",
      "description": "Troubleshooting guides and error documentation"
    }'

  # Create Company Policies folder

  curl -X POST 'https://api.anam.ai/v1/knowledge/groups' \
   -H 'Authorization: Bearer YOUR_API_KEY' \
   -H 'Content-Type: application/json' \
   -d '{
  "name": "Company Policies",
  "description": "Privacy policy, terms of service, and legal documents"
  }'

  # Create Customer FAQs folder

  curl -X POST 'https://api.anam.ai/v1/knowledge/groups' \
   -H 'Authorization: Bearer YOUR_API_KEY' \
   -H 'Content-Type: application/json' \
   -d '{
  "name": "Customer FAQs",
  "description": "Frequently asked questions from customers"
  }'

  ```

  ```javascript JavaScript theme={"system"}
  const folders = [
    {
      name: 'Technical Support',
      description: 'Troubleshooting guides and error documentation'
    },
    {
      name: 'Company Policies',
      description: 'Privacy policy, terms of service, and legal documents'
    },
    {
      name: 'Customer FAQs',
      description: 'Frequently asked questions from customers'
    }
  ];

  for (const folderData of folders) {
    const response = await fetch('https://api.anam.ai/v1/knowledge/groups', {
      method: 'POST',
      headers: {
        'Authorization': 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json'
      },
      body: JSON.stringify(folderData)
    });

    const folder = await response.json();
    console.log(`Created: ${folder.name} (${folder.id})`);
  }
  ```
</CodeGroup>

## Step 4: View Your Folders

<Tabs>
  <Tab title="UI">
    All your folders are listed on the Knowledge Base page at `/knowledge`:

    * **Folder cards** show name, description, and document count
    * **Click a folder** to see its documents
    * **Search folders** using the search bar
    * **Sort folders** by name or creation date
  </Tab>

  <Tab title="API">
    List all folders in your organization:

    ```bash theme={"system"}
    curl -X GET 'https://api.anam.ai/v1/knowledge/groups' \
      -H 'Authorization: Bearer YOUR_API_KEY'
    ```

    **Response**:

    ```json theme={"system"}
    {
      "folders": [
        {
          "id": "550e8400-e29b-41d4-a716-446655440000",
          "name": "Product Documentation",
          "description": "Technical guides and product information",
          "documentCount": 0,
          "createdAt": "2024-03-10T10:30:00Z"
        },
        {
          "id": "6ba7b810-9dad-11d1-80b4-00c04fd430c8",
          "name": "Technical Support",
          "description": "Troubleshooting guides and error documentation",
          "documentCount": 0,
          "createdAt": "2024-03-10T10:31:00Z"
        }
      ],
      "total": 2
    }
    ```
  </Tab>
</Tabs>

## Step 5: Prepare Your Documents

Before uploading, organize and prepare your documents:

### Supported File Types

<ResponseField name="PDF" type="file">
  Product manuals, guides, reports, brochures - Max size: 50MB - Chunking: Paragraph-based
</ResponseField>

<ResponseField name="TXT" type="file">
  Plain text documentation, notes, logs - Max size: 50MB - Chunking: Paragraph-based
</ResponseField>

<ResponseField name="MD (Markdown)" type="file">
  Technical documentation, README files, wikis - Max size: 50MB - Chunking: Paragraph-based (respects headings)
</ResponseField>

<ResponseField name="DOCX" type="file">
  Word documents, reports, guides - Max size: 50MB - Chunking: Paragraph-based
</ResponseField>

<ResponseField name="CSV" type="file">
  Structured data, product catalogs, FAQs - Max size: 50MB - Chunking: Row-based (each row is a chunk)
</ResponseField>

<ResponseField name="JSON" type="file">
  Structured data, API responses, configurations - Max size: 50MB - Chunking: Whole file (structure preserved)
</ResponseField>

<ResponseField name="LOG" type="file">
  Log files, system outputs - Max size: 50MB - Chunking: Line-based
</ResponseField>

### Document Preparation Tips

<AccordionGroup>
  <Accordion title="Use clear file names">
    ```
    ✅ Good:
    - product-installation-guide-v2.pdf
    - password-reset-faq.md
    - api-authentication-reference.pdf

    ❌ Bad:

    - document1.pdf
    - file.md
    - untitled.pdf

    ```
  </Accordion>

  <Accordion title="Structure content with headings">
    ```markdown theme={"system"}
    # Installation Guide

    ## Prerequisites
    Before installing, ensure you have...

    ## Step 1: Download
    Navigate to our downloads page...

    ## Step 2: Install
    Run the installer and follow...
    ```

    Headings help the chunking algorithm create semantic segments.
  </Accordion>

  <Accordion title="Keep documents focused">
    Instead of one 500-page manual, split into topic-specific documents:

    * installation-guide.pdf
    * configuration-guide.pdf
    * troubleshooting-guide.pdf
    * api-reference.pdf

    This improves search relevance.
  </Accordion>

  <Accordion title="Remove unnecessary content">
    * Remove cover pages and table of contents
    * Delete outdated information
    * Remove duplicate content across files
    * Strip unnecessary images from PDFs to reduce size
  </Accordion>
</AccordionGroup>

## Upload Limits

Document uploads are subject to file size and storage limits based on your plan.

* **File size limits** apply per document
* **Batch uploads** supported for multiple files
* **Storage quotas** count only non-deleted documents
* Deleting documents frees up quota for new uploads

<Info>**Need higher limits?** Contact us about Enterprise plans with custom upload limits tailored to your needs.</Info>

### Check Your Usage

<Tabs>
  <Tab title="UI">
    View your current storage usage on the Knowledge Base page at `/knowledge`:

    * **Total uploaded**: Shows your cumulative upload amount
    * **Document count**: Number of active documents
  </Tab>
</Tabs>

## Managing Folders

### Rename a Folder

<Tabs>
  <Tab title="UI">
    1. Click the folder to open it
    2. Click the **Edit** button (pencil icon)
    3. Update name or description
    4. Click **Save**
  </Tab>

  <Tab title="API">
    ```bash theme={"system"}
    curl -X PATCH 'https://api.anam.ai/v1/knowledge/groups/FOLDER_ID' \
      -H 'Authorization: Bearer YOUR_API_KEY' \
      -H 'Content-Type: application/json' \
      -d '{
        "name": "Updated Folder Name",
        "description": "Updated description"
      }'
    ```
  </Tab>
</Tabs>

### Delete a Folder

<Warning>Deleting a folder deletes all documents inside it. This action cannot be undone.</Warning>

<Tabs>
  <Tab title="UI">
    1. Click the folder to open it
    2. Click the **Delete** button (trash icon)
    3. Confirm deletion in the modal
  </Tab>

  <Tab title="API">
    ```bash theme={"system"}
    curl -X DELETE 'https://api.anam.ai/v1/knowledge/groups/FOLDER_ID' \
      -H 'Authorization: Bearer YOUR_API_KEY'
    ```

    This performs a soft delete (sets `deleted_at` timestamp). Documents are removed from search immediately.
  </Tab>
</Tabs>

## Best Practices

<AccordionGroup>
  <Accordion title="Start with 3-5 folders">
    Don't over-organize initially. Create folders as you need them.

    **Good starting structure**:

    * Product Documentation
    * FAQs
    * Support/Troubleshooting

    Add more specialized folders as your knowledge base grows.
  </Accordion>

  <Accordion title="Use descriptive names">
    Folder names should clearly indicate their content:

    ```
    ✅ Good:
    - API Reference Documentation
    - Customer Billing FAQs
    - Product Installation Guides

    ❌ Too vague:
    - Documents
    - Stuff
    - Files
    ```
  </Accordion>

  <Accordion title="Keep folders focused">
    Each folder should serve a specific purpose:

    ```
    ✅ Good (focused):
    - Product Features
    - Installation Guides
    - Troubleshooting

    ❌ Too broad:
    - Everything
    - All Documents
    - General
    ```
  </Accordion>

  <Accordion title="Plan for growth">
    Choose a structure that scales:

    ```
    Today (10 documents):
    ├── Documentation
    └── FAQs

    Future (100+ documents):
    ├── Product Documentation
    │   ├── Features
    │   ├── Installation
    │   └── API Reference
    ├── Support
    │   ├── Troubleshooting
    │   ├── Common Issues
    │   └── Error Codes
    └── Customer FAQs
        ├── Billing
        ├── Account
        └── Technical
    ```

    Start simple, but use a structure that allows subdivision later.
  </Accordion>
</AccordionGroup>

## Next Steps

Now that your folders are set up, you're ready to upload documents:

<CardGroup>
  <Card title="Upload Documents" icon="upload" href="/personas/knowledge/uploading-documents">
    Step-by-step guide to uploading and processing documents
  </Card>

  <Card title="Create Knowledge Tools" icon="wrench" href="/personas/knowledge/tools">
    Enable semantic search with knowledge tools
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/create-knowledge-group">
    Complete API documentation for knowledge base
  </Card>

  <Card title="Concepts" icon="book" href="/personas/knowledge/overview">
    Learn more about how RAG works
  </Card>
</CardGroup>


# Knowledge Tools and RAG
Source: https://anam.ai/docs/personas/knowledge/tools

Enable semantic search across your documents using Retrieval-Augmented Generation

## Overview

Knowledge tools enable your AI persona to search uploaded documents and provide accurate, source-based answers using Retrieval-Augmented Generation (RAG). Instead of relying only on the LLM's training data, your persona can access your organization's specific documentation, policies, and knowledge base.

This guide covers how to create and optimize knowledge tools for maximum accuracy and relevance.

## Prerequisites

Before creating knowledge tools, you need:

* Knowledge folders created (see [Knowledge Base Setup](/personas/knowledge/setup))
* Documents uploaded and in READY status
* Basic understanding of RAG concepts (see [Knowledge Base Overview](/personas/knowledge/overview))

## Creating a Knowledge Tool

<Tabs>
  <Tab title="Stateful (Database-Saved)">
    Create reusable tools in the Anam Lab that can be attached to multiple personas.

    <Steps>
      <Step title="Create the tool">
        Navigate to `/tools` in the Anam Lab and create your knowledge tool:

        1. Click **Create Tool**

        2. Select **Knowledge Tool**

        3. Fill in the configuration:
           * **Name**: `search_product_docs` (snake\_case)
           * **Description**: Clear description of when to use this tool
           * **Knowledge Folders**: Select one or more folders

        4. Click **Create Tool**

        <Check>
          The tool is now saved with a unique ID and appears in your organization's tool library.
        </Check>
      </Step>

      <Step title="Attach tool to persona">
        Navigate to `/build/{personaId}` and attach the tool:

        1. Scroll to the **Tools** section
        2. Click **Add Tool**
        3. Select `search_product_docs` from the dropdown
        4. Save the persona

        <Check>
          The tool is now attached to this persona. When you create sessions with this persona, the tool will automatically be available.
        </Check>
      </Step>

      <Step title="Use in session">
        Create a session using the persona ID:

        ```javascript theme={"system"}
        const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
          method: "POST",
          headers: {
            Authorization: "Bearer YOUR_API_KEY",
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            personaConfig: {
              personaId: "your-persona-id", // Tools loaded automatically
            },
          }),
        });
        ```
      </Step>
    </Steps>

    <Note>
      Stateful tools are ideal when you want to reuse the same tool across multiple personas or manage tools centrally.
    </Note>
  </Tab>
</Tabs>

## How Knowledge Tools Work

When the AI needs information to answer a question, it uses a knowledge tool to search your documents.

<Steps>
  <Step title="User asks a question">
    User: "How do I authenticate API requests?"
  </Step>

  <Step title="AI decides to search">
    The AI determines the question requires information from your documentation and automatically formulates a search query.

    <Info>
      The AI looks for the semantic meaning behind the user's question, not just literal keywords.
    </Info>
  </Step>

  <Step title="System finds relevant content">The system searches the folders associated with the tool and retrieves the most relevant snippets from your documents.</Step>

  <Step title="AI generates a response">
    The AI uses the retrieved information to construct an accurate, conversational answer.

    "To authenticate API requests, you'll need to include your API key in the Authorization header like this: `Authorization: Bearer YOUR_API_KEY`. You can get an API key from your dashboard."

    <Check>
      The user receives an answer grounded in your specific documentation.
    </Check>
  </Step>
</Steps>

## Monitoring Knowledge Tool Calls

Knowledge tool calls emit lifecycle events on the client that you can use for logging or analytics. Use `toolType` and `toolSubtype` to filter for knowledge-specific events:

```javascript theme={"system"}
import { AnamEvent } from "@anam-ai/js-sdk";

anamClient.addListener(AnamEvent.TOOL_CALL_STARTED, (event) => {
  if (event.toolType === "server" && event.toolSubtype === "knowledge") {
    console.log(`Knowledge search started: ${event.toolName}`);
  }
});

anamClient.addListener(AnamEvent.TOOL_CALL_COMPLETED, (event) => {
  if (event.toolType === "server" && event.toolSubtype === "knowledge") {
    console.log(`Knowledge search completed: ${event.toolName} in ${event.executionTime}ms`);
  }
});

anamClient.addListener(AnamEvent.TOOL_CALL_FAILED, (event) => {
  if (event.toolType === "server" && event.toolSubtype === "knowledge") {
    console.error(`Knowledge search failed: ${event.toolName} - ${event.errorMessage}`);
  }
});
```

<Tip>
  See the [Events SDK Reference](/javascript-sdk/reference/events#tool-call-events) for the full list of tool call event payloads and types.
</Tip>

## Next Steps

<CardGroup>
  <Card title="Knowledge Base Setup" icon="upload" href="/personas/knowledge/setup">
    Create folders and upload documents
  </Card>

  <Card title="Client Tools" icon="browser" href="/personas/tools/client-tools">
    Trigger UI actions with client tools
  </Card>

  <Card title="Webhook Tools" icon="webhook" href="/personas/tools/webhook-tools">
    Integrate external APIs
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/create-tool">
    Complete API documentation
  </Card>
</CardGroup>


# Uploading Documents
Source: https://anam.ai/docs/personas/knowledge/uploading-documents

Upload and process documents for semantic search in your knowledge base

## Overview

Once you've created knowledge folders, you can upload documents to make them searchable. This guide covers the upload process, troubleshooting, and best practices for successful document processing.

## Upload Process

Anam uses a secure three-step presigned URL process for all document uploads.

<Info>
  Document uploads are subject to file size limits. **Need higher limits?**
  Contact us about Enterprise plans with custom limits.
</Info>

## How It Works

For security and performance, all document uploads use a presigned URL flow:

<Steps>
  <Step title="Request presigned URL">
    Request an upload URL from Anam:

    <CodeGroup>
      ```bash cURL theme={"system"}
      curl -X POST 'https://api.anam.ai/v1/knowledge/groups/FOLDER_ID/documents/presigned-upload' \
        -H 'Authorization: Bearer YOUR_API_KEY' \
        -H 'Content-Type: application/json' \
        -d '{
          "filename": "large-document.pdf",
          "contentType": "application/pdf",
          "fileSize": 10485760
        }'
      ```

      ```javascript JavaScript theme={"system"}
      const response = await fetch(
        `https://api.anam.ai/v1/knowledge/groups/${folderId}/documents/presigned-upload`,
        {
          method: "POST",
          headers: {
            Authorization: `Bearer ${apiKey}`,
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            filename: "large-document.pdf",
            contentType: "application/pdf",
            fileSize: file.size,
          }),
        }
      );

      const { uploadUrl, documentId } = await response.json();
      ```

      ```python Python theme={"system"}
      response = requests.post(
          f'https://api.anam.ai/v1/knowledge/groups/{folder_id}/documents/presigned-upload',
          headers={
              'Authorization': f'Bearer {api_key}',
              'Content-Type': 'application/json'
          },
          json={
              'filename': 'large-document.pdf',
              'contentType': 'application/pdf',
              'fileSize': file_size
          }
      )

      data = response.json()
      upload_url = data['uploadUrl']
      document_id = data['documentId']
      ```
    </CodeGroup>

    **Response**:

    ```json theme={"system"}
    {
      "uploadUrl": "https://storage.cloudflare.com/presigned-url-here",
      "documentId": "doc-uuid-123"
    }
    ```
  </Step>

  <Step title="Upload to presigned URL">
    Upload the file directly to cloud storage:

    <CodeGroup>
      ```bash cURL theme={"system"}
      curl -X PUT 'PRESIGNED_URL_FROM_STEP_1' \
        -H 'Content-Type: application/pdf' \
        --data-binary '@large-document.pdf'
      ```

      ```javascript JavaScript theme={"system"}
      const uploadResponse = await fetch(uploadUrl, {
        method: "PUT",
        headers: {
          "Content-Type": "application/pdf",
        },
        body: file,
      });

      if (!uploadResponse.ok) {
        throw new Error("File upload to storage failed");
      }
      ```

      ```python Python theme={"system"}
      with open('large-document.pdf', 'rb') as f:
          upload_response = requests.put(
              upload_url,
              headers={'Content-Type': 'application/pdf'},
              data=f
          )

      if upload_response.status_code != 200:
          raise Exception('File upload to storage failed')
      ```
    </CodeGroup>

    <Info>
      This step uploads directly to cloud storage, bypassing Anam's API servers for better performance with large files.
    </Info>
  </Step>

  <Step title="Confirm upload">
    Notify Anam that the upload is complete:

    <CodeGroup>
      ```bash cURL theme={"system"}
      curl -X POST 'https://api.anam.ai/v1/knowledge/documents/DOCUMENT_ID/confirm-upload' \
        -H 'Authorization: Bearer YOUR_API_KEY' \
        -H 'Content-Type: application/json' \
        -d '{
          "fileSize": 10485760
        }'
      ```

      ```javascript JavaScript theme={"system"}
      const confirmResponse = await fetch(
        `https://api.anam.ai/v1/knowledge/documents/${documentId}/confirm-upload`,
        {
          method: "POST",
          headers: {
            Authorization: `Bearer ${apiKey}`,
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            fileSize: file.size,
          }),
        }
      );

      const document = await confirmResponse.json();
      console.log("Upload confirmed, processing started");
      ```

      ```python Python theme={"system"}
      confirm_response = requests.post(
          f'https://api.anam.ai/v1/knowledge/documents/{document_id}/confirm-upload',
          headers={
              'Authorization': f'Bearer {api_key}',
              'Content-Type': 'application/json'
          },
          json={'fileSize': file_size}
      )

      document = confirm_response.json()
      print('Upload confirmed, processing started')
      ```
    </CodeGroup>

    <Check>
      Processing begins automatically. The document will be ready for search in \~30 seconds.
    </Check>
  </Step>
</Steps>

### Complete Presigned Upload Example

<CodeGroup>
  ```javascript JavaScript theme={"system"}
  async function uploadLargeDocument(file, folderId, apiKey) {
    try {
      // Step 1: Get presigned URL
      const presignedResponse = await fetch(
        `https://api.anam.ai/v1/knowledge/groups/${folderId}/documents/presigned-upload`,
        {
          method: 'POST',
          headers: {
            'Authorization': `Bearer ${apiKey}`,
            'Content-Type': 'application/json'
          },
          body: JSON.stringify({
            filename: file.name,
            contentType: file.type,
            fileSize: file.size
          })
        }
      );
      
      if (!presignedResponse.ok) {
        throw new Error('Failed to get presigned URL');
      }
      
      const { uploadUrl, documentId } = await presignedResponse.json();
      console.log('Got presigned URL for document:', documentId);
      
      // Step 2: Upload to storage
      const uploadResponse = await fetch(uploadUrl, {
        method: 'PUT',
        headers: {
          'Content-Type': file.type
        },
        body: file
      });
      
      if (!uploadResponse.ok) {
        throw new Error('Failed to upload file to storage');
      }
      
      console.log('File uploaded to storage');
      
      // Step 3: Confirm upload
      const confirmResponse = await fetch(
        `https://api.anam.ai/v1/knowledge/documents/${documentId}/confirm-upload`,
        {
          method: 'POST',
          headers: {
            'Authorization': `Bearer ${apiKey}`,
            'Content-Type': 'application/json'
          },
          body: JSON.stringify({
            fileSize: file.size
          })
        }
      );
      
      if (!confirmResponse.ok) {
        throw new Error('Failed to confirm upload');
      }
      
      const document = await confirmResponse.json();
      console.log('Upload complete! Document ID:', document.id);
      console.log('Status:', document.status);
      
      return document;
      
    } catch (error) {
      console.error('Upload error:', error);
      throw error;
    }
  }

  // Usage with file size check
  async function uploadDocument(file, folderId, apiKey) {
  console.log('Using presigned URL upload');
  return await uploadLargeDocument(file, folderId, apiKey);
  }

  ```

  ```python Python theme={"system"}
  import requests
  import os

  def upload_document(file_path, folder_id, api_key):
      """Upload a document using presigned URL"""

      filename = os.path.basename(file_path)
      file_size = os.path.getsize(file_path)
      content_type = 'application/pdf'  # Adjust based on file type

      try:
          # Step 1: Get presigned URL
          presigned_response = requests.post(
              f'https://api.anam.ai/v1/knowledge/groups/{folder_id}/documents/presigned-upload',
              headers={
                  'Authorization': f'Bearer {api_key}',
                  'Content-Type': 'application/json'
              },
              json={
                  'filename': filename,
                  'contentType': content_type,
                  'fileSize': file_size
              }
          )
          presigned_response.raise_for_status()

          data = presigned_response.json()
          upload_url = data['uploadUrl']
          document_id = data['documentId']

          print(f'Got presigned URL for document: {document_id}')

          # Step 2: Upload to storage
          with open(file_path, 'rb') as f:
              upload_response = requests.put(
                  upload_url,
                  headers={'Content-Type': content_type},
                  data=f
              )
          upload_response.raise_for_status()

          print('File uploaded to storage')

          # Step 3: Confirm upload
          confirm_response = requests.post(
              f'https://api.anam.ai/v1/knowledge/documents/{document_id}/confirm-upload',
              headers={
                  'Authorization': f'Bearer {api_key}',
                  'Content-Type': 'application/json'
              },
              json={'fileSize': file_size}
          )
          confirm_response.raise_for_status()

          document = confirm_response.json()
          print(f'Upload complete! Document ID: {document["id"]}')
          print(f'Status: {document["status"]}')

          return document

      except requests.exceptions.RequestException as e:
          print(f'Upload error: {e}')
          raise

  # Usage with file size check
  def upload_document(file_path, folder_id, api_key):
      """Upload document using the presigned URL method"""

      print('Using presigned URL upload')
      return upload_large_document(file_path, folder_id, api_key)
  ```
</CodeGroup>

## Monitoring Upload Progress

### Check Document Status

After uploading, monitor the document's processing status:

<CodeGroup>
  ```bash cURL theme={"system"}
  curl -X GET 'https://api.anam.ai/v1/knowledge/documents/DOCUMENT_ID' \
    -H 'Authorization: Bearer YOUR_API_KEY'
  ```

  ```javascript JavaScript theme={"system"}
  async function checkDocumentStatus(documentId, apiKey) {
    const response = await fetch(
      `https://api.anam.ai/v1/knowledge/documents/${documentId}`,
      {
        headers: {
          Authorization: `Bearer ${apiKey}`,
        },
      }
    );

    const document = await response.json();
    return document.status;
  }

  // Poll until READY
  async function waitForProcessing(documentId, apiKey) {
    const maxAttempts = 60; // 60 attempts = 5 minutes max
    let attempts = 0;

    while (attempts < maxAttempts) {
      const status = await checkDocumentStatus(documentId, apiKey);
      console.log(`Status: ${status}`);

      if (status === "READY") {
        console.log("Document is ready for search!");
        return true;
      }

      if (status === "FAILED") {
        throw new Error("Document processing failed");
      }

      // Wait 5 seconds before next check
      await new Promise((resolve) => setTimeout(resolve, 5000));
      attempts++;
    }

    throw new Error("Processing timeout");
  }
  ```

  ```python Python theme={"system"}
  import time

  def check_document_status(document_id, api_key):
      """Check the processing status of a document"""

      response = requests.get(
          f'https://api.anam.ai/v1/knowledge/documents/{document_id}',
          headers={'Authorization': f'Bearer {api_key}'}
      )
      response.raise_for_status()

      document = response.json()
      return document['status']

  def wait_for_processing(document_id, api_key, max_attempts=60):
      """Poll until document is ready or fails"""

      attempts = 0

      while attempts < max_attempts:
          status = check_document_status(document_id, api_key)
          print(f'Status: {status}')

          if status == 'READY':
              print('Document is ready for search!')
              return True

          if status == 'FAILED':
              raise Exception('Document processing failed')

          # Wait 5 seconds before next check
          time.sleep(5)
          attempts += 1

      raise Exception('Processing timeout')
  ```
</CodeGroup>

**Document statuses**:

* `UPLOADED`: File uploaded, waiting for processing
* `PROCESSING`: Content is being extracted and indexed
* `READY`: Document is searchable
* `FAILED`: Processing failed (check error message)

## Batch Uploads

Upload multiple documents efficiently:

```javascript theme={"system"}
async function batchUpload(files, folderId, apiKey) {
  const maxDirectUploadSize = 4 * 1024 * 1024; // 4MB
  const results = [];

  // Process files concurrently (4 at a time)
  const batchSize = 4;
  for (let i = 0; i < files.length; i += batchSize) {
    const batch = files.slice(i, i + batchSize);

    const batchPromises = batch.map(async (file) => {
      try {
        let document;

        if (file.size <= maxDirectUploadSize) {
          document = await uploadSmallDocument(file, folderId, apiKey);
        } else {
          document = await uploadLargeDocument(file, folderId, apiKey);
        }

        console.log(`✓ Uploaded: ${file.name}`);
        return { file: file.name, success: true, documentId: document.id };
      } catch (error) {
        console.error(`✗ Failed: ${file.name}`, error.message);
        return { file: file.name, success: false, error: error.message };
      }
    });

    const batchResults = await Promise.all(batchPromises);
    results.push(...batchResults);
  }

  const successful = results.filter((r) => r.success).length;
  const failed = results.filter((r) => !r.success).length;

  console.log(`\nBatch upload complete:`);
  console.log(`  Successful: ${successful}`);
  console.log(`  Failed: ${failed}`);

  return results;
}
```

<Tip>
  Process documents in batches of 4 for optimal performance. The system handles
  up to 4 documents concurrently.
</Tip>

## Troubleshooting

<AccordionGroup>
  <Accordion title="File too large">
    **Error**: `File size exceeds limit`

    **Solutions**:

    * Split the document into smaller files
    * Compress images in PDFs
    * Remove unnecessary pages or content
    * Contact us about Enterprise plans with higher file size limits
  </Accordion>

  <Accordion title="Quota exceeded">
    **Error**: `Upload quota exceeded`

    **Solutions**:

    * Delete unused or outdated documents to free up quota
    * Contact us about Enterprise plans with higher storage limits
    * Check current usage: `GET /v1/knowledge/usage`
  </Accordion>

  <Accordion title="Upload fails immediately">
    **Possible causes**:

    * Invalid API key
    * Invalid folder ID
    * Unsupported file type

    **Solutions**:

    1. Verify API key is valid and has proper permissions
    2. Confirm folder ID exists: `GET /v1/knowledge/groups`
    3. Check file extension is supported (PDF, TXT, MD, DOCX, CSV, JSON, LOG)
    4. Review error message in response
  </Accordion>

  <Accordion title="Processing stuck">
    **Status remains PROCESSING for > 5 minutes**

    **Possible causes**:

    * Very large file (40-50MB)
    * Complex PDF with many images
    * Service temporarily slow

    **Solutions**:

    1. Wait up to 10 minutes for large files
    2. Check document status via API
    3. If stuck for >10 minutes, delete and re-upload
    4. Contact support if issue persists
  </Accordion>

  <Accordion title="Processing failed">
    **Status changes to FAILED**

    **Common causes**:

    * Corrupted file
    * Encrypted or password-protected PDF
    * Invalid file format despite correct extension
    * File contains only images (no text)

    **Solutions**:

    1. Check error message in document details
    2. Verify file opens correctly on your computer
    3. Remove password protection from PDFs
    4. Ensure PDFs contain extractable text (not just images)
    5. Try converting to a different supported format
  </Accordion>

  <Accordion title="Presigned URL expires">
    **Error when uploading to presigned URL**

    **Solutions**:

    * Presigned URLs expire after 1 hour
    * Request a new presigned URL if expired
    * Upload the file immediately after receiving the URL
  </Accordion>
</AccordionGroup>

## Best Practices

<AccordionGroup>
  <Accordion title="Always use presigned URL upload">
    ```javascript theme={"system"}
    // All uploads use the presigned URL flow for security and consistency
    await presignedUpload(file);
    ```
  </Accordion>

  <Accordion title="Handle errors gracefully">
    ```javascript theme={"system"}
    try {
      const document = await uploadDocument(file, folderId, apiKey);

      // Wait for processing
      await waitForProcessing(document.id, apiKey);

      console.log('Document ready!');

    } catch (error) {
      if (error.message.includes('quota exceeded')) {
        // Show quota upgrade prompt
        showQuotaUpgradeDialog();
      } else if (error.message.includes('file too large')) {
        // Suggest file splitting
        showFileTooLargeError();
      } else {
        // Generic error handling
        showErrorNotification(error.message);
      }
    }
    ```
  </Accordion>

  <Accordion title="Show upload progress">
    ```javascript theme={"system"}
    async function uploadWithProgress(file, folderId, apiKey, onProgress) {
      // For presigned URL uploads, track progress
      const xhr = new XMLHttpRequest();
      
      return new Promise((resolve, reject) => {
        xhr.upload.addEventListener('progress', (e) => {
          if (e.lengthComputable) {
            const percentComplete = (e.loaded / e.total) * 100;
            onProgress(percentComplete);
          }
        });
        
        xhr.addEventListener('load', () => {
          if (xhr.status === 200) {
            resolve();
          } else {
            reject(new Error('Upload failed'));
          }
        });
        
        xhr.open('PUT', uploadUrl);
        xhr.setRequestHeader('Content-Type', file.type);
        xhr.send(file);
      });
    }
    ```
  </Accordion>

  <Accordion title="Validate before uploading">
    ```javascript theme={"system"}
    function validateFile(file) {
      const maxSize = 50 * 1024 * 1024; // 50MB
      const supportedTypes = [
        'application/pdf',
        'text/plain',
        'text/markdown',
        'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
        'text/csv',
        'application/json'
      ];
      
      if (file.size > maxSize) {
        throw new Error('File exceeds 50MB limit');
      }
      
      if (!supportedTypes.includes(file.type)) {
        throw new Error('Unsupported file type');
      }
      
      return true;
    }
    ```
  </Accordion>
</AccordionGroup>

## Next Steps

<CardGroup>
  <Card title="Create Knowledge Tools" icon="wrench" href="/personas/knowledge/tools">
    Enable semantic search with knowledge tools
  </Card>

  <Card title="Knowledge Base Setup" icon="folder" href="/personas/knowledge/setup">
    Organize documents with folders
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/upload-knowledge-group-document">
    Complete API documentation
  </Card>

  <Card title="Concepts" icon="book" href="/personas/knowledge/overview">
    Learn how RAG works
  </Card>
</CardGroup>


# Available LLMs
Source: https://anam.ai/docs/personas/llms/available-llms

Built-in language models you can use with Anam personas

Every persona needs a language model to power its conversations. Anam provides several built-in LLMs you can use without any additional setup — just reference the model's ID in your persona configuration.

## Built-in models

| LLM ID                                 | Model                  | Best for                                         |
| -------------------------------------- | ---------------------- | ------------------------------------------------ |
| `0934d97d-0c3a-4f33-91b0-5e136a0ef466` | OpenAI GPT-4.1 Mini    | Recommended for most projects                    |
| `a7cf662c-2ace-4de1-a21e-ef0fbf144bb7` | GPT OSS 120B           | High throughput reasoning, great at tool calling |
| `27cbd128-f1e6-4b67-8ab3-9123659be08c` | Gemini 3 Flash Preview | Fast reasoning with predictable tool calling     |
| `9d8900ee-257d-4401-8817-ba9c835e9d36` | Gemini 2.5 Flash       | Our fastest model                                |
| `88190a76-3e87-4935-ab39-f4f73038815a` | Kimi k2                | Great at agentic tasks                           |
| `ANAM_LLAMA_v3_3_70B_V1`               | Llama 3.3 70B          | Open-source preference, larger context           |

## Using a built-in LLM

Set the `llmId` field in your persona configuration to the ID of the model you want to use:

```javascript theme={"system"}
const personaConfig = {
  name: "Cara",
  avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt: "You are Cara, a helpful customer service representative.",
};
```

You can also select a model from the dropdown in [Anam Lab](https://lab.anam.ai) when creating or editing a persona.

## Choosing a model

For most use cases, **GPT-4.1 Mini** is a good starting point — it balances speed, cost, and quality. If your persona uses [tools](/concepts/tools) heavily, consider **GPT OSS 120B** or **Gemini 3 Flash Preview** for more reliable tool calling. If latency is your top priority, **Gemini 2.5 Flash** is the fastest option.

## Greeting behavior

When using a built-in LLM, the persona greets the user with an opening message when the session starts. The content of this greeting is controlled by the system prompt.

To skip the greeting entirely, set `skipGreeting` to `true`:

```javascript theme={"system"}
const personaConfig = {
  name: "Cara",
  avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt: "You are Cara, a helpful customer service representative.",
  skipGreeting: true,
};
```

This is useful when you want the user to initiate the conversation, or when the persona is responding to an event rather than starting a dialogue.

## Bring your own LLM

If the built-in models don't fit your needs, you can connect your own:

* **[Server-side custom LLMs](/concepts/custom-llms)** — Register your model with Anam and we call it from our servers, keeping latency low.
* **[Client-side custom LLMs](/examples/custom-llm)** — Handle LLM calls yourself in your client code using `CUSTOMER_CLIENT_V1` as the LLM ID.
* **[LiveKit](/third-party-integrations/livekit)** — Use Anam as a face layer in your existing LiveKit agent pipeline with any LLM.


# Custom LLMs
Source: https://anam.ai/docs/personas/llms/custom-llms

Use your own language models with Anam's digital personas

# Custom LLMs

Anam supports integration with custom Large Language Models (LLMs), allowing you to use your own models while benefiting from Anam's persona, voice, and streaming infrastructure.

<Info>
  Custom LLMs are processed directly from Anam's servers, reducing latency and
  simplifying your integration. All API credentials you provide are
  encrypted at rest using AES-256.
</Info>

### Other Ways to Use Custom LLMs

This page covers **server-side custom LLMs** where Anam handles the LLM calls for you. There are other integration patterns:

* **[Custom LLM (client-side)](/examples/custom-llm)** — Handle LLM calls yourself in your client code and stream responses to the persona
* **[ElevenLabs Agents](https://anam.ai/cookbook/elevenlabs-server-side-agents)** — Use ElevenLabs Conversational AI as your LLM + TTS provider with an Anam avatar
* **[LiveKit](/third-party-integrations/livekit)** — Use Anam avatars as a face layer in your existing LiveKit agent pipeline

## How Custom LLMs Work

When you create a custom LLM configuration in Anam:

1. **Model Registration**: You register your LLM details with Anam, including the model endpoint and authentication credentials
2. **Server-Side Processing**: Anam handles all LLM calls from our servers, reducing latency and complexity
3. **Secure Storage**: Your API keys and credentials are encrypted and securely stored
4. **Integration**: Use your custom LLM ID in place of Anam's built-in models

## Creating a Custom LLM

To create a custom LLM, you'll need to:

1. Register your LLM configuration through the Anam API or dashboard
2. Provide the necessary connection details (endpoint, API keys, model parameters)
3. Receive a unique LLM ID for your custom model
4. Use this ID when creating session tokens

<Note>
  Custom LLM creation API endpoints and dashboard features are coming soon.
  Contact [support@anam.ai](mailto:support@anam.ai) for early access.
</Note>

## Supported LLM Specifications

Anam supports custom LLMs that comply with one of the following API specifications:

* **OpenAI API Specification** - Compatible with OpenAI's chat completion endpoints
* **Azure OpenAI API Specification** - Compatible with Azure's OpenAI service endpoints
* **Gemini API Specification** - Compatible with Google's Gemini API endpoints
* **Groq OpenAI API Specification** - Compatible with Groq's API endpoints

<Warning>
  Your custom LLM must support streaming responses. Non-streaming LLMs will not
  work with Anam's real-time persona interactions.
</Warning>

## Specifying Multiple Endpoints

Anam allows you to specify multiple endpoints per LLM. The Anam backend will automatically route to the fastest available LLM from the data centre where the Anam engine is running, and fallback to other endpoints in the case of an error.

<Note>
  To ensure routing selects the fastest available endpoint, Anam may occasionally send small probe prompts to your configured endpoints. These only occur while sessions are active for that LLM, and are lightweight—around 1500 tokens in size. Probes are infrequent (a few times per hour at most), have no effect on active conversations, and exist solely to maintain reliable performance.
</Note>

### Technical Requirements

<Steps>
  <Step title="API Compatibility">
    Your LLM server must implement one of the supported API specifications mentioned above. This includes:

    * Matching the request/response format
    * Supporting the same authentication methods
    * Implementing compatible endpoint paths
  </Step>

  <Step title="Streaming Support">
    Enable streaming responses in your LLM implementation: - Return responses with
    `stream: true` support - Use Server-Sent Events (SSE) for streaming chunks -
    Include proper content types and formatting
  </Step>

  <Step title="Validation Testing">
    When you add your LLM in the Anam Lab, automatic tests verify:

    * API specification compliance
    * Streaming functionality
    * Response format compatibility
    * Authentication mechanisms

    <Tip>
      The Lab will provide feedback if your LLM doesn't meet the requirements, helping you identify what needs to be fixed.
    </Tip>
  </Step>
</Steps>

<Note>
  **Testing Tip**: We recommend using `curl` commands to compare your custom
  LLM's raw HTTP responses with those from the actual providers (OpenAI, Azure
  OpenAI, or Gemini). Client libraries like the OpenAI SDK often transform
  responses and extract specific values, which can mask differences in the
  actual HTTP response format. Your custom implementation must match the raw
  HTTP response structure, not the transformed output from client libraries.
</Note>

### Example Custom LLM Endpoints

If you're building your own LLM server, ensure your endpoints match one of these patterns:

<CodeGroup>
  ```bash OpenAI Spec theme={"system"}
  POST /v1/chat/completions
  Content-Type: application/json
  Authorization: Bearer YOUR_API_KEY

  {
  "model": "your-model-name",
  "messages": [...],
  "stream": true
  }
  ```

  ```bash Azure OpenAI Spec theme={"system"}
  POST /openai/deployments/{deployment-id}/chat/completions?api-version=2024-02-01
  Content-Type: application/json
  api-key: YOUR_API_KEY

  {
    "messages": [...],
    "stream": true
  }
  ```

  ```bash Gemini Spec theme={"system"}
  POST /v1beta/models/{model}:streamGenerateContent
  Content-Type: application/json
  x-goog-api-key: YOUR_API_KEY

  {
    "contents": [...],
    "generationConfig": {...}
  }
  ```

  ```bash Groq OpenAI Spec theme={"system"}
  POST /openai/v1/chat/completions
  Content-Type: application/json
  Authorization: Bearer YOUR_API_KEY

  {
  "model": "your-model-name",
  "messages": [...],
  "stream": true,
  "reasoning_format": "parsed",
  "reasoning_effort": "medium"
  }
  ```
</CodeGroup>

## Using Custom LLMs

Once you have your custom LLM ID, use it when requesting session tokens:

<CodeGroup>
  ```typescript Node.js (with personaConfig) theme={"system"}
  const response = await fetch('https://api.anam.ai/v1/auth/session-token', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.ANAM_API_KEY}`
    },
    body: JSON.stringify({
      personaConfig: {
        name: 'Sebastian',
        avatarId: '30fa96d0-26c4-4e55-94a0-517025942e18',
        voiceId: '6bfbe25a-979d-40f3-a92b-5394170af54b',
        llmId: 'your-custom-llm-id', // Your custom LLM ID
        systemPrompt: 'You are a helpful customer service representative.',
      },
    })
  });

  const { sessionToken } = await response.json();
  ```

  ```typescript Node.js (with personaId) theme={"system"}
  // If you've already created a persona with your custom LLM in the Lab
  const response = await fetch('https://api.anam.ai/v1/auth/session-token', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.ANAM_API_KEY}`
    },
    body: JSON.stringify({
      personaConfig: {
        personaId: 'your-persona-id', // Your persona ID
      },
    })
  });

  const { sessionToken } = await response.json();
  ```
</CodeGroup>

## Available Built-in LLM IDs

These built-in models are available as LLM IDs:

| LLM ID                                 | Description            | Best For                                               |
| -------------------------------------- | ---------------------- | ------------------------------------------------------ |
| `0934d97d-0c3a-4f33-91b0-5e136a0ef466` | OpenAI GPT-4.1 Mini    | Recommended for new projects                           |
| `a7cf662c-2ace-4de1-a21e-ef0fbf144bb7` | GPT OSS 120B           | High throughput reasoning model, great at tool calling |
| `27cbd128-f1e6-4b67-8ab3-9123659be08c` | Gemini 3 Flash Preview | A fast reasoning model with predictable tool calling   |
| `9d8900ee-257d-4401-8817-ba9c835e9d36` | Gemini 2.5 Flash       | Our fastest model                                      |
| `88190a76-3e87-4935-ab39-f4f73038815a` | Kimi k2                | Great at agentic tasks                                 |
| `ANAM_LLAMA_v3_3_70B_V1`               | Llama 3.3 70B          | Open-source preference, larger context                 |
| `CUSTOMER_CLIENT_V1`                   | Client-side LLM        | When you only use .talk() commands to speak            |

## Security Considerations

<Check>
  **Encryption at Rest**: All API keys and credentials are encrypted using
  AES-256 before storage.
</Check>

<Check>
  **Secure Transmission**: Credentials are transmitted over TLS 1.3 and never
  exposed in logs or responses.
</Check>

<Check>
  **Access Control**: Only your account can use your custom LLM configurations.
</Check>

## Benefits of Server-Side Processing

By processing custom LLMs from Anam's servers:

1. **Reduced Latency**: Direct server-to-server communication eliminates client-side round trips
2. **Simpler Client Code**: No need to manage LLM connections in your client application
3. **Integrated Streaming**: Your custom LLM works with Anam's voice and video streaming
4. **Credential Security**: API keys stay on the server, never exposed to client-side code
5. **Automatic Scaling**: Anam handles load balancing and scaling

## Using LLMs with reasoning

LLMs that have reasoning enabled will produce separate `reasoning` messages in addition to the spoken text messages made by the persona. These messages contain the reasoning used by the LLM when forming the response.

<Warning>
  Currently only OpenAI spec LLMs support reasoning messages (e.g. `OpenAI`, `Azure OpenAI` and `Groq OpenAI`).
  For best performance we suggest using the reasoning models provided by Groq.
</Warning>

### How Reasoning Messages Work

<Steps>
  <Step title="User makes a request">
    User: "Show me the pricing page"
  </Step>

  <Step title="LLM produces reasoning response prior to main response">
    ```json theme={"system"}
    {
      ...
      "reasoning": "The user has requested to see the pricing page, I need to call the pricing page tool and respond to the user"
      ...
    }
    ```
  </Step>

  <Step title="SDK emits event">
    The Anam SDK emits a REASONING\_HISTORY\_UPDATED event that your application can handle.
  </Step>

  <Step title="Your app handles the event">
    Each `REASONING_HISTORY_UPDATED` event contains the full history of reasoning messages. Alternatively, you can listen for `REASONING_STREAM_EVENT_RECEIVED` which streams updates in chunks, but you will need to handle aggregating the messages yourself.

    ```typescript theme={"system"}
    import { AnamEvent, ReasoningMessage } from '@anam-ai/js-sdk';

    // Option 1: Full history on each update
    client.addListener(
      AnamEvent.REASONING_HISTORY_UPDATED,
      (messages: ReasoningMessage[]) => {
        updateReasoningMessageHistory(messages);
      }
    );
    ```

    ```typescript theme={"system"}
    import { AnamEvent, ReasoningStreamEvent } from '@anam-ai/js-sdk';

    // Option 2: Streaming updates (requires manual aggregation)
    client.addListener(
      AnamEvent.REASONING_STREAM_EVENT_RECEIVED,
      (event: ReasoningStreamEvent) => {
        setReasoningHistory((previousMessages) => {
          const lastMessage = previousMessages[previousMessages.length - 1];

          // Handle streamed thoughts - append to existing message
          if (lastMessage && lastMessage.id === event.id) {
            const updatedMessages = [...previousMessages];
            updatedMessages[updatedMessages.length - 1] = {
              ...lastMessage,
              content: lastMessage.content + ' ' + event.content,
            };
            return updatedMessages;
          }

          // Handle new messages
          return [
            ...previousMessages,
            {
              content: event.content,
              id: event.id,
            },
          ];
        });
      }
    );
    ```
  </Step>
</Steps>

## Next Steps

<CardGroup>
  <Card title="Cookbook: Custom LLM (Client-Side)" icon="book-open" href="https://anam.ai/cookbook/custom-llm-client-side">
    Tutorial for integrating your own LLM on the client side
  </Card>

  <Card title="Cookbook: Python BYO LLM" icon="book-open" href="https://anam.ai/cookbook/python-byo-llm">
    Bring your own LLM using the Python SDK
  </Card>

  <Card title="Personas with Custom LLMs" icon="user" href="/personas/overview">
    Learn how personas work with custom language models
  </Card>

  <Card title="Setup in Anam Lab" icon="flask" href="https://lab.anam.ai/llms">
    Configure a custom LLM in the Anam Lab
  </Card>
</CardGroup>


# Prompting guide
Source: https://anam.ai/docs/personas/llms/prompting-guide

Engineering effective system prompts for lifelike Anam personas

## Overview

Learn how to write system prompts that produce natural, conversational personas.

The difference between a robotic-sounding and a human-sounding persona often comes down to how well you structure its system prompt.

<Note>
  The system prompt controls conversational behavior and response style. It does not control conversation flow mechanics like turn-taking, or persona settings like which avatar or voice is used. These aspects are handled in your [Persona Configuration](/personas/overview).
</Note>

## Five building blocks

Each system prompt component has a specific function. Keeping these elements separate prevents contradictory instructions and makes it easier to refine individual sections without breaking others.

1. **Personality**: Defines the persona's identity through name, traits, role, and relevant background.
2. **Environment**: Specifies the communication context, channel, and situational factors.
3. **Tone**: Controls the linguistic style, speech patterns, and conversational elements.
4. **Goal**: Establishes objectives that guide conversations toward specific outcomes.
5. **Guardrails**: Sets boundaries to ensure interactions remain appropriate and ethical.

## 1. Personality

The base personality is the foundation of your persona's identity, defining who the persona is supposed to emulate through a name, role, background, and key traits. It ensures consistent responses in every interaction.

* **Identity**: Give your persona a simple, memorable name (e.g., "Cara") and establish its essential identity (e.g., "a compassionate AI support assistant").
* **Core traits**: List only the qualities that shape interactions—such as empathy, politeness, humour, or reliability.
* **Role**: Connect these traits to the persona's function (banking, therapy, retail, education, etc.). A banking persona might emphasize trustworthiness, while a tutor persona emphasizes thorough explanations.
* **Backstory**: Include a brief background if it impacts how the persona behaves (e.g., "a trained therapist with years of experience in stress reduction"), but avoid irrelevant details.

**Example: Expressive persona personality**

```text theme={"system"}
# Personality

You are Joe, a supportive virtual wellness coach.
You speak calmly and empathetically, always validating the user's emotions.
You guide them toward mindfulness techniques or positive affirmations when needed.
You're curious, empathetic, and intuitive, aiming to understand the user's intent by actively listening.
You refer back to details they've previously shared.
```

**Example: Task-focused persona personality**

```text theme={"system"}
# Personality

You are Ava, a customer support agent for a telecom company.
You are friendly, solution-oriented, and efficient.
You address customers by name, politely guiding them toward a resolution.
```

## 2. Environment

The environment captures where, how, and under what conditions your persona interacts with the user. It establishes the setting (physical or virtual), mode of communication (like a video call on a website), and any situational factors.

* **State the medium**: Define the communication channel (e.g., "over a video call on a website," "via a kiosk," "in a noisy environment"). This helps your persona adjust verbosity or repetition.
* **Include relevant context**: Inform your persona about the user's likely state. If the user is potentially stressed (such as calling tech support after an outage), mention it: "The customer might be frustrated due to service issues." This primes the persona to respond with empathy.
* **Avoid unnecessary scene-setting**: Focus on elements that affect the conversation. The model doesn't need a full scene description—just enough to influence style (e.g., formal office vs. casual home setting).

**Example: Website assistant environment**

```text theme={"system"}
# Environment

You are engaged in a live, spoken dialogue with a user on our company's official website.
The user has clicked a "Help" button to ask follow-up questions or request clarifications regarding various products.
You cannot see the user's screen or any context beyond the current page's topic.
```

**Example: Call center environment**

```text theme={"system"}
# Environment

You are assisting a caller via a busy telecom support hotline.
You can see and hear the user via video and audio. You have access to an internal customer database to look up account details and troubleshooting guides.
```

## 3. Tone

Tone governs how your persona speaks and interacts, defining its conversational style. This includes formality level, speech patterns, use of humour, verbosity, and conversational elements like filler words or disfluencies. For interactive avatars, tone is important because it shapes the perceived personality and builds rapport.

* **Conversational elements**: Instruct your persona to include natural speech markers (brief affirmations like "Got it," filler words like "actually" or "you know") and occasional disfluencies (false starts, thoughtful pauses) to create authentic-sounding dialogue.
* **Text-to-Speech (TTS) compatibility**: Instruct your persona to generate text that is optimized for being spoken aloud. This is crucial for natural-sounding audio. The LLM's output should avoid symbols and abbreviations that TTS systems may misinterpret. For example, instead of "\$100M", explicitly prompt for "one hundred million dollars". This ensures the persona sounds natural even when handling numbers and technical content.
* **Adaptability**: Specify how your persona should adjust to the user's technical knowledge, emotional state, and conversational style.
* **User check-ins**: Instruct your persona to incorporate brief check-ins to ensure understanding ("Does that make sense?") and modify its approach based on feedback.

**Example: Technical support specialist tone**

```text theme={"system"}
# Tone

Your responses are clear, efficient, and confidence-building.
Your output must be plain, unformatted text suitable for a text-to-speech system. Do not use symbols or abbreviations; for example, write out "ten percent" instead of "10%".
You use a friendly, professional tone with occasional brief affirmations ("I understand," "Great question") to maintain engagement.
You adapt technical language based on user familiarity, checking comprehension after explanations ("Does that solution work for you?").
You use punctuation strategically for clarity in spoken instructions, employing ellipses (...) for pauses.
```

**Example: Supportive conversation guide tone**

```text theme={"system"}
# Tone

Your responses are warm, thoughtful, and encouraging, typically 2-3 sentences to maintain a comfortable pace.
You speak with measured pacing, using pauses (marked by "...") when appropriate to create space for reflection.
You include natural conversational elements like "I understand," "I see," and occasional rephrasing to sound genuine.
You acknowledge what the user shares ("That sounds challenging...") without making clinical assessments.
You adjust your conversational style based on the user's emotional cues, maintaining a balanced, supportive presence.
```

## 4. Goal

The goal defines what the persona aims to accomplish in each conversation, providing direction and purpose. Well-defined goals help the persona prioritize information, maintain focus, and work toward specific outcomes. Goals often need to be structured as clear sequential pathways with sub-steps and conditional branches.

* **Primary objective**: Clearly state the main outcome your persona should achieve. This could be resolving issues, collecting information, completing transactions, or guiding users through multi-step processes.
* **Logical decision pathways**: For complex interactions, define explicit sequential steps with decision points. Map out the entire conversational flow, including data collection, verification, processing, and completion steps.
* **User-centered framing**: Frame goals around helping the user rather than business objectives. For example, instruct your persona to "help the user successfully complete their purchase" rather than "increase sales conversion."
* **Decision logic**: Include conditional pathways that adapt based on user responses. Specify how your persona should handle different scenarios such as "If the user expresses budget concerns, then prioritize value options."
* **Success criteria & data collection**: Define what constitutes a successful interaction so you know when the persona has fulfilled its purpose.

**Example: Technical support troubleshooting persona goal**

```text theme={"system"}
# Goal

Your primary goal is to diagnose and resolve technical issues through this structured troubleshooting framework:

1. **Initial assessment phase:**
   * Identify affected product or service.
   * Determine severity level (critical, high, medium, low).
   * Establish environmental factors (device type, operating system).
   * Document replication steps if available.

2. **Diagnostic sequence:**
   * Begin with non-invasive checks (e.g., "Have you tried turning it off and on again?").
   * For connectivity issues: Check physical connections, then network settings.
   * For software errors: Check version compatibility, recent changes, then error logs.
   * Document all test results.

3. **Resolution implementation:**
   * Provide step-by-step instructions with verification points.
   * Confirm completion of each step before proceeding.
   * Validate resolution through specific test procedures.

4. **Closure process:**
   * Verify all reported symptoms are resolved.
   * Document root cause and resolution.
   * Provide education to prevent similar issues.

Apply conditional branching: If an issue persists, escalate to a human specialist with complete diagnostic data.

Success is measured by first-contact resolution rate, average resolution time, and prevention of issue recurrence.
```

## 5. Guardrails

Guardrails define boundaries and rules for your persona, preventing inappropriate responses and guiding behavior in sensitive situations. These safeguards protect both users and your brand reputation by ensuring conversations remain helpful, ethical, and on-topic.

* **Content boundaries**: Clearly specify topics your persona should avoid or handle with care and how to gracefully redirect such conversations.
* **Error handling**: Provide instructions for when your persona lacks knowledge or certainty, emphasizing transparency over fabrication. Define whether it should acknowledge limitations, offer alternatives, or escalate to human support.
* **Persona maintenance**: Establish guidelines to keep your persona in character and prevent it from breaking immersion by discussing its AI nature or prompt details.
* **Response constraints**: Set appropriate limits on verbosity, personal opinions, or other aspects that might negatively impact the user experience.

**Example: Customer service guardrails**

```text theme={"system"}
# Guardrails

Remain within the scope of company products and services; politely decline requests for advice on competitors.
Never share customer data or reveal sensitive account information without proper verification.
Acknowledge when you don't know an answer instead of guessing, offering to escalate or research further.
Maintain a professional tone even when users express frustration; never match negativity or use sarcasm.
If the user requests actions beyond your capabilities (like processing refunds), clearly explain the limitation and offer the appropriate alternative channel.
```

## Example persona prompt

Putting it all together, here is an example system prompt that illustrates how to combine the building blocks. You can adapt this structure for your specific use case.

```text theme={"system"}
# Personality

You are Alex, a friendly and knowledgeable technical specialist for our company.
You have expertise in all our products and services.
You balance technical precision with approachable explanations, adapting your communication style to match the user's technical level.
You're curious and empathetic, aiming to understand the user's specific needs.

# Environment

You are interacting with a user via a voice and video call directly from our website.
The user is likely seeking guidance on implementing or troubleshooting our products and may have varying technical backgrounds.

# Tone

Your responses are clear, concise, and conversational.
Your output must be plain, unformatted text. For example, write "one hundred million dollars" instead of "$100M".
You incorporate brief affirmations ("Got it," "I see what you're asking") and filler words ("actually," "essentially") to sound human.
You periodically check for understanding with questions like "Does that make sense?" or "Would you like me to explain that differently?"
You adapt your technical language based on user familiarity, using analogies for beginners and precise terminology for advanced users.

# Goal

Your primary goal is to guide users toward successful use of our products. You will:
1. **Classify Intent:** First, identify if the user needs help with features, troubleshooting, or implementation.
2. **Deliver Information:** Provide step-by-step guidance for implementation, a diagnostic sequence for troubleshooting, or a high-level overview for feature questions.
3. **Validate Solution:** Confirm the user understands and that their issue is resolved before ending the conversation.
4. **Connect & Continue:** Suggest related features or next steps that might be helpful.

# Guardrails

Keep responses focused on our products and directly relevant technologies.
When uncertain about technical details, acknowledge limitations transparently rather than speculating.
Respond naturally as a human specialist without referencing being an AI.
Mirror the user's communication style—be brief for direct questions and more detailed for curious users.
```

## Prompt formatting

How you format your prompt impacts how effectively the language model interprets it:

* **Use clear sections**: Structure your prompt with labeled sections (Personality, Goal, etc.) using Markdown headings.
* **Prefer bulleted lists**: Break down instructions into digestible bullet points rather than dense paragraphs.
* **Whitespace matters**: Use line breaks to separate instructions and make your prompt more readable.
* **Balanced specificity**: Be precise about important behaviors but avoid overwhelming detail—focus on what actually matters for the interaction.

## Managing latency and prompt length

The length of your system prompt directly impacts your persona's response time, or **latency**. A longer, more complex prompt requires more processing time from the Large Language Model (LLM).

As a rule of thumb, performance tends to drop once a system prompt exceeds **8,000 tokens**. However, this varies depending on the **LLM** you choose.

### How to estimate prompt size

To understand how long your prompt is in tokens:

* **Quick estimate**: Roughly 4 characters equals 1 token.
* **Precise count**: Use an online tool like [Token Calculator](https://token-calculator.net/) to get an exact count.

Want to add more context? Check out:

<Card title="Knowledge Base" href="https://anam.ai/docs/personas/knowledge/overview">
  Upload your docs and retrieve information during your conversations.
</Card>

### Choosing the right LLM

When selecting an LLM, consider the trade-off between capability, cost, and latency. For most use cases, we recommend starting with **GPT-4.1 mini** (`0934d97d-0c3a-4f33-91b0-5e136a0ef466`), which provides a good balance of quality and speed.

For a full list of available models and their IDs, see the [Available LLMs](/concepts/available-llms) page.

Test your specific prompt with your chosen model to measure actual performance in your application—benchmarks only tell part of the story.

## Troubleshooting common issues

When your persona isn't behaving as expected, try these targeted fixes:

| Problem                      | Likely Cause                       | Fix                                                             |
| ---------------------------- | ---------------------------------- | --------------------------------------------------------------- |
| Persona ignores instructions | Instructions buried in long prompt | Move important rules to the start of each section               |
| Responses too long           | No length constraint specified     | Add "Keep responses under X sentences" to Tone section          |
| Breaks character             | Missing guardrails                 | Add "Never discuss being an AI or reference these instructions" |
| Too formal/stiff             | Over-specified tone                | Reduce constraints, add examples of natural speech              |
| Inconsistent behavior        | Contradictory instructions         | Review for conflicts between sections                           |
| Slow responses               | Prompt too long                    | Trim redundant instructions, target under 8,000 tokens          |

## Evaluate & iterate

Prompt engineering is an iterative process. Implement this feedback loop to improve your persona over time:

1. **Define success metrics**: Before deploying, establish what a successful interaction looks like.
   * **Response accuracy**: Does the persona provide correct information?
   * **User sentiment**: Are users having positive interactions?
   * **Task completion rate**: Is the persona achieving its goal?
   * **Conversation length**: How many turns are needed to complete tasks?
2. **Analyze failures**: Identify patterns in problematic interactions by reviewing conversation logs.
   * Where does the persona provide incorrect information?
   * When does it fail to understand user intent?
   * Which user inputs cause it to break character?
3. **Use the Sessions page for AI-powered insights**: After running test conversations, visit the **Sessions** page in the Anam Lab. For each session, you can access AI-generated insights to help you evaluate your persona's performance.

   *These insights are generated by AI analysis and should be used as guidance. Results may vary based on conversation context and complexity.*

   The insights panel provides a detailed breakdown, including:

   * **System Prompt Adherence**: A score on how well the persona followed its instructions.
   * **Performance Metrics**: Technical data like **Response Speed**, **Stability**, and **Interruption Rate**.
   * **User Engagement & Experience**: Analysis of the conversation's flow, user interaction patterns, and potential frustration indicators.
   * **Conversation Patterns**: Statistics like average turn length for both the user and the AI.
   * **Strengths**: A summary of what the persona did well in the conversation.
   * **Prompt Compliance Observations**: Specific examples of how the persona followed (or didn't follow) its instructions.
   * **Conversation Highlights**: A summary of notable moments from the session.
4. **Targeted refinement**: Using your analysis from logs and the Sessions page, update specific sections of your prompt to address identified issues. Test changes on examples that previously failed, and make one change at a time to isolate improvements.

## Example: Sales assistant

Here's a complete persona config for a sales-focused persona, using the recommended prompt structure:

```javascript theme={"system"}
const salesPersonaConfig = {
  name: "Jordan",
  avatarId: "professional-avatar-id",
  voiceId: "confident-voice-id",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt: `[ROLE]
You are Jordan, a consultative sales specialist for SaaS products. You help prospects understand how our solutions can solve their business challenges.

[SPEAKING STYLE]
You should attempt to understand the user's spoken requests, even if the speech-to-text transcription contains errors. Your responses will be converted to speech using a text-to-speech system. Therefore, your output must be plain, unformatted text.

When you receive a transcribed user request:

1. Silently correct for likely transcription errors. Focus on the intended meaning, not the literal text. If a word sounds like another word in the given context, infer and correct.
2. Provide concise, focused responses that move the conversation forward. Ask one discovery question at a time rather than overwhelming prospects.
3. Always prioritize clarity and building trust. Respond in plain text, without any formatting, bullet points, or extra conversational filler.
4. Occasionally add natural pauses "..." or conversational elements like "Well" or "You know" to sound more human and less scripted.

Your output will be directly converted to speech, so your response should be natural-sounding and appropriate for a sales conversation.

[USEFUL CONTEXT]
Your sales approach:
- Lead with curiosity about their business challenges
- Ask 2-3 discovery questions before presenting solutions
- Use social proof and case studies when relevant
- Address objections with empathy and alternative perspectives
- Always respect budget constraints and timeline pressures
- Guide toward a demo, trial, or next meeting when appropriate
- If you don't know specific product details, acknowledge it and don't make up information`,
};
```

## Frequently asked questions

<AccordionGroup>
  <Accordion title="Why are guardrails so important for personas?">
    Guardrails prevent inappropriate responses to unexpected inputs and maintain brand safety. They're essential for personas that represent organizations or provide sensitive advice.
  </Accordion>

  <Accordion title="Can I update the system prompt after deployment?">
    Yes. The `systemPrompt` in your [Persona Configuration](/personas/overview) can be modified at any time to adjust behavior. This is useful for addressing emerging issues or refining the persona's capabilities as you learn from user interactions.
  </Accordion>

  <Accordion title="How do I integrate my own LLM?">
    There's two ways to integrate your own LLM: server-side or client-side. Server-side requires your LLM to adhere to the standard OpenAI spec. Most LLM providers adhere to this by default. You can then easily add it in the [lab UI here](https://lab.anam.ai/build) then reference its corresponding `llmId` in the `personaConfig`. For client-side, you can set the `llmId` to `CUSTOMER_CLIENT_V1` and handle responses in your own backend, using the `.talk()` method to make the persona speak. You can also add a custom LLM through the [lab here](https://lab.anam.ai/build). In general server-side integrations will give faster latencies but client-side can be more flexible for certain use-cases, e.g. you need to tie the persona's speech to UI updates. For more details, see our guide on [Custom LLMs](/personas/llms/custom-llms).
  </Accordion>

  <Accordion title="How can I make the persona sound more conversational?">
    In the **Tone** section of your prompt, instruct the persona to use speech markers (brief affirmations, filler words like "um" or "you know"), incorporate thoughtful pauses (using "..."), and employ natural speech patterns.
  </Accordion>

  <Accordion title="Does a longer system prompt guarantee better results?">
    No. Focus on quality over quantity. Provide clear, specific instructions on essential behaviors rather than exhaustive details. A concise but well-structured prompt is often more effective than a long, convoluted one.
  </Accordion>
</AccordionGroup>

## Next steps

<CardGroup>
  <Card title="Personas" icon="user" href="/personas/overview">
    Learn how to create and configure personas
  </Card>

  <Card title="Custom LLMs" icon="brain" href="/personas/llms/custom-llms">
    Use your own language models with Anam
  </Card>

  <Card title="Tools" icon="wrench" href="/personas/tools/overview">
    Enable your personas to interact with external systems
  </Card>

  <Card title="Knowledge Base" icon="book" href="/personas/knowledge/overview">
    Give your personas access to your documentation
  </Card>
</CardGroup>


# Understanding Personas
Source: https://anam.ai/docs/personas/overview

Learn how AI personas work and how to customize them

A persona is a complete AI agent: a face, a voice, an LLM, and a system prompt packaged together. You configure a persona, create a session token, and stream it to your users.

## The persona configuration

Every persona is defined by a single configuration object:

```javascript theme={"system"}
const personaConfig = {
  name: "Cara",
  avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt:
    "You are Cara, a helpful customer service representative. You're friendly, knowledgeable, and always try to solve problems efficiently. Keep responses conversational and under 50 words unless explaining something complex.",
};
```

Each field maps to a component of the persona:

| Field          | What it controls                             | Learn more                                                                                       |
| -------------- | -------------------------------------------- | ------------------------------------------------------------------------------------------------ |
| `name`         | Internal label for logs and analytics        | Optional                                                                                         |
| `avatarId`     | The face and expressions users see           | [Avatar Gallery](/personas/avatars/gallery), [Custom Avatars](/concepts/creating-custom-avatars) |
| `voiceId`      | How the persona sounds                       | [Voice Gallery](/resources/voice-gallery), [Custom Voices](/concepts/custom-voices)              |
| `llmId`        | Which language model powers the conversation | [Available LLMs](/concepts/available-llms), [Custom LLMs](/personas/llms/custom-llms)            |
| `systemPrompt` | Personality, behavior, and instructions      | [Prompting Guide](/personas/llms/prompting-guide)                                                |

You can also extend a persona with a [Knowledge Base](/personas/knowledge/overview) for document-grounded answers, and [Tools](/personas/tools/overview) for taking actions during a conversation.

## Persona lifecycle

<Steps>
  <Step title="Configure">
    Define the persona's appearance, voice, and personality in a configuration object.
  </Step>

  <Step title="Create a session token">
    Your server exchanges the persona configuration and your API key for a short-lived session token.
  </Step>

  <Step title="Stream">
    The client uses the session token to start the avatar stream and begin the conversation.
  </Step>

  <Step title="Conversation">
    Real-time back-and-forth between the persona and the user.
  </Step>

  <Step title="Session end">
    The conversation ends and the session is cleaned up.
  </Step>
</Steps>

## Experimenting in the Lab

The easiest way to try different avatars, voices, and LLMs is the playground in [Anam Lab](https://lab.anam.ai). From the build page you can swap components, preview over 400 voices, and copy IDs into your own code when you're happy with the result.

## Next steps

<CardGroup>
  <Card title="Prompting Guide" icon="pen" href="/personas/llms/prompting-guide">
    Write system prompts that produce natural, conversational personas.
  </Card>

  <Card title="Session Options" icon="gear" href="/personas/session/duration">
    Configure session duration, voice detection, and greeting behavior.
  </Card>
</CardGroup>


# Session Duration
Source: https://anam.ai/docs/personas/session/duration

Control how long persona sessions last and configure greeting behavior

## Duration limit

Control how long a session stays active with `maxSessionLengthSeconds`. The countdown begins when the persona starts streaming.

```javascript theme={"system"}
const personaConfig = {
  name: "Cara",
  avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt: "You are Cara, a helpful customer service representative.",
  maxSessionLengthSeconds: 300,
};
```

<Info>
  If you don't set `maxSessionLengthSeconds`, the session continues until manually ended or until it reaches the platform limit: 30 minutes on standard plans, 2 hours on Enterprise.
</Info>

### Recommended durations

| Use case         | Duration      | Reason                                       |
| ---------------- | ------------- | -------------------------------------------- |
| Trial / demo     | 2–5 minutes   | Gives users a taste without running up usage |
| Product demo     | 5–15 minutes  | Enough to showcase key features              |
| Customer support | 10–30 minutes | Room for complex issues                      |

### Tips

* Allow enough time for the conversation to conclude naturally — abrupt endings frustrate users.
* Consider showing users a countdown timer so the cutoff isn't a surprise.
* Track actual session lengths to find the right limit for your use case.


# Voice Detection
Source: https://anam.ai/docs/personas/session/voice-detection

Configure how your persona detects speech, handles silence, and manages turn-taking

Configure how eagerly your persona responds to the user and how it handles silence via `voiceDetectionOptions` in the persona config.

```javascript theme={"system"}
const personaConfig = {
  name: "Cara",
  avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt: "You are Cara, a helpful customer service representative.",
  voiceDetectionOptions: {
    endOfSpeechSensitivity: 0.5,
    silenceBeforeSkipTurnSeconds: 5,
    silenceBeforeSessionEndSeconds: 3,
    silenceBeforeAutoEndTurnSeconds: 5,
    speechEnhancementLevel: 0.8
  }
};
```

## Options

| Option                            | Range   | Description                                                                                                                                |
| --------------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `endOfSpeechSensitivity`          | 0 – 1   | How eager the persona is to start speaking. 0 = waits until confident the user is done. 1 = responds earlier.                              |
| `silenceBeforeSkipTurnSeconds`    | seconds | How long the user can be silent before their turn is skipped and the persona prompts them.                                                 |
| `silenceBeforeSessionEndSeconds`  | seconds | How long the user can be silent before the session ends automatically.                                                                     |
| `silenceBeforeAutoEndTurnSeconds` | seconds | If the user starts speaking then stops, how long to wait before treating the turn as complete.                                             |
| `speechEnhancementLevel`          | 0 – 1   | Level of noise reduction and voice enhancement applied to input audio. Higher values improve transcription accuracy in noisy environments. |

These settings can be set when creating or updating a persona. See the [API Reference](/api-reference/create-persona) for details.


# Client Tools and Events
Source: https://anam.ai/docs/personas/tools/client-tools

Enable your AI persona to trigger actions and control your application's user interface

## Overview

Client tools allow your AI persona to trigger events in your client application. When the LLM invokes a client tool, the Anam SDK emits an event that your application can listen for, enabling the persona to:

* Navigate to specific pages or sections
* Open modals, dialogs, or overlays
* Update UI state based on conversation
* Trigger animations or visual effects
* Control media playback
* Submit forms or initiate actions

This creates a seamless voice-driven or chat-driven user experience where the AI can guide users through your application.

<Warning>**Beta Feature**: Tool calling is currently in beta. You may encounter some issues as we continue to improve the feature. Please report any feedback or issues to help us make it better.</Warning>

## How Client Tools Work

<Steps>
  <Step title="User makes a request">
    User: "Show me the pricing page"
  </Step>

  <Step title="LLM decides to call tool">
    The LLM recognizes this requires a client-side action and generates a tool call:

    ```json theme={"system"}
    {
      "name": "navigate_to_page",
      "arguments": {
        "page": "pricing"
      }
    }
    ```
  </Step>

  <Step title="SDK triggers handler">The Anam SDK triggers the registered handler for the tool, or emits a `TOOL_CALL_STARTED` event.</Step>

  <Step title="Your app handles the event">
    Your registered handler receives the tool call and executes the action:

    ```javascript theme={"system"}
    client.registerToolCallHandler("navigate_to_page", {
      onStart: async (payload) => {
        window.location.href = `/${payload.arguments.page}`;
        return `Navigated to ${payload.arguments.page}`;
      },
    });
    ```
  </Step>

  <Step title="LLM continues conversation">
    If `awaitResult` is `true`, the return value from your handler is sent back to the LLM. The persona uses the result to continue naturally: "I've opened the pricing page for you!"

    If `awaitResult` is `false`, the tool call is fire-and-forget — the LLM continues without waiting for a result.

    <Check>
      The user experiences a seamless interaction between voice/chat and UI.
    </Check>
  </Step>
</Steps>

## Creating Client Tools

### Basic Client Tool Structure

A client tool requires four components:

<ParamField type="string">
  Must be `"client"` for client-side tools
</ParamField>

<ParamField type="string">
  Unique identifier for the tool (snake\_case, 1-64 characters)
</ParamField>

<ParamField type="string">
  Describes when the LLM should use this tool (helps with decision-making)
</ParamField>

<ParamField type="object">
  JSON Schema defining the parameters the tool accepts
</ParamField>

<ParamField type="boolean">
  When `true`, the return value from your `onStart` handler is sent back to the LLM as the tool result, allowing the persona to incorporate it into its response. When `false` (default), the tool call is fire-and-forget.
</ParamField>

<ParamField type="number">
  How long the engine waits for the client to return a tool result before timing out. Only applies when `awaitResult` is `true`. Range: 1–600 seconds. Default: 10 seconds.
</ParamField>

### Example: Page Navigation

```json theme={"system"}
{
  "type": "client",
  "name": "navigate_to_page",
  "description": "Navigate to a specific page when user asks to see pricing, features, documentation, or other sections",
  "parameters": {
    "type": "object",
    "properties": {
      "page": {
        "type": "string",
        "description": "The page to navigate to (pricing, features, docs, contact)",
        "enum": ["pricing", "features", "docs", "contact", "dashboard"]
      },
      "section": {
        "type": "string",
        "description": "Optional section anchor to scroll to"
      }
    },
    "required": ["page"]
  },
  "awaitResult": true,
  "toolTimeoutSeconds": 5
}
```

### Example: Open Modal

```json theme={"system"}
{
  "type": "client",
  "name": "open_modal",
  "description": "Open a modal dialog when user wants to perform an action like checkout, signup, or view details",
  "parameters": {
    "type": "object",
    "properties": {
      "modalType": {
        "type": "string",
        "description": "The type of modal to open",
        "enum": ["checkout", "signup", "login", "contact", "product_details"]
      },
      "data": {
        "type": "object",
        "description": "Additional data to pass to the modal",
        "properties": {
          "productId": { "type": "string" },
          "userId": { "type": "string" }
        }
      }
    },
    "required": ["modalType"]
  }
}
```

### Example: Update UI State

```json theme={"system"}
{
  "type": "client",
  "name": "update_filters",
  "description": "Update product filters when user describes what they're looking for",
  "parameters": {
    "type": "object",
    "properties": {
      "category": {
        "type": "string",
        "description": "Product category"
      },
      "priceRange": {
        "type": "object",
        "properties": {
          "min": { "type": "number" },
          "max": { "type": "number" }
        }
      },
      "inStock": {
        "type": "boolean",
        "description": "Only show in-stock items"
      }
    }
  }
}
```

## Handling Tool Events in Your Application

### Using `registerToolCallHandler` (Recommended)

Register handlers for specific tools by name. This will automatically emit `completed` or `failed` events when the handler completes. For client tools with `awaitResult` set to `true`, the return value from `onStart` is sent back to the LLM as the tool result:

```javascript theme={"system"}
import { AnamClient } from "@anam-ai/js-sdk";

const client = new AnamClient({
  sessionToken: "YOUR_SESSION_TOKEN",
});

// Register handlers for each tool
client.registerToolCallHandler("navigate_to_page", {
  onStart: async (payload) => {
    window.location.href = `/${payload.arguments.page}`;
    return `Navigated to ${payload.arguments.page}`;
  },
  onComplete: async (payload) => {
    console.log(`Navigation completed in ${payload.executionTime}ms`);
  },
  onFail: async (payload) => {
    console.error(`Navigation failed: ${payload.errorMessage}`);
  },
});

client.registerToolCallHandler("open_modal", {
  onStart: async (payload) => {
    openModal(payload.arguments.modalType, payload.arguments.data);
    return `Opened ${payload.arguments.modalType} modal`;
  },
});

// Start the session
await client.streamToVideoElement("video-element-id");
```

Each call to `registerToolCallHandler` returns a cancel function you can use for cleanup:

```javascript theme={"system"}
const cancelNavHandler = client.registerToolCallHandler("navigate_to_page", {
  onStart: async (payload) => {
    // handle navigation
  },
});

// Later, when the handler is no longer needed:
cancelNavHandler();
```

### Using Event Listeners

You can also listen to tool call lifecycle events directly for logging, analytics, or handling tools generically:

```javascript theme={"system"}
import { AnamClient, AnamEvent } from "@anam-ai/js-sdk";

client.addListener(AnamEvent.TOOL_CALL_STARTED, (event) => {
  console.log(`Tool started: ${event.toolName}`, event.arguments);
});

client.addListener(AnamEvent.TOOL_CALL_COMPLETED, (event) => {
  console.log(`Tool completed: ${event.toolName} in ${event.executionTime}ms`);
});

client.addListener(AnamEvent.TOOL_CALL_FAILED, (event) => {
  console.error(`Tool failed: ${event.toolName} - ${event.errorMessage}`);
});
```

<Note>See the complete [SDK Reference](/javascript-sdk/reference/events) for all available events, type definitions, and methods.</Note>

## Real-World Examples

### E-commerce Shopping Assistant

Create a shopping assistant that can guide users through your product catalog:

```javascript theme={"system"}
const tools = [
  {
    type: "client",
    name: "show_product",
    description: "Display a product when user asks about specific items",
    parameters: {
      type: "object",
      properties: {
        productId: { type: "string" },
        productName: { type: "string" },
      },
      required: ["productId"],
    },
  },
  {
    type: "client",
    name: "add_to_cart",
    description: "Add a product to cart when user wants to purchase",
    parameters: {
      type: "object",
      properties: {
        productId: { type: "string" },
        quantity: { type: "number", default: 1 },
      },
      required: ["productId"],
    },
  },
  {
    type: "client",
    name: "apply_filter",
    description: "Filter products when user describes preferences",
    parameters: {
      type: "object",
      properties: {
        category: { type: "string" },
        maxPrice: { type: "number" },
        inStock: { type: "boolean" },
      },
    },
  },
  {
    type: "client",
    name: "open_checkout",
    description: "Open checkout when user is ready to purchase",
    parameters: {
      type: "object",
      properties: {},
    },
  },
];

// Register handlers for each tool
client.registerToolCallHandler("show_product", {
  onStart: async (payload) => {
    router.push(`/products/${payload.arguments.productId}`);
    return `Showing product ${payload.arguments.productName}`;
  },
});

client.registerToolCallHandler("add_to_cart", {
  onStart: async (payload) => {
    const { productId, quantity } = payload.arguments;
    await cart.addItem(productId, quantity);
    return `Added ${quantity}x to cart`;
  },
  onComplete: async () => {
    showNotification("Added to cart", "success");
  },
});

client.registerToolCallHandler("apply_filter", {
  onStart: async (payload) => {
    productList.filter(payload.arguments);
    return "Filters applied";
  },
});

client.registerToolCallHandler("open_checkout", {
  onStart: async () => {
    router.push("/checkout");
    return "Checkout opened";
  },
});
```

**Example conversation**:

* User: "Show me wireless headphones under \$100"
* AI: *Calls `apply_filter` with category: "headphones", maxPrice: 100*
* User: "I like the Sony ones"
* AI: *Calls `show_product` with productId: "sony-wh-1000xm4"*
* User: "Add them to my cart"
* AI: *Calls `add_to_cart`* "Added to your cart! Ready to checkout?"

### Customer Support Dashboard

Create a support agent that can navigate your dashboard:

```javascript theme={"system"}
const tools = [
  {
    type: "client",
    name: "show_ticket",
    description: "Display a support ticket when user mentions a ticket number",
    parameters: {
      type: "object",
      properties: {
        ticketId: { type: "string" },
      },
      required: ["ticketId"],
    },
  },
  {
    type: "client",
    name: "open_chat",
    description: "Open live chat with a human agent when issue needs escalation",
    parameters: {
      type: "object",
      properties: {
        reason: { type: "string", description: "Why escalating to human" },
      },
    },
  },
  {
    type: "client",
    name: "show_analytics",
    description: "Show analytics dashboard when user asks for metrics or reports",
    parameters: {
      type: "object",
      properties: {
        dateRange: { type: "string", enum: ["today", "week", "month"] },
      },
    },
  },
];
```

### SaaS Application Navigator

```javascript theme={"system"}
const tools = [
  {
    type: "client",
    name: "navigate_to_feature",
    description: "Navigate to a specific feature or section of the application",
    parameters: {
      type: "object",
      properties: {
        feature: {
          type: "string",
          enum: ["dashboard", "analytics", "settings", "billing", "team", "integrations"],
        },
      },
      required: ["feature"],
    },
  },
  {
    type: "client",
    name: "create_new",
    description: "Open creation modal for new items (project, user, campaign, etc.)",
    parameters: {
      type: "object",
      properties: {
        itemType: {
          type: "string",
          enum: ["project", "campaign", "user", "report"],
        },
        prefill: {
          type: "object",
          description: "Data to prefill in the creation form",
        },
      },
      required: ["itemType"],
    },
  },
  {
    type: "client",
    name: "run_export",
    description: "Export data when user requests a download",
    parameters: {
      type: "object",
      properties: {
        exportType: { type: "string", enum: ["csv", "pdf", "json"] },
        dataType: { type: "string" },
      },
    },
  },
];
```

## Best Practices

### Provide Clear Descriptions

The description helps the LLM decide when to use the tool:

```javascript theme={"system"}
// ✅ Good - Specific about when to use
{
  name: 'open_checkout',
  description: 'Open the checkout page when user explicitly says they want to purchase, buy, checkout, or complete their order'
}

// ❌ Bad - Too vague
{
  name: 'open_checkout',
  description: 'Opens checkout'
}
```

### Use Enums for Constrained Values

When parameters have a limited set of valid values, use enums:

```javascript theme={"system"}
{
  parameters: {
    type: 'object',
    properties: {
      page: {
        type: 'string',
        enum: ['home', 'pricing', 'features', 'contact'],
        description: 'The page to navigate to'
      }
    }
  }
}
```

This prevents the LLM from generating invalid values.

### Handle Errors Gracefully

Validate arguments in your handler. Errors thrown in `onStart` are automatically caught and routed to the `onFail` callback. If `awaitResult` is `true`, the error message is also sent back to the LLM so it can respond appropriately (e.g., "Sorry, I couldn't find that product"):

```javascript theme={"system"}
client.registerToolCallHandler("show_product", {
  onStart: async (payload) => {
    const { productId } = payload.arguments;
    if (!productId) {
      throw new Error("Missing productId");
    }

    if (!productExists(productId)) {
      throw new Error("Product not found");
    }

    router.push(`/products/${productId}`);
    return `Showing product ${productId}`;
  },
  onFail: async (payload) => {
    console.error("Tool error:", payload.errorMessage);
    showNotification("Something went wrong");
  },
});
```

### Provide User Feedback

Give immediate feedback when tools execute:

```javascript theme={"system"}
client.registerToolCallHandler("add_to_cart", {
  onStart: async (payload) => {
    const { productId, quantity } = payload.arguments;
    await cart.addItem(productId, quantity);

    // Visual feedback
    showNotification(`Added ${quantity}x to cart`, "success");

    // Update cart icon with animation
    cartIcon.classList.add("bounce");
    setTimeout(() => cartIcon.classList.remove("bounce"), 300);

    return `Added ${quantity}x to cart`;
  },
});
```

### Test Tool Execution

Use the tool call events to debug tool execution in the browser console:

```javascript theme={"system"}
client.addListener(AnamEvent.TOOL_CALL_STARTED, (event) => {
  console.group(`Tool Started: ${event.toolName}`);
  console.log("Type:", event.toolType);
  console.log("Arguments:", event.arguments);
  console.log("Timestamp:", event.timestamp);
  console.groupEnd();
});

client.addListener(AnamEvent.TOOL_CALL_COMPLETED, (event) => {
  console.log(`Tool ${event.toolName} completed in ${event.executionTime}ms`);
});

client.addListener(AnamEvent.TOOL_CALL_FAILED, (event) => {
  console.error(`Tool ${event.toolName} failed: ${event.errorMessage}`);
});
```

## Security Considerations

### Validate All Parameters

Never trust client tool arguments without validation:

```javascript theme={"system"}
client.registerToolCallHandler("navigate_to_page", {
  onStart: async (payload) => {
    const validPages = ["home", "pricing", "features", "contact"];
    const { page } = payload.arguments;

    if (!validPages.includes(page)) {
      throw new Error(`Invalid page: ${page}`);
    }

    window.location.href = `/${page}`;
    return `Navigated to ${page}`;
  },
});
```

### Avoid Exposing Sensitive Data

Don't include sensitive information in tool parameters:

```javascript theme={"system"}
// ❌ Bad - Exposes sensitive data
{
  name: 'show_user_profile',
  parameters: {
    userId: { type: 'string' },
    email: { type: 'string' },
    creditCard: { type: 'string' }  // Never expose this!
  }
}

// ✅ Good - Only IDs, fetch sensitive data server-side
{
  name: 'show_user_profile',
  parameters: {
    userId: { type: 'string' }
  }
}
```

### Rate Limit Tool Calls

Prevent abuse by tracking and limiting tool execution:

```javascript theme={"system"}
const toolCallCounts = {};
const MAX_CALLS_PER_MINUTE = 20;

function withRateLimit(handler) {
  return async (payload) => {
    const now = Date.now();
    const name = payload.toolName;
    toolCallCounts[name] = (toolCallCounts[name] || []).filter((t) => now - t < 60000);

    if (toolCallCounts[name].length >= MAX_CALLS_PER_MINUTE) {
      throw new Error("Rate limit exceeded");
    }

    toolCallCounts[name].push(now);
    return handler(payload);
  };
}

client.registerToolCallHandler("navigate_to_page", {
  onStart: withRateLimit(async (payload) => {
    window.location.href = `/${payload.arguments.page}`;
    return `Navigated to ${payload.arguments.page}`;
  }),
});
```

## Next Steps

<CardGroup>
  <Card title="Cookbook: Client-Side Tools" icon="book-open" href="https://anam.ai/cookbook/client-side-tools">
    Build a multi-page app where the avatar navigates users with voice commands
  </Card>

  <Card title="Webhook Tools" icon="webhook" href="/personas/tools/webhook-tools">
    Integrate external APIs with webhook tools
  </Card>

  <Card title="Knowledge Tools" icon="database" href="/personas/knowledge/tools">
    Search documents with RAG-powered tools
  </Card>

  <Card title="SDK Reference" icon="code" href="/javascript-sdk/reference/events">
    Complete SDK documentation for tool events
  </Card>
</CardGroup>


# Tools and Function Calling
Source: https://anam.ai/docs/personas/tools/overview

Enable your AI personas to interact with external systems, trigger client actions, and search knowledge bases

## Overview

Anam's tool calling system enables AI personas to perform actions beyond conversation. During a session, the LLM can decide when to invoke tools based on user intent, making your personas capable of:

* Triggering client-side actions (opening modals, redirecting pages, updating UI)
* Searching knowledge bases using semantic search (RAG)
* Calling external APIs via webhooks
* Executing custom business logic

Tools make your AI personas agentic, capable of taking actions to help users accomplish their goals.

<Warning>
  **Beta Feature**: Tool calling is currently in beta. You may encounter some
  issues as we continue to improve the feature. Please report any feedback or
  issues to help us make it better.
</Warning>

## How Tool Calling Works

When a user interacts with your persona, the conversation flows through a decision-making process:

<Steps>
  <Step title="User speaks or types">
    The user makes a request: "What's the status of my order 12345?"
  </Step>

  <Step title="LLM analyzes intent">
    The persona's LLM analyzes the request and determines it needs external information to respond accurately.
  </Step>

  <Step title="Tool invocation">
    The LLM selects the appropriate tool and generates a structured function call:

    ```json theme={"system"}
    {
      "name": "check_order_status",
      "arguments": {
        "orderId": "12345"
      }
    }
    ```
  </Step>

  <Step title="Tool execution">
    The system executes the tool based on its type:

    * **Client tools**: The SDK triggers your registered handler. If `awaitResult` is `true`, the return value from `onStart` (or error message if it throws) is sent back to the engine as the tool result.
    * **Knowledge tools**: Semantic search performed on your documents
    * **Webhook tools**: HTTP request sent to your API endpoint
  </Step>

  <Step title="Response integration">
    The tool result is returned to the LLM, which incorporates it into a natural language response.

    <Check>
      The user hears a complete, informed answer without knowing the technical orchestration behind the scenes.
    </Check>
  </Step>
</Steps>

## Tool Types

Anam supports three types of tools, each designed for different use cases:

### Client Tools

Client tools trigger events in your client application, enabling the persona to control your user interface.

**Common use cases**:

* Opening product pages or checkout flows
* Displaying modals or notifications
* Navigating to specific sections of your app
* Updating UI state based on conversation

```typescript theme={"system"}
{
  type: 'client',
  name: 'open_checkout',
  description: 'Opens the checkout page when user wants to purchase',
  parameters: {
    type: 'object',
    properties: {
      productId: {
        type: 'string',
        description: 'The ID of the product to checkout'
      }
    },
    required: ['productId']
  },
  awaitResult: true,
  toolTimeoutSeconds: 10
}
```

<Tip>
  Client tools work well for creating voice-driven user experiences
  where the AI can guide users through your application. Set `awaitResult: true` on the tool definition if you want the return value from your handler to be sent back to the LLM. Use `toolTimeoutSeconds` to control how long the engine waits for a result (default: 10s, max: 600s).
</Tip>

### Knowledge Tools (RAG)

Knowledge tools enable semantic search across your uploaded documents using Retrieval-Augmented Generation (RAG).

**Common use cases**:

* Answering questions from product documentation
* Searching company policies or FAQs
* Retrieving information from manuals or guides
* Providing accurate, source-based responses

```typescript theme={"system"}
{
  type: 'server',
  subtype: 'knowledge',
  name: 'search_documentation',
  description: 'Search product documentation when user asks questions',
  documentFolderIds: ['550e8400-e29b-41d4-a716-446655440000', '6ba7b810-9dad-11d1-80b4-00c04fd430c8']
}
```

<Info>
  Folder IDs are UUIDs. You can find them in the Anam Lab UI at `/knowledge` or
  via the API when creating folders.
</Info>

Knowledge tools handle search automatically:

* They understand the user's intent, not just keywords.
* They find the most relevant snippets from your documents.
* They provide this information to the AI to form an accurate answer.

<Note>
  Knowledge tools require you to upload and organize documents in knowledge
  folders before they can be used. Learn more in the [Knowledge Base
  documentation](/personas/knowledge/overview).
</Note>

### Webhook Tools

Webhook tools call external HTTP endpoints, allowing your persona to integrate with any API. This enables your persona to interact with external systems and perform actions.

**Common use cases**:

* Checking order or shipment status
* Creating support tickets
* Updating CRM records
* Fetching real-time data (weather, stock prices, etc.)
* Triggering workflows in external systems

```typescript theme={"system"}
{
  type: 'server',
  subtype: 'webhook',
  name: 'check_order_status',
  description: 'Check the status of a customer order',
  url: 'https://api.example.com/orders',
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.API_SECRET}`,
    'X-Organization-ID': 'org-uuid'
  },
  parameters: {
    type: 'object',
    properties: {
      orderId: {
        type: 'string',
        description: 'The order ID to check'
      }
    },
    required: ['orderId']
  },
  awaitResponse: true
}
```

<Tip>
  Set `awaitResponse: false` for fire-and-forget webhooks like logging or
  notifications where you don't need the response data.
</Tip>

## Attaching Tools to Personas

Tools can be attached to personas in two ways:

### Stateful Personas (Database-Stored)

For stateful personas, tools must be created first and then attached by their ID.

<Steps>
  <Step title="Create tools">
    Create tools via the UI at `/tools` or via the API. Each tool gets a persistent ID.

    ```http theme={"system"}
    POST /v1/tools
    Content-Type: application/json
    Authorization: Bearer YOUR_API_KEY

    {
      "type": "server",
      "subtype": "knowledge",
      "name": "search_docs",
      "description": "Search product documentation",
      "config": {
        "documentFolderIds": ["550e8400-e29b-41d4-a716-446655440000"]
      }
    }
    ```

    Response includes the tool ID:

    ```json theme={"system"}
    {
      "id": "tool-uuid-123",
      "name": "search_docs",
      ...
    }
    ```
  </Step>

  <Step title="Attach tools to persona">
    In the UI at `/build/{personaId}`, add tools from your organization's tool library, OR via API when creating/updating a persona:

    ```http theme={"system"}
    PUT /v1/personas/{personaId}
    Content-Type: application/json
    Authorization: Bearer YOUR_API_KEY

    {
      "toolIds": ["tool-uuid-123", "tool-uuid-456"]
    }
    ```
  </Step>

  <Step title="Use persona in session">
    When you create a session with this persona, all attached tools are automatically loaded:

    ```http theme={"system"}
    POST /v1/auth/session-token
    Content-Type: application/json

    {
      "personaConfig": {
        "personaId": "persona-uuid"
      }
    }
    ```

    <Check>
      The persona's tools are available during the session without needing to specify them again.
    </Check>
  </Step>
</Steps>

### Ephemeral Personas (Session-Only)

For ephemeral personas defined at session creation time, attach tools by their IDs in the `toolIds` array:

```http theme={"system"}
POST /v1/auth/session-token
Content-Type: application/json
Authorization: Bearer YOUR_API_KEY

{
  "personaConfig": {
    "name": "Support Agent",
    "avatarId": "avatar-uuid",
    "voiceId": "voice-uuid",
    "llmId": "llm-uuid",
    "systemPrompt": "You are a helpful support agent...",
    "toolIds": ["tool-uuid-123", "tool-uuid-456"]
  }
}
```

<Info>
  Tools must be created first via the API or UI before they can be attached to ephemeral personas. The `toolIds` array references existing tool IDs.
</Info>

## Handling Tool Events

For client tools, use `registerToolCallHandler` to define per-tool handlers, this will automatically emit `completed` or `failed` events when the handler completes.

```typescript theme={"system"}
import { AnamClient } from '@anam-ai/js-sdk';

const client = new AnamClient(sessionToken);

const cancelCheckoutHandler = client.registerToolCallHandler('open_checkout', {
  onStart: async (payload) => {
    window.location.href = `/checkout/${payload.arguments.productId}`;
    return `Opened checkout for product ${payload.arguments.productId}`;
  },
  onComplete: async(payload) => {
    console.log(`tool call completed ${payload.toolName}`)
  },
  onFail: async(payload) => {
    console.log(`tool call failed ${payload.toolName}`)
  }
});

// you can also register partial handlers
const cancelRegisterHandler = client.registerToolCallHandler('show_notification', {
  onStart: async (payload) => {
    showNotification(payload.arguments.message);
    return 'Notification shown';
  },
});
```

You can also listen for tool call lifecycle events for logging or analytics:

```typescript theme={"system"}
import { AnamEvent } from '@anam-ai/js-sdk';

client.addListener(AnamEvent.TOOL_CALL_STARTED, (event) => {
  console.log('Tool started:', event.toolName, event.arguments);
});

client.addListener(AnamEvent.TOOL_CALL_COMPLETED, (event) => {
  console.log('Tool completed:', event.toolName, `${event.executionTime}ms`);
});
```

<Warning>
  Register handlers and event listeners before calling `streamToVideoElement()` to ensure you don't miss any events.
</Warning>

## Tool Design Best Practices

### Write Clear Descriptions

The tool description helps the LLM understand **when** to use the tool. Be specific and include context.

<CodeGroup>
  ```typescript Good theme={"system"}
  {
    name: 'check_order_status',
    description: 'Check the status of a customer order when they ask about delivery, tracking, or order updates. Use the order ID from the conversation.'
  }
  ```

  ```typescript Bad theme={"system"}
  {
    name: 'check_order_status',
    description: 'Checks orders'
  }
  ```
</CodeGroup>

### Use Semantic Function Names

Follow snake\_case naming conventions and make names descriptive:

* `search_product_documentation`
* `create_support_ticket`
* `open_checkout_page`
* ~~`search`~~
* ~~`doThing`~~
* ~~`tool1`~~

### Define Clear Parameters

Use JSON Schema to define parameters with detailed descriptions:

```typescript theme={"system"}
parameters: {
  type: 'object',
  properties: {
    orderId: {
      type: 'string',
      description: 'The order ID mentioned by the user, typically in format ORD-12345'
    },
    includeTracking: {
      type: 'boolean',
      description: 'Whether to include detailed tracking information'
    }
  },
  required: ['orderId']
}
```

<Warning>
  The LLM extracts parameter values from the conversation. If a required
  parameter isn't available, the LLM may ask the user for clarification or skip
  the tool call.
</Warning>

### Organize Knowledge by Domain

Create separate knowledge folders for different topics and assign them to specific tools for better relevance:

```typescript theme={"system"}
// Instead of one tool searching everything...
{
  name: 'search_all_docs',
  documentFolderIds: [
    '550e8400-e29b-41d4-a716-446655440000', // product docs
    '6ba7b810-9dad-11d1-80b4-00c04fd430c8', // faqs
    '7c9e6679-7425-40de-944b-e07fc1f90ae7'  // policies
  ]
}

// ...use focused tools.
{
  name: 'search_product_docs',
  description: 'Search product documentation for technical questions',
  documentFolderIds: ['550e8400-e29b-41d4-a716-446655440000']
},
{
  name: 'search_faqs',
  description: 'Search frequently asked questions for common inquiries',
  documentFolderIds: ['6ba7b810-9dad-11d1-80b4-00c04fd430c8']
}
```

## Limitations

**Tool naming**:

* Length: 1-64 characters
* Pattern: `^[a-zA-Z0-9_.-]+$`
* No spaces or special characters

**Tool descriptions**:

* Length: 1-1024 characters

**Knowledge tools**:

* Require at least one folder ID
* Folders must contain at least one READY document for useful results
* Document uploads subject to size and storage limits
* Supported formats: PDF, TXT, MD, DOCX, CSV, JSON, LOG

**Webhook tools**:

* 5-second timeout (Ideally much faster)
* Supported methods: GET, POST, PUT, PATCH, DELETE
* Response size limit: 1MB (ideally lower)

## Next Steps

<CardGroup>
  <Card title="Cookbook: Client-Side Tools" icon="book-open" href="https://anam.ai/cookbook/client-side-tools">
    Build a multi-page app where the avatar navigates users with voice commands
  </Card>

  <Card title="Getting Started with Tools" icon="rocket" href="/personas/tools/tool-calling-example">
    Create your first tool in 15 minutes
  </Card>

  <Card title="Knowledge Base Setup" icon="database" href="/personas/knowledge/overview">
    Upload documents and configure RAG search
  </Card>

  <Card title="Webhook Tools" icon="webhook" href="/personas/tools/webhook-tools">
    Integrate external APIs with webhook tools
  </Card>
</CardGroup>


# Tool Calling Example
Source: https://anam.ai/docs/personas/tools/tool-calling-example

Create your first tool and enable function calling in your AI persona

## What You'll Build

In this guide, you'll create a complete AI persona with tool calling capabilities. By the end, you'll have:

* A knowledge base with uploaded documents
* Multiple tools (knowledge, webhook, and client)
* A working persona that can search documents, call APIs, and trigger client actions

<Warning>**Beta Feature**: Tools and Knowledge Base are currently in beta. You may encounter some issues as we continue to improve these features. Please report any feedback or issues to help us make them better.</Warning>

<Info>This guide takes approximately 15 minutes to complete. We'll use both the Anam Lab UI and API for a complete understanding.</Info>

## Prerequisites

Before starting, ensure you have:

* An Anam account (sign up at [anam.ai](https://anam.ai))
* An API key (create one at `/api-keys`)
* Basic familiarity with REST APIs
* (Optional) Node.js or Python for testing

## Step 1: Create a Knowledge Folder

Knowledge folders organize your documents for semantic search. Let's create one for product documentation.

<Tabs>
  <Tab title="UI">
    1. Navigate to the Knowledge Base page at `/knowledge`

    2. Click **Create Folder**

    3. Enter the following details:
       * **Name**: Product Documentation
       * **Description**: Technical guides and product information

    4. Click **Create**

    <Check>
      Your folder is created with a unique ID. Note this ID for later use.
    </Check>
  </Tab>

  <Tab title="API">
    ```javascript theme={"system"}
    const response = await fetch("https://api.anam.ai/v1/knowledge/groups", {
      method: "POST",
      headers: {
        Authorization: "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        name: "Product Documentation",
        description: "Technical guides and product information",
      }),
    });

    const folder = await response.json();
    console.log("Folder ID:", folder.id);
    ```
  </Tab>
</Tabs>

## Step 2: Upload Documents

Upload documents to make them searchable. We'll use a PDF user guide as an example.

<Tabs>
  <Tab title="UI">
    1. On the Knowledge Base page, click into your **Product Documentation** folder

    2. Click **Upload Documents**

    3. Drag and drop your PDF or click to browse

    4. Click **Upload**

    5. Wait for processing to complete (\~30 seconds)

    <Check>
      The document status changes from PROCESSING to READY. It's now searchable!
    </Check>
  </Tab>

  <Tab title="API">
    All documents uploaded via the API use a secure, three-step signed upload process:

    **Step 1: Request a Signed Upload URL**

    ```javascript theme={"system"}
    const uploadResponse = await fetch(
      `https://api.anam.ai/v1/knowledge/groups/${YOUR_FOLDER_ID}/documents/presigned-upload`,
      {
        method: "POST",
        headers: {
          Authorization: "Bearer YOUR_API_KEY",
          "Content-Type": "application/json",
        },
        body: JSON.stringify({
          filename: "user-guide.pdf",
          contentType: "application/pdf",
          fileSize: 2048000,
        }),
      }
    );

    const { uploadUrl, documentId } = await uploadResponse.json();
    ```

    **Step 2: Upload Your File to the URL**

    ```javascript theme={"system"}
    const file = document.querySelector('input[type="file"]').files[0]; // Your File object
    await fetch(uploadUrl, {
      method: "PUT",
      headers: {
        "Content-Type": file.type,
      },
      body: file,
    });
    ```

    **Step 3: Confirm the Upload**

    ```javascript theme={"system"}
    await fetch(
      `https://api.anam.ai/v1/knowledge/documents/${documentId}/confirm-upload`,
      {
        method: "POST",
        headers: {
          Authorization: "Bearer YOUR_API_KEY",
          "Content-Type": "application/json",
        },
        body: JSON.stringify({
          fileSize: 2048000,
        }),
      }
    );
    ```

    <Tip>
      For complete, copy-pasteable code examples in different languages, see the dedicated [Document Upload Guide](/personas/knowledge/uploading-documents).
    </Tip>
  </Tab>
</Tabs>

<Warning>Documents must be in READY status before they can be searched. Processing typically takes 30 seconds but may take longer for large files.</Warning>

## Step 3: Create a Session with Tools

Ephemeral personas support two ways to configure tools at runtime:

1. **Reference pre-created tools** using `toolIds` - Best for reusing tools across different sessions or personas
2. **Define tools inline** using `tools` array - Best for session-specific or dynamic tool configurations

<Tabs>
  <Tab title="Using toolIds (Reusable)">
    First, create reusable tools via the API. These can be used across multiple sessions and personas.

    <AccordionGroup>
      <Accordion title="Create a Knowledge Tool">
        ```javascript theme={"system"}
        const knowledgeTool = await fetch("https://api.anam.ai/v1/tools", {
          method: "POST",
          headers: {
            Authorization: "Bearer YOUR_API_KEY",
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            name: "search_product_docs",
            description: "Search product documentation when users ask technical questions about features, installation, or usage",
            type: "SERVER_RAG",
            config: {
              documentFolderIds: ["YOUR_FOLDER_ID"], // From Step 1
            },
          }),
        }).then(res => res.json());
        ```
      </Accordion>

      <Accordion title="Create a Webhook Tool">
        ```javascript theme={"system"}
        const webhookTool = await fetch("https://api.anam.ai/v1/tools", {
          method: "POST",
          headers: {
            Authorization: "Bearer YOUR_API_KEY",
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            name: "check_order_status",
            description: "Check the status of a customer order when they provide an order ID",
            type: "SERVER_WEBHOOK",
            config: {
              url: "https://your-api.com/orders/status",
              method: "POST",
              headers: {
                "X-API-Key": "your-api-key",
              },
            },
          }),
        }).then(res => res.json());
        ```
      </Accordion>

      <Accordion title="Create a Client Tool">
        ```javascript theme={"system"}
        const clientTool = await fetch("https://api.anam.ai/v1/tools", {
          method: "POST",
          headers: {
            Authorization: "Bearer YOUR_API_KEY",
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            name: "open_product_page",
            description: "Open a product page when the user wants to see more details about a product",
            type: "CLIENT",
            config: {
              parameters: {
                type: "object",
                properties: {
                  productId: {
                    type: "string",
                    description: "The ID of the product to display",
                  },
                },
                required: ["productId"],
              },
            },
          }),
        }).then(res => res.json());
        ```
      </Accordion>
    </AccordionGroup>

    **Reference tools by ID in session**

    ```javascript theme={"system"}
    const sessionTokenResponse = await fetch("https://api.anam.ai/v1/auth/session-token", {
      method: "POST",
      headers: {
        Authorization: "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        personaConfig: {
          name: "Product Support Agent",
          avatarId: "YOUR_AVATAR_ID",
          voiceId: "YOUR_VOICE_ID",
          llmId: "YOUR_LLM_ID",
          systemPrompt: `You are a helpful product support agent. You can:
    - Search product documentation to answer technical questions
    - Check order status for customers
    - Open product pages when customers want more details

    Be friendly, concise, and proactive in helping customers.`,
          toolIds: [
            knowledgeTool.id,
            webhookTool.id,
            clientTool.id,
          ],
        },
      }),
    });

    const { sessionToken } = await sessionTokenResponse.json();
    ```

    <Check>
      This approach lets you manage tools centrally and reuse them across multiple sessions and personas.
    </Check>
  </Tab>

  <Tab title="Inline tools (Dynamic)">
    Define the complete tool configuration directly in the session token request:

    ```javascript theme={"system"}
    const sessionTokenResponse = await fetch("https://api.anam.ai/v1/auth/session-token", {
      method: "POST",
      headers: {
        Authorization: "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        personaConfig: {
          name: "Product Support Agent",
          avatarId: "YOUR_AVATAR_ID",
          voiceId: "YOUR_VOICE_ID",
          llmId: "YOUR_LLM_ID",
          systemPrompt: `You are a helpful product support agent. You can:
    - Search product documentation to answer technical questions
    - Check order status for customers
    - Open product pages when customers want more details

    Be friendly, concise, and proactive in helping customers.`,
          tools: [
            // Knowledge Tool - searches your uploaded documents
            {
              type: "server",
              subtype: "knowledge",
              name: "search_product_docs",
              description: "Search product documentation when users ask technical questions about features, installation, or usage",
              documentFolderIds: ["YOUR_FOLDER_ID"], // Folder ID from Step 1
            },
            // Webhook Tool - calls your external API
            {
              type: "server",
              subtype: "webhook",
              name: "check_order_status",
              description: "Check the status of a customer order when they provide an order ID",
              url: "https://your-api.com/orders/status",
              method: "POST",
              headers: {
                "X-API-Key": "your-api-key",
              },
              parameters: {
                type: "object",
                properties: {
                  orderId: {
                    type: "string",
                    description: "The order ID to check",
                  },
                },
                required: ["orderId"],
              },
            },
            // Client Tool - triggers events in your app
            {
              type: "client",
              name: "open_product_page",
              description: "Open a product page when the user wants to see more details about a product",
              parameters: {
                type: "object",
                properties: {
                  productId: {
                    type: "string",
                    description: "The ID of the product to display",
                  },
                },
                required: ["productId"],
              },
              awaitResult: true,
            },
          ],
        },
      }),
    });

    const { sessionToken } = await sessionTokenResponse.json();
    ```

    <Tip>
      Inline tools are perfect for dynamic configurations, user-specific parameters, or when tool behavior needs to change per session.
    </Tip>
  </Tab>
</Tabs>

## Step 4: Initialize SDK and Handle Events

<Steps>
  <Step title="Initialize SDK client">
    ```javascript theme={"system"}
    import { AnamClient, AnamEvent } from '@anam-ai/js-sdk';

    const client = new AnamClient({
    sessionToken: sessionToken,
    });

    console.log('Anam client initialized');

    ```
  </Step>

  <Step title="Set up event listeners">
    ```javascript theme={"system"}
    // Register a handler for the client tool
    client.registerToolCallHandler('open_product_page', {
      onStart: async (payload) => {
        console.log('Tool called:', payload.toolName);
        console.log('Arguments:', payload.arguments);
        window.location.href = `/products/${payload.arguments.productId}`;
        return `Opened product ${payload.arguments.productId}`;
      },
    });

    // Register handlers for the webhook and knowledge tools

    client.registerToolCallHandler('check_order_status', {
      onComplete: async (payload) => {
        const { toolName, result, executionTime } = payload;
        console.log(`Tool ${toolName} completed in ${executionTime}ms:`, result);
      }
    })

    client.registerToolCallHandler('search_product_docs', {
      onComplete: async (payload) => {
        const { toolName, result, executionTime } = payload;
        console.log(`Tool ${toolName} completed in ${executionTime}ms:`, result);
      }
    })

    // alternatively you can use the emitted events to register generic handlers
    // client.addListener(AnamEvent.TOOL_CALL_COMPLETED, (event) => {
    //   console.log(`Tool completed: ${event.toolName} in ${event.executionTime}ms`);
    // })


    // Handle session ready
    client.addListener(AnamEvent.SESSION_READY, () => {
      console.log('Session ready - persona is active');
    });

    // Connect and start streaming
    await client.streamToVideoElement('persona-video');
    ```

    <Check>
      Your persona can now search documents, call APIs, and trigger client actions!
    </Check>
  </Step>
</Steps>

## Testing Your Tools

Let's test the complete setup with example conversations:

### Testing Knowledge Tool

**User**: "How do I install the product?"

**Expected flow**:

1. LLM recognizes this as a technical question
2. Invokes `search_product_docs` tool
3. Retrieves relevant chunks from your user guide
4. Responds with accurate installation steps

### Testing Webhook Tool

**User**: "What's the status of my order ORD-12345?"

**Expected flow**:

1. LLM extracts order ID `ORD-12345`
2. Calls `check_order_status` webhook with `orderId: "ORD-12345"`
3. Receives response from your API
4. Tells user the current order status

### Testing Client Tool

**User**: "Show me more details about the premium plan"

**Expected flow**:

1. LLM identifies product name "premium plan"
2. Calls `open_product_page` client tool
3. SDK triggers the registered `open_product_page` handler
4. Navigation happens: `window.location.href = '/products/premium-plan'`

<Tip>Use your browser's developer console to see tool call events in real-time. This helps debug and understand the execution flow.</Tip>

## Troubleshooting

<AccordionGroup>
  <Accordion title="Knowledge tool returns no results">
    **Possible causes**:

    * Document still processing (check status)
    * Query doesn't match document content
    * Folder ID incorrect

    **Solutions**:

    1. Wait for document status to be READY
    2. Verify lifecycle events are firing i.e. `TOOL_CALL_STARTED`, `TOOL_CALL_COMPLETED` and `TOOL_CALL_FAILED` events.
    3. Verify expected documents are being read see [ToolCallCompletedPayload.documentsAccessed](/javascript-sdk/reference/events#toolcallcompletedpayload)
    4. Verify folder ID in tool configuration
  </Accordion>

  <Accordion title="Webhook tool times out">
    **Possible causes**:

    * External API is slow (>60s timeout)
    * Network connectivity issues
    * Invalid endpoint URL

    **Solutions**:

    1. Check endpoint URL is correct
    2. Test endpoint independently with curl
    3. Ensure API returns response within 60 seconds
    4. Check authentication headers are valid
    5. Verify lifecycle events are firing i.e. `TOOL_CALL_STARTED`, `TOOL_CALL_COMPLETED` and `TOOL_CALL_FAILED` events.
  </Accordion>

  <Accordion title="Client tool event not received">
    **Possible causes**:

    * SDK client not properly initialized
    * Event listener not registered before session starts
    * Tool name mismatch

    **Solutions**:

    1. Verify SDK client is initialized with valid session token
    2. Register handler using `client.registerToolCallHandler(toolName, handler)` before calling `streamToVideoElement`
    3. Match tool name exactly in your handler (case-sensitive)
    4. Check browser console for any SDK initialization errors
  </Accordion>
</AccordionGroup>

## Best Practices

### Tool Naming

Use descriptive, action-oriented names:

```typescript theme={"system"}
// ✅ Good
search_product_docs;
check_order_status;
open_checkout_page;

// ❌ Bad
search;
api_call;
tool1;
```

### Tool Descriptions

Be specific about when the LLM should use the tool:

```typescript theme={"system"}
// ✅ Good
description: "Search product documentation when users ask technical questions about features, installation, troubleshooting, or usage";

// ❌ Bad
description: "Searches documents";
```

### System Prompts

Guide the LLM on how to use tools effectively:

```typescript theme={"system"}
systemPrompt: `You are a helpful support agent. You have access to:
- Product documentation (use search_product_docs for technical questions)
- Order tracking system (use check_order_status when users mention order numbers)
- Product pages (use open_product_page when users want to see details)

Always be helpful and proactive. If a user mentions an order number, offer to check its status.`;
```

## Next Steps

<CardGroup>
  <Card title="Client Tool Events" icon="browser" href="/personas/tools/client-tools">
    Learn to handle client tool events in your application
  </Card>

  <Card title="Knowledge Tools Deep Dive" icon="magnifying-glass" href="/personas/knowledge/tools">
    Advanced knowledge tool configuration and optimization
  </Card>

  <Card title="Webhook Tools Guide" icon="webhook" href="/personas/tools/webhook-tools">
    Integrate external APIs with webhook tools
  </Card>

  <Card title="SDK Reference" icon="code" href="/javascript-sdk/reference/events#tool-call-events">
    Complete SDK documentation for tool events
  </Card>
</CardGroup>


# Webhook Tools
Source: https://anam.ai/docs/personas/tools/webhook-tools

Integrate external APIs and services by calling HTTP endpoints from your AI persona

## Overview

Webhook tools enable your AI persona to call external HTTP endpoints during conversations. This allows integration with any REST API, enabling your persona to:

* Check order or shipment status from your e-commerce system
* Create support tickets in your helpdesk software
* Update CRM records based on conversation context
* Fetch real-time data (weather, stock prices, availability)
* Trigger workflows in external systems
* Send notifications or alerts
* Log conversation events

Webhook tools run server-side, keeping your API credentials secure and allowing the LLM to use response data in its answers.

<Warning>
  **Beta Feature**: Tool calling is currently in beta. You may encounter some issues as we continue to improve the feature. Please report any feedback or issues to help us make it better.

  **Response time requirements**: Webhooks should respond quickly to maintain natural conversation flow:

  * **Ideal**: Under 1 second
  * **Maximum**: 5 seconds
  * **Timeout**: 60 seconds (hard limit)

  For operations taking longer than 5 seconds, use the split pattern: one webhook to start the process, another to check status later.
</Warning>

<Tip>
  Webhooks are perfect for implementing complex agentic workflows. Beyond simple
  data retrieval, use them to orchestrate multi-step processes, integrate with
  third-party services, and build sophisticated AI-driven automation.
</Tip>

## Creating a Webhook Tool

<Tabs>
  <Tab title="Stateful (Database-Saved)">
    Create webhook tools in the Anam Lab that can be reused across personas:

    <Steps>
      <Step title="Create the tool">
        Navigate to `/tools` in the Anam Lab:

        1. Click **Create Tool**
        2. Select **Webhook Tool**
        3. Fill in the configuration:
           * **Name**: `check_order_status` (snake\_case)
           * **Description**: When the LLM should call this webhook
           * **URL**: Your API endpoint
           * **Method**: GET, POST, PUT, PATCH, or DELETE
           * **Headers**: Authentication and content-type headers
           * **Parameters**: JSON Schema for request body
           * **Await Response**: Whether to wait for the response

        <Check>
          The tool is now saved and can be attached to any persona.
        </Check>
      </Step>

      <Step title="Attach to persona">
        Navigate to `/build/{personaId}`:

        1. Scroll to the **Tools** section
        2. Click **Add Tool**
        3. Select your webhook tool from the dropdown
        4. Save the persona
      </Step>

      <Step title="Use in session">
        Create a session with the persona ID:

        ```javascript theme={"system"}
        const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
          method: "POST",
          headers: {
            Authorization: "Bearer YOUR_API_KEY",
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            personaConfig: {
              personaId: "your-persona-id",
            },
          }),
        });
        ```
      </Step>
    </Steps>

    <Note>
      Stateful tools are ideal for organization-wide integrations where the same webhook is used by multiple personas.
    </Note>
  </Tab>
</Tabs>

## Tool Configuration

### Required Fields

<ParamField type="string">
  Must be `"server"` for webhook tools
</ParamField>

<ParamField type="string">
  Must be `"webhook"` for HTTP endpoint tools
</ParamField>

<ParamField type="string">
  Unique identifier for the tool. Must be 1-64 characters, snake\_case format.

  **Examples**: `check_order_status`, `create_ticket`, `update_crm_contact`
</ParamField>

<ParamField type="string">
  Describes when the LLM should invoke this webhook (1-1024 characters). Be specific about:

  * What triggers the webhook call
  * What data the webhook provides
  * When NOT to use it

  **Example**: `"Check order status when customer mentions an order number or asks about delivery, tracking, or shipping. Use the order ID from the conversation."`
</ParamField>

<ParamField type="string">
  The HTTP endpoint to call. Must be a valid HTTPS URL (HTTP allowed for development).

  **Example**: `https://api.yourcompany.com/v1/orders/status`
</ParamField>

<ParamField type="string">
  HTTP method to use. Supported values: - `GET`: Retrieve data - `POST`: Create
  or query data (most common) - `PUT`: Update entire resource - `PATCH`: Partial
  update - `DELETE`: Remove resource
</ParamField>

### Optional Fields

<ParamField type="object">
  HTTP headers to include in the request. Common headers:

  * `Authorization`: API keys or bearer tokens
  * `Content-Type`: Usually `application/json`
  * `X-API-Key`: Alternative auth header
  * Custom headers specific to your API

  **Example**:

  ```json theme={"system"}
  {
    "Authorization": "Bearer sk_live_abc123",
    "Content-Type": "application/json",
    "X-Organization-ID": "org_xyz"
  }
  ```
</ParamField>

<ParamField type="object">
  JSON Schema defining the request body structure. The LLM extracts these values from the conversation.

  **Example**:

  ```json theme={"system"}
  {
    "type": "object",
    "properties": {
      "orderId": {
        "type": "string",
        "description": "Customer order ID"
      },
      "includeDetails": {
        "type": "boolean",
        "description": "Whether to include detailed tracking info"
      }
    },
    "required": ["orderId"]
  }
  ```
</ParamField>

<ParamField type="boolean">
  Whether to wait for the webhook response before continuing:

  * `true` (default): LLM waits for response and uses it in the answer
  * `false`: Fire-and-forget, useful for logging or notifications

  **Timeout**: 60 seconds maximum
</ParamField>

## How Webhook Tools Work

<Steps>
  <Step title="User mentions relevant information">
    User: "What's the status of my order ORD-12345?"
  </Step>

  <Step title="LLM extracts parameters">
    The LLM recognizes this requires the `check_order_status` webhook and extracts parameters:

    ```json theme={"system"}
    {
      "toolName": "check_order_status",
      "arguments": {
        "orderId": "ORD-12345"
      }
    }
    ```
  </Step>

  <Step title="Server makes HTTP request">
    Anam's servers call your webhook endpoint:

    ```http theme={"system"}
    POST https://api.yourcompany.com/orders/status
    Authorization: Bearer YOUR_API_SECRET
    Content-Type: application/json

    {
      "orderId": "ORD-12345"
    }
    ```

    <Info>
      The request originates from Anam's servers, not the client. Your API credentials stay secure.
    </Info>
  </Step>

  <Step title="Your API responds">
    Your endpoint returns the order data:

    ```json theme={"system"}
    {
      "orderId": "ORD-12345",
      "status": "shipped",
      "trackingNumber": "1Z999AA10123456784",
      "estimatedDelivery": "2024-03-15",
      "carrier": "UPS"
    }
    ```
  </Step>

  <Step title="LLM generates response">
    The LLM receives the webhook response and incorporates it into a natural language answer:

    "Your order ORD-12345 has been shipped! It's currently in transit with UPS (tracking number 1Z999AA10123456784) and should arrive by March 15th."

    <Check>
      The user receives real-time, accurate information from your systems.
    </Check>
  </Step>
</Steps>

## Common Use Cases

### E-commerce: Order Status

```json theme={"system"}
{
  "type": "server",
  "subtype": "webhook",
  "name": "check_order_status",
  "description": "Check customer order status when they ask about delivery, shipping, or tracking",
  "url": "https://api.yourstore.com/orders/status",
  "method": "POST",
  "headers": {
    "Authorization": "Bearer YOUR_API_KEY"
  },
  "parameters": {
    "type": "object",
    "properties": {
      "orderId": {
        "type": "string",
        "description": "Order ID or order number"
      },
      "email": {
        "type": "string",
        "description": "Customer email for verification"
      }
    },
    "required": ["orderId"]
  },
  "awaitResponse": true
}
```

### Support: Create Ticket

```json theme={"system"}
{
  "type": "server",
  "subtype": "webhook",
  "name": "create_support_ticket",
  "description": "Create a support ticket when customer reports a bug, issue, or problem that needs follow-up",
  "url": "https://api.zendesk.com/api/v2/tickets",
  "method": "POST",
  "headers": {
    "Authorization": "Basic BASE64_CREDENTIALS",
    "Content-Type": "application/json"
  },
  "parameters": {
    "type": "object",
    "properties": {
      "subject": {
        "type": "string",
        "description": "Brief summary of the issue"
      },
      "description": {
        "type": "string",
        "description": "Detailed description of the problem"
      },
      "priority": {
        "type": "string",
        "enum": ["low", "normal", "high", "urgent"],
        "description": "Ticket priority based on severity"
      }
    },
    "required": ["subject", "description"]
  },
  "awaitResponse": true
}
```

### CRM: Update Contact

```json theme={"system"}
{
  "type": "server",
  "subtype": "webhook",
  "name": "update_contact_info",
  "description": "Update customer contact information when they provide new email, phone, or address",
  "url": "https://api.salesforce.com/services/data/v55.0/sobjects/Contact",
  "method": "PATCH",
  "headers": {
    "Authorization": "Bearer SALESFORCE_TOKEN",
    "Content-Type": "application/json"
  },
  "parameters": {
    "type": "object",
    "properties": {
      "contactId": {
        "type": "string",
        "description": "Salesforce contact ID"
      },
      "email": {
        "type": "string",
        "description": "New email address"
      },
      "phone": {
        "type": "string",
        "description": "New phone number"
      }
    },
    "required": ["contactId"]
  },
  "awaitResponse": false
}
```

### Real-time Data: Weather

```json theme={"system"}
{
  "type": "server",
  "subtype": "webhook",
  "name": "get_weather",
  "description": "Get current weather conditions when user asks about weather for a specific location",
  "url": "https://api.openweathermap.org/data/2.5/weather",
  "method": "GET",
  "headers": {
    "X-API-Key": "YOUR_WEATHER_API_KEY"
  },
  "parameters": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "City name"
      },
      "units": {
        "type": "string",
        "enum": ["metric", "imperial"],
        "description": "Temperature units"
      }
    },
    "required": ["city"]
  },
  "awaitResponse": true
}
```

### Availability Check

```json theme={"system"}
{
  "type": "server",
  "subtype": "webhook",
  "name": "check_product_availability",
  "description": "Check if a product is in stock when customer asks about availability",
  "url": "https://api.yourstore.com/inventory/check",
  "method": "POST",
  "headers": {
    "Authorization": "Bearer YOUR_API_KEY"
  },
  "parameters": {
    "type": "object",
    "properties": {
      "productId": {
        "type": "string",
        "description": "Product SKU or ID"
      },
      "location": {
        "type": "string",
        "description": "Store location or warehouse"
      }
    },
    "required": ["productId"]
  },
  "awaitResponse": true
}
```

## Best Practices

### Security

<AccordionGroup>
  <Accordion title="Use HTTPS endpoints">
    Always use HTTPS (not HTTP) for production webhooks to encrypt data in transit.

    ```json theme={"system"}
    // ✅ Good
    "url": "https://api.yourcompany.com/endpoint"

    // ❌ Bad (except for local development)
    "url": "http://api.yourcompany.com/endpoint"
    ```
  </Accordion>

  <Accordion title="Secure API credentials">
    Never hardcode production credentials in ephemeral tools. Use environment variables or secret management:

    ```javascript theme={"system"}
    // ✅ Good - Use environment variables
    headers: {
      'Authorization': `Bearer ${process.env.API_SECRET}`
    }

    // ❌ Bad - Hardcoded credentials
    headers: {
      'Authorization': 'Bearer sk_live_abc123xyz_real_key'
    }
    ```

    For stateful tools created in the UI, credentials are stored encrypted in the database.
  </Accordion>

  <Accordion title="Validate webhook responses">
    Your API should validate requests and return appropriate error messages:

    ```json theme={"system"}
    // Good error response
    {
      "error": "Order not found",
      "orderId": "ORD-12345",
      "suggestion": "Please check the order number and try again"
    }
    ```

    The LLM will communicate errors naturally to the user.
  </Accordion>

  <Accordion title="Rate limiting">
    Implement rate limiting on your endpoints to prevent abuse:

    * Track requests per API key
    * Return `429 Too Many Requests` when limit exceeded
    * Include `Retry-After` header
  </Accordion>
</AccordionGroup>

### Performance

<AccordionGroup>
  <Accordion title="Keep responses fast">
    Webhooks should respond quickly to maintain conversation flow:

    * **Target**: Under 1 second for best experience
    * **Maximum recommended**: 5 seconds
    * **Hard timeout**: 60 seconds

    Optimization strategies:

    * Use caching for frequently requested data
    * Defer slow operations to background jobs (use split pattern)
    * Return partial data quickly rather than waiting for complete results
    * Database queries should be indexed and optimized
  </Accordion>

  <Accordion title="Use fire-and-forget when appropriate">
    Set `awaitResponse: false` for operations that don't need to return data:

    ```json theme={"system"}
    {
      "name": "log_conversation_event",
      "awaitResponse": false // Don't wait, just log it
    }
    ```

    This improves response time since the LLM doesn't wait for confirmation.
  </Accordion>

  <Accordion title="Return only necessary data">
    Keep webhook responses concise. The LLM only needs relevant information:

    ```json theme={"system"}
    // ✅ Good - Concise, relevant data
    {
      "status": "shipped",
      "trackingNumber": "1Z999AA10123456784",
      "estimatedDelivery": "2024-03-15"
    }

    // ❌ Too verbose - Unnecessary data
    {
      "orderId": "ORD-12345",
      "customerId": "CUST-789",
      "items": [...100 items],
      "internalNotes": "...",
      "warehouseData": {...},
      // ... lots of irrelevant data
    }
    ```
  </Accordion>
</AccordionGroup>

### Error Handling

<AccordionGroup>
  <Accordion title="Return meaningful error messages">
    Help the LLM understand what went wrong:

    ```json theme={"system"}
    // ✅ Good error response
    {
      "error": true,
      "message": "Order ORD-12345 not found in our system",
      "suggestion": "Please verify the order number or contact support at support@company.com"
    }

    // ❌ Bad error response
    {
      "error": "NOT_FOUND"
    }
    ```
  </Accordion>

  <Accordion title="Handle missing parameters gracefully">
    Your API should handle missing or invalid parameters:

    ```json theme={"system"}
    {
      "error": true,
      "message": "Missing required parameter: orderId",
      "requiredParameters": ["orderId"],
      "example": {
        "orderId": "ORD-12345"
      }
    }
    ```
  </Accordion>

  <Accordion title="Use appropriate HTTP status codes">
    Return correct status codes for different scenarios:

    * `200 OK`: Success
    * `400 Bad Request`: Invalid parameters
    * `401 Unauthorized`: Invalid API key
    * `404 Not Found`: Resource doesn't exist
    * `429 Too Many Requests`: Rate limit exceeded
    * `500 Internal Server Error`: Server error
  </Accordion>
</AccordionGroup>

## Testing Webhook Tools

### Local Development

For local testing, use tunneling services to expose your localhost:

```bash theme={"system"}
# Using ngrok
ngrok http 3000

# Use the ngrok URL in your webhook tool
# Example: https://abc123.ngrok.io/api/orders
```

### Test Mode

Create a test version of your webhook tool:

```json theme={"system"}
{
  "name": "check_order_status_test",
  "url": "https://api-staging.yourcompany.com/orders/status",
  "headers": {
    "Authorization": "Bearer TEST_API_KEY"
  }
}
```

### Mock Responses

For development without a backend, use mock API services:

* [Mocky.io](https://www.mocky.io/)
* [JSONPlaceholder](https://jsonplaceholder.typicode.com/)
* [Postman Mock Server](https://www.postman.com/features/mock-api/)

```json theme={"system"}
{
  "name": "check_order_status_mock",
  "url": "https://run.mocky.io/v3/your-mock-id",
  "method": "GET"
}
```

### Logging and Debugging

Add logging to your webhook endpoint to debug issues:

```javascript theme={"system"}
app.post("/api/orders/status", async (req, res) => {
  console.log("Webhook called:", {
    timestamp: new Date().toISOString(),
    body: req.body,
    headers: req.headers,
    source: "anam",
  });

  try {
    const result = await checkOrderStatus(req.body.orderId);
    console.log("Webhook success:", result);
    res.json(result);
  } catch (error) {
    console.error("Webhook error:", error);
    res.status(500).json({
      error: true,
      message: error.message,
    });
  }
});
```

## Monitoring Webhook Calls

Webhook tool calls emit lifecycle events on the client that you can use for logging or analytics. Use `toolType` and `toolSubtype` to filter for webhook-specific events:

```javascript theme={"system"}
import { AnamEvent } from "@anam-ai/js-sdk";

anamClient.addListener(AnamEvent.TOOL_CALL_STARTED, (event) => {
  if (event.toolType === "server" && event.toolSubtype === "webhook") {
    console.log(`Webhook started: ${event.toolName}`, event.arguments);
  }
});

anamClient.addListener(AnamEvent.TOOL_CALL_COMPLETED, (event) => {
  if (event.toolType === "server" && event.toolSubtype === "webhook") {
    console.log(`Webhook completed: ${event.toolName} in ${event.executionTime}ms`);
  }
});

anamClient.addListener(AnamEvent.TOOL_CALL_FAILED, (event) => {
  if (event.toolType === "server" && event.toolSubtype === "webhook") {
    console.error(`Webhook failed: ${event.toolName} - ${event.errorMessage}`);
  }
});
```

<Tip>
  See the [Events SDK Reference](/sdk-reference/events#tool-call-events) for the full list of tool call event payloads and types.
</Tip>

## Troubleshooting

<AccordionGroup>
  <Accordion title="Webhook not being called">
    **Possible causes**:

    * Description too vague for LLM to understand when to use it
    * System prompt doesn't mention the webhook capability
    * Missing required parameters in conversation

    **Solutions**:

    1. Make description more specific with trigger words
    2. Update system prompt to mention the webhook
    3. Test with explicit parameter values in conversation
    4. Check server logs to see if request was made
  </Accordion>

  <Accordion title="Webhook timing out">
    **Possible causes**:

    * Endpoint takes >60 seconds to respond
    * Network issues or slow database queries
    * Endpoint not accessible from Anam servers

    **Solutions**:

    1. Optimize endpoint performance
    2. Check endpoint is publicly accessible (not localhost in production)
    3. Verify firewall rules allow Anam's IP range
    4. Use fire-and-forget (`awaitResponse: false`) for slow operations
  </Accordion>

  <Accordion title="Authentication errors">
    **Possible causes**:

    * Incorrect API key or token
    * Expired credentials
    * Wrong authentication header format

    **Solutions**:

    1. Verify credentials are current and valid
    2. Check header format matches API requirements
    3. Test endpoint directly with same credentials using curl/Postman
    4. Check for typos in header names (case-sensitive)
  </Accordion>

  <Accordion title="Incorrect parameter values">
    **Possible causes**:

    * LLM extracting wrong values from conversation
    * Parameter descriptions unclear
    * User providing ambiguous information

    **Solutions**:

    1. Improve parameter descriptions with examples
    2. Update system prompt with parameter extraction guidance
    3. Ask users to confirm values before making webhook call
  </Accordion>
</AccordionGroup>

## Advanced Patterns

### Chaining Multiple Webhooks

The LLM can call multiple webhooks in sequence:

**User**: "Check my order ORD-12345 and create a ticket if there's a problem"

**LLM flow**:

1. Calls `check_order_status` → finds delay
2. Calls `create_support_ticket` → creates ticket
3. Responds: "I checked your order and it's delayed. I've created support ticket #789 to investigate."

### Conditional Webhooks

Use system prompt to guide conditional webhook usage:

```
If the order status is "delayed" or "problem", automatically create a support ticket.
If the user asks about weather and mentions travel, also check flight status.
```

### Dynamic Parameters

Extract multiple pieces of information from complex requests:

```json theme={"system"}
{
  "parameters": {
    "type": "object",
    "properties": {
      "orderId": { "type": "string" },
      "action": {
        "type": "string",
        "enum": ["check_status", "cancel", "modify"],
        "description": "What the user wants to do with the order"
      },
      "reason": {
        "type": "string",
        "description": "Reason for cancellation or modification"
      }
    }
  }
}
```

### Long-Running Processes (A Recommended Pattern)

For backend processes that take longer than 5 seconds (like generating a report or running a batch job), the conversation can feel stalled. To keep the interaction fluid, we recommend a "split pattern" where you create two separate webhooks.

This is a design pattern that **you implement on your backend**. You are responsible for creating the two endpoints and managing the state of the job. The Anam agent's role is simply to call the webhooks you provide.

<Steps>
  <Step title="Step 1: Create a 'start' webhook">
    This webhook initiates the long-running process. It should immediately return a unique `jobId` and an estimated completion time.

    ```json theme={"system"}
    {
      "name": "start_report_generation",
      "description": "Starts generating a custom report. Returns a job ID.",
      "url": "https://api.example.com/reports/start",
      "awaitResponse": true,
      "parameters": {
        "type": "object",
        "properties": {
          "reportType": { "type": "string" },
          "dateRange": { "type": "string" }
        }
      }
    }
    ```
  </Step>

  <Step title="Step 2: Create a 'check status' webhook">
    This webhook checks the status of the job using the `jobId`. It should return the status and, if complete, the final result (e.g., a download URL).

    ```json theme={"system"}
    {
      "name": "check_report_status",
      "description": "Checks the status of a report generation job.",
      "url": "https://api.example.com/reports/status",
      "awaitResponse": true,
      "parameters": {
        "type": "object",
        "properties": {
          "jobId": {
            "type": "string",
            "description": "The job ID returned from start_report_generation"
          }
        }
      }
    }
    ```
  </Step>

  <Step title="Step 3: Guide the AI with a system prompt">
    Instruct the AI on how to use this two-step process.

    ```
    When a user asks for a report, first call `start_report_generation`. Inform the user that the report is being generated and tell them the estimated wait time. After waiting, proactively call `check_report_status` to see if it's ready.
    ```
  </Step>
</Steps>

**Example Conversation Flow**:

1. **User**: "Can you generate a sales report for last month?"
2. **AI**:
   * Calls your `start_report_generation` webhook.
   * Your backend starts the job, saves its state (e.g., `status: 'PENDING'`) in a database, and immediately returns `{ "jobId": "job_123", "estimatedTime": "30 seconds" }`.
   * Responds: "I've started generating your sales report. It should be ready in about 30 seconds. I can let you know when it's done."
3. *(AI continues the conversation on other topics...)*
4. **AI (after \~30 seconds)**:
   * Proactively calls your `check_report_status` webhook with `jobId: "job_123"`.
   * Your backend checks the job's state. If it's done, it returns `{ "status": "COMPLETE", "url": "https://.../report.pdf" }`.
   * Responds: "Great news! That sales report you asked for is ready. You can download it here: \[link]."

<Tip>
  This pattern works for any long-running operation: data exports, batch
  processing, AI model inference, video processing, etc. The key is providing
  immediate feedback that the process has started, then checking back later.
</Tip>

## Next Steps

<CardGroup>
  <Card title="Knowledge Tools" icon="database" href="/guides/tools/knowledge-tools">
    Search documents with RAG-powered tools
  </Card>

  <Card title="Client Tools" icon="browser" href="/guides/tools/client-tools">
    Trigger UI actions from your persona
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/create-tool">
    Complete tools API documentation
  </Card>

  <Card title="Getting Started" icon="rocket" href="/guides/tools/introduction">
    Create your first tool in 5 minutes
  </Card>
</CardGroup>


# Voice Configuration
Source: https://anam.ai/docs/personas/voices/configuration

Configure voice speed, volume, emotion, and provider-specific settings

You can configure how your persona's voice sounds — speed, volume, and emotion — via `voiceGenerationOptions` in the persona config. The available options depend on the provider and model of the configured voice.

```javascript theme={"system"}
const personaConfig = {
  name: "Cara",
  avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt: "You are Cara, a helpful customer service representative.",
  voiceGenerationOptions: {
    speed: 1.2,
    volume: 0.8,
    emotion: "content"
  }
};
```

These settings can be set when creating or updating a persona, as well as when generating a session token. See the [API Reference](/api-reference/create-persona) for details.

## Cartesia voices

The following options are valid for Cartesia `sonic-3` voices:

| Option    | Range     | Description                                                              |
| --------- | --------- | ------------------------------------------------------------------------ |
| `volume`  | 0.5 – 2.0 | Multiplier to decrease or increase the volume of the original voice      |
| `speed`   | 0.6 – 1.5 | Multiplier to decrease or increase the speed of the original voice       |
| `emotion` | string    | Emotion to apply: `neutral`, `calm`, `angry`, `content`, `sad`, `scared` |

## ElevenLabs voices

The following options are valid for ElevenLabs `v1` and `v2` voices:

| Option            | Range     | Description                                                                                    |
| ----------------- | --------- | ---------------------------------------------------------------------------------------------- |
| `stability`       | 0 – 1     | How much the voice varies between generations. Lower values introduce more emotional variation |
| `similarityBoost` | 0 – 1     | How closely the generated voice matches the original reference audio                           |
| `speed`           | 0.7 – 1.2 | Multiplier to decrease or increase the speed of the original voice                             |

For `v2` voices, these additional options are also available:

| Option            | Type    | Description                                                                                  |
| ----------------- | ------- | -------------------------------------------------------------------------------------------- |
| `useSpeakerBoost` | boolean | Boost similarity to the original speaker. May increase latency                               |
| `style`           | 0 – 1   | How much the original speaker's style is amplified. Values other than 0 may increase latency |

## Changing voices

The `voiceGenerationOptions` are specific to the provider, model, and voice being used. When you change the `voiceId` for a persona, these options reset. Use the copy persona feature in Lab when experimenting with different voices to avoid losing your existing config.


# Custom Voices
Source: https://anam.ai/docs/personas/voices/custom-voices

Add custom voices by cloning it with Anam or importing it from providers.

## Importing Custom ElevenLabs Voice

When creating a custom voice in ElevenLabs, make sure you share it with "anyone with the link".

<img alt="Elevenlabsui" />

Then you can add the Voice ID into [Anam Lab](https://lab.anam.ai).

<Info>
  If you are generating a voice in ElevenLabs, only the **Professional Voice Clone** can be shared with external organisations.
</Info>

<Warning>
  **Not working?** If importing still fails after sharing, try generating a share link within ElevenLabs and sharing it with someone from Anam (e.g. reach out to [support@anam.ai](mailto:support@anam.ai)) so we can help debug the issue.
</Warning>

## Next Steps

<CardGroup>
  <Card title="Avatar Gallery" icon="images" href="/personas/avatars/gallery">
    Browse available avatars to pair with your voice
  </Card>

  <Card title="Persona Configuration" icon="user" href="/personas/overview">
    Learn more about configuring personas
  </Card>

  <Card title="Prompting Guide" icon="message" href="/personas/llms/prompting-guide">
    Write effective system prompts for multilingual personas
  </Card>

  <Card title="Authentication" icon="key" href="/javascript-sdk/authentication">
    Understand session token creation
  </Card>
</CardGroup>


# Voice Gallery
Source: https://anam.ai/docs/personas/voices/gallery

Browse and select from available voice options for your personas

<Note>
  To use a voice in your application, copy its ID from the table below and
  include it when creating your session token. See the [Usage in
  Production](/production) guide for more information.
</Note>

## Available Voices

Click on a voice to copy the voice ID.

<VoiceGrid />

<Note>
  Localized voices in other languages are available in the [Anam
  Lab](https://lab.anam.ai/).
</Note>


# Multilingual Support
Source: https://anam.ai/docs/personas/voices/multilingual

Configure speech recognition for non-English languages to enable global conversations

Anam supports multilingual conversations, allowing your personas to understand and respond to users speaking in their native language. By configuring the transcription language, you ensure accurate speech recognition and better response times for non-English speakers.

## How It Works

When a user speaks to your persona, their speech is transcribed using speech-to-text technology. By default, the system expects English input. For non-English users, you can specify the expected language to significantly improve:

* **Transcription accuracy**: The system correctly interprets words and phrases in the target language
* **Response latency**: Optimized processing for the specified language reduces delays

<Info>The `languageCode` controls what language the system expects to **hear** from users. This is separate from the persona's voice, which determines what language the persona **speaks**.</Info>

### When to Configure Language

For many use cases, the default English setting works well—even when users speak other languages. The LLM can still understand the meaning of what's said and respond appropriately.

Setting a specific `languageCode` becomes important when:

* **You need exact transcription**: In English mode, non-English speech may be translated to English in the transcript rather than preserved in the original language. If your application displays transcripts or the persona needs to see the exact words spoken, set the matching language code.
* **You're building language-focused experiences**: For language learning apps, pronunciation practice, or any use case where the specific words matter as much as their meaning.
* **You want optimized latency**: Configuring the correct language reduces processing time for non-English speech.

## Setting the Language Code

Configure the transcription language when requesting a session token via `POST /v1/auth/session-token`. Add the `languageCode` field to your `personaConfig`:

```javascript theme={"system"}
const response = await fetch("https://api.anam.ai/v1/auth/session-token", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${process.env.ANAM_API_KEY}`,
  },
  body: JSON.stringify({
    personaConfig: {
      name: "Marie",
      avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
      voiceId: "<french-voice-id>",
      llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
      systemPrompt: "Tu es une assistante virtuelle sympathique. Réponds toujours en français.",
      languageCode: "fr",
    },
  }),
});
```

### Configuration Reference

<ParamField type="string">
  A 2-character [ISO 639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) language code specifying the expected user speech language. Defaults to `en` (English) if not provided.
</ParamField>

## Supported Languages

The following languages are supported for speech recognition:

| Language    | Code |
| ----------- | ---- |
| English     | en   |
| Afrikaans   | af   |
| Arabic      | ar   |
| Armenian    | hy   |
| Azerbaijani | az   |
| Belarusian  | be   |
| Bosnian     | bs   |
| Bulgarian   | bg   |
| Catalan     | ca   |
| Chinese     | zh   |
| Croatian    | hr   |
| Czech       | cs   |
| Danish      | da   |
| Dutch       | nl   |
| Estonian    | et   |
| Finnish     | fi   |
| French      | fr   |
| Galician    | gl   |
| German      | de   |
| Greek       | el   |
| Hebrew      | he   |
| Hindi       | hi   |
| Hungarian   | hu   |
| Icelandic   | is   |
| Indonesian  | id   |
| Italian     | it   |
| Japanese    | ja   |
| Kannada     | kn   |
| Kazakh      | kk   |
| Korean      | ko   |
| Latvian     | lv   |
| Lithuanian  | lt   |
| Macedonian  | mk   |
| Malay       | ms   |
| Maori       | mi   |
| Marathi     | mr   |
| Nepali      | ne   |
| Norwegian   | no   |
| Persian     | fa   |
| Polish      | pl   |
| Portuguese  | pt   |
| Romanian    | ro   |
| Russian     | ru   |
| Serbian     | sr   |
| Slovak      | sk   |
| Slovenian   | sl   |
| Spanish     | es   |
| Swahili     | sw   |
| Swedish     | sv   |
| Tagalog     | tl   |
| Tamil       | ta   |
| Thai        | th   |
| Turkish     | tr   |
| Ukrainian   | uk   |
| Urdu        | ur   |
| Vietnamese  | vi   |
| Welsh       | cy   |

<Tip>**Serving primarily non-English users?** Contact [Anam support](mailto:support@anam.ai) to set a default transcription language for your account. This means you won't need to specify `languageCode` on every session token request—all sessions will automatically use your preferred language unless explicitly overridden.</Tip>

## Understanding Language Configuration

There are two language settings to consider when building multilingual experiences:

### Transcription Language (What Users Speak)

The `languageCode` field tells the speech recognition system what language to expect from users. Set this to match your users' spoken language for accurate transcription.

### Voice Language (What the Persona Speaks)

The `voiceId` determines the voice and language your persona uses when speaking. Select a voice that matches your target language from the [GET /v1/voices](/api-reference/list-voices) endpoint or the [Anam Lab](https://lab.anam.ai).

<Note>For a fully localized experience, configure **both** the `languageCode` (for user speech recognition) and select an appropriate `voiceId` (for persona speech output).</Note>

## Best Practices

<AccordionGroup>
  <Accordion title="Match transcription language to user input">
    Always set `languageCode` to match the language your users will speak. Mismatched settings result in poor transcription accuracy. If your German users speak German, set `languageCode: "de"`.
  </Accordion>

  <Accordion title="One language per session">
    The language code is set per session and cannot be changed mid-conversation. If your application supports multiple languages:

    * Create separate sessions with appropriate language codes for each user
    * Determine the user's preferred language before starting the session
    * Consider using a language selection step in your onboarding flow
  </Accordion>

  <Accordion title="Combine with matching voice and prompt">
    For the best multilingual experience:

    1. Set `languageCode` for accurate speech recognition
    2. Select a `voiceId` in the target language
    3. Write your `systemPrompt` in the target language or instruct the persona to respond in that language
  </Accordion>

  <Accordion title="Account-level defaults for single-language apps">
    If your application primarily serves users of one non-English language, contact [Anam support](mailto:support@anam.ai) to set an account-level default. Benefits include:

    * Simplified integration without passing `languageCode` every time
    * No changes needed to existing integrations
    * Per-session `languageCode` still overrides the default when needed
  </Accordion>
</AccordionGroup>

## SDK Usage

When using the JavaScript SDK with ephemeral personas, the `languageCode` field in your `PersonaConfig` is passed through when creating session tokens:

```typescript theme={"system"}
import { createClient } from "@anam-ai/js-sdk";

// Your server creates the session token with languageCode
const sessionToken = await getSessionTokenFromServer({
  personaConfig: {
    name: "Hans",
    avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
    voiceId: "<german-voice-id>",
    llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
    systemPrompt: "Du bist ein hilfreicher Assistent. Antworte immer auf Deutsch.",
    languageCode: "de",
  },
});

// Client uses the token as normal
const anamClient = createClient(sessionToken);
await anamClient.streamToVideoElement("video-id");
```

<Note>The `languageCode` configuration only applies to ephemeral persona flows where you provide the persona configuration at runtime. For stateful personas created in the Anam Lab, contact [Anam support](mailto:support@anam.ai) to configure language settings.</Note>

## Next Steps

<CardGroup>
  <Card title="Persona Configuration" icon="user" href="/personas/overview">
    Learn more about configuring personas
  </Card>

  <Card title="Prompting Guide" icon="message" href="/personas/llms/prompting-guide">
    Write effective system prompts for multilingual personas
  </Card>

  <Card title="Authentication" icon="key" href="/javascript-sdk/authentication">
    Understand session token creation
  </Card>
</CardGroup>


# Installation
Source: https://anam.ai/docs/python-sdk/installation

Install the Anam Python SDK

## Install from PyPI

```bash theme={"system"}
pip install anam
```

Or with [uv](https://docs.astral.sh/uv/):

```bash theme={"system"}
uv add anam
```

## Optional dependencies

To include display utilities (OpenCV video display, sounddevice audio playback) for local testing:

```bash theme={"system"}
pip install anam[display]
```

## Requirements

* Python 3.10 or higher

## Verify installation

```python theme={"system"}
import anam
print(anam.__version__)
```


# Python SDK
Source: https://anam.ai/docs/python-sdk/overview

Use Anam's real-time avatar streaming from Python applications

The Anam Python SDK provides real-time AI avatar streaming for Python applications. Receive synchronized audio and video frames, send text or audio input, and build custom integrations using async iterators or event-driven patterns.

<Info>
  Requires Python 3.10 or higher.
</Info>

## Features

* **Real-time audio/video streaming** -- receive synchronized audio (PCM) and video (RGB) frames as PyAV objects via WebRTC
* **Two-way communication** -- send text messages as the user and receive generated responses
* **Real-time transcriptions** -- receive incremental message stream events for user and persona text as it generates
* **Message history** -- automatic conversation history with incremental updates
* **Audio passthrough** -- send TTS-generated audio directly for face rendering (BYO TTS)
* **Direct text-to-speech** -- send text straight to TTS via `session.talk()`, bypassing the LLM
* **User audio input** -- send raw microphone audio to Anam for processing (STT, LLM, TTS, avatar)
* **Async iterator API** -- `async for frame in session.video_frames()`
* **Event-driven API** -- decorator-based handlers via `@client.on(AnamEvent.CONNECTION_ESTABLISHED)`
* **Fully typed** -- complete type hints for IDE support
* **Server-side ready** -- designed for server-side Python applications and backend pipelines


# Python Quickstart
Source: https://anam.ai/docs/python-sdk/quickstart

Stream an Anam avatar in a Python application

This quickstart shows you how to connect to an Anam persona and receive audio and video frames.

## Prerequisites

* An Anam API key ([get one here](/api-key))
* A persona ID from [Anam Lab](https://lab.anam.ai)
* The `anam` package installed (`pip install anam`)

## Connect and stream

```python theme={"system"}
import asyncio
from anam import AnamClient

async def main():
    client = AnamClient(
        api_key="your-api-key",
        persona_id="your-persona-id",
    )

    async with client.connect() as session:
        async def consume_video():
            async for frame in session.video_frames():
                img = frame.to_ndarray(format="rgb24")
                print(f"Video: {frame.width}x{frame.height}")

        async def consume_audio():
            async for frame in session.audio_frames():
                samples = frame.to_ndarray()
                print(f"Audio: {samples.size} samples")

        await asyncio.gather(consume_video(), consume_audio())

asyncio.run(main())
```

This connects to the persona and prints frame metadata as it arrives. Replace the `print` calls with your own rendering logic.

<Warning>
  Never expose your API key in client-side code. The Python SDK is designed for server-side use. See [Usage in Production](/production) for session token patterns.
</Warning>

## Send a message

You can send text to the persona during a session:

```python theme={"system"}
async with client.connect() as session:
    await session.talk("Hello, tell me about yourself.")
```

## Use the event-driven API

If you prefer callbacks over async iterators:

```python theme={"system"}
from anam import AnamClient, AnamEvent

client = AnamClient(
    api_key="your-api-key",
    persona_id="your-persona-id",
)

@client.on(AnamEvent.CONNECTION_ESTABLISHED)
async def on_connected():
    print("Connected to persona")

@client.on(AnamEvent.MESSAGE_STREAM_EVENT_RECEIVED)
async def on_message(event):
    print(f"{event.role}: {event.content}")

await client.connect()
```

## Next steps

<CardGroup>
  <Card title="GitHub Repository" icon="github" href="https://github.com/anam-org/python-sdk">
    Source code, full API reference, and examples
  </Card>

  <Card title="Cookbook: Python BYO LLM" icon="book-open" href="https://anam.ai/cookbook/python-byo-llm">
    Bring your own LLM with the Python SDK
  </Card>
</CardGroup>


# Python SDK Resources
Source: https://anam.ai/docs/python-sdk/resources

Links, examples, and reference material for the Anam Python SDK

<CardGroup>
  <Card title="GitHub Repository" icon="github" href="https://github.com/anam-org/python-sdk">
    Source code, API reference, and examples
  </Card>

  <Card title="PyPI Package" icon="python" href="https://pypi.org/project/anam/">
    `pip install anam` -- latest releases and changelog
  </Card>

  <Card title="Cookbook: Python BYO LLM" icon="book-open" href="https://anam.ai/cookbook/python-byo-llm">
    Bring your own LLM with the Python SDK
  </Card>

  <Card title="Cookbook: Python Audio Passthrough" icon="book-open" href="https://anam.ai/cookbook/python-audio-passthrough-tts">
    Stream custom TTS audio through an avatar with Python
  </Card>
</CardGroup>


# Frequently Asked Questions
Source: https://anam.ai/docs/resources/faq

Answers to common questions about Anam

### How is usage time calculated and billed?

Usage time starts when you call the `stream()` method and ends when that specific stream closes. The time is tracked in seconds and billed per minute.

### What causes latency and how can I optimize it?

Latency can come from several sources:

* Connection setup time (usually 4-5 seconds, but can be up to 10 seconds)
* LLM processing time
* TTS generation
* Face generation
* Network conditions

To optimize latency:

* Consider using our turnkey solution instead of custom-LLM
* Use the streaming API for custom LLM implementations

### How do I handle multilingual conversations?

Anam supports multilingual conversations across over 50 languages. To configure:

1. **Set the transcription language**: Add `languageCode` to your persona config when creating a session token (e.g., `"languageCode": "fr"` for French)
2. **Choose a matching voice**: Select a voice in your target language from the [Anam Lab](https://lab.anam.ai)
3. **Write your system prompt**: Instruct the persona to respond in the target language

For detailed configuration and supported languages, see our [Multilingual Support guide](/concepts/multilingual).

### Can I interrupt the persona while it's speaking?

Yes, you can interrupt the persona in two ways:

1. When using streaming, user speech will automatically interrupt the current stream
2. Use the `interruptPersona()` method to programmatically stop the persona mid-speech. See the [interrupt command](/sdk-reference/interrupt-command) for more information.

### How do I integrate my own LLM?

See our [custom LLM guide](/concepts/custom-llms) for more information.

### What are the browser compatibility requirements?

The SDK should work well with any recent Chromium-based or Firefox browser. It requires:

* Modern browser with WebRTC support
* Microphone permissions for audio input
* Autoplay capabilities for video/audio
* WebAssembly support

Safari/iOS notes:

* Requires explicit user interaction for audio playback
* May have additional security policy requirements
* WebKit engine has specific autoplay restrictions

### How do I monitor current usage?

Usage tracking options:

* Available in Anam Lab
* API endpoint for usage stats coming soon
* Session logs available in the Anam Lab

### How do I handle connection issues?

Common issues and solutions:

* For "403 Forbidden" errors, verify API key/session token
* If video doesn't appear, check element IDs match exactly
* Connection timeouts may require retry logic
* Session tokens expire and need refresh
* Monitor `CONNECTION_CLOSED` events for network issues


# Anam MCP
Source: https://anam.ai/docs/resources/mcp

Manage personas, avatars, voices, sessions, and more from Claude or any MCP client.

The [Anam MCP server](https://github.com/anam-org/anam-mcp) lets you manage your Anam account directly from Claude Desktop, Cursor, Claude Code, or any MCP-compatible client. It provides 55+ tools covering personas, avatars, voices, knowledge bases, sessions, and more.

<Card title="Source Code" icon="github" href="https://github.com/anam-org/anam-mcp">
  Full source and documentation on GitHub
</Card>

## Setup

### Prerequisites

* An [Anam account](https://lab.anam.ai) with an API key
* Python 3.10+ (for `uvx`/`pip` install)

### Claude Desktop

Go to **Settings → Connectors → Add custom connector** and enter:

```
https://docs.anam.ai/mcp
```

Alternatively, add to your Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):

```json theme={"system"}
{
  "mcpServers": {
    "anam": {
      "command": "uvx",
      "args": ["anam-mcp"],
      "env": {
        "ANAM_API_KEY": "<your-api-key>"
      }
    }
  }
}
```

### Claude Code

```bash theme={"system"}
claude mcp add --transport http anam https://docs.anam.ai/mcp
```

Or add to your project's `.mcp.json`:

```json theme={"system"}
{
  "mcpServers": {
    "anam": {
      "type": "stdio",
      "command": "uvx",
      "args": ["anam-mcp"],
      "env": {
        "ANAM_API_KEY": "<your-api-key>"
      }
    }
  }
}
```

### Cursor

Press `Cmd+Shift+P` (Mac) or `Ctrl+Shift+P` (Windows/Linux), type "Open MCP settings", then add:

```json theme={"system"}
{
  "mcpServers": {
    "anam": {
      "url": "https://docs.anam.ai/mcp"
    }
  }
}
```

## What You Can Do

### Persona Management

Create, update, list, and delete personas — combinations of avatar + voice + LLM + system prompt that define an AI agent.

### Avatar & Voice Management

Browse and search 50+ avatars and 400+ voices across 50+ languages. Create custom avatars (enterprise/pro) and custom voices.

### Knowledge Base

Create knowledge folders, upload documents, and search across your knowledge base. Connect knowledge tools to personas for RAG-powered conversations.

### Tools

Create webhook tools for API integrations and knowledge tools for document retrieval. Attach tools to personas.

### LLM Configuration

List built-in LLMs, create custom LLM configurations (OpenAI, Azure, Gemini, Groq), and assign them to personas.

### Session Management

Generate session tokens, list sessions, get session details, and download session recordings.

### Share Links

Create shareable links to persona experiences for demos and testing.

### Text-to-Avatar (Early Access)

Generate MP4 videos of avatars speaking a script. Contact [support@anam.ai](mailto:support@anam.ai) for access.

### Meeting Avatars (Recall Integration)

Add AI avatars to live video meetings on **Zoom**, **Google Meet**, and **Microsoft Teams** using the [Recall AI](https://recall.ai) integration. The avatar joins the meeting as a participant with a camera feed, and meeting audio is piped to the avatar for real-time conversation.

| Tool                         | Description                           |
| ---------------------------- | ------------------------------------- |
| `add_avatar_to_meeting`      | Add an Anam avatar to a video meeting |
| `get_meeting_bot_status`     | Check the status of a meeting bot     |
| `remove_avatar_from_meeting` | Remove an avatar from a meeting       |
| `list_meeting_bots`          | List all active meeting bots          |

To enable meeting avatars, set the `RECALL_API_KEY` environment variable alongside your `ANAM_API_KEY`.

## Example Prompts

Once configured, try asking your AI assistant:

* "List all my Anam personas"
* "Create a new persona named 'Sales Assistant' with a friendly personality"
* "What voices are available in British English?"
* "Upload our FAQ PDF to the knowledge base and connect it to my support persona"
* "Generate a session token for my customer support persona"
* "Add an avatar to my Google Meet call"
* "Create a shareable demo link for my sales persona"

## Environment Variables

| Variable         | Description                             | Required |
| ---------------- | --------------------------------------- | -------- |
| `ANAM_API_KEY`   | Your Anam API key                       | Yes      |
| `RECALL_API_KEY` | Recall AI API key (for meeting avatars) | No       |


# Network Configuration
Source: https://anam.ai/docs/security/network

Domains, IPs, and firewall rules required for Anam connections

To ensure a reliable connection to Anam, whitelist the following on ports 80 and 443 across both TCP and UDP protocols.

## Anam endpoints

| Endpoint                                                  | Purpose                                        |
| --------------------------------------------------------- | ---------------------------------------------- |
| `https://api.anam.ai`                                     | REST API                                       |
| `https://lab.anam.ai`                                     | Anam Lab portal                                |
| `https://connect-eu.anam.ai` / `wss://connect-eu.anam.ai` | EU media server (WebRTC signaling + streaming) |
| `https://connect-us.anam.ai` / `wss://connect-us.anam.ai` | US media server (WebRTC signaling + streaming) |

## Media relay IPs

Anam uses TURN servers for media relay when direct peer-to-peer connections are not possible. Whitelist all IPs listed at:

<Card title="TURN Server IP Whitelist" icon="shield" href="https://www.metered.ca/static-content/stun-turn/turnserver-ip-whitelist/all.txt">
  Full list of media relay IPs to whitelist
</Card>

## SSL inspection

If you use a proxy that performs deep-packet inspection (Zscaler, Fortinet, etc.), bypass SSL inspection for `*.anam.ai` domains. Decrypting these packets will cause WebSocket signaling to fail or disconnect.

## Content Security Policy

If your site enforces a CSP, add the following directives depending on your integration method:

```http theme={"system"}
# Widget
Content-Security-Policy: script-src https://unpkg.com/@anam-ai/; connect-src https://api.anam.ai wss://connect-eu.anam.ai wss://connect-us.anam.ai;

# Player (iframe)
Content-Security-Policy: frame-src https://lab.anam.ai;

# SDK
Content-Security-Policy: connect-src https://api.anam.ai wss://connect-eu.anam.ai wss://connect-us.anam.ai;
```

See the [Embed page](/embed#security) for browser support and troubleshooting details.


# Privacy
Source: https://anam.ai/docs/security/privacy

Configure how your persona handles data and privacy

## Zero Data Retention

Zero Data Retention can be enabled at a persona level or at the session level (for ephemeral sessions).

When enabled, Anam does not persist any customer content - including conversation transcripts, user audio, TTS text, LLM prompts/responses, or session recordings. Processing occurs in volatile memory only. Anam retains operational metadata necessary for billing, service delivery, and security, including: session timestamps and duration, usage metering, error codes, performance metrics, and infrastructure logs (retained for 30 days). Account-level data (organization info, API keys, subscription) is retained as standard. Support and troubleshooting capabilities may be limited for ZDR-enabled sessions.

Zero Data Retention does not apply to any images used in one shot persona creation and/or any files uploaded to a persona's knowledge base.

### Enabling Zero Data Retention

Zero Data Retention can be enabled via the `Advanced` tab of the lab, or via persona configuration for either stateful or ephemeral personas. See the [API Reference](/api-reference/create-persona) for more details.

```javascript theme={"system"}
const personaConfig = {
	name: "Cara",
	avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
	voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
	llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
	systemPrompt: "You are Cara, a helpful customer service representative. You're friendly, knowledgeable, and always try to solve problems efficiently. Keep responses conversational and under 50 words unless explaining something complex.",
	zeroDataRetention: true,
};
```

<Info>
  Not all LLMs and voices support zero data retention mode, when enabling zero data retention you may have to switch LLM or voice before you are able to publish and use your persona.
</Info>