- Simpler client code (no audio bridging, microphone management, or speaker muting)
- Reduced latency through server-to-server audio flow
- Session recordings and transcripts available in Anam Lab
Looking for the client-side approach where you manage the audio pipeline in the browser? See Custom TTS (client-side).
Architecture
Server fetches signed URL and session token
Your API route fetches an ElevenLabs signed URL using your API key, then requests an Anam session token with
elevenLabsAgentSettings attached.Engine connects to ElevenLabs
The Anam engine uses the signed URL to open a WebSocket to ElevenLabs and manages the full voice pipeline—speech-to-text, LLM reasoning, and text-to-speech.
Client streams avatar
The client creates an
AnamClient with the session token and calls streamToVideoElement(). Mic audio goes to the engine over WebRTC; the avatar video and speech audio come back over the same connection.Prerequisites
- Node.js 18+
- Anam account and API key
- ElevenLabs account with a Conversational AI agent configured
Best Practices for ElevenLabs Agent Configuration
Before writing code, configure your ElevenLabs agent for good performance with Anam:Voice Settings
- Use V3 Conversational as the TTS model for better expressivity
- Enable Expressive mode on V3 voices
- Add audio tags to system prompts for effects like laughter
Audio Configuration
- Set user input audio format to PCM 16000Hz (other formats are not supported with Anam)
- Enable Filter Background Speech in Advanced settings if background noise is problematic
Response Optimization
- As of writing, Qwen3-30B-A3B performs well for low latency — check the ElevenLabs agent UI for current LLM options and their latency characteristics
- Avoid reasoning models unless using high-throughput providers
- Set Eagerness to “Eager” in Advanced menu for quickest responses
- Configure soft timeouts (2 seconds) in Advanced settings with filler phrase generation if responses lag
Server-Side Implementation
Source Code
Full source code for the server-side integration
Cookbook: Server-Side Agents
Step-by-step tutorial for this integration
Cookbook: Expressive Voice Agents
Guide to using ElevenLabs V3 expressive voices with Anam
ElevenLabs Docs
Official ElevenLabs Conversational AI documentation
Environment Variables
.env
| Variable | What it is | Where to find it |
|---|---|---|
agentId | Your ElevenLabs Agent ID | ElevenLabs dashboard → Agents → select your agent → copy the Agent ID |
avatarId | An Anam avatar face ID (not a persona ID) | Avatar Gallery, or in Anam Lab click the three-dot menu on an avatar and then click the copy button |
The
avatarId is specifically the face model ID, not an overall persona or agent ID. You’re pairing an Anam face with an ElevenLabs agent—the voice, LLM, and STT all come from ElevenLabs.API Route
Create a Next.js API route (or equivalent server endpoint) that fetches the ElevenLabs signed URL and creates an Anam session token:app/api/anam-session/route.ts
The
environment.elevenLabsAgentSettings field tells the Anam engine to connect to ElevenLabs instead of running Anam’s built-in STT/LLM/TTS pipeline.Per-Session Customization
The Anam session token API accepts additional fields inelevenLabsAgentSettings that are passed through to ElevenLabs:
The
dynamicVariables, conversationConfigOverride, and customLlmExtraBody fields each have a 10KB size limit.Dynamic Variables
Define placeholders like{{user_name}} in your ElevenLabs agent system prompt, then populate them at runtime:
Configuration Overrides
Modify per-conversation settings like first message, language, system prompt, or TTS voice:User Identification
PassuserId for analytics tracking in ElevenLabs:
Custom LLM Parameters
If your ElevenLabs agent uses a custom LLM backend, pass additional parameters:Client-Side Implementation
The client code is minimal—just fetch a session token and stream:React Component Example
MESSAGE_STREAM_EVENT_RECEIVED event fires for each text chunk from user and agent. Accumulate chunks by message ID to construct full transcripts. Each event also includes endOfSpeech to indicate when a message is complete.
Feature Support
Works with server-side integration:- Voice intelligence (STT, LLM, TTS)
- Expressive V3 voices
- Interruption handling
- Custom knowledge bases
- Server-side tools (webhooks)
- Conversation history
- Session recordings and transcripts in Anam Lab
- Client tools (browser-based tool execution). Use the client-side integration for agent-based client tools.
Troubleshooting
ElevenLabs API key returns 403 or permission error
ElevenLabs API key returns 403 or permission error
- After creating an API key in the ElevenLabs dashboard, you must edit the key and grant write access to Conversational AI (ElevenLabs Agents). This permission is not enabled by default.
- Go to ElevenLabs → API Keys → click the key → enable the Conversational AI permission → save.
Failed to get ElevenLabs signed URL
Failed to get ElevenLabs signed URL
- Verify
ELEVENLABS_API_KEYis valid and has Conversational AI permissions (see above) - Confirm the
agentIdexists in your ElevenLabs dashboard - Verify your ElevenLabs plan includes Conversational AI access
Avatar connects but no conversation starts
Avatar connects but no conversation starts
- Signed URL may have expired—create token immediately before client needs it
- Verify ElevenLabs agent is active (not paused) in the dashboard
Anam API error 400 when creating session token
Anam API error 400 when creating session token
- Validate
avatarIdat lab.anam.ai/avatars - Ensure
elevenLabsAgentSettingsincludes bothsignedUrlandagentId
Avatar lips not syncing or no audio
Avatar lips not syncing or no audio
- The server-side integration handles audio format matching automatically
- Check ElevenLabs agent configuration for supported voice models
Session recordings not appearing in Anam Lab
Session recordings not appearing in Anam Lab
- Recordings generate after session ends—allow several minutes for processing
- Verify session completed cleanly (client called
stopStreaming()or normal connection closure)
Resources
Source Code
Full source code for the server-side integration
Cookbook: Server-Side Agents
Step-by-step tutorial for this integration
Cookbook: Expressive Voice Agents
Guide to using ElevenLabs V3 expressive voices with Anam
ElevenLabs Docs
Official ElevenLabs Conversational AI documentation
Avatar Gallery
Browse available stock avatars
Client-Side Approach
Use audio passthrough for direct client-side TTS control
