Give your VideoSDK voice agent a face

Anam is now natively supported on VideoSDK, real-time, photorealistic avatars in any WebRTC pipeline.

Table of Content

Add a face to any VideoSDK pipeline

Anam is now natively supported on VideoSDK. A first-class plugin that adds photorealistic, real-time avatars to any voice agent pipeline you've already built.

Built for developers who want higher engagement and user preference without touching their existing stack. Your STT, LLM, and TTS stay exactly as they are. Anam adds the face.

Avatar interactions outperform voice across every deployment: +24% conversion, +44% engagement, 70% user preference over voice-only. This integration is the fastest path to that outcome. Under 10 lines of code, no infrastructure changes.

What adding an Anam avatar to your videoSDK solves:

  • Voice isn't enough. Users consistently prefer video. 70% preference rate, across every deployment.

  • Zero infrastructure rebuild. Plugs directly into CascadingPipeline or RealTimePipeline alongside your existing STT, LLM, and TTS. Nothing changes except the output.

  • Native WebRTC. The avatar stream is a standard VideoSDK participant. No proxy layers, no custom delivery.

  • Credibility at scale. Higher engagement, completion, and trust across sales, onboarding, support, and training.

Add a face to any VideoSDK pipeline

Anam is now natively supported on VideoSDK. A first-class plugin that adds photorealistic, real-time avatars to any voice agent pipeline you've already built.

Built for developers who want higher engagement and user preference without touching their existing stack. Your STT, LLM, and TTS stay exactly as they are. Anam adds the face.

Avatar interactions outperform voice across every deployment: +24% conversion, +44% engagement, 70% user preference over voice-only. This integration is the fastest path to that outcome. Under 10 lines of code, no infrastructure changes.

What adding an Anam avatar to your videoSDK solves:

  • Voice isn't enough. Users consistently prefer video. 70% preference rate, across every deployment.

  • Zero infrastructure rebuild. Plugs directly into CascadingPipeline or RealTimePipeline alongside your existing STT, LLM, and TTS. Nothing changes except the output.

  • Native WebRTC. The avatar stream is a standard VideoSDK participant. No proxy layers, no custom delivery.

  • Credibility at scale. Higher engagement, completion, and trust across sales, onboarding, support, and training.

70%


70%

70%

user preference over voice-only

<1s

<1s

response latency

<10

<10

lines of code to integrate

#1

#1

real-time avatar model in market

Integration features

Everything you need to ship production-grade avatar experiences on VideoSDK.

Everything you need to ship production-grade avatar experiences on VideoSDK.

Native WebRTC integration

  • No proxy layers or workarounds

  • Synchronized lip-synced audio/video output

  • Works with VideoSDK's full security and scaling stack

  • Developer benefit: Ship avatars without touching your WebRTC setup

Sub-second responsiveness

  • 180ms median server-side latency

  • Natural turn-taking and interruption support

  • 25fps bespoke rendering model

  • Developer benefit: No latency tuning. Works out of the box

Developer-first API

  • Single pip install, no heavy setup

  • Drop into any pipeline without rebuilding

  • Full Python SDK with typed interfaces

  • Developer benefit: From pip install to live avatar in minutes

Unmatched realism

  • Independent 178-participant benchmark study

  • 24% higher than nearest competitor (p < 0.001)

  • Verified at avatarbenchmark.com

  • Developer benefit: Ship the best avatar quality available, not second-best

Installation guide

1. Install the VideoSDK Anam plugin

Install from PyPI using uv or pip. Everything needed for avatar streaming is included.

1uv add "videosdk-plugins-anam"
2# or: pip install videosdk-plugins-anam

2. Initialize AnamAvatar

Import and initialize with your Anam API key and avatar ID. Retrieve your key from lab.anam.ai and browse avatars at lab.anam.ai/avatars.

1import os
2from videosdk.plugins.anam import AnamAvatar
3
4anam_avatar = AnamAvatar(
5 api_key=os.getenv("ANAM_API_KEY"),
6 avatar_id=os.getenv("ANAM_AVATAR_ID"),
7)

3. Add to a CascadingPipeline

Pass AnamAvatar as the avatar parameter alongside your STT, LLM, TTS, VAD, and turn detector.

1from videosdk.agents import CascadingPipeline
2from videosdk.plugins.deepgram import DeepgramSTT
3from videosdk.plugins.openai import OpenAILLM
4from videosdk.plugins.elevenlabs import ElevenLabsTTS
5from videosdk.plugins.silero import SileroVAD
6from videosdk.plugins.turn_detector import TurnDetector
7
8pipeline = CascadingPipeline(
9 stt=DeepgramSTT(model="nova-3", api_key=os.getenv("DEEPGRAM_API_KEY")),
10 llm=OpenAILLM(model="gpt-4o-mini", api_key=os.getenv("OPENAI_API_KEY")),
11 tts=ElevenLabsTTS(api_key=os.getenv("ELEVENLABS_API_KEY"), enable_streaming=True),
12 vad=SileroVAD(),
13 turn_detector=TurnDetector(threshold=0.8),
14 avatar=anam_avatar,
15)

Audio input → STT → LLM → TTS → Anam avatar A/V stream

4. Or use a RealTimePipeline (Gemini Live)

For native audio models like Gemini Live, use RealTimePipeline. The model's audio drives the avatar directly.

1from videosdk.agents import RealTimePipeline
2from videosdk.plugins.google import GeminiRealtime, GeminiLiveConfig
3
4model = GeminiRealtime(
5 model="gemini-2.5-flash-native-audio-preview-12-2025",
6 config=GeminiLiveConfig(voice="Leda", response_modalities=["AUDIO"]),
7)
8
9pipeline = RealTimePipeline(model=model, avatar=anam_avatar)
10# avatar video streams to all participants in the VideoSDK room

Set Up Time:

5–10 mins

Difficulty:

Beginner

Category:

Infrastructure

Type:

Native SDK Plugin

Frequently asked questions

Avatar not rendering in stream

Confirm your avatar_id exists in lab.anam.ai, verify the API key is active, and check that the avatar stage is assigned before calling pipeline.start(). Review VideoSDK session logs for participant stream errors.

Lip Sync out of alignment

Check TTS response times first — high-latency providers delay avatar rendering. Use streaming TTS output where available, and ensure audio chunk sizes match Anam's expected input format.

Authentification error start

Generate a fresh API key from lab.anam.ai. Confirm the key is passed as a string, not an environment variable reference, and check it is not scoped to a different project.

Pipeline import errors

Run pip show videosdk-plugins-anam to confirm installation. Check you are in the correct virtual environment and that Python 3.9+ is in use. Reinstall with pip install --upgrade videosdk-plugins-anam.

High latency or frozen video

Check LLM endpoint latency — slow models will bottleneck the pipeline. Review network conditions and consider switching to a faster LLM or TTS provider. Check status.anam.ai for any active incidents.

Which STT, LLM, and TTS providers are supported?

Anam works with any VideoSDK-compatible provider. This includes Deepgram and AssemblyAI for STT, OpenAI and Anthropic for LLM, and ElevenLabs and Cartesia for TTS. You can also bring your own custom endpoints.

What are custom avatars?

Custom avatars (One-Shot Avatars) are AI-powered video avatars you can create and use in your projects. Each plan includes a certain number of avatar slots that determine how many custom avatars you can have active at once.

Is the integration HIPAA compliant?

Yes. Anam is HIPAA compliant and SOC II certified. Zero Data Retention mode is available for privacy-sensitive deployments — no session data is stored after a session ends. See the Trust Center for full documentation.

Where can I find the full cookbook

The full working example with setup, configuration, and testing is at anam.ai/cookbook/videosdk-anam-avatar.