Installation
pip install livekit-plugins-anam
Environment variables
Service Where to get it Anam lab.anam.ai LiveKit LiveKit Cloud or self-hostedLLM providers DeepGram, ElevenLabs, OpenAI, Google AI Studio, etc.
ANAM_API_KEY = your_anam_api_key
ANAM_AVATAR_ID = your_avatar_id
LIVEKIT_URL = wss://your-project.livekit.cloud
LIVEKIT_API_KEY = your_livekit_api_key
LIVEKIT_API_SECRET = your_livekit_api_secret
OPENAI_API_KEY = your_openai_api_key
# or
GEMINI_API_KEY = your_gemini_api_key
PersonaConfig
Configure the avatar identity:
persona_config = anam.PersonaConfig(
name = "Maya" , # Display name for the avatar
avatarId = "uuid-here" , # Avatar appearance ID
)
Display name for the avatar. Used in logs and debugging.
AvatarSession
avatar = anam.AvatarSession(
persona_config = anam.PersonaConfig( ... ),
api_key = "your_api_key" ,
api_url = "https://api.anam.ai" , # Optional
)
Configuration for the avatar’s identity and appearance.
api_url
string
default: "https://api.anam.ai"
Anam API endpoint. Override for staging or self-hosted deployments.
start()
Starts the avatar session and connects it to the LiveKit room.
await avatar.start(session, room = ctx.room)
The LiveKit agent session to connect the avatar to.
The LiveKit room instance from the job context.
Advanced examples
Gemini with Vision
Use Gemini Live for multimodal conversations with screen share analysis:
import os
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.agents.voice import VoiceActivityVideoSampler, room_io
from livekit.plugins import anam, google
async def entrypoint ( ctx : JobContext):
await ctx.connect()
llm = google.realtime.RealtimeModel(
model = "gemini-2.0-flash-exp" ,
api_key = os.getenv( "GEMINI_API_KEY" ),
voice = "Aoede" ,
instructions = "You are a helpful assistant that can see the user's screen." ,
)
avatar = anam.AvatarSession(
persona_config = anam.PersonaConfig(
name = "Maya" ,
avatarId = os.getenv( "ANAM_AVATAR_ID" ),
),
api_key = os.getenv( "ANAM_API_KEY" ),
)
session = AgentSession(
llm = llm,
video_sampler = VoiceActivityVideoSampler(
speaking_fps = 0.2 ,
silent_fps = 0.1 ,
),
)
await avatar.start(session, room = ctx.room)
await session.start(
agent = Agent( instructions = "Help the user with what you see on their screen." ),
room = ctx.room,
room_input_options = room_io.RoomInputOptions( video_enabled = True ),
)
if __name__ == "__main__" :
cli.run_app(WorkerOptions( entrypoint_fnc = entrypoint))
Extend your agent with custom tools:
from livekit.agents import function_tool
@function_tool
async def fill_form_field ( field_name : str , value : str ) -> str :
"""Fill in a form field on the user's screen.
Args:
field_name: The name of the field to fill
value: The value to enter
Returns:
Confirmation message
"""
await send_command_to_frontend( "fill_field" , { "field" : field_name, "value" : value})
return "Field filled successfully"
session = AgentSession(
llm = llm,
tools = [fill_form_field],
)
Running your agent
Connects to your LiveKit server and automatically joins rooms when participants connect. Deploy using Docker, Kubernetes, or your preferred container platform. See the LiveKit Agents deployment guide for details.
Troubleshooting
Agent won't connect to LiveKit
Verify LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET are correct
Check that your LiveKit server is accessible
Ensure WebSocket connections aren’t blocked by a firewall
Test connectivity at meet.livekit.io
Verify your ANAM_API_KEY is valid
Check that ANAM_AVATAR_ID matches an existing avatar
Review agent logs for Anam connection errors
Ensure the avatar session starts before the agent session
Check your LLM API key is valid (OpenAI, Gemini, etc.)
Verify microphone permissions in the browser
Look for API errors in the agent logs
Confirm the agent is receiving audio tracks
High latency or choppy audio
Check your network connection stability
Consider using LiveKit Cloud for optimized routing
Reduce video sampling frequency if CPU-bound
Monitor your LLM API response times