Skip to main content

Installation

pip install livekit-plugins-anam

Environment variables

ServiceWhere to get it
Anamlab.anam.ai
LiveKitLiveKit Cloud or self-hosted
LLM providersDeepGram, ElevenLabs, OpenAI, Google AI Studio, etc.
.env
ANAM_API_KEY=your_anam_api_key
ANAM_AVATAR_ID=your_avatar_id

LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret

OPENAI_API_KEY=your_openai_api_key
# or
GEMINI_API_KEY=your_gemini_api_key

PersonaConfig

Configure the avatar identity:
persona_config = anam.PersonaConfig(
    name="Maya",           # Display name for the avatar
    avatarId="uuid-here",  # Avatar appearance ID
)
name
string
required
Display name for the avatar. Used in logs and debugging.
avatarId
string
required
UUID of the avatar to use. Get this from the Avatar Gallery or Anam Lab.

AvatarSession

avatar = anam.AvatarSession(
    persona_config=anam.PersonaConfig(...),
    api_key="your_api_key",
    api_url="https://api.anam.ai",  # Optional
)
persona_config
PersonaConfig
required
Configuration for the avatar’s identity and appearance.
api_key
string
required
Your Anam API key.
api_url
string
default:"https://api.anam.ai"
Anam API endpoint. Override for staging or self-hosted deployments.

start()

Starts the avatar session and connects it to the LiveKit room.
await avatar.start(session, room=ctx.room)
session
AgentSession
required
The LiveKit agent session to connect the avatar to.
room
rtc.Room
required
The LiveKit room instance from the job context.

Advanced examples

Gemini with Vision

Use Gemini Live for multimodal conversations with screen share analysis:
import os
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.agents.voice import VoiceActivityVideoSampler, room_io
from livekit.plugins import anam, google

async def entrypoint(ctx: JobContext):
    await ctx.connect()

    llm = google.realtime.RealtimeModel(
        model="gemini-2.0-flash-exp",
        api_key=os.getenv("GEMINI_API_KEY"),
        voice="Aoede",
        instructions="You are a helpful assistant that can see the user's screen.",
    )

    avatar = anam.AvatarSession(
        persona_config=anam.PersonaConfig(
            name="Maya",
            avatarId=os.getenv("ANAM_AVATAR_ID"),
        ),
        api_key=os.getenv("ANAM_API_KEY"),
    )

    session = AgentSession(
        llm=llm,
        video_sampler=VoiceActivityVideoSampler(
            speaking_fps=0.2,
            silent_fps=0.1,
        ),
    )

    await avatar.start(session, room=ctx.room)
    await session.start(
        agent=Agent(instructions="Help the user with what you see on their screen."),
        room=ctx.room,
        room_input_options=room_io.RoomInputOptions(video_enabled=True),
    )

if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))

Function tools

Extend your agent with custom tools:
from livekit.agents import function_tool

@function_tool
async def fill_form_field(field_name: str, value: str) -> str:
    """Fill in a form field on the user's screen.

    Args:
        field_name: The name of the field to fill
        value: The value to enter

    Returns:
        Confirmation message
    """
    await send_command_to_frontend("fill_field", {"field": field_name, "value": value})
    return "Field filled successfully"

session = AgentSession(
    llm=llm,
    tools=[fill_form_field],
)

Running your agent

python agent.py dev
Connects to your LiveKit server and automatically joins rooms when participants connect.

Troubleshooting

  • Verify LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET are correct
  • Check that your LiveKit server is accessible
  • Ensure WebSocket connections aren’t blocked by a firewall
  • Test connectivity at meet.livekit.io
  • Verify your ANAM_API_KEY is valid
  • Check that ANAM_AVATAR_ID matches an existing avatar
  • Review agent logs for Anam connection errors
  • Ensure the avatar session starts before the agent session
  • Check your LLM API key is valid (OpenAI, Gemini, etc.)
  • Verify microphone permissions in the browser
  • Look for API errors in the agent logs
  • Confirm the agent is receiving audio tracks
  • Check your network connection stability
  • Consider using LiveKit Cloud for optimized routing
  • Reduce video sampling frequency if CPU-bound
  • Monitor your LLM API response times