Built-in models
| LLM ID | Model | Best for |
|---|---|---|
0934d97d-0c3a-4f33-91b0-5e136a0ef466 | OpenAI GPT-4.1 Mini | Recommended for most projects |
a7cf662c-2ace-4de1-a21e-ef0fbf144bb7 | GPT OSS 120B | High throughput reasoning, great at tool calling |
27cbd128-f1e6-4b67-8ab3-9123659be08c | Gemini 3 Flash Preview | Fast reasoning with predictable tool calling |
9d8900ee-257d-4401-8817-ba9c835e9d36 | Gemini 2.5 Flash | Our fastest model |
88190a76-3e87-4935-ab39-f4f73038815a | Kimi k2 | Great at agentic tasks |
ANAM_LLAMA_v3_3_70B_V1 | Llama 3.3 70B | Open-source preference, larger context |
Using a built-in LLM
Set thellmId field in your persona configuration to the ID of the model you want to use:
Choosing a model
For most use cases, GPT-4.1 Mini is a good starting point — it balances speed, cost, and quality. If your persona uses tools heavily, consider GPT OSS 120B or Gemini 3 Flash Preview for more reliable tool calling. If latency is your top priority, Gemini 2.5 Flash is the fastest option.Greeting behavior
When using a built-in LLM, the persona greets the user with an opening message when the session starts. The content of this greeting is controlled by the system prompt. To skip the greeting entirely, setskipGreeting to true:
Bring your own LLM
If the built-in models don’t fit your needs, you can connect your own:- Server-side custom LLMs — Register your model with Anam and we call it from our servers, keeping latency low.
- Client-side custom LLMs — Handle LLM calls yourself in your client code using
CUSTOMER_CLIENT_V1as the LLM ID. - LiveKit — Use Anam as a face layer in your existing LiveKit agent pipeline with any LLM.

