> ## Documentation Index
> Fetch the complete documentation index at: https://anam.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Custom LLMs

> Use your own language models with Anam's digital personas

# Custom LLMs

Anam supports integration with custom Large Language Models (LLMs), allowing you to use your own models while benefiting from Anam's persona, voice, and streaming infrastructure.

<Info>
  Custom LLMs are processed directly from Anam's servers, reducing latency and
  simplifying your integration. All API credentials you provide are
  encrypted at rest using AES-256.
</Info>

### Other Ways to Use Custom LLMs

This page covers **server-side custom LLMs** where Anam handles the LLM calls for you. There are other integration patterns:

* **[Custom LLM (client-side)](/examples/custom-llm)** — Handle LLM calls yourself in your client code and stream responses to the persona
* **[ElevenLabs Agents](https://anam.ai/cookbook/elevenlabs-server-side-agents)** — Use ElevenLabs Conversational AI as your LLM + TTS provider with an Anam avatar
* **[LiveKit](/third-party-integrations/livekit)** — Use Anam avatars as a face layer in your existing LiveKit agent pipeline

## How Custom LLMs Work

When you create a custom LLM configuration in Anam:

1. **Model Registration**: You register your LLM details with Anam, including the model endpoint and authentication credentials
2. **Server-Side Processing**: Anam handles all LLM calls from our servers, reducing latency and complexity
3. **Secure Storage**: Your API keys and credentials are encrypted and securely stored
4. **Integration**: Use your custom LLM ID in place of Anam's built-in models

## Creating a Custom LLM

To create a custom LLM, you'll need to:

1. Register your LLM configuration through the Anam API or dashboard
2. Provide the necessary connection details (endpoint, API keys, model parameters)
3. Receive a unique LLM ID for your custom model
4. Use this ID when creating session tokens

### Register in Anam Lab

1. Open [Anam Lab](https://lab.anam.ai/build).
2. Go to the **LLMs** tab.
3. Add a custom LLM.
4. Choose the provider format that matches your endpoint.
5. Enter the endpoint URL, model name, and API secret.
6. Run the built-in validation test.
7. Copy the generated LLM ID and use it as `personaConfig.llmId`.

### Register through the API

Call `POST /v1/llms` from your server. The `secret` value is encrypted at rest and is not returned by later API responses.

```bash theme={"system"}
curl -X POST "https://api.anam.ai/v1/llms" \
  -H "Authorization: Bearer $ANAM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "displayName": "My OpenAI-compatible model",
    "description": "Production assistant model",
    "urls": [
      {
        "url": "https://api.openai.com/v1/chat/completions"
      }
    ],
    "llmFormat": "openai",
    "modelName": "gpt-4o",
    "secret": "YOUR_LLM_PROVIDER_API_KEY",
    "temperature": 0.7
  }'
```

The response includes the `id` to use as your custom `llmId`:

```json theme={"system"}
{
  "id": "your-custom-llm-id",
  "displayName": "My OpenAI-compatible model",
  "llmFormat": "openai",
  "urls": [
    {
      "url": "https://api.openai.com/v1/chat/completions"
    }
  ],
  "modelName": "gpt-4o"
}
```

## Supported LLM Specifications

Anam supports custom LLMs that comply with one of the following API specifications:

* **OpenAI API Specification** - Compatible with OpenAI's chat completion endpoints
* **Azure OpenAI API Specification** - Compatible with Azure's OpenAI service endpoints
* **Gemini API Specification** - Compatible with Google's Gemini API endpoints
* **Groq OpenAI API Specification** - Compatible with Groq's API endpoints

<Warning>
  Your custom LLM must support streaming responses. Non-streaming LLMs will not
  work with Anam's real-time persona interactions.
</Warning>

## Specifying Multiple Endpoints

Anam allows you to specify multiple endpoints per LLM. The Anam backend will automatically route to the fastest available LLM from the data centre where the Anam engine is running, and fallback to other endpoints in the case of an error.

<Note>
  To ensure routing selects the fastest available endpoint, Anam may occasionally send small probe prompts to your configured endpoints. These only occur while sessions are active for that LLM, and are lightweight—around 1500 tokens in size. Probes are infrequent (a few times per hour at most), have no effect on active conversations, and exist solely to maintain reliable performance.
</Note>

### Technical Requirements

<Steps>
  <Step title="API Compatibility">
    Your LLM server must implement one of the supported API specifications mentioned above. This includes:

    * Matching the request/response format
    * Supporting the same authentication methods
    * Implementing compatible endpoint paths
  </Step>

  <Step title="Streaming Support">
    Enable streaming responses in your LLM implementation:

    * Return responses with `stream: true` support
    * Use Server-Sent Events (SSE) for streaming chunks
    * Include proper content types and formatting
  </Step>

  <Step title="Validation Testing">
    When you add your LLM in the Anam Lab, automatic tests verify:

    * API specification compliance
    * Streaming functionality
    * Response format compatibility
    * Authentication mechanisms

    <Tip>
      The Lab will provide feedback if your LLM doesn't meet the requirements, helping you identify what needs to be fixed.
    </Tip>
  </Step>
</Steps>

<Note>
  **Testing Tip**: We recommend using `curl` commands to compare your custom
  LLM's raw HTTP responses with those from the actual providers (OpenAI, Azure
  OpenAI, or Gemini). Client libraries like the OpenAI SDK often transform
  responses and extract specific values, which can mask differences in the
  actual HTTP response format. Your custom implementation must match the raw
  HTTP response structure, not the transformed output from client libraries.
</Note>

### Streaming Response Shape

For `openai`, `azure_openai`, and `groq_openai` formats, return a Server-Sent Events stream where each `data:` payload is a `chat.completion.chunk` JSON object. Put generated text in `choices[0].delta.content`, and finish with a chunk that sets `finish_reason`.

```text OpenAI-compatible SSE theme={"system"}
HTTP/1.1 200 OK
Content-Type: text/event-stream

data: {"id":"chatcmpl-custom-1","object":"chat.completion.chunk","created":1720000000,"model":"your-model-name","choices":[{"index":0,"delta":{"role":"assistant"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-custom-1","object":"chat.completion.chunk","created":1720000000,"model":"your-model-name","choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-custom-1","object":"chat.completion.chunk","created":1720000000,"model":"your-model-name","choices":[{"index":0,"delta":{"content":" there"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-custom-1","object":"chat.completion.chunk","created":1720000000,"model":"your-model-name","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}
```

If you register the LLM with `llmFormat: "gemini"`, match Gemini's raw `streamGenerateContent` response shape instead of the OpenAI-compatible SSE shape above.

### Validation Checklist

Before using the LLM in production:

1. Send a raw `curl` request to the upstream endpoint with `stream: true`.
2. Confirm the endpoint returns streaming chunks in the provider's expected format.
3. Add the LLM in Anam Lab or through `POST /v1/llms`.
4. Run the Anam Lab validation test.
5. Create a short test persona session with the returned `llmId`.

Common validation failures include non-streaming responses, a mismatched `llmFormat`, invalid provider credentials, and endpoints that are not reachable from Anam's servers.

### Example Custom LLM Endpoints

If you're building your own LLM server, ensure your endpoints match one of these patterns:

<CodeGroup>
  ```bash OpenAI Spec theme={"system"}
  POST /v1/chat/completions
  Content-Type: application/json
  Authorization: Bearer YOUR_API_KEY

  {
  "model": "your-model-name",
  "messages": [...],
  "stream": true
  }
  ```

  ```bash Azure OpenAI Spec theme={"system"}
  POST /openai/deployments/{deployment-id}/chat/completions?api-version=2024-02-01
  Content-Type: application/json
  api-key: YOUR_API_KEY

  {
    "messages": [...],
    "stream": true
  }
  ```

  ```bash Gemini Spec theme={"system"}
  POST /v1beta/models/{model}:streamGenerateContent
  Content-Type: application/json
  x-goog-api-key: YOUR_API_KEY

  {
    "contents": [...],
    "generationConfig": {...}
  }
  ```

  ```bash Groq OpenAI Spec theme={"system"}
  POST /openai/v1/chat/completions
  Content-Type: application/json
  Authorization: Bearer YOUR_API_KEY

  {
  "model": "your-model-name",
  "messages": [...],
  "stream": true,
  "reasoning_format": "parsed",
  "reasoning_effort": "medium"
  }
  ```
</CodeGroup>

## Using Custom LLMs

Once you have your custom LLM ID, use it when requesting session tokens:

<CodeGroup>
  ```typescript Node.js (with personaConfig) theme={"system"}
  const response = await fetch('https://api.anam.ai/v1/auth/session-token', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.ANAM_API_KEY}`
    },
    body: JSON.stringify({
      personaConfig: {
        name: 'Sebastian',
        avatarId: '30fa96d0-26c4-4e55-94a0-517025942e18',
        avatarModel: 'cara-4',
        voiceId: '6bfbe25a-979d-40f3-a92b-5394170af54b',
        llmId: 'your-custom-llm-id', // Your custom LLM ID
        systemPrompt: 'You are a helpful customer service representative.',
      },
    })
  });

  if (!response.ok) {
    throw new Error(`Failed to create session token: ${response.status}`);
  }

  const { sessionToken } = await response.json();
  ```

  ```typescript Node.js (with personaId) theme={"system"}
  // If you've already created a persona with your custom LLM in the Lab
  const response = await fetch('https://api.anam.ai/v1/auth/session-token', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.ANAM_API_KEY}`
    },
    body: JSON.stringify({
      personaConfig: {
        personaId: 'your-persona-id', // Your persona ID
      },
    })
  });

  if (!response.ok) {
    throw new Error(`Failed to create session token: ${response.status}`);
  }

  const { sessionToken } = await response.json();
  ```
</CodeGroup>

Use the `personaId` form if you assigned the custom LLM to a persona in Anam Lab. Use the full `personaConfig` form if you want to choose the custom LLM at session creation time.

## Security Considerations

<Check>
  **Encryption at Rest**: All API keys and credentials are encrypted using
  AES-256 before storage.
</Check>

<Check>
  **Secure Transmission**: Credentials are transmitted over TLS 1.3 and never
  exposed in logs or responses.
</Check>

<Check>
  **Access Control**: Only your account can use your custom LLM configurations.
</Check>

## Benefits of Server-Side Processing

By processing custom LLMs from Anam's servers:

1. **Reduced Latency**: Direct server-to-server communication eliminates client-side round trips
2. **Simpler Client Code**: No need to manage LLM connections in your client application
3. **Integrated Streaming**: Your custom LLM works with Anam's voice and video streaming
4. **Credential Security**: API keys stay on the server, never exposed to client-side code
5. **Automatic Scaling**: Anam handles load balancing and scaling

## Next Steps

<CardGroup cols={2}>
  <Card title="Cookbook: Custom LLM (Client-Side)" icon="book-open" href="https://anam.ai/cookbook/custom-llm-client-side">
    Tutorial for integrating your own LLM on the client side
  </Card>

  <Card title="Cookbook: Python BYO LLM" icon="book-open" href="https://anam.ai/cookbook/python-byo-llm">
    Bring your own LLM using the Python SDK
  </Card>

  <Card title="Personas with Custom LLMs" icon="user" href="/personas/overview">
    Learn how personas work with custom language models
  </Card>

  <Card title="Setup in Anam Lab" icon="flask" href="https://lab.anam.ai/llms">
    Configure a custom LLM in the Anam Lab
  </Card>
</CardGroup>