> ## Documentation Index > Fetch the complete documentation index at: https://anam.ai/docs/llms.txt > Use this file to discover all available pages before exploring further. # Custom LLM (client-side) > Build your own AI conversation logic with OpenAI, Anthropic, and other language models Learn how to bypass Anam's built-in language models and integrate your own custom LLM for complete control over conversation logic. This guide uses OpenAI as an example, but the pattern works with any LLM provider (Anthropic, Google Gemini, Groq, Mistral, etc.). Step-by-step tutorial with full source code **New Feature**: Anam now supports [server-side custom LLMs](/concepts/custom-llms) where we handle the LLM calls for you, improving latency and simplifying development. This guide shows the client-side approach where you manage the LLM calls yourself. ## What You'll Build By the end of this guide, you'll have a persona application featuring: * **Custom AI Brain** using your own language model (OpenAI GPT-4.1-mini) * **Streaming Responses** with real-time text-to-speech conversion * **Turn-taking Management** that handles conversation flow * **Message History Integration** that maintains conversation context * **Error Handling & Recovery** for production use After completing the initial setup (Steps 1-4), you can extend this foundation by adding features like conversation memory, different LLM providers, custom system prompts, or specialized AI behaviors. This guide uses **OpenAI's GPT-4.1-mini** as an example custom LLM for demonstration purposes. In your actual application, you would replace the OpenAI integration with calls to your specific LLM provider. The core integration pattern remains the same regardless of your LLM choice. ## Prerequisites * **Node.js** (version 18 or higher) and **npm** installed * Understanding of modern JavaScript/TypeScript and streaming APIs * An Anam API key ([get one here](/api-key)) * An OpenAI API key ([get one here](https://platform.openai.com/api-keys)) * Basic knowledge of Express.js and modern web development * A microphone and speakers for voice interaction ## Understanding the Custom LLM Flow Before diving into the implementation, here is how custom LLM integration works with Anam personas. Regardless of your custom LLM provider, the implementation pattern follows these steps: The `llmId: "CUSTOMER_CLIENT_V1"` setting in the session token request disables Anam's default AI, allowing you to handle all conversation logic. The `MESSAGE_HISTORY_UPDATED` event fires when the user finishes speaking, providing the complete conversation history including the new user message. Your server endpoint receives the conversation history and generates a streaming response using your chosen LLM (OpenAI in this example). The LLM response is streamed back to the client and forwarded to the persona using `createTalkMessageStream()` for text-to-speech conversion. Using these core concepts, we'll build a simple web application that allows you to chat with your custom LLM-powered persona. ## Basic Setup Let's start by building the foundation with custom LLM integration. This setup creates a web application with four main components: ``` anam-custom-llm-app/ ├── server.js # Express server with streaming LLM endpoint ├── package.json # Node.js dependencies ├── public/ # Static files served to the browser │ ├── index.html # Main HTML page with video element │ └── script.js # Client-side JavaScript for persona control └── .env # Environment variables ``` ```bash theme={"system"} mkdir anam-custom-llm-app cd anam-custom-llm-app ``` ```bash theme={"system"} npm init -y ``` This creates a `package.json` file for managing dependencies. ```bash theme={"system"} mkdir public ``` The `public` folder will contain your HTML and JavaScript files that are served to the browser. ```bash theme={"system"} npm install express dotenv openai ``` We're installing Express for the server, dotenv for environment variables, and the OpenAI SDK for custom LLM integration. The Anam SDK will be loaded directly from a CDN in the browser. Create a `.env` file in your project root to store your API keys securely: ```bash .env theme={"system"} ANAM_API_KEY=your-anam-api-key-here OPENAI_API_KEY=your-openai-api-key-here ``` Replace the placeholder values with your actual API keys. Never commit this file to version control. ### Step 1: Set up your server with LLM streaming Create an Express server that handles both session token generation and LLM streaming: ```javascript server.js theme={"system"} require('dotenv').config(); const express = require('express'); const OpenAI = require('openai'); const app = express(); // Initialize OpenAI client const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, }); app.use(express.json()); app.use(express.static('public')); // Session token endpoint with custom brain configuration app.post('/api/session-token', async (req, res) => { try { const response = await fetch('https://api.anam.ai/v1/auth/session-token', { method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${process.env.ANAM_API_KEY}`, }, body: JSON.stringify({ personaConfig: { name: 'Cara', avatarId: '30fa96d0-26c4-4e55-94a0-517025942e18', avatarModel: 'cara-4', voiceId: '6bfbe25a-979d-40f3-a92b-5394170af54b', // This disables Anam's default brain and enables custom LLM integration llmId: 'CUSTOMER_CLIENT_V1', }, }), }); const data = await response.json(); res.json({ sessionToken: data.sessionToken }); } catch (error) { console.error('Session token error:', error); res.status(500).json({ error: 'Failed to create session' }); } }); // Custom LLM streaming endpoint app.post('/api/chat-stream', async (req, res) => { try { const { messages } = req.body; // Create a streaming response from OpenAI const stream = await openai.chat.completions.create({ model: 'gpt-4.1-mini', messages: [ { role: 'system', content: 'You are Cara, a helpful AI assistant. Be friendly, concise, and conversational in your responses. Keep responses under 100 words unless specifically asked for detailed information.', }, ...messages, ], stream: true, temperature: 0.7, }); // Set headers for streaming response res.setHeader('Content-Type', 'text/event-stream'); res.setHeader('Cache-Control', 'no-cache'); res.setHeader('Connection', 'keep-alive'); // Process the OpenAI stream and forward to client for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content || ''; if (content) { // Send each chunk as JSON res.write(JSON.stringify({ content }) + '\n'); } } res.end(); } catch (error) { console.error('LLM streaming error:', error); res.status(500).json({ error: 'An error occurred while streaming response' }); } }); app.listen(8000, () => { console.log('Server running on http://localhost:8000'); console.log('Custom LLM integration ready!'); }); ``` The key difference here is setting `llmId: "CUSTOMER_CLIENT_V1"` which disables Anam's default AI and enables custom LLM integration. The `/api/chat-stream` endpoint handles the actual AI conversation logic. ### Step 2: Set up your HTML Create a simple HTML page with video element and conversation display: ```html public/index.html theme={"system"} Custom LLM Persona - Anam Integration

Custom LLM Persona

Ready to connect

Conversation

Start a conversation to see your chat history...

``` ### Step 3: Implement the client-side custom LLM integration Create the client-side JavaScript that handles the custom LLM integration: ```javascript public/script.js theme={"system"} import { createClient } from 'https://esm.sh/@anam-ai/js-sdk@latest'; import { AnamEvent } from 'https://esm.sh/@anam-ai/js-sdk@latest/dist/module/types'; let anamClient = null; // Get DOM elements const startButton = document.getElementById('start-button'); const stopButton = document.getElementById('stop-button'); const videoElement = document.getElementById('persona-video'); const statusElement = document.getElementById('status'); const chatHistory = document.getElementById('chat-history'); // Status management function updateStatus(message, type = 'normal') { statusElement.textContent = message; const colors = { loading: '#f39c12', connected: '#28a745', error: '#dc3545', normal: '#333', }; statusElement.style.color = colors[type] || colors.normal; } // Chat history management function updateChatHistory(messages) { if (!chatHistory) return; chatHistory.innerHTML = ''; if (messages.length === 0) { chatHistory.innerHTML = '

Start a conversation to see your chat history...

'; return; } messages.forEach((message) => { const messageDiv = document.createElement('div'); const isUser = message.role === 'user'; messageDiv.style.cssText = ` margin-bottom: 10px; padding: 8px 12px; border-radius: 8px; max-width: 85%; background: ${isUser ? '#e3f2fd' : '#f1f8e9'}; ${isUser ? 'margin-left: auto; text-align: right;' : ''} `; messageDiv.innerHTML = `${isUser ? 'You' : 'Cara'}: ${message.content}`; chatHistory.appendChild(messageDiv); }); // Scroll to bottom chatHistory.scrollTop = chatHistory.scrollHeight; } // Custom LLM response handler async function handleUserMessage(messageHistory) { // Only respond to user messages if (messageHistory.length === 0 || messageHistory[messageHistory.length - 1].role !== 'user') { return; } if (!anamClient) return; try { console.log('Getting custom LLM response for:', messageHistory); // Convert Anam message format to OpenAI format const openAIMessages = messageHistory.map((msg) => ({ role: msg.role === 'user' ? 'user' : 'assistant', content: msg.content, })); // Create a streaming talk session // You can optionally pass a correlationId to track this specific message stream const talkStream = anamClient.createTalkMessageStream(); // Call our custom LLM streaming endpoint const response = await fetch('/api/chat-stream', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ messages: openAIMessages }), }); if (!response.ok) { throw new Error(`LLM request failed: ${response.status}`); } const reader = response.body?.getReader(); if (!reader) { throw new Error('Failed to get response stream reader'); } const textDecoder = new TextDecoder(); console.log('Streaming LLM response to persona...'); // Stream the response chunks to the persona while (true) { const { done, value } = await reader.read(); if (done) { console.log('LLM streaming complete'); if (talkStream.isActive()) { talkStream.endMessage(); } break; } if (value) { const text = textDecoder.decode(value); const lines = text.split('\n').filter((line) => line.trim()); for (const line of lines) { try { const data = JSON.parse(line); if (data.content && talkStream.isActive()) { talkStream.streamMessageChunk(data.content, false); } } catch (parseError) { // Ignore parse errors in streaming } } } } } catch (error) { console.error('Custom LLM error:', error); if (anamClient) { anamClient.talk( "I'm sorry, I encountered an error while processing your request. Please try again." ); } } } async function startConversation() { try { startButton.disabled = true; updateStatus('Connecting...', 'loading'); // Get session token from server const response = await fetch('/api/session-token', { method: 'POST', }); if (!response.ok) { throw new Error('Failed to get session token'); } const { sessionToken } = await response.json(); // Create Anam client anamClient = createClient(sessionToken); // Set up event listeners anamClient.addListener(AnamEvent.SESSION_READY, () => { console.log('Session ready!'); updateStatus('Connected - Custom LLM active', 'connected'); startButton.disabled = true; stopButton.disabled = false; // Send initial greeting anamClient.talk("Hello! I'm Cara, powered by a custom AI brain. How can I help you today?"); }); anamClient.addListener(AnamEvent.CONNECTION_CLOSED, () => { console.log('Connection closed'); stopConversation(); }); // This is the key event for custom LLM integration anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, handleUserMessage); // Update chat history in real-time anamClient.addListener(AnamEvent.MESSAGE_HISTORY_UPDATED, (messages) => { updateChatHistory(messages); }); // Handle stream interruptions (user interrupted the persona while speaking) anamClient.addListener(AnamEvent.TALK_STREAM_INTERRUPTED, () => { console.log('Talk stream interrupted by user'); }); // Start streaming to video element await anamClient.streamToVideoElement('persona-video'); console.log('Custom LLM persona started successfully!'); } catch (error) { console.error('Failed to start conversation:', error); updateStatus(`Error: ${error.message}`, 'error'); startButton.disabled = false; } } function stopConversation() { if (anamClient) { anamClient.stopStreaming(); anamClient = null; } // Reset UI videoElement.srcObject = null; updateChatHistory([]); updateStatus('Disconnected', 'normal'); startButton.disabled = false; stopButton.disabled = true; console.log('Conversation stopped'); } // Add event listeners startButton.addEventListener('click', startConversation); stopButton.addEventListener('click', stopConversation); // Cleanup on page unload window.addEventListener('beforeunload', stopConversation); ``` ### Step 4: Test your custom LLM integration 1. Start your server: ```bash theme={"system"} node server.js ``` 2. Open [http://localhost:8000](http://localhost:8000) in your browser 3. Click "Start Conversation" to begin chatting with your custom LLM-powered persona! You should see Cara appear and greet you, powered by your custom OpenAI integration. Try having a conversation - your voice will be transcribed, sent to OpenAI's GPT-4.1-mini, and the response will be streamed back through the persona's voice and video. ## Advanced Features ### Enhanced Error Handling Add retry logic to improve reliability: ```javascript theme={"system"} // Add this to your script.js handleUserMessage function async function handleUserMessage(messageHistory) { if (messageHistory.length === 0 || messageHistory[messageHistory.length - 1].role !== 'user') { return; } if (!anamClient) return; const maxRetries = 3; let retryCount = 0; while (retryCount < maxRetries) { try { // ... existing LLM call code ... return; // Success, exit retry loop } catch (error) { retryCount++; console.error(`Custom LLM error (attempt ${retryCount}):`, error); if (retryCount >= maxRetries) { // Final fallback response if (anamClient) { anamClient.talk( "I'm experiencing some technical difficulties. Please try rephrasing your question or try again in a moment." ); } } else { // Wait before retry await new Promise((resolve) => setTimeout(resolve, 1000 * retryCount)); } } } } ``` ## What You've Built You've integrated a custom language model with Anam's persona system. Your application includes: * **Custom AI Brain**: Control over your persona's intelligence using OpenAI's GPT-4.1-mini, with the ability to customize personality, knowledge, and behavior. * **Real-time Streaming**: Responses stream from your LLM through the persona's voice. * **Conversation Context**: Full conversation history is maintained and provided to your LLM for contextually aware responses. * **Error Handling**: Retry logic and fallback responses for reliability. * **Extensible Architecture**: The modular design allows you to swap LLM providers, add custom logic, or integrate with other AI services. ## Troubleshooting **Symptoms**: Persona doesn't speak or responses are delayed **Solutions**: * Verify OpenAI API key is correctly configured * Check that `llmId: "CUSTOMER_CLIENT_V1"` is set in session token * Ensure `MESSAGE_HISTORY_UPDATED` event listener is properly connected * Check browser console for JavaScript errors * Verify the `/api/chat-stream` endpoint is responding correctly **Symptoms**: Slow or choppy persona responses **Solutions**: * Optimize LLM model parameters (reduce max\_tokens, adjust temperature) * Implement response caching for common queries * Use faster models like `gpt-4.1-mini` instead of `gpt-4` * Consider chunking large responses for better streaming * Monitor network latency and server performance ***