Changelog

New features, improvements, and fixes.

September 11, 2025

Intelligent LLM Routing for Faster Responses

The performance of LLM endpoints can be highly variable, with time-to-first-token latencies sometimes fluctuating by as much as 500ms from one day to the next depending on regional load. To solve this and ensure your personas respond as quickly and reliably as possible, we've rolled out a new intelligent routing system for LLM requests. This is active for both our turnkey customers and for customers using their own server-side Custom LLMs if they deploy multiple endpoints.

LLM config options

This new system constantly monitors the health and performance of all configured LLM endpoints by sending lightweight probes at regular intervals. Using a time-aware moving average, it builds a real-time picture of network latency and processing speed for each endpoint. When a request is made, the system uses this data to calculate the optimal route, automatically shedding load from any overloaded or slow endpoints within a region.

Improvements

Generate one-shot avatars from text prompts

You can now generate one-shot avatars from text prompts within the lab, powered by Gemini’s new Nano Banana model. The one-shot creation flow has been redesigned for speed and ease-of-use, and is now available to all plans. Image upload and webcam avatars remain exclusive to Pro and Enterprise.

One shot text to image

Improved management of published embed widgets

Published embed widgets can now be configured and monitored from the lab at https://lab.anam.ai/personas/published.

Improvements

Automatic failover to backup data centres

To ensure maximum uptime and reliability for our personas, we’ve implemented automatic failover to backup data centres.

Fixes

Prevent session crash on long user speech

Previously, unbroken user speech exceeding 30 seconds would trigger a transcription error and crash the session. We now automatically truncate continuous speech to 30 seconds, preventing sessions from failing in these rare cases.

Allow configurable session lengths of up to 2 hours for Enterprise plans

We had a bug where sessions had a max timeout of 30 mins instead of 2 hours for enterprise plans. This has now been fixed.

Resolved slow connection times caused by incorrect database region selection

An undocumented issue with our database provider led to incorrect region selection for our databases. Simply refreshing our credentials resolved the problem, resulting in a ~1s improvement in median connection times and ~3s faster p95 times. While our provider works on a permanent fix, we’re actively monitoring for any recurrence.

September 4, 2025

Embed Widget

Embed personas directly into your website with our new widget. Within the lab's build page click Publish then generate your unique html snippet. This snippet will work in most common website builders, eg Wordpress.org or SquareSpace.

For added security we recommend adding a whitelist with your domain url. This will lock down the persona to only work on your website. You can also cap the number of sessions or give the widget an expiration period.

Improvements

ONE-SHOT avatars available via API

Professional and Enterprise accounts can now create one-shot avatars via API. Docs here.

Improvements

Spend caps

It's now possible to set a spend cap on your account. Available in profile settings.

Fixes

Prevent Cartesia from timing out when using slow custom LLMs.

We’ve added a safeguard to prevent Cartesia contexts from unexpectedly closing during pauses in text streaming. With slower llms or if there’s a break or slow-down in text being sent, your connection will now stay alive, ensuring smoother, uninterrupted interactions.