New features, improvements, and fixes.
The performance of LLM endpoints can be highly variable, with time-to-first-token latencies sometimes fluctuating by as much as 500ms from one day to the next depending on regional load. To solve this and ensure your personas respond as quickly and reliably as possible, we've rolled out a new intelligent routing system for LLM requests. This is active for both our turnkey customers and for customers using their own server-side Custom LLMs if they deploy multiple endpoints.
This new system constantly monitors the health and performance of all configured LLM endpoints by sending lightweight probes at regular intervals. Using a time-aware moving average, it builds a real-time picture of network latency and processing speed for each endpoint. When a request is made, the system uses this data to calculate the optimal route, automatically shedding load from any overloaded or slow endpoints within a region.
• Generate one-shot avatars from text prompts
You can now generate one-shot avatars from text prompts within the lab, powered by Gemini’s new Nano Banana model. The one-shot creation flow has been redesigned for speed and ease-of-use, and is now available to all plans. Image upload and webcam avatars remain exclusive to Pro and Enterprise.
• Improved management of published embed widgets
Published embed widgets can now be configured and monitored from the lab at https://lab.anam.ai/personas/published.
• Automatic failover to backup data centres
To ensure maximum uptime and reliability for our personas, we’ve implemented automatic failover to backup data centres.
• Prevent session crash on long user speech
Previously, unbroken user speech exceeding 30 seconds would trigger a transcription error and crash the session. We now automatically truncate continuous speech to 30 seconds, preventing sessions from failing in these rare cases.
• Allow configurable session lengths of up to 2 hours for Enterprise plans
We had a bug where sessions had a max timeout of 30 mins instead of 2 hours for enterprise plans. This has now been fixed.
• Resolved slow connection times caused by incorrect database region selection
An undocumented issue with our database provider led to incorrect region selection for our databases. Simply refreshing our credentials resolved the problem, resulting in a ~1s improvement in median connection times and ~3s faster p95 times. While our provider works on a permanent fix, we’re actively monitoring for any recurrence.
Embed personas directly into your website with our new widget. Within the lab's build page click Publish then generate your unique html snippet. This snippet will work in most common website builders, eg Wordpress.org or SquareSpace.
For added security we recommend adding a whitelist with your domain url. This will lock down the persona to only work on your website. You can also cap the number of sessions or give the widget an expiration period.
• ONE-SHOT avatars available via API
Professional and Enterprise accounts can now create one-shot avatars via API. Docs here.
• Spend caps
It's now possible to set a spend cap on your account. Available in profile settings.
• Prevent Cartesia from timing out when using slow custom LLMs.
We’ve added a safeguard to prevent Cartesia contexts from unexpectedly closing during pauses in text streaming. With slower llms or if there’s a break or slow-down in text being sent, your connection will now stay alive, ensuring smoother, uninterrupted interactions.