Skip to main content
You can configure how your persona’s voice sounds — speed, volume, and emotion — via voiceGenerationOptions in the persona config. The available options depend on the provider and model of the configured voice.
const personaConfig = {
  name: "Cara",
  avatarId: "30fa96d0-26c4-4e55-94a0-517025942e18",
  voiceId: "6bfbe25a-979d-40f3-a92b-5394170af54b",
  llmId: "0934d97d-0c3a-4f33-91b0-5e136a0ef466",
  systemPrompt: "You are Cara, a helpful customer service representative.",
  voiceGenerationOptions: {
    speed: 1.2,
    volume: 0.8,
    emotion: "content"
  }
};
These settings can be set when creating or updating a persona, as well as when generating a session token. See the API Reference for details.

Cartesia voices

The following options are valid for Cartesia sonic-3 voices:
OptionRangeDescription
volume0.5 – 2.0Multiplier to decrease or increase the volume of the original voice
speed0.6 – 1.5Multiplier to decrease or increase the speed of the original voice
emotionstringEmotion to apply: neutral, calm, angry, content, sad, scared

ElevenLabs voices

The following options are valid for ElevenLabs v1 and v2 voices:
OptionRangeDescription
stability0 – 1How much the voice varies between generations. Lower values introduce more emotional variation
similarityBoost0 – 1How closely the generated voice matches the original reference audio
speed0.7 – 1.2Multiplier to decrease or increase the speed of the original voice
For v2 voices, these additional options are also available:
OptionTypeDescription
useSpeakerBoostbooleanBoost similarity to the original speaker. May increase latency
style0 – 1How much the original speaker’s style is amplified. Values other than 0 may increase latency

Changing voices

The voiceGenerationOptions are specific to the provider, model, and voice being used. When you change the voiceId for a persona, these options reset. Use the copy persona feature in Lab when experimenting with different voices to avoid losing your existing config.