Skip to main content
The ElevenLabs Text to Speech node converts written text into spoken audio using the ElevenLabs API. It allows you to select a specific voice and fine-tune various speech characteristics like stability, speed, and style to generate a customized audio output.

Inputs

ParameterDescriptionData TypeRequiredRange
voiceVoice to use for speech synthesis. Connect from Voice Selector or Instant Voice Clone.CUSTOMYesN/A
textThe text to convert to speech.STRINGYesN/A
stabilityVoice stability. Lower values give broader emotional range, higher values produce more consistent but potentially monotonous speech (default: 0.5).FLOATNo0.0 - 1.0
apply_text_normalizationText normalization mode. ‘auto’ lets the system decide, ‘on’ always applies normalization, ‘off’ skips it.COMBONo"auto"
"on"
"off"
modelModel to use for text-to-speech. Selecting a model reveals its specific parameters.DYNAMICCOMBONo"eleven_multilingual_v2"
"eleven_v3"
language_codeISO-639-1 or ISO-639-3 language code (e.g., ‘en’, ‘es’, ‘fra’). Leave empty for automatic detection (default: "").STRINGNoN/A
seedSeed for reproducibility (determinism not guaranteed) (default: 1).INTNo0 - 2147483647
output_formatAudio output format.COMBONo"mp3_44100_192"
"opus_48000_192"
Model-Specific Parameters: When the model parameter is set to "eleven_multilingual_v2", the following additional parameters become available:
  • speed: Speech speed. 1.0 is normal, <1.0 slower, >1.0 faster (default: 1.0, range: 0.7 - 1.3).
  • similarity_boost: Similarity boost. Higher values make the voice more similar to the original (default: 0.75, range: 0.0 - 1.0).
  • use_speaker_boost: Boost similarity to the original speaker voice (default: False).
  • style: Style exaggeration. Higher values increase stylistic expression but may reduce stability (default: 0.0, range: 0.0 - 0.2).
When the model parameter is set to "eleven_v3", the following additional parameters become available:
  • speed: Speech speed. 1.0 is normal, <1.0 slower, >1.0 faster (default: 1.0, range: 0.7 - 1.3).
  • similarity_boost: Similarity boost. Higher values make the voice more similar to the original (default: 0.75, range: 0.0 - 1.0).

Outputs

Output NameDescriptionData Type
audioThe generated audio from the text-to-speech conversion.AUDIO
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 0cd570fbb152e07ba028e96df56abc08dde8941d043386fd076f42a1e1dc6016