ElevenLabsTextToDialogue - ComfyUI Built-in Node Documentation

The ElevenLabs Text to Dialogue node generates a multi-speaker audio dialogue from text. It allows you to create a conversation by specifying different text lines and distinct voices for each participant. The node sends the dialogue request to the ElevenLabs API and returns the generated audio.

Inputs

Parameter	Description	Data Type	Required	Range
`stability`	Voice stability. Lower values give broader emotional range, higher values produce more consistent but potentially monotonous speech. (default: 0.5)	FLOAT	No	0.0 - 1.0
`apply_text_normalization`	Text normalization mode. ‘auto’ lets the system decide, ‘on’ always applies normalization, ‘off’ skips it.	COMBO	No	`"auto"` `"on"` `"off"`
`model`	Model to use for dialogue generation.	COMBO	No	`"eleven_v3"`
`inputs`	Number of dialogue entries. Selecting a number will generate that many text and voice input fields.	DYNAMICCOMBO	Yes	`"1"` `"2"` `"3"` `"4"` `"5"` `"6"` `"7"` `"8"` `"9"` `"10"`
`language_code`	ISO-639-1 or ISO-639-3 language code (e.g., ‘en’, ‘es’, ‘fra’). Leave empty for automatic detection. (default: empty)	STRING	No	-
`seed`	Seed for reproducibility. (default: 1)	INT	No	0 - 4294967295
`output_format`	Audio output format.	COMBO	No	`"mp3_44100_192"` `"opus_48000_192"`

Note: The inputs parameter is dynamic. When you select a number (e.g., “3”), the node will display three corresponding text and voice input fields (e.g., text1, voice1, text2, voice2, text3, voice3). Each text field must contain at least one character.

Outputs

Output Name	Description	Data Type
`audio`	The generated multi-speaker dialogue audio in the selected output format.	AUDIO

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 95b16143391a2282c58ebc66561b85338a8ce1f87e0ec769405225599d2c76ae

​Inputs

​Outputs

Inputs

Outputs