Skip to main content
The TextEncodeAceStepAudio node processes text inputs for audio conditioning by combining tags and lyrics into tokens, then encoding them with adjustable lyrics strength. It takes a CLIP model along with text descriptions and lyrics, tokenizes them together, and generates conditioning data suitable for audio generation tasks. The node allows fine-tuning the influence of lyrics through a strength parameter that controls their impact on the final output.

Inputs

ParameterDescriptionData TypeRequiredRange
clipThe CLIP model used for tokenization and encodingCLIPYes-
tagsText tags or descriptions for audio conditioning (supports multiline input and dynamic prompts)STRINGYes-
lyricsLyrics text for audio conditioning (supports multiline input and dynamic prompts)STRINGYes-
lyrics_strengthControls the strength of lyrics influence on the conditioning output (default: 1.0, step: 0.01)FLOATNo0.0 - 10.0

Outputs

Output NameDescriptionData Type
conditioningThe encoded conditioning data containing processed text tokens with applied lyrics strengthCONDITIONING
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 79cdc3b7d0728a7fdb771243bc1b30f252cc322892df634584698a8f2c4d1633