CLIPTextEncodeControlnet - ComfyUI Built-in Node Documentation

The CLIPTextEncodeControlnet node processes text input using a CLIP model and combines it with existing conditioning data to create enhanced conditioning output for controlnet applications. It tokenizes the input text, encodes it through the CLIP model, and adds the resulting embeddings to the provided conditioning data as cross-attention controlnet parameters.

Inputs

Parameter	Description	Data Type	Required	Range
`clip`	The CLIP model used for text tokenization and encoding	CLIP	Yes	-
`conditioning`	Existing conditioning data to be enhanced with controlnet parameters	CONDITIONING	Yes	-
`text`	Text input to be processed by the CLIP model. Supports multiline text and dynamic prompts	STRING	Yes	-

Note: This node requires all three inputs (clip, conditioning, and text) to function properly. The text input supports dynamic prompts and multiline text for flexible text processing.

Outputs

Output Name	Description	Data Type
`CONDITIONING`	Enhanced conditioning data with added controlnet cross-attention parameters (`cross_attn_controlnet` and `pooled_output_controlnet`) derived from the CLIP text encoding	CONDITIONING

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 36f0c9b7d3f3187bcf794c71e8a261d456972a41c079ddaca0cb5117912e9685

​Inputs

​Outputs

Inputs

Outputs