Skip to main content
This node is designed to encode text input using a CLIP model specifically customized for the SDXL architecture. It uses a dual encoder system (CLIP-L and CLIP-G) to process text descriptions, resulting in more accurate image generation.

Inputs

ParameterDescriptionData Type
clipCLIP model instance used for text encoding.CLIP
widthSpecifies the image width in pixels, default 1024.INT
heightSpecifies the image height in pixels, default 1024.INT
crop_wWidth of the crop area in pixels, default 0.INT
crop_hHeight of the crop area in pixels, default 0.INT
target_widthTarget width for the output image, default 1024.INT
target_heightTarget height for the output image, default 1024.INT
text_gGlobal text description for overall scene description.STRING
text_lLocal text description for detail description.STRING

Outputs

ParameterDescriptionData Type
CONDITIONINGContains encoded text and conditional information needed for image generation.CONDITIONING
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub