Skip to main content
The TextEncodeQwenImageEditPlus node processes text prompts and optional images to generate conditioning data for image generation or editing tasks. It uses a specialized template to analyze input images and understand how text instructions should modify them, then encodes this information for use in subsequent generation steps. The node can handle up to three input images and optionally generate reference latents when a VAE is provided.

Inputs

ParameterDescriptionData TypeRequiredRange
clipThe CLIP model used for tokenization and encodingCLIPYes-
promptText instruction describing the desired image modification (supports multiline input and dynamic prompts)STRINGYes-
vaeOptional VAE model for generating reference latents from input imagesVAENo-
image1First optional input image for analysis and modificationIMAGENo-
image2Second optional input image for analysis and modificationIMAGENo-
image3Third optional input image for analysis and modificationIMAGENo-
Note: When a VAE is provided, the node generates reference latents from all input images. The node can process up to three images simultaneously. Images are automatically resized to 384x384 pixels for vision-language processing, and to dimensions divisible by 8 (with a target area of 1024x1024 pixels) for VAE encoding.

Outputs

Output NameDescriptionData Type
CONDITIONINGEncoded conditioning data containing text tokens and optional reference latents for image generationCONDITIONING
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 40e0104e1a5fd88afb889948bc43559f99049a91c03c3f9885455b6dbfde343e