Qwen-Image is a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license.
Model Used | VRAM Usage | First Generation | Second Generation |
---|---|---|---|
fp8_e4m3fn | 86% | ≈ 94s | ≈ 71s |
fp8_e4m3fn with lightx2v 8-step LoRA | 86% | ≈ 55s | ≈ 34s |
Distilled fp8_e4m3fn | 86% | ≈ 69s | ≈ 36s |
Download Workflow for Qwen-Image Official Model
Distilled versionDownload Workflow for Distilled Model
Load Diffusion Model
node has loaded qwen_image_fp8_e4m3fn.safetensors
Load CLIP
node has loaded qwen_2.5_vl_7b_fp8_scaled.safetensors
Load VAE
node has loaded qwen_image_vae.safetensors
EmptySD3LatentImage
node is set with the correct image dimensionsCLIP Text Encoder
node; currently, it supports at least English, Chinese, Korean, Japanese, Italian, etc.Ctrl + B
to enable it, and modify the Ksampler settings as described in step 8Queue
button, or use the shortcut Ctrl(cmd) + Enter
to run the workflowDownload JSON Format Workflow
Download the image below as input:ComfyUI/models/model_patches
folder
qwen_image_canny_diffsynth_controlnet.safetensors
is loadedstrength
in the QwenImageDiffsynthControlnet
node to control the intensity of line art controlRun
button, or use the shortcut Ctrl(cmd) + Enter
to run the workflow
For using qwen_image_depth_diffsynth_controlnet.safetensors, you need to preprocess the image into a depth map and replace the image processing
part. For this usage, please refer to the InstantX processing method in this document. Other parts are similar to using the Canny model.
Inpaint Model ControlNet Usage Instructions
ModelPatchLoader
loads the qwen_image_inpaint_diffsynth_controlnet.safetensors
modelmask
output of the corresponding Load Image
node to the mask
input of QwenImageDiffsynthControlnet
to ensure the corresponding mask is loadedCtrl-B
shortcut to set the original Canny in the workflow to bypass mode, making the corresponding Canny node processing ineffectiveCLIP Text Encoder
, input what you want to change the masked area tostrength
in the QwenImageDiffsynthControlnet
node to control the corresponding control intensityRun
button, or use the shortcut Ctrl(cmd) + Enter
to run the workflowDownload JSON Format Workflow
Download the image below as inputComfyUI/models/loras/
folder
LoraLoaderModelOnly
correctly loads the qwen_image_union_diffsynth_lora.safetensors
modelCanny
node parameters. Since different input images require different parameter settings to get better image preprocessing results, you can try adjusting the corresponding parameter values to get more/fewer detailsRun
button, or use the shortcut Ctrl(cmd) + Enter
to run the workflowFor other types of control, you also need to replace the image processing part.