Qwen-Image-Layered ComfyUI Workflow Example

Qwen-Image-Layered is a model developed by Alibaba’s Qwen team that can decompose an image into multiple RGBA layers. This layered representation unlocks inherent editability: each layer can be independently manipulated without affecting other content. Key Features:

Inherent Editability: Each layer can be independently manipulated without affecting other content
High-Fidelity Elementary Operations: Supports resizing, repositioning, and recoloring with physical isolation of semantic components
Variable-Layer Decomposition: Not limited to a fixed number of layers - decompose into 3, 4, 8, or more layers as needed
Recursive Decomposition: Any layer can be further decomposed, enabling infinite decomposition depth

Related Links:

Qwen-Image-Layered workflow

Download JSON Workflow File

Run on ComfyUI Cloud

Portable or self deployed users
Desktop or Cloud users

Make sure your ComfyUI is updated.

Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated. (Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:

You are not using the latest ComfyUI version (Nightly version)
Some nodes failed to import at startup

Model links

text_encoders

qwen_2.5_vl_7b_fp8_scaled.safetensors

diffusion_models

qwen_image_layered_bf16.safetensors

vae

qwen_image_layered_vae.safetensors

Model Storage Location

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 text_encoders/
│   │      └── qwen_2.5_vl_7b_fp8_scaled.safetensors
│   ├── 📂 diffusion_models/
│   │      └── qwen_image_layered_bf16.safetensors
│   └── 📂 vae/
│          └── qwen_image_layered_vae.safetensors

FP8 version

By default we are using bf16, which requires high VRAM. For lower VRAM usage, you can use the fp8 version:

qwen_image_layered_fp8mixed.safetensors

Then update the Load Diffusion model node inside the Subgraph to use it.

Workflow settings

Sampler settings

This model is slow. The original sampling settings are steps: 50 and CFG: 4.0, which will at least double the generation time.

Input size

For input size, 640px is recommended. Use 1024px for high-resolution output.

Prompt (optional)

The text prompt is intended to describe the overall content of the input image—including elements that may be partially occluded (e.g., you may specify the text hidden behind a foreground object). It is not designed to control the semantic content of individual layers explicitly.

Get Started

Basic Concepts

Interface Guide

Tutorials

Qwen-Image-Layered ComfyUI Workflow Example

Qwen-Image-Layered workflow

Model links

FP8 version

Workflow settings

Sampler settings

Input size

Prompt (optional)

Get Started

Basic Concepts

Interface Guide

Tutorials

​Qwen-Image-Layered workflow

​Model links

​FP8 version

​Workflow settings

​Sampler settings

​Input size

​Prompt (optional)

Qwen-Image-Layered workflow

Model links

FP8 version

Workflow settings

Sampler settings

Input size

Prompt (optional)