This guide demonstrates how to generate videos with first and last frames using Wan2.1 Video in ComfyUI
Wan2.1 Video series is a video generation model open-sourced by Alibaba in February 2025 under the Apache 2.0 license. It offers two versions:
If you find missing nodes when loading the workflow file below, it may be due to the following situations:
Please make sure you have successfully updated ComfyUI to the latest Development (Nightly) version. See: How to Update ComfyUI section to learn how to update ComfyUI.
Please update ComfyUI to the latest version before starting the examples to make sure you have native Wan Video support.
All models mentioned in this guide can be found here. Below are the common models you’ll need for the examples in this guide, which you can download in advance:
Choose one version from Text encoders to download:
VAE
CLIP Vision
File storage locations:
For diffusion models, we’ll use the fp16 precision models in this guide because we’ve found that they perform better than the bf16 versions. If you need other precision versions, please visit here to download them.
Before starting the workflow, please download wan2.1_t2v_1.3B_fp16.safetensors and save it to the ComfyUI/models/diffusion_models/
directory.
If you need other t2v precision versions, please visit here to download them.
Download the file below and drag it into ComfyUI to load the corresponding workflow:
Load Diffusion Model
node has loaded the wan2.1_t2v_1.3B_fp16.safetensors
modelLoad CLIP
node has loaded the umt5_xxl_fp8_e4m3fn_scaled.safetensors
modelLoad VAE
node has loaded the wan_2.1_vae.safetensors
modelEmptyHunyuanLatentVideo
node if neededCLIP Text Encoder
node at number 5
Run
button or use the shortcut Ctrl(cmd) + Enter
to execute the video generationSince Wan Video separates the 480P and 720P models, we’ll need to provide examples for both resolutions in this guide. In addition to using different models, they also have slight parameter differences.
Download the image below and drag it into ComfyUI to load the corresponding workflow:
We’ll use the following image as input:
Please download wan2.1_i2v_480p_14B_fp16.safetensors and save it to the ComfyUI/models/diffusion_models/
directory.
Load Diffusion Model
node has loaded the wan2.1_i2v_480p_14B_fp16.safetensors
modelLoad CLIP
node has loaded the umt5_xxl_fp8_e4m3fn_scaled.safetensors
modelLoad VAE
node has loaded the wan_2.1_vae.safetensors
modelLoad CLIP Vision
node has loaded the clip_vision_h.safetensors
modelLoad Image
nodeCLIP Text Encoder
nodeWanImageToVideo
node if neededRun
button or use the shortcut Ctrl(cmd) + Enter
to execute the video generationDownload the image below and drag it into ComfyUI to load the corresponding workflow:
We’ll use the following image as input:
Please download wan2.1_i2v_720p_14B_fp16.safetensors and save it to the ComfyUI/models/diffusion_models/
directory.
Load Diffusion Model
node has loaded the wan2.1_i2v_720p_14B_fp16.safetensors
modelLoad CLIP
node has loaded the umt5_xxl_fp8_e4m3fn_scaled.safetensors
modelLoad VAE
node has loaded the wan_2.1_vae.safetensors
modelLoad CLIP Vision
node has loaded the clip_vision_h.safetensors
modelLoad Image
nodeCLIP Text Encoder
nodeWanImageToVideo
node if neededRun
button or use the shortcut Ctrl(cmd) + Enter
to execute the video generation