ComfyUI Wan2.1 Fun Control Video Examples

About Wan2.1-Fun-Control

Wan2.1-Fun-Control is an open-source video generation and control project developed by Alibaba team. It introduces innovative Control Codes mechanisms combined with deep learning and multimodal conditional inputs to generate high-quality videos that conform to preset control conditions. The project focuses on precisely guiding generated video content through multimodal control conditions. Currently, the Fun Control model supports various control conditions, including Canny (line art), Depth, OpenPose (human posture), MLSD (geometric edges), and trajectory control. The model also supports multi-resolution video prediction with options for 512, 768, and 1024 resolutions at 16 frames per second, generating videos up to 81 frames (approximately 5 seconds) in length. Model versions:

1.3B Lightweight: Suitable for local deployment and quick inference with lower VRAM requirements
14B High-performance: Model size reaches 32GB+, offering better results but requiring higher VRAM

Here are the relevant code repositories:

ComfyUI now natively supports the Wan2.1 Fun Control model. Before starting this tutorial, please update your ComfyUI to ensure you’re using a version after this commit. In this guide, we’ll provide two workflows:

A workflow using only native Comfy Core nodes
A workflow using custom nodes

Due to current limitations in native nodes for video support, the native-only workflow ensures users can complete the process without installing custom nodes. However, we’ve found that providing a good user experience for video generation is challenging without custom nodes, so we’re providing both workflow versions in this guide.

Make sure your ComfyUI is updated.

Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated.(Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:

Not using the latest ComfyUI version(Nightly version)
Using Stable or Desktop version (Latest changes may not be included)
Some nodes failed to import at startup

Model Installation

You only need to install these models once. The workflow images also contain model download information, so you can choose your preferred download method. The following models can be found at Wan_2.1_ComfyUI_repackaged and Wan2.1-Fun Click the corresponding links to download. If you’ve used Wan-related workflows before, you only need to download the Diffusion models. Diffusion models - choose 1.3B or 14B. The 14B version has a larger file size (32GB) and higher VRAM requirements:

wan2.1_fun_control_1.3B_bf16.safetensors
Wan2.1-Fun-14B-Control: Rename to Wan2.1-Fun-14B-Control.safetensors after downloading

Text encoders - choose one of the following models (fp16 precision has a larger size and higher performance requirements):

VAE

wan_2.1_vae.safetensors

CLIP Vision

clip_vision_h.safetensors

File storage location:

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 diffusion_models/
│   │   └── wan2.1_fun_control_1.3B_bf16.safetensors
│   ├── 📂 text_encoders/
│   │   └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│   └── 📂 vae/
│   │   └── wan_2.1_vae.safetensors
│   └── 📂 clip_vision/
│       └──  clip_vision_h.safetensors

ComfyUI Native Workflow

In this workflow, we use videos converted to WebP format since the Load Image node doesn’t currently support mp4 format. We also use Canny Edge to preprocess the original video. Because many users encounter installation failures and environment issues when installing custom nodes, this version of the workflow uses only native nodes to ensure a smoother experience. Thanks to our powerful ComfyUI authors who provide feature-rich nodes. If you want to directly check the related version, see Workflow Using Custom Nodes.

1. Workflow File Download

1.1 Workflow File

Download the image below and drag it into ComfyUI to load the workflow: Wan2.1 Fun Control Native Workflow

1.2 Input Images and Videos Download

Please download the following image and video for input: Input Reference Image

2. Complete the Workflow Step by Step

Ensure the Load Diffusion Model node has loaded wan2.1_fun_control_1.3B_bf16.safetensors
Ensure the Load CLIP node has loaded umt5_xxl_fp8_e4m3fn_scaled.safetensors
Ensure the Load VAE node has loaded wan_2.1_vae.safetensors
Ensure the Load CLIP Vision node has loaded clip_vision_h.safetensors
Upload the starting frame to the Load Image node (renamed to Start_image)
Upload the control video to the second Load Image node. Note: This node currently doesn’t support mp4, only WebP videos
(Optional) Modify the prompt (both English and Chinese are supported)
(Optional) Adjust the video size in WanFunControlToVideo, avoiding overly large dimensions
Click the Run button or use the shortcut Ctrl(cmd) + Enter to execute video generation

3. Usage Notes

Since we need to input the same number of frames as the control video into the WanFunControlToVideo node, if the specified frame count exceeds the actual control video frames, the excess frames may display scenes not conforming to control conditions. We’ll address this issue in the Workflow Using Custom Nodes
Avoid setting overly large dimensions, as this can make the sampling process very time-consuming. Try generating smaller images first, then upscale
Use your imagination to build upon this workflow by adding text-to-image or other types of workflows to achieve direct text-to-video generation or style transfer
Use tools like ComfyUI-comfyui_controlnet_aux for richer control options

Workflow Using Custom Nodes

We’ll need to install the following two custom nodes:

You can use ComfyUI Manager to install missing nodes or follow the installation instructions for each custom node package.

1. Workflow File Download

1.1 Workflow File

Download the image below and drag it into ComfyUI to load the workflow: Workflow File

Due to the large size of video files, you can also click here to download the workflow file in JSON format.

1.2 Input Images and Videos Download

Please download the following image and video for input: Input Reference Image

2. Complete the Workflow Step by Step

Wan2.1 Fun Control Workflow Using Custom Nodes Steps

The model part is essentially the same. If you’ve already experienced the native-only workflow, you can directly upload the corresponding images and run it.

Ensure the Load Diffusion Model node has loaded wan2.1_fun_control_1.3B_bf16.safetensors
Ensure the Load CLIP node has loaded umt5_xxl_fp8_e4m3fn_scaled.safetensors
Ensure the Load VAE node has loaded wan_2.1_vae.safetensors
Ensure the Load CLIP Vision node has loaded clip_vision_h.safetensors
Upload the starting frame to the Load Image node
Upload an mp4 format video to the Load Video(Upload) custom node. Note that the workflow has adjusted the default frame_load_cap
For the current image, the DWPose Estimator only uses the detect_face option
(Optional) Modify the prompt (both English and Chinese are supported)
(Optional) Adjust the video size in WanFunControlToVideo, avoiding overly large dimensions
Click the Run button or use the shortcut Ctrl(cmd) + Enter to execute video generation

3. Workflow Notes

Thanks to the ComfyUI community authors for their custom node packages:

This example uses Load Video(Upload) to support mp4 videos
The video_info obtained from Load Video(Upload) allows us to maintain the same fps for the output video
You can replace DWPose Estimator with other preprocessors from the ComfyUI-comfyui_controlnet_aux node package
Prompts support multiple languages

Usage Tips

A useful tip is that you can combine multiple image preprocessing techniques and then use the Image Blend node to achieve the goal of applying multiple control methods simultaneously.
You can use the Video Combine node from ComfyUI-VideoHelperSuite to save videos in mp4 format
We use SaveAnimatedWEBP because we currently don’t support embedding workflow into mp4 and some other custom nodes may not support embedding workflow too. To preserve the workflow in the video, we choose SaveAnimatedWEBP node.
In the WanFunControlToVideo node, control_video is not mandatory, so sometimes you can skip using a control video, first generate a very small video size like 320x320, and then use them as control video input to achieve consistent results.
ComfyUI-WanVideoWrapper
ComfyUI-KJNodes

Get Started

Basic Concepts

Interface Guide

Tutorials

Troubleshooting

Community

ComfyUI Wan2.1 Fun Control Video Examples

About Wan2.1-Fun-Control

Model Installation

ComfyUI Native Workflow

1. Workflow File Download

1.1 Workflow File

1.2 Input Images and Videos Download

2. Complete the Workflow Step by Step

3. Usage Notes

Workflow Using Custom Nodes

1. Workflow File Download

1.1 Workflow File

1.2 Input Images and Videos Download

2. Complete the Workflow Step by Step

3. Workflow Notes

Usage Tips

Get Started

Basic Concepts

Interface Guide

Tutorials

Troubleshooting

Community

​About Wan2.1-Fun-Control

​Model Installation

​ComfyUI Native Workflow

​1. Workflow File Download

​1.1 Workflow File

​1.2 Input Images and Videos Download

​2. Complete the Workflow Step by Step

​3. Usage Notes

​Workflow Using Custom Nodes

​1. Workflow File Download

​1.1 Workflow File

​1.2 Input Images and Videos Download

​2. Complete the Workflow Step by Step

​3. Workflow Notes

​Usage Tips

About Wan2.1-Fun-Control

Model Installation

ComfyUI Native Workflow

1. Workflow File Download

1.1 Workflow File

1.2 Input Images and Videos Download

2. Complete the Workflow Step by Step

3. Usage Notes

Workflow Using Custom Nodes

1. Workflow File Download

1.1 Workflow File

1.2 Input Images and Videos Download

2. Complete the Workflow Step by Step

3. Workflow Notes

Usage Tips