About Wan2.1 Fun Camera

Wan2.1 Fun Camera is a video generation project launched by the Alibaba team, focusing on controlling video generation effects through camera motion.

Model Weights Download:

Code Repository: VideoX-Fun

ComfyUI now natively supports the Wan2.1 Fun Camera model.

If you find missing nodes when loading the workflow file below, it may be due to the following situations:

  1. You are not using the latest Development (Nightly) version of ComfyUI.
  2. You are using the Stable (Release) version or Desktop version of ComfyUI (which does not include the latest feature updates).
  3. You are using the latest Commit version of ComfyUI, but some nodes failed to import during startup.

Please make sure you have successfully updated ComfyUI to the latest Development (Nightly) version. See: How to Update ComfyUI section to learn how to update ComfyUI.

Model Installation

These models only need to be installed once. Additionally, model download information is included in the corresponding workflow images, so you can choose your preferred way to download the models.

All of the following models can be found at Wan_2.1_ComfyUI_repackaged

Diffusion Models choose either 1.3B or 14B:

If you’ve used Wan2.1 related models before, you should already have the following models. If not, please download them:

Text Encoders choose one:

VAE

CLIP Vision

File Storage Location:

📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │   ├── wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors # 1.3B version
│ │   └── wan2.1_fun_camera_v1.1_14B_bf16.safetensors # 14B version
│ ├── 📂 text_encoders/
│ │   └── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ ├── 📂 vae/
│ │   └── wan_2.1_vae.safetensors
│ └── 📂 clip_vision/
│     └── clip_vision_h.safetensors

ComfyUI Wan2.1 Fun Camera 1.3B Native Workflow Example

1.1 Workflow File

Download the video below and drag it into ComfyUI to load the corresponding workflow:

Download Json Workflow File

If you want to use the 14B version, simply replace the model file with the 14B version, but please be aware of the VRAM requirements.

1.2 Input Image Download

Please download the image below, which we will use as the starting frame:

2. Complete the Workflow Step by Step

  1. Ensure the correct version of model file is loaded:
    • 1.3B version: wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors
    • 14B version: wan2.1_fun_camera_v1.1_14B_bf16.safetensors
  2. Ensure the Load CLIP node has loaded umt5_xxl_fp8_e4m3fn_scaled.safetensors
  3. Ensure the Load VAE node has loaded wan_2.1_vae.safetensors
  4. Ensure the Load CLIP Vision node has loaded clip_vision_h.safetensors
  5. Upload the starting frame to the Load Image node
  6. Modify the Prompt if you’re using your own input image
  7. Set camera motion in the WanCameraEmbedding node
  8. Click the Run button or use the shortcut Ctrl(cmd) + Enter to execute generation

ComfyUI Wan2.1 Fun Camera 14B Workflow and Input Image

Download Json Workflow File

Input Image

Performance Reference

1.3B Version:

  • 512×512 resolution on RTX 4090 takes about 72 seconds to generate 81 frames

14B Version:

  • RTX4090 24GB VRAM may experience insufficient memory when generating 512×512 resolution, and memory issues have also occurred on A100 when using larger sizes