This guide shows how to use Hunyuan Text-to-Video and Image-to-Video workflows in ComfyUI
Hunyuan Video series is developed and open-sourced by Tencent, featuring a hybrid architecture that supports both Text-to-Video and Image-to-Video generation with a parameter scale of 13B.
Technical features:
You can learn more through the official repositories: Hunyuan Video and Hunyuan Video-I2V.
This guide will walk you through setting up both Text-to-Video and Image-to-Video workflows in ComfyUI.
The workflow images in this tutorial contain metadata with model download information.
Simply drag them into ComfyUI or use the menu Workflows
-> Open (ctrl+o)
to load the corresponding workflow, which will prompt you to download the required models.
Alternatively, this guide provides direct model links if automatic downloads fail or you are not using the Desktop version. All models are available here for download.
If you find missing nodes when loading the workflow file below, it may be due to the following situations:
Please make sure you have successfully updated ComfyUI to the latest Development (Nightly) version. See: How to Update ComfyUI section to learn how to update ComfyUI.
The following models are used in both Text-to-Video and Image-to-Video workflows. Please download and save them to the specified directories:
Storage location:
Hunyuan Text-to-Video was open-sourced in December 2024, supporting 5-second short video generation through natural language descriptions in both Chinese and English.
Download the image below and drag it into ComfyUI to load the workflow:
Download hunyuan_video_t2v_720p_bf16.safetensors and save it to the ComfyUI/models/diffusion_models
folder.
Ensure you have all these model files in the correct locations:
DualCLIPLoader
node has loaded these models:
Load Diffusion Model
node has loaded hunyuan_video_t2v_720p_bf16.safetensors
Load VAE
node has loaded hunyuan_video_vae_bf16.safetensors
Queue
button or use the shortcut Ctrl(cmd) + Enter
to run the workflowWhen the length
parameter in the EmptyHunyuanLatentVideo
node is set to 1, the model can generate a static image.
Hunyuan Image-to-Video model was open-sourced on March 6, 2025, based on the HunyuanVideo framework. It transforms static images into smooth, high-quality videos and also provides LoRA training code to customize special video effects like hair growth, object transformation, etc.
Currently, the Hunyuan Image-to-Video model has two versions:
v1 “concat”
v2 “replace”
Download the following file and save it to the ComfyUI/models/clip_vision
directory:
Download the workflow image below and drag it into ComfyUI to load the workflow:
Download the image below, which we’ll use as the starting frame for the image-to-video generation:
Ensure you have all these model files in the correct locations:
DualCLIPLoader
has loaded these models:
Load CLIP Vision
has loaded llava_llama3_vision.safetensors
Load Image Model
has loaded hunyuan_video_image_to_video_720p_bf16.safetensors
Load VAE
has loaded vae_name: hunyuan_video_vae_bf16.safetensors
Load Diffusion Model
has loaded hunyuan_video_image_to_video_720p_bf16.safetensors
Queue
button or use the shortcut Ctrl(cmd) + Enter
to run the workflowThe v2 workflow is essentially the same as the v1 workflow. You just need to download the replace model and use it in the Load Diffusion Model
node.
Download the workflow image below and drag it into ComfyUI to load the workflow:
Download the image below, which we’ll use as the starting frame for the image-to-video generation:
Ensure you have all these model files in the correct locations:
DualCLIPLoader
node has loaded these models:
Load CLIP Vision
node has loaded llava_llama3_vision.safetensors
Load Image Model
node has loaded hunyuan_video_image_to_video_720p_bf16.safetensors
Load VAE
node has loaded hunyuan_video_vae_bf16.safetensors
Load Diffusion Model
node has loaded hunyuan_video_v2_replace_image_to_video_720p_bf16.safetensors
Queue
button or use the shortcut Ctrl(cmd) + Enter
to run the workflowHere are some images and prompts we provide. Based on that content or make an adjustment to create your own video.