WanImageToVideoApi - ComfyUI Built-in Node Documentation

The Wan Image to Video node generates a video from a single input image and a text prompt. It uses the provided image as the first frame and creates a video sequence based on the description, with options for resolution, duration, audio, and other advanced settings.

Inputs

Parameter	Description	Data Type	Required	Range
`model`	Model to use (default: “wan2.6-i2v”)	COMBO	Yes	”wan2.5-i2v-preview" "wan2.6-i2v”
`image`	Input image that serves as the first frame for video generation. Exactly one image is required.	IMAGE	Yes	-
`prompt`	Prompt describing the elements and visual features. Supports English and Chinese (default: empty).	STRING	Yes	-
`negative_prompt`	Negative prompt describing what to avoid (default: empty).	STRING	No	-
`resolution`	Video resolution quality (default: “720P”). The Wan 2.6 model does not support 480P.	COMBO	No	”480P" "720P" "1080P”
`duration`	Duration of the generated video in seconds. A 15-second duration is supported only by the Wan 2.6 model (default: 5).	INT	No	5-15 (step: 5)
`audio`	Audio must contain a clear, loud voice, without extraneous noise or background music. When provided, audio duration must be between 3.0 and 29.0 seconds.	AUDIO	No	-
`seed`	Seed to use for generation (default: 0).	INT	No	0-2147483647
`generate_audio`	If no audio input is provided, generate audio automatically (default: False).	BOOLEAN	No	-
`prompt_extend`	Whether to enhance the prompt with AI assistance (default: True).	BOOLEAN	No	-
`watermark`	Whether to add an AI-generated watermark to the result (default: False).	BOOLEAN	No	-
`shot_type`	Specifies the shot type for the generated video, that is, whether the video is a single continuous shot or multiple shots with cuts. This parameter takes effect only when prompt_extend is True (default: “single”).	COMBO	No	”single" "multi”

Constraints:

Exactly one input image is required for video generation.
The Wan 2.6 model (wan2.6-i2v) does not support 480P resolution.
A 15-second duration is supported only by the Wan 2.6 model (wan2.6-i2v).
When audio is provided, it must be between 3.0 and 29.0 seconds in duration.

Outputs

Output Name	Description	Data Type
`output`	Generated video based on the input image and prompt.	VIDEO

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): dabcbfa3118669d9ff5b40b79ef643ffcf7ac5e510e61d6ad859d80360061673

​Inputs

​Outputs

Inputs

Outputs