Skip to main content
The Wan Text to Video node generates video content based on text descriptions. It uses AI models to create videos from prompts and supports various video sizes, durations, and optional audio inputs. The node can automatically generate audio when needed and provides options for prompt enhancement and watermarking.

Inputs

ParameterDescriptionData TypeRequiredRange
modelModel to use (default: “wan2.6-t2v”)COMBOYes”wan2.5-t2v-preview"
"wan2.6-t2v”
promptPrompt describing the elements and visual features. Supports English and Chinese (default: "")STRINGYes-
negative_promptNegative prompt describing what to avoid (default: "")STRINGNo-
sizeVideo resolution and aspect ratio (default: “720p: 1:1 (960x960)“)COMBONo”480p: 1:1 (624x624)"
"480p: 16:9 (832x480)"
"480p: 9:16 (480x832)"
"720p: 1:1 (960x960)"
"720p: 16:9 (1280x720)"
"720p: 9:16 (720x1280)"
"720p: 4:3 (1088x832)"
"720p: 3:4 (832x1088)"
"1080p: 1:1 (1440x1440)"
"1080p: 16:9 (1920x1080)"
"1080p: 9:16 (1080x1920)"
"1080p: 4:3 (1632x1248)"
"1080p: 3:4 (1248x1632)“
durationDuration of the video in seconds. A 15-second duration is available only for the Wan 2.6 model (default: 5)INTNo5-15 (in steps of 5)
audioAudio must contain a clear, loud voice, without extraneous noise or background musicAUDIONo-
seedSeed to use for generation (default: 0)INTNo0-2147483647
generate_audioIf no audio input is provided, generate audio automatically (default: False)BOOLEANNo-
prompt_extendWhether to enhance the prompt with AI assistance (default: True)BOOLEANNo-
watermarkWhether to add an AI-generated watermark to the result (default: False)BOOLEANNo-
shot_typeSpecifies the shot type for the generated video, that is, whether the video is a single continuous shot or multiple shots with cuts. This parameter takes effect only when prompt_extend is True (default: “single”)COMBONo”single"
"multi”
Note: The Wan 2.6 model does not support 480p resolutions. A 15-second duration is only supported by the Wan 2.6 model. When providing audio input, it must be between 3.0 and 29.0 seconds in duration and contain clear voice without background noise or music.

Outputs

Output NameDescriptionData Type
outputThe generated video based on the input parametersVIDEO
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 4fbdb2e06ff15849684de860ca3fdf4eb43e6af1803483b4baa7229e584f6e25