Inputs
| Parameter | Description | Data Type | Required | Range |
|---|---|---|---|---|
positive | Positive conditioning prompts for video generation | CONDITIONING | Yes | - |
negative | Negative conditioning prompts to avoid in video generation | CONDITIONING | Yes | - |
vae | VAE model for encoding images to latent space | VAE | Yes | - |
width | Output video width in pixels (default: 832, step: 16) | INT | Yes | 16 to MAX_RESOLUTION |
height | Output video height in pixels (default: 480, step: 16) | INT | Yes | 16 to MAX_RESOLUTION |
length | Number of frames in the video sequence (default: 81, step: 4) | INT | Yes | 1 to MAX_RESOLUTION |
batch_size | Number of videos to generate simultaneously (default: 1) | INT | Yes | 1 to 4096 |
clip_vision_output | Optional CLIP vision output for additional conditioning | CLIP_VISION_OUTPUT | No | - |
start_image | Optional starting image to initialize the video sequence. When provided, the first frames of the video will be based on this image, with a mask applied to blend the starting frames with generated content. The image is resized to match the specified width and height. | IMAGE | No | - |
camera_conditions | Optional camera embedding conditions for video generation. When provided, these conditions are applied to both positive and negative conditioning. | WAN_CAMERA_EMBEDDING | No | - |
start_image is provided, the node uses it to initialize the video sequence and applies masking to blend the starting frames with generated content. The camera_conditions and clip_vision_output parameters are optional but when provided, they modify the conditioning for both positive and negative prompts.
Outputs
| Output Name | Description | Data Type |
|---|---|---|
positive | Modified positive conditioning with applied camera conditions, clip vision outputs, and/or starting image data | CONDITIONING |
negative | Modified negative conditioning with applied camera conditions, clip vision outputs, and/or starting image data | CONDITIONING |
latent | Generated empty video latent representation for use with video models. The latent tensor has dimensions [batch_size, 16, frames, height/8, width/8] where frames is calculated as ((length - 1) // 4) + 1. | LATENT |
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
Source fingerprint (SHA-256):
e2309b40f78d5a2487242f1684f82d9e4dd8405ef256615f82da2f701418fd4a