Inputs
| Parameter | Description | Data Type | Required | Range |
|---|---|---|---|---|
positive | Positive conditioning input that guides the video generation toward desired content | CONDITIONING | Yes | - |
negative | Negative conditioning input that steers the video generation away from unwanted content | CONDITIONING | Yes | - |
vae | VAE model used for encoding reference images into latent space | VAE | Yes | - |
width | Width of the output video frames in pixels (default: 832, must be divisible by 16) | INT | Yes | 16 to MAX_RESOLUTION |
height | Height of the output video frames in pixels (default: 480, must be divisible by 16) | INT | Yes | 16 to MAX_RESOLUTION |
length | Number of frames in the generated video sequence (default: 97, must be such that (length - 1) is divisible by 4) | INT | Yes | 1 to MAX_RESOLUTION |
batch_size | Number of video sequences to generate simultaneously (default: 1) | INT | Yes | 1 to 4096 |
audio_encoder_output | Optional audio encoding data that can influence video generation based on audio content | AUDIOENCODEROUTPUT | No | - |
ref_image | Optional reference image used to guide the video generation style and content | IMAGE | No | - |
Outputs
| Output Name | Description | Data Type |
|---|---|---|
positive | Modified positive conditioning with reference image and/or audio embeddings incorporated | CONDITIONING |
negative | Modified negative conditioning with reference image and/or audio embeddings incorporated | CONDITIONING |
latent | Generated latent representation containing the video sequence data | LATENT |
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
Source fingerprint (SHA-256):
4d28fe2617f25e72745d34bf2ec19aec2df6e89ad49eabe086ad045690f42d1f