Inputs
| Parameter | Description | Data Type | Required | Range |
|---|---|---|---|---|
positive | Positive conditioning for guiding the generation towards desired content | CONDITIONING | Yes | - |
negative | Negative conditioning for steering the generation away from unwanted content | CONDITIONING | Yes | - |
vae | VAE model used for encoding and decoding image data | VAE | Yes | - |
width | Output video width in pixels (default: 832, step: 16) | INT | Yes | 16 to MAX_RESOLUTION |
height | Output video height in pixels (default: 480, step: 16) | INT | Yes | 16 to MAX_RESOLUTION |
length | Number of frames to generate (default: 77, step: 4) | INT | Yes | 1 to MAX_RESOLUTION |
batch_size | Number of videos to generate simultaneously (default: 1) | INT | Yes | 1 to 4096 |
clip_vision_output | Optional CLIP vision model output for additional conditioning | CLIP_VISION_OUTPUT | No | - |
reference_image | Reference image used as starting point for generation | IMAGE | No | - |
face_video | Video input providing facial expression guidance | IMAGE | No | - |
pose_video | Video input providing pose and motion guidance | IMAGE | No | - |
continue_motion_max_frames | Maximum number of frames to continue from previous motion (default: 5, step: 4) | INT | Yes | 1 to MAX_RESOLUTION |
background_video | Background video to composite with generated content | IMAGE | No | - |
character_mask | Mask defining character regions for selective processing | MASK | No | - |
continue_motion | Previous motion sequence to continue from for temporal consistency | IMAGE | No | - |
video_frame_offset | The amount of frames to seek in all the input videos. Used for generating longer videos by chunk. Connect to the video_frame_offset output of the previous node for extending a video. (default: 0, step: 1) | INT | Yes | 0 to MAX_RESOLUTION |
- When
pose_videois provided, the output length will be adjusted to match the pose video duration if thetrim_to_pose_videologic is active (currently set toFalsein the source code) face_videois automatically resized to 512x512 resolution and normalized to a range of -1.0 to 1.0 when processedcontinue_motionframes are limited by thecontinue_motion_max_framesparameter; only the lastcontinue_motion_max_framesframes from the input are used- Input videos (
face_video,pose_video,background_video,character_mask) are offset byvideo_frame_offsetbefore processing; if the offset exceeds the video length, the input is ignored - If
character_maskcontains only one frame, it will be repeated across all frames - When
clip_vision_outputis provided, it’s applied to both positive and negative conditioning - If
reference_imageis not provided, a black image (all zeros) is used as the default reference - If
continue_motionis not provided, the initial frames are filled with gray (0.5 intensity) noise
Outputs
| Output Name | Description | Data Type |
|---|---|---|
positive | Modified positive conditioning with additional video context including CLIP vision output, pose video latent, face video pixels, concatenated latent image, and concatenated mask | CONDITIONING |
negative | Modified negative conditioning with additional video context including CLIP vision output, pose video latent, face video pixels (inverted), concatenated latent image, and concatenated mask | CONDITIONING |
latent | Generated video content in latent space format with shape [batch_size, 16, latent_length + trim_latent, latent_height, latent_width] | LATENT |
trim_latent | Latent space trimming information indicating the number of latent frames to trim from the beginning (corresponds to reference image latent frames) | INT |
trim_image | Image space trimming information for reference motion frames, indicating the number of image frames to trim from the beginning | INT |
video_frame_offset | Updated frame offset for continuing video generation in chunks, calculated as the previous offset plus the generated length | INT |
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
Source fingerprint (SHA-256):
2ec2afbc57f58a5b7ce0ecc3730618633d435439ce2d650b18be531c1edddff0