Inputs
| Parameter | Description | Data Type | Required | Range |
|---|---|---|---|---|
clip_vision | CLIP vision model for encoding the input image | CLIP_VISION | Yes | - |
init_image | Initial image to use as the starting point for video generation | IMAGE | Yes | - |
vae | VAE model for encoding the image into latent space | VAE | Yes | - |
width | Output video width (default: 1024, step: 8) | INT | Yes | 16 to MAX_RESOLUTION |
height | Output video height (default: 576, step: 8) | INT | Yes | 16 to MAX_RESOLUTION |
video_frames | Number of frames to generate in the video (default: 14) | INT | Yes | 1 to 4096 |
motion_bucket_id | Controls the amount of motion in the generated video (default: 127) | INT | Yes | 1 to 1023 |
fps | Frames per second for the generated video (default: 6) | INT | Yes | 1 to 1024 |
augmentation_level | Level of noise augmentation to apply to the input image (default: 0.0, step: 0.01) | FLOAT | Yes | 0.0 to 10.0 |
Outputs
| Output Name | Description | Data Type |
|---|---|---|
positive | Positive conditioning data containing image embeddings and video parameters | CONDITIONING |
negative | Negative conditioning data with zeroed embeddings and video parameters | CONDITIONING |
latent | Empty latent space tensor ready for video generation | LATENT |
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
Source fingerprint (SHA-256):
33b295b6f2e459852aaa95d9dca26c724aa2e9ad0f884a1c7760766530a00a09