Skip to main content
The CosmosPredict2ImageToVideoLatent node creates video latent representations from images for video generation. It can generate a blank video latent or incorporate start and end images to create video sequences with specified dimensions and duration. The node handles the encoding of images into the appropriate latent space format for video processing.

Inputs

ParameterDescriptionData TypeRequiredRange
vaeThe VAE model used for encoding images into latent spaceVAEYes-
widthThe width of the output video in pixels (default: 848, must be divisible by 16)INTYes16 to MAX_RESOLUTION
heightThe height of the output video in pixels (default: 480, must be divisible by 16)INTYes16 to MAX_RESOLUTION
lengthThe number of frames in the video sequence (default: 93, step: 4)INTYes1 to MAX_RESOLUTION
batch_sizeThe number of video sequences to generate (default: 1)INTYes1 to 4096
start_imageOptional starting image for the video sequenceIMAGENo-
end_imageOptional ending image for the video sequenceIMAGENo-
Note: When neither start_image nor end_image are provided, the node generates a blank video latent. When images are provided, they are encoded and positioned at the beginning and/or end of the video sequence with appropriate masking.

Outputs

Output NameDescriptionData Type
samplesThe generated video latent representation containing the encoded video sequenceLATENT
noise_maskA mask indicating which parts of the latent should be preserved during generationLATENT
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 5c470c35afaa9b994b431f0a8655e108da420182ec5c223fa54ec2c233ba9fd3