Skip to main content
The CosmosImageToVideoLatent node creates video latent representations from input images. It generates a blank video latent and optionally encodes start and/or end images into the beginning and/or end frames of the video sequence. When images are provided, it also creates corresponding noise masks to indicate which parts of the latent should be preserved during generation.

Inputs

ParameterDescriptionData TypeRequiredRange
vaeThe VAE model used for encoding images into latent spaceVAEYes-
widthThe width of the output video in pixels (default: 1280)INTYes16 to MAX_RESOLUTION
heightThe height of the output video in pixels (default: 704)INTYes16 to MAX_RESOLUTION
lengthThe number of frames in the video sequence (default: 121)INTYes1 to MAX_RESOLUTION
batch_sizeThe number of latent batches to generate (default: 1)INTYes1 to 4096
start_imageOptional image to encode at the beginning of the video sequenceIMAGENo-
end_imageOptional image to encode at the end of the video sequenceIMAGENo-
Note: When neither start_image nor end_image are provided, the node returns a blank latent without any noise mask. When either image is provided, the corresponding sections of the latent are encoded and masked accordingly.

Outputs

Output NameDescriptionData Type
latentThe generated video latent representation with optional encoded images and corresponding noise masksLATENT
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 4fefd1b6c38c93c260ef8376e8d69ba610a556b3c8555863016a1afd45885eaf