Skip to main content
The LTXVImgToVideo node converts an input image into a video latent representation for video generation models. It takes a single image and extends it into a sequence of frames using the VAE encoder, then applies conditioning with strength control to determine how much of the original image content is preserved versus modified during video generation.

Inputs

ParameterDescriptionData TypeRequiredRange
positivePositive conditioning prompts for guiding the video generationCONDITIONINGYes-
negativeNegative conditioning prompts for avoiding certain elements in the videoCONDITIONINGYes-
vaeVAE model used for encoding the input image into latent spaceVAEYes-
imageInput image to be converted into video framesIMAGEYes-
widthOutput video width in pixels (default: 768, step: 32)INTNo64 to MAX_RESOLUTION
heightOutput video height in pixels (default: 512, step: 32)INTNo64 to MAX_RESOLUTION
lengthNumber of frames in the generated video (default: 97, step: 8)INTNo9 to MAX_RESOLUTION
batch_sizeNumber of videos to generate simultaneously (default: 1)INTNo1 to 4096
strengthControl over how much of the original image content is preserved in the first frame of the generated video. A value of 1.0 preserves the original image completely, while 0.0 allows maximum modification (default: 1.0)FLOATNo0.0 to 1.0

Outputs

Output NameDescriptionData Type
positiveProcessed positive conditioning with video frame masking appliedCONDITIONING
negativeProcessed negative conditioning with video frame masking appliedCONDITIONING
latentVideo latent representation containing the encoded frames and noise mask for video generationLATENT
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 1f9d897d1f461270106bf44106acc90db422a04e6bce10ad3bca22127e96ffab