Skip to main content
The WanHuMoImageToVideo node converts images to video sequences by generating latent representations for video frames. It processes conditioning inputs and can incorporate reference images and audio embeddings to influence the video generation. The node outputs modified conditioning data and latent representations suitable for video synthesis.

Inputs

ParameterDescriptionData TypeRequiredRange
positivePositive conditioning input that guides the video generation toward desired contentCONDITIONINGYes-
negativeNegative conditioning input that steers the video generation away from unwanted contentCONDITIONINGYes-
vaeVAE model used for encoding reference images into latent spaceVAEYes-
widthWidth of the output video frames in pixels (default: 832, must be divisible by 16)INTYes16 to MAX_RESOLUTION
heightHeight of the output video frames in pixels (default: 480, must be divisible by 16)INTYes16 to MAX_RESOLUTION
lengthNumber of frames in the generated video sequence (default: 97, must be such that (length - 1) is divisible by 4)INTYes1 to MAX_RESOLUTION
batch_sizeNumber of video sequences to generate simultaneously (default: 1)INTYes1 to 4096
audio_encoder_outputOptional audio encoding data that can influence video generation based on audio contentAUDIOENCODEROUTPUTNo-
ref_imageOptional reference image used to guide the video generation style and contentIMAGENo-
Note: When a reference image is provided, it gets encoded and added to both positive and negative conditioning. When audio encoder output is provided, it gets processed and incorporated into the conditioning data. If neither is provided, zero-filled placeholder tensors are used for both reference latents and audio embeddings.

Outputs

Output NameDescriptionData Type
positiveModified positive conditioning with reference image and/or audio embeddings incorporatedCONDITIONING
negativeModified negative conditioning with reference image and/or audio embeddings incorporatedCONDITIONING
latentGenerated latent representation containing the video sequence dataLATENT
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 4d28fe2617f25e72745d34bf2ec19aec2df6e89ad49eabe086ad045690f42d1f