WanMoveTrackToVideo - ComfyUI Built-in Node Documentation

The WanMoveTrackToVideo node prepares conditioning and latent space data for video generation, incorporating optional motion tracking information. It encodes a starting image sequence into a latent representation and can blend in positional data from object tracks to guide the motion in the generated video. The node outputs modified positive and negative conditioning along with an empty latent tensor ready for a video model.

Inputs

Parameter	Description	Data Type	Required	Range
`positive`	The positive conditioning input to be modified.	CONDITIONING	Yes	-
`negative`	The negative conditioning input to be modified.	CONDITIONING	Yes	-
`vae`	The VAE model used to encode the starting image into the latent space.	VAE	Yes	-
`tracks`	Optional motion tracking data containing object paths.	TRACKS	No	-
`strength`	Strength of the track conditioning. (default: 1.0)	FLOAT	No	0.0 - 100.0
`width`	The width of the output video. Must be divisible by 16. (default: 832)	INT	No	16 - MAX_RESOLUTION
`height`	The height of the output video. Must be divisible by 16. (default: 480)	INT	No	16 - MAX_RESOLUTION
`length`	The number of frames in the video sequence. (default: 81)	INT	No	1 - MAX_RESOLUTION
`batch_size`	The batch size for the latent output. (default: 1)	INT	No	1 - 4096
`start_image`	The starting image or image sequence to encode.	IMAGE	Yes	-
`clip_vision_output`	Optional CLIP vision model output to add to the conditioning.	CLIPVISIONOUTPUT	No	-

Note: The strength parameter only has an effect when tracks are provided. If tracks are not provided or strength is 0.0, the track conditioning is not applied. The start_image is used to create a latent image and mask for the conditioning; if it is not provided, the node only passes through the conditioning and outputs an empty latent.

Outputs

Output Name	Description	Data Type
`positive`	The modified positive conditioning, potentially containing `concat_latent_image`, `concat_mask`, and `clip_vision_output`.	CONDITIONING
`negative`	The modified negative conditioning, potentially containing `concat_latent_image`, `concat_mask`, and `clip_vision_output`.	CONDITIONING
`latent`	An empty latent tensor with dimensions shaped by the `batch_size`, `length`, `height`, and `width` inputs.	LATENT

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 9dc861c3616a3d92c9dc647e1d227bc1f94d5c74c58eed41ffa8d28b445c9160

​Inputs

​Outputs

Inputs

Outputs