> ## Documentation Index
> Fetch the complete documentation index at: https://docs.comfy.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Wan22ImageToVideoLatent - ComfyUI Built-in Node Documentation

> Complete documentation for the Wan22ImageToVideoLatent node in ComfyUI. Learn its inputs, outputs, parameters and usage.

The Wan22ImageToVideoLatent node creates video latent representations from images. It generates a blank video latent space with specified dimensions and can optionally encode a starting image sequence into the beginning frames. When a start image is provided, it encodes the image into the latent space and creates a corresponding noise mask for the inpainted regions.

## Inputs

| Parameter     | Description                                                       | Data Type | Required | Range                 |
| ------------- | ----------------------------------------------------------------- | --------- | -------- | --------------------- |
| `vae`         | The VAE model used for encoding images into latent space          | VAE       | Yes      | -                     |
| `width`       | The width of the output video in pixels (default: 1280, step: 32) | INT       | Yes      | 32 to MAX\_RESOLUTION |
| `height`      | The height of the output video in pixels (default: 704, step: 32) | INT       | Yes      | 32 to MAX\_RESOLUTION |
| `length`      | The number of frames in the video sequence (default: 49, step: 4) | INT       | Yes      | 1 to MAX\_RESOLUTION  |
| `batch_size`  | The number of batches to generate (default: 1)                    | INT       | Yes      | 1 to 4096             |
| `start_image` | Optional starting image sequence to encode into the video latent  | IMAGE     | No       | -                     |

**Note:** When `start_image` is provided, the node encodes the image sequence into the beginning frames of the latent space and generates a corresponding noise mask. The width and height parameters must be divisible by 16 for proper latent space dimensions. The `length` parameter determines the number of frames in the video latent; the latent space's temporal dimension is calculated as `((length - 1) // 4) + 1`.

## Outputs

| Output Name  | Description                                                                  | Data Type |
| ------------ | ---------------------------------------------------------------------------- | --------- |
| `samples`    | The generated video latent representation                                    | LATENT    |
| `noise_mask` | The noise mask indicating which regions should be denoised during generation | LATENT    |

> This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! [Edit on GitHub](https://github.com/Comfy-Org/embedded-docs/blob/main/comfyui_embedded_docs/docs/Wan22ImageToVideoLatent/en.md)

***

**Source fingerprint (SHA-256):** `d12982594b1e38e7db26630fe3d5bde84bcd540e95abb6ce50cac196ea953901`