> ## Documentation Index
> Fetch the complete documentation index at: https://docs.comfy.org/llms.txt
> Use this file to discover all available pages before exploring further.

# WanFunControlToVideo - ComfyUI Built-in Node Documentation

> Complete documentation for the WanFunControlToVideo node in ComfyUI. Learn its inputs, outputs, parameters and usage.

This node was added to support the Alibaba Wan Fun Control model for video generation, and was added after [this commit](https://github.com/comfyanonymous/ComfyUI/commit/3661c833bcc41b788a7c9f0e7bc48524f8ee5f82).

* **Purpose:** Prepare the conditioning information needed for video generation, using the Wan 2.1 Fun Control model.

The WanFunControlToVideo node is a ComfyUI addition designed to support Wan Fun Control models for video generation, aimed at utilizing WanFun control for video creation.

This node serves as a preparation point for essential conditioning information and initializes the center point of the latent space, guiding the subsequent video generation process using the Wan 2.1 Fun model. The node's name clearly indicates its function: it accepts various inputs and converts them into a format suitable for controlling video generation within the WanFun framework.

The node's position in the ComfyUI node hierarchy indicates that it operates in the early stages of the video generation pipeline, focusing on manipulating conditioning signals before actual sampling or decoding of video frames.

## Inputs

| Parameter Name       | Required | Data Type            | Description                                                                                                                                                                                                           | Default Value |
| :------------------- | :------- | :------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------ |
| positive             | Yes      | CONDITIONING         | Standard ComfyUI positive conditioning data, typically from a "CLIP Text Encode" node. The positive prompt describes the content, subject matter, and artistic style that the user envisions for the generated video. | N/A           |
| negative             | Yes      | CONDITIONING         | Standard ComfyUI negative conditioning data, typically generated by a "CLIP Text Encode" node. The negative prompt specifies elements, styles, or artifacts that the user wants to avoid in the generated video.      | N/A           |
| vae                  | Yes      | VAE                  | Requires a VAE (Variational Autoencoder) model compatible with the Wan 2.1 Fun model family, used for encoding and decoding image/video data.                                                                         | N/A           |
| width                | Yes      | INT                  | The desired width of output video frames in pixels, with a default value of 832, minimum value of 16, maximum value determined by nodes.MAX\_RESOLUTION, and a step size of 16.                                       | 832           |
| height               | Yes      | INT                  | The desired height of output video frames in pixels, with a default value of 480, minimum value of 16, maximum value determined by nodes.MAX\_RESOLUTION, and a step size of 16.                                      | 480           |
| length               | Yes      | INT                  | The total number of frames in the generated video, with a default value of 81, minimum value of 1, maximum value determined by nodes.MAX\_RESOLUTION, and a step size of 4.                                           | 81            |
| batch\_size          | Yes      | INT                  | The number of videos generated in a single batch, with a default value of 1, minimum value of 1, and maximum value of 4096.                                                                                           | 1             |
| clip\_vision\_output | No       | CLIP\_VISION\_OUTPUT | (Optional) Visual features extracted by a CLIP vision model, allowing for visual style and content guidance.                                                                                                          | None          |
| start\_image         | No       | IMAGE                | (Optional) An initial image that influences the beginning of the generated video.                                                                                                                                     | None          |
| control\_video       | No       | IMAGE                | (Optional) Allows users to provide a preprocessed ControlNet reference video that will guide the motion and potential structure of the generated video.                                                               | None          |

## Outputs

| Parameter Name | Data Type    | Description                                                                                                 |
| :------------- | :----------- | :---------------------------------------------------------------------------------------------------------- |
| positive       | CONDITIONING | Provides enhanced positive conditioning data, including encoded start\_image and control\_video.            |
| negative       | CONDITIONING | Provides negative conditioning data that has also been enhanced, containing the same concat\_latent\_image. |
| latent         | LATENT       | A dictionary containing an empty latent tensor with the key "samples".                                      |
