# Get proxyvidutasks creations Source: https://docs.comfy.org/api-reference/api-nodes/get-proxyvidutasks-creations https://api.comfy.org/openapi get /proxy/vidu/tasks/{id}/creations # Post proxyviduimg2video Source: https://docs.comfy.org/api-reference/api-nodes/post-proxyviduimg2video https://api.comfy.org/openapi post /proxy/vidu/img2video # Post proxyvidureference2video Source: https://docs.comfy.org/api-reference/api-nodes/post-proxyvidureference2video https://api.comfy.org/openapi post /proxy/vidu/reference2video # Post proxyvidustart end2video Source: https://docs.comfy.org/api-reference/api-nodes/post-proxyvidustart-end2video https://api.comfy.org/openapi post /proxy/vidu/start-end2video # Post proxyvidutext2video Source: https://docs.comfy.org/api-reference/api-nodes/post-proxyvidutext2video https://api.comfy.org/openapi post /proxy/vidu/text2video # Add review to a specific version of a node Source: https://docs.comfy.org/api-reference/registry/add-review-to-a-specific-version-of-a-node https://api.comfy.org/openapi post /nodes/{nodeId}/reviews # Claim nodeId into publisherId for the authenticated publisher Source: https://docs.comfy.org/api-reference/registry/claim-nodeid-into-publisherid-for-the-authenticated-publisher https://api.comfy.org/openapi post /publishers/{publisherId}/nodes/{nodeId}/claim-my-node This endpoint allows a publisher to claim an unclaimed node that they own the repo, which is identified by the nodeId. The unclaimed node's repository must be owned by the authenticated user. # Create a new custom node Source: https://docs.comfy.org/api-reference/registry/create-a-new-custom-node https://api.comfy.org/openapi post /publishers/{publisherId}/nodes # Create a new personal access token Source: https://docs.comfy.org/api-reference/registry/create-a-new-personal-access-token https://api.comfy.org/openapi post /publishers/{publisherId}/tokens # Create a new publisher Source: https://docs.comfy.org/api-reference/registry/create-a-new-publisher https://api.comfy.org/openapi post /publishers # create comfy-nodes for certain node Source: https://docs.comfy.org/api-reference/registry/create-comfy-nodes-for-certain-node https://api.comfy.org/openapi post /nodes/{nodeId}/versions/{version}/comfy-nodes # Create Node Translations Source: https://docs.comfy.org/api-reference/registry/create-node-translations https://api.comfy.org/openapi post /nodes/{nodeId}/translations # Delete a publisher Source: https://docs.comfy.org/api-reference/registry/delete-a-publisher https://api.comfy.org/openapi delete /publishers/{publisherId} # Delete a specific node Source: https://docs.comfy.org/api-reference/registry/delete-a-specific-node https://api.comfy.org/openapi delete /publishers/{publisherId}/nodes/{nodeId} # Delete a specific personal access token Source: https://docs.comfy.org/api-reference/registry/delete-a-specific-personal-access-token https://api.comfy.org/openapi delete /publishers/{publisherId}/tokens/{tokenId} # Get information about the calling user. Source: https://docs.comfy.org/api-reference/registry/get-information-about-the-calling-user https://api.comfy.org/openapi get /users # get specify comfy-node based on its id Source: https://docs.comfy.org/api-reference/registry/get-specify-comfy-node-based-on-its-id https://api.comfy.org/openapi get /nodes/{nodeId}/versions/{version}/comfy-nodes/{comfyNodeName} # List all node versions given some filters. Source: https://docs.comfy.org/api-reference/registry/list-all-node-versions-given-some-filters https://api.comfy.org/openapi get /versions # List all versions of a node Source: https://docs.comfy.org/api-reference/registry/list-all-versions-of-a-node https://api.comfy.org/openapi get /nodes/{nodeId}/versions # list comfy-nodes for node version Source: https://docs.comfy.org/api-reference/registry/list-comfy-nodes-for-node-version https://api.comfy.org/openapi get /nodes/{nodeId}/versions/{version}/comfy-nodes # Publish a new version of a node Source: https://docs.comfy.org/api-reference/registry/publish-a-new-version-of-a-node https://api.comfy.org/openapi post /publishers/{publisherId}/nodes/{nodeId}/versions # Retrieve a node by ComfyUI node name Source: https://docs.comfy.org/api-reference/registry/retrieve-a-node-by-comfyui-node-name https://api.comfy.org/openapi get /comfy-nodes/{comfyNodeName}/node Returns the node that contains a ComfyUI node with the specified name # Retrieve a publisher by ID Source: https://docs.comfy.org/api-reference/registry/retrieve-a-publisher-by-id https://api.comfy.org/openapi get /publishers/{publisherId} # Retrieve a specific node by ID Source: https://docs.comfy.org/api-reference/registry/retrieve-a-specific-node-by-id https://api.comfy.org/openapi get /nodes/{nodeId} Returns the details of a specific node. # Retrieve a specific version of a node Source: https://docs.comfy.org/api-reference/registry/retrieve-a-specific-version-of-a-node https://api.comfy.org/openapi get /nodes/{nodeId}/versions/{versionId} # Retrieve all nodes Source: https://docs.comfy.org/api-reference/registry/retrieve-all-nodes https://api.comfy.org/openapi get /publishers/{publisherId}/nodes # Retrieve all nodes Source: https://docs.comfy.org/api-reference/registry/retrieve-all-nodes-1 https://api.comfy.org/openapi get /publishers/{publisherId}/nodes/v2 # Retrieve all publishers Source: https://docs.comfy.org/api-reference/registry/retrieve-all-publishers https://api.comfy.org/openapi get /publishers # Retrieve all publishers for a given user Source: https://docs.comfy.org/api-reference/registry/retrieve-all-publishers-for-a-given-user https://api.comfy.org/openapi get /users/publishers/ # Retrieve multiple node versions in a single request Source: https://docs.comfy.org/api-reference/registry/retrieve-multiple-node-versions-in-a-single-request https://api.comfy.org/openapi post /bulk/nodes/versions # Retrieve permissions the user has for a given publisher Source: https://docs.comfy.org/api-reference/registry/retrieve-permissions-the-user-has-for-a-given-publisher https://api.comfy.org/openapi get /publishers/{publisherId}/nodes/{nodeId}/permissions # Retrieve permissions the user has for a given publisher Source: https://docs.comfy.org/api-reference/registry/retrieve-permissions-the-user-has-for-a-given-publisher-1 https://api.comfy.org/openapi get /publishers/{publisherId}/permissions # Retrieves a list of nodes Source: https://docs.comfy.org/api-reference/registry/retrieves-a-list-of-nodes https://api.comfy.org/openapi get /nodes Returns a paginated list of nodes across all publishers. # Retrieves a list of nodes Source: https://docs.comfy.org/api-reference/registry/retrieves-a-list-of-nodes-1 https://api.comfy.org/openapi get /nodes/search Returns a paginated list of nodes across all publishers. # Returns a node version to be installed. Source: https://docs.comfy.org/api-reference/registry/returns-a-node-version-to-be-installed https://api.comfy.org/openapi get /nodes/{nodeId}/install Retrieves the node data for installation, either the latest or a specific version. # Unpublish (delete) a specific version of a node Source: https://docs.comfy.org/api-reference/registry/unpublish-delete-a-specific-version-of-a-node https://api.comfy.org/openapi delete /publishers/{publisherId}/nodes/{nodeId}/versions/{versionId} # Update a publisher Source: https://docs.comfy.org/api-reference/registry/update-a-publisher https://api.comfy.org/openapi put /publishers/{publisherId} # Update a specific node Source: https://docs.comfy.org/api-reference/registry/update-a-specific-node https://api.comfy.org/openapi put /publishers/{publisherId}/nodes/{nodeId} # Update changelog and deprecation status of a node version Source: https://docs.comfy.org/api-reference/registry/update-changelog-and-deprecation-status-of-a-node-version https://api.comfy.org/openapi put /publishers/{publisherId}/nodes/{nodeId}/versions/{versionId} Update only the changelog and deprecated status of a specific version of a node. # Validate if a publisher username is available Source: https://docs.comfy.org/api-reference/registry/validate-if-a-publisher-username-is-available https://api.comfy.org/openapi get /publishers/validate Checks if the publisher username is already taken. # Get release notes Source: https://docs.comfy.org/api-reference/releases/get-release-notes https://api.comfy.org/openapi get /releases Fetch release notes from Strapi with caching # BasicScheduler - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/BasicScheduler The BasicScheduler node is used to compute a sequence of sigma values for diffusion models based on the provided scheduler, model, and denoising parameters. The `BasicScheduler` node is designed to compute a sequence of sigma values for diffusion models based on the provided scheduler, model, and denoising parameters. It dynamically adjusts the total number of steps based on the denoise factor to fine-tune the diffusion process, providing precise "recipes" for different stages in advanced sampling processes that require fine control (such as multi-stage sampling). ## Inputs | Parameter | Data Type | Input Type | Default | Range | Metaphor Description | Technical Purpose | | ----------- | -------------- | ---------- | ------- | --------- | ------------------------------------------------------------------------- | ---------------------------------------------------------- | | `model` | MODEL | Input | - | - | **Canvas Type**: Different canvas materials need different paint formulas | Diffusion model object, determines sigma calculation basis | | `scheduler` | COMBO\[STRING] | Widget | - | 9 options | **Mixing Technique**: Choose how paint concentration changes | Scheduling algorithm, controls noise decay mode | | `steps` | INT | Widget | 20 | 1-10000 | **Mixing Count**: 20 mixes vs 50 mixes precision difference | Sampling steps, affects generation quality and speed | | `denoise` | FLOAT | Widget | 1.0 | 0.0-1.0 | **Creation Intensity**: Control level from fine-tuning to repainting | Denoising strength, supports partial repainting scenarios | ### Scheduler Types Based on source code `comfy.samplers.SCHEDULER_NAMES`, supports the following 9 schedulers: | Scheduler Name | Characteristics | Use Cases | Noise Decay Pattern | | --------------------- | ----------------- | ----------------------------- | ----------------------------- | | **normal** | Standard linear | General scenarios, balanced | Uniform decay | | **karras** | Smooth transition | High quality, detail-rich | Smooth non-linear decay | | **exponential** | Exponential decay | Fast generation, efficiency | Exponential rapid decay | | **sgm\_uniform** | SGM uniform | Specific model optimization | SGM optimized decay | | **simple** | Simple scheduling | Quick testing, basic use | Simplified decay | | **ddim\_uniform** | DDIM uniform | DDIM sampling optimization | DDIM specific decay | | **beta** | Beta distribution | Special distribution needs | Beta function decay | | **linear\_quadratic** | Linear quadratic | Complex scenario optimization | Quadratic function decay | | **kl\_optimal** | KL optimal | Theoretical optimization | KL divergence optimized decay | ## Outputs | Parameter | Data Type | Output Type | Metaphor Description | Technical Meaning | | --------- | --------- | ----------- | ------------------------------------------------------------------------------ | -------------------------------------------------------------- | | `sigmas` | SIGMAS | Output | **Paint Recipe Chart**: Detailed paint concentration list for step-by-step use | Noise level sequence, guides diffusion model denoising process | ## Node Role: Artist's Color Mixing Assistant Imagine you are an artist creating a clear image from a chaotic mixture of paint (noise). `BasicScheduler` acts like your **professional color mixing assistant**, whose job is to prepare a series of precise paint concentration recipes: ### Workflow * **Step 1**: Use 90% concentration paint (high noise level) * **Step 2**: Use 80% concentration paint * **Step 3**: Use 70% concentration paint * **...** * **Final Step**: Use 0% concentration (clean canvas, no noise) ### Color Assistant's Special Skills **Different mixing methods (scheduler)**: * **"karras" mixing method**: Paint concentration changes very smoothly, like professional artist's gradient technique * **"exponential" mixing method**: Paint concentration decreases rapidly, suitable for quick creation * **"linear" mixing method**: Paint concentration decreases uniformly, stable and controllable **Fine control (steps)**: * **20 mixes**: Quick painting, efficiency priority * **50 mixes**: Fine painting, quality priority **Creation intensity (denoise)**: * **1.0 = Complete new creation**: Start completely from blank canvas * **0.5 = Half transformation**: Keep half of original painting, transform half * **0.2 = Fine adjustment**: Only make subtle adjustments to original painting ### Collaboration with Other Nodes `BasicScheduler` (Color Assistant) → Prepare Recipe → `SamplerCustom` (Artist) → Actual Painting → Completed Work # Canny - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/Canny The Canny node used to extract edge lines from photos. Extract all edge lines from photos, like using a pen to outline a photo, drawing out the contours and detail boundaries of objects. ## Working Principle Imagine you are an artist who needs to use a pen to outline a photo. The Canny node acts like an intelligent assistant, helping you decide where to draw lines (edges) and where not to. This process is like a screening job: * **High threshold** is the "must draw line standard": only very obvious and clear contour lines will be drawn, such as facial contours of people and building frames * **Low threshold** is the "definitely don't draw line standard": edges that are too weak will be ignored to avoid drawing noise and meaningless lines * **Middle area**: edges between the two standards will be drawn together if they connect to "must draw lines", but won't be drawn if they are isolated The final output is a black and white image, where white parts are detected edge lines and black parts are areas without edges. ## Inputs | Parameter Name | Data Type | Input Type | Default | Range | Function Description | | ---------------- | --------- | ---------- | ------- | --------- | --------------------------------------------------------------------------------------------------------------- | | `image` | IMAGE | Input | - | - | Original photo that needs edge extraction | | `low_threshold` | FLOAT | Widget | 0.4 | 0.01-0.99 | Low threshold, determines how weak edges to ignore. Lower values preserve more details but may produce noise | | `high_threshold` | FLOAT | Widget | 0.8 | 0.01-0.99 | High threshold, determines how strong edges to preserve. Higher values only keep the most obvious contour lines | ## Outputs | Output Name | Data Type | Description | | ----------- | --------- | ----------------------------------------------------------------------------------------------- | | `image` | IMAGE | Black and white edge image, white lines are detected edges, black areas are parts without edges | ## Parameter Comparison Original Image Parameter Comparison **Common Issues:** * Broken edges: Try lowering high threshold * Too much noise: Raise low threshold * Missing important details: Lower low threshold * Edges too rough: Check input image quality and resolution # CheckpointLoaderSimple - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/CheckpointLoaderSimple The CheckpointLoaderSimple node is used to load model files from specified locations and decompose them into three core components: the main model, text encoder, and image encoder/decoder. This is a model loader node that loads model files from specified locations and decomposes them into three core components: the main model, text encoder, and image encoder/decoder. This node automatically detects all model files in the `ComfyUI/models/checkpoints` folder, as well as additional paths configured in your `extra_model_paths.yaml` file. 1. **Model Compatibility**: Ensure the selected model is compatible with your workflow. Different model types (such as SD1.5, SDXL, Flux, etc.) need to be paired with corresponding samplers and other nodes 2. **File Management**: Place model files in the `ComfyUI/models/checkpoints` folder, or configure other paths through extra\_model\_paths.yaml 3. **Interface Refresh**: If new model files are added while ComfyUI is running, you need to refresh the browser (Ctrl+R) to see the new files in the dropdown list ## Inputs | Parameter | Data Type | Input Type | Default | Range | Description | | ----------- | --------- | ---------- | ------- | ------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | | `ckpt_name` | STRING | Widget | null | All model files in checkpoints folder | Select the checkpoint model file name to load, which determines the AI model used for subsequent image generation | ## Outputs | Output Name | Data Type | Description | | ----------- | --------- | --------------------------------------------------------------------------------------------------------------- | | `MODEL` | MODEL | The main diffusion model used for image denoising generation, the core component of AI image creation | | `CLIP` | CLIP | The model used for encoding text prompts, converting text descriptions into information that AI can understand | | `VAE` | VAE | The model used for image encoding and decoding, responsible for converting between pixel space and latent space | # ClipLoader - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipLoader The ClipLoader node is used to load CLIP text encoder models independently. This node is primarily used for loading CLIP text encoder models independently. The model files can be detected in the following paths: * "ComfyUI/models/text\_encoders/" * "ComfyUI/models/clip/" > If you save a model after ComfyUI has started, you'll need to refresh the ComfyUI frontend to get the latest model file path list Supported model formats: * `.ckpt` * `.pt` * `.pt2` * `.bin` * `.pth` * `.safetensors` * `.pkl` * `.sft` For more details on the latest model file loading, please refer to [folder\_paths](https://github.com/comfyanonymous/ComfyUI/blob/master/folder_paths.py) ## Inputs | Parameter | Data Type | Description | | ----------- | -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `clip_name` | COMBO\[STRING] | Specifies the name of the CLIP model to be loaded. This name is used to locate the model file within a predefined directory structure. | | `type` | COMBO\[STRING] | Determines the type of CLIP model to load. As ComfyUI supports more models, new types will be added here. Please check the `CLIPLoader` class definition in [node.py](https://github.com/comfyanonymous/ComfyUI/blob/master/nodes.py) for details. | | `device` | COMBO\[STRING] | Choose the device for loading the CLIP model. `default` will run the model on GPU, while selecting `CPU` will force loading on CPU. | ### Device Options Explained **When to choose "default":** * Have sufficient GPU memory * Want the best performance * Let the system optimize memory usage automatically **When to choose "cpu":** * Insufficient GPU memory * Need to reserve GPU memory for other models (like UNet) * Running in a low VRAM environment * Debugging or special purpose needs **Performance Impact** Running on CPU will be much slower than GPU, but it can save valuable GPU memory for other more important model components. In memory-constrained environments, putting the CLIP model on CPU is a common optimization strategy. ### Supported Combinations | Model Type | Corresponding Encoder | | ----------------- | ----------------------- | | stable\_diffusion | clip-l | | stable\_cascade | clip-g | | sd3 | t5 xxl/ clip-g / clip-l | | stable\_audio | t5 base | | mochi | t5 xxl | | cosmos | old t5 xxl | | lumina2 | gemma 2 2B | | wan | umt5 xxl | As ComfyUI updates, these combinations may expand. For details, please refer to the `CLIPLoader` class definition in [node.py](https://github.com/comfyanonymous/ComfyUI/blob/master/nodes.py) ## Outputs | Parameter | Data Type | Description | | --------- | --------- | ------------------------------------------------------------------------------- | | `clip` | CLIP | The loaded CLIP model, ready for use in downstream tasks or further processing. | ## Additional Notes CLIP models play a core role as text encoders in ComfyUI, responsible for converting text prompts into numerical representations that diffusion models can understand. You can think of them as translators, responsible for translating your text into a language that large models can understand. Of course, different models have their own "dialects," so different CLIP encoders are needed between different architectures to complete the text encoding process. # ClipMergeSimple - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipMergeSimple The ClipMergeSimple node is used to combine two CLIP text encoder models based on a specified ratio. `CLIPMergeSimple` is an advanced model merging node used to combine two CLIP text encoder models based on a specified ratio. This node specializes in merging two CLIP models based on a specified ratio, effectively blending their characteristics. It selectively applies patches from one model to another, excluding specific components like position IDs and logit scale, to create a hybrid model that combines features from both source models. ## Inputs | Parameter | Data Type | Description | | --------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `clip1` | CLIP | The first CLIP model to be merged. It serves as the base model for the merging process. | | `clip2` | CLIP | The second CLIP model to be merged. Its key patches, except for position IDs and logit scale, are applied to the first model based on the specified ratio. | | `ratio` | FLOAT | Range `0.0 - 1.0`, determines the proportion of features from the second model to blend into the first model. A ratio of 1.0 means fully adopting the second model's features, while 0.0 retains only the first model's features. | ## Outputs | Parameter | Data Type | Description | | --------- | --------- | ---------------------------------------------------------------------------------------------------------------- | | `clip` | CLIP | The resulting merged CLIP model, incorporating features from both input models according to the specified ratio. | ## Merging Mechanism Explained ### Merging Algorithm The node uses weighted averaging to merge the two models: 1. **Clone Base Model**: First clones `clip1` as the base model 2. **Get Patches**: Obtains all key patches from `clip2` 3. **Filter Special Keys**: Skips keys ending with `.position_ids` and `.logit_scale` 4. **Apply Weighted Merge**: Uses the formula `(1.0 - ratio) * clip1 + ratio * clip2` ### Ratio Parameter Explained * **ratio = 0.0**: Fully uses clip1, ignores clip2 * **ratio = 0.5**: 50% contribution from each model * **ratio = 1.0**: Fully uses clip2, ignores clip1 ## Use Cases 1. **Model Style Fusion**: Combine characteristics of CLIP models trained on different data 2. **Performance Optimization**: Balance strengths and weaknesses of different models 3. **Experimental Research**: Explore combinations of different CLIP encoders # ClipSave - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipSave The ClipSave node is used to save CLIP text encoder models in SafeTensors format. The `CLIPSave` node is designed for saving CLIP text encoder models in SafeTensors format. This node is part of advanced model merging workflows and is typically used in conjunction with nodes like `CLIPMergeSimple` and `CLIPMergeAdd`. The saved files use the SafeTensors format to ensure security and compatibility. ## Inputs | Parameter | Data Type | Required | Default Value | Description | | ---------------- | -------------- | -------- | -------------- | ------------------------------------------ | | clip | CLIP | Yes | - | The CLIP model to be saved | | filename\_prefix | STRING | Yes | "clip/ComfyUI" | The prefix path for the saved file | | prompt | PROMPT | Hidden | - | Workflow prompt information (for metadata) | | extra\_pnginfo | EXTRA\_PNGINFO | Hidden | - | Additional PNG information (for metadata) | ## Outputs This node has no defined output types. It saves the processed files to the `ComfyUI/output/` folder. ### Multi-file Saving Strategy The node saves different components based on the CLIP model type: | Prefix Type | File Suffix | Description | | ------------ | ----------- | --------------------- | | `clip_l.` | `_clip_l` | CLIP-L text encoder | | `clip_g.` | `_clip_g` | CLIP-G text encoder | | Empty prefix | No suffix | Other CLIP components | ## Usage Notes 1. **File Location**: All files are saved in the `ComfyUI/output/` directory 2. **File Format**: Models are saved in SafeTensors format for security 3. **Metadata**: Includes workflow information and PNG metadata if available 4. **Naming Convention**: Uses the specified prefix plus appropriate suffixes based on model type # ClipSetLastLayer - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipSetLastLayer The ClipSetLastLayer node is used to control the processing depth of CLIP models. `CLIP Set Last Layer` is a core node in ComfyUI for controlling the processing depth of CLIP models. It allows users to precisely control where the CLIP text encoder stops processing, affecting both the depth of text understanding and the style of generated images. Imagine the CLIP model as a 24-layer intelligent brain: * Shallow layers (1-8): Recognize basic letters and words * Middle layers (9-16): Understand grammar and sentence structure * Deep layers (17-24): Grasp abstract concepts and complex semantics `CLIP Set Last Layer` works like a **"thinking depth controller"**: -1: Use all 24 layers (complete understanding) -2: Stop at layer 23 (slightly simplified) -12: Stop at layer 13 (medium understanding) -24: Use only layer 1 (basic understanding) ## Inputs | Parameter | Data Type | Default | Range | Description | | -------------------- | --------- | ------- | --------- | ----------------------------------------------------------------------------------- | | `clip` | CLIP | - | - | The CLIP model to be modified | | `stop_at_clip_layer` | INT | -1 | -24 to -1 | Specifies which layer to stop at, -1 uses all layers, -24 uses only the first layer | ## Outputs | Output Name | Data Type | Description | | ----------- | --------- | -------------------------------------------------------------------- | | clip | CLIP | The modified CLIP model with the specified layer set as the last one | ## Why Set the Last Layer * **Performance Optimization**: Like not needing a PhD to understand simple sentences, sometimes shallow understanding is enough and faster * **Style Control**: Different levels of understanding produce different artistic styles * **Compatibility**: Some models might perform better at specific layers # ClipTextEncode - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipTextEncode The ClipTextEncode node is used to convert text prompts into AI-understandable 'language' for image generation. `CLIP Text Encode (CLIPTextEncode)` acts like a translator, converting your creative text prompts into a special "language" that AI can understand, helping the AI accurately interpret what kind of image you want to create. Imagine communicating with a foreign artist - you need a translator to help accurately convey the artwork you want. This node acts as that translator, using the CLIP model (an AI model trained on vast amounts of image-text pairs) to understand your text descriptions and convert them into "instructions" that the AI art model can understand. ## Inputs | Parameter | Data Type | Input Method | Default | Range | Description | | --------- | --------- | --------------- | ------- | ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------- | | text | STRING | Text Input | Empty | Any text | Like detailed instructions to an artist, enter your image description here. Supports multi-line text for detailed descriptions. | | clip | CLIP | Model Selection | None | Loaded CLIP models | Like choosing a specific translator, different CLIP models are like different translators with slightly different understandings of artistic styles. | ## Outputs | Output Name | Data Type | Description | | ------------ | ------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | CONDITIONING | CONDITIONING | These are the translated "painting instructions" containing detailed creative guidance that the AI model can understand. These instructions tell the AI model how to create an image matching your description. | ## Usage Tips 1. **Basic Text Prompt Usage** * Write detailed descriptions like you're writing a short essay * More specific descriptions lead to more accurate results * Use English commas to separate different descriptive elements 2. **Special Feature: Using Embedding Models** * Embedding models are like preset art style packages that can quickly apply specific artistic effects * Currently supports .safetensors, .pt, and .bin file formats, and you don't necessarily need to use the complete model name * How to use: 1. Place the embedding model file (in .pt format) in the `ComfyUI/models/embeddings` folder 2. Use `embedding:model_name` in your text Example: If you have a model called `EasyNegative.pt`, you can use it like this: ``` a beautiful landscape, embedding:EasyNegative, high quality ``` 3. **Prompt Weight Adjustment** * Use parentheses to adjust the importance of certain descriptions * For example: `(beautiful:1.2)` will make the "beautiful" feature more prominent * Regular parentheses `()` have a default weight of 1.1 * Use keyboard shortcuts `ctrl + up/down arrow` to quickly adjust weights * The weight adjustment step size can be modified in settings 4. **Important Notes** * Ensure the CLIP model is properly loaded * Use positive and clear text descriptions * When using embedding models, make sure the file name is correct and compatible with your current main model's architecture # ClipTextEncodeFlux - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipTextEncodeFlux The ClipTextEncodeFlux node is used to encode text prompts into Flux-compatible conditioning embeddings. `CLIPTextEncodeFlux` is an advanced text encoding node in ComfyUI, specifically designed for the Flux architecture. It uses a dual-encoder mechanism (CLIP-L and T5XXL) to process both structured keywords and detailed natural language descriptions, providing the Flux model with more accurate and comprehensive text understanding for improved text-to-image generation quality. This node is based on a dual-encoder collaboration mechanism: 1. The `clip_l` input is processed by the CLIP-L encoder, extracting style, theme, and other keyword features—ideal for concise descriptions. 2. The `t5xxl` input is processed by the T5XXL encoder, which excels at understanding complex and detailed natural language scene descriptions. 3. The outputs from both encoders are fused, and combined with the `guidance` parameter to generate unified conditioning embeddings (`CONDITIONING`) for downstream Flux sampler nodes, controlling how closely the generated content matches the text description. ## Inputs | Parameter | Data Type | Input Method | Default | Range | Description | | ---------- | --------- | ------------ | ------- | ---------------- | ---------------------------------------------------------------------------------------------------------------------- | | `clip` | CLIP | Node input | None | - | Must be a CLIP model supporting the Flux architecture, including both CLIP-L and T5XXL encoders | | `clip_l` | STRING | Text box | None | Up to 77 tokens | Suitable for concise keyword descriptions, such as style or theme | | `t5xxl` | STRING | Text box | None | Nearly unlimited | Suitable for detailed natural language descriptions, expressing complex scenes and details | | `guidance` | FLOAT | Slider | 3.5 | 0.0 - 100.0 | Controls the influence of text conditions on the generation process; higher values mean stricter adherence to the text | ## Outputs | Output Name | Data Type | Description | | -------------- | ------------ | ------------------------------------------------------------------------------------------------------------------ | | `CONDITIONING` | CONDITIONING | Contains the fused embeddings from both encoders and the guidance parameter, used for conditional image generation | ## Usage Examples ### Prompt Examples * **clip\_l input** (keyword style): * Use structured, concise keyword combinations * Example: `masterpiece, best quality, portrait, oil painting, dramatic lighting` * Focus on style, quality, and main subject * **t5xxl input** (natural language description): * Use complete, fluent scene descriptions * Example: `A highly detailed portrait in oil painting style, featuring dramatic chiaroscuro lighting that creates deep shadows and bright highlights, emphasizing the subject's features with renaissance-inspired composition.` * Focus on scene details, spatial relationships, and lighting effects ### Notes 1. Make sure to use a CLIP model compatible with the Flux architecture 2. It is recommended to fill in both `clip_l` and `t5xxl` to leverage the dual-encoder advantage 3. Note the 77-token limit for `clip_l` 4. Adjust the `guidance` parameter based on the generated results # ClipTextEncodeHunyuanDit - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipTextEncodeHunyuanDit The ClipTextEncodeHunyuanDit node is used to encode text prompts into HunyuanDiT-compatible conditioning embeddings. The `CLIPTextEncodeHunyuanDiT` node's main function is to convert input text into a form that the model can understand. It is an advanced conditioning node specifically designed for the dual text encoder architecture of the HunyuanDiT model. Its primary role is like a translator, converting our text descriptions into "machine language" that the AI model can understand. The `bert` and `mt5xl` inputs prefer different types of prompt inputs. ## Inputs | Parameter | Data Type | Description | | --------- | --------- | -------------------------------------------------------------------------------------------------------------------------------------------- | | `clip` | CLIP | A CLIP model instance used for text tokenization and encoding, which is core to generating conditions. | | `bert` | STRING | Text input for encoding, prefers phrases and keywords, supports multiline and dynamic prompts. | | `mt5xl` | STRING | Another text input for encoding, supports multiline and dynamic prompts (multilingual), can use complete sentences and complex descriptions. | ## Outputs | Parameter | Data Type | Description | | -------------- | ------------ | ------------------------------------------------------------------------------- | | `CONDITIONING` | CONDITIONING | The encoded conditional output used for further processing in generation tasks. | # ClipTextEncodeSdxl - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipTextEncodeSdxl The ClipTextEncodeSdxl node is used to encode text prompts into SDXL-compatible conditioning embeddings. This node is designed to encode text input using a CLIP model specifically customized for the SDXL architecture. It uses a dual encoder system (CLIP-L and CLIP-G) to process text descriptions, resulting in more accurate image generation. ## Inputs | Parameter | Data Type | Description | | --------------- | --------- | ------------------------------------------------------ | | `clip` | CLIP | CLIP model instance used for text encoding. | | `width` | INT | Specifies the image width in pixels, default 1024. | | `height` | INT | Specifies the image height in pixels, default 1024. | | `crop_w` | INT | Width of the crop area in pixels, default 0. | | `crop_h` | INT | Height of the crop area in pixels, default 0. | | `target_width` | INT | Target width for the output image, default 1024. | | `target_height` | INT | Target height for the output image, default 1024. | | `text_g` | STRING | Global text description for overall scene description. | | `text_l` | STRING | Local text description for detail description. | ## Outputs | Parameter | Data Type | Description | | -------------- | ------------ | ------------------------------------------------------------------------------ | | `CONDITIONING` | CONDITIONING | Contains encoded text and conditional information needed for image generation. | # ClipTextEncodeSdxlRefiner - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipTextEncodeSdxlRefiner The ClipTextEncodeSdxlRefiner node is used to encode text prompts into SDXL Refiner-compatible conditioning embeddings. This node is specifically designed for the SDXL Refiner model to convert text prompts into conditioning information by incorporating aesthetic scores and dimensional information to enhance the conditions for generation tasks, thereby improving the final refinement effect. It acts like a professional art director, not only conveying your creative intent but also injecting precise aesthetic standards and specification requirements into the work. ## About SDXL Refiner SDXL Refiner is a specialized refinement model that focuses on enhancing image details and quality based on the SDXL base model. This process is like having an art retoucher: 1. First, it receives preliminary images or text descriptions generated by the base model 2. Then, it guides the refinement process through precise aesthetic scoring and dimensional parameters 3. Finally, it focuses on processing high-frequency image details to improve overall quality Refiner can be used in two ways: * As a standalone refinement step for post-processing images generated by the base model * As part of an expert integration system, taking over processing during the low-noise phase of generation ## Inputs | Parameter Name | Data Type | Input Type | Default Value | Value Range | Description | | -------------- | --------- | ---------- | ------------- | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `clip` | CLIP | Required | - | - | CLIP model instance used for text tokenization and encoding, the core component for converting text into model-understandable format | | `ascore` | FLOAT | Optional | 6.0 | 0.0-1000.0 | Controls the visual quality and aesthetics of generated images, similar to setting quality standards for artwork:
- High scores(7.5-8.5): Pursues more refined, detail-rich effects
- Medium scores(6.0-7.0): Balanced quality control
- Low scores(2.0-3.0): Suitable for negative prompts | | `width` | INT | Required | 1024 | 64-16384 | Specifies output image width (pixels), must be multiple of 8. SDXL performs best when total pixel count is close to 1024×1024 (about 1M pixels) | | `height` | INT | Required | 1024 | 64-16384 | Specifies output image height (pixels), must be multiple of 8. SDXL performs best when total pixel count is close to 1024×1024 (about 1M pixels) | | `text` | STRING | Required | - | - | Text prompt description, supports multi-line input and dynamic prompt syntax. In Refiner, text prompts should focus more on describing desired visual quality and detail characteristics | ## Outputs | Output Name | Data Type | Description | | -------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `CONDITIONING` | CONDITIONING | Refined conditional output containing integrated encoding of text semantics, aesthetic standards, and dimensional information, specifically for guiding SDXL Refiner model in precise image refinement | ## Notes 1. This node is specifically optimized for the SDXL Refiner model and differs from regular CLIPTextEncode nodes 2. An aesthetic score of 7.5 is recommended as the baseline, which is the standard setting used in SDXL training 3. All dimensional parameters must be multiples of 8, and total pixel count close to 1024×1024 (about 1M pixels) is recommended 4. The Refiner model focuses on enhancing image details and quality, so text prompts should emphasize desired visual effects rather than scene content 5. In practical use, Refiner is typically used in the later stages of generation (approximately the last 20% of steps), focusing on detail optimization # ClipVisionEncode - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipVisionEncode The ClipVisionEncode node is used to encode input images into visual feature vectors through the CLIP Vision model. The `CLIP Vision Encode` node is an image encoding node in ComfyUI, used to convert input images into visual feature vectors through the CLIP Vision model. This node is an important bridge connecting image and text understanding, and is widely used in various AI image generation and processing workflows. **Node Functionality** * **Image feature extraction**: Converts input images into high-dimensional feature vectors * **Multimodal bridging**: Provides a foundation for joint processing of images and text * **Conditional generation**: Provides visual conditions for image-based conditional generation ## Inputs | Parameter Name | Data Type | Description | | -------------- | ------------ | -------------------------------------------------------------------- | | `clip_vision` | CLIP\_VISION | CLIP vision model, usually loaded via the CLIPVisionLoader node | | `image` | IMAGE | The input image to be encoded | | `crop` | Dropdown | Image cropping method, options: center (center crop), none (no crop) | ## Outputs | Output Name | Data Type | Description | | -------------------- | -------------------- | ----------------------- | | CLIP\_VISION\_OUTPUT | CLIP\_VISION\_OUTPUT | Encoded visual features | This output object contains: * `last_hidden_state`: The last hidden state * `image_embeds`: Image embedding vector * `penultimate_hidden_states`: The penultimate hidden state * `mm_projected`: Multimodal projection result (if available) # Load CLIP Vision - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/ClipVisionLoader The Load CLIP Vision node is used to load CLIP Vision models from the `ComfyUI/models/clip_vision` folder. This node automatically detects models located in the `ComfyUI/models/clip_vision` folder, as well as any additional model paths configured in the `extra_model_paths.yaml` file. If you add models after starting ComfyUI, please **refresh the ComfyUI interface** to ensure the latest model files are listed. ## Inputs | Field | Data Type | Description | | ----------- | -------------- | --------------------------------------------------------------------------- | | `clip_name` | COMBO\[STRING] | Lists all supported model files in the `ComfyUI/models/clip_vision` folder. | ## Outputs | Field | Data Type | Description | | ------------- | ------------ | ---------------------------------------------------------------------------------- | | `clip_vision` | CLIP\_VISION | Loaded CLIP Vision model, ready for encoding images or other vision-related tasks. | # Load3D - ComfyUI Built-in Node Documentation Source: https://docs.comfy.org/built-in-nodes/Load3D The Load3D node is a core node in ComfyUI for loading and previewing various 3D model files, supporting multi-format import and rich three-dimensional view operations. The Load3D node is a core node for loading and processing 3D model files. When loading the node, it automatically retrieves available 3D resources from `ComfyUI/input/3d/`. You can also upload supported 3D files for preview using the upload function. **Supported Formats** Currently, this node supports multiple 3D file formats, including `.gltf`, `.glb`, `.obj`, `.fbx`, and `.stl`. **3D Node Preferences** Some related preferences for 3D nodes can be configured in ComfyUI's settings menu. Please refer to the following documentation for corresponding settings: [Settings Menu - 3D](/interface/settings/3d) Besides regular node outputs, Load3D has lots of 3D view-related settings in the canvas menu. ## Inputs | Parameter Name | Type | Description | Default | Range | | -------------- | -------------- | --------------------------------------------------------------------------------------------- | ------- | ----------------- | | model\_file | File Selection | 3D model file path, supports upload, defaults to reading model files from `ComfyUI/input/3d/` | - | Supported formats | | width | INT | Canvas rendering width | 1024 | 1-4096 | | height | INT | Canvas rendering height | 1024 | 1-4096 | ## Outputs | Parameter Name | Data Type | Description | | ---------------- | -------------- | ---------------------------------------------------------------------------------------------- | | image | IMAGE | Canvas rendered image | | mask | MASK | Mask containing current model position | | mesh\_path | STRING | Model file path | | normal | IMAGE | Normal map | | lineart | IMAGE | Line art image output, corresponding `edge_threshold` can be adjusted in the canvas model menu | | camera\_info | LOAD3D\_CAMERA | Camera information | | recording\_video | VIDEO | Recorded video (only when recording exists) | All corresponding outputs preview View Operation Demo ## Canvas Area Description The Load3D node's Canvas area contains numerous view operations, including: * Preview view settings (grid, background color, preview view) * Camera control: Control FOV, camera type * Global illumination intensity: Adjust lighting intensity * Video recording: Record and export videos * Model export: Supports `GLB`, `OBJ`, `STL` formats * And more Load 3D Node UI 1. Contains multiple menus and hidden menus of the Load 3D node 2. Menu for `resizing preview window` and `canvas video recording` 3. 3D view operation axis 4. Preview thumbnail 5. Preview size settings, scale preview view display by setting dimensions and then resizing window ### 1. View Operations