```
**Requirements:**
* Must be less than 100 characters
* Can only contain alphanumeric characters, hyphens, underscores, and periods
* Cannot have consecutive special characters
* Cannot start with a number or special character
* Case-insensitive comparison
**Best Practices:**
* Use a short, descriptive name
* Don't include "ComfyUI" in the name
* Make it memorable and easy to type
**Examples:**
```toml
name = "image-processor" # ✅ Good: Simple and clear
name = "super-resolution" # ✅ Good: Describes functionality
name = "ComfyUI-enhancer" # ❌ Bad: Includes ComfyUI
name = "123-tool" # ❌ Bad: Starts with number
```
See the official [python documentation](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#name) for more details.
### version (required)
Uses [semantic versioning](https://semver.org/) with a three-digit version number X.Y.Z:
* X (**MAJOR**): Breaking changes
* Y (**MINOR**): New features (backwards compatible)
* Z (**PATCH**): Bug fixes
**Examples:**
```toml
version = "1.0.0" # Initial release
version = "1.1.0" # Added new features
version = "1.1.1" # Bug fix
version = "2.0.0" # Breaking changes
```
### license (optional)
Specifies the license for your custom node. Can be specified in two ways:
1. **File Reference:**
```toml
license = { file = "LICENSE" } # ✅ Points to LICENSE file
license = { file = "LICENSE.txt" } # ✅ Points to LICENSE.txt
license = "LICENSE" # ❌ Incorrect format
```
2. **License Name:**
```toml
license = { text = "MIT License" } # ✅ Correct format
license = { text = "Apache-2.0" } # ✅ Correct format
license = "MIT LICENSE" # ❌ Incorrect format
```
Common licenses: [MIT](https://opensource.org/license/mit), [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html), [Apache](https://www.apache.org/licenses/LICENSE-2.0)
### description (recommended)
A brief description of what your custom node does.
```toml
description = "A super resolution node for enhancing image quality"
```
### repository (required)
Links to related resources:
```toml
[project.urls]
Repository = "https://github.com/username/repository"
```
### urls (recommended)
Links to related resources:
```toml
[project.urls]
Documentation = "https://github.com/username/repository/wiki"
"Bug Tracker" = "https://github.com/username/repository/issues"
```
### requires-python (recommended)
Specifies the Python versions that your node supports:
```toml
requires-python = ">=3.8" # Python 3.8 or higher
requires-python = ">=3.8,<3.11" # Python 3.8 up to (but not including) 3.11
```
### Frontend Version Compatibility (optional)
If your node has specific requirements for which ComfyUI frontend versions it supports, you can specify this using the `comfyui-frontend-package` dependency. This package is published on [PyPI](https://pypi.org/project/comfyui-frontend-package/).
For example, use this field when:
* Your custom node uses frontend APIs that were introduced in a specific version
* You've identified incompatibilities between your node and certain frontend versions
* Your node requires specific UI features only available in newer frontend versions
```toml
[project]
dependencies = [
"comfyui-frontend-package>=1.20.0" # Requires frontend 1.20.0 or newer
"comfyui-frontend-package<=1.21.6" # Restricts to frontend versions up to 1.21.6
"comfyui-frontend-package>=1.19,<1.22" # Works with frontend 1.19 to 1.21.x
"comfyui-frontend-package~=1.20.0" # Compatible with 1.20.x but not 1.21.0
"comfyui-frontend-package!=1.21.3" # Works with any version except 1.21.3
]
```
### classifiers (recommended)
Use classifiers to specify operating system compatibility and GPU accelerators. This information is used to help users find the right node for their system.
```toml
[project]
classifiers = [
# For OS-independent nodes (works on all operating systems)
"Operating System :: OS Independent",
# OR for OS-specific nodes, specify the supported systems:
"Operating System :: Microsoft :: Windows", # Windows specific
"Operating System :: POSIX :: Linux", # Linux specific
"Operating System :: MacOS", # macOS specific
# GPU Accelerator support
"Environment :: GPU :: NVIDIA CUDA", # NVIDIA CUDA support
"Environment :: GPU :: AMD ROCm", # AMD ROCm support
"Environment :: GPU :: Intel Arc", # Intel Arc support
"Environment :: NPU :: Huawei Ascend", # Huawei Ascend support
"Environment :: GPU :: Apple Metal", # Apple Metal support
]
```
## \[tool.comfy] Section
### PublisherId (required)
Your unique publisher identifier, typically matching your GitHub username.
**Examples:**
```toml
PublisherId = "john-doe" # ✅ Matches GitHub username
PublisherId = "image-wizard" # ✅ Unique identifier
```
### DisplayName (optional)
A user-friendly name for your custom node.
```toml
DisplayName = "Super Resolution Node"
```
### Icon (optional)
URL to your custom node's icon that will be displayed on the ComfyUI Registry and ComfyUI-Manager.
**Requirements:**
* File types: SVG, PNG, JPG, or GIF
* Maximum resolution: 400px × 400px
* Aspect ratio should be square
```toml
Icon = "https://raw.githubusercontent.com/username/repo/main/icon.png"
```
### Banner (optional)
URL to a larger banner image that will be displayed on the ComfyUI Registry and ComfyUI-Manager.
**Requirements:**
* File types: SVG, PNG, JPG, or GIF
* Aspect ratio: 21:9
```toml
Banner = "https://raw.githubusercontent.com/username/repo/main/banner.png"
```
### requires-comfyui (optional)
Specifies which version of ComfyUI your node is compatible with. This helps users ensure they have the correct version of ComfyUI installed.
**Supported operators:** `<`, `>`, `<=`, `>=`, `~=`, `<>`, `!=` and ranges
```toml
requires-comfyui = ">=1.0.0" # ComfyUI 1.0.0 or higher
requires-comfyui = ">=1.0.0,<2.0.0" # ComfyUI 1.0.0 up to (but not including) 2.0.0
requires-comfyui = "~=1.0.0" # Compatible release: version 1.0.0 or newer, but not version 2.0.0
requires-comfyui = "!=1.2.3" # Any version except 1.2.3
requires-comfyui = ">0.1.3,<1.0.0" # Greater than 0.1.3 and less than 1.0.0
```
### includes (optional)
Specifies whether to force include certain specific folders. For some situations, such as custom nodes in frontend projects, the final packaged output folder might be included in .gitignore. In such cases, we need to force include it for registry use.
```toml
includes = ['dist']
```
## Complete Example
```toml
[project]
name = "super-resolution-node"
version = "1.0.0"
description = "Enhance image quality using advanced super resolution techniques"
license = { file = "LICENSE" }
requires-python = ">=3.8"
dependencies = [
"comfyui-frontend-package<=1.21.6" # Frontend version compatibility
]
classifiers = [
"Operating System :: OS Independent" # Works on all operating systems
]
dynamic = ["dependencies"]
[tool.setuptools.dynamic]
dependencies = {file = ["requirements.txt"]}
[project.urls]
Repository = "https://github.com/username/super-resolution-node"
Documentation = "https://github.com/username/super-resolution-node/wiki"
"Bug Tracker" = "https://github.com/username/super-resolution-node/issues"
[tool.comfy]
PublisherId = "image-wizard"
DisplayName = "Super Resolution Node"
Icon = "https://raw.githubusercontent.com/username/super-resolution-node/main/icon.png"
Banner = "https://raw.githubusercontent.com/username/super-resolution-node/main/banner.png"
requires-comfyui = ">=1.0.0" # ComfyUI version compatibility
```
# Standards
Source: https://docs.comfy.org/registry/standards
Security and other standards for publishing to the Registry
## Base Standards
### 1. Community Value
Custom nodes must provide valuable functionality to the ComfyUI community
Avoid:
* Excessive self-promotion
* Impersonation or misleading behavior
* Malicious behavior
* Self-promotion is permitted only within your designated settings menu section
* Top and side menus should contain only useful functionality
### 2. Node Compatibility
Do not interfere with other custom nodes' operations (installation, updates, removal)
* For dependencies on other custom nodes:
* Display clear warnings when dependent functionality is used
* Provide example workflows demonstrating required nodes
### 3. Legal Compliance
Must comply with all applicable laws and regulations
### 5. Quality Requirements
Nodes must be fully functional, well documented, and actively maintained.
### 6. Fork Guidelines
Forked nodes must:
* Have clearly distinct names from original
* Provide significant differences in functionality or code
Below are standards that must be met to publish custom nodes to the registry.
## Security Standards
Custom nodes should be secure. We will start working with custom nodes that violate these standards to be rewritten. If there is some major functionality that should be exposed by core, please request it in the [rfcs repo](https://github.com/comfy-org/rfcs).
### eval/exec Calls
#### Policy
The use of `eval` and `exec` functions is prohibited in custom nodes due to security concerns.
#### Reasoning
These functions can enable arbitrary code execution, creating potential Remote Code Execution (RCE) vulnerabilities when processing user inputs. Workflows containing nodes that pass user inputs into `eval` or `exec` could be exploited for various cyberattacks, including:
* Keylogging
* Ransomware
* Other malicious code execution
### subprocess for pip install
#### Policy
Runtime package installation through subprocess calls is not permitted.
#### Reasoning
* First item
ComfyUI manager will ship with ComfyUI and lets the user install dependencies
* Centralized dependency management improves security and user experience
* Helps prevent potential supply chain attacks
* Eliminates need for multiple ComfyUI reloads
### Code Obfuscation
#### Policy
Code obfuscation is prohibited in custom nodes.
#### Reasoning
Obfuscated code:
* Impossible to review and likely to be malicious
# Node Definition JSON
Source: https://docs.comfy.org/specs/nodedef_json
JSON schema for a ComfyUI Node.
The node definition JSON is defined using [JSON Schema](https://json-schema.org/). Changes to this schema will be discussed in the [rfcs repo](https://github.com/comfy-org/rfcs).
## v2.0 (Latest)
```json Node Definition v2.0
{
"$ref": "#/definitions/ComfyNodeDefV2",
"definitions": {
"ComfyNodeDefV2": {
"type": "object",
"properties": {
"inputs": {
"type": "object",
"additionalProperties": {
"anyOf": [
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"control_after_generate": {
"type": "boolean"
},
"type": {
"type": "string",
"const": "INT"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"round": {
"anyOf": [
{
"type": "number"
},
{
"type": "boolean",
"const": false
}
]
},
"type": {
"type": "string",
"const": "FLOAT"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {
"type": "boolean"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"label_on": {
"type": "string"
},
"label_off": {
"type": "string"
},
"type": {
"type": "string",
"const": "BOOLEAN"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {
"type": "string"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"multiline": {
"type": "boolean"
},
"dynamicPrompts": {
"type": "boolean"
},
"defaultVal": {
"type": "string"
},
"placeholder": {
"type": "string"
},
"type": {
"type": "string",
"const": "STRING"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
},
"type": {
"type": "string",
"const": "COMBO"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"type": {
"type": "string"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
}
]
}
},
"outputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"index": {
"type": "number"
},
"name": {
"type": "string"
},
"type": {
"type": "string"
},
"is_list": {
"type": "boolean"
},
"options": {
"type": "array"
},
"tooltip": {
"type": "string"
}
},
"required": [
"index",
"name",
"type",
"is_list"
],
"additionalProperties": false
}
},
"hidden": {
"type": "object",
"additionalProperties": {}
},
"name": {
"type": "string"
},
"display_name": {
"type": "string"
},
"description": {
"type": "string"
},
"category": {
"type": "string"
},
"output_node": {
"type": "boolean"
},
"python_module": {
"type": "string"
},
"deprecated": {
"type": "boolean"
},
"experimental": {
"type": "boolean"
}
},
"required": [
"inputs",
"outputs",
"name",
"display_name",
"description",
"category",
"output_node",
"python_module"
],
"additionalProperties": false
}
},
"$schema": "http://json-schema.org/draft-07/schema#"
}
```
# Node Definition JSON 1.0
Source: https://docs.comfy.org/specs/nodedef_json_1_0
JSON schema for a ComfyUI Node.
## v1.0
```json Node Definition v1.0
{
"$ref": "#/definitions/ComfyNodeDefV1",
"definitions": {
"ComfyNodeDefV1": {
"type": "object",
"properties": {
"input": {
"type": "object",
"properties": {
"required": {
"type": "object",
"additionalProperties": {
"anyOf": [
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "INT"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"control_after_generate": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "FLOAT"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"round": {
"anyOf": [
{
"type": "number"
},
{
"type": "boolean",
"const": false
}
]
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "BOOLEAN"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"type": "boolean"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"label_on": {
"type": "string"
},
"label_off": {
"type": "string"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "STRING"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"type": "string"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"multiline": {
"type": "boolean"
},
"dynamicPrompts": {
"type": "boolean"
},
"defaultVal": {
"type": "string"
},
"placeholder": {
"type": "string"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "COMBO"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
}
]
}
]
}
},
"optional": {
"type": "object",
"additionalProperties": {
"anyOf": [
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "INT"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"control_after_generate": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "FLOAT"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"round": {
"anyOf": [
{
"type": "number"
},
{
"type": "boolean",
"const": false
}
]
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "BOOLEAN"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"type": "boolean"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"label_on": {
"type": "string"
},
"label_off": {
"type": "string"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "STRING"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"type": "string"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"multiline": {
"type": "boolean"
},
"dynamicPrompts": {
"type": "boolean"
},
"defaultVal": {
"type": "string"
},
"placeholder": {
"type": "string"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "COMBO"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
}
]
}
]
}
},
"hidden": {
"type": "object",
"additionalProperties": {}
}
},
"additionalProperties": false
},
"output": {
"type": "array",
"items": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
]
}
},
"output_is_list": {
"type": "array",
"items": {
"type": "boolean"
}
},
"output_name": {
"type": "array",
"items": {
"type": "string"
}
},
"output_tooltips": {
"type": "array",
"items": {
"type": "string"
}
},
"name": {
"type": "string"
},
"display_name": {
"type": "string"
},
"description": {
"type": "string"
},
"category": {
"type": "string"
},
"output_node": {
"type": "boolean"
},
"python_module": {
"type": "string"
},
"deprecated": {
"type": "boolean"
},
"experimental": {
"type": "boolean"
}
},
"required": [
"name",
"display_name",
"description",
"category",
"output_node",
"python_module"
],
"additionalProperties": false
}
},
"$schema": "http://json-schema.org/draft-07/schema#"
}
```
# Workflow JSON
Source: https://docs.comfy.org/specs/workflow_json
JSON schema for a ComfyUI workflow.
The workflow JSON is defined using [JSON Schema](https://json-schema.org/). Changes to this schema will be discussed in the [rfcs repo](https://github.com/comfy-org/rfcs).
## Version 1.0 (Latest)
```json ComfyUI Workflow v1.0
{
"$ref": "#/definitions/ComfyWorkflow1_0",
"definitions": {
"ComfyWorkflow1_0": {
"type": "object",
"properties": {
"version": {
"type": "number",
"const": 1
},
"config": {
"anyOf": [
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"links_ontop": {
"type": "boolean"
},
"align_to_grid": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
},
{
"type": "null"
}
]
},
"state": {
"type": "object",
"properties": {
"lastGroupid": {
"type": "number"
},
"lastNodeId": {
"type": "number"
},
"lastLinkId": {
"type": "number"
},
"lastRerouteId": {
"type": "number"
}
},
"additionalProperties": true
},
"groups": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {
"type": "string"
},
"bounding": {
"type": "array",
"minItems": 4,
"maxItems": 4,
"items": [
{
"type": "number"
},
{
"type": "number"
},
{
"type": "number"
},
{
"type": "number"
}
]
},
"color": {
"type": "string"
},
"font_size": {
"type": "number"
},
"locked": {
"type": "boolean"
}
},
"required": [
"title",
"bounding"
],
"additionalProperties": true
}
},
"nodes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"type": {
"type": "string"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"size": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"flags": {
"type": "object",
"properties": {
"collapsed": {
"type": "boolean"
},
"pinned": {
"type": "boolean"
},
"allow_interaction": {
"type": "boolean"
},
"horizontal": {
"type": "boolean"
},
"skip_repeated_outputs": {
"type": "boolean"
}
},
"additionalProperties": true
},
"order": {
"type": "number"
},
"mode": {
"type": "number"
},
"inputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"link": {
"type": [
"number",
"null"
]
},
"slot_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
}
},
"required": [
"name",
"type"
],
"additionalProperties": true
}
},
"outputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"links": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
},
"slot_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
}
},
"required": [
"name",
"type"
],
"additionalProperties": true
}
},
"properties": {
"type": "object",
"properties": {
"Node name for S&R": {
"type": "string"
}
},
"additionalProperties": true
},
"widgets_values": {
"anyOf": [
{
"type": "array"
},
{
"type": "object",
"additionalProperties": {}
}
]
},
"color": {
"type": "string"
},
"bgcolor": {
"type": "string"
}
},
"required": [
"id",
"type",
"pos",
"size",
"flags",
"order",
"mode",
"properties"
],
"additionalProperties": true
}
},
"links": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"origin_id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"origin_slot": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"target_id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"target_slot": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"parentId": {
"type": "number"
}
},
"required": [
"id",
"origin_id",
"origin_slot",
"target_id",
"target_slot",
"type"
],
"additionalProperties": true
}
},
"reroutes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"linkIds": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
}
},
"required": [
"id",
"pos"
],
"additionalProperties": true
}
},
"extra": {
"anyOf": [
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"ds": {
"type": "object",
"properties": {
"scale": {
"type": "number"
},
"offset": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
}
},
"required": [
"scale",
"offset"
],
"additionalProperties": true
},
"info": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"author": {
"type": "string"
},
"description": {
"type": "string"
},
"version": {
"type": "string"
},
"created": {
"type": "string"
},
"modified": {
"type": "string"
},
"software": {
"type": "string"
}
},
"required": [
"name",
"author",
"description",
"version",
"created",
"modified",
"software"
],
"additionalProperties": true
},
"linkExtensions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
}
},
"required": [
"id",
"parentId"
],
"additionalProperties": true
}
},
"reroutes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"linkIds": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
}
},
"required": [
"id",
"pos"
],
"additionalProperties": true
}
}
},
"additionalProperties": true
}
]
},
{
"type": "null"
}
]
},
"models": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"url": {
"type": "string",
"format": "uri"
},
"hash": {
"type": "string"
},
"hash_type": {
"type": "string"
},
"directory": {
"type": "string"
}
},
"required": [
"name",
"url",
"directory"
],
"additionalProperties": false
}
}
},
"required": [
"version",
"state",
"nodes"
],
"additionalProperties": true
}
},
"$schema": "http://json-schema.org/draft-07/schema#"
}
```
## Older versions
* [0.4](./workflow_json_0.4)
# Workflow JSON 0.4
Source: https://docs.comfy.org/specs/workflow_json_0.4
JSON schema for a ComfyUI workflow.
## v0.4
```json
{
"$ref": "#/definitions/ComfyWorkflow0_4",
"definitions": {
"ComfyWorkflow0_4": {
"type": "object",
"properties": {
"last_node_id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"last_link_id": {
"type": "number"
},
"nodes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"type": {
"type": "string"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"size": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"flags": {
"type": "object",
"properties": {
"collapsed": {
"type": "boolean"
},
"pinned": {
"type": "boolean"
},
"allow_interaction": {
"type": "boolean"
},
"horizontal": {
"type": "boolean"
},
"skip_repeated_outputs": {
"type": "boolean"
}
},
"additionalProperties": true
},
"order": {
"type": "number"
},
"mode": {
"type": "number"
},
"inputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"link": {
"type": [
"number",
"null"
]
},
"slot_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
}
},
"required": [
"name",
"type"
],
"additionalProperties": true
}
},
"outputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"links": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
},
"slot_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
}
},
"required": [
"name",
"type"
],
"additionalProperties": true
}
},
"properties": {
"type": "object",
"properties": {
"Node name for S&R": {
"type": "string"
}
},
"additionalProperties": true
},
"widgets_values": {
"anyOf": [
{
"type": "array"
},
{
"type": "object",
"additionalProperties": {}
}
]
},
"color": {
"type": "string"
},
"bgcolor": {
"type": "string"
}
},
"required": [
"id",
"type",
"pos",
"size",
"flags",
"order",
"mode",
"properties"
],
"additionalProperties": true
}
},
"links": {
"type": "array",
"items": {
"type": "array",
"minItems": 6,
"maxItems": 6,
"items": [
{
"type": "number"
},
{
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
{
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
{
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
{
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
{
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
}
]
}
},
"groups": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {
"type": "string"
},
"bounding": {
"type": "array",
"minItems": 4,
"maxItems": 4,
"items": [
{
"type": "number"
},
{
"type": "number"
},
{
"type": "number"
},
{
"type": "number"
}
]
},
"color": {
"type": "string"
},
"font_size": {
"type": "number"
},
"locked": {
"type": "boolean"
}
},
"required": [
"title",
"bounding"
],
"additionalProperties": true
}
},
"config": {
"anyOf": [
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"links_ontop": {
"type": "boolean"
},
"align_to_grid": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
},
{
"type": "null"
}
]
},
"extra": {
"anyOf": [
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"ds": {
"type": "object",
"properties": {
"scale": {
"type": "number"
},
"offset": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
}
},
"required": [
"scale",
"offset"
],
"additionalProperties": true
},
"info": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"author": {
"type": "string"
},
"description": {
"type": "string"
},
"version": {
"type": "string"
},
"created": {
"type": "string"
},
"modified": {
"type": "string"
},
"software": {
"type": "string"
}
},
"required": [
"name",
"author",
"description",
"version",
"created",
"modified",
"software"
],
"additionalProperties": true
},
"linkExtensions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
}
},
"required": [
"id",
"parentId"
],
"additionalProperties": true
}
},
"reroutes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"linkIds": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
}
},
"required": [
"id",
"pos"
],
"additionalProperties": true
}
}
},
"additionalProperties": true
}
]
},
{
"type": "null"
}
]
},
"version": {
"type": "number"
},
"models": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"url": {
"type": "string",
"format": "uri"
},
"hash": {
"type": "string"
},
"hash_type": {
"type": "string"
},
"directory": {
"type": "string"
}
},
"required": [
"name",
"url",
"directory"
],
"additionalProperties": false
}
}
},
"required": [
"last_node_id",
"last_link_id",
"nodes",
"links",
"version"
],
"additionalProperties": true
}
},
"$schema": "http://json-schema.org/draft-07/schema#"
}
```
# How to Troubleshoot and Solve ComfyUI Issues
Source: https://docs.comfy.org/troubleshooting/custom-node-issues
Troubleshoot and fix problems caused by custom nodes and extensions
Here is the overall approach for troubleshooting custom node issues:
```mermaid
flowchart TD
A[Issue Encountered] --> B{Does the issue disappear after disabling all custom nodes ?}
B -- Yes --> C[Issue caused by custom nodes]
B -- No --> D[Issue not caused by custom nodes, refer to other troubleshooting docs]
C --> E{Check frontend extensions first ?}
E -- Yes --> F[Troubleshoot in ComfyUI frontend Only need reload the frontend ]
E -- No --> G[Use general binary search method Need to restart ComfyUI multiple times ]
F --> H[Use binary search to locate problematic node]
G --> H
H --> I[Fix, replace, report or remove problematic node ]
I --> J[Issue solved]
```
## How to disable all custom nodes?
Start ComfyUI Desktop with custom nodes disabled from the settings menu
or run the server manually:
```bash
cd path/to/your/comfyui
python main.py --disable-all-custom-nodes
```
```bash
cd ComfyUI
python main.py --disable-all-custom-nodes
```
Open the folder where the portable version is located, and find the `run_nvidia_gpu.bat` or `run_cpu.bat` file
1. Copy `run_nvidia_gpu.bat` or `run_cpu.bat` file and rename it to `run_nvidia_gpu_disable_custom_nodes.bat`
2. Open the copied file with Notepad
3. Add the `--disable-all-custom-nodes` parameter to the file, or copy the parameters below into a `.txt` file and rename the file to `run_nvidia_gpu_disable_custom_nodes.bat`
```bash
.\python_embeded\python.exe -s ComfyUI\main.py --disable-all-custom-nodes --windows-standalone-build
pause
```
4. Save the file and close it
5. Double-click the file to run it. If everything is normal, you should see ComfyUI start and custom nodes disabled
1. Enter the folder where the portable version is located
2. Open the terminal by right-clicking the menu → Open terminal
3. Ensure that the folder name is the current directory of the portable version
4. Enter the following command to start ComfyUI through the portable python and disable custom nodes
```
.\python_embeded\python.exe -s ComfyUI\main.py --disable-all-custom-nodes
```
**Results:**
* ✅ **Issue disappears**: A custom node is causing the problem → Continue to Step 2
* ❌ **Issue persists**: Not a custom node issue → [Report the issue](#reporting-issues)
## What is Binary Search?
In this document, we will introduce the binary search approach for troubleshooting custom node issues, which involves checking half of the custom nodes at a time until we locate the problematic node.
Please refer to the flowchart below for the specific approach - enable half of the currently disabled nodes each time and check if the issue appears, until we identify which custom node is causing the issue
```mermaid
flowchart TD
A[Start] --> B{Split all custom nodes in half}
B --> C[Enable first half of custom nodes]
C --> D[Restart ComfyUI and test]
D --> E{Does the issue appear?}
E --> |Yes| F[Issue is in enabled custom nodes]
E --> |No| G[Issue is in disabled custom nodes]
F --> H{Enabled custom nodes > 1?}
G --> I{Disabled custom nodes > 1?}
H --> |Yes| J[Continue binary search on enabled nodes]
I --> |Yes| K[Continue binary search on disabled nodes]
H --> |No| L[Found problematic custom node]
I --> |No| L
J --> B
K --> B
L --> M[End]
```
## Two Troubleshooting Methods
In this document, we categorize custom nodes into two types for troubleshooting:
* A: Custom nodes with frontend extensions
* B: Regular custom nodes
Let's first understand the potential issues and causes for different types of custom nodes:
For custom nodes, we can prioritize troubleshooting those with frontend extensions, as they cause the most issues. Their main conflicts arise from incompatibilities with ComfyUI frontend version updates.
Common issues include:
* Workflows not executing
* Some nodes can't show preview images(such as save image node)
* Misaligned UI elements
* Unable to access ComfyUI frontend
* Completely broken UI or blank screen
* Unable to communicate normally with ComfyUI backend
* Node connections not working properly
* And more
Common causes for these issues:
* Frontend modifications during updates that custom nodes haven't adapted to yet
* Users updating ComfyUI without synchronously upgrading custom nodes, even though authors have released compatible versions
* Authors stopping maintenance, leading to incompatibility between custom node extensions and the ComfyUI frontend
If the problem isn't caused by custom nodes' frontend extensions, issues often relate to dependencies. Common problems include:
* "Failed to import" errors in console/logs
* Missing nodes still showing as missing after installation and restart
* ComfyUI crashes or fails to start
* And more
Common causes for these errors:
* Custom nodes requiring additional wheels like ComfyUI-Nunchaku
* Custom nodes using strict dependency versions (e.g., `torch==2.4.1`) while other plugins use different versions (e.g., `torch>=2.4.2`), causing conflicts after installation
* Network issues preventing successful dependency installation
When problems involve Python environment interdependencies and versions, troubleshooting becomes more complex and requires knowledge of Python environment management, including how to install and uninstall dependencies
## Using Binary Search for Troubleshooting
Among these two different types of custom node issues, conflicts between custom node frontend extensions and ComfyUI are more common. We'll prioritize troubleshooting these nodes first. Here's the overall troubleshooting approach:
### 1. Troubleshooting the Custom Nodes' Frontend Extensions
After starting ComfyUI, find the `Extensions` menu in settings and follow the steps shown in the image to disable all third-party extensions
If you can't enter ComfyUI frontend, just skip the frontend extensions troubleshooting section and continue to [General Custom Node Troubleshooting Approach](#2-general-custom-node-troubleshooting-approach)
After disabling frontend extensions for the first time, it's recommended to restart ComfyUI to ensure all frontend extensions are properly disabled
* If the problem disappears, then it was caused by custom node frontend extensions, and we can proceed with binary search troubleshooting
* If the problem persists, then it's not caused by frontend extensions - please refer to the other troubleshooting approaches in this document
Use the method mentioned at the beginning of this document to troubleshoot, enabling half of the custom nodes at a time until you find the problematic node
Refer to the image to enable half of the frontend extensions. Note that if extension names are similar, they likely come from the same custom node's frontend extensions
If you find the problematic custom node, please refer to the problem fixing section of this document to resolve the custom node issues
Using this method, you don't need to restart ComfyUI multiple times - just reload ComfyUI after enabling/disabling custom node frontend extensions. Plus, your troubleshooting scope is limited to nodes with frontend extensions, which greatly narrows down the search range.
### 2. General Custom Node Troubleshooting
For the binary search localization method, in addition to manual search, we also have automated binary search using comfy-cli, as detailed below:
Using Comfy CLI requires some command line experience. If you're not comfortable with it, use manual binary search instead.
If you have [Comfy CLI](/comfy-cli/getting-started) installed, you can use the automated bisect tool to find the problematic node:
```bash
# Start a bisect session
comfy-cli node bisect start
# Follow the prompts:
# - Test ComfyUI with the current set of enabled nodes
# - Mark as 'good' if the issue is gone: comfy-cli node bisect good
# - Mark as 'bad' if the issue persists: comfy-cli node bisect bad
# - Repeat until the problematic node is identified
# Reset when done
comfy-cli node bisect reset
```
The bisect tool will automatically enable/disable nodes and guide you through the process.
Before starting, please **create a backup** of your custom\_nodes folder in case something goes wrong.
If you prefer to do the process manually or don't have Comfy CLI installed, follow the steps below:
Before starting, enter the `\ComfyUI\` folder
* **Backup all custom nodes**: Copy and rename `custom_nodes` to `custom_nodes_backup`
* **Create a temporary folder**: Create a folder named `custom_nodes_temp`
Or use the following command line to backup:
```bash
# Create backup and temporary folder
mkdir "%USERPROFILE%\custom_nodes_backup"
mkdir "%USERPROFILE%\custom_nodes_temp"
# Backup all content
xcopy "custom_nodes\*" "%USERPROFILE%\custom_nodes_backup\" /E /H /Y
```
Manually backup custom\_nodes folder
Or use the following command line to backup:
```bash
# Create backup and temporary folder
mkdir ~/custom_nodes_backup
mkdir ~/custom_nodes_temp
# Backup all content
cp -r custom_nodes/* ~/custom_nodes_backup/
```
```bash
# Create backup and temporary folder
mkdir /content/custom_nodes_backup
mkdir /content/custom_nodes_temp
# Backup all content
cp -r /content/ComfyUI/custom_nodes/* /content/custom_nodes_backup/
```
Since Windows has a visual interface, you can skip this step unless you're only using the command line
```bash
dir custom_nodes
```
```bash
ls custom_nodes/
```
```bash
ls /content/ComfyUI/custom_nodes/
```
Let's assume that you have 8 custom nodes. Move the first half to temporary storage:
```bash
# Move first half (nodes 1-4) to temp
move "custom_nodes\node1" "%USERPROFILE%\custom_nodes_temp\"
move "custom_nodes\node2" "%USERPROFILE%\custom_nodes_temp\"
move "custom_nodes\node3" "%USERPROFILE%\custom_nodes_temp\"
move "custom_nodes\node4" "%USERPROFILE%\custom_nodes_temp\"
```
```bash
# Move first half (nodes 1-4) to temp
mv custom_nodes/node1 ~/custom_nodes_temp/
mv custom_nodes/node2 ~/custom_nodes_temp/
mv custom_nodes/node3 ~/custom_nodes_temp/
mv custom_nodes/node4 ~/custom_nodes_temp/
```
```bash
# Move first half (nodes 1-4) to temp
mv /content/ComfyUI/custom_nodes/node1 /content/custom_nodes_temp/
mv /content/ComfyUI/custom_nodes/node2 /content/custom_nodes_temp/
mv /content/ComfyUI/custom_nodes/node3 /content/custom_nodes_temp/
mv /content/ComfyUI/custom_nodes/node4 /content/custom_nodes_temp/
```
Start ComfyUI normally
```bash
python main.py
```
* **Issue persists**: Problem is in the remaining nodes (5-8)
* **Issue disappears**: Problem was in the moved nodes (1-4)
* If issue persists: Move half of remaining nodes (e.g., nodes 7-8) to temp
* If issue gone: Move half of temp nodes (e.g., nodes 3-4) back to custom\_nodes
* Repeat until you find the single problematic node
## How to Fix the Issue
Once you've identified the problematic custom node:
### Option 1: Update the Node
1. Check if there's an update available in ComfyUI Manager
2. Update the node and test again
### Option 2: Replace the Node
1. Look for alternative custom nodes with similar functionality
2. Check the [ComfyUI Registry](https://registry.comfy.org) for alternatives
### Option 3: Report the Issue
Contact the custom node developer:
1. Find the node's GitHub repository
2. Create an issue with:
* Your ComfyUI version
* Error messages/logs
* Steps to reproduce
* Your operating system
### Option 4: Remove or Disable the Node
If no fix is available and you don't need the functionality:
1. Remove the problematic node from `custom_nodes/` or disable it in the ComfyUI Manager interface
2. Restart ComfyUI
## Reporting Issues
If the issue isn't caused by custom nodes, refer to the general [troubleshooting overview](/troubleshooting/overview) for other common problems.
### For Custom Node-Specific Issues
Contact the custom node developer:
* Find the node's GitHub repository
* Create an issue with your ComfyUI version, error messages, reproduction steps, and OS
* Check the node's documentation and Issues page for known issues
### For ComfyUI Core Issues
* **GitHub**: [ComfyUI Issues](https://github.com/comfyanonymous/ComfyUI/issues)
* **Forum**: [Official ComfyUI Forum](https://forum.comfy.org/)
### For Desktop App Issues
* **GitHub**: [ComfyUI Desktop Issues](https://github.com/Comfy-Org/desktop/issues)
### For Frontend Issues
* **GitHub**: [ComfyUI Frontend Issues](https://github.com/Comfy-Org/ComfyUI_frontend/issues)
For general installation, model, or performance issues, see our [troubleshooting overview](/troubleshooting/overview) and [model issues](/troubleshooting/model-issues) pages.
# How to Troubleshoot and Solve ComfyUI Model Issues
Source: https://docs.comfy.org/troubleshooting/model-issues
Troubleshooting model-related problems including architecture mismatches, missing models, and loading errors
## Model Architecture Mismatch
**Symptoms:** Tensor dimension errors during generation, especially during VAE decode stage
**Common error messages:**
* `Given groups=1, weight of size [64, 4, 3, 3], expected input[1, 16, 128, 128] to have 4 channels, but got 16 channels instead`
* `Given groups=1, weight of size [4, 4, 1, 1], expected input[1, 16, 144, 112] to have 4 channels, but got 16 channels instead`
* `Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 16, 192, 128] to have 4 channels, but got 16 channels instead`
* `The size of tensor a (49) must match the size of tensor b (16) at non-singleton dimension 1`
* `Tensors must have same number of dimensions: got 2 and 3`
* `mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)`
**Root cause:** Using models from different architecture families together
### Solutions
1. **Verify model family compatibility:**
* **Flux models** use 16-channel latent space with dual text encoder conditioning (CLIP-L + T5-XXL)
* **SD1.5 models** use 4-channel latent space with single CLIP ViT-L/14 text encoder
* **SDXL models** use 4-channel latent space with dual text encoders (CLIP ViT-L/14 + OpenCLIP ViT-bigG/14)
* **SD3 models** use 16-channel latent space with triple text encoder conditioning (CLIP-L + OpenCLIP bigG + T5-XXL)
* **ControlNet models** must match the architecture of the base checkpoint (SD1.5 ControlNets only work with SD1.5 checkpoints, SDXL ControlNets only work with SDXL checkpoints, etc.)
2. **Common mismatch scenarios and fixes:**
**Flux + wrong VAE:**
```
Problem: Using taesd or sdxl_vae.safetensors with Flux checkpoint
Fix: Use ae.safetensors (Flux VAE) from Hugging Face Flux releases
```
**Flux + incorrect CLIP configuration:**
```
Problem: Using t5xxl_fp8_e4m3fn.safetensors in both CLIP slots of DualClipLoader
Fix: Use t5xxl_fp8_e4m3fn.safetensors in one slot and clip_l.safetensors in the other
```
**ControlNet architecture mismatch:**
```
Problem: SD1.5 ControlNet with SDXL checkpoint (or vice versa)
Error: "mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)"
Fix: Use ControlNet models designed for your checkpoint architecture
- SD1.5 checkpoints require SD1.5 ControlNets
- SDXL checkpoints require SDXL ControlNets
```
3. **Quick diagnostics:**
```bash
# Check if error occurs at VAE decode stage
# Look for "expected input[X, Y, Z] to have N channels, but got M channels"
# Y value indicates channel count: 4 = SD models, 16 = Flux models
```
4. **Prevention strategies:**
* Keep all workflow models within the same architecture family
* Download complete model packages from same source/release (often all in a Hugging Face repo)
* When trying new models, start with the template workflows or official ComfyUI workflow examples before customizing
## Missing Models Error
**Example error message:**
```
Prompt execution failed
Prompt outputs failed validation:
CheckpointLoaderSimple:
- Value not in list: ckpt_name: 'model-name.safetensors' not in []
```
### Solutions
1. **Download required models:**
* Use ComfyUI Manager to auto-download models
* Verify models are in correct subfolders
2. **Check model paths:**
* **Checkpoints**: `models/checkpoints/`
* **VAE**: `models/vae/`
* **LoRA**: `models/loras/`
* **ControlNet**: `models/controlnet/`
* **Embeddings**: `models/embeddings/`
3. **Share models between UIs or use custom paths:**
* See [ComfyUI Model Sharing and Custom Model Directory Configuration](/installation/comfyui_portable_windows#2-comfyui-model-sharing-and-custom-model-directory-configuration) for detailed instructions
* Edit `extra_model_paths.yaml` file to add custom model directories
### Model Search Path Configuration
If you have models in custom locations, see the detailed guide for [ComfyUI Model Sharing and Custom Model Directory Configuration](/installation/comfyui_portable_windows#2-comfyui-model-sharing-and-custom-model-directory-configuration) to configure ComfyUI to find them.
## Model Loading Errors
**Error message:** "Error while deserializing header"
### Solutions
1. **Re-download the model** - File may be corrupted during download
2. **Check available disk space** - Ensure enough space for model loading (models can be 2-15GB+)
3. **Check file permissions** - Ensure ComfyUI can read the model files
4. **Test with different model** - Verify if issue is model-specific or system-wide
## Model Performance Issues
### Slow Model Loading
**Symptoms:** Long delays when switching models or starting generation
**Solutions:**
1. **Use faster storage:**
* Move models to SSD if using HDD
* Use NVMe SSD for best performance
2. **Adjust caching settings:**
```bash
python main.py --cache-classic # Use the old style (aggressive) caching.
python main.py --cache-lru 10 # Increase size of LRU cache
```
### Memory Issues with Large Models
**"RuntimeError: CUDA out of memory":**
```bash
# Progressive memory reduction
python main.py --lowvram # First try
python main.py --novram # If lowvram insufficient
python main.py --cpu # Last resort
```
**Model-specific memory optimization:**
```bash
# Force lower precision
python main.py --force-fp16
# Reduce attention memory usage
python main.py --use-pytorch-cross-attention
```
For additional model configuration and setup information, see the [Models documentation](/development/core-concepts/models).
# How to Troubleshoot and Solve ComfyUI Issues
Source: https://docs.comfy.org/troubleshooting/overview
Common ComfyUI issues, solutions, and how to report bugs effectively
We receive a lot of feedback issues, and we find that most of the issues submitted are related to custom nodes. So please ensure that you have read the [custom node troubleshooting guide](/troubleshooting/custom-node-issues) before submitting an error report to ensure that the issue is not caused by ComfyUI core issues.
Check how to troubleshoot issues caused by custom nodes.
## Common Issues & Quick Fixes
Before diving into detailed troubleshooting, try these common solutions:
### ComfyUI Won't Start
**Symptoms:** Application crashes on startup, black screen, or fails to load
**Quick fixes:**
1. **Check system requirements** - Ensure your system meets the [minimum requirements](/installation/system_requirements)
2. **Update GPU drivers** - Download latest drivers from NVIDIA/AMD/Intel
### Generation Fails or Produces Errors
**Symptoms:** "Prompt execution failed" dialog with "Show report" button, workflow stops executing
**Quick fixes:**
1. **Click "Show report"** - Read the detailed error message to identify the specific issue
2. **Check if it's a custom node issue** - [Follow our custom node troubleshooting guide](/troubleshooting/custom-node-issues)
3. **Verify model files** - See [Models documentation](/development/core-concepts/models) for model setup
4. **Check VRAM usage** - Close other applications using GPU memory
### Slow Performance
**Symptoms:** Very slow generation times, system freezing, out of memory errors
**Quick fixes:**
1. **Lower resolution/batch size** - Reduce image size or number of images
2. **Use memory optimization flags** - See performance optimization section below
3. **Close unnecessary applications** - Free up RAM and VRAM
4. **Check CPU/GPU usage** - Use Task Manager to identify bottlenecks
**Performance Optimization Commands:**
For low VRAM systems:
```bash
# Low VRAM mode (uses cpu for text encoder)
python main.py --lowvram
# CPU mode (very slow but works with any hardware, only use as absolute last resort)
python main.py --cpu
```
For better performance:
```bash
# Disable previews (saves VRAM and processing)
python main.py --preview-method none
# Use optimized attention mechanisms
python main.py --use-pytorch-cross-attention
python main.py --use-flash-attention
# Async weight offloading
python main.py --async-offload
```
For memory management:
```bash
# Reserve specific VRAM amount for OS (in GB)
python main.py --reserve-vram 2
# Disable smart memory management
python main.py --disable-smart-memory
# Use different caching strategies
python main.py --cache-none # Less RAM usage, but slower
python main.py --cache-lru 10 # Cache 10 results, faster
python main.py --cache-classic # Use the old style (aggressive) caching.
```
## Installation-Specific Issues
### Desktop App Issues
For comprehensive desktop installation troubleshooting, see the [Desktop Installation Guide](/installation/desktop/windows).
* **Unsupported device**: ComfyUI Desktop Windows only supports NVIDIA GPUs with CUDA. Use [ComfyUI Portable](/installation/comfyui_portable_windows) or [manual installation](/installation/manual_install) for other GPUs
* **Installation fails**: Run installer as administrator, ensure at least 15GB disk space
* **Maintenance page**: Check [mirror settings](/installation/desktop/windows#mirror-settings) if downloads fail
* **Missing models**: Models are not copied during migration, only linked. Verify model paths
* **"App is damaged"**: Allow app in Security & Privacy settings
* **Performance issues**: Grant Full Disk Access in Privacy settings
* **Crashes**: Check Console app for crash reports
* **Missing libraries**: Install dependencies with package manager
* **LD\_LIBRARY\_PATH errors**: PyTorch library path issues (see below)
### Manual Installation Issues
The documentation may be slightly out of date. If an issue occurs, please manually verify whether a newer stable version of pytorch or any of the listed libraries exists. Refer to resources like the [pytorch installation matrix](https://pytorch.org/get-started/locally/) or the [ROCm website](https://rocm.docs.amd.com/projects/install-on-linux/en/develop/install/3rd-party/pytorch-install.html#using-a-wheels-package).
**Python version conflicts:**
```bash
# Check Python version (3.9+ required, 3.12 recommended)
python --version
# Use virtual environment (recommended)
python -m venv comfyui_env
source comfyui_env/bin/activate # Linux/Mac
comfyui_env\Scripts\activate # Windows
```
**Package installation failures:**
```bash
# Update pip first
python -m pip install --upgrade pip
# Install dependencies
pip install -r requirements.txt
# For NVIDIA GPUs (CUDA 12.8)
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu128
# For AMD GPUs (Linux only - ROCm 6.3)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.3
```
### Linux-Specific Issues
**LD\_LIBRARY\_PATH errors:**
Common symptoms:
* "libcuda.so.1: cannot open shared object file"
* "libnccl.so: cannot open shared object file"
* "ImportError: libnvinfer.so.X: cannot open shared object file"
**Solutions:**
1. **Modern PyTorch installations (most common):**
```bash
# For virtual environments with NVIDIA packages
export LD_LIBRARY_PATH=$VIRTUAL_ENV/lib/python3.12/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
# For conda environments
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib/python3.12/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
# Or find your Python site-packages automatically
PYTHON_PATH=$(python -c "import site; print(site.getsitepackages()[0])")
export LD_LIBRARY_PATH=$PYTHON_PATH/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
# You may also need other NVIDIA libraries
export LD_LIBRARY_PATH=$PYTHON_PATH/nvidia/cuda_runtime/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$PYTHON_PATH/nvidia/cublas/lib:$LD_LIBRARY_PATH
```
2. **Find what libraries you have:**
```bash
# Check installed NVIDIA packages
python -c "import site; import os; nvidia_path=os.path.join(site.getsitepackages()[0], 'nvidia'); print('NVIDIA libs:', [d for d in os.listdir(nvidia_path) if os.path.isdir(os.path.join(nvidia_path, d))] if os.path.exists(nvidia_path) else 'Not found')"
# Find missing libraries that PyTorch needs
python -c "import torch; print(torch.__file__)"
ldd $(python -c "import torch; print(torch.__file__.replace('__init__.py', 'lib/libtorch_cuda.so'))")
```
3. **Set permanently for your environment:**
```bash
# For virtual environments, add to activation script
echo 'export LD_LIBRARY_PATH=$VIRTUAL_ENV/lib/python*/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH' >> $VIRTUAL_ENV/bin/activate
# For conda environments
conda env config vars set LD_LIBRARY_PATH=$CONDA_PREFIX/lib/python*/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
# For global bashrc (adjust Python version as needed)
echo 'export LD_LIBRARY_PATH=$(python -c "import site; print(site.getsitepackages()[0])")/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
```
4. **Alternative: Use ldconfig:**
```bash
# Check current library cache
ldconfig -p | grep cuda
ldconfig -p | grep nccl
# If missing, add library paths (requires root)
sudo echo "/usr/local/cuda/lib64" > /etc/ld.so.conf.d/cuda.conf
sudo ldconfig
```
5. **Debug library loading:**
```bash
# Verbose library loading to see what's missing
LD_DEBUG=libs python main.py 2>&1 | grep "looking for"
# Check PyTorch CUDA availability
python -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('CUDA version:', torch.version.cuda)"
```
## Model-Related Issues
For comprehensive model troubleshooting including architecture mismatches, missing models, and loading errors, see the dedicated [Model Issues](/troubleshooting/model-issues) page.
## Network & API Issues
### API Nodes Not Working
**Symptoms:** API calls fail, timeout errors, quota exceeded
**Solutions:**
1. **Check API key validity** - Verify keys in [user settings](/interface/user)
2. **Check account credits** - Ensure sufficient [API credits](/interface/credits)
3. **Verify internet connection** - Test with other online services
4. **Check service status** - Provider may be experiencing downtime
### Connection Issues
**Symptoms:** "Failed to connect to server", timeout errors
**Solutions:**
1. **Check firewall settings** - Allow ComfyUI through firewall
2. **Try different port** - Default is 8188, try 8189 or 8190
3. **Disable VPN temporarily** - VPN may be blocking connections
4. **Check proxy settings** - Disable proxy if not required
### Frontend Issues
**"Frontend or Templates Package Not Updated":**
```bash
# After updating ComfyUI via Git, update frontend dependencies
pip install -r requirements.txt
```
**"Can't Find Custom Node":**
* Disable node validation in ComfyUI settings
**"Error Toast About Workflow Failing Validation":**
* Disable workflow validation in settings temporarily
* Report the issue to the ComfyUI team
**Login Issues When Not on Localhost:**
* Normal login only works when accessing from localhost
* For LAN/remote access: Generate API key at [platform.comfy.org/login](https://platform.comfy.org/login)
* Use API key in login dialog or with `--api-key` command line argument
## Hardware-Specific Issues
### NVIDIA GPU Issues
**"Torch not compiled with CUDA enabled" error:**
```bash
# First uninstall torch
pip uninstall torch
# Install stable PyTorch with CUDA 12.8
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu128
# For nightly builds (might have performance improvements)
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
# Verify CUDA support
python -c "import torch; print(torch.cuda.is_available())"
```
**GPU not detected:**
```bash
# Check if GPU is visible
nvidia-smi
# Check driver version and CUDA compatibility
nvidia-smi --query-gpu=driver_version --format=csv
```
### AMD GPU Issues
**ROCm support (Linux only):**
```bash
# Install stable ROCm PyTorch (6.3.1 at the time of writing)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.3
# For nightly builds (ROCm 6.4 at the time of writing), which might have performance improvements)
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.4
```
**Unsupported AMD GPUs:**
```bash
# For RDNA2 or older (6700, 6600)
HSA_OVERRIDE_GFX_VERSION=10.3.0 python main.py
# For RDNA3 cards (7600)
HSA_OVERRIDE_GFX_VERSION=11.0.0 python main.py
```
**Performance optimization:**
```bash
# Enable experimental memory efficient attention (no longer necessary with PyTorch 2.4)
TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 python main.py --use-pytorch-cross-attention
# Enable tunable operations (slow first run, but faster subsequent runs)
PYTORCH_TUNABLEOP_ENABLED=1 python main.py
```
### Apple Silicon (M1/M2/M3) Issues
**MPS backend setup:**
```bash
# Install PyTorch nightly for Apple Silicon
# Follow Apple's guide: https://developer.apple.com/metal/pytorch/
# Check MPS availability
python -c "import torch; print(torch.backends.mps.is_available())"
# Launch ComfyUI
python main.py
```
**If MPS causes issues:**
```bash
# Force CPU mode
python main.py --cpu
# With memory optimization
python main.py --force-fp16 --cpu
```
### Intel GPU Issues
**Option 1: Native PyTorch XPU support (Windows/Linux):**
```bash
# Install PyTorch nightly with XPU support
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/xpu
# Launch ComfyUI
python main.py
```
**Option 2: Intel Extension for PyTorch (IPEX):**
```bash
# For Intel Arc A-Series Graphics
conda install libuv
pip install torch==2.3.1.post0+cxx11.abi torchvision==0.18.1.post0+cxx11.abi torchaudio==2.3.1.post0+cxx11.abi intel-extension-for-pytorch==2.3.110.post0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
```
## Getting Help & Reporting Bugs
### Before Reporting a Bug
1. **Check if it's a known issue:**
* Search [GitHub Issues](https://github.com/comfyanonymous/ComfyUI/issues)
* Check [ComfyUI Forum](https://forum.comfy.org/)
* Review [Discord discussions](https://discord.com/invite/comfyorg)
2. **Try basic troubleshooting:**
* Test with [default workflow](/get_started/first_generation)
* Disable all custom nodes (see [custom node troubleshooting](/troubleshooting/custom-node-issues))
* Check console/terminal for error messages
* If using comfy-cli, try updating: `comfy node update all`
### How to Report Bugs Effectively
#### For ComfyUI Core Issues
**Where to report:** [GitHub Issues](https://github.com/comfyanonymous/ComfyUI/issues)
#### For Desktop App Issues
**Where to report:** [Desktop GitHub Issues](https://github.com/Comfy-Org/desktop/issues)
#### For Frontend Issues
**Where to report:** [Frontend GitHub Issues](https://github.com/Comfy-Org/ComfyUI_frontend/issues)
#### For Custom Node Issues
**Where to report:** Contact the specific custom node developer
### Required Information
When reporting any issue, include:
**System Information (can be found in the About page in settings):**
* Operating System (Windows 11, macOS 14.1, Ubuntu 22.04, etc.)
* ComfyUI version (check About page in settings)
* Python version: `python --version`
* PyTorch version: `python -c "import torch; print(torch.__version__)"`
* GPU model and driver version
* Installation method (Desktop, Portable, Manual, comfy-cli)
```bash
# System info
systeminfo | findstr /C:"OS Name" /C:"OS Version"
# GPU info
wmic path win32_VideoController get name
# Python & PyTorch info
python --version
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}')"
```
```bash
# System info
uname -a
# GPU info (Linux)
lspci | grep VGA
# Python & PyTorch info
python --version
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}')"
```
**For Desktop App issues, also include:**
* Log files from: `C:\Users\\AppData\Roaming\ComfyUI\logs` (Windows)
* Config files from: `C:\Users\\AppData\Roaming\ComfyUI` (Windows)
**Problem Details:**
* Clear description of the issue
* Steps to reproduce the problem
* Expected vs actual behavior
* Screenshots or videos if applicable
**Error Messages:**
* Full error text from console/terminal
* Browser console errors (F12 → Console tab)
* Any crash logs or error dialogs
**Additional Context:**
* List of installed custom nodes
* Workflow file (.json) that reproduces the issue
* Recent changes (new installations, updates, etc.)
## Community Resources
* **Official Forum:** [forum.comfy.org](https://forum.comfy.org/)
* **Discord:** [ComfyUI Discord Server](https://discord.com/invite/comfyorg)
* **Reddit:** [r/comfyui](https://reddit.com/r/comfyui)
* **YouTube:** [ComfyUI Tutorials](https://www.youtube.com/@comfyorg)
# ComfyUI Hunyuan3D-2 Examples
Source: https://docs.comfy.org/tutorials/3d/hunyuan3D-2
This guide will demonstrate how to use Hunyuan3D-2 in ComfyUI to generate 3D assets.
# Hunyuan3D 2.0 Introduction


[Hunyuan3D 2.0](https://github.com/Tencent/Hunyuan3D-2) is an open-source 3D asset generation model released by Tencent, capable of generating high-fidelity 3D models with high-resolution texture maps through text or images.
Hunyuan3D 2.0 adopts a two-stage generation approach, first generating a geometry model without textures, then synthesizing high-resolution texture maps. This effectively separates the complexity of shape and texture generation. Below are the two core components of Hunyuan3D 2.0:
1. **Geometry Generation Model (Hunyuan3D-DiT)**: Based on a flow diffusion Transformer architecture, it generates untextured geometric models that precisely match input conditions.
2. **Texture Generation Model (Hunyuan3D-Paint)**: Combines geometric conditions and multi-view diffusion techniques to add high-resolution textures to models, supporting PBR materials.
**Key Advantages**
* **High-Precision Generation**: Sharp geometric structures, rich texture colors, support for PBR material generation, achieving near-realistic lighting effects.
* **Diverse Usage Methods**: Provides code calls, Blender plugins, Gradio applications, and online experience through the official website, suitable for different user needs.
* **Lightweight and Compatibility**: The Hunyuan3D-2mini model requires only 5GB VRAM, the standard version needs 6GB VRAM for shape generation, and the complete process (shape + texture) requires only 12GB VRAM.
Recently (March 18, 2025), Hunyuan3D 2.0 also introduced a multi-view shape generation model (Hunyuan3D-2mv), which supports generating more detailed geometric structures from inputs at different angles.
This example includes three workflows:
* Using Hunyuan3D-2mv with multiple view inputs to generate 3D models
* Using Hunyuan3D-2mv-turbo with multiple view inputs to generate 3D models
* Using Hunyuan3D-2 with a single view input to generate 3D models
ComfyUI now natively supports Hunyuan3D-2mv, but does not yet support texture and material generation. Please make sure you have updated to the latest version of [ComfyUI](https://github.com/comfyanonymous/ComfyUI) before starting.
The workflow example PNG images in this tutorial contain workflow JSON in their metadata:
* You can drag them directly into ComfyUI
* Or use the menu `Workflows` -> `Open (ctrl+o)`
This will load the corresponding workflow and prompt you to download the required models. The generated `.glb` format models will be output to the `ComfyUI/output/mesh` folder.
## ComfyUI Hunyuan3D-2mv Workflow Example
In the Hunyuan3D-2mv workflow, we'll use multi-view images to generate a 3D model. Note that multiple view images are not mandatory in this workflow - you can use only the `front` view image to generate a 3D model.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
### 1. Workflow
Please download the images below and drag into ComfyUI to load the workflow.

Download the images below we will use them as input images.
In this example, the input images have already been preprocessed to remove excess background. In actual use, you can use custom nodes like [ComfyUI\_essentials](https://github.com/cubiq/ComfyUI_essentials) to automatically remove excess background.
### 2. Manual Model Installation
Download the model below and save it to the corresponding ComfyUI folder
* hunyuan3d-dit-v2-mv: [model.fp16.safetensors](https://huggingface.co/tencent/Hunyuan3D-2mv/resolve/main/hunyuan3d-dit-v2-mv/model.fp16.safetensors?download=true) - after downloading, you can rename it to `hunyuan3d-dit-v2-mv.safetensors`
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── hunyuan3d-dit-v2-mv.safetensors // renamed file
```
### 3. Steps to Run the Workflow
1. Ensure that the Image Only Checkpoint Loader(img2vid model) has loaded our downloaded and renamed `hunyuan3d-dit-v2-mv.safetensors` model
2. Load the corresponding view images in each of the `Load Image` nodes
3. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
If you need to add more views, make sure to load other view images in the `Hunyuan3Dv2ConditioningMultiView` node, and ensure that you load the corresponding view images in the `Load Image` nodes.
## Hunyuan3D-2mv-turbo Workflow
In the Hunyuan3D-2mv-turbo workflow, we'll use the Hunyuan3D-2mv-turbo model to generate 3D models. This model is a step distillation version of Hunyuan3D-2mv, allowing for faster 3D model generation. In this version of the workflow, we set `cfg` to 1.0 and add a `flux guidance` node to control the `distilled cfg` generation.
### 1. Workflow
Please download the images below and drag into ComfyUI to load the workflow.

Download the images below we will use them as input images.
### 2. Manual Model Installation
Download the model below and save it to the corresponding ComfyUI folder
* hunyuan3d-dit-v2-mv-turbo: [model.fp16.safetensors](https://huggingface.co/tencent/Hunyuan3D-2mv/resolve/main/hunyuan3d-dit-v2-mv-turbo/model.fp16.safetensors?download=true) - after downloading, you can rename it to `hunyuan3d-dit-v2-mv-turbo.safetensors`
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── hunyuan3d-dit-v2-mv-turbo.safetensors // renamed file
```
### 3. Steps to Run the Workflow
1. Ensure that the `Image Only Checkpoint Loader(img2vid model)` node has loaded our renamed `hunyuan3d-dit-v2-mv-turbo.safetensors` model
2. Load the corresponding view images in each of the `Load Image` nodes
3. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
## Hunyuan3D-2 Single View Workflow
In the Hunyuan3D-2 workflow, we'll use the Hunyuan3D-2 model to generate 3D models. This model is not a multi-view model. In this workflow, we use the `Hunyuan3Dv2Conditioning` node instead of the `Hunyuan3Dv2ConditioningMultiView` node.
### 1. Workflow
Please download the image below and drag it into ComfyUI to load the workflow.

Download the image below we will use it as input image.

### 2. Manual Model Installation
Download the model below and save it to the corresponding ComfyUI folder
* hunyuan3d-dit-v2-0: [model.fp16.safetensors](https://huggingface.co/tencent/Hunyuan3D-2/resolve/main/hunyuan3d-dit-v2-0/model.fp16.safetensors?download=true) - after downloading, you can rename it to `hunyuan3d-dit-v2.safetensors`
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── hunyuan3d-dit-v2.safetensors // renamed file
```
### 3. Steps to Run the Workflow
1. Ensure that the `Image Only Checkpoint Loader(img2vid model)` node has loaded our renamed `hunyuan3d-dit-v2.safetensors` model
2. Load the image in the `Load Image` node
3. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
## Community Resources
Below are ComfyUI community resources related to Hunyuan3D-2
* [ComfyUI-Hunyuan3DWrapper](https://github.com/kijai/ComfyUI-Hunyuan3DWrapper)
* [Kijai/Hunyuan3D-2\_safetensors](https://huggingface.co/Kijai/Hunyuan3D-2_safetensors/tree/main)
* [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack)
## Hunyuan3D 2.0 Open-Source Model Series
Currently, Hunyuan3D 2.0 has open-sourced multiple models covering the complete 3D generation process. You can visit [Hunyuan3D-2](https://github.com/Tencent/Hunyuan3D-2) for more information.
**Hunyuan3D-2mini Series**
| Model | Description | Date | Parameters | Huggingface |
| --------------------- | ------------------------- | ---------- | ---------- | --------------------------------------------------------------------------------------- |
| Hunyuan3D-DiT-v2-mini | Mini Image to Shape Model | 2025-03-18 | 0.6B | [Visit](https://huggingface.co/tencent/Hunyuan3D-2mini/tree/main/hunyuan3d-dit-v2-mini) |
**Hunyuan3D-2mv Series**
| Model | Description | Date | Parameters | Huggingface |
| ------------------------ | ----------------------------------------------------------------------------------------------------------- | ---------- | ---------- | ---------------------------------------------------------------------------------------- |
| Hunyuan3D-DiT-v2-mv-Fast | Guidance Distillation Version, can halve DIT inference time | 2025-03-18 | 1.1B | [Visit](https://huggingface.co/tencent/Hunyuan3D-2mv/tree/main/hunyuan3d-dit-v2-mv-fast) |
| Hunyuan3D-DiT-v2-mv | Multi-view Image to Shape Model, suitable for 3D creation requiring multiple angles to understand the scene | 2025-03-18 | 1.1B | [Visit](https://huggingface.co/tencent/Hunyuan3D-2mv/tree/main/hunyuan3d-dit-v2-mv) |
**Hunyuan3D-2 Series**
| Model | Description | Date | Parameters | Huggingface |
| ----------------------- | --------------------------- | ---------- | ---------- | ------------------------------------------------------------------------------------- |
| Hunyuan3D-DiT-v2-0-Fast | Guidance Distillation Model | 2025-02-03 | 1.1B | [Visit](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-dit-v2-0-fast) |
| Hunyuan3D-DiT-v2-0 | Image to Shape Model | 2025-01-21 | 1.1B | [Visit](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-dit-v2-0) |
| Hunyuan3D-Paint-v2-0 | Texture Generation Model | 2025-01-21 | 1.3B | [Visit](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-paint-v2-0) |
| Hunyuan3D-Delight-v2-0 | Image Delight Model | 2025-01-21 | 1.3B | [Visit](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-delight-v2-0) |
# Flux 1.1 Pro Ultra Image API Node ComfyUI Official Workflow Examples
Source: https://docs.comfy.org/tutorials/api-nodes/black-forest-labs/flux-1-1-pro-ultra-image
This guide covers how to use the Flux 1.1 Pro Ultra Image API node in ComfyUI
FLUX 1.1 Pro Ultra is a high-performance AI image generation tool by BlackForestLabs, featuring ultra-high resolution and efficient generation capabilities. It supports up to 4MP resolution (4x the standard version) while keeping single image generation time under 10 seconds - 2.5x faster than similar high-resolution models.
The tool offers two core modes:
* **Ultra Mode**: Designed for high-resolution needs, perfect for advertising and e-commerce where detail magnification is important. It accurately reflects prompts while maintaining generation speed.
* **Raw Mode**: Focuses on natural realism, optimizing skin tones, lighting, and landscape details. Reduces the "AI look" and is ideal for photography and realistic style creation.
We now support the Flux 1.1 Pro Ultra Image node in ComfyUI. This guide will cover:
* Flux 1.1 Pro Text-to-Image
* Flux 1.1 Pro Image-to-Image (Remix)
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Flux 1.1 Pro Ultra Image Node Documentation
Check the following documentation for detailed node parameter settings:
* [Flux 1.1 Pro Ultra Image](/images/built-in-nodes/api_nodes/bfl/flux-1-1-pro-ultra-image.jpg)
## Flux 1.1 \[pro] Text-to-Image Tutorial
### 1. Download Workflow File
Download and drag the following file into ComfyUI to load the workflow:

### 2. Complete the Workflow Steps
Follow the numbered steps to complete the basic workflow:
1. (Optional) Modify the prompt in the `Flux 1.1 [pro] Ultra Image` node
2. (Optional) Set `raw` parameter to `false` for more realistic output
3. Click `Run` or use shortcut `Ctrl(cmd) + Enter` to generate the image
4. After the API returns results, view the generated image in the `Save Image` node. Images are saved to the `ComfyUI/output/` directory
## Flux 1.1\[pro] Image-to-Image Tutorial
When adding an `image_prompt` to the node input, the output will blend features from the input image (Remix). The `image_prompt_strength` value affects the blend ratio: higher values make the output more similar to the input image.
### 1. Download Workflow File
Download and drag the following file into ComfyUI, or right-click the purple node in the Text-to-Image workflow and set `mode` to `always` to enable `image_prompt` input:

We'll use this image as input:

### 2. Complete the Workflow Steps
Follow these numbered steps:
1. Click **Upload** on the `Load Image` node to upload your input image
2. (Optional) Adjust `image_prompt_strength` in `Flux 1.1 [pro] Ultra Image` to change the blend ratio
3. Click `Run` or use shortcut `Ctrl(cmd) + Enter` to generate the image
4. After the API returns results, view the generated image in the `Save Image` node. Images are saved to the `ComfyUI/output/` directory
Here's a comparison of outputs with different `image_prompt_strength` values:
# ComfyUI Flux.1 Kontext Pro Image API Node Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/black-forest-labs/flux-1-kontext
This guide will show you how to use the Flux.1 Kontext Pro Image API node in ComfyUI to perform image editing
FLUX.1 Kontext is a professional image-to-image editing model developed by Black Forest Labs, focusing on intelligent understanding of image context and precise editing.
It can perform various editing tasks without complex descriptions, including object modification, style transfer, background replacement, character consistency editing, and text editing.
The core advantage of Kontext lies in its excellent context understanding ability and character consistency maintenance, ensuring that key elements such as character features and composition layout remain stable even after multiple iterations of editing.
Currently, ComfyUI has supported two models of Flux.1 Kontext:
* **Kontext Pro** is ideal for editing, composing, and remixing.
* **Kontext Max** pushes the limits on typography, prompt precision, and speed.
In this guide, we will briefly introduce how to use the Flux.1 Kontext API nodes to perform image editing through corresponding workflows.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Flux.1 Kontext Multiple Image Input Workflow
We have recently updated to support multiple image input workflows. Using the new `Image Stitch` node, you can stitch multiple images into a single image and edit it using Flux.1 Kontext.
### 1. Workflow File Download
The `metadata` of the images below contains the workflow information. Please download and drag them into ComfyUI to load the corresponding workflow.

Download the following images for input or use your own images:



### 2. Complete the Workflow Step by Step
You can follow the numbered steps in the image to complete the workflow:
1. Upload the provided images in the `Load image` node
2. Modify the necessary parameters in `Flux.1 Kontext Pro Image`:
* `prompt` Enter the prompt for the image you want to edit
* `aspect_ratio` Set the aspect ratio of the original image, which must be between 1:4 and 4:1
* `prompt_upsampling` Set whether to use prompt upsampling. If enabled, it will automatically modify the prompt to get richer results, but the results are not reproducible
3. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image editing
4. After waiting for the API to return results, you can view the edited image in the `Save Image` node, and the corresponding image will also be saved to the `ComfyUI/output/` directory
The subsequent two workflows only differ in the API nodes used. In fact, you only need to modify based on the multiple image input workflow, with no significant differences
## Flux.1 Kontext Pro Image API Node Workflow
### 1. Workflow File Download
The `metadata` of the image below contains the workflow information. Please download and drag it into ComfyUI to load the corresponding workflow.

Download the image below for input or use your own image:

### 2. Complete the Workflow Step by Step
You can follow the numbered steps in the image to complete the workflow:
1. Load the image you want to edit in the `Load Image` node
2. (Optional) Modify the necessary parameters in `Flux.1 Kontext Pro Image`
3. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image editing
4. After waiting for the API to return results, you can view the edited image in the `Save Image` node, and the corresponding image will also be saved to the `ComfyUI/output/` directory
## Flux.1 Kontext Max Image API Node Workflow
### 1. Workflow File Download
The `metadata` of the image below contains the workflow information. Please download and drag it into ComfyUI to load the corresponding workflow.

Download the image below for input or use your own image for demonstration:

### 2. Complete the Workflow Step by Step
You can follow the numbered steps in the image to complete the workflow:
1. Load the image you want to edit in the `Load Image` node
2. (Optional) Modify the necessary parameters in `Flux.1 Kontext Max Image`
3. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image editing
4. After waiting for the API to return results, you can view the edited image in the `Save Image` node, and the corresponding image will also be saved to the `ComfyUI/output/` directory
## Flux Kontext Prompt Techniques
### 1. Basic Modifications
* Simple and direct: `"Change the car color to red"`
* Maintain style: `"Change to daytime while maintaining the same style of the painting"`
### 2. Style Transfer
**Principles:**
* Clearly name style: `"Transform to Bauhaus art style"`
* Describe characteristics: `"Transform to oil painting with visible brushstrokes, thick paint texture"`
* Preserve composition: `"Change to Bauhaus style while maintaining the original composition"`
### 3. Character Consistency
**Framework:**
* Specific description: `"The woman with short black hair"` instead of "she"
* Preserve features: `"while maintaining the same facial features, hairstyle, and expression"`
* Step-by-step modifications: Change background first, then actions
### 4. Text Editing
* Use quotes: `"Replace 'joy' with 'BFL'"`
* Maintain format: `"Replace text while maintaining the same font style"`
## Common Problem Solutions
### Character Changes Too Much
❌ Wrong: `"Transform the person into a Viking"`
✅ Correct: `"Change the clothes to be a viking warrior while preserving facial features"`
### Composition Position Changes
❌ Wrong: `"Put him on a beach"`
✅ Correct: `"Change the background to a beach while keeping the person in the exact same position, scale, and pose"`
### Style Application Inaccuracy
❌ Wrong: `"Make it a sketch"`
✅ Correct: `"Convert to pencil sketch with natural graphite lines, cross-hatching, and visible paper texture"`
## Core Principles
1. **Be Specific and Clear** - Use precise descriptions, avoid vague terms
2. **Step-by-step Editing** - Break complex modifications into multiple simple steps
3. **Explicit Preservation** - State what should remain unchanged
4. **Verb Selection** - Use "change", "replace" rather than "transform"
## Best Practice Templates
**Object Modification:**
`"Change [object] to [new state], keep [content to preserve] unchanged"`
**Style Transfer:**
`"Transform to [specific style], while maintaining [composition/character/other] unchanged"`
**Background Replacement:**
`"Change the background to [new background], keep the subject in the exact same position and pose"`
**Text Editing:**
`"Replace '[original text]' with '[new text]', maintain the same font style"`
> **Remember:** The more specific, the better. Kontext excels at understanding detailed instructions and maintaining consistency.
# FAQs about API Nodes
Source: https://docs.comfy.org/tutorials/api-nodes/faq
Some FAQs you may encounter when using API Nodes.
This article addresses common questions regarding the use of API nodes.
Please update your ComfyUI to the latest version (the latest commit or the latest [desktop version](https://www.comfy.org/download)).
We may add more API support in the future, and the corresponding nodes will be updated, so please keep your ComfyUI up to date.
Please note that you need to distinguish between the nightly version and the release version.
In some cases, the latest `release` version may not be updated in time compared to the `nightly` version.
Since we are still iterating quickly, please ensure you are using the latest version when you cannot find the corresponding node.
API access requires that your current request is based on a secure network environment. The current requirements for API access are as follows:
* The local network only allows access from `127.0.0.1` or `localhost`, which may mean that you cannot use the API Nodes in a ComfyUI service started with the `--listen` parameter in a LAN environment.
* Able to access our API service normally (a proxy service may be required in some regions).
* Your account does not have enough [credits](/interface/credits).
* Currently, only `127.0.0.1` or `localhost` access is supported.
* Ensure your account has enough credits.
API Nodes require credits for API calls to closed-source models, so they do not support free usage.
Please refer to the following documentation:
1. [Comfy Account](/interface/user): Find the `User` section in the settings menu to log in.
2. [Credits](/interface/credits): After logging in, the settings interface will show the credits menu. You can purchase credits in `Settings` → `Credits`. We use a prepaid system, so there will be no unexpected charges.
3. Complete the payment through Stripe.
4. Check if the credits have been updated. If not, try restarting or refreshing the page.
Currently, we do not support refunds for credits.
If you believe there is an error resulting in unused balance due to technical issues, please [contact support](mailto:support@comfy.org).
Credits cannot go negative, so please ensure you have enough credits before making the corresponding API calls.
Please visit the [Credits](/interface/credits) menu after logging in to check the corresponding credits.
Currently, the API Nodes are still in the testing phase and do not support this feature yet, but we have considered adding it.
No, your credits do not expire.
No, your credits cannot be transferred to other users and are limited to the currently logged-in account, but we do not restrict the number of devices that can log in.
We do not limit the number of devices that can log in; you can use your account anywhere you want.
Email a request to [support@comfy.org](mailto:support@comfy.org) and we will delete your information
# Google Gemini API Node ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/google/gemini
This article will introduce how to use Google Gemini API nodes in ComfyUI to complete conversational functions
Google Gemini is a powerful AI model developed by Google, supporting conversational and text generation functions. Currently, ComfyUI has integrated the Google Gemini API, allowing you to directly use the related nodes in ComfyUI to complete conversational functions.
In this guide, we will walk you through completing the corresponding conversational functionality.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Google Gemini Chat Workflow
### 1. Workflow File Download
Please download the Json file below and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
### 2. Complete the Workflow Execution Step by Step
In the corresponding template, we have built a prompt for analyzing and generating role prompts, used to interpret your images into corresponding drawing prompts
You can refer to the numbers in the image to complete the basic text-to-image workflow execution:
1. In the `Load Image` node, load the image you need AI to interpret
2. (Optional) If needed, you can modify the prompt in `Google Gemini` to have AI execute specific tasks
3. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute the conversation.
4. After waiting for the API to return results, you can view the corresponding AI returned content in the `Preview Any` node.
### 3. Additional Notes
* Currently, the file input node `Gemini Input Files` requires files to be uploaded to the `ComfyUI/input/` directory first. This node is being improved, and we will modify the template after updates
* The workflow provides an example using `Batch Images` for input. If you have multiple images that need AI interpretation, you can refer to the step diagram and use right-click to set the corresponding node mode to `Always` to enable it
# ComfyUI Ideogram 3.0 API Node Official Examples
Source: https://docs.comfy.org/tutorials/api-nodes/ideogram/ideogram-v3
This guide covers how to use the Ideogram 3.0 API node in ComfyUI
Ideogram 3.0 is a powerful text-to-image model by Ideogram, known for its photorealistic quality, accurate text rendering, and consistent style control.
The [Ideogram V3](/built-in-nodes/api-node/image/ideogram/ideogram-v3) node currently supports two modes:
* Text-to-Image mode
* Image Editing mode (when both image and mask inputs are provided)
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Ideogram 3.0 Node Documentation
Check the following documentation for detailed node parameter settings:
* [Ideogram V3](/built-in-nodes/api-node/image/ideogram/ideogram-v3)
## Ideogram 3.0 API Node Text-to-Image Mode
When using [Ideogram V3](/built-in-nodes/api-node/image/ideogram/ideogram-v3) without image and mask inputs, the node operates in Text-to-Image mode.
### 1. Download Workflow File
Download and drag the following file into ComfyUI to load the workflow:

### 2. Complete the Workflow Steps
Follow the numbered steps to complete the basic workflow:
1. Enter your image description in the `prompt` field of the `Ideogram V3` node
2. Click `Run` or use shortcut `Ctrl(cmd) + Enter` to generate the image
3. After the API returns results, view the generated image in the `Save Image` node. Images are saved to the `ComfyUI/output/` directory
## Ideogram 3.0 API Node Image Editing Mode
\[To be updated]
# Luma Image to Image API Node ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/luma/luma-image-to-image
This guide covers how to use the Luma Image to Image API node in ComfyUI
The [Luma Image to Image](/built-in-nodes/api-node/image/luma/luma-image-to-image) node allows you to modify existing images based on text prompts using Luma AI technology, while preserving certain features and structures from the original image.
In this guide, we'll show you how to set up an image-to-image workflow using this node.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Luma Image to Image Node Documentation
Check the following documentation for detailed node parameter settings:
Luma Image to Image API Node Documentation
## Luma Image to Image API Node Workflow
This feature works well for changing objects and shapes. However, it may not be ideal for color changes. We recommend using lower weight values, around 0.0 to 0.1.
### 1. Download Workflow File
Download and drag the following image into ComfyUI to load the workflow (workflow information is included in the image metadata):

Download this image to use as input:

### 2. Complete the Workflow Steps
Follow these numbered steps:
1. Click **Upload** on the `Load Image` node to upload your input image
2. (Optional) Modify the workflow prompts
3. (Optional) Adjust `image_weight` to change input image influence (lower values stay closer to original)
4. Click `Run` or use shortcut `Ctrl(cmd) + Enter` to generate the image
5. After API returns results, view the generated image in the `Save Image` node. Images are saved to the `ComfyUI/output/` directory
### 3. Results with Different `image_weight` Values
# Luma Image to Video API Node ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/luma/luma-image-to-video
Learn how to use the Luma Image to Video API node in ComfyUI
The [Luma Image to Video](/built-in-nodes/api-node/video/luma/luma-image-to-video) node allows you to convert static images into smooth, dynamic videos using Luma AI's advanced technology, bringing life and motion to your images.
In this guide, we'll show you how to set up a workflow for image-to-video conversion using this node.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Luma Image to Video Node Documentation
Check out the following documentation to learn more about the node's parameters:
Luma Image to Video API node documentation
Luma Concepts API node documentation
## Image to Video Workflow with Luma API Node
The Luma Image to Video node requires at least one image input (`first_image` or `last_image`) along with text prompts to determine the video's motion effects. In this guide, we've created an example using `first_image` and `luma_concepts` to showcase Luma AI's video generation capabilities.
### 1. Download the Workflow
The workflow information is included in the metadata of the video below. Download and drag it into ComfyUI to load the workflow.

Download the following image to use as input:

### 2. Follow the Workflow Steps
Follow these basic steps to run the workflow:
1. Upload your input image in the `first_image` node
2. (Optional) Write prompts in the Luma Image to Video node to describe how you want the image animated
3. (Optional) Modify the `Luma Concepts` node to control camera movement for professional cinematography
4. Click `Run` or use `Ctrl(cmd) + Enter` to generate the video
5. Once the API returns results, view the generated video in the `Save Video` node. The video will also be saved to the `ComfyUI/output/` directory
### 3. Additional Notes
* **Image Input Requirements**: At least one of `first_image` or `last_image` is required, with a maximum of 1 image per input
* **Luma Concepts**: Controls camera movement for professional video effects
* **Seed Parameter**: Only determines if the node should rerun, doesn't affect generation results
* **Enable Input Nodes**: Right-click on purple "Bypass" mode nodes and set "mode" to "always" to enable inputs
* **Model Selection**: Different video generation models have unique characteristics, adjustable via the model parameter
* **Resolution and Duration**: Adjust output video resolution and length using resolution and duration parameters
# Luma Text to Image API Node ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/luma/luma-text-to-image
This guide explains how to use the Luma Text to Image API node in ComfyUI
The [Luma Text to Image](/built-in-nodes/api-node/image/luma/luma-text-to-image) node allows you to generate high-quality images from text prompts using Luma AI's advanced technology, capable of creating photorealistic content and artistic style images.
In this guide, we'll show you how to set up workflows using this node for text-to-image generation.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Luma Text to Image Node Documentation
You can refer to the following documentation for detailed parameter settings:
Luma Text to Image API Node Documentation
Luma Reference API Node Documentation
## Luma Text to Image API Node Workflow
When the `Luma Text to Image` node is used without any image inputs, it functions as a text-to-image workflow. In this guide, we've created examples using `style_image` and `image_luma_ref` to showcase Luma AI's excellent image processing capabilities.
### 1. Download Workflow Files
The workflow information is included in the metadata of the image below. Download and drag it into ComfyUI to load the workflow.

Please download these images for input:


### 2. Follow Steps to Run the Workflow
Follow the numbered steps in the image to complete the basic workflow:
1. Upload the reference image in the `Load image` node
2. Upload the style reference image in the `Load image (renamed to styleref)` node
3. (Optional) Modify the prompts in the `Luma Text to Image` node
4. (Optional) Adjust the `style_image_weight` to control the style reference image's influence
5. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to generate the image
6. After the API returns results, view the generated image in the `Save Image` node. Images are saved to the `ComfyUI/output/` directory
### 3. Additional Notes
* The [node](/built-in-nodes/api-node/image/luma/luma-text-to-image) allows up to 4 reference images and character references simultaneously.
* To enable multiple image inputs, right-click on the purple "Bypassed" nodes and set their `mode` to `always`
# Luma Text to Video API Node ComfyUI Official Guide
Source: https://docs.comfy.org/tutorials/api-nodes/luma/luma-text-to-video
Learn how to use the Luma Text to Video API node in ComfyUI
The [Luma Text to Video](/built-in-nodes/api-node/video/luma/luma-text-to-video) node allows you to create high-quality, smooth videos from text descriptions using Luma AI's innovative video generation technology.
In this guide, we'll show you how to set up a text-to-video workflow using this node.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Luma Text to Video Node Documentation
Check out the following documentation to learn more about the node parameters:
Documentation for the Luma Text to Video API node
Documentation for the Luma Concepts API node
## Text to Video Workflow with Luma API Node
The Luma Text to Video node requires text prompts to describe the video content. In this guide, we've created examples using `prompt` and `luma_concepts` to showcase Luma AI's excellent video generation capabilities.
### 1. Download the Workflow
The workflow information is included in the metadata of the video below. Download and drag it into ComfyUI to load the workflow.

### 2. Follow the Steps
Follow these basic steps to run the workflow:
1. Write your prompt in the `Luma Text to Video` node to describe the video content you want
2. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to generate the video
3. After the API returns results, you can view the generated video in the `Save Video` node. The video will also be saved to the `ComfyUI/output/` directory
> (Optional) Modify the `Luma Concepts` node to control camera movements and add professional cinematography
### 3. Additional Notes
* **Writing Prompts**: Describe scenes, subjects, actions, and mood in detail for best results
* **Luma Concepts**: Mainly used for camera control to create professional video shots
* **Seed Parameter**: Only determines if the node should rerun, doesn't affect generation results
* **Model Selection**: Different video models have different features, adjustable via the model parameter
* **Resolution and Duration**: Adjust output video resolution and length using these parameters
* **Ray 1.6 Model Note**: Duration and resolution parameters don't work when using the Ray 1.6 model
# Moonvalley API Node Official ComfyUI Example
Source: https://docs.comfy.org/tutorials/api-nodes/moonvalley/moonvalley-video-generation
This article introduces how to use Moonvalley API nodes for text-to-video, image-to-video, and video-to-video capabilities in ComfyUI.
VIDEO
Moonvalley Marey Realism v1.5 is an AI video generation model designed for cinematic-level creation. The model is **trained entirely with commercially licensed content**, ensuring **copyright compliance and commercial safety**.
## Product Highlights
* Exceptional prompt comprehension: Accurately interprets complex prompt instructions.
* Native 1080p HD quality: The training dataset is based on **1080P** videos, resulting in fine and detailed output.
* Realistic physics and dynamic performance: Precisely simulates physical motion models and natural dynamics, delivering professional-grade realism.
* Complex scene layering and advanced lighting effects: Supports foreground, midground, and background layering in complex scenes, with intelligent spatial relationship understanding.
* Production-level control features such as motion and pose transfer: Automatically generates realistic lighting for composite scenes.
Currently, Moonvalley-related API nodes are natively supported in ComfyUI. You can use the corresponding text-to-video, image-to-video, and video-to-video capabilities directly in ComfyUI.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Moonvalley Text-to-Video Workflow
### 1. Download the Workflow File
Download the workflow file in JSON format
### 2. Follow the Steps to Run the Workflow
1. Enter the positive prompt (content you want to appear in the video)
2. Enter the negative prompt (content you do not want to appear in the video)
3. Modify the video output resolution
4. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to start video generation
5. After the API returns the result, you can view the generated video in the `Save Video` node. The video will also be saved in the `ComfyUI/output/` directory
## Moonvalley Image-to-Video Workflow
### 1. Download the Workflow File
Download the workflow file in JSON format
Download the image below as the input image

### 2. Follow the Steps to Run the Workflow
1. Load the input image in the `Load Image` node
2. Enter the positive prompt (content you want to appear in the video)
3. Enter the negative prompt (content you do not want to appear in the video)
4. Modify the video output resolution
5. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to start video generation
6. After the API returns the result, you can view the generated video in the `Save Video` node. The video will also be saved in the `ComfyUI/output/` directory
## Moonvalley Video-to-Video Workflow
The `Moonvalley Marey Video to Video` node allows you to input a reference video for video re-drawing. You can use the reference video's motion or character poses for video generation.
### 1. Download the Workflow File
Download the workflow file in JSON format
Download the video below as the input video:
### 2. Follow the Steps to Run the Workflow
1. Load the reference video (or your own material) in the `Load Video` node
* If the final video duration is 5s, the input video must be longer than 5s
* If the final video duration is 10s, the input video must be longer than 10s
2. Enter the positive prompt (content you want to appear in the video)
3. Enter the negative prompt (content you do not want to appear in the video)
4. Set the `length` parameter to specify the video duration
5. Set the `control_type` parameter to choose the reference type for video re-drawing
* `Motion Transfer`: Generate based on the motion in the reference video
* `Pose Transfer`: Generate based on the character poses in the reference video
6. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to start video generation
7. After the API returns the result, you can view the generated video in the `Save Video` node. The video will also be saved in the `ComfyUI/output/` directory
# OpenAI Chat API Node ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/openai/chat
This article will introduce how to use OpenAI Chat API nodes in ComfyUI to complete conversational functions
OpenAI is a company focused on generative AI, providing powerful conversational capabilities. Currently, ComfyUI has integrated the OpenAI API, allowing you to directly use the related nodes in ComfyUI to complete conversational functions.
In this guide, we will walk you through completing the corresponding conversational functionality.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## OpenAI Chat Workflow
### 1. Workflow File Download
Please download the Json file below and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
### 2. Complete the Workflow Execution Step by Step
In the corresponding template, we have built a role setting for analyzing prompt generation.
You can refer to the numbers in the image to complete the basic text-to-image workflow execution:
1. In the `Load Image` node, load the image you need AI to interpret
2. (Optional) If needed, you can modify the settings in `OpenAI Chat Advanced Options` to have AI execute specific tasks
3. In the `OpenAI Chat` node, you can modify `Prompt` to set the conversation prompt, or modify `model` to select different models
4. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute the conversation.
5. After waiting for the API to return results, you can view the corresponding AI returned content in the `Preview Any` node.
### 3. Additional Notes
* Currently, the file input node `OpenAI Chat Input Files` requires files to be uploaded to the `ComfyUI/input/` directory first. This node is being improved, and we will modify the template after updates
* The workflow provides an example using `Batch Images` for input. If you have multiple images that need AI interpretation, you can refer to the step diagram and use right-click to set the corresponding node mode to `Always` to enable it
# OpenAI DALL·E 2 Node
Source: https://docs.comfy.org/tutorials/api-nodes/openai/dall-e-2
Learn how to use the OpenAI DALL·E 2 API node to generate images in ComfyUI
OpenAI DALL·E 2 is part of the ComfyUI API Nodes series, allowing users to generate images through OpenAI's **DALL·E 2** model.
This node supports:
* Text-to-image generation
* Image editing functionality (inpainting through masks)
## Node Overview
The **OpenAI DALL·E 2** node generates images synchronously through OpenAI's image generation API. It receives text prompts and returns images that match the description.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Parameter Description
### Required Parameters
| Parameter | Description |
| --------- | ------------------------------------------------------------- |
| `prompt` | Text prompt describing the image content you want to generate |
### Widget Parameters
| Parameter | Description | Options/Range | Default Value |
| --------- | -------------------------------------------------------------------------- | --------------------------------- | ------------- |
| `seed` | Seed value for image generation (currently not implemented in the backend) | 0 to 2^31-1 | 0 |
| `size` | Output image dimensions | "256x256", "512x512", "1024x1024" | "1024x1024" |
| `n` | Number of images to generate | 1 to 8 | 1 |
### Optional Parameters
| Parameter | Description | Options/Range | Default Value |
| --------- | ------------------------------------------ | --------------- | ------------- |
| `image` | Optional reference image for image editing | Any image input | None |
| `mask` | Optional mask for local inpainting | Mask input | None |
## Usage Method
## Workflow Examples
This API node currently supports two workflows:
* Text to Image
* Inpainting
Image to Image workflow is not supported
### Text to Image Example
The image below contains a simple text-to-image workflow. Please download the corresponding image and drag it into ComfyUI to load the workflow.

The corresponding example is very simple
You only need to load the `OpenAI DALL·E 2` node, input the description of the image you want to generate in the `prompt` node, connect a `Save Image` node, and then run the workflow.
### Inpainting Workflow
DALL·E 2 supports image editing functionality, allowing you to use a mask to specify the area to be replaced. Below is a simple inpainting workflow example:
#### 1. Workflow File Download
Download the image below and drag it into ComfyUI to load the corresponding workflow.

We will use the image below as input:

#### 2. Workflow File Usage Instructions
Since this workflow is relatively simple, if you want to manually implement the corresponding workflow yourself, you can follow the steps below:
1. Use the `Load Image` node to load the image
2. Right-click on the load image node and select `MaskEditor`
3. In the mask editor, use the brush to draw the area you want to redraw
4. Connect the loaded image to the `image` input of the **OpenAI DALL·E 2** node
5. Connect the mask to the `mask` input of the **OpenAI DALL·E 2** node
6. Edit the prompt in the `prompt` node
7. Run the workflow
**Notes**
* If you want to use the image editing functionality, you must provide both an image and a mask (both are required)
* The mask and image must be the same size
* When inputting large images, the node will automatically resize the image to an appropriate size
* The URLs returned by the API are valid for a short period, please save the results promptly
* Each generation consumes credits, charged according to image size and quantity
## FAQs
Please update your ComfyUI to the latest version (the latest commit or the latest [desktop version](https://www.comfy.org/download)).
We may add more API support in the future, and the corresponding nodes will be updated, so please keep your ComfyUI up to date.
Please note that you need to distinguish between the nightly version and the release version.
In some cases, the latest `release` version may not be updated in time compared to the `nightly` version.
Since we are still iterating quickly, please ensure you are using the latest version when you cannot find the corresponding node.
API access requires that your current request is based on a secure network environment. The current requirements for API access are as follows:
* The local network only allows access from `127.0.0.1` or `localhost`, which may mean that you cannot use the API Nodes in a ComfyUI service started with the `--listen` parameter in a LAN environment.
* Able to access our API service normally (a proxy service may be required in some regions).
* Your account does not have enough [credits](/interface/credits).
* Currently, only `127.0.0.1` or `localhost` access is supported.
* Ensure your account has enough credits.
API Nodes require credits for API calls to closed-source models, so they do not support free usage.
Please refer to the following documentation:
1. [Comfy Account](/interface/user): Find the `User` section in the settings menu to log in.
2. [Credits](/interface/credits): After logging in, the settings interface will show the credits menu. You can purchase credits in `Settings` → `Credits`. We use a prepaid system, so there will be no unexpected charges.
3. Complete the payment through Stripe.
4. Check if the credits have been updated. If not, try restarting or refreshing the page.
Currently, we do not support refunds for credits.
If you believe there is an error resulting in unused balance due to technical issues, please [contact support](mailto:support@comfy.org).
Credits cannot go negative, so please ensure you have enough credits before making the corresponding API calls.
Please visit the [Credits](/interface/credits) menu after logging in to check the corresponding credits.
Currently, the API Nodes are still in the testing phase and do not support this feature yet, but we have considered adding it.
No, your credits do not expire.
No, your credits cannot be transferred to other users and are limited to the currently logged-in account, but we do not restrict the number of devices that can log in.
We do not limit the number of devices that can log in; you can use your account anywhere you want.
Email a request to [support@comfy.org](mailto:support@comfy.org) and we will delete your information
# OpenAI DALL·E 3 Node
Source: https://docs.comfy.org/tutorials/api-nodes/openai/dall-e-3
Learn how to use the OpenAI DALL·E 3 API node to generate images in ComfyUI
OpenAI DALL·E 3 is part of the ComfyUI API Nodes series, allowing users to generate images through OpenAI's **DALL·E 3** model. This node supports text-to-image generation functionality.
## Node Overview
DALL·E 3 is OpenAI's latest image generation model, capable of creating detailed and high-quality images based on text prompts. Through this node in ComfyUI, you can directly access DALL·E 3's generation capabilities without leaving the ComfyUI interface.
The **OpenAI DALL·E 3** node generates images synchronously through OpenAI's image generation API. It receives text prompts and returns images that match the description.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Parameter Details
### Required Parameters
| Parameter | Type | Description |
| --------- | ---- | ---------------------------------------------------------------------------------------------------------------------------- |
| prompt | Text | Text prompt for generating images. Supports multi-line input, can describe in detail the image content you want to generate. |
### Widget Parameters
| Parameter | Type | Options | Default Value | Description |
| --------- | ------- | ------------------------------- | ------------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
| seed | Integer | 0-2147483647 | 0 | Random seed used to control the generation result |
| quality | Option | standard, hd | standard | Image quality setting. The "hd" option generates higher quality images but may require more computational resources |
| style | Option | natural, vivid | natural | Image style. "Vivid" tends to generate hyperrealistic and dramatic images, while "natural" produces more natural, less exaggerated images |
| size | Option | 1024x1024, 1024x1792, 1792x1024 | 1024x1024 | Size of the generated image. You can choose square or rectangular images in different orientations |
## Usage Examples
You can download the image below and drag it into ComfyUI to load the corresponding workflow

Since the corresponding workflow is very simple, you can also directly add the **OpenAI DALL·E 3** node in ComfyUI, input the description of the image you want to generate, and then run the workflow
1. Add the **OpenAI DALL·E 3** node in ComfyUI
2. Enter the description of the image you want to generate in the prompt text box
3. Adjust optional parameters as needed (quality, style, size, etc.)
4. Run the workflow to generate the image
## FAQs
Please update your ComfyUI to the latest version (the latest commit or the latest [desktop version](https://www.comfy.org/download)).
We may add more API support in the future, and the corresponding nodes will be updated, so please keep your ComfyUI up to date.
Please note that you need to distinguish between the nightly version and the release version.
In some cases, the latest `release` version may not be updated in time compared to the `nightly` version.
Since we are still iterating quickly, please ensure you are using the latest version when you cannot find the corresponding node.
API access requires that your current request is based on a secure network environment. The current requirements for API access are as follows:
* The local network only allows access from `127.0.0.1` or `localhost`, which may mean that you cannot use the API Nodes in a ComfyUI service started with the `--listen` parameter in a LAN environment.
* Able to access our API service normally (a proxy service may be required in some regions).
* Your account does not have enough [credits](/interface/credits).
* Currently, only `127.0.0.1` or `localhost` access is supported.
* Ensure your account has enough credits.
API Nodes require credits for API calls to closed-source models, so they do not support free usage.
Please refer to the following documentation:
1. [Comfy Account](/interface/user): Find the `User` section in the settings menu to log in.
2. [Credits](/interface/credits): After logging in, the settings interface will show the credits menu. You can purchase credits in `Settings` → `Credits`. We use a prepaid system, so there will be no unexpected charges.
3. Complete the payment through Stripe.
4. Check if the credits have been updated. If not, try restarting or refreshing the page.
Currently, we do not support refunds for credits.
If you believe there is an error resulting in unused balance due to technical issues, please [contact support](mailto:support@comfy.org).
Credits cannot go negative, so please ensure you have enough credits before making the corresponding API calls.
Please visit the [Credits](/interface/credits) menu after logging in to check the corresponding credits.
Currently, the API Nodes are still in the testing phase and do not support this feature yet, but we have considered adding it.
No, your credits do not expire.
No, your credits cannot be transferred to other users and are limited to the currently logged-in account, but we do not restrict the number of devices that can log in.
We do not limit the number of devices that can log in; you can use your account anywhere you want.
Email a request to [support@comfy.org](mailto:support@comfy.org) and we will delete your information
# OpenAI GPT-Image-1 Node
Source: https://docs.comfy.org/tutorials/api-nodes/openai/gpt-image-1
Learn how to use the OpenAI GPT-Image-1 API node to generate images in ComfyUI
OpenAI GPT-Image-1 is part of the ComfyUI API nodes series that allows users to generate images through OpenAI's **GPT-Image-1** model. This is the same model used for image generation in ChatGPT 4o.
This node supports:
* Text-to-image generation
* Image editing functionality (inpainting through masks)
## Node Overview
The **OpenAI GPT-Image-1** node synchronously generates images through OpenAI's image generation API. It receives text prompts and returns images matching the description. GPT-Image-1 is OpenAI's most advanced image generation model currently available, capable of creating highly detailed and realistic images.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Parameter Description
### Required Parameters
| Parameter | Type | Description |
| --------- | ---- | ------------------------------------------------------------- |
| `prompt` | Text | Text prompt describing the image content you want to generate |
### Widget Parameters
| Parameter | Type | Options | Default | Description |
| ------------ | ------- | ------------------------------------- | ------- | ------------------------------------------------------- |
| `seed` | Integer | 0-2147483647 | 0 | Random seed used to control generation results |
| `quality` | Option | low, medium, high | low | Image quality setting, affects cost and generation time |
| `background` | Option | opaque, transparent | opaque | Whether the returned image has a background |
| `size` | Option | auto, 1024x1024, 1024x1536, 1536x1024 | auto | Size of the generated image |
| `n` | Integer | 1-8 | 1 | Number of images to generate |
### Optional Parameters
| Parameter | Type | Options | Default | Description |
| --------- | ----- | --------------- | ------- | ----------------------------------------------------------- |
| `image` | Image | Any image input | None | Optional reference image for image editing |
| `mask` | Mask | Mask input | None | Optional mask for inpainting (white areas will be replaced) |
## Usage Examples
### Text-to-Image Example
The image below contains a simple text-to-image workflow. Please download the image and drag it into ComfyUI to load the corresponding workflow.

The corresponding workflow is very simple:
You only need to load the `OpenAI GPT-Image-1` node, input the description of the image you want to generate in the `prompt` node, connect a `Save Image` node, and then run the workflow.
### Image-to-Image Example
The image below contains a simple image-to-image workflow. Please download the image and drag it into ComfyUI to load the corresponding workflow.

We will use the image below as input:

In this workflow, we use the `OpenAI GPT-Image-1` node to generate images and the `Load Image` node to load the input image, then connect it to the `image` input of the `OpenAI GPT-Image-1` node.
### Multiple Image Input Example
Please download the image below and drag it into ComfyUI to load the corresponding workflow.

Use the hat image below as an additional input image.

The corresponding workflow is shown in the image below:
The `Batch Images` node is used to load multiple images into the `OpenAI GPT-Image-1` node.
### Inpainting Workflow
GPT-Image-1 also supports image editing functionality, allowing you to specify areas to replace using a mask. Below is a simple inpainting workflow example:
Download the image below and drag it into ComfyUI to load the corresponding workflow. We will continue to use the input image from the image-to-image workflow section.

The corresponding workflow is shown in the image
Compared to the image-to-image workflow, we use the MaskEditor in the `Load Image` node through the right-click menu to draw a mask, then connect it to the `mask` input of the `OpenAI GPT-Image-1` node to complete the workflow.
**Notes**
* The mask and image must be the same size
* When inputting large images, the node will automatically resize the image to an appropriate size
## FAQs
Please update your ComfyUI to the latest version (the latest commit or the latest [desktop version](https://www.comfy.org/download)).
We may add more API support in the future, and the corresponding nodes will be updated, so please keep your ComfyUI up to date.
Please note that you need to distinguish between the nightly version and the release version.
In some cases, the latest `release` version may not be updated in time compared to the `nightly` version.
Since we are still iterating quickly, please ensure you are using the latest version when you cannot find the corresponding node.
API access requires that your current request is based on a secure network environment. The current requirements for API access are as follows:
* The local network only allows access from `127.0.0.1` or `localhost`, which may mean that you cannot use the API Nodes in a ComfyUI service started with the `--listen` parameter in a LAN environment.
* Able to access our API service normally (a proxy service may be required in some regions).
* Your account does not have enough [credits](/interface/credits).
* Currently, only `127.0.0.1` or `localhost` access is supported.
* Ensure your account has enough credits.
API Nodes require credits for API calls to closed-source models, so they do not support free usage.
Please refer to the following documentation:
1. [Comfy Account](/interface/user): Find the `User` section in the settings menu to log in.
2. [Credits](/interface/credits): After logging in, the settings interface will show the credits menu. You can purchase credits in `Settings` → `Credits`. We use a prepaid system, so there will be no unexpected charges.
3. Complete the payment through Stripe.
4. Check if the credits have been updated. If not, try restarting or refreshing the page.
Currently, we do not support refunds for credits.
If you believe there is an error resulting in unused balance due to technical issues, please [contact support](mailto:support@comfy.org).
Credits cannot go negative, so please ensure you have enough credits before making the corresponding API calls.
Please visit the [Credits](/interface/credits) menu after logging in to check the corresponding credits.
Currently, the API Nodes are still in the testing phase and do not support this feature yet, but we have considered adding it.
No, your credits do not expire.
No, your credits cannot be transferred to other users and are limited to the currently logged-in account, but we do not restrict the number of devices that can log in.
We do not limit the number of devices that can log in; you can use your account anywhere you want.
Email a request to [support@comfy.org](mailto:support@comfy.org) and we will delete your information
# API Nodes
Source: https://docs.comfy.org/tutorials/api-nodes/overview
In this article, we will introduce ComfyUI's API Nodes and related information.
API Nodes are ComfyUI's new way of calling closed-source models through API requests, providing ComfyUI users with access to external state-of-the-art AI models without complex API key setup.
## What are API Nodes?
API Nodes are a set of special nodes that connect to external API services, allowing you to use closed-source or third-party hosted AI models directly in your ComfyUI workflows. These nodes are designed to seamlessly integrate the capabilities of external models while maintaining the open-source nature of ComfyUI's core.
Currently supported models include:
* **Black Forest Labs**: Flux 1.1\[pro] Ultra, Flux .1\[pro], Flux .1 Kontext Pro, Flux .1 Kontext Max
* **Google**: Veo2, Gemini 2.5 Pro, Gemini 2.5 Flash
* **Ideogram**: V3, V2, V1
* **Kling**: 2.0, 1.6, 1.5 & Various Effects
* **Luma**: Photon, Ray2, Ray1.6
* **MiniMax**: Text-to-Video, Image-to-Video
* **OpenAI**: o1, o1-pro, o3, gpt-4o, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, DALL·E 2, DALL·E 3, GPT-Image-1
* **PixVerse**: V4 & Effects
* **Pika**: 2.2
* **Recraft**: V3, V2 & Various Tools
* **Rodin**: 3D Generation
* **Stability AI**: Stable Image Ultra, Stable Diffusion 3.5 Large, Image Upscale
* **Tripo**: v1-4, v2.0, v2.5
## Prerequisites for Using API Nodes
To use API Nodes, the following requirements must be met:
### 1. ComfyUI Version Requirements
Please update your ComfyUI to the latest version, as we may add more API support in the future, and corresponding nodes will be updated, so please keep your ComfyUI up to date.
Please note the distinction between nightly and release versions. We recommend using the `nightly` version (which is the latest code commit), as the release version may not be updated in a timely manner.
This refers to the development version and the stable version, and since we are still rapidly iterating, this document may not be updated promptly, so please pay attention to the version differences.
### 2. Account and Credits Requirements
You need to be logged into your ComfyUI with a [Comfy account](/interface/user) and have a credit balance of [credits](/interface/credits) greater than 0.
Log in via `Settings` -> `User`:
Go to `Settings` -> `Credits` to purchase credits
Please refer to the corresponding documentation for account and credits to ensure this requirement:
* [Comfy account](/interface/user): Find the `User` section in the settings menu to log in.
* [Credits](/interface/credits): After logging in, the settings interface will show a credits menu where you can purchase credits. We use a prepaid system, so there will be no unexpected charges.
### 3. Network Environment Requirements
API access requires that your current requests are based on a secure network environment. The current requirements for API access are as follows:
* The local network only allows access from `127.0.0.1` or `localhost`, and you can directly use the login function.
* If you are accessing from a local area network or a website that is not on the whitelist, please log in with an API Key. Please refer to [Log in with an API Key](/interface/user#logging-in-with-an-api-key).
* You should be able to access our API service normally (in some regions, you may need to use a proxy service).
* Access should be carried out in an `https` environment to ensure the security of the requests.
Accessing in an insecure context poses significant risks, which may result in the following consequences:
1. Authentication may be stolen, leading to the leakage of your account information.
2. Your account may be maliciously used, resulting in financial losses.
Even if we open this restriction in the future, we strongly advise against accessing API services through insecure network requests due to the high risks involved.
### 4. Using the Corresponding Nodes
**Add to Workflow**: Add the API node to your workflow just like you would with other nodes.
**Run**: Set the parameters and then run the workflow.
## Log in with API Key on non-whitelisted websites
Currently, we have set up a whitelist to restrict the websites where you can log in to your ComfyUI account.
If you need to log in to your ComfyUI account on some non-whitelisted websites, please refer to the account management section to learn how to log in using an API Key.
In this case, the corresponding website does not need to be on our whitelist.
Learn how to log in with ComfyUI API Key
## Use ComfyUI API Key Integration to call paid model API nodes
Currently, we support accessing our services through ComfyUI API Key Integration to call paid model API nodes. Please refer to the API Key Integration section to learn how to use API Key Integration to call paid model API nodes.
Please refer to the API Key Integration section to learn how to use API Key Integration to call paid model API nodes
## Advantages of API Nodes
API Nodes provide several important advantages for ComfyUI users:
* **Access to closed-source models**: Use state-of-the-art AI models without having to deploy them yourself
* **Seamless integration**: API nodes are fully compatible with other ComfyUI nodes and can be combined to create complex workflows
* **Simplified experience**: No need to manage API keys or handle complex API requests
* **Controlled costs**: The prepaid system ensures you have complete control over your spending with no unexpected charges
## Pricing
Please refer to the pricing page for the corresponding API pricing
## About Open Source and Opt-in
It's important to note that **API Nodes are completely optional**. ComfyUI will always remain fully open-source and free for local users. API nodes are designed as an "opt-in" feature, providing convenience for those who want access to external SOTA (state-of-the-art) models.
## Use Cases
A powerful application of API Nodes is combining the output of external models with local nodes. For example:
* Using GPT-Image-1 to generate a base image, then transforming it into video with a local `wan` node
* Combining externally generated images with local upscaling or style transfer nodes
* Creating hybrid workflows that leverage the advantages of both closed-source and open-source models
This flexibility makes ComfyUI a truly universal generative AI interface, integrating various AI capabilities into a unified workflow, opening up more possibilities
## FAQs
Please update your ComfyUI to the latest version (the latest commit or the latest [desktop version](https://www.comfy.org/download)).
We may add more API support in the future, and the corresponding nodes will be updated, so please keep your ComfyUI up to date.
Please note that you need to distinguish between the nightly version and the release version.
In some cases, the latest `release` version may not be updated in time compared to the `nightly` version.
Since we are still iterating quickly, please ensure you are using the latest version when you cannot find the corresponding node.
API access requires that your current request is based on a secure network environment. The current requirements for API access are as follows:
* The local network only allows access from `127.0.0.1` or `localhost`, which may mean that you cannot use the API Nodes in a ComfyUI service started with the `--listen` parameter in a LAN environment.
* Able to access our API service normally (a proxy service may be required in some regions).
* Your account does not have enough [credits](/interface/credits).
* Currently, only `127.0.0.1` or `localhost` access is supported.
* Ensure your account has enough credits.
API Nodes require credits for API calls to closed-source models, so they do not support free usage.
Please refer to the following documentation:
1. [Comfy Account](/interface/user): Find the `User` section in the settings menu to log in.
2. [Credits](/interface/credits): After logging in, the settings interface will show the credits menu. You can purchase credits in `Settings` → `Credits`. We use a prepaid system, so there will be no unexpected charges.
3. Complete the payment through Stripe.
4. Check if the credits have been updated. If not, try restarting or refreshing the page.
Currently, we do not support refunds for credits.
If you believe there is an error resulting in unused balance due to technical issues, please [contact support](mailto:support@comfy.org).
Credits cannot go negative, so please ensure you have enough credits before making the corresponding API calls.
Please visit the [Credits](/interface/credits) menu after logging in to check the corresponding credits.
Currently, the API Nodes are still in the testing phase and do not support this feature yet, but we have considered adding it.
No, your credits do not expire.
No, your credits cannot be transferred to other users and are limited to the currently logged-in account, but we do not restrict the number of devices that can log in.
We do not limit the number of devices that can log in; you can use your account anywhere you want.
Email a request to [support@comfy.org](mailto:support@comfy.org) and we will delete your information
# Pricing
Source: https://docs.comfy.org/tutorials/api-nodes/pricing
This article lists the pricing of the current API Nodes.
The following table lists the pricing of the current API Nodes. All prices are in USD.
## BFL
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| --------------------------- | -------- | ---------------------------- | --------------- | -------- |
| Flux 1.1 \[pro] Ultra Image | Image | NA | NA | 0.06 |
| Flux.1 Canny Control Image | Image | NA | NA | 0.05 |
| Flux.1 Depth Control Image | Image | NA | NA | 0.05 |
| Flux.1 Expand Image | Image | NA | NA | 0.05 |
| Flux.1 Fill Image | Image | NA | NA | 0.05 |
| Flux.1 Kontext \[max] Image | Image | NA | NA | 0.08 |
| Flux.1 Kontext \[pro] Image | Image | NA | NA | 0.04 |
| Flux.1 Kontext \[pro] Image | Image | NA | NA | 0.05 |
## Kling
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ------------------------------------ | -------- | ---------------------------- | ---------------------------------------------- | -------- |
| Kling Image Generation | Image | image input, model\_name, n | kling-v1-5, 1, image to image | 0.028 |
| Kling Image Generation | Image | image input, model\_name, n | kling-v1-5, 1, text to image | 0.014 |
| Kling Image Generation | Image | image input, model\_name, n | kling-v1, 1, image to image | 0.0035 |
| Kling Image Generation | Image | image input, model\_name, n | kling-v1, 1, text to image | 0.0035 |
| Kling Image Generation | Image | image input, model\_name, n | kling-v2, 1, text to image | 0.014 |
| Kling Virtual Try On | Image | NA | NA | 0.07 |
| Kling Text to Video (Camera Control) | Video | NA | NA | 0.49 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-5, pro, 5 | 0.49 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-5, pro, 10 | 0.98 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-5, std, 5 | 0.28 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-5, std, 10 | 0.56 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-6, pro, 5 | 0.49 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-6, pro, 10 | 0.98 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-6, std, 5 | 0.28 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-6, std, 10 | 0.56 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1, pro, 5 | 0.49 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1, pro, 10 | 0.98 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1, std, 5 | 0.14 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1, std, 10 | 0.28 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-5, pro, 5 | 0.49 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-5, pro, 10 | 0.98 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-5, std, 5 | 0.28 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-5, std, 10 | 0.56 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-6, pro, 5 | 0.49 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-6, pro, 10 | 0.98 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-6, std, 5 | 0.28 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-6, std, 10 | 0.56 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1, pro, 5 | 0.49 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1, pro, 10 | 0.98 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1, std, 5 | 0.14 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1, std, 10 | 0.28 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1-master, 5s | 1.4 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1-master, 10s | 2.8 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1, pro, 5s | 0.49 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1, pro, 10s | 0.98 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1, std, 5s | 0.28 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1, std, 10s | 0.56 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v2-maser, pro, 5s | 1.4 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v2-maser, pro, 10s | 2.8 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v2-maser, std, 5s | 1.4 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v2-maser, std, 10s | 2.8 |
| Kling Lip Sync Video with Audio | Video | output\_length | 5s | 0.07 |
| Kling Lip Sync Video with Audio | Video | output\_length | 10s | 0.14 |
| Kling Lip Sync Video with Text | Video | output\_length | 5s | 0.07 |
| Kling Lip Sync Video with Text | Video | output\_length | 10s | 0.14 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 5s duration / kling-v1 | 0.49 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 5s duration / kling-v1-5 | 0.49 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 5s duration / kling-v1-6 | 0.49 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 10s duration / kling-v1-5 | 0.98 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 10s duration / kling-v1-6 | 0.98 |
| Kling Start-End Frame to Video | Video | mode | standard mode / 5s duration / kling-v1 | 0.14 |
| Kling Text to Video | Video | mode | 5s duration / kling-v2-1-master | 1.4 |
| Kling Text to Video | Video | mode | 10s duration / kling-v2-1-master | 2.8 |
| Kling Text to Video | Video | mode | pro mode / 5s duration / kling-v1 | 0.49 |
| Kling Text to Video | Video | mode | pro mode / 5s duration / kling-v2-master | 1.4 |
| Kling Text to Video | Video | mode | pro mode / 10s duration / kling-v1 | 0.98 |
| Kling Text to Video | Video | mode | pro mode / 10s duration / kling-v2-master | 2.8 |
| Kling Text to Video | Video | mode | standard mode / 5s duration / kling-v1 | 0.14 |
| Kling Text to Video | Video | mode | standard mode / 5s duration / kling-v1-6 | 0.28 |
| Kling Text to Video | Video | mode | standard mode / 5s duration / kling-v2-master | 1.4 |
| Kling Text to Video | Video | mode | standard mode / 10s duration / kling-v1 | 0.28 |
| Kling Text to Video | Video | mode | standard mode / 10s duration / kling-v1-6 | 0.56 |
| Kling Text to Video | Video | mode | standard mode / 10s duration / kling-v2-master | 2.8 |
| Kling Video Effects | Video | duration, effect\_scene | dizzydizzy or bloombloom, 5 | 0.49 |
| Kling Video Effects | Video | duration, effect\_scene | fuzzyfuzzy or squish or expansion, 5 | 0.28 |
| Kling Video Extend | Video | NA | NA | 0.28 |
## Luma
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ------------------- | -------- | ---------------------------- | ---------------------- | -------- |
| Luma Text to Image | Image | model | photo-flash-1 | 0.0019 |
| Luma Text to Image | Image | model | photo-flash-1 | 0.0019 |
| Luma Image to Image | Image | model | photon-1 | 0.0073 |
| Luma Image to Image | Image | model | photon-1 | 0.0073 |
| Luma Image to Video | Video | duration, model, resoluion | ray-1-6, 720p, 5s | 0.35 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 4k, 5s | 6.37 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 4k, 9s | 11.47 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 540p, 5s | 0.4 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 540p, 9s | 0.72 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 720p, 5s | 0.71 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 720p, 9s | 1.27 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 1080p, 5s | 1.59 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 1080p, 9s | 2.87 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 4k, 5s | 2.19 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 4k, 9s | 3.94 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 540p, 5s | 0.14 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 540p, 9s | 0.25 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 720p, 5s | 0.24 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 720p, 9s | 0.44 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 1080p, 5s | 0.55 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 1080p, 9s | 0.99 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-1-6, 720p, 5s | 0.35 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 4k, 5s | 6.37 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 4k, 9s | 11.47 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 540p, 5s | 0.4 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 540p, 9s | 0.72 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 720p, 5s | 0.71 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 720p, 9s | 1.27 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 1080p, 5s | 1.59 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 1080p, 9s | 2.87 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 4k, 5s | 2.19 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 4k, 9s | 3.94 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 540p, 5s | 0.14 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 540p, 9s | 0.25 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 720p, 5s | 0.24 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 720p, 9s | 0.44 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 1080p, 5s | 0.55 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 1080p, 9s | 0.99 |
## Google
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ----------------------------- | -------- | ---------------------------- | -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| Google Veo 3 Video Generatoin | Video | generate\_audio, model | veo-3.0-fast-generate-001, false | 2 |
| Google Veo 3 Video Generatoin | Video | generate\_audio, model | veo-3.0-fast-generate-001, true | 3.2 |
| Google Veo 3 Video Generatoin | Video | generate\_audio, model | veo-3.0-generate-001, false | 4 |
| Google Veo 3 Video Generatoin | Video | generate\_audio, model | veo-3.0-generate-001, true | 6 |
| Google Veo2 Video Generation | Video | duration | 5 | 2.5 |
| Google Veo2 Video Generation | Video | duration | 8 | 4 |
| Google Gemini | Text | model | gemini-2.5-flash-preview-04-17 | $1.25/1M input tokens + $10/1M output tokens (\< 200K tokens) |
| Google Gemini | Text | model | gemini-2.5-pro-preview-05-06 | $0.16/1M input tokens + $0.6/1M output tokens + \$1/1M input audio tokens (\< 200K tokens) |
| Google Gemini | Text | model | gemini-2.5-flash | $0.3/1M input tokens(txt, img, vid) + $2.5/1M output tex tokens + \$1/1M input audio tokens (\< 200K tokens) |
| Google Gemini | Text | model | gemini-2.5-pro | $1.25/1M input tokens (text, img, vid) + $10/1M output tokens (\< 200K total tokens) |
| Google Gemini | Text | model | gemini-2.5-pro-preview-05-06 | $1.25/1M input tokens(text, img, vid) + $10/1M output tokens (\< 200K total tokens) |
| Google Gemini Image | Image | model | gemini-2.5-flash-image-preview | $0.3/1M input tokens (text, video, image) + $1/1M input tokens (audio) + $2.5/1M output tokens (text) + $30/1M output tokens (image) |
## Minimax
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ---------------------- | -------- | ---------------------------- | ---------------------------------- | -------- |
| Minimax Text to Video | Video | model | Hailuo-02 / 768 P / 6s | 0.28 |
| Minimax Text to Video | Video | model | Hailuo-02 / 768 P / 10s | 0.56 |
| Minimax Text to Video | Video | model | Hailuo-02 / 1080 P / 6s | 0.49 |
| Minimax Text to Video | Video | model | T2V-01 | 0.43 |
| Minimax Text to Video | Video | model | T2V-01-Director | 0.43 |
| Minimax Image to Video | Video | model | Hailuo-02 / 768 P / 6s | 0.28 |
| Minimax Image to Video | Video | model | Hailuo-02 / 768 P / 10s | 0.56 |
| Minimax Image to Video | Video | model | Hailuo-02 / 1080 P / 6s | 0.49 |
| Minimax Image to Video | Video | model | I2V-01 | 0.43 |
| Minimax Image to Video | Video | model | I2V-01-live | 0.43 |
| Minimax Video | Video | model | 768P, 6 (Text or Image to video) | 0.28 |
| Minimax Video | Video | model | 768P, 10 (Text or Image to video) | 0.56 |
| Minimax Video | Video | model | 1080P, 6 (Text or Image to video) | 0.49 |
## Recraft
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ------------------------------ | -------- | ---------------------------- | --------------- | -------- |
| Recraft Creative Upscale Image | Image | NA | NA | 0.25 |
| Recraft Crisp Upscale Image | Image | NA | NA | 0.004 |
| Recraft Image to Image | Image | n | 1 | 0.04 |
| Recraft Remove Background | Image | NA | NA | 0.01 |
| Recraft Replace Background | Image | n | 1 | 0.04 |
| Recraft Text to Image | Image | n | 1 | 0.04 |
| Recraft Vectorize Image | Image | NA | NA | 0.01 |
| Recraft Text to Vector | Image | n | 1 | 0.08 |
| Recraft Image Inpainting | Image | n | 1 | 0.04 |
## Ideogram
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ---------------- | -------- | ------------------------------ | --------------- | -------- |
| Ideogram V1 | Image | num\_images, turbo(true/false) | 1, false | 0.06 |
| Ideogram V1 | Image | num\_images, turbo(true/false) | 1, true | 0.02 |
| Ideogram V2 | Image | num\_images, turbo(true/false) | 1, false | 0.08 |
| Ideogram V2 | Image | num\_images, turbo(true/false) | 1, true | 0.05 |
| Ideogram V3 | Image | num\_images, rendering\_speed | 1, Balanced | 0.06 |
| Ideogram V3 | Image | num\_images, rendering\_speed | 1, Quality | 0.09 |
| Ideogram V3 | Image | num\_images, rendering\_speed | 1, Turbo | 0.03 |
## Runway
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ----------------------------------- | -------- | ---------------------------- | --------------- | -------- |
| Ruway Text to Image | Image | NA | NA | 0.08 |
| Runway First-Last-Frame to Video | Video | duration | 5s | 0.25 |
| Runway First-Last-Frame to Video | Video | duration | 10s | 0.5 |
| Runway Image to Video (Gen3a Turbo) | Video | duration | 5s | 0.25 |
| Runway Image to Video (Gen3a Turbo) | Video | duration | 10s | 0.5 |
| Runway Image to Video (Gen4 Turbo) | Video | duration | 5s | 0.25 |
| Runway Image to Video (Gen4 Turbo) | Video | duration | 10s | 0.5 |
## OpenAI
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ------------------------------- | -------- | ---------------------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------- |
| GPT-Image-1 - Actual | Image | n, quality, size | input image tokens$10 / 1M tokens + input text tokens$5 / 1M tokens +output tokens\$40 / 1M tokens | ?? |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | high, 1024x1024 | 0.167 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | high, 1024x1536 | 0.25 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | high, 1536x1024 | 0.25 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | low, 1024x1024 | 0.011 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | low, 1024x1536 | 0.016 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | low, 1536x1024 | 0.016 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | medium, 1024x1024 | 0.042 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | medium, 1024x1536 | 0.063 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | medium, 1536x1024 | 0.063 |
| Image Generation (DALL·E 2) | Image | size | size = 512 \* 512 | 0.018 |
| Image Generation (DALL·E 2) | Image | size | size = 1024 \* 1024 | 0.02 |
| Image Generation (DALL·E 2) | Image | size | size 256 \* 256 | 0.016 |
| Image Generation (DALL·E 3 HD) | Image | quality, size | size = 1024 \* 1024, hd | 0.08 |
| Image Generation (DALL·E 3 HD) | Image | quality, size | size = 1024 \* 1792, hd | 0.12 |
| Image Generation (DALL·E 3 HD) | Image | quality, size | size = 1792 \* 1024, hd | 0.12 |
| Image Generation (DALL·E 3 Std) | Image | quality, size | size = 1024 \* 1024,std | 0.04 |
| Image Generation (DALL·E 3 Std) | Image | quality, size | size = 1024 \* 1792, std | 0.08 |
| Image Generation (DALL·E 3 Std) | Image | quality, size | size = 1792 \* 1024, std | 0.08 |
| OpenAI Chat | Text | gpt-4.1 | Per 1M tokens: Input Text: $2, Cached Input text:$0.5, Output text: \$8 | Per 1M tokens: Input Text: $0.05, Cached Input text: $0.005, Output text: \$0.4 |
| OpenAI Chat | Text | gpt-4.1-mini | Per 1M tokens: Input Text: $0.4, Cached Input text: $0.1, Output text: \$1.6 | |
| OpenAI Chat | Text | gpt-4.1-nano | Per 1M tokens: Input Text: $0.1, Cached Input text: $0.025, Output text: \$0.4 | |
| OpenAI Chat | Text | gpt-5 | | Per 1M tokens: Input Text: $1.25, Cached Input text: $0.125, Output text: \$10 |
| OpenAI Chat | Text | gpt-5-mini | | Per 1M tokens: Input Text: $0.25 ,Cached Input text: $0.025, Output text: \$2 |
| OpenAI Chat | Text | gpt-5-nano | | Per 1M tokens: Input Text: $0.05, Cached Input text: $0.005, Output text: \$0.4 |
| OpenAI Chat | Text | gpt-4o | | Per 1M tokens: Input Text: $2.5, Cached Input text: $1.25, Output text: \$10 |
| OpenAI Chat | Text | o1 | | Per 1M tokens: Input Text: $15, Cached Input text: $7.5, Output text: \$60 |
| OpenAI Chat | Text | o1-pro | | Per 1M tokens: Input Text: $150, Cached Input text: NA, Output text: $600 |
| OpenAI Chat | Text | o3 | | Per 1M tokens: Input Text: $2, Cached Input text: $0.5, Output text: \$8 |
| OpenAI Chat | Text | o4-mini | | Per 1M tokens: Input Text: $1.1, Cached Input text: $0.275, Output text: \$4.4 |
## Pixverse
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ------------------------- | -------- | ---------------------------- | --------------- | -------- |
| PixVerse Text to Video | Video | duration, quality, resoluion | 360p fast 5s | 0.9 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 360p normal 5s | 0.45 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 360p normal 8s | 0.9 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 540p fast 5s | 0.9 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 540p normal 5s | 0.45 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 540p normal 8s | 0.9 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 720p fast 5s | 1.2 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 720p normal 5s | 0.6 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 720p normal 8s | 1.2 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 1080p normal 5s | 1.2 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 360p fast 5s | 0.9 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 360p normal 5s | 0.45 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 360p normal 8s | 0.9 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 540p fast 5s | 0.9 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 540p normal 5s | 0.45 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 540p normal 8s | 0.9 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 720p fast 5s | 1.2 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 720p normal 5s | 0.6 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 720p normal 8s | 1.2 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 1080p normal 5s | 1.2 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 360p fast 5s | 0.9 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 360p normal 5s | 0.45 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 360p normal 8s | 0.9 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 540p fast 5s | 0.9 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 540p normal 5s | 0.45 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 540p normal 8s | 0.9 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 720p fast 5s | 1.2 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 720p normal 5s | 0.6 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 720p normal 8s | 1.2 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 1080p normal 5s | 1.2 |
## Pika
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ------------------------------------- | -------- | ---------------------------- | --------------- | -------- |
| Pika Scenes (Video Image Composition) | Video | duration, resoluion | 720p, 5s | 0.3 |
| Pika Scenes (Video Image Composition) | Video | duration, resoluion | 720p, 10s | 0.4 |
| Pika Scenes (Video Image Composition) | Video | duration, resoluion | 1080p, 5s | 0.5 |
| Pika Scenes (Video Image Composition) | Video | duration, resoluion | 1080p, 10s | 1.5 |
| Pika Start and End Frame to Video | Video | duration, resoluion | 720p, 5s | 0.2 |
| Pika Start and End Frame to Video | Video | duration, resoluion | 720p, 10s | 0.25 |
| Pika Start and End Frame to Video | Video | duration, resoluion | 1080p, 5s | 0.3 |
| Pika Start and End Frame to Video | Video | duration, resoluion | 1080p, 10s | 1 |
| Pika Text to Video | Video | duration, resoluion | 720p, 5s | 0.2 |
| Pika Text to Video | Video | duration, resoluion | 720p, 10s | 0.6 |
| Pika Text to Video | Video | duration, resoluion | 1080p, 5s | 0.45 |
| Pika Text to Video | Video | duration, resoluion | 1080p, 10s | 1 |
| PikaImage to Video | Video | duration, resoluion | 720p, 5s | 0.2 |
| PikaImage to Video | Video | duration, resoluion | 720p, 10s | 0.6 |
| PikaImage to Video | Video | duration, resoluion | 1080p, 5s | 0.45 |
| PikaImage to Video | Video | duration, resoluion | 1080p, 10s | 1 |
| Pika Swaps (Video Object Replacement) | Video | NA | NA | 0.3 |
| Pikadditios (Video Object Insertion) | Video | NA | NA | 0.3 |
| Pikaffects (Video Effects) | Video | NA | NA | 0.45 |
## Moonvalley
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ------------------- | -------- | ---------------------------- | --------------- | -------- |
| Image to video - 5s | Video | NA | NA | 1.5 |
| Text to video - 5s | Video | NA | NA | 1.5 |
| Video to video - 5s | Video | NA | NA | 2.25 |
## Rodin
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ------------------------------------ | -------- | ---------------------------- | --------------- | -------- |
| Rodin 3D Generate - Regular Generate | 3D | NA | NA | 0.4 |
| Rodin 3D Generate - Detail Generate | 3D | NA | NA | 0.4 |
| Rodin 3D Generate - Sketch Generate | 3D | NA | NA | 0.4 |
| Rodin 3D Generate - Smooth Generate | 3D | NA | NA | 0.4 |
## Tripo
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ----------------------------------------- | -------- | -------------------------------------- | ------------------------------------ | -------- |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, false, any quality, false | 0.15 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, false, any quality, true | 0.2 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, true, detailed, false | 0.35 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, true, detailed, true | 0.4 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, true, standard, false | 0.25 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, true, standard, true | 0.3 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, false, any quality, false | 0.1 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, false, any quality, true | 0.15 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, true, detailed, false | 0.3 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, true, detailed, true | 0.35 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, true, standard, false | 0.2 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, true, standard, true | 0.25 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, false, any quality, false | 0.25 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, false, any quality, true | 0.3 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, true, detailed, false | 0.45 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, true, detailed, true | 0.5 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, true, standard, false | 0.35 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, true, standard, true | 0.4 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, false, any quality, false | 0.2 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, false, any quality, true | 0.25 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, true, detailed, false | 0.4 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, true, detailed, true | 0.45 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, true, standard, false | 0.3 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, true, standard, true | 0.35 |
| Tripo: Convert model | 3D | NA | NA | 0.1 |
| Tripo: Refine Draft model | 3D | NA | NA | 0.3 |
| Tripo: Retarget rigged model | 3D | NA | NA | 0.1 |
| Tripo: Rig model | 3D | NA | NA | 0.25 |
| Tripo: Texture model | 3D | texture\_quality | detailed | 0.2 |
| Tripo: Texture model | 3D | texture\_quality | standard | 0.1 |
## Stability AI
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| --------------------------------------- | -------- | ---------------------------- | --------------- | -------- |
| Stability AI Stable Image Ultra | Image | NA | NA | 0.08 |
| Stability AI Stable Diffusion 3.5 Image | Image | model | sd3.5-large | 0.065 |
| Stability AI Stable Diffusion 3.5 Image | Image | model | sd3.5-medium | 0.035 |
| Stability AI Upscale Conservative | Image | NA | NA | 0.40 |
| Stability AI Upscale Creative | Image | NA | NA | 0.60 |
| Stability AI Upscale Fast | Image | NA | NA | 0.02 |
## Vidu
| Model(Node name) | Category | Parameters that affect price | Parameter combo | Price \$ |
| ----------------------- | -------- | ---------------------------- | --------------- | -------- |
| ViduImageToVideoNode | Video | NA | NA | 0.4 |
| ViduReferenceVideoNode | Video | NA | NA | 0.4 |
| ViduStartEndToVideoNode | Video | NA | NA | 0.4 |
| ViduTextToVideoNode | Video | NA | NA | 0.4 |
# Recraft Text to Image API Node ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/recraft/recraft-text-to-image
Learn how to use the Recraft Text to Image API node in ComfyUI
The [Recraft Text to Image](/built-in-nodes/api-node/image/recraft/recraft-text-to-image) node allows you to create high-quality images in various styles using Recraft AI's image generation technology based on text descriptions.
In this guide, we'll show you how to set up a text-to-image workflow using this node.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Recraft Text to Image API Node Workflow
### 1. Download the Workflow File
The workflow information is included in the metadata of the image below. Download and drag it into ComfyUI to load the workflow.

### 2. Follow the Steps to Run the Workflow
Follow these numbered steps to run the basic workflow:
1. (Optional) Change the `Recraft Color RGB` in the `Color` node to your desired color
2. (Optional) Modify the `Recraft Style` node to control the visual style, such as digital art, realistic photo, or logo design. This group includes other style nodes you can enable as needed
3. (Optional) Edit the `prompt` parameter in the `Recraft Text to Image` node. You can also change the `size` parameter
4. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to generate the image
5. After the API returns the result, you can view the generated image in the `Save Image` node. The image will also be saved to the `ComfyUI/output/` directory
> (Optional) We've included a **Convert to SVG** group in the workflow. Since the `Recraft Vectorize Image` node in this group consumes additional credits, enable it only when you need to convert the generated image to SVG format
### 3. Additional Notes
* **Recraft Style**: Offers various preset styles like realistic photos, digital art, and logo designs
* **Seed Parameter**: Only used to determine if the node should run again, the actual generation result is not affected by the seed value
## Related Node Documentation
Check the following documentation for detailed parameter settings of the nodes
Documentation for the Recraft Text to Image API node
Documentation for the Recraft Style - Realistic Image API node
Documentation for the Recraft Controls API node
# Rodin API Node Model Generation ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/rodin/model-generation
This article will introduce how to use Rodin node's API in ComfyUI for model generation
Hyper3D Rodin (hyper3d.ai) is a platform focused on rapidly generating high-quality, production-ready 3D models and materials through artificial intelligence.
ComfyUI has now natively integrated the corresponding Rodin model generation API, allowing you to conveniently use the related nodes in ComfyUI for model generation.
Currently, ComfyUI's API nodes support the following Rodin model generation capabilities:
* Single-view model generation
* Multi-view model generation
* Model generation with different levels of detail
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Single-view Model Generation Workflow
### 1. Workflow File Download
Download the file below and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
Download the image below as input image

### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic text-to-image workflow execution:
1. In the `Load Image` node, load the provided input image
2. (Optional) In `Rodin 3D Generate - Regular Generate` adjust the corresponding parameters
* polygon\_count: You can set different polygon counts, the higher the value, the smoother and more detailed the model
3. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute model generation. After the workflow completes, the corresponding model will be automatically saved to the `ComfyUI/output/Rodin` directory
4. In the `Preview 3D` node, click to expand the menu
5. Select `Export` to directly export the corresponding model
## Multi-view Model Generation Workflow
The corresponding `Rodin 3D Generate - Regular Generate` allows up to 5 image inputs
### 1. Workflow File Download
You can modify the single-view workflow to a multi-view workflow, or directly download the workflow file below
Download the file below and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
Download the images below as input images



### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic text-to-image workflow execution:
1. In the `Load Image` node, load the provided input images
2. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute model generation. After the workflow completes, the corresponding model will be automatically saved to the `ComfyUI/output/Rodin` directory
3. In the `Preview 3D` node, click to expand the menu
4. Select `Export` to directly export the corresponding model
## Other Related Nodes
Currently, Rodin provides different types of model generation nodes in ComfyUI, since the corresponding input conditions are the same as the workflow introduced in this article, you can enable them as needed. In addition, we have provided corresponding nodes in the corresponding templates, you can also modify the corresponding node mode as needed to enable them
# Runway API Node Image Generation ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/runway/image-generation
This article will introduce how to use Runway nodes in ComfyUI for text-to-image and reference-to-image generation
Runway is a company focused on generative AI, providing powerful image generation capabilities. Its models support features such as style transfer, image extension, and detail control. Currently, ComfyUI has integrated the Runway API, allowing you to directly use the related nodes in ComfyUI for image generation.
In this guide, we will walk you through the following workflows:
* Text-to-image
* Reference-to-image
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Runway Image Text-to-Image Workflow
### 1. Workflow File Download
The image below contains workflow information in its `metadata`. Please download and drag it into ComfyUI to load the corresponding workflow.

### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic text-to-image workflow execution:
1. In the `Runway Text to Image` node, input your prompt in the `prompt` field
2. (Optional) Adjust the `ratio` setting to set different output aspect ratios
3. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute image generation.
4. After waiting for the API to return results, you can view the generated image in the `Save Image` node (right-click to save). The corresponding image will also be saved to the `ComfyUI/output/` directory.
## Runway Image Reference-to-Image Workflow
### 1. Workflow and Input Image Download
The image below contains workflow information in its `metadata`. Please download and drag it into ComfyUI to load the corresponding workflow.

Download the image below for input

### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic reference-to-image workflow execution:
1. In the `Load Image` node, load the provided input image
2. In the `Runway Text to Image` node, input your prompt in the `prompt` field and adjust dimensions
3. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute image generation.
4. After waiting for the API to return results, you can view the generated image in the `Save Image` node (right-click to save). The corresponding image will also be saved to the `ComfyUI/output/` directory.
# Runway API Node Video Generation ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/runway/video-generation
This article will introduce how to use Runway nodes in ComfyUI for video generation workflows
Runway is a company focused on generative AI, providing powerful video generation capabilities. Currently, ComfyUI has integrated the Runway API, allowing you to directly use the related nodes in ComfyUI for video generation.
Currently, ComfyUI natively integrates the following Runway video generation models:
* Runway Gen3a turbo
* Runway Gen4 turbo
* Runway First Last Frame to video
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Gen3a turbo Image-to-Video Workflow
### 1. Workflow File Download
The video below contains workflow information in its `metadata`. Please download and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
Download the image below as input image

### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic image-to-video workflow execution:
1. In the `Load Image` node, load the provided input image
2. In the `Runway Gen3a turbo` node, set the `prompt` to describe video content, modify the `duration` parameter to set video length, modify the `ratio` parameter to set video aspect ratio
3. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation.
4. After waiting for the API to return results, you can view the generated video in the `Save Video` node (right-click to save). The corresponding video will also be saved to the `ComfyUI/output/` directory.
## Gen4 turbo Image-to-Video Workflow
### 1. Workflow File Download
The video below contains workflow information in its `metadata`. Please download and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
Download the image below as input image

### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic image-to-video workflow execution:
1. In the `Load Image` node, load the provided input image
2. In the `Runway Gen4 turbo` node, set the `prompt` to describe video content, modify the `duration` parameter to set video length, modify the `ratio` parameter to set video aspect ratio
3. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation.
4. After waiting for the API to return results, you can view the generated video in the `Save Video` node (right-click to save). The corresponding video will also be saved to the `ComfyUI/output/` directory.
## First-Last Frame Video Generation Workflow
### 1. Workflow File Download
The video below contains workflow information in its `metadata`. Please download and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
Download the images below as input images


### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic first-last frame to video workflow execution:
1. In the `Load Image` node, load the starting frame
2. In the `Load Image` node, load the ending frame
3. In the `Runway First-Last-Frame to Video` node, set the `prompt` to describe video content, modify the `duration` parameter to set video length, modify the `ratio` parameter to set video aspect ratio
4. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation.
5. After waiting for the API to return results, you can view the generated video in the `Save Video` node (right-click to save). The corresponding video will also be saved to the `ComfyUI/output/` directory.
# Stability AI Stable Diffusion 3.5 API Node ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/stability-ai/stable-diffusion-3-5-image
This article will introduce how to use Stability AI Stable Diffusion 3.5 API node's text-to-image and image-to-image capabilities in ComfyUI
The [Stability AI Stable Diffusion 3.5 Image](/built-in-nodes/api-node/image/stability-ai/stability-ai-stable-diffusion-3-5-image) node allows you to use Stability AI's Stable Diffusion 3.5 model to create high-quality, detail-rich image content through text prompts or reference images.
In this guide, we will show you how to set up workflows for both text-to-image and image-to-image generation using this node.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Stability AI Stable Diffusion 3.5 Text-to-Image Workflow
### 1. Workflow File Download
The image below contains workflow information in its `metadata`. Please download and drag it into ComfyUI to load the corresponding workflow.

### 2. Complete the Workflow Step by Step
You can follow the numbered steps in the image to complete the basic text-to-image workflow:
1. (Optional) Modify the `prompt` parameter in the `Stability AI Stable Diffusion 3.5 Image` node to input your desired image description. More detailed prompts often result in better image quality.
2. (Optional) Select the `model` parameter to choose which SD 3.5 model version to use.
3. (Optional) Select the `style_preset` parameter to control the visual style of the image. Different presets produce images with different stylistic characteristics, such as "cinematic" or "anime". Select "None" to not apply any specific style.
4. (Optional) Edit the `String(Multiline)` to modify negative prompts, specifying elements you don't want to appear in the generated image.
5. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation.
6. After the API returns results, you can view the generated image in the `Save Image` node. The image will also be saved to the `ComfyUI/output/` directory.
### 3. Additional Notes
* **Prompt**: The prompt is one of the most important parameters in the generation process. Detailed, clear descriptions lead to better results. Can include elements like scene, subject, colors, lighting, and style.
* **CFG Scale**: Controls how closely the generator follows the prompt. Higher values make the image more closely match the prompt description, but too high may result in oversaturated or unnatural results.
* **Style Preset**: Offers various preset styles for quickly defining the overall style of the image.
* **Negative Prompt**: Used to specify elements you don't want to appear in the generated image.
* **Seed Parameter**: Can be used to reproduce or fine-tune generation results, helpful for iteration during creation.
* Currently the `Load Image` node is in "Bypass" mode. To enable it, refer to the step guide and right-click the node to set "Mode" to "Always" to enable input, switching to image-to-image mode.
* `image_denoise` has no effect when there is no input image.
## Stability AI Stable Diffusion 3.5 Image-to-Image Workflow
### 1. Workflow File Download
The image below contains workflow information in its `metadata`. Please download and drag it into ComfyUI to load the corresponding workflow.

Download the image below to use as input
!\[Stability AI Stable Diffusion 3.5 Image-to-Image Workflow Input Image]\(
### 2. Complete the Workflow Step by Step
You can follow the numbered steps in the image to complete the image-to-image workflow:
1. Load a reference image through the `Load Image` node, which will serve as the basis for generation.
2. (Optional) Modify the `prompt` parameter in the `Stability AI Stable Diffusion 3.5 Image` node to describe elements you want to change or enhance in the reference image.
3. (Optional) Select the `style_preset` parameter to control the visual style of the image. Different presets produce images with different stylistic characteristics.
4. (Optional|Important) Adjust the `image_denoise` parameter (range 0.0-1.0) to control how much the original image is modified:
* Values closer to 0.0 make the generated image more similar to the input reference image (at 0.0, it's basically identical to the original)
* Values closer to 1.0 make the generated image more like pure text-to-image generation (at 1.0, it's as if no reference image was provided)
5. (Optional) Edit the `String(Multiline)` to modify negative prompts, specifying elements you don't want to appear in the generated image.
6. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation.
7. After the API returns results, you can view the generated image in the `Save Image` node. The image will also be saved to the `ComfyUI/output/` directory.
### 3. Additional Notes
The image below shows a comparison of results with and without input image using the same parameter settings:
**Image Denoise**: This parameter determines how much of the original image's features are preserved during generation. It's the most crucial adjustment parameter in image-to-image mode. The image below shows the effects of different denoising strengths:
* **Reference Image Selection**: Choosing images with clear subjects and good composition usually yields better results.
* **Prompt Tips**: In image-to-image mode, prompts should focus more on elements you want to change or enhance, rather than describing everything already present in the image.
* **Mode Switching**: When an input image is provided, the node automatically switches from text-to-image mode to image-to-image mode, and aspect ratio parameters are ignored.
## Related Node Details
You can refer to the documentation below to understand detailed parameter settings for the corresponding node
Stability Stable Diffusion 3.5 Image API Node Documentation
# Stability AI Stable Image Ultra API Node ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/stability-ai/stable-image-ultra
This article will introduce how to use the Stability AI Stable Image Ultra API node's text-to-image and image-to-image capabilities in ComfyUI
The [Stability Stable Image Ultra](/images/built-in-nodes/api_nodes/stability-ai/stability-ai-stable-image-ultra.jpg) node allows you to use Stability AI's Stable Image Ultra model to create high-quality, detailed image content through text prompts or reference images.
In this guide, we will show you how to set up workflows for both text-to-image and image-to-image generation using this node.
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Stability AI Stable Image Ultra Text-to-Image Workflow
### 1. Workflow File Download
The workflow information is included in the metadata of the image below. Please download and drag it into ComfyUI to load the corresponding workflow.

### 2. Complete the Workflow Execution Step by Step
You can follow the numbered steps in the image to complete the basic text-to-image workflow:
1. (Optional) Modify the `prompt` parameter in the `Stability AI Stable Image Ultra` node to input your desired image description. More detailed prompts often lead to better image quality. You can use the `(word:weight)` format to control specific word weights, for example: `The sky was crisp (blue:0.3) and (green:0.8)` indicates the sky is blue and green, but green is more prominent.
2. (Optional) Select the `style_preset` parameter to control the visual style of the image. Different preset styles will produce images with different stylistic characteristics, such as "cinematic", "anime", etc. Selecting "None" will not apply any specific style.
3. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation.
4. After the API returns the result, you can view the generated image in the `Save Image` node, and the image will also be saved to the `ComfyUI/output/` directory.
### 3. Additional Notes
* **Prompt**: The prompt is one of the most important parameters in the generation process. Detailed, clear descriptions will lead to better results. It can include elements like scene, subject, colors, lighting, and style.
* **Style Preset**: Provides multiple preset styles such as cinematic, anime, digital art, etc., which can quickly define the overall style of the image.
* **Negative Prompt**: Used to specify elements you don't want to appear in the generated image, helping avoid common issues like extra limbs or distorted faces.
* **Seed Parameter**: Can be used to reproduce or fine-tune generation results, helpful for iteration during the creative process.
* Currently, the `Load Image` node is in "Bypass" mode. To enable it, refer to the step guide and right-click on the corresponding node to set "Mode" to "Always" to enable input, which will switch to image-to-image mode.
## Stability AI Stable Image Ultra Image-to-Image Workflow
### 1. Workflow File Download
The workflow information is included in the metadata of the image below. Please download and drag it into ComfyUI to load the corresponding workflow.

Download the image below which we will use as input

### 2. Complete the Workflow Execution Step by Step
You can follow the numbered steps in the image to complete the image-to-image workflow:
1. Load a reference image through the `Load Image` node, which will serve as the basis for generation.
2. (Optional) Modify the `prompt` parameter in the `Stability Stable Image Ultra` node to describe elements you want to change or enhance in the reference image.
3. (Optional) Adjust the `image_denoise` parameter (range 0.0-1.0) to control the degree of modification to the original image:
* Values closer to 0.0 will make the generated image more similar to the input reference image
* Values closer to 1.0 will make the generated image more like pure text-to-image generation
4. (Optional) You can also set `style_preset` and other parameters to further control the generation effect.
5. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation.
6. After the API returns the result, you can view the generated image in the `Save Image` node, and the image will also be saved to the `ComfyUI/output/` directory.
### 3. Additional Notes
**Image Denoise**: This parameter determines how much of the original image's features are preserved during generation, and is the most crucial adjustment parameter in image-to-image mode. The image below shows the effects of different denoising strengths
* **Reference Image Selection**: Choosing images with clear subjects and good composition usually leads to better results.
* **Prompt Tips**: In image-to-image mode, prompts should focus more on what you want to change or enhance, rather than describing all elements already present in the image.
## Related Node Documentation
You can refer to the documentation below for detailed parameter settings and more information about the corresponding nodes
Stability Stable Image Ultra API Node Documentation
# Tripo API Node Model Generation ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/api-nodes/tripo/model-generation
This article will introduce how to use Tripo node's API in ComfyUI for model generation
Tripo AI is a company focused on generative AI 3D modeling. It provides user-friendly platforms and API services that can quickly convert text prompts or 2D images (single or multiple) into high-quality 3D models.
ComfyUI has now natively integrated the corresponding Tripo API, allowing you to conveniently use the related nodes in ComfyUI for model generation.
Currently, ComfyUI's API nodes support the following Tripo model generation capabilities:
* Text-to-model
* Image-to-model
* Multi-view model generation
* Rig model
* Retarget rigged model
To use the API nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the [API Nodes Overview](/tutorials/api-nodes/overview) section of the documentation to understand the specific requirements for using the API nodes.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Text-to-Model Workflow
### 1. Workflow File Download
Download the file below and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic text-to-model workflow execution:
1. In the `Tripo: Text to Model` node, input your prompt in the `prompt` field
* model: You can select different models, currently only v1.4 model supports subsequent optimization with `Tripo: Refine Draft model`
* style: You can set different styles
* texture\_quality: You can set different texture qualities
2. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute model generation. After the workflow completes, the corresponding model will be automatically saved to the `ComfyUI/output/` directory
3. In the `Preview 3D` node, click to expand the menu
4. Select `Export` to directly export the corresponding model
## Image-to-Model Workflow
### 1. Workflow File Download
Download the file below and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
Download the image below as input image

### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic image-to-model workflow execution:
1. In the `Load Image` node, load the provided input image
2. In the `Tripo: Image to Model` node, modify the corresponding parameter settings
* model: You can select different models, currently only v1.4 model supports subsequent optimization with `Tripo: Refine Draft model`
* style: You can set different styles
* texture\_quality: You can set different texture qualities
3. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute model generation. After the workflow completes, the corresponding model will be automatically saved to the `ComfyUI/output/` directory
4. For model download, please refer to the instructions in the text-to-model section
## Multi-view Model Generation Workflow
### 1. Workflow File Download
Download the file below and drag it into ComfyUI to load the corresponding workflow.
Download Json Format Workflow File
Download the images below as input images


### 2. Complete the Workflow Execution Step by Step
You can refer to the numbers in the image to complete the basic multi-view to model workflow execution:
1. In the `Load Image` nodes, load the provided input images respectively
2. In the `Tripo: Image to Model` node, modify the corresponding parameter settings
* model: You can select different models, currently only v1.4 model supports subsequent optimization with `Tripo: Refine Draft model`
* style: You can set different styles
* texture\_quality: You can set different texture qualities
3. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute model generation. After the workflow completes, the corresponding model will be automatically saved to the `ComfyUI/output/` directory
4. For other view inputs, you can refer to the step diagram and set the corresponding node mode to `Always` to enable it
5. For model download, please refer to the instructions in the text-to-model section
## Subsequent Task Processing for the Same Task
Tripo's corresponding nodes provide subsequent processing for the same task, you only need to input the corresponding `model_task_id` in the relevant nodes, and we have also provided the corresponding nodes in the relevant templates, you can also modify the corresponding node mode as needed to enable it
The `Tripo: Refine Draft model` node only supports V1.4 model, other models do not support it
# ComfyUI ACE-Step Native Example
Source: https://docs.comfy.org/tutorials/audio/ace-step/ace-step-v1
This guide will help you create dynamic music using the ACE-Step model in ComfyUI
ACE-Step is an open-source foundational music generation model jointly developed by Chinese team StepFun and ACE Studio, aimed at providing music creators with efficient, flexible and high-quality music generation and editing tools.
The model is released under the [Apache-2.0](https://github.com/ace-step/ACE-Step?tab=readme-ov-file#-license) license and is free for commercial use.
As a powerful music generation foundation, ACE-Step provides rich extensibility. Through fine-tuning techniques like LoRA and ControlNet, developers can customize the model according to their actual needs.
Whether it's audio editing, vocal synthesis, accompaniment production, voice cloning or style transfer applications, ACE-Step provides stable and reliable technical support.
This flexible architecture greatly simplifies the development process of music AI applications, allowing more creators to quickly apply AI technology to music creation.
Currently, ACE-Step has released related training code, including LoRA model training, and the corresponding ControlNet training code will be released in the future.
You can visit their [Github](https://github.com/ace-step/ACE-Step?tab=readme-ov-file#-roadmap) to learn more details.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## ACE-Step ComfyUI Text-to-Audio Generation Workflow Example
### 1. Download Workflow and Related Models
Click the button below to download the corresponding workflow file. Drag it into ComfyUI to load the workflow information. The workflow includes model download information.
Click the button below to download the corresponding workflow file. Drag it into ComfyUI to load the workflow information. The workflow includes model download information.
Download Json Format Workflow File
You can also manually download [ace\_step\_v1\_3.5b.safetensors](https://huggingface.co/Comfy-Org/ACE-Step_ComfyUI_repackaged/blob/main/all_in_one/ace_step_v1_3.5b.safetensors) and save it to the `ComfyUI/models/checkpoints` folder
### 2. Complete the Workflow Step by Step
1. Ensure the `Load Checkpoints` node has loaded the `ace_step_v1_3.5b.safetensors` model
2. (Optional) In the `EmptyAceStepLatentAudio` node, you can set the duration of the music to be generated
3. (Optional) In the `LatentOperationTonemapReinhard` node, you can adjust the `multiplier` to control the volume of the vocals (higher numbers result in more prominent vocals)
4. (Optional) Input corresponding music styles etc. in the `tags` field of `TextEncodeAceStepAudio`
5. (Optional) Input corresponding lyrics in the `lyrics` field of `TextEncodeAceStepAudio`
6. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute the audio generation
7. After the workflow completes,, you can preview the generated audio in the `Save Audio` node. You can click to play and listen to it, and the audio will also be saved to `ComfyUI/output/audio` (subdirectory determined by the `Save Audio` node).
## ACE-Step ComfyUI Audio-to-Audio Workflow
Similar to image-to-image workflows, you can input a piece of music and use the workflow below to resample and generate music. You can also adjust the difference from the original audio by controlling the `denoise` parameter in the `Ksampler`.
### 1. Download Workflow File
Click the button below to download the corresponding workflow file. Drag it into ComfyUI to load the workflow information.
Download Json Format Workflow File
Download the following audio file as the input audio:
Download Example Audio File for Input
### 2. Complete the Workflow Step by Step
1. Ensure the `Load Checkpoints` node has loaded the `ace_step_v1_3.5b.safetensors` model
2. Upload the provided audio file in the `LoadAudio` node
3. (Optional) Input corresponding music styles and lyrics in the `tags` and `lyrics` fields of `TextEncodeAceStepAudio`. Providing lyrics is very important for audio editing
4. (Optional) Modify the `denoise` parameter in the `Ksampler` node to adjust the noise added during sampling to control similarity with the original audio (smaller values result in more similarity to the original audio; setting it to `1.00` is approximately equivalent to having no audio input)
5. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute the audio generation
6. After the workflow completes, you can preview the generated audio in the `Save Audio` node. You can click to play and listen to it, and the audio will also be saved to `ComfyUI/output/audio` (subdirectory determined by the `Save Audio` node).
You can also implement the lyrics modification and editing functionality from the ACE-Step project page, modifying the original lyrics to change the audio effect.
### 3. Additional Workflow Notes
1. In the example workflow, you can change the `tags` in `TextEncodeAceStepAudio` from `male voice` to `female voice` to generate female vocals.
2. You can also modify the `lyrics` in `TextEncodeAceStepAudio` to change the lyrics and thus the generated audio. Refer to the examples on the ACE-Step project page for more details.
## ACE-Step Prompt Guide
ACE currently uses two types of prompts: `tags` and `lyrics`.
* `tags`: Mainly used to describe music styles, scenes, etc. Similar to prompts we use for other generations, they primarily describe the overall style and requirements of the audio, separated by English commas
* `lyrics`: Mainly used to describe lyrics, supporting lyric structure tags such as \[verse], \[chorus], and \[bridge] to distinguish different parts of the lyrics. You can also input instrument names for purely instrumental music
You can find rich examples of `tags` and `lyrics` on the [ACE-Step model homepage](https://ace-step.github.io/). You can refer to these examples to try corresponding prompts. This document's prompt guide is organized based on the project to help you quickly try combinations to achieve your desired effect.
### Tags (prompt)
#### Mainstream Music Styles
Use short tag combinations to generate specific music styles
* electronic
* rock
* pop
* funk
* soul
* cyberpunk
* Acid jazz
* electro
* em (electronic music)
* soft electric drums
* melodic
#### Scene Types
Combine specific usage scenarios and atmospheres to generate music that matches the corresponding mood
* background music for parties
* radio broadcasts
* workout playlists
#### Instrumental Elements
* saxophone
* jazz
* piano, violin
#### Vocal Types
* female voice
* male voice
* clean vocals
#### Professional Terms
Use some professional terms commonly used in music to precisely control music effects
* 110 bpm (beats per minute is 110)
* fast tempo
* slow tempo
* loops
* fills
* acoustic guitar
* electric bass
{/* - Variance control:
- variance=0.3 (controls the degree of difference between generated music and original style) */}
{/* - Lyrics editing:
- edit lyrics: 'When I was young' -> 'When you were kid' (lyrics editing example) */}
### Lyrics
#### Lyric Structure Tags
* \[outro]
* \[verse]
* \[chorus]
* \[bridge]
#### Multilingual Support
* ACE-Step V1 supports multiple languages. When used, ACE-Step converts different languages into English letters and then generates music.
* In ComfyUI, we haven't fully implemented the conversion of all languages to English letters. Currently, only [Japanese hiragana and katakana characters](https://github.com/comfyanonymous/ComfyUI/commit/5d3cc85e13833aeb6ef9242cdae243083e30c6fc) are implemented.
So if you need to use multiple languages for music generation, you need to first convert the corresponding language to English letters, and then input the language code abbreviation at the beginning of the `lyrics`, such as Chinese `[zh]`, Korean `[ko]`, etc.
For example:
```
[verse]
[zh]wo3zou3guo4shen1ye4de5jie1dao4
[zh]leng3feng1chui1luan4si1nian4de5piao4liang4wai4tao4
[zh]ni3de5wei1xiao4xiang4xing1guang1hen3xuan4yao4
[zh]zhao4liang4le5wo3gu1du2de5mei3fen1mei3miao3
[chorus]
[verse]
[ko]hamkke si-kkeuleo-un sesang-ui sodong-eul pihae
[ko]honja ogsang-eseo dalbich-ui eolyeompus-ileul balaboda
[ko]niga salang-eun lideum-i ganghan eum-ag gatdago malhaess-eo
[ko]han ta han tamada ma-eum-ui ondoga eolmana heojeonhanji ijge hae
[bridge]
[es]cantar mi anhelo por ti sin ocultar
[es]como poesía y pintura, lleno de anhelo indescifrable
[es]tu sombra es tan terca como el viento, inborrable
[es]persiguiéndote en vuelo, brilla como cruzar una mar de nubes
[chorus]
[fr]que tu sois le vent qui souffle sur ma main
[fr]un contact chaud comme la douce pluie printanière
[fr]que tu sois le vent qui s'entoure de mon corps
[fr]un amour profond qui ne s'éloignera jamais
```
Currently, ACE-Step supports 19 languages, but the following ten languages have better support:
* English
* Chinese: \[zh]
* Russian: \[ru]
* Spanish: \[es]
* Japanese: \[ja]
* German: \[de]
* French: \[fr]
* Portuguese: \[pt]
* Italian: \[it]
* Korean: \[ko]
The language tags above have not been fully tested at the time of writing this documentation. If any language tag is incorrect, please [submit an issue to our documentation repository](https://github.com/Comfy-Org/docs/issues) and we will make timely corrections.
## ACE-Step Related Resources
* [Project Page](https://ace-step.github.io/)
* [Hugging Face](https://huggingface.co/ACE-Step/ACE-Step-v1-3.5B)
* [GitHub](https://github.com/ace-step/ACE-Step)
* [Training Scripts](https://github.com/ace-step/ACE-Step?tab=readme-ov-file#-train)
# ComfyUI Image to Image Workflow
Source: https://docs.comfy.org/tutorials/basic/image-to-image
This guide will help you understand and complete an image to image workflow
## What is Image to Image
Image to Image is a workflow in ComfyUI that allows users to input an image and generate a new image based on it.
Image to Image can be used in scenarios such as:
* Converting original image styles, like transforming realistic photos into artistic styles
* Converting line art into realistic images
* Image restoration
* Colorizing old photos
* ... and other scenarios
To explain it with an analogy:
It's like asking an artist to create a specific piece based on your reference image.
If you carefully compare this tutorial with the [Text to Image](/tutorials/basic/text-to-image) tutorial,
you'll notice that the Image to Image process is very similar to Text to Image,
just with an additional input reference image as a condition. In Text to Image, we let the artist (image model) create freely based on our prompts,
while in Image to Image, we let the artist create based on both our reference image and prompts.
## ComfyUI Image to Image Workflow Example Guide
### Model Installation
Download the [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors) file and put it in your `ComfyUI/models/checkpoints` folder.
### Image to Image Workflow and Input Image
Download the image below and **drag it into ComfyUI** to load the workflow:

Images containing workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
Download the image below and we will use it as the input image:
### Complete the Workflow Step by Step
Follow the steps in the diagram below to ensure the workflow runs correctly.
1. Ensure `Load Checkpoint` loads **v1-5-pruned-emaonly-fp16.safetensors**
2. Upload the input image to the `Load Image` node
3. Click `Queue` or press `Ctrl/Cmd + Enter` to generate
## Key Points of Image to Image Workflow
The key to the Image to Image workflow lies in the `denoise` parameter in the `KSampler` node, which should be **less than 1**
If you've adjusted the `denoise` parameter and generated images, you'll notice:
* The smaller the `denoise` value, the smaller the difference between the generated image and the reference image
* The larger the `denoise` value, the larger the difference between the generated image and the reference image
This is because `denoise` determines the strength of noise added to the latent space image after converting the reference image. If `denoise` is 1, the latent space image will become completely random noise, making it the same as the latent space generated by the `empty latent image` node, losing all characteristics of the reference image.
For the corresponding principles, please refer to the principle explanation in the [Text to Image](/tutorials/basic/text-to-image) tutorial.
## Try It Yourself
1. Try modifying the `denoise` parameter in the **KSampler** node, gradually changing it from 1 to 0, and observe the changes in the generated images
2. Replace with your own prompts and reference images to generate your own image effects
# ComfyUI Inpainting Workflow
Source: https://docs.comfy.org/tutorials/basic/inpaint
This guide will introduce you to the inpainting workflow in ComfyUI, walk you through an inpainting example, and cover topics like using the mask editor
This article will introduce the concept of inpainting in AI image generation and guide you through creating an inpainting workflow in ComfyUI. We'll cover:
* Using inpainting workflows to modify images
* Using the ComfyUI mask editor to draw masks
* `VAE Encoder (for Inpainting)` node
## About Inpainting
In AI image generation, we often encounter situations where we're satisfied with the overall image but there are elements we don't want or that contain errors. Simply regenerating might produce a completely different image, so using inpainting to fix specific parts becomes very useful.
It's like having an **artist (AI model)** paint a picture, but we're still not satisfied with the specific details. We need to tell the artist **which areas to adjust (mask)**, and then let them **repaint (inpaint)** according to our requirements.
Common inpainting scenarios include:
* **Defect Repair:** Removing unwanted objects, fixing incorrect AI-generated body parts, etc.
* **Detail Optimization:** Precisely adjusting local elements (like modifying clothing textures, adjusting facial expressions)
* And other scenarios
## ComfyUI Inpainting Workflow Example
### Model and Resource Preparation
#### 1. Model Installation
Download the [512-inpainting-ema.safetensors](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/blob/main/512-inpainting-ema.safetensors) file and put it in your `ComfyUI/models/checkpoints` folder:
#### 2. Inpainting Asset
Please download the following image which we'll use as input:
This image already contains an alpha channel (transparency mask), so you don't need to manually draw a mask. This tutorial will also cover how to use the mask editor to draw masks.
#### 3. Inpainting Workflow
Download the image below and **drag it into ComfyUI** to load the workflow:

Images containing workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
### ComfyUI Inpainting Workflow Example Explanation
Follow the steps in the diagram below to ensure the workflow runs correctly.
1. Ensure `Load Checkpoint` loads `512-inpainting-ema.safetensors`
2. Upload the input image to the `Load Image` node
3. Click `Queue` or use `Ctrl + Enter` to generate
For comparison, here's the result using the [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors) model:
You will find that the results generated by the [512-inpainting-ema.safetensors](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/blob/main/512-inpainting-ema.safetensors) model have better inpainting effects and more natural transitions.
This is because this model is specifically designed for inpainting, which helps us better control the generation area, resulting in improved inpainting effects.
Do you remember the analogy we've been using? Different models are like artists with varying abilities, but each artist has their own limits. Choosing the right model can help you achieve better generation results.
You can try these approaches to achieve better results:
1. Modify positive and negative prompts with more specific descriptions
2. Try multiple runs using different seeds in the `KSampler` for different generation results
3. After learning about the mask editor in this tutorial, you can re-inpaint the generated results to achieve satisfactory outcomes.
Next, we'll learn about using the **Mask Editor**. While our input image already includes an `alpha` transparency channel (the area we want to edit),
so manual mask drawing isn't necessary, you'll often use the Mask Editor to create masks in practical applications.
### Using the Mask Editor
First right-click the `Save Image` node and select `Copy(Clipspace)`:
Then right-click the **Load Image** node and select `Paste(Clipspace)`:
Right-click the **Load Image** node again and select `Open in MaskEditor`:
1. Adjust brush parameters on the right panel
2. Use eraser to correct mistakes
3. Click `Save` when finished
The drawn content will be used as a Mask input to the VAE Encoder (for Inpainting) node for encoding
Then try adjusting your prompts and generating again until you achieve satisfactory results.
## VAE Encoder (for Inpainting) Node
Comparing this workflow with [Text-to-Image](/tutorials/basic/text-to-image) and [Image-to-Image](/tutorials/basic/image-to-image), you'll notice the main differences are in the VAE section's conditional inputs.
In this workflow, we use the **VAE Encoder (for Inpainting)** node, specifically designed for inpainting to help us better control the generation area and achieve better results.
**Input Types**
| Parameter Name | Function |
| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
| `pixels` | Input image to be encoded into latent space. |
| `vae` | VAE model used to encode the image from pixel space to latent space. |
| `mask` | Image mask specifying which areas need modification. |
| `grow_mask_by` | Pixel value to expand the original mask outward, ensuring a transition area around the mask to avoid hard edges between inpainted and original areas. |
**Output Types**
| Parameter Name | Function |
| -------------- | ------------------------------------------- |
| `latent` | Image encoded into latent space by the VAE. |
# ComfyUI LoRA Example
Source: https://docs.comfy.org/tutorials/basic/lora
This guide will help you understand and use a single LoRA model
**LoRA (Low-Rank Adaptation)** is an efficient technique for fine-tuning large generative models like Stable Diffusion.
It introduces trainable low-rank matrices to the pre-trained model, adjusting only a portion of parameters rather than retraining the entire model,
thus achieving optimization for specific tasks at a lower computational cost.
Compared to base models like SD1.5, LoRA models are smaller and easier to train.
The image above compares generation with the same parameters using [dreamshaper\_8](https://civitai.com/models/4384?modelVersionId=128713) directly versus using the [blindbox\_V1Mix](https://civitai.com/models/25995/blindbox) LoRA model.
As you can see, by using a LoRA model, we can generate images in different styles without adjusting the base model.
We will demonstrate how to use a LoRA model. All LoRA variants: Lycoris, loha, lokr, locon, etc... are used in the same way.
In this example, we will learn how to load and use a LoRA model in [ComfyUI](https://github.com/comfyanonymous/ComfyUI), covering the following topics:
1. Installing a LoRA model
2. Generating images using a LoRA model
3. A simple introduction to the `Load LoRA` node
## Required Model Installation
Download the [dreamshaper\_8.safetensors](https://civitai.com/api/download/models/128713?type=Model\&format=SafeTensor\&size=pruned\&fp=fp16) file and put it in your `ComfyUI/models/checkpoints` folder.
Download the [blindbox\_V1Mix.safetensors](https://civitai.com/api/download/models/32988?type=Model\&format=SafeTensor\&size=full\&fp=fp16) file and put it in your `ComfyUI/models/loras` folder.
## LoRA Workflow File
Download the image below and **drag it into ComfyUI** to load the workflow.
Images containing workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
## Complete the Workflow Step by Step
Follow the steps in the diagram below to ensure the workflow runs correctly.

1. Ensure `Load Checkpoint` loads `dreamshaper_8.safetensors`
2. Ensure `Load LoRA` loads `blindbox_V1Mix.safetensors`
3. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to generate the image
## Load LoRA Node Introduction
Models in the `ComfyUI\models\loras` folder will be detected by ComfyUI and can be loaded using this node.
### Input Types
| Parameter Name | Function |
| ---------------- | ------------------------------------------------------------------------------------------------------ |
| `model` | Connect to the base model |
| `clip` | Connect to the CLIP model |
| `lora_name` | Select the LoRA model to load and use |
| `strength_model` | Affects how strongly the LoRA influences the model weights; higher values make the LoRA style stronger |
| `strength_clip` | Affects how strongly the LoRA influences the CLIP text embeddings |
### Output Types
| Parameter Name | Function |
| -------------- | ---------------------------------------------------- |
| `model` | Outputs the model with LoRA adjustments applied |
| `clip` | Outputs the CLIP model with LoRA adjustments applied |
This node supports chain connections, allowing multiple `Load LoRA` nodes to be linked in series to apply multiple LoRA models. For more details, please refer to [ComfyUI Multiple LoRAs Example](/tutorials/basic/multiple-loras)
## Try It Yourself
1. Try modifying the prompt or adjusting different parameters of the `Load LoRA` node, such as `strength_model`, to observe changes in the generated images and become familiar with the `Load LoRA` node.
2. Visit [CivitAI](https://civitai.com/models) to download other kinds of LoRA models and try using them.
# ComfyUI Multiple LoRAs Example
Source: https://docs.comfy.org/tutorials/basic/multiple-loras
This guide demonstrates how to apply multiple LoRA models simultaneously in ComfyUI
In our [ComfyUI LoRA Example](/tutorials/basic/lora), we introduced how to load and use a single LoRA model, mentioning the node's chain connection capability.
This tutorial demonstrates chaining multiple `Load LoRA` nodes to apply two LoRA models simultaneously: [blindbox\_V1Mix](https://civitai.com/models/25995?modelVersionId=32988) and [MoXinV1](https://civitai.com/models/12597?modelVersionId=14856).
The comparison below shows individual effects of these LoRAs using identical parameters:
By chaining multiple LoRA models, we achieve a blended style in the final output:
## Model Installation
Download the [dreamshaper\_8.safetensors](https://civitai.com/api/download/models/128713?type=Model\&format=SafeTensor\&size=pruned\&fp=fp16) file and put it in your `ComfyUI/models/checkpoints` folder.
Download the [blindbox\_V1Mix.safetensors](https://civitai.com/api/download/models/32988?type=Model\&format=SafeTensor\&size=full\&fp=fp16) file and put it in your `ComfyUI/models/loras` folder.
Download the [MoXinV1.safetensors](https://civitai.com/api/download/models/14856?type=Model\&format=SafeTensor\&size=full\&fp=fp16) file and put it in your `ComfyUI/models/loras` folder.
## Multi-LoRA Workflow
Download the image below and **drag it into ComfyUI** to load the workflow:

Images containing workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
## Complete the Workflow Step by Step
Follow the steps in the diagram below to ensure the workflow runs correctly.
1. Ensure `Load Checkpoint` loads **dreamshaper\_8.safetensors**
2. Ensure first `Load LoRA` loads **blindbox\_V1Mix.safetensors**
3. Ensure second `Load LoRA` loads **MoXinV1.safetensors**
4. Click `Queue` or press `Ctrl/Cmd + Enter` to generate
## Try It Yourself
1. Adjust `strength_model` values in both `Load LoRA` nodes to control each LoRA's influence
2. Explore [CivitAI](https://civitai.com/models) for additional LoRAs and create custom combinations
# ComfyUI Outpainting Workflow Example
Source: https://docs.comfy.org/tutorials/basic/outpaint
This guide will introduce you to the outpainting workflow in ComfyUI and walk you through an outpainting example
This guide will introduce you to the concept of outpainting in AI image generation and how to create an outpainting workflow in ComfyUI. We will cover:
* Using outpainting workflow to extend an image
* Understanding and using outpainting-related nodes in ComfyUI
* Mastering the basic outpainting process
## About Outpainting
In AI image generation, we often encounter situations where an existing image has good composition but the canvas area is too small, requiring us to extend the canvas to get a larger scene. This is where outpainting comes in.
Basically, it requires similar content to [Inpainting](/tutorials/basic/inpaint), but we use different nodes to **build the mask**.
Outpainting applications include:
* **Scene Extension:** Expand the scene range of the original image to show a more complete environment
* **Composition Adjustment:** Optimize the overall composition by extending the canvas
* **Content Addition:** Add more related scene elements to the original image
## ComfyUI Outpainting Workflow Example Explanation
### Preparation
#### 1. Model Installation
Download the following model file and save it to `ComfyUI/models/checkpoints` directory:
* [512-inpainting-ema.safetensors](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/blob/main/512-inpainting-ema.safetensors)
#### 2. Input Image
Prepare an image you want to extend. In this example, we will use the following image:
#### 3. Outpainting Workflow
Download the image below and **drag it into ComfyUI** to load the workflow:

Images containing workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
### Outpainting Workflow Usage Explanation
The key steps of the outpainting workflow are as follows:
1. Load the locally installed model file in the `Load Checkpoint` node
2. Click the `Upload` button in the `Load Image` node to upload your image
3. Click the `Queue` button or use the shortcut `Ctrl + Enter` to execute the image generation
In this workflow, we mainly use the `Pad Image for outpainting` node to control the direction and range of image extension. This is actually an [Inpaint](/tutorials/basic/inpaint) workflow, but we use different nodes to build the mask.
### Pad Image for outpainting Node
This node accepts an input image and outputs an extended image with a corresponding mask, where the mask is built based on the node parameters.
#### Input Parameters
| Parameter Name | Function |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| `image` | Input image |
| `left` | Left padding amount |
| `top` | Top padding amount |
| `right` | Right padding amount |
| `bottom` | Bottom padding amount |
| `feathering` | Controls the smoothness of the transition between the original image and the added padding, higher values create smoother transitions |
#### Output Parameters
| Parameter Name | Function |
| -------------- | -------------------------------------------------------------------------- |
| `image` | Output `image` represents the padded image |
| `mask` | Output `mask` indicates the original image area and the added padding area |
#### Node Output Content
After processing by the `Pad Image for outpainting` node, the output image and mask preview are as follows:
You can see the corresponding output results:
* The `Image` output is the extended image
* The `Mask` output is the mask marking the extension areas
# ComfyUI Text to Image Workflow
Source: https://docs.comfy.org/tutorials/basic/text-to-image
This guide will help you understand the concept of text-to-image in AI art generation and complete a text-to-image workflow in ComfyUI
This guide aims to introduce you to ComfyUI's text-to-image workflow and help you understand the functionality and usage of various ComfyUI nodes.
In this document, we will:
* Complete a text-to-image workflow
* Gain a basic understanding of diffusion model principles
* Learn about the functions and roles of workflow nodes
* Get an initial understanding of the SD1.5 model
We'll start by running a text-to-image workflow, followed by explanations of related concepts. Please choose the relevant sections based on your needs.
## About Text to Image
**Text to Image** is a fundamental process in AI art generation that creates images from text descriptions, with **diffusion models** at its core.
The text-to-image process requires the following elements:
* **Artist:** The image generation model
* **Canvas:** The latent space
* **Image Requirements (Prompts):** Including positive prompts (elements you want in the image) and negative prompts (elements you don't want)
This text-to-image generation process can be simply understood as telling your requirements (positive and negative prompts) to an **artist (the image model)**, who then creates what you want based on these requirements.
## ComfyUI Text to Image Workflow Example Guide
### 1. Preparation
Ensure you have at least one SD1.5 model file in your `ComfyUI/models/checkpoints` folder, such as [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors)
If you haven't installed it yet, please refer to the model installation section in [Getting Started with ComfyUI AI Art Generation](/get_started/first_generation).
### 2. Loading the Text to Image Workflow
Download the image below and **drag it into ComfyUI** to load the workflow:
Images containing workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
### 3. Loading the Model and Generating Your First Image
After installing the image model, follow the steps in the image below to load the model and generate your first image

Follow these steps according to the image numbers:
1. In the **Load Checkpoint** node, use the arrows or click the text area to ensure **v1-5-pruned-emaonly-fp16.safetensors** is selected, and the left/right arrows don't show **null** text
2. Click the `Queue` button or use the shortcut `Ctrl + Enter` to execute image generation
After the process completes, you should see the resulting image in the **Save Image** node interface, which you can right-click to save locally
If you're not satisfied with the result, try running the generation multiple times. Each time you run it, **KSampler** will use a different random seed based on the `seed` parameter, so each generation will produce different results
### 4. Start Experimenting
Try modifying the text in the **CLIP Text Encoder**
The `Positive` connection to the KSampler node represents positive prompts, while the `Negative` connection represents negative prompts
Here are some basic prompting principles for the SD1.5 model:
* Use English whenever possible
* Separate prompts with English commas `,`
* Use phrases rather than long sentences
* Use specific descriptions
* Use expressions like `(golden hour:1.2)` to increase the weight of specific keywords, making them more likely to appear in the image. `1.2` is the weight, `golden hour` is the keyword
* Use keywords like `masterpiece, best quality, 4k` to improve generation quality
Here are several prompt examples you can try, or use your own prompts for generation:
**1. Anime Style**
Positive prompts:
```
anime style, 1girl with long pink hair, cherry blossom background, studio ghibli aesthetic, soft lighting, intricate details
masterpiece, best quality, 4k
```
Negative prompts:
```
low quality, blurry, deformed hands, extra fingers
```
**2. Realistic Style**
Positive prompts:
```
(ultra realistic portrait:1.3), (elegant woman in crimson silk dress:1.2),
full body, soft cinematic lighting, (golden hour:1.2),
(fujifilm XT4:1.1), shallow depth of field,
(skin texture details:1.3), (film grain:1.1),
gentle wind flow, warm color grading, (perfect facial symmetry:1.3)
```
Negative prompts:
```
(deformed, cartoon, anime, doll, plastic skin, overexposed, blurry, extra fingers)
```
**3. Specific Artist Style**
Positive prompts:
```
fantasy elf, detailed character, glowing magic, vibrant colors, long flowing hair, elegant armor, ethereal beauty, mystical forest, magical aura, high detail, soft lighting, fantasy portrait, Artgerm style
```
Negative prompts:
```
blurry, low detail, cartoonish, unrealistic anatomy, out of focus, cluttered, flat lighting
```
## Text to Image Working Principles
The entire text-to-image process can be understood as a **reverse diffusion process**. The [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors) we downloaded is a pre-trained model that can **generate target images from pure Gaussian noise**. We only need to input our prompts, and it can generate target images through denoising random noise.
```mermaid
graph LR
A[Pure Gaussian Noise] --> B[Iterative Denoising]
B --> C[Intermediate Latents]
C --> D[Final Generated Image]
E[Text Prompts] --> F[CLIP Encoder]
F --> G[Semantic Vectors]
G --> B
```
We need to understand two concepts:
1. **Latent Space:** Latent Space is an abstract data representation method in diffusion models. Converting images from pixel space to latent space reduces storage space and makes it easier to train diffusion models and reduce denoising complexity. It's like architects using blueprints (latent space) for design rather than designing directly on the building (pixel space), maintaining structural features while significantly reducing modification costs
2. **Pixel Space:** Pixel Space is the storage space for images, which is the final image we see, used to store pixel values.
If you want to learn more about diffusion models, you can read these papers:
* [Denoising Diffusion Probabilistic Models (DDPM)](https://arxiv.org/pdf/2006.11239)
* [Denoising Diffusion Implicit Models (DDIM)](https://arxiv.org/pdf/2010.02502)
* [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/pdf/2112.10752)
## ComfyUI Text to Image Workflow Node Explanation
### A. Load Checkpoint Node
This node is typically used to load the image generation model. A `checkpoint` usually contains three components: `MODEL (UNet)`, `CLIP`, and `VAE`
* `MODEL (UNet)`: The UNet model responsible for noise prediction and image generation during the diffusion process
* `CLIP`: The text encoder that converts our text prompts into vectors that the model can understand, as the model cannot directly understand text prompts
* `VAE`: The Variational AutoEncoder that converts images between pixel space and latent space, as diffusion models work in latent space while our images are in pixel space
### B. Empty Latent Image Node
Defines a latent space that outputs to the KSampler node. The Empty Latent Image node constructs a **pure noise latent space**
You can think of its function as defining the canvas size, which determines the dimensions of our final generated image
### C. CLIP Text Encoder Node
Used to encode prompts, which are your requirements for the image
* The `Positive` condition input connected to the KSampler node represents positive prompts (elements you want in the image)
* The `Negative` condition input connected to the KSampler node represents negative prompts (elements you don't want in the image)
The prompts are encoded into semantic vectors by the `CLIP` component from the `Load Checkpoint` node and output as conditions to the KSampler node
### D. KSampler Node
The **KSampler** is the core of the entire workflow, where the entire noise denoising process occurs, ultimately outputting a latent space image
```mermaid
graph LR
A[Diffusion Model] --> B{KSampler}
C[Random Noise Latent Space] --> B
D[CLIP Semantic Vectors] --> B
B --> E[Denoised Latent]
```
Here's an explanation of the KSampler node parameters:
| Parameter Name | Description | Function |
| ---------------------------- | ---------------------------------- | ----------------------------------------------------------------------------------------------------------- |
| **model** | Diffusion model used for denoising | Determines the style and quality of generated images |
| **positive** | Positive prompt condition encoding | Guides generation to include specified elements |
| **negative** | Negative prompt condition encoding | Suppresses unwanted content |
| **latent\_image** | Latent space image to be denoised | Serves as the input carrier for noise initialization |
| **seed** | Random seed for noise generation | Controls generation randomness |
| **control\_after\_generate** | Seed control mode after generation | Determines seed variation pattern in batch generation |
| **steps** | Number of denoising iterations | More steps mean finer details but longer processing time |
| **cfg** | Classifier-free guidance scale | Controls prompt constraint strength (too high leads to overfitting) |
| **sampler\_name** | Sampling algorithm name | Determines the mathematical method for denoising path |
| **scheduler** | Scheduler type | Controls noise decay rate and step size allocation |
| **denoise** | Denoising strength coefficient | Controls noise strength added to latent space, 0.0 preserves original input features, 1.0 is complete noise |
In the KSampler node, the latent space uses `seed` as an initialization parameter to construct random noise, and semantic vectors `Positive` and `Negative` are input as conditions to the diffusion model
Then, based on the number of denoising steps specified by the `steps` parameter, denoising is performed. Each denoising step uses the denoising strength coefficient specified by the `denoise` parameter to denoise the latent space and generate a new latent space image
### E. VAE Decode Node
Converts the latent space image output from the **KSampler** into a pixel space image
### F. Save Image Node
Previews and saves the decoded image from latent space to the local `ComfyUI/output` folder
## Introduction to SD1.5 Model
**SD1.5 (Stable Diffusion 1.5)** is an AI image generation model developed by [Stability AI](https://stability.ai/). It's the foundational version of the Stable Diffusion series, trained on **512×512** resolution images, making it particularly good at generating images at this resolution. With a size of about 4GB, it runs smoothly on **consumer-grade GPUs (e.g., 6GB VRAM)**. Currently, SD1.5 has a rich ecosystem, supporting various plugins (like ControlNet, LoRA) and optimization tools.
As a milestone model in AI art generation, SD1.5 remains the best entry-level choice thanks to its open-source nature, lightweight architecture, and rich ecosystem. Although newer versions like SDXL/SD3 have been released, its value for consumer-grade hardware remains unmatched.
### Basic Information
* **Release Date**: October 2022
* **Core Architecture**: Based on Latent Diffusion Model (LDM)
* **Training Data**: LAION-Aesthetics v2.5 dataset (approximately 590M training steps)
* **Open Source Features**: Fully open-source model/code/training data
### Advantages and Limitations
Model Advantages:
* Lightweight: Small size, only about 4GB, runs smoothly on consumer GPUs
* Low Entry Barrier: Supports a wide range of plugins and optimization tools
* Mature Ecosystem: Extensive plugin and tool support
* Fast Generation: Smooth operation on consumer GPUs
Model Limitations:
* Detail Handling: Hands/complex lighting prone to distortion
* Resolution Limits: Quality degrades for direct 1024x1024 generation
* Prompt Dependency: Requires precise English descriptions for control
# ComfyUI Image Upscale Workflow
Source: https://docs.comfy.org/tutorials/basic/upscale
This guide explains the concept of image upscaling in AI drawing and demonstrates how to implement an image upscaling workflow in ComfyUI
## What is Image Upscaling?
Image Upscaling is the process of converting low-resolution images to high-resolution using algorithms.
Unlike traditional interpolation methods, AI upscaling models (like ESRGAN) can intelligently reconstruct details while maintaining image quality.
For instance, the default SD1.5 model often struggles with large-size image generation.
To achieve high-resolution results,we typically generate smaller images first and then use upscaling techniques.
This article covers one of many upscaling methods in ComfyUI. In this tutorial, we'll guide you through:
1. Downloading and installing upscaling models
2. Performing basic image upscaling
3. Combining text-to-image workflows with upscaling
## Upscaling Workflow
### Model Installation
Required ESRGAN models download:
Visit [OpenModelDB](https://openmodeldb.info/) to search and download upscaling models (e.g., RealESRGAN)
As shown:
1. Filter models by image type using the category selector
2. The model's magnification factor is indicated in the top-right corner (e.g., 2x in the screenshot)
We'll use the [4x-ESRGAN](https://openmodeldb.info/models/4x-ESRGAN) model for this tutorial. Click the `Download` button on the model detail page.
Save the model file (.pth) in `ComfyUI/models/upscale_models` directory
### Workflow and Assets
Download and drag the following image into ComfyUI to load the basic upscaling workflow:

Images containing workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
Use this image in smaller size as input:
### Complete the Workflow Step by Step
Follow the steps in the diagram below to ensure the workflow runs correctly.
1. Ensure `Load Upscale Model` loads `4x-ESRGAN.pth`
2. Upload the input image to the `Load Image` node
3. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to generate the image
The core components are the `Load Upscale Model` and `Upscale Image (Using Model)` nodes, which receive an image input and upscale it using the selected model.
## Text-to-Image Combined Workflow
After mastering basic upscaling, we can combine it with the [text-to-image](/tutorials/basic/text-to-image) workflow. For text-to-image basics, refer to the [text-to-image tutorial](/tutorials/basic/text-to-image).
Download and drag this image into ComfyUI to load the combined workflow:
This workflow connects the text-to-image output image directly to the upscaling nodes for final processing.
## Additional Tips
Model characteristics:
* **RealESRGAN**: General-purpose upscaling
* **BSRGAN**: Excels with text and sharp edges
* **SwinIR**: Preserves natural textures, ideal for landscapes
1. **Chained Upscaling**: Combine multiple upscale nodes (e.g., 2x → 4x) for ultra-high magnification
2. **Hybrid Workflow**: Connect upscale nodes after generation for "generate+enhance" pipelines
3. **Comparative Testing**: Different models perform better on specific image types - test multiple options
# ComfyUI ControlNet Usage Example
Source: https://docs.comfy.org/tutorials/controlnet/controlnet
This guide will introduce you to the basic concepts of ControlNet and demonstrate how to generate corresponding images in ComfyUI
Achieving precise control over image creation in AI image generation cannot be done with just one click.
It typically requires numerous generation attempts to produce a satisfactory image. However, the emergence of **ControlNet** has effectively addressed this challenge.
ControlNet is a conditional control generation model based on diffusion models (such as Stable Diffusion),
first proposed by [Lvmin Zhang](https://lllyasviel.github.io/) and Maneesh Agrawala et al. in 2023 in the paper [Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543).
ControlNet models significantly enhance the controllability of image generation and the ability to reproduce details by introducing multimodal input conditions,
such as edge detection maps, depth maps, and pose keypoints.
These conditioning constraints make image generation more controllable, allowing multiple ControlNet models to be used simultaneously during the drawing process for better results.
Before ControlNet, we could only rely on the model to generate images repeatedly until we were satisfied with the results, which involved a lot of randomness.
With the advent of ControlNet, we can control image generation by introducing additional conditions.
For example, we can use a simple sketch to guide the image generation process, producing images that closely align with our sketch.
In this example, we will guide you through installing and using ControlNet models in [ComfyUI](https://github.com/comfyanonymous/ComfyUI), and complete a sketch-controlled image generation example.

The workflows for other types of ControlNet V1.1 models are similar to this example. You only need to select the appropriate model and upload the corresponding reference image based on your needs.
## ControlNet Image Preprocessing Information
Different types of ControlNet models typically require different types of reference images:

> Image source: [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
Since the current **Comfy Core** nodes do not include all types of **preprocessors**, in the actual examples in this documentation, we will provide pre-processed images.
However, in practical use, you may need to use custom nodes to preprocess images to meet the requirements of different ControlNet models. Below are some relevant custom nodes:
* [ComfyUI-Advanced-ControlNet](https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet)
* [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
## ComfyUI ControlNet Workflow Example Explanation
### 1. ControlNet Workflow Assets
Please download the workflow image below and drag it into ComfyUI to load the workflow:

Images with workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
This image already includes download links for the corresponding models, and dragging it into ComfyUI will automatically prompt for downloads.
Please download the image below, which we will use as input:

### 2. Manual Model Installation
If your network cannot successfully complete the automatic download of the corresponding models, please try manually downloading the models below and placing them in the specified directories:
* [dreamCreationVirtual3DECommerce\_v10.safetensors](https://civitai.com/api/download/models/731340?type=Model\&format=SafeTensor\&size=full\&fp=fp16)
* [vae-ft-mse-840000-ema-pruned.safetensors](https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors?download=true)
* [control\_v11p\_sd15\_scribble\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_scribble_fp16.safetensors?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── dreamCreationVirtual3DECommerce_v10.safetensors
│ ├── vae/
│ │ └── vae-ft-mse-840000-ema-pruned.safetensors
│ └── controlnet/
│ └── control_v11p_sd15_scribble_fp16.safetensors
```
In this example, you could also use the VAE model embedded in dreamCreationVirtual3DECommerce\_v10.safetensors, but we're following the model author's recommendation to use a separate VAE model.
### 3. Step-by-Step Workflow Execution
1. Ensure that `Load Checkpoint` can load **dreamCreationVirtual3DECommerce\_v10.safetensors**
2. Ensure that `Load VAE` can load **vae-ft-mse-840000-ema-pruned.safetensors**
3. Click `Upload` in the `Load Image` node to upload the input image provided earlier
4. Ensure that `Load ControlNet` can load **control\_v11p\_sd15\_scribble\_fp16.safetensors**
5. Click the `Queue` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation
## Related Node Explanations
### Load ControlNet Node Explanation
Models located in `ComfyUI\models\controlnet` will be detected by ComfyUI and can be loaded through this node.
### Apply ControlNet Node Explanation
This node accepts the ControlNet model loaded by `load controlnet` and generates corresponding control conditions based on the input image.
**Input Types**
| Parameter Name | Function |
| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `positive` | Positive conditioning |
| `negative` | Negative conditioning |
| `control_net` | The ControlNet model to be applied |
| `image` | Preprocessed image used as reference for ControlNet application |
| `vae` | VAE model input |
| `strength` | Strength of ControlNet application; higher values increase ControlNet's influence on the generated image |
| `start_percent` | Determines when to start applying ControlNet as a percentage; e.g., 0.2 means ControlNet guidance begins when 20% of diffusion is complete |
| `end_percent` | Determines when to stop applying ControlNet as a percentage; e.g., 0.8 means ControlNet guidance stops when 80% of diffusion is complete |
**Output Types**
| Parameter Name | Function |
| -------------- | -------------------------------------------------- |
| `positive` | Positive conditioning data processed by ControlNet |
| `negative` | Negative conditioning data processed by ControlNet |
You can use chain connections to apply multiple ControlNet models, as shown in the image below. You can also refer to the [Mixing ControlNet Models](/tutorials/controlnet/mixing-controlnets) guide to learn more about combining multiple ControlNet models.
You might see the `Apply ControlNet(Old)` node in some early workflows, which is an early version of the ControlNet node. It is currently deprecated and not visible by default in search and node lists.
To enable it, go to **Settings** --> **comfy** --> **Node** and enable the `Show deprecated nodes in search` option. However, it's recommended to use the new node.
## Start Your Exploration
1. Try creating similar sketches, or even draw your own, and use ControlNet models to generate images to experience the benefits of ControlNet.
2. Adjust the `Control Strength` parameter in the Apply ControlNet node to control the influence of the ControlNet model on the generated image.
3. Visit the [ControlNet-v1-1\_fp16\_safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/tree/main) repository to download other types of ControlNet models and try using them to generate images.
# ComfyUI Depth ControlNet Usage Example
Source: https://docs.comfy.org/tutorials/controlnet/depth-controlnet
This guide will introduce you to the basic concepts of Depth ControlNet and demonstrate how to generate corresponding images in ComfyUI
## Introduction to Depth Maps and Depth ControlNet
A depth map is a special type of image that uses grayscale values to represent the distance between objects in a scene and the observer or camera. In a depth map, the grayscale value is inversely proportional to distance: brighter areas (closer to white) indicate objects that are closer, while darker areas (closer to black) indicate objects that are farther away.

Depth ControlNet is a ControlNet model specifically trained to understand and utilize depth map information. It helps AI correctly interpret spatial relationships, ensuring that generated images conform to the spatial structure specified by the depth map, thereby enabling precise control over three-dimensional spatial layouts.
### Application Scenarios for Depth Maps with ControlNet
Depth maps have numerous applications in various scenarios:
1. **Portrait Scenes**: Control the spatial relationship between subjects and backgrounds, avoiding distortion in critical areas such as faces
2. **Landscape Scenes**: Control the hierarchical relationships between foreground, middle ground, and background
3. **Architectural Scenes**: Control the spatial structure and perspective relationships of buildings
4. **Product Showcase**: Control the separation and spatial positioning of products against their backgrounds
In this example, we will use a depth map to generate an architectural visualization scene.
## ComfyUI ControlNet Workflow Example Explanation
### 1. ControlNet Workflow Assets
Please download the workflow image below and drag it into ComfyUI to load the workflow:

Images with workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
This image already includes download links for the corresponding models, and dragging it into ComfyUI will automatically prompt for downloads.
Please download the image below, which we will use as input:

### 2. Model Installation
If your network cannot successfully complete the automatic download of the corresponding models, please try manually downloading the models below and placing them in the specified directories:
* [architecturerealmix\_v11.safetensors](https://civitai.com/api/download/models/431755?type=Model\&format=SafeTensor\&size=full\&fp=fp16)
* [control\_v11f1p\_sd15\_depth\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11f1p_sd15_depth_fp16.safetensors?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── architecturerealmix_v11.safetensors
│ └── controlnet/
│ └── control_v11f1p_sd15_depth_fp16.safetensors
```
### 3. Step-by-Step Workflow Execution
1. Ensure that `Load Checkpoint` can load **architecturerealmix\_v11.safetensors**
2. Ensure that `Load ControlNet` can load **control\_v11f1p\_sd15\_depth\_fp16.safetensors**
3. Click `Upload` in the `Load Image` node to upload the depth image provided earlier
4. Click the `Queue` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation
## Combining Depth Control with Other Techniques
Based on different creative needs, you can combine Depth ControlNet with other types of ControlNet to achieve better results:
1. **Depth + Lineart**: Maintain spatial relationships while reinforcing outlines, suitable for architecture, products, and character design
2. **Depth + Pose**: Control character posture while maintaining correct spatial relationships, suitable for character scenes
For more information on using multiple ControlNet models together, please refer to the [Mixing ControlNet](/tutorials/controlnet/mixing-controlnets) example.
# ComfyUI Depth T2I Adapter Usage Example
Source: https://docs.comfy.org/tutorials/controlnet/depth-t2i-adapter
This guide will introduce you to the basic concepts of Depth T2I Adapter and demonstrate how to generate corresponding images in ComfyUI
## Introduction to T2I Adapter
[T2I-Adapter](https://huggingface.co/TencentARC/T2I-Adapter) is a lightweight adapter developed by [Tencent ARC Lab](https://github.com/TencentARC) designed to enhance the structural, color, and style control capabilities of text-to-image generation models (such as Stable Diffusion).
It works by aligning external conditions (such as edge detection maps, depth maps, sketches, or color reference images) with the model's internal features, achieving high-precision control without modifying the original model structure. With only about 77M parameters (approximately 300MB in size), its inference speed is about 3 times faster than [ControlNet](https://github.com/lllyasviel/ControlNet-v1-1-nightly), and it supports multiple condition combinations (such as sketch + color grid). Application scenarios include line art to image conversion, color style transfer, multi-element scene generation, and more.
### Comparison Between T2I Adapter and ControlNet
Although their functions are similar, there are notable differences in implementation and application:
1. **Lightweight Design**: T2I Adapter has fewer parameters and a smaller memory footprint
2. **Inference Speed**: T2I Adapter is typically about 3 times faster than ControlNet
3. **Control Precision**: ControlNet offers more precise control in certain scenarios, while T2I Adapter is more suitable for lightweight control
4. **Multi-condition Combination**: T2I Adapter shows more significant resource advantages when combining multiple conditions
### Main Types of T2I Adapter
T2I Adapter provides various types to control different aspects:
* **Depth**: Controls the spatial structure and depth relationships in images
* **Line Art (Canny/Sketch)**: Controls image edges and lines
* **Keypose**: Controls character poses and actions
* **Segmentation (Seg)**: Controls scene layout through semantic segmentation
* **Color**: Controls the overall color scheme of images
In ComfyUI, using T2I Adapter is similar to [ControlNet](/tutorials/controlnet/controlnet) in terms of interface and workflow. In this example, we will demonstrate how to use a depth T2I Adapter to control an interior scene.

## Value of Depth T2I Adapter Applications
Depth maps have several important applications in image generation:
1. **Spatial Layout Control**: Accurately describes three-dimensional spatial structures, suitable for interior design and architectural visualization
2. **Object Positioning**: Controls the relative position and size of objects in a scene, suitable for product showcases and scene construction
3. **Perspective Relationships**: Maintains reasonable perspective and proportions, suitable for landscape and urban scene generation
4. **Light and Shadow Layout**: Natural light and shadow distribution based on depth information, enhancing realism
We will use interior design as an example to demonstrate how to use the depth T2I Adapter, but these techniques are applicable to other scenarios as well.
## ComfyUI Depth T2I Adapter Workflow Example Explanation
### 1. Depth T2I Adapter Workflow Assets
Please download the workflow image below and drag it into ComfyUI to load the workflow:

Images with workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
This image already includes download links for the corresponding models, and dragging it into ComfyUI will automatically prompt for downloads.
Please download the image below, which we will use as input:

### 2. Model Installation
If your network cannot successfully complete the automatic download of the corresponding models, please try manually downloading the models below and placing them in the specified directories:
* [interiordesignsuperm\_v2.safetensors](https://civitai.com/api/download/models/93152?type=Model\&format=SafeTensor\&size=full\&fp=fp16)
* [t2iadapter\_depth\_sd15v2.pth](https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_depth_sd15v2.pth?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── interiordesignsuperm_v2.safetensors
│ └── controlnet/
│ └── t2iadapter_depth_sd15v2.pth
```
### 3. Step-by-Step Workflow Execution
1. Ensure that `Load Checkpoint` can load **interiordesignsuperm\_v2.safetensors**
2. Ensure that `Load ControlNet` can load **t2iadapter\_depth\_sd15v2.pth**
3. Click `Upload` in the `Load Image` node to upload the input image provided earlier
4. Click the `Queue` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation
## General Tips for Using T2I Adapter
### Input Image Quality Optimization
Regardless of the application scenario, high-quality input images are key to successfully using T2I Adapter:
1. **Moderate Contrast**: Control images (such as depth maps, line art) should have clear contrast, but not excessively extreme
2. **Clear Boundaries**: Ensure that major structures and element boundaries are clearly distinguishable in the control image
3. **Noise Control**: Try to avoid excessive noise in control images, especially for depth maps and line art
4. **Reasonable Layout**: Control images should have a reasonable spatial layout and element distribution
## Characteristics of T2I Adapter Usage
One major advantage of T2I Adapter is its ability to easily combine multiple conditions for complex control effects:
1. **Depth + Edge**: Control spatial layout while maintaining clear structural edges, suitable for architecture and interior design
2. **Line Art + Color**: Control shapes while specifying color schemes, suitable for character design and illustrations
3. **Pose + Segmentation**: Control character actions while defining scene areas, suitable for complex narrative scenes
Mixing different T2I Adapters, or combining them with other control methods (such as ControlNet, regional prompts, etc.), can further expand creative possibilities. To achieve mixing, simply chain multiple `Apply ControlNet` nodes together in the same way as described in [Mixing ControlNet](/tutorials/controlnet/mixing-controlnets).
# ComfyUI Mixing ControlNet Examples
Source: https://docs.comfy.org/tutorials/controlnet/mixing-controlnets
In this example, we will demonstrate how to mix multiple ControlNets and learn to use multiple ControlNet models to control image generation
In AI image generation, a single control condition often fails to meet the requirements of complex scenes. Mixing multiple ControlNets allows you to control different regions or aspects of an image simultaneously, achieving more precise control over image generation.
In certain scenarios, mixing ControlNets can leverage the characteristics of different control conditions to achieve more refined conditional control:
1. **Scene Complexity**: Complex scenes require multiple control conditions working together
2. **Fine-grained Control**: By adjusting the strength parameter of each ControlNet, you can precisely control the degree of influence for each part
3. **Complementary Effects**: Different types of ControlNets can complement each other, compensating for the limitations of single controls
4. **Creative Expression**: Combining different controls can produce unique creative effects
### How to Mix ControlNets
When mixing multiple ControlNets, each ControlNet influences the image generation process according to its applied area. ComfyUI enables multiple ControlNet conditions to be applied sequentially in a layered manner through chain connections in the `Apply ControlNet` node:
## ComfyUI ControlNet Regional Division Mixing Example
In this example, we will use a combination of **Pose ControlNet** and **Scribble ControlNet** to generate a scene containing multiple elements: a character on the left controlled by Pose ControlNet and a cat on a scooter on the right controlled by Scribble ControlNet.
### 1. ControlNet Mixing Workflow Assets
Please download the workflow image below and drag it into ComfyUI to load the workflow:

This workflow image contains Metadata, and can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`. The system will automatically detect and prompt to download the required models.
Input pose image (controls the character pose on the left):

Input scribble image (controls the cat and scooter on the right):

### 2. Manual Model Installation
If your network cannot successfully complete the automatic download of the corresponding models, please try manually downloading the models below and placing them in the specified directories:
* [awpainting\_v14.safetensors](https://civitai.com/api/download/models/624939?type=Model\&format=SafeTensor\&size=full\&fp=fp16)
* [control\_v11p\_sd15\_scribble\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_scribble_fp16.safetensors?download=true)
* [control\_v11p\_sd15\_openpose\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_openpose_fp16.safetensors?download=true)
* [vae-ft-mse-840000-ema-pruned.safetensors](https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── awpainting_v14.safetensors
│ ├── controlnet/
│ │ └── control_v11p_sd15_scribble_fp16.safetensors
│ │ └── control_v11p_sd15_openpose_fp16.safetensors
│ ├── vae/
│ │ └── vae-ft-mse-840000-ema-pruned.safetensors
```
### 3. Step-by-Step Workflow Execution
Follow these steps according to the numbered markers in the image:
1. Ensure that `Load Checkpoint` can load **awpainting\_v14.safetensors**
2. Ensure that `Load VAE` can load **vae-ft-mse-840000-ema-pruned.safetensors**
First ControlNet group using the Openpose model:
3\. Ensure that `Load ControlNet Model` loads **control\_v11p\_sd15\_openpose\_fp16.safetensors**
4\. Click `Upload` in the `Load Image` node to upload the pose image provided earlier
Second ControlNet group using the Scribble model:
5\. Ensure that `Load ControlNet Model` loads **control\_v11p\_sd15\_scribble\_fp16.safetensors**
6\. Click `Upload` in the `Load Image` node to upload the scribble image provided earlier
7\. Click the `Queue` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation
## Workflow Explanation
#### Strength Balance
When controlling different regions of an image, balancing the strength parameters is particularly important:
* If the ControlNet strength for one region is significantly higher than another, it may cause that region's control effect to overpower and suppress the other region
* It's recommended to set similar strength values for ControlNets controlling different regions, for example, both set to 1.0
#### Prompt Techniques
For regional division mixing, the prompt needs to include descriptions of both regions:
```
"A woman in red dress, a cat riding a scooter, detailed background, high quality"
```
Such a prompt covers both the character and the cat on the scooter, ensuring the model pays attention to both control regions.
## Multi-dimensional Control Applications for a Single Subject
In addition to the regional division mixing shown in this example, another common mixing approach is to apply multi-dimensional control to the same subject. For example:
* **Pose + Depth**: Control character posture and spatial sense
* **Pose + Canny**: Control character posture and edge details
* **Pose + Reference**: Control character posture while referencing a specific style
In this type of application, reference images for multiple ControlNets should be aligned to the same subject, and their strengths should be adjusted to ensure proper balance.
By combining different types of ControlNets and specifying their control regions, you can achieve precise control over elements in your image.
# ComfyUI Pose ControlNet Usage Example
Source: https://docs.comfy.org/tutorials/controlnet/pose-controlnet-2-pass
This guide will introduce you to the basic concepts of Pose ControlNet, and demonstrate how to generate large-sized images in ComfyUI using a two-pass generation approach
## Introduction to OpenPose
[OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) is an open-source real-time multi-person pose estimation system developed by Carnegie Mellon University (CMU), representing a significant breakthrough in the field of computer vision. The system can simultaneously detect multiple people in an image, capturing:
* **Body skeleton**: 18 keypoints, including head, shoulders, elbows, wrists, hips, knees, and ankles
* **Facial expressions**: 70 facial keypoints for capturing micro-expressions and facial contours
* **Hand details**: 21 hand keypoints for precisely expressing finger positions and gestures
* **Foot posture**: 6 foot keypoints, recording standing postures and movement details
In AI image generation, skeleton structure maps generated by OpenPose serve as conditional inputs for ControlNet, enabling precise control over the posture, actions, and expressions of generated characters. This allows us to generate realistic human figures with expected poses and actions, greatly improving the controllability and practical value of AI-generated content.
Particularly for early Stable Diffusion 1.5 series models, skeletal maps generated by OpenPose can effectively prevent issues with distorted character actions, limbs, and expressions.
## ComfyUI 2-Pass Pose ControlNet Usage Example
### 1. Pose ControlNet Workflow Assets
Please download the workflow image below and drag it into ComfyUI to load the workflow:

Images with workflow JSON in their metadata can be directly dragged into ComfyUI or loaded using the menu `Workflows` -> `Open (ctrl+o)`.
This image already includes download links for the corresponding models, and dragging it into ComfyUI will automatically prompt for downloads.
Please download the image below, which we will use as input:

### 2. Manual Model Installation
If your network cannot successfully complete the automatic download of the corresponding models, please try manually downloading the models below and placing them in the specified directories:
* [control\_v11p\_sd15\_openpose\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_openpose_fp16.safetensors?download=true)
* [majicmixRealistic\_v7.safetensors](https://civitai.com/api/download/models/176425?type=Model\&format=SafeTensor\&size=pruned\&fp=fp16)
* [japaneseStyleRealistic\_v20.safetensors](https://civitai.com/api/download/models/85426?type=Model\&format=SafeTensor\&size=pruned\&fp=fp16)
* [vae-ft-mse-840000-ema-pruned.safetensors](https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── majicmixRealistic_v7.safetensors
│ │ └── japaneseStyleRealistic_v20.safetensors
│ ├── vae/
│ │ └── vae-ft-mse-840000-ema-pruned.safetensors
│ └── controlnet/
│ └── control_v11p_sd15_openpose_fp16.safetensors
```
### 3. Step-by-Step Workflow Execution
Follow these steps according to the numbered markers in the image:
1. Ensure that `Load Checkpoint` can load **majicmixRealistic\_v7.safetensors**
2. Ensure that `Load VAE` can load **vae-ft-mse-840000-ema-pruned.safetensors**
3. Ensure that `Load ControlNet Model` can load **control\_v11p\_sd15\_openpose\_fp16.safetensors**
4. Click the select button in the `Load Image` node to upload the pose input image provided earlier, or use your own OpenPose skeleton map
5. Ensure that `Load Checkpoint` can load **japaneseStyleRealistic\_v20.safetensors**
6. Click the `Queue` button or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation
## Explanation of the Pose ControlNet 2-Pass Workflow
This workflow uses a two-pass image generation approach, dividing the image creation process into two phases:
### First Phase: Basic Pose Image Generation
In the first phase, the **majicmixRealistic\_v7** model is combined with Pose ControlNet to generate an initial character pose image:
1. First, load the majicmixRealistic\_v7 model via the `Load Checkpoint` node
2. Load the pose control model through the `Load ControlNet Model` node
3. The input pose image is fed into the `Apply ControlNet` node and combined with positive and negative prompt conditions
4. The first `KSampler` node (typically using 20-30 steps) generates a basic character pose image
5. The pixel-space image for the first phase is obtained through `VAE Decode`
This phase primarily focuses on correct character posture, pose, and basic structure, ensuring that the generated character conforms to the input skeletal pose.
### Second Phase: Style Optimization and Detail Enhancement
In the second phase, the output image from the first phase is used as a reference, with the **japaneseStyleRealistic\_v20** model performing stylization and detail enhancement:
1. The image generated in the first phase creates a larger resolution latent space through the `Upscale latent` node
2. The second `Load Checkpoint` loads the japaneseStyleRealistic\_v20 model, which focuses on details and style
3. The second `KSampler` node uses a lower `denoise` strength (typically 0.4-0.6) for refinement, preserving the basic structure from the first phase
4. Finally, a higher quality, larger resolution image is output through the second `VAE Decode` and `Save Image` nodes
This phase primarily focuses on style consistency, detail richness, and enhancing overall image quality.
## Advantages of 2-Pass Image Generation
Compared to single-pass generation, the two-pass image generation method offers the following advantages:
1. **Higher Resolution**: Two-pass processing can generate high-resolution images beyond the capabilities of single-pass generation
2. **Style Blending**: Can combine advantages of different models, such as using a realistic model in the first phase and a stylized model in the second phase
3. **Better Details**: The second phase can focus on optimizing details without having to worry about overall structure
4. **Precise Control**: Once pose control is completed in the first phase, the second phase can focus on refining style and details
5. **Reduced GPU Load**: Generating in two passes allows for high-quality large images with limited GPU resources
To learn more about techniques for mixing multiple ControlNets, please refer to the [Mixing ControlNet Models](/tutorials/controlnet/mixing-controlnets) tutorial.
# ComfyUI Flux.1 ControlNet Examples
Source: https://docs.comfy.org/tutorials/flux/flux-1-controlnet
This guide will demonstrate workflow examples using Flux.1 ControlNet.
## FLUX.1 ControlNet Model Introduction
FLUX.1 Canny and Depth are two powerful models from the [FLUX.1 Tools](https://blackforestlabs.ai/flux-1-tools/) launched by [Black Forest Labs](https://blackforestlabs.ai/). This toolkit is designed to add control and guidance capabilities to FLUX.1, enabling users to modify and recreate real or generated images.
**FLUX.1-Depth-dev** and **FLUX.1-Canny-dev** are both 12B parameter Rectified Flow Transformer models that can generate images based on text descriptions while maintaining the structural features of the input image.
The Depth version maintains the spatial structure of the source image through depth map extraction techniques, while the Canny version uses edge detection techniques to preserve the structural features of the source image, allowing users to choose the appropriate control method based on different needs.
Both models have the following features:
* Top-tier output quality and detail representation
* Excellent prompt following ability while maintaining consistency with the original image
* Trained using guided distillation techniques for improved efficiency
* Open weights for the research community
* API interfaces (pro version) and open-source weights (dev version)
Additionally, Black Forest Labs also provides **FLUX.1-Depth-dev-lora** and **FLUX.1-Canny-dev-lora** adapter versions extracted from the complete models.
These can be applied to the FLUX.1 \[dev] base model to provide similar functionality with smaller file size, especially suitable for resource-constrained environments.
We will use the full version of **FLUX.1-Canny-dev** and **FLUX.1-Depth-dev-lora** to complete the workflow examples.
All workflow images's Metadata contains the corresponding model download information. You can load the workflows by:
* Dragging them directly into ComfyUI
* Or using the menu `Workflows` -> `Open(ctrl+o)`
If you're not using the Desktop Version or some models can't be downloaded automatically, please refer to the manual installation sections to save the model files to the corresponding folder.
For image preprocessors, you can use the following custom nodes to complete image preprocessing. In this example, we will provide processed images as input.
* [ComfyUI-Advanced-ControlNet](https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet)
* [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
## FLUX.1-Canny-dev Complete Version Workflow
### 1. Workflow and Asset
Please download the workflow image below and drag it into ComfyUI to load the workflow

Please download the image below, which we will use as the input image

### 2. Manual Models Installation
If you have previously used the [complete version of Flux related workflows](/tutorials/flux/flux-1-text-to-image), then you only need to download the **flux1-canny-dev.safetensors** model file.
Since you need to first agree to the terms of [black-forest-labs/FLUX.1-Canny-dev](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev), please visit the [black-forest-labs/FLUX.1-Canny-dev](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev) page and make sure you have agreed to the corresponding terms as shown in the image below.
Complete model list:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true)
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-canny-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev/resolve/main/flux1-canny-dev.safetensors?download=true) (Please ensure you have agreed to the corresponding repo's terms)
File storage location:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ └── diffusion_models/
│ └── flux1-canny-dev.safetensors
```
### 3. Step-by-Step Workflow Execution
1. Make sure `ae.safetensors` is loaded in the `Load VAE` node
2. Make sure `flux1-canny-dev.safetensors` is loaded in the `Load Diffusion Model` node
3. Make sure the following models are loaded in the `DualCLIPLoader` node:
* clip\_name1: t5xxl\_fp16.safetensors
* clip\_name2: clip\_l.safetensors
4. Upload the provided input image in the `Load Image` node
5. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
### 4. Start Your Experimentation
Try using the [FLUX.1-Depth-dev](https://huggingface.co/black-forest-labs/FLUX.1-Depth-dev) model to complete the Depth version of the workflow
You can use the image below as input

Or use the following custom nodes to complete image preprocessing:
* [ComfyUI-Advanced-ControlNet](https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet)
* [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
## FLUX.1-Depth-dev-lora Workflow
The LoRA version workflow builds on the complete version by adding the LoRA model. Compared to the [complete version of the Flux workflow](/tutorials/flux/flux-1-text-to-image), it adds nodes for loading and using the corresponding LoRA model.
### 1. Workflow and Asset
Please download the workflow image below and drag it into ComfyUI to load the workflow

Please download the image below, which we will use as the input image

### 2. Manual Model Download
If you have previously used the [complete version of Flux related workflows](/tutorials/flux/flux-1-text-to-image), then you only need to download the **flux1-depth-dev-lora.safetensors** model file.
Complete model list:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true)
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors?download=true)
* [flux1-depth-dev-lora.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-Depth-dev-lora/resolve/main/flux1-depth-dev-lora.safetensors?download=true)
File storage location:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ ├── diffusion_models/
│ │ └── flux1-dev.safetensors
│ └── loras/
│ └── flux1-depth-dev-lora.safetensors
```
### 3. Step-by-Step Workflow Execution
1. Make sure `flux1-dev.safetensors` is loaded in the `Load Diffusion Model` node
2. Make sure `flux1-depth-dev-lora.safetensors` is loaded in the `LoraLoaderModelOnly` node
3. Make sure the following models are loaded in the `DualCLIPLoader` node:
* clip\_name1: t5xxl\_fp16.safetensors
* clip\_name2: clip\_l.safetensors
4. Upload the provided input image in the `Load Image` node
5. Make sure `ae.safetensors` is loaded in the `Load VAE` node
6. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
### 4. Start Your Experimentation
Try using the [FLUX.1-Canny-dev-lora](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev-lora) model to complete the Canny version of the workflow
Use [ComfyUI-Advanced-ControlNet](https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet) or [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux) to complete image preprocessing
## Community Versions of Flux Controlnets
XLab and InstantX + Shakker Labs have released Controlnets for Flux.
**InstantX:**
* [FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny/blob/main/diffusion_pytorch_model.safetensors)
* [FLUX.1-dev-ControlNet-Depth](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Depth/blob/main/diffusion_pytorch_model.safetensors)
* [FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro/blob/main/diffusion_pytorch_model.safetensors)
**XLab**: [flux-controlnet-collections](https://huggingface.co/XLabs-AI/flux-controlnet-collections)
Place these files in the `ComfyUI/models/controlnet` directory.
You can visit [Flux Controlnet Example](https://raw.githubusercontent.com/comfyanonymous/ComfyUI_examples/refs/heads/master/flux/flux_controlnet_example.png) to get the corresponding workflow image, and use the image from [here](https://raw.githubusercontent.com/comfyanonymous/ComfyUI_examples/refs/heads/master/flux/girl_in_field.png) as the input image.
# ComfyUI Flux.1 fill dev Example
Source: https://docs.comfy.org/tutorials/flux/flux-1-fill-dev
This guide demonstrates how to use Flux.1 fill dev to create Inpainting and Outpainting workflows.
## Introduction to Flux.1 fill dev Model
Flux.1 fill dev is one of the core tools in the [FLUX.1 Tools suite](https://blackforestlabs.ai/flux-1-tools/) launched by [Black Forest Labs](https://blackforestlabs.ai/), specifically designed for image inpainting and outpainting.
Key features of Flux.1 fill dev:
* Powerful image inpainting and outpainting capabilities, with results second only to the commercial version FLUX.1 Fill \[pro].
* Excellent prompt understanding and following ability, precisely capturing user intent while maintaining high consistency with the original image.
* Advanced guided distillation training technology, making the model more efficient while maintaining high-quality output.
* Friendly licensing terms, with generated outputs usable for personal, scientific, and commercial purposes, please refer to the [FLUX.1 \[dev\] non-commercial license](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md) for details.
Open Source Repository: [FLUX.1 \[dev\]](https://huggingface.co/black-forest-labs/FLUX.1-dev)
This guide will demonstrate inpainting and outpainting workflows based on the Flux.1 fill dev model.
If you're not familiar with inpainting and outpainting workflows, you can refer to [ComfyUI Layout Inpainting Example](/tutorials/basic/inpaint) and [ComfyUI Image Extension Example](/tutorials/basic/outpaint) for some related explanations.
## Flux.1 Fill dev and related models installation
Before we begin, let's complete the installation of the Flux.1 Fill dev model files. The inpainting and outpainting workflows will use exactly the same model files.
If you've previously used the full version of the [Flux.1 Text-to-Image workflow](/tutorials/flux/flux-1-text-to-image),
then you only need to download the **flux1-fill-dev.safetensors** model file in this section.
However, since downloading the corresponding model requires agreeing to the corresponding usage agreement, please visit the [black-forest-labs/FLUX.1-Fill-dev](https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev) page and make sure you have agreed to the corresponding agreement as shown in the image below.
Complete model list:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true)
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-fill-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev/resolve/main/flux1-fill-dev.safetensors?download=true)
File storage location:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ └── diffusion_models/
│ └── flux1-fill-dev.safetensors
```
## Flux.1 Fill dev inpainting workflow
### 1. Inpainting workflow and asset
Please download the image below and drag it into ComfyUI to load the corresponding workflow

Please download the image below, we will use it as the input image

The corresponding image already contains an alpha channel, so you don't need to draw a mask separately.
If you want to draw your own mask, please [click here](https://raw.githubusercontent.com/Comfy-Org/example_workflows/main/flux/inpaint/flux_fill_inpaint_input_original.png) to get the image without a mask, and refer to the MaskEditor usage section in the [ComfyUI Layout Inpainting Example](/tutorials/basic/inpaint#using-the-mask-editor) to learn how to draw a mask in the `Load Image` node.
### 2. Steps to run the workflow
1. Ensure the `Load Diffusion Model` node has `flux1-fill-dev.safetensors` loaded.
2. Ensure the `DualCLIPLoader` node has the following models loaded:
* clip\_name1: `t5xxl_fp16.safetensors`
* clip\_name2: `clip_l.safetensors`
3. Ensure the `Load VAE` node has `ae.safetensors` loaded.
4. Upload the input image provided in the document to the `Load Image` node; if you're using the version without a mask, remember to complete the mask drawing using the mask editor
5. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
## Flux.1 Fill dev Outpainting Workflow
### 1. Outpainting workflow and asset
Please download the image below and drag it into ComfyUI to load the corresponding workflow

Please download the image below, we will use it as the input image

### 2. Steps to run the workflow
1. Ensure the `Load Diffusion Model` node has `flux1-fill-dev.safetensors` loaded.
2. Ensure the `DualCLIPLoader` node has the following models loaded:
* clip\_name1: `t5xxl_fp16.safetensors`
* clip\_name2: `clip_l.safetensors`
3. Ensure the `Load VAE` node has `ae.safetensors` loaded.
4. Upload the input image provided in the document to the `Load Image` node
5. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
# ComfyUI Flux Kontext Dev Native Workflow Example
Source: https://docs.comfy.org/tutorials/flux/flux-1-kontext-dev
ComfyUI Flux Kontext Dev Native Workflow Example.
VIDEO
## About FLUX.1 Kontext Dev
FLUX.1 Kontext is a breakthrough multimodal image editing model from Black Forest Labs that supports simultaneous text and image input, intelligently understanding image context and performing precise editing. Its development version is an open-source diffusion transformer model with 12 billion parameters, featuring excellent context understanding and character consistency maintenance, ensuring that key elements such as character features and composition layout remain stable even after multiple iterative edits.
It shares the same core capabilities as the FLUX.1 Kontext suite:
* Character Consistency: Preserves unique elements in images across multiple scenes and environments, such as reference characters or objects in the image.
* Editing: Makes targeted modifications to specific elements in the image without affecting other parts.
* Style Reference: Generates novel scenes while preserving the unique style of the reference image according to text prompts.
* Interactive Speed: Minimal latency in image generation and editing.
While the previously released API version offers the highest fidelity and speed, FLUX.1 Kontext \[Dev] runs entirely on local machines, providing unparalleled flexibility for developers, researchers, and advanced users who wish to experiment.
### Version Information
* **\[FLUX.1 Kontext \[pro]** - Commercial version, focused on rapid iterative editing
* **FLUX.1 Kontext \[max]** - Experimental version with stronger prompt adherence
* **FLUX.1 Kontext \[dev]** - Open source version (used in this tutorial), 12B parameters, mainly for research
Currently in ComfyUI, you can use all these versions, where [Pro and Max versions](/tutorials/api-nodes/black-forest-labs/flux-1-kontext) can be called through API nodes, while the Dev open source version please refer to the instructions in this guide.
## Workflow Description
In this tutorial, we cover two types of workflows, which are essentially the same:
* A workflow using the **FLUX.1 Kontext Image Edit** group node, making the interface and workflow reuse simpler
* Another workflow without using group nodes, showing the complete original workflow.
The main advantage of using group nodes is workflow conciseness - you can reuse group nodes to implement complex workflows and quickly reuse node groups. Additionally, in the new version of the frontend, we've added a quick group node addition feature for Flux.1 Kontext Dev:
This feature is currently experimental and may be adjusted in future versions.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Model Download
To run the workflows in this guide successfully, you first need to download the following model files. You can also directly get the model download links from the corresponding workflows, which already contain the model file download information.
**Diffusion Model**
* [flux1-dev-kontext\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/flux1-kontext-dev_ComfyUI/resolve/main/split_files/diffusion_models/flux1-dev-kontext_fp8_scaled.safetensors)
If you want to use the original weights, you can visit Black Forest Labs' related repository to obtain and use the original model weights.
**VAE**
* [ae.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/blob/main/split_files/vae/ae.safetensors)
**Text Encoder**
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors) or [t5xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn_scaled.safetensors)
Model save location
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── flux1-dev-kontext_fp8_scaled.safetensors
│ ├── 📂 vae/
│ │ └── ae.safetensor
│ └── 📂 text_encoders/
│ ├── clip_l.safetensors
│ └── t5xxl_fp16.safetensors or t5xxl_fp8_e4m3fn_scaled.safetensors
```
## Flux.1 Kontext Dev Workflow
This workflow is a normal workflow, but it uses the `Load Image(from output)` node to load the image to be edited, making it more convenient for you to access the edited image for multiple rounds of editing.
### 1. Workflow and Input Image Download
Download the following files and drag them into ComfyUI to load the corresponding workflow

**Input Image**

### 2. Complete the workflow step by step
You can refer to the numbers in the image to complete the workflow run:
1. In the `Load Diffusion Model` node, load the `flux1-dev-kontext_fp8_scaled.safetensors` model
2. In the `DualCLIP Load` node, ensure that `clip_l.safetensors` and `t5xxl_fp16.safetensors` or `t5xxl_fp8_e4m3fn_scaled.safetensors` are loaded
3. In the `Load VAE` node, ensure that `ae.safetensors` model is loaded
4. In the `Load Image(from output)` node, load the provided input image
5. In the `CLIP Text Encode` node, modify the prompts, only English is supported
6. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
## Flux Kontext Prompt Techniques
### 1. Basic Modifications
* Simple and direct: `"Change the car color to red"`
* Maintain style: `"Change to daytime while maintaining the same style of the painting"`
### 2. Style Transfer
**Principles:**
* Clearly name style: `"Transform to Bauhaus art style"`
* Describe characteristics: `"Transform to oil painting with visible brushstrokes, thick paint texture"`
* Preserve composition: `"Change to Bauhaus style while maintaining the original composition"`
### 3. Character Consistency
**Framework:**
* Specific description: `"The woman with short black hair"` instead of "she"
* Preserve features: `"while maintaining the same facial features, hairstyle, and expression"`
* Step-by-step modifications: Change background first, then actions
### 4. Text Editing
* Use quotes: `"Replace 'joy' with 'BFL'"`
* Maintain format: `"Replace text while maintaining the same font style"`
## Common Problem Solutions
### Character Changes Too Much
❌ Wrong: `"Transform the person into a Viking"`
✅ Correct: `"Change the clothes to be a viking warrior while preserving facial features"`
### Composition Position Changes
❌ Wrong: `"Put him on a beach"`
✅ Correct: `"Change the background to a beach while keeping the person in the exact same position, scale, and pose"`
### Style Application Inaccuracy
❌ Wrong: `"Make it a sketch"`
✅ Correct: `"Convert to pencil sketch with natural graphite lines, cross-hatching, and visible paper texture"`
## Core Principles
1. **Be Specific and Clear** - Use precise descriptions, avoid vague terms
2. **Step-by-step Editing** - Break complex modifications into multiple simple steps
3. **Explicit Preservation** - State what should remain unchanged
4. **Verb Selection** - Use "change", "replace" rather than "transform"
## Best Practice Templates
**Object Modification:**
`"Change [object] to [new state], keep [content to preserve] unchanged"`
**Style Transfer:**
`"Transform to [specific style], while maintaining [composition/character/other] unchanged"`
**Background Replacement:**
`"Change the background to [new background], keep the subject in the exact same position and pose"`
**Text Editing:**
`"Replace '[original text]' with '[new text]', maintain the same font style"`
> **Remember:** The more specific, the better. Kontext excels at understanding detailed instructions and maintaining consistency.
# ComfyUI Flux.1 Text-to-Image Workflow Example
Source: https://docs.comfy.org/tutorials/flux/flux-1-text-to-image
This guide provides a brief introduction to the Flux.1 model and guides you through using the Flux.1 model for text-to-image generation with examples including the full version and the FP8 Checkpoint version.
Flux is one of the largest open-source text-to-image generation models, with 12B parameters and an original file size of approximately 23GB. It was developed by [Black Forest Labs](https://blackforestlabs.ai/), a team founded by former Stable Diffusion team members.
Flux is known for its excellent image quality and flexibility, capable of generating high-quality, diverse images.
Currently, the Flux.1 model has several main versions:
* **Flux.1 Pro:** The best performing model, closed-source, only available through API calls.
* **[Flux.1 \[dev\]:](https://huggingface.co/black-forest-labs/FLUX.1-dev)** Open-source but limited to non-commercial use, distilled from the Pro version, with performance close to the Pro version.
* **[Flux.1 \[schnell\]:](https://huggingface.co/black-forest-labs/FLUX.1-schnell)** Uses the Apache2.0 license, requires only 4 steps to generate images, suitable for low-spec hardware.
**Flux.1 Model Features**
* **Hybrid Architecture:** Combines the advantages of Transformer networks and diffusion models, effectively integrating text and image information, improving the alignment accuracy between generated images and prompts, with excellent fidelity to complex prompts.
* **Parameter Scale:** Flux has 12B parameters, capturing more complex pattern relationships and generating more realistic, diverse images.
* **Supports Multiple Styles:** Supports diverse styles, with excellent performance for various types of images.
In this example, we'll introduce text-to-image examples using both Flux.1 Dev and Flux.1 Schnell versions, including the full version model and the simplified FP8 Checkpoint version.
* **Flux Full Version:** Best performance, but requires larger VRAM resources and installation of multiple model files.
* **Flux FP8 Checkpoint:** Requires only one fp8 version of the model, but quality is slightly reduced compared to the full version.
All workflow images's Metadata contains the corresponding model download information. You can load the workflows by:
* Dragging them directly into ComfyUI
* Or using the menu `Workflows` -> `Open(ctrl+o)`
If you're not using the Desktop Version or some models can't be downloaded automatically, please refer to the manual installation sections to save the model files to the corresponding folder.
Make sure your ComfyUI is updated to the latest version before starting.
## Flux.1 Full Version Text-to-Image Example
If you can't download models from [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev), make sure you've logged into Huggingface and agreed to the corresponding repository's license agreement.
### Flux.1 Dev
#### 1. Workflow File
Please download the image below and drag it into ComfyUI to load the workflow.

#### 2. Manual Model Installation
* The `flux1-dev.safetensors` file requires agreeing to the [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) agreement before downloading via browser.
* If your VRAM is low, you can try using [t5xxl\_fp8\_e4m3fn.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors?download=true) to replace the `t5xxl_fp16.safetensors` file.
Please download the following model files:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true) Recommended when your VRAM is greater than 32GB.
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors)
Storage location:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ └── diffusion_models/
│ └── flux1-dev.safetensors
```
#### 3. Steps to Run the Workflow
Please refer to the image below to ensure all model files are loaded correctly
1. Ensure the `DualCLIPLoader` node has the following models loaded:
* clip\_name1: t5xxl\_fp16.safetensors
* clip\_name2: clip\_l.safetensors
2. Ensure the `Load Diffusion Model` node has `flux1-dev.safetensors` loaded
3. Make sure the `Load VAE` node has `ae.safetensors` loaded
4. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
Thanks to Flux's excellent prompt following capability, we don't need any negative prompts
### Flux.1 Schnell
#### 1. Workflow File
Please download the image below and drag it into ComfyUI to load the workflow.

#### 2. Manual Models Installation
In this workflow, only two model files are different from the Flux1 Dev version workflow. For t5xxl, you can still use the fp16 version for better results.
* **t5xxl\_fp16.safetensors** -> **t5xxl\_fp8.safetensors**
* **flux1-dev.safetensors** -> **flux1-schnell.safetensors**
Complete model file list:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp8\_e4m3fn.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors?download=true)
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-schnell.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors)
File storage location:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp8_e4m3fn.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ └── diffusion_models/
│ └── flux1-schnell.safetensors
```
#### 3. Steps to Run the Workflow
1. Ensure the `DualCLIPLoader` node has the following models loaded:
* clip\_name1: t5xxl\_fp8\_e4m3fn.safetensors
* clip\_name2: clip\_l.safetensors
2. Ensure the `Load Diffusion Model` node has `flux1-schnell.safetensors` loaded
3. Ensure the `Load VAE` node has `ae.safetensors` loaded
4. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
## Flux.1 FP8 Checkpoint Version Text-to-Image Example
The fp8 version is a quantized version of the original Flux.1 fp16 version.
To some extent, the quality of this version will be lower than that of the fp16 version,
but it also requires less VRAM, and you only need to install one model file to try running it.
### Flux.1 Dev
Please download the image below and drag it into ComfyUI to load the workflow.

Please download [flux1-dev-fp8.safetensors](https://huggingface.co/Comfy-Org/flux1-dev/resolve/main/flux1-dev-fp8.safetensors?download=true) and save it to the `ComfyUI/models/checkpoints/` directory.
Ensure that the corresponding `Load Checkpoint` node loads `flux1-dev-fp8.safetensors`, and you can try to run the workflow.
### Flux.1 Schnell
Please download the image below and drag it into ComfyUI to load the workflow.

Please download [flux1-schnell-fp8.safetensors](https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell-fp8.safetensors?download=true) and save it to the `ComfyUI/models/checkpoints/` directory.
Ensure that the corresponding `Load Checkpoint` node loads `flux1-schnell-fp8.safetensors`, and you can try to run the workflow.
# Flux.1 Krea Dev ComfyUI Workflow Tutorial
Source: https://docs.comfy.org/tutorials/flux/flux1-krea-dev
Best open-source FLUX model developed by Black Forest Labs in collaboration with Krea, focusing on unique aesthetic style and natural details, avoiding AI look, providing exceptional realism and image quality.
[Flux.1 Krea Dev](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev) is an advanced text-to-image generation model developed in collaboration between Black Forest Labs (BFL) and Krea. This is currently the best open-source FLUX model, specifically designed for text-to-image generation.
**Model Features**
* **Unique Aesthetic Style**: Focuses on generating images with unique aesthetics, avoiding common "AI look" appearance
* **Natural Details**: Does not produce blown-out highlights, maintaining natural detail representatio
* **Exceptional Realism**: Provides outstanding realism and image quality
* **Fully Compatible Architecture**: Fully compatible architecture design with FLUX.1 \[dev]
**Model License**
This model is released under the [flux-1-dev-non-commercial-license](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev/blob/main/LICENSE.md)
## Flux.1 Krea Dev ComfyUI Workflow
#### 1. Workflow Files
Download the image or JSON below and drag it into ComfyUI to load the corresponding workflow

Download JSON Workflow
#### 2. Manual Model Installation
Please download the following model files:
**Diffusion model**
* [flux1-krea-dev\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/FLUX.1-Krea-dev_ComfyUI/blob/main/split_files/diffusion_models/flux1-krea-dev_fp8_scaled.safetensors)
If you want to pursue higher quality and have enough VRAM, you can try the original model weights
* [flux1-krea-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev/resolve/main/flux1-krea-dev.safetensors)
The `flux1-dev.safetensors` file requires agreeing to the [black-forest-labs/FLUX.1-Krea-dev](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev/) agreement before downloading via browser.
If you have used Flux related workflows before, the following models are the same and don't need to be downloaded again
**Text encoders**
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true) Recommended when your VRAM is greater than 32GB.
* [t5xxl\_fp8\_e4m3fn.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors) For Low VRAM
**VAE**
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
File save location:
```
ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └── flux1-krea-dev_fp8_scaled.safetensors or flux1-krea-dev.safetensors
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors or t5xxl_fp8_e4m3fn.safetensors
│ ├── vae/
│ │ └── ae.safetensors
```
#### 3. Step-by-step Verification to Ensure Workflow Runs Properly
For low VRAM users, this model may not run smoothly on your device, you can wait for the community to provide FP8 or GGUF version.
Please refer to the image below to ensure all model files have been loaded correctly
1. Ensure that `flux1-krea-dev_fp8_scaled.safetensors` or `flux1-krea-dev.safetensors` is loaded in the `Load Diffusion Model` node
* `flux1-krea-dev_fp8_scaled.safetensors` is recommended for low VRAM users
* `flux1-krea-dev.safetensors` is the original weights, if you have enough VRAM like 24GB you can use it for better quality
2. Ensure the following models are loaded in the `DualCLIPLoader` node:
* clip\_name1: t5xxl\_fp16.safetensors or t5xxl\_fp8\_e4m3fn.safetensors
* clip\_name2: clip\_l.safetensors
3. Ensure that `ae.safetensors` is loaded in the `Load VAE` node
4. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
# Cosmos Predict2 Text-to-Image ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/image/cosmos/cosmos-predict2-t2i
This guide demonstrates how to complete Cosmos-Predict2 text-to-image workflow in ComfyUI
Cosmos-Predict2 is NVIDIA's next-generation physical world foundation model, specifically designed for high-quality visual generation and prediction tasks in physical AI scenarios.
The model features exceptional physical accuracy, environmental interactivity, and detail reproduction capabilities, enabling realistic simulation of complex physical phenomena and dynamic scenes.
Cosmos-Predict2 supports various generation methods including Text-to-Image (Text2Image) and Video-to-World (Video2World), and is widely used in industrial simulation, autonomous driving, urban planning, scientific research, and other fields.
GitHub:[Cosmos-predict2](https://github.com/nvidia-cosmos/cosmos-predict2)
huggingface: [Cosmos-Predict2](https://huggingface.co/collections/nvidia/cosmos-predict2-68028efc052239369a0f2959)
This guide will walk you through completing **text-to-image** workflow in ComfyUI.
For the video generation section, please refer to the following part:
Using Cosmos-Predict2 for video generation
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
{/*
## Cosmos Predict2 Video2World Workflow
When testing the 2B version showed it uses around 16GB of VRAM.
### 1. Workflow File
Please download the image below and drag it into ComfyUI to load the workflow. The workflow already has embedded model download links.

### 2. Manual Model Installation
If the model download wasn't successful, you can try to download them manually by yourself in this section.
**Diffusion model**
- [cosmos_predict2_2B_t2i.safetensors](https://huggingface.co/Comfy-Org/Cosmos_Predict2_repackaged/resolve/main/cosmos_predict2_2B_t2i.safetensors)
For other weights, please visit [Cosmos_Predict2_repackaged](https://huggingface.co/Comfy-Org/Cosmos_Predict2_repackaged) to download
**Text encoder**
[oldt5_xxl_fp8_e4m3fn_scaled.safetensors](https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/resolve/main/text_encoders/oldt5_xxl_fp8_e4m3fn_scaled.safetensors)
**VAE**
[wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
File Storage Location
```
📂 ComfyUI/
├──📂 models/
│ ├── 📂 diffusion_models/
│ │ └─── cosmos_predict2_2B_t2i.safetensors
│ ├── 📂 text_encoders/
│ │ └─── oldt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── 📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Complete Workflow Step by Step

Please follow the steps in the image to run the workflow:
1. Ensure the `Load Diffusion Model` node has loaded `cosmos_predict2_2B_t2i.safetensors`
2. Ensure the `Load CLIP` node has loaded `oldt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. Ensure the `Load VAE` node has loaded `wan_2.1_vae.safetensors`
4. Set the image size in `EmptySD3LatentImage`
5. Modify the prompts in the `ClipTextEncode` node
6. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to run the worklfow
7. Once generation is complete, the image will automatically save to the `ComfyUI/output/` directory. You can also preview it in the `save image` node. */}
# ComfyUI Native HiDream-E1, E1.1 Workflow Example
Source: https://docs.comfy.org/tutorials/image/hidream/hidream-e1
This guide will help you understand and complete the ComfyUI native HiDream-I1 text-to-image workflow example

HiDream-E1 is an interactive image editing large model officially open-sourced by HiDream-ai, built based on HiDream-I1.
It allows you to edit images using natural language. The model is released under the [MIT License](https://github.com/HiDream-ai/HiDream-E1?tab=MIT-1-ov-file), supporting use in personal projects, scientific research, and commercial applications.
In combination with the previously released [hidream-i1](/tutorials/image/hidream/hidream-i1), it enables **creative capabilities from image generation to editing**.
| Name | Update Date | Inference Steps | Resolution | HuggingFace Repository |
| --------------- | ----------- | --------------- | --------------------- | ----------------------------------------------------------------------- |
| HiDream-E1-Full | 2025-4-28 | 28 | 768x768 | 🤗 [HiDream-E1-Full](https://huggingface.co/HiDream-ai/HiDream-E1-Full) |
| HiDream-E1.1 | 2025-7-16 | 28 | Dynamic (1 Megapixel) | 🤗 [HiDream-E1.1](https://huggingface.co/HiDream-ai/HiDream-E1-1) |
[HiDream E1 - Github](https://github.com/HiDream-ai/HiDream-E1)
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## HiDream E1 and E1.1 Workflow Related Models
All the models involved in this guide can be found [here](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/tree/main/split_files). Except for the Diffusion model, E1 and E1.1 use the same models.
The corresponding workflow files also include the relevant model information. You can choose to manually download and save the models, or follow the workflow prompts to download them after loading the workflow. It is recommended to use E1.1.
This model requires a large amount of VRAM to run. Please refer to the relevant sections for specific VRAM requirements.
**Diffusion Model**
You do not need to download both models. Since E1.1 is an iterative version based on E1, our tests show that its quality and performance are significantly improved compared to E1.
* [hidream\_e1\_1\_bf16.safetensors (Recommended)](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_e1_1_bf16.safetensors) 34.2GB
* [hidream\_e1\_full\_bf16.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_e1_full_bf16.safetensors) 34.2GB
**Text Encoder**:
* [clip\_l\_hidream.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/clip_l_hidream.safetensors) 236.12MB
* [clip\_g\_hidream.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/clip_g_hidream.safetensors) 1.29GB
* [t5xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/t5xxl_fp8_e4m3fn_scaled.safetensors) 4.8GB
* [llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/llama_3.1_8b_instruct_fp8_scaled.safetensors) 8.46GB
**VAE**
* [ae.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/vae/ae.safetensors) 319.77MB
> This is the VAE model for Flux. If you have used the Flux workflow before, you may have already downloaded this file.
Model Save Location
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 text_encoders/
│ │ ├─── clip_l_hidream.safetensors
│ │ ├─── clip_g_hidream.safetensors
│ │ ├─── t5xxl_fp8_e4m3fn_scaled.safetensors
│ │ └─── llama_3.1_8b_instruct_fp8_scaled.safetensors
│ └── 📂 vae/
│ │ └── ae.safetensors
│ └── 📂 diffusion_models/
│ ├── hidream_e1_1_bf16.safetensors
│ └── hidream_e1_full_bf16.safetensors
```
## HiDream E1.1 ComfyUI Native Workflow Example
E1.1 is an updated version released on July 16, 2025. This version supports dynamic 1-megapixel resolution, and the workflow uses the `Scale Image to Total Pixels` node to dynamically adjust the input image to 1 million pixels.
Here are the VRAM usage references during testing:
1. A100 40GB (VRAM usage 95%): First generation: 211s, second generation: 73s
2. 4090D 24GB (VRAM usage 98%)
* Full version: Out of memory
* FP8\_e4m3fn\_fast (VRAM 98%) First generation: 120s, second generation: 91s
### 1. HiDream E1.1 Workflow and Related Materials
Download the image below and drag it into ComfyUI with the corresponding workflow and models loaded:

Download the image below as input:

### 2. Step-by-step Guide to Running the HiDream-e1 Workflow
Follow these steps to run the workflow:
1. Make sure the `Load Diffusion Model` node loads the `hidream_e1_1_bf16.safetensors` model.
2. Make sure the four corresponding text encoders in `QuadrupleCLIPLoader` are loaded correctly:
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. Make sure the `Load VAE` node uses the `ae.safetensors` file.
4. In the `Load Image` node, load the provided input or your desired image.
5. In the `Empty Text Encoder(Positive)` node, enter **the modifications you want to make to the image**.
6. In the `Empty Text Encoder(Negative)` node, enter **the content you do not want to appear in the image**.
7. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute image generation.
### 3. Additional Notes on the Workflow
* Since HiDream E1.1 supports dynamic input with a total of 1 million pixels, the workflow uses `Scale Image to Total Pixels` to process and convert all input images, which may cause the aspect ratio to differ from the original input image.
* When using the fp16 version of the model, in actual tests, the full version ran out of memory on both A100 40GB and 4090D 24GB, so the workflow is set by default to use `fp8_e4m3fn_fast` for inference.
## HiDream E1 ComfyUI Native Workflow Example
E1 is a model released on April 28, 2025. This model only supports 768\*768 resolution.
For reference, this workflow takes about 500s for the first run and 370s for the second run with 28 sampling steps on Google Colab L4 with 22.5GB VRAM.
### 1. HiDream-e1 workflow
Please download the image below and drag it into ComfyUI. The workflow already contains model download information, and after loading, it will prompt you to download the corresponding models.

Download this image below as input:

### 2. Complete the HiDream-e1 Workflow Step by Step
Follow these steps to complete the workflow:
1. Make sure the `Load Diffusion Model` node has loaded the `hidream_e1_full_bf16.safetensors` model
2. Ensure that the four corresponding text encoders are correctly loaded in the `QuadrupleCLIPLoader`
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. Make sure the `Load VAE` node is using the `ae.safetensors` file
4. Load the input image we downloaded earlier in the `Load Image` node
5. (Important) Enter **the prompt for how you want to modify the image** in the `Empty Text Encoder(Positive)` node
6. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to generate the image
### Additional Notes on ComfyUI HiDream-e1 Workflow
* You may need to modify the prompt multiple times or generate multiple times to get better results
* This model has difficulty maintaining consistency when changing image styles, so try to make your prompts as complete as possible
* As the model supports a resolution of 768\*768, in actual testing with other dimensions, the image performance is poor or even significantly different at other dimensions
# ComfyUI Native HiDream-I1 Text-to-Image Workflow Example
Source: https://docs.comfy.org/tutorials/image/hidream/hidream-i1
This guide will walk you through completing a ComfyUI native HiDream-I1 text-to-image workflow example

HiDream-I1 is a text-to-image model officially open-sourced by HiDream-ai on April 7, 2025. The model has 17B parameters and is released under the [MIT license](https://github.com/HiDream-ai/HiDream-I1/blob/main/LICENSE), supporting personal projects, scientific research, and commercial use.
It currently performs excellently in multiple benchmark tests.
## Model Features
**Hybrid Architecture Design**
A combination of Diffusion Transformer (DiT) and Mixture of Experts (MoE) architecture:
* Based on Diffusion Transformer (DiT), with dual-stream MMDiT modules processing multimodal information and single-stream DiT modules optimizing global consistency.
* Dynamic routing mechanism flexibly allocates computing resources, enhancing complex scene processing capabilities and delivering excellent performance in color restoration, edge processing, and other details.
**Multimodal Text Encoder Integration**
Integrates four text encoders:
* OpenCLIP ViT-bigG, OpenAI CLIP ViT-L (visual semantic alignment)
* T5-XXL (long text parsing)
* Llama-3.1-8B-Instruct (instruction understanding)
This combination achieves SOTA performance in complex semantic parsing of colors, quantities, spatial relationships, etc., with Chinese prompt support significantly outperforming similar open-source models.
**Original Model Versions**
HiDream-ai provides three versions of the HiDream-I1 model to meet different needs. Below are the links to the original model repositories:
| Model Name | Description | Inference Steps | Repository Link |
| --------------- | -------------- | --------------- | ----------------------------------------------------------------------- |
| HiDream-I1-Full | Full version | 50 | [🤗 HiDream-I1-Full](https://huggingface.co/HiDream-ai/HiDream-I1-Full) |
| HiDream-I1-Dev | Distilled dev | 28 | [🤗 HiDream-I1-Dev](https://huggingface.co/HiDream-ai/HiDream-I1-Dev) |
| HiDream-I1-Fast | Distilled fast | 16 | [🤗 HiDream-I1-Fast](https://huggingface.co/HiDream-ai/HiDream-I1-Fast) |
## About This Workflow Example
In this example, we will use the repackaged version from ComfyOrg. You can find all the model files we'll use in this example in the [HiDream-I1\_ComfyUI](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/) repository.
Before starting, please update your ComfyUI version to ensure it's at least after this [commit](https://github.com/comfyanonymous/ComfyUI/commit/9ad792f92706e2179c58b2e5348164acafa69288) to make sure your ComfyUI has native support for HiDream
## HiDream-I1 Workflow
The model requirements for different ComfyUI native HiDream-I1 workflows are basically the same, with only the [diffusion models](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/tree/main/split_files/diffusion_models) files being different.
If you don't know which version to choose, please refer to the following suggestions:
* **HiDream-I1-Full** can generate the highest quality images
* **HiDream-I1-Dev** balances high-quality image generation with speed
* **HiDream-I1-Fast** can generate images in just 16 steps, suitable for scenarios requiring real-time iteration
For the **dev** and **fast** versions, negative prompts are not needed, so please set the `cfg` parameter to `1.0` during sampling. We have noted the corresponding parameter settings in the relevant workflows.
The full versions of all three versions require a lot of VRAM - you may need more than 27GB of VRAM to run them smoothly. In the corresponding workflow tutorials,
we will use the **fp8** version as a demonstration example to ensure that most users can run it smoothly.
However, we will still provide download links for different versions of the model in the corresponding examples, and you can choose the appropriate file based on your VRAM situation.
### Model Installation
The following model files are common files that we will use.
Please click on the corresponding links to download and save them according to the model file save location.
We will guide you to download the corresponding **diffusion models** in the corresponding workflows.
**text\_encoders**:
* [clip\_l\_hidream.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/clip_l_hidream.safetensors)
* [clip\_g\_hidream.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/clip_g_hidream.safetensors)
* [t5xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/t5xxl_fp8_e4m3fn_scaled.safetensors) This model has been used in many workflows, you may have already downloaded this file.
* [llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/llama_3.1_8b_instruct_fp8_scaled.safetensors)
**VAE**
* [ae.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/vae/ae.safetensors) This is Flux's VAE model, if you have used Flux's workflow before, you may have already downloaded this file.
**diffusion models**
We will guide you to download the corresponding model files in the corresponding workflows.
Model file save location
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 text_encoders/
│ │ ├─── clip_l_hidream.safetensors
│ │ ├─── clip_g_hidream.safetensors
│ │ ├─── t5xxl_fp8_e4m3fn_scaled.safetensors
│ │ └─── llama_3.1_8b_instruct_fp8_scaled.safetensors
│ └── 📂 vae/
│ │ └── ae.safetensors
│ └── 📂 diffusion_models/
│ └── ... # We will guide you to install in the corresponding version workflow
```
### HiDream-I1 Full Version Workflow
#### 1. Model File Download
Please select the appropriate version based on your hardware. Click the link and download the corresponding model file to save it to the `ComfyUI/models/diffusion_models/` folder.
* FP8 version: [hidream\_i1\_full\_fp8.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_full_fp8.safetensors?download=true) requires more than 16GB of VRAM
* Full version: [hidream\_i1\_full\_f16.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_full_fp16.safetensors?download=true) requires more than 27GB of VRAM
#### 2. Workflow File Download
Please download the image below and drag it into ComfyUI to load the corresponding workflow

#### 3. Complete the Workflow Step by Step
Complete the workflow execution step by step
1. Make sure the `Load Diffusion Model` node is using the `hidream_i1_full_fp8.safetensors` file
2. Make sure the four corresponding text encoders in `QuadrupleCLIPLoader` are loaded correctly
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. Make sure the `Load VAE` node is using the `ae.safetensors` file
4. For the **full** version, you need to set the `shift` parameter in `ModelSamplingSD3` to `3.0`
5. For the `Ksampler` node, you need to make the following settings
* Set `steps` to `50`
* Set `cfg` to `5.0`
* (Optional) Set `sampler` to `lcm`
* (Optional) Set `scheduler` to `normal`
6. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation
### HiDream-I1 Dev Version Workflow
#### 1. Model File Download
Please select the appropriate version based on your hardware, click the link and download the corresponding model file to save to the `ComfyUI/models/diffusion_models/` folder.
* FP8 version: [hidream\_i1\_dev\_fp8.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_dev_fp8.safetensors?download=true) requires more than 16GB of VRAM
* Full version: [hidream\_i1\_dev\_bf16.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_dev_bf16.safetensors?download=true) requires more than 27GB of VRAM
#### 2. Workflow File Download
Please download the image below and drag it into ComfyUI to load the corresponding workflow

#### 3. Complete the Workflow Step by Step
Complete the workflow execution step by step
1. Make sure the `Load Diffusion Model` node is using the `hidream_i1_dev_fp8.safetensors` file
2. Make sure the four corresponding text encoders in `QuadrupleCLIPLoader` are loaded correctly
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. Make sure the `Load VAE` node is using the `ae.safetensors` file
4. For the **dev** version, you need to set the `shift` parameter in `ModelSamplingSD3` to `6.0`
5. For the `Ksampler` node, you need to make the following settings
* Set `steps` to `28`
* (Important) Set `cfg` to `1.0`
* (Optional) Set `sampler` to `lcm`
* (Optional) Set `scheduler` to `normal`
6. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation
### HiDream-I1 Fast Version Workflow
#### 1. Model File Download
Please select the appropriate version based on your hardware, click the link and download the corresponding model file to save to the `ComfyUI/models/diffusion_models/` folder.
* FP8 version: [hidream\_i1\_fast\_fp8.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_fast_fp8.safetensors?download=true) requires more than 16GB of VRAM
* Full version: [hidream\_i1\_fast\_bf16.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_fast_bf16.safetensors?download=true) requires more than 27GB of VRAM
#### 2. Workflow File Download
Please download the image below and drag it into ComfyUI to load the corresponding workflow

#### 3. Complete the Workflow Step by Step
Complete the workflow execution step by step
1. Make sure the `Load Diffusion Model` node is using the `hidream_i1_fast_fp8.safetensors` file
2. Make sure the four corresponding text encoders in `QuadrupleCLIPLoader` are loaded correctly
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. Make sure the `Load VAE` node is using the `ae.safetensors` file
4. For the **fast** version, you need to set the `shift` parameter in `ModelSamplingSD3` to `3.0`
5. For the `Ksampler` node, you need to make the following settings
* Set `steps` to `16`
* (Important) Set `cfg` to `1.0`
* (Optional) Set `sampler` to `lcm`
* (Optional) Set `scheduler` to `normal`
6. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute the image generation
## Other Related Resources
### GGUF Version Models
* [HiDream-I1-Full-gguf](https://huggingface.co/city96/HiDream-I1-Full-gguf)
* [HiDream-I1-Dev-gguf](https://huggingface.co/city96/HiDream-I1-Dev-gguf)
You need to use the “Unet Loader (GGUF)” node in City96's [ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF) to replace the “Load Diffusion Model” node.
### NF4 Version Models
* [HiDream-I1-nf4](https://github.com/hykilpikonna/HiDream-I1-nf4)
* Use the [ComfyUI-HiDream-Sampler](https://github.com/SanDiegoDude/ComfyUI-HiDream-Sampler) node to use the NF4 version model.
# ComfyUI OmniGen2 Native Workflow Examples
Source: https://docs.comfy.org/tutorials/image/omnigen/omnigen2
ComfyUI OmniGen2 Native Workflow Examples - Unified text-to-image, image editing, and multi-image composition model.
## About OmniGen2
OmniGen2 is a powerful and efficient unified multimodal generation model with approximately **7B** total parameters (3B text model + 4B image generation model). Unlike OmniGen v1, OmniGen2 adopts an innovative dual-path Transformer architecture with completely independent text autoregressive model and image diffusion model, achieving parameter decoupling and specialized optimization.
### Model Highlights
* **Visual Understanding**: Inherits the powerful image content interpretation and analysis capabilities of the Qwen-VL-2.5 foundation model
* **Text-to-Image Generation**: Creates high-fidelity and aesthetically pleasing images from text prompts
* **Instruction-guided Image Editing**: Performs complex, instruction-based image modifications, achieving state-of-the-art performance among open-source models
* **Contextual Generation**: Versatile capabilities to process and flexibly combine diverse inputs (including people, reference objects, and scenes), producing novel and coherent visual outputs
### Technical Features
* **Dual-path Architecture**: Based on Qwen 2.5 VL (3B) text encoder + independent diffusion Transformer (4B)
* **Omni-RoPE Position Encoding**: Supports multi-image spatial positioning and identity distinction
* **Parameter Decoupling Design**: Avoids negative impact of text generation on image quality
* Support for complex text understanding and image understanding
* Controllable image generation and editing
* Excellent detail preservation capabilities
* Unified architecture supporting multiple image generation tasks
* Text generation capability: Can generate clear text content within images
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## OmniGen2 Model Download
Since this article involves different workflows, the corresponding model files and installation locations are as follows. The download information for model files is also included in the corresponding workflows:
**Diffusion Models**
* [omnigen2\_fp16.safetensors](https://huggingface.co/Comfy-Org/Omnigen2_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/omnigen2_fp16.safetensors)
**VAE**
* [ae.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/resolve/main/split_files/vae/ae.safetensors)
**Text Encoders**
* [qwen\_2.5\_vl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Omnigen2_ComfyUI_repackaged/resolve/main/split_files/text_encoders/qwen_2.5_vl_fp16.safetensors)
File save location:
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── omnigen2_fp16.safetensors
│ ├── 📂 vae/
│ │ └── ae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_fp16.safetensors
```
## ComfyUI OmniGen2 Text-to-Image Workflow
### 1. Download Workflow File

### 2. Complete Workflow Step by Step
Please follow the numbered steps in the image for step-by-step confirmation to ensure smooth operation of the corresponding workflow:
1. **Load Main Model**: Ensure the `Load Diffusion Model` node loads `omnigen2_fp16.safetensors`
2. **Load Text Encoder**: Ensure the `Load CLIP` node loads `qwen_2.5_vl_fp16.safetensors`
3. **Load VAE**: Ensure the `Load VAE` node loads `ae.safetensors`
4. **Set Image Dimensions**: Set the generated image dimensions in the `EmptySD3LatentImage` node (recommended 1024x1024)
5. **Input Prompts**:
* Input positive prompts in the first `CLipTextEncode` node (content you want to appear in the image)
* Input negative prompts in the second `CLipTextEncode` node (content you don't want to appear in the image)
6. **Start Generation**: Click the `Queue Prompt` button, or use the shortcut `Ctrl(cmd) + Enter` to execute text-to-image generation
7. **View Results**: After generation is complete, the corresponding images will be automatically saved to the `ComfyUI/output/` directory, and you can also preview them in the `SaveImage` node
## ComfyUI OmniGen2 Image Editing Workflow
OmniGen2 has rich image editing capabilities and supports adding text to images
### 1. Download Workflow File

Download the image below, which we will use as the input image.

### 2. Complete Workflow Step by Step
1. **Load Main Model**: Ensure the `Load Diffusion Model` node loads `omnigen2_fp16.safetensors`
2. **Load Text Encoder**: Ensure the `Load CLIP` node loads `qwen_2.5_vl_fp16.safetensors`
3. **Load VAE**: Ensure the `Load VAE` node loads `ae.safetensors`
4. **Upload Image**: Upload the provided image in the `Load Image` node
5. **Input Prompts**:
* Input positive prompts in the first `CLipTextEncode` node (content you want to appear in the image)
* Input negative prompts in the second `CLipTextEncode` node (content you don't want to appear in the image)
6. **Start Generation**: Click the `Queue Prompt` button, or use the shortcut `Ctrl(cmd) + Enter` to execute text-to-image generation
7. **View Results**: After generation is complete, the corresponding images will be automatically saved to the `ComfyUI/output/` directory, and you can also preview them in the `SaveImage` node
### 3. Additional Workflow Instructions
* If you want to enable the second image input, you can use the shortcut **Ctrl + B** to enable the corresponding node inputs for nodes that are in pink/purple state in the workflow
* If you want to customize dimensions, you can delete the `Get image size` node linked to the `EmptySD3LatentImage` node and input custom dimensions
# Qwen-Image ComfyUI Native Workflow Example
Source: https://docs.comfy.org/tutorials/image/qwen/qwen-image
Qwen-Image is a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license.
**Qwen-Image** is the first image generation foundation model released by Alibaba's Qwen team. It's a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license. The model has made significant advances in **complex text rendering** and **precise image editing**, achieving high-fidelity output for multiple languages including English and Chinese.
**Model Highlights**:
* **Excellent Multilingual Text Rendering**: Supports high-precision text generation in multiple languages including English, Chinese, Korean, Japanese, maintaining font details and layout consistency
* **Diverse Artistic Styles**: From photorealistic scenes to impressionist paintings, from anime aesthetics to minimalist design, fluidly adapting to various creative prompts
**Related Links**:
* [GitHub](https://github.com/QwenLM/Qwen-Image)
* [Hugging Face](https://huggingface.co/Qwen/Qwen-Image)
* [ModelScope](https://modelscope.cn/models/qwen/Qwen-Image)
Currently Qwen-Image has multiple ControlNet support options available:
* [Qwen-Image-DiffSynth-ControlNets/model\_patches](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/tree/main/split_files/model_patches): Includes canny, depth, and inpaint models
* [qwen\_image\_union\_diffsynth\_lora.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/blob/main/split_files/loras/qwen_image_union_diffsynth_lora.safetensors): Image structure control LoRA supporting canny, depth, pose, lineart, softedge, normal, openpose
* InstantX ControlNet: To be updated
## ComfyOrg Qwen-Image live stream
**Qwen-Image in ComfyUI - Lightning & LoRAs**
VIDEO
**Qwen-Image ControlNet in ComfyUI - DiffSynth**
VIDEO
## Qwen-Image Native Workflow Example
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
There are three different models used in the workflow attached to this document:
1. Qwen-Image original model fp8\_e4m3fn
2. 8-step accelerated version: Qwen-Image original model fp8\_e4m3fn with lightx2v 8-step LoRA
3. Distilled version: Qwen-Image distilled model fp8\_e4m3fn
**VRAM Usage Reference**
GPU: RTX4090D 24GB
| Model Used | VRAM Usage | First Generation | Second Generation |
| ------------------------------------- | ---------- | ---------------- | ----------------- |
| fp8\_e4m3fn | 86% | ≈ 94s | ≈ 71s |
| fp8\_e4m3fn with lightx2v 8-step LoRA | 86% | ≈ 55s | ≈ 34s |
| Distilled fp8\_e4m3fn | 86% | ≈ 69s | ≈ 36s |
### 1. Workflow File
After updating ComfyUI, you can find the workflow file in the templates, or drag the workflow below into ComfyUI to load it.

Download Workflow for Qwen-Image Official Model
Distilled version
Download Workflow for Distilled Model
### 2. Model Download
**Available Models in ComfyUI**
* Qwen-Image\_bf16 (40.9 GB)
* Qwen-Image\_fp8 (20.4 GB)
* Distilled versions (non-official, requires only 15 steps)
All models are available at [Huggingface](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) and [Modelscope](https://modelscope.cn/models/Comfy-Org/Qwen-Image_ComfyUI/files)
**Diffusion model**
* [qwen\_image\_fp8\_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/diffusion_models/qwen_image_fp8_e4m3fn.safetensors)
Qwen\_image\_distill
* [qwen\_image\_distill\_full\_fp8\_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/non_official/diffusion_models/qwen_image_distill_full_fp8_e4m3fn.safetensors)
* [qwen\_image\_distill\_full\_bf16.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/non_official/diffusion_models/qwen_image_distill_full_bf16.safetensors)
- The original author of the distilled version recommends using 15 steps with cfg 1.0.
- According to tests, this distilled version also performs well at 10 steps with cfg 1.0. You can choose either euler or res\_multistep based on the type of image you want.
**LoRA**
* [Qwen-Image-Lightning-8steps-V1.0.safetensors](https://huggingface.co/lightx2v/Qwen-Image-Lightning/resolve/main/Qwen-Image-Lightning-8steps-V1.0.safetensors)
**Text encoder**
* [qwen\_2.5\_vl\_7b\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)
**VAE**
[qwen\_image\_vae.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors)
**Model Storage Location**
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ ├── qwen_image_fp8_e4m3fn.safetensors
│ │ └── qwen_image_distill_full_fp8_e4m3fn.safetensors ## 蒸馏版
│ ├── 📂 loras/
│ │ └── Qwen-Image-Lightning-8steps-V1.0.safetensors ## 8步加速 LoRA 模型
│ ├── 📂 vae/
│ │ └── qwen_image_vae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors
```
### 3. Workflow Instructions
1. Make sure the `Load Diffusion Model` node has loaded `qwen_image_fp8_e4m3fn.safetensors`
2. Make sure the `Load CLIP` node has loaded `qwen_2.5_vl_7b_fp8_scaled.safetensors`
3. Make sure the `Load VAE` node has loaded `qwen_image_vae.safetensors`
4. Make sure the `EmptySD3LatentImage` node is set with the correct image dimensions
5. Set your prompt in the `CLIP Text Encoder` node; currently, it supports at least English, Chinese, Korean, Japanese, Italian, etc.
6. If you want to enable the 8-step acceleration LoRA by lightx2v, select the node and use `Ctrl + B` to enable it, and modify the Ksampler settings as described in step 8
7. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
8. For different model versions and workflows, adjust the KSampler parameters accordingly
The distilled model and the 8-step acceleration LoRA by lightx2v do not seem to be compatible for simultaneous use. You can experiment with different combinations to verify if they can be used together.
## Qwen Image InstantX ControlNet Workflow
This is a ControlNet model, so you can use it as normal ControlNet.
### 1. Workflow and Input Images
Download the image below and drag it into ComfyUI to load the workflow

Download JSON Format Workflow
Download the image below as input

### 2. Model Links
1. InstantX Controlnet
Download [Qwen-Image-InstantX-ControlNet-Union.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-InstantX-ControlNets/resolve/main/split_files/controlnet/Qwen-Image-InstantX-ControlNet-Union.safetensors) and save it to the `ComfyUI/models/controlnet/` folder
2. **Lotus Depth model**
We will use this model to generate the depth map of the image. The following two models need to be downloaded:
**Diffusion Model**
* [lotus-depth-d-v1-1.safetensors](https://huggingface.co/Comfy-Org/lotus/resolve/main/lotus-depth-d-v1-1.safetensors)
**VAE Model**
* [vae-ft-mse-840000-ema-pruned.safetensors](https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors) or any SD1.5 VAE
```
ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └─── lotus-depth-d-v1-1.safetensors
│ └── vae/
│ └── lvae-ft-mse-840000-ema-pruned.safetensors
```
> You can also use custom nodes like [comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) to generate depth map.
### 3. Workflow Instructions
1. Ensure that the `Load ControlNet Model` node correctly loads the `Qwen-Image-InstantX-ControlNet-Union.safetensors` model
2. Upload input image
3. This subgraph uses the Lotus Depth model. You can find it in the templates or edit the subgraph to learn more, make sure all the models are loaded correctly
4. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
## Qwen Image ControlNet DiffSynth-ControlNets Model Patches Workflow
This model is actually not a ControlNet, but a Model patch that supports three different control modes: canny, depth, and inpaint.
Original model address: [DiffSynth-Studio/Qwen-Image ControlNet](https://www.modelscope.cn/collections/Qwen-Image-ControlNet-6157b44e89d444)
Comfy Org rehost address: [Qwen-Image-DiffSynth-ControlNets/model\_patches](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/tree/main/split_files/model_patches)
### 1. Workflow and Input Images
Download the image below and drag it into ComfyUI to load the corresponding workflow

Download JSON Format Workflow
Download the image below as input:

### 2. Model Links
Other models are the same as the Qwen-Image basic workflow. You only need to download the models below and save them to the `ComfyUI/models/model_patches` folder
* [qwen\_image\_canny\_diffsynth\_controlnet.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/resolve/main/split_files/model_patches/qwen_image_canny_diffsynth_controlnet.safetensors)
* [qwen\_image\_depth\_diffsynth\_controlnet.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/resolve/main/split_files/model_patches/qwen_image_depth_diffsynth_controlnet.safetensors)
* [qwen\_image\_inpaint\_diffsynth\_controlnet.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/resolve/main/split_files/model_patches/qwen_image_inpaint_diffsynth_controlnet.safetensors)
### 3. Workflow Usage Instructions
Currently, diffsynth has three patch models: Canny, Depth, and Inpaint.
If you're using ControlNet-related workflows for the first time, you need to understand that control images need to be preprocessed into supported image formats before they can be used and recognized by the model.
* Canny: Processed canny edge, line art contours
* Depth: Preprocessed depth map showing spatial relationships
* Inpaint: Requires using Mask to mark areas that need to be repainted
Since this patch model is divided into three different models, you need to select the correct preprocessing type when inputting to ensure proper image preprocessing.
**Canny Model ControlNet Usage Instructions**
1. Ensure that `qwen_image_canny_diffsynth_controlnet.safetensors` is loaded
2. Upload input image for subsequent processing
3. The Canny node is a native preprocessing node that will preprocess the input image according to your set parameters to control generation
4. If needed, you can modify the `strength` in the `QwenImageDiffsynthControlnet` node to control the intensity of line art control
5. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
> For using qwen\_image\_depth\_diffsynth\_controlnet.safetensors, you need to preprocess the image into a depth map and replace the `image processing` part. For this usage, please refer to the InstantX processing method in this document. Other parts are similar to using the Canny model.
**Inpaint Model ControlNet Usage Instructions**
For the Inpaint model, it requires using the [Mask Editor](/interface/maskeditor) to draw a mask and use it as input control condition.
1. Ensure that `ModelPatchLoader` loads the `qwen_image_inpaint_diffsynth_controlnet.safetensors` model
2. Upload image and use the [Mask Editor](/interface/maskeditor) to draw a mask. You need to connect the `mask` output of the corresponding `Load Image` node to the `mask` input of `QwenImageDiffsynthControlnet` to ensure the corresponding mask is loaded
3. Use the `Ctrl-B` shortcut to set the original Canny in the workflow to bypass mode, making the corresponding Canny node processing ineffective
4. In `CLIP Text Encoder`, input what you want to change the masked area to
5. If needed, you can modify the `strength` in the `QwenImageDiffsynthControlnet` node to control the corresponding control intensity
6. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
## Qwen Image Union ControlNet LoRA Workflow
Original model address: [DiffSynth-Studio/Qwen-Image-In-Context-Control-Union](https://www.modelscope.cn/models/DiffSynth-Studio/Qwen-Image-In-Context-Control-Union/)
Comfy Org rehost address: [qwen\_image\_union\_diffsynth\_lora.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/blob/main/split_files/loras/qwen_image_union_diffsynth_lora.safetensors): Image structure control LoRA supporting canny, depth, pose, lineart, softedge, normal, openpose
### 1. Workflow and Input Images
Download the image below and drag it into ComfyUI to load the workflow

Download JSON Format Workflow
Download the image below as input

### 2. Model Links
Download the model below. Since this is a LoRA model, it needs to be saved to the `ComfyUI/models/loras/` folder
* [qwen\_image\_union\_diffsynth\_lora.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/blob/main/split_files/loras/qwen_image_union_diffsynth_lora.safetensors): Image structure control LoRA supporting canny, depth, pose, lineart, softedge, normal, openpose
### 3. Workflow Instructions
This model is a unified control LoRA that supports canny, depth, pose, lineart, softedge, normal, openpose controls. Since many image preprocessing native nodes are not fully supported, you should use something like [comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) to complete other image preprocessing.
1. Ensure that `LoraLoaderModelOnly` correctly loads the `qwen_image_union_diffsynth_lora.safetensors` model
2. Upload input image
3. If needed, you can adjust the `Canny` node parameters. Since different input images require different parameter settings to get better image preprocessing results, you can try adjusting the corresponding parameter values to get more/fewer details
4. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
> For other types of control, you also need to replace the image processing part.
# Qwen-Image-Edit ComfyUI Native Workflow Example
Source: https://docs.comfy.org/tutorials/image/qwen/qwen-image-edit
Qwen-Image-Edit is the image editing version of Qwen-Image, further trained based on the 20B model, supporting precise text editing and dual semantic/appearance editing capabilities.
**Qwen-Image-Edit** is the image editing version of Qwen-Image. It is further trained based on the 20B Qwen-Image model, successfully extending Qwen-Image's unique text rendering capabilities to editing tasks, enabling precise text editing. In addition, Qwen-Image-Edit feeds the input image into both Qwen2.5-VL (for visual semantic control) and the VAE Encoder (for visual appearance control), thus achieving dual semantic and appearance editing capabilities.
**Model Features**
Features include:
* Precise Text Editing: Qwen-Image-Edit supports bilingual (Chinese and English) text editing, allowing direct addition, deletion, and modification of text in images while preserving the original text size, font, and style.
* Dual Semantic/Appearance Editing: Qwen-Image-Edit supports not only low-level visual appearance editing (such as style transfer, addition, deletion, modification, etc.) but also high-level visual semantic editing (such as IP creation, object rotation, etc.).
* Strong Cross-Benchmark Performance: Evaluations on multiple public benchmarks show that Qwen-Image-Edit achieves SOTA in editing tasks, making it a powerful foundational model for image generation.
**Official Links**:
* [GitHub Repository](https://github.com/QwenLM/Qwen-Image)
* [Hugging Face](https://huggingface.co/Qwen/Qwen-Image-Edit)
* [ModelScope](https://modelscope.cn/models/qwen/Qwen-Image-Edit)
## ComfyOrg Qwen-Image-Edit Live Stream
VIDEO
## Qwen-Image-Edit ComfyUI Native Workflow Example
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
### 1. Workflow File
After updating ComfyUI, you can find the workflow file from the templates, or drag the workflow below into ComfyUI to load it.

Download JSON Workflow
Download the image below as input

### 2. Model Download
All models can be found at [Comfy-Org/Qwen-Image\_ComfyUI](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) or [Comfy-Org/Qwen-Image-Edit\_ComfyUI](https://huggingface.co/Comfy-Org/Qwen-Image-Edit_ComfyUI)
**Diffusion model**
* [qwen\_image\_edit\_fp8\_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-Edit_ComfyUI/resolve/main/split_files/diffusion_models/qwen_image_edit_fp8_e4m3fn.safetensors)
**LoRA**
* [Qwen-Image-Lightning-4steps-V1.0.safetensors](https://huggingface.co/lightx2v/Qwen-Image-Lightning/resolve/main/Qwen-Image-Lightning-4steps-V1.0.safetensors)
**Text encoder**
* [qwen\_2.5\_vl\_7b\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)
**VAE**
* [qwen\_image\_vae.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors)
Model Storage Location
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── qwen_image_edit_fp8_e4m3fn.safetensors
│ ├── 📂 loras/
│ │ └── Qwen-Image-Lightning-4steps-V1.0.safetensors
│ ├── 📂 vae/
│ │ └── qwen_image_vae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors
```
### 3. Follow the Steps to Complete the Workflow
1. Model Loading
* Ensure the `Load Diffusion Model` node loads `qwen_image_edit_fp8_e4m3fn.safetensors`
* Ensure the `Load CLIP` node loads `qwen_2.5_vl_7b_fp8_scaled.safetensors`
* Ensure the `Load VAE` node loads `qwen_image_vae.safetensors`
2. Image Loading
* Ensure the `Load Image` node uploads the image to be edited
3. Prompt Setting
* Set the prompt in the `CLIP Text Encoder` node
4. The Scale Image to Total Pixels node will scale your input image to a total of one million pixels,
* Mainly to avoid quality loss in output images caused by oversized input images such as 2048x2048
* If you are familiar with your input image size, you can bypass this node using `Ctrl+B`
5. If you want to use the 4-step Lighting LoRA to speed up image generation, you can select the `LoraLoaderModelOnly` node and press `Ctrl+B` to enable it
6. For the `steps` and `cfg` settings of the Ksampler node, we've added a note below the node where you can test the optimal parameter settings
7. Click the `Queue` button, or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
# Cosmos Predict2 Video2World ComfyUI Official Example
Source: https://docs.comfy.org/tutorials/video/cosmos/cosmos-predict2-video2world
This guide demonstrates how to complete Cosmos-Predict2 Video2World workflows in ComfyUI
Cosmos-Predict2 is NVIDIA's next-generation physical world foundation model, specifically designed for high-quality visual generation and prediction tasks in physical AI scenarios.
The model features exceptional physical accuracy, environmental interactivity, and detail reproduction capabilities, enabling realistic simulation of complex physical phenomena and dynamic scenes.
Cosmos-Predict2 supports various generation methods including Text-to-Image (Text2Image) and Video-to-World (Video2World),
and is widely used in industrial simulation, autonomous driving, urban planning, scientific research, and other fields.
It serves as a crucial foundational tool for promoting deep integration of intelligent vision and the physical world.
GitHub:[Cosmos-predict2](https://github.com/nvidia-cosmos/cosmos-predict2)
huggingface: [Cosmos-Predict2](https://huggingface.co/collections/nvidia/cosmos-predict2-68028efc052239369a0f2959)
This guide will walk you through completing **Video2World** generation in ComfyUI.
For the text-to-image section, please refer to the following part:
Using Cosmos-Predict2 for text-to-image generation
{/*
## Cosmos Predict2 Video2World Workflow
When testing the 2B version, it takes around 16GB VRAM.
### 1. Workflow File
Please download the video below and drag it into ComfyUI to load the workflow. The workflow already has embedded model download links.
Download Json Format Workflow File
Please download the following image as input:

### 2. Manual Model Installation
If the model download wasn't successful, you can try to download them manually by yourself in this section.
**Diffusion model**
- [cosmos_predict2_2B_video2world_480p_16fps.safetensors](https://huggingface.co/Comfy-Org/Cosmos_Predict2_repackaged/resolve/main/cosmos_predict2_2B_video2world_480p_16fps.safetensors)
For other weights, please visit [Cosmos_Predict2_repackaged](https://huggingface.co/Comfy-Org/Cosmos_Predict2_repackaged) to download
**Text encoder**
[oldt5_xxl_fp8_e4m3fn_scaled.safetensors](https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/resolve/main/text_encoders/oldt5_xxl_fp8_e4m3fn_scaled.safetensors)
**VAE**
[wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
File Storage Location
```
📂 ComfyUI/
├──📂 models/
│ ├── 📂 diffusion_models/
│ │ └─── cosmos_predict2_2B_video2world_480p_16fps.safetensors
│ ├── 📂 text_encoders/
│ │ └─── oldt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── 📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Complete Workflow Step by Step

Please follow the steps in the image to run the workflow:
1. Ensure the `Load Diffusion Model` node has loaded `cosmos_predict2_2B_video2world_480p_16fps.safetensors`
2. Ensure the `Load CLIP` node has loaded `oldt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. Ensure the `Load VAE` node has loaded `wan_2.1_vae.safetensors`
4. Upload the provided input image in the `Load Image` node
5. (Optional) If you need first and last frame control, use the shortcut `Ctrl(cmd) + B` to enable last frame input
6. (Optional) You can modify the prompts in the `ClipTextEncode` node
7. (Optional) Modify the size and frame count in the `CosmosPredict2ImageToVideoLatent` node
8. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
9. Once generation is complete, the video will automatically save to the `ComfyUI/output/` directory, you can also preview it in the `save video` node */}
# ComfyUI Hunyuan Video Examples
Source: https://docs.comfy.org/tutorials/video/hunyuan-video
This guide shows how to use Hunyuan Text-to-Video and Image-to-Video workflows in ComfyUI
Hunyuan Video series is developed and open-sourced by [Tencent](https://huggingface.co/tencent), featuring a hybrid architecture that supports both [Text-to-Video](https://github.com/Tencent/HunyuanVideo) and [Image-to-Video](https://github.com/Tencent/HunyuanVideo-I2V) generation with a parameter scale of 13B.
Technical features:
* **Core Architecture:** Uses a DiT (Diffusion Transformer) architecture similar to Sora, effectively fusing text, image, and motion information to improve consistency, quality, and alignment between generated video frames. A unified full-attention mechanism enables multi-view camera transitions while ensuring subject consistency.
* **3D VAE:** The custom 3D VAE compresses videos into a compact latent space, making image-to-video generation more efficient.
* **Superior Image-Video-Text Alignment:** Utilizing MLLM text encoders that excel in both image and video generation, better following text instructions, capturing details, and performing complex reasoning.
You can learn more through the official repositories: [Hunyuan Video](https://github.com/Tencent/HunyuanVideo) and [Hunyuan Video-I2V](https://github.com/Tencent/HunyuanVideo-I2V).
This guide will walk you through setting up both **Text-to-Video** and **Image-to-Video** workflows in ComfyUI.
The workflow images in this tutorial contain metadata with model download information.
Simply drag them into ComfyUI or use the menu `Workflows` -> `Open (ctrl+o)` to load the corresponding workflow, which will prompt you to download the required models.
Alternatively, this guide provides direct model links if automatic downloads fail or you are not using the Desktop version. All models are available [here](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/tree/main/split_files) for download.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Common Models for All Workflows
The following models are used in both Text-to-Video and Image-to-Video workflows. Please download and save them to the specified directories:
* [clip\_l.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/text_encoders/clip_l.safetensors?download=true)
* [llava\_llama3\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/text_encoders/llava_llama3_fp8_scaled.safetensors?download=true)
* [hunyuan\_video\_vae\_bf16.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/vae/hunyuan_video_vae_bf16.safetensors?download=true)
Storage location:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── llava_llama3_fp8_scaled.safetensors
│ ├── vae/
│ │ └── hunyuan_video_vae_bf16.safetensors
```
## Hunyuan Text-to-Video Workflow
Hunyuan Text-to-Video was open-sourced in December 2024, supporting 5-second short video generation through natural language descriptions in both Chinese and English.
### 1. Workflow
Download the image below and drag it into ComfyUI to load the workflow:

### 2. Manual Models Installation
Download [hunyuan\_video\_t2v\_720p\_bf16.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/diffusion_models/hunyuan_video_t2v_720p_bf16.safetensors?download=true) and save it to the `ComfyUI/models/diffusion_models` folder.
Ensure you have all these model files in the correct locations:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors // Shared model
│ │ └── llava_llama3_fp8_scaled.safetensors // Shared model
│ ├── vae/
│ │ └── hunyuan_video_vae_bf16.safetensors // Shared model
│ └── diffusion_models/
│ └── hunyuan_video_t2v_720p_bf16.safetensors // T2V model
```
### 3. Steps to Run the Workflow
1. Ensure the `DualCLIPLoader` node has loaded these models:
* clip\_name1: clip\_l.safetensors
* clip\_name2: llava\_llama3\_fp8\_scaled.safetensors
2. Ensure the `Load Diffusion Model` node has loaded `hunyuan_video_t2v_720p_bf16.safetensors`
3. Ensure the `Load VAE` node has loaded `hunyuan_video_vae_bf16.safetensors`
4. Click the `Queue` button or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
When the `length` parameter in the `EmptyHunyuanLatentVideo` node is set to 1, the model can generate a static image.
## Hunyuan Image-to-Video Workflow
Hunyuan Image-to-Video model was open-sourced on March 6, 2025, based on the HunyuanVideo framework. It transforms static images into smooth, high-quality videos and also provides LoRA training code to customize special video effects like hair growth, object transformation, etc.
Currently, the Hunyuan Image-to-Video model has two versions:
* v1 "concat": Better motion fluidity but less adherence to the image guidance
* v2 "replace": Updated the day after v1, with better image guidance but seemingly less dynamic compared to v1
v1 "concat"
v2 "replace"
### Shared Model for v1 and v2 Versions
Download the following file and save it to the `ComfyUI/models/clip_vision` directory:
* [llava\_llama3\_vision.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/clip_vision/llava_llama3_vision.safetensors?download=true)
### V1 "concat" Image-to-Video Workflow
#### 1. Workflow and Asset
Download the workflow image below and drag it into ComfyUI to load the workflow:

Download the image below, which we'll use as the starting frame for the image-to-video generation:

#### 2. Related models manual installation
* [hunyuan\_video\_image\_to\_video\_720p\_bf16.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/diffusion_models/hunyuan_video_image_to_video_720p_bf16.safetensors?download=true)
Ensure you have all these model files in the correct locations:
```
ComfyUI/
├── models/
│ ├── clip_vision/
│ │ └── llava_llama3_vision.safetensors // I2V shared model
│ ├── text_encoders/
│ │ ├── clip_l.safetensors // Shared model
│ │ └── llava_llama3_fp8_scaled.safetensors // Shared model
│ ├── vae/
│ │ └── hunyuan_video_vae_bf16.safetensors // Shared model
│ └── diffusion_models/
│ └── hunyuan_video_image_to_video_720p_bf16.safetensors // I2V v1 "concat" version model
```
#### 3. Steps to Run the Workflow
1. Ensure that `DualCLIPLoader` has loaded these models:
* clip\_name1: clip\_l.safetensors
* clip\_name2: llava\_llama3\_fp8\_scaled.safetensors
2. Ensure that `Load CLIP Vision` has loaded `llava_llama3_vision.safetensors`
3. Ensure that `Load Image Model` has loaded `hunyuan_video_image_to_video_720p_bf16.safetensors`
4. Ensure that `Load VAE` has loaded `vae_name: hunyuan_video_vae_bf16.safetensors`
5. Ensure that `Load Diffusion Model` has loaded `hunyuan_video_image_to_video_720p_bf16.safetensors`
6. Click the `Queue` button or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
### v2 "replace" Image-to-Video Workflow
The v2 workflow is essentially the same as the v1 workflow. You just need to download the **replace** model and use it in the `Load Diffusion Model` node.
#### 1. Workflow and Asset
Download the workflow image below and drag it into ComfyUI to load the workflow:

Download the image below, which we'll use as the starting frame for the image-to-video generation:

#### 2. Related models manual installation
* [hunyuan\_video\_v2\_replace\_image\_to\_video\_720p\_bf16.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/diffusion_models/hunyuan_video_v2_replace_image_to_video_720p_bf16.safetensors?download=true)
Ensure you have all these model files in the correct locations:
```
ComfyUI/
├── models/
│ ├── clip_vision/
│ │ └── llava_llama3_vision.safetensors // I2V shared model
│ ├── text_encoders/
│ │ ├── clip_l.safetensors // Shared model
│ │ └── llava_llama3_fp8_scaled.safetensors // Shared model
│ ├── vae/
│ │ └── hunyuan_video_vae_bf16.safetensors // Shared model
│ └── diffusion_models/
│ └── hunyuan_video_v2_replace_image_to_video_720p_bf16.safetensors // V2 "replace" version model
```
#### 3. Steps to Run the Workflow
1. Ensure the `DualCLIPLoader` node has loaded these models:
* clip\_name1: clip\_l.safetensors
* clip\_name2: llava\_llama3\_fp8\_scaled.safetensors
2. Ensure the `Load CLIP Vision` node has loaded `llava_llama3_vision.safetensors`
3. Ensure the `Load Image Model` node has loaded `hunyuan_video_image_to_video_720p_bf16.safetensors`
4. Ensure the `Load VAE` node has loaded `hunyuan_video_vae_bf16.safetensors`
5. Ensure the `Load Diffusion Model` node has loaded `hunyuan_video_v2_replace_image_to_video_720p_bf16.safetensors`
6. Click the `Queue` button or use the shortcut `Ctrl(cmd) + Enter` to run the workflow
## Try it yourself
Here are some images and prompts we provide. Based on that content or make an adjustment to create your own video.
```
Futuristic robot dancing ballet, dynamic motion, fast motion, fast shot, moving scene
```
***
```
Samurai waving sword and hitting the camera. camera angle movement, zoom in, fast scene, super fast, dynamic
```
***
```
flying car fastly moving and flying through the city
```
***
```
cyberpunk car race in night city, dynamic, super fast, fast shot
```
# LTX-Video
Source: https://docs.comfy.org/tutorials/video/ltxv
[LTX-Video](https://huggingface.co/Lightricks/LTX-Video) is a very efficient video model by lightricks. The important thing with this model is to give it long descriptive prompts.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Multi Frame Control
Allows you to control the video with a series of images. You can download the input images: [starting frame](https://raw.githubusercontent.com/Comfy-Org/example_workflows/refs/heads/main/ltxv/multi-frame/house1.png) and [ending frame](https://raw.githubusercontent.com/Comfy-Org/example_workflows/refs/heads/main/ltxv/multi-frame/house2.png).
Drag the video directly into ComfyUI to run the workflow.
## Image to Video
Allows you to control the video with a first [frame image](https://raw.githubusercontent.com/Comfy-Org/example_workflows/refs/heads/main/ltxv/i2v/girl1.png).
Drag the video directly into ComfyUI to run the workflow.
## Text to Video
Drag the video directly into ComfyUI to run the workflow.
## Requirements
Download the following models and place them in the locations specified below:
* [ltx-video-2b-v0.9.5.safetensors](https://huggingface.co/Lightricks/LTX-Video/resolve/main/ltx-video-2b-v0.9.5.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/mochi_preview_repackaged/resolve/main/split_files/text_encoders/t5xxl_fp16.safetensors?download=true)
```
├── checkpoints/
│ └── ltx-video-2b-v0.9.5.safetensors
└── text_encoders/
└── t5xxl_fp16.safetensors
```
# ComfyUI Wan2.1 Fun Camera Official Examples
Source: https://docs.comfy.org/tutorials/video/wan/fun-camera
This guide demonstrates how to use Wan2.1 Fun Camera in ComfyUI for video generation
## About Wan2.1 Fun Camera
**Wan2.1 Fun Camera** is a video generation project launched by the Alibaba team, focusing on controlling video generation effects through camera motion.
**Model Weights Download**:
* [14B Version](https://huggingface.co/alibaba-pai/Wan2.1-Fun-V1.1-14B-Control-Camera)
* [1.3B Version](https://huggingface.co/alibaba-pai/Wan2.1-Fun-V1.1-1.3B-Control-Camera)
**Code Repository**: [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
**ComfyUI now natively supports the Wan2.1 Fun Camera model**.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Model Installation
These models only need to be installed once. Additionally, model download information is included in the corresponding workflow images, so you can choose your preferred way to download the models.
All of the following models can be found at [Wan\_2.1\_ComfyUI\_repackaged](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged)
**Diffusion Models** choose either 1.3B or 14B:
* [wan2.1\_fun\_camera\_v1.1\_1.3B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors)
* [wan2.1\_fun\_camera\_v1.1\_14B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_fun_camera_v1.1_14B_bf16.safetensors)
If you've used Wan2.1 related models before, you should already have the following models. If not, please download them:
**Text Encoders** choose one:
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors)
File Storage Location:
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ ├── wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors # 1.3B version
│ │ └── wan2.1_fun_camera_v1.1_14B_bf16.safetensors # 14B version
│ ├── 📂 text_encoders/
│ │ └── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ ├── 📂 vae/
│ │ └── wan_2.1_vae.safetensors
│ └── 📂 clip_vision/
│ └── clip_vision_h.safetensors
```
## ComfyUI Wan2.1 Fun Camera 1.3B Native Workflow Example
### 1. Workflow Related Files Download
#### 1.1 Workflow File
Download the video below and drag it into ComfyUI to load the corresponding workflow:
Download Json Workflow File
If you want to use the 14B version, simply replace the model file with the 14B version, but please be aware of the VRAM requirements.
#### 1.2 Input Image Download
Please download the image below, which we will use as the starting frame:

### 2. Complete the Workflow Step by Step
1. Ensure the correct version of model file is loaded:
* 1.3B version: `wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors`
* 14B version: `wan2.1_fun_camera_v1.1_14B_bf16.safetensors`
2. Ensure the `Load CLIP` node has loaded `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. Ensure the `Load VAE` node has loaded `wan_2.1_vae.safetensors`
4. Ensure the `Load CLIP Vision` node has loaded `clip_vision_h.safetensors`
5. Upload the starting frame to the `Load Image` node
6. Modify the Prompt if you're using your own input image
7. Set camera motion in the `WanCameraEmbedding` node
8. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute generation
## ComfyUI Wan2.1 Fun Camera 14B Workflow and Input Image
Download Json Workflow File
**Input Image**

## Performance Reference
**1.3B Version**:
* 512×512 resolution on RTX 4090 takes about 72 seconds to generate 81 frames
**14B Version**:
* RTX4090 24GB VRAM may experience insufficient memory when generating 512×512 resolution, and memory issues have also occurred on A100 when using larger sizes
# ComfyUI Wan2.1 Fun Control Video Examples
Source: https://docs.comfy.org/tutorials/video/wan/fun-control
This guide demonstrates how to use Wan2.1 Fun Control in ComfyUI to generate videos with control videos
## About Wan2.1-Fun-Control
**Wan2.1-Fun-Control** is an open-source video generation and control project developed by Alibaba team.
It introduces innovative Control Codes mechanisms combined with deep learning and multimodal conditional inputs to generate high-quality videos that conform to preset control conditions. The project focuses on precisely guiding generated video content through multimodal control conditions.
Currently, the Fun Control model supports various control conditions, including **Canny (line art), Depth, OpenPose (human posture), MLSD (geometric edges), and trajectory control.**
The model also supports multi-resolution video prediction with options for 512, 768, and 1024 resolutions at 16 frames per second, generating videos up to 81 frames (approximately 5 seconds) in length.
Model versions:
* **1.3B** Lightweight: Suitable for local deployment and quick inference with **lower VRAM requirements**
* **14B** High-performance: Model size reaches 32GB+, offering better results but **requiring higher VRAM**
Here are the relevant code repositories:
* [Wan2.1-Fun-1.3B-Control](https://huggingface.co/alibaba-pai/Wan2.1-Fun-1.3B-Control)
* [Wan2.1-Fun-14B-Control](https://huggingface.co/alibaba-pai/Wan2.1-Fun-14B-Control)
* Code repository: [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
ComfyUI now **natively supports** the Wan2.1 Fun Control model. Before starting this tutorial, please update your ComfyUI to ensure you're using a version after [this commit](https://github.com/comfyanonymous/ComfyUI/commit/3661c833bcc41b788a7c9f0e7bc48524f8ee5f82).
In this guide, we'll provide two workflows:
1. A workflow using only native Comfy Core nodes
2. A workflow using custom nodes
Due to current limitations in native nodes for video support, the native-only workflow ensures users can complete the process without installing custom nodes.
However, we've found that providing a good user experience for video generation is challenging without custom nodes, so we're providing both workflow versions in this guide.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Model Installation
You only need to install these models once. The workflow images also contain model download information, so you can choose your preferred download method.
The following models can be found at [Wan\_2.1\_ComfyUI\_repackaged](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged) and [Wan2.1-Fun](https://huggingface.co/collections/alibaba-pai/wan21-fun-67e4fb3b76ca01241eb7e334)
Click the corresponding links to download. If you've used Wan-related workflows before, you only need to download the **Diffusion models**.
**Diffusion models** - choose 1.3B or 14B. The 14B version has a larger file size (32GB) and higher VRAM requirements:
* [wan2.1\_fun\_control\_1.3B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_fun_control_1.3B_bf16.safetensors?download=true)
* [Wan2.1-Fun-14B-Control](https://huggingface.co/alibaba-pai/Wan2.1-Fun-14B-Control/blob/main/diffusion_pytorch_model.safetensors?download=true): Rename to `Wan2.1-Fun-14B-Control.safetensors` after downloading
**Text encoders** - choose one of the following models (fp16 precision has a larger size and higher performance requirements):
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors?download=true)
File storage location:
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── wan2.1_fun_control_1.3B_bf16.safetensors
│ ├── 📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── 📂 vae/
│ │ └── wan_2.1_vae.safetensors
│ └── 📂 clip_vision/
│ └── clip_vision_h.safetensors
```
## ComfyUI Native Workflow
In this workflow, we use videos converted to **WebP format** since the `Load Image` node doesn't currently support mp4 format. We also use **Canny Edge** to preprocess the original video.
Because many users encounter installation failures and environment issues when installing custom nodes, this version of the workflow uses only native nodes to ensure a smoother experience.
Thanks to our powerful ComfyUI authors who provide feature-rich nodes. If you want to directly check the related version, see [Workflow Using Custom Nodes](#workflow-using-custom-nodes).
### 1. Workflow File Download
#### 1.1 Workflow File
Download the image below and drag it into ComfyUI to load the workflow:

#### 1.2 Input Images and Videos Download
Please download the following image and video for input:


### 2. Complete the Workflow Step by Step
1. Ensure the `Load Diffusion Model` node has loaded `wan2.1_fun_control_1.3B_bf16.safetensors`
2. Ensure the `Load CLIP` node has loaded `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. Ensure the `Load VAE` node has loaded `wan_2.1_vae.safetensors`
4. Ensure the `Load CLIP Vision` node has loaded `clip_vision_h.safetensors`
5. Upload the starting frame to the `Load Image` node (renamed to `Start_image`)
6. Upload the control video to the second `Load Image` node. Note: This node currently doesn't support mp4, only WebP videos
7. (Optional) Modify the prompt (both English and Chinese are supported)
8. (Optional) Adjust the video size in `WanFunControlToVideo`, avoiding overly large dimensions
9. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
### 3. Usage Notes
* Since we need to input the same number of frames as the control video into the `WanFunControlToVideo` node, if the specified frame count exceeds the actual control video frames, the excess frames may display scenes not conforming to control conditions. We'll address this issue in the [Workflow Using Custom Nodes](#workflow-using-custom-nodes)
* Avoid setting overly large dimensions, as this can make the sampling process very time-consuming. Try generating smaller images first, then upscale
* Use your imagination to build upon this workflow by adding text-to-image or other types of workflows to achieve direct text-to-video generation or style transfer
* Use tools like [ComfyUI-comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) for richer control options
## Workflow Using Custom Nodes
We'll need to install the following two custom nodes:
* [ComfyUI-VideoHelperSuite](https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite)
* [ComfyUI-comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
You can use [ComfyUI Manager](https://github.com/Comfy-Org/ComfyUI-Manager) to install missing nodes or follow the installation instructions for each custom node package.
### 1. Workflow File Download
#### 1.1 Workflow File
Download the image below and drag it into ComfyUI to load the workflow:

Due to the large size of video files, you can also click [here](https://raw.githubusercontent.com/Comfy-Org/example_workflows/main/wan2.1_fun_control/wan2.1_fun_control_use_custom_nodes.json) to download the workflow file in JSON format.
#### 1.2 Input Images and Videos Download
Please download the following image and video for input:

### 2. Complete the Workflow Step by Step
> The model part is essentially the same. If you've already experienced the native-only workflow, you can directly upload the corresponding images and run it.
1. Ensure the `Load Diffusion Model` node has loaded `wan2.1_fun_control_1.3B_bf16.safetensors`
2. Ensure the `Load CLIP` node has loaded `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. Ensure the `Load VAE` node has loaded `wan_2.1_vae.safetensors`
4. Ensure the `Load CLIP Vision` node has loaded `clip_vision_h.safetensors`
5. Upload the starting frame to the `Load Image` node
6. Upload an mp4 format video to the `Load Video(Upload)` custom node. Note that the workflow has adjusted the default `frame_load_cap`
7. For the current image, the `DWPose Estimator` only uses the `detect_face` option
8. (Optional) Modify the prompt (both English and Chinese are supported)
9. (Optional) Adjust the video size in `WanFunControlToVideo`, avoiding overly large dimensions
10. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
### 3. Workflow Notes
Thanks to the ComfyUI community authors for their custom node packages:
* This example uses `Load Video(Upload)` to support mp4 videos
* The `video_info` obtained from `Load Video(Upload)` allows us to maintain the same `fps` for the output video
* You can replace `DWPose Estimator` with other preprocessors from the `ComfyUI-comfyui_controlnet_aux` node package
* Prompts support multiple languages
## Usage Tips
* A useful tip is that you can combine multiple image preprocessing techniques and then use the `Image Blend` node to achieve the goal of applying multiple control methods simultaneously.
* You can use the `Video Combine` node from `ComfyUI-VideoHelperSuite` to save videos in mp4 format
* We use `SaveAnimatedWEBP` because we currently don't support embedding workflow into **mp4** and some other custom nodes may not support embedding workflow too. To preserve the workflow in the video, we choose `SaveAnimatedWEBP` node.
* In the `WanFunControlToVideo` node, `control_video` is not mandatory, so sometimes you can skip using a control video, first generate a very small video size like 320x320, and then use them as control video input to achieve consistent results.
* [ComfyUI-WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper)
* [ComfyUI-KJNodes](https://github.com/kijai/ComfyUI-KJNodes)
# ComfyUI Wan2.1 Fun InP Video Examples
Source: https://docs.comfy.org/tutorials/video/wan/fun-inp
This guide demonstrates how to use Wan2.1 Fun InP in ComfyUI to generate videos with first and last frame control
## About Wan2.1-Fun-InP
**Wan-Fun InP** is an open-source video generation model released by Alibaba, part of the Wan2.1-Fun series, focusing on generating videos from images with first and last frame control.
**Key features**:
* **First and last frame control**: Supports inputting both first and last frame images to generate transitional video between them, enhancing video coherence and creative freedom. Compared to earlier community versions, Alibaba's official model produces more stable and significantly higher quality results.
* **Multi-resolution support**: Supports generating videos at 512×512, 768×768, 1024×1024 and other resolutions to accommodate different scenario requirements.
**Model versions**:
* **1.3B** Lightweight: Suitable for local deployment and quick inference with **lower VRAM requirements**
* **14B** High-performance: Model size reaches 32GB+, offering better results but requiring **higher VRAM**
Below are the relevant model weights and code repositories:
* [Wan2.1-Fun-1.3B-Input](https://huggingface.co/alibaba-pai/Wan2.1-Fun-1.3B-Input)
* [Wan2.1-Fun-14B-Input](https://huggingface.co/alibaba-pai/Wan2.1-Fun-14B-Input)
* Code repository: [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
Currently, ComfyUI natively supports the Wan2.1 Fun InP model. Before starting this tutorial, please update your ComfyUI to ensure your version is after [this commit](https://github.com/comfyanonymous/ComfyUI/commit/0a1f8869c9998bbfcfeb2e97aa96a6d3e0a2b5df).
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Wan2.1 Fun InP Workflow
Download the image below and drag it into ComfyUI to load the workflow:

### 1. Workflow File Download
### 2. Manual Model Installation
If automatic model downloading is ineffective, please download the models manually and save them to the corresponding folders.
The following models can be found at [Wan\_2.1\_ComfyUI\_repackaged](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged) and [Wan2.1-Fun](https://huggingface.co/collections/alibaba-pai/wan21-fun-67e4fb3b76ca01241eb7e334)
**Diffusion models** - choose 1.3B or 14B. The 14B version has a larger file size (32GB) and higher VRAM requirements:
* [wan2.1\_fun\_inp\_1.3B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_fun_inp_1.3B_bf16.safetensors?download=true)
* [Wan2.1-Fun-14B-InP](https://huggingface.co/alibaba-pai/Wan2.1-Fun-14B-InP/resolve/main/diffusion_pytorch_model.safetensors?download=true): Rename to `Wan2.1-Fun-14B-InP.safetensors` after downloading
**Text encoders** - choose one of the following models (fp16 precision has a larger size and higher performance requirements):
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors?download=true)
File storage location:
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── wan2.1_fun_inp_1.3B_bf16.safetensors
│ ├── 📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── 📂 vae/
│ │ └── wan_2.1_vae.safetensors
│ └── 📂 clip_vision/
│ └── clip_vision_h.safetensors
```
### 3. Complete the Workflow Step by Step
1. Ensure the `Load Diffusion Model` node has loaded `wan2.1_fun_inp_1.3B_bf16.safetensors`
2. Ensure the `Load CLIP` node has loaded `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. Ensure the `Load VAE` node has loaded `wan_2.1_vae.safetensors`
4. Ensure the `Load CLIP Vision` node has loaded `clip_vision_h.safetensors`
5. Upload the starting frame to the `Load Image` node (renamed to `Start_image`)
6. Upload the ending frame to the second `Load Image` node
7. (Optional) Modify the prompt (both English and Chinese are supported)
8. (Optional) Adjust the video size in `WanFunInpaintToVideo`, avoiding overly large dimensions
9. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
### 4. Workflow Notes
Please make sure to use the correct model, as `wan2.1_fun_inp_1.3B_bf16.safetensors` and `wan2.1_fun_control_1.3B_bf16.safetensors` are stored in the same folder and have very similar names. Ensure you're using the right model.
* When using Wan Fun InP, you may need to frequently modify prompts to ensure the accuracy of the corresponding scene transitions.
## Other Wan2.1 Fun InP or video-related custom node packages
* [ComfyUI-VideoHelperSuite](https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite)
* [ComfyUI-WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper)
* [ComfyUI-KJNodes](https://github.com/kijai/ComfyUI-KJNodes)
# ComfyUI Wan2.1 VACE Video Examples
Source: https://docs.comfy.org/tutorials/video/wan/vace
This article introduces how to complete Wan VACE video generation examples in ComfyUI
As we have made adjustments to the template and added related usage and instructions for CausVid LoRA, this document needs to be updated and requires some preparation time. Until then, please refer to the notes in the template for usage
## About VACE
VACE 14B is an open-source unified video editing model launched by the Alibaba Tongyi Wanxiang team. Through integrating multi-task capabilities, supporting high-resolution processing and flexible multi-modal input mechanisms, this model significantly improves the efficiency and quality of video creation.
The model is open-sourced under the [Apache-2.0](https://github.com/ali-vilab/VACE?tab=Apache-2.0-1-ov-file) license and can be used for personal or commercial purposes.
Here is a comprehensive analysis of its core features and technical highlights:
* Multi-modal input: Supports multiple input forms including text, images, video, masks, and control signals
* Unified architecture: Single model supports multiple tasks with freely combinable functions
* Motion transfer: Generates coherent actions based on reference videos
* Local replacement: Replaces specific areas in videos through masks
* Video extension: Completes actions or extends backgrounds
* Background replacement: Preserves subjects while changing environmental backgrounds
Currently VACE has released two versions - 1.3B and 14B. Compared to the 1.3B version, the 14B version supports 720P resolution output with better image details and stability.
| Model | 480P | 720P |
| ----------------------------------------------------------- | ---- | ---- |
| [VACE-1.3B](https://huggingface.co/Wan-AI/Wan2.1-VACE-1.3B) | ✅ | ❌ |
| [VACE-14B](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B) | ✅ | ✅ |
Related model weights and code repositories:
* [VACE-1.3B](https://huggingface.co/Wan-AI/Wan2.1-VACE-1.3B)
* [VACE-14B](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B)
* [Github](https://github.com/ali-vilab/VACE)
* [VACE Project Homepage](https://ali-vilab.github.io/VACE-Page/)
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Model Download and Loading in Workflows
Since the workflows covered in this document all use the same workflow template, we can first complete the model download and loading information introduction, then enable/disable different inputs through Bypassing different nodes to achieve different workflows.
The model download information is already embedded in the workflow information in specific examples, so you can also complete the model download when downloading specific example workflows.
### Model Download
**diffusion\_models**
[wan2.1\_vace\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_vace_14B_fp16.safetensors)
[wan2.1\_vace\_1.3B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_vace_1.3B_fp16.safetensors)
If you have used Wan Video related workflows before, you have already downloaded the following model files.
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
Choose one version from **Text encoders** to download
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
File save location
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └─── wan2.1_vace_14B_fp16.safetensors
│ ├── 📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors # or umt5_xxl_fp16.safetensors
│ └── 📂 vae/
│ └── wan_2.1_vae.safetensors
```
### Model Loading
Since the models used in the workflows covered in this document are consistent, the workflows are also the same, and only the nodes are bypassed to enable/disable different inputs, please refer to the following image to ensure that the corresponding models are correctly loaded in different workflows.
1. Make sure the `Load Diffusion Model` node has loaded `wan2.1_vace_14B_fp16.safetensors`
2. Make sure the `Load CLIP` node has loaded `umt5_xxl_fp8_e4m3fn_scaled.safetensors` or `umt5_xxl_fp16.safetensors`
3. Make sure the `Load VAE` node has loaded `wan_2.1_vae.safetensors`
### How to toggle Node Bypass Status
When a node is set to Bypass status, data passing through the node will not be affected by the node and will be output directly. We often set nodes to Bypass status when we don't need them.
Here are three ways to toggle a node's Bypass status:
1. After selecting the node, click the arrow in the indicator section of the selection toolbox to quickly toggle the node's Bypass status
2. After selecting the node, right-click the node and select `Mode` -> `Always` to switch to Always mode
3. After selecting the node, right-click the node and select the `Bypass` option to toggle the Bypass status
## VACE Text-to-Video Workflow
If you cannot load the workflow from mp4 file, please ensure that your ComfyUI front-end version is up to date version in [requirements.txt](https://github.com/comfyanonymous/ComfyUI/blob/master/requirements.txt) , make sure you can load the workflow from mp4 file.
Currently 1.19.9 is the latest ComfyUI front-end version in the requirements.txt file.
### 1. Workflow Download
Download the video below and drag it into ComfyUI to load the corresponding workflow
### 2. Complete the Workflow Step by Step
Please follow the numbered steps in the image to ensure smooth workflow execution
1. Enter positive prompts in the `CLIP Text Encode (Positive Prompt)` node
2. Enter negative prompts in the `CLIP Text Encode (Negative Prompt)` node
3. Set the image dimensions (640x640 resolution recommended for first run) and frame count (video duration) in `WanVaceToVideo`
4. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
5. Once generated, the video will automatically save to `ComfyUI/output/video` directory (subfolder location depends on `save video` node settings)
During testing with a 4090 GPU:
* 720x1280 resolution, generating 81 frames takes about 40 minutes
* 640x640 resolution, generating 49 frames takes about 7 minutes
However, 720P video quality is better.
## VACE Image-to-Video Workflow
You can continue using the workflow above, just unbypass the `Load image` node in **Load reference image** and input your image. You can also use the image below - in this file we've already set up the corresponding parameters.
### 1. Workflow Download
Download the video below and drag it into ComfyUI to load the corresponding workflow
Please download the image below as input

### 2. Complete the Workflow Step by Step
Please follow the numbered steps in the image to ensure smooth workflow execution
1. Input the corresponding image in the `Load image` node
2. You can modify and edit prompts like in the text-to-video workflow
3. Set the image dimensions (640x640 resolution recommended for first run) and frame count (video duration) in `WanVaceToVideo`
4. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
5. Once generated, the video will automatically save to `ComfyUI/output/video` directory (subfolder location depends on `save video` node settings)
You may want to use nodes like getting image dimensions to set the resolution, but due to width and height step requirements of the corresponding nodes, you may get error messages if your image dimensions are not divisible by 16.
### 3. Additional Workflow Notes
VACE also supports inputting multiple reference images in a single image to generate corresponding videos. You can see related examples on the VACE project [page](https://ali-vilab.github.io/VACE-Page/)
## VACE Video-to-Video Workflow
### 1. Workflow Download
Download the video below and drag it into ComfyUI to load the corresponding workflow
We will use the following materials as input:
1. Input image for reference

2. The video below has been preprocessed and will be used to control video generation
3. The video below is the original video. You can download these materials and use preprocessing nodes like [comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) to preprocess the images
### 2. Complete the Workflow Step by Step
Please follow the numbered steps in the image to ensure smooth workflow execution
1. Input the reference image in the `Load Image` node under `Load reference image`
2. Input the control video in the `Load Video` node under `Load control video`. Since the provided video is preprocessed, no additional processing is needed
3. If you need to preprocess the original video yourself, you can modify the `Image preprocessing` group or use `comfyui_controlnet_aux` nodes to complete the preprocessing
4. Modify prompts
5. Set the image dimensions (640x640 resolution recommended for first run) and frame count (video duration) in `WanVaceToVideo`
6. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
7. Once generated, the video will automatically save to `ComfyUI/output/video` directory (subfolder location depends on `save video` node settings)
## VACE Video Outpainting Workflow
\[To be updated]
## VACE First-Last Frame Video Generation
\[To be updated]
To ensure that the first and last frames are effective, the video `length` setting must satisfy that `length-1` is divisible by 4.
The corresponding `Batch_size` setting must satisfy `Batch_size = length - 2`
## Related Node Documentation
Please refer to the documentation below to learn about related nodes
WanVaceToVideo Node Documentation
ComfyUI TrimVideoLatent Node Documentation
# Wan ATI ComfyUI Native Workflow Tutorial
Source: https://docs.comfy.org/tutorials/video/wan/wan-ati
Using trajectory control for video generation.
**ATI (Any Trajectory Instruction)** is a controllable video generation framework proposed by the ByteDance team. ATI is implemented based on Wan2.1 and supports unified control of objects, local regions, and camera motion in videos through arbitrary trajectory instructions.
Project URL: [https://github.com/bytedance/ATI](https://github.com/bytedance/ATI)
## Key Features
* **Unified Motion Control**: Supports trajectory control for multiple motion types including objects, local regions, and camera movements.
* **Interactive Trajectory Editor**: Visual tool that allows users to freely draw and edit motion trajectories on images.
* **Wan2.1 Compatible**: Based on the official Wan2.1 implementation, compatible with environments and model structures.
* **Rich Visualization Tools**: Supports visualization of input trajectories, output videos, and trajectory overlays.
## WAN ATI Trajectory Control Workflow Example
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
### 1. Workflow Download
Download the video below and drag it into ComfyUI to load the corresponding workflow
We will use the following image as input:

### 2. Model Download
If you haven't successfully downloaded the model files from the workflow, you can try downloading them manually using the links below
**Diffusion Model**
* [Wan2\_1-I2V-ATI-14B\_fp8\_e4m3fn.safetensors](https://huggingface.co/Kijai/WanVideo_comfy/resolve/main/Wan2_1-I2V-ATI-14B_fp8_e4m3fn.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**Text encoders** Chose one of following model
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**clip\_vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors)
File save location
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ └───Wan2_1-I2V-ATI-14B_fp8_e4m3fn.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors # or other version
│ ├───📂 clip_vision/
│ │ └─── clip_vision_h.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Complete the workflow execution step by step
Please follow the numbered steps in the image to ensure smooth execution of the corresponding workflow
1. Ensure the `Load Diffusion Model` node has loaded the `Wan2_1-I2V-ATI-14B_fp8_e4m3fn.safetensors` model
2. Ensure the `Load CLIP` node has loaded the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model
3. Ensure the `Load VAE` node has loaded the `wan_2.1_vae.safetensors` model
4. Ensure the `Load CLIP Vision` node has loaded the `clip_vision_h.safetensors` model
5. Upload the provided input image in the `Load Image` node
6. Trajectory editing: Currently there is no corresponding trajectory editor in ComfyUI yet, you can use the following link to complete trajectory editing
* [Online Trajectory Editing Tool](https://comfyui-wiki.github.io/Trajectory-Annotation-Tool/)
7. If you need to modify the prompts (positive and negative), please make changes in the `CLIP Text Encoder` node numbered `5`
8. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
# ComfyUI Wan2.1 FLF2V Native Example
Source: https://docs.comfy.org/tutorials/video/wan/wan-flf
This guide explains how to complete Wan2.1 FLF2V video generation examples in ComfyUI
Wan FLF2V (First-Last Frame Video Generation) is an open-source video generation model developed by the Alibaba Tongyi Wanxiang team. Its open-source license is [Apache 2.0](https://github.com/Wan-Video/Wan2.1?tab=Apache-2.0-1-ov-file).
Users only need to provide two images as the starting and ending frames, and the model automatically generates intermediate transition frames, outputting a logically coherent and naturally flowing 720p high-definition video.
**Core Technical Highlights**
1. **Precise First-Last Frame Control**: The matching rate of first and last frames reaches 98%, defining video boundaries through starting and ending scenes, intelligently filling intermediate dynamic changes to achieve scene transitions and object morphing effects.
2. **Stable and Smooth Video Generation**: Using CLIP semantic features and cross-attention mechanisms, the video jitter rate is reduced by 37% compared to similar models, ensuring natural and smooth transitions.
3. **Multi-functional Creative Capabilities**: Supports dynamic embedding of Chinese and English subtitles, generation of anime/realistic/fantasy and other styles, adapting to different creative needs.
4. **720p HD Output**: Directly generates 1280×720 resolution videos without post-processing, suitable for social media and commercial applications.
5. **Open-source Ecosystem Support**: Model weights, code, and training framework are fully open-sourced, supporting deployment on mainstream AI platforms.
**Technical Principles and Architecture**
1. **DiT Architecture**: Based on diffusion models and Diffusion Transformer architecture, combined with Full Attention mechanism to optimize spatiotemporal dependency modeling, ensuring video coherence.
2. **3D Causal Variational Encoder**: Wan-VAE technology compresses HD frames to 1/128 size while retaining subtle dynamic details, significantly reducing memory requirements.
3. **Three-stage Training Strategy**: Starting from 480P resolution pre-training, gradually upgrading to 720P, balancing generation quality and computational efficiency through phased optimization.
**Related Links**
* **GitHub Repository**: [GitHub](https://github.com/Wan-Video/Wan2.1)
* **Hugging Face Model Page**: [Hugging Face](https://huggingface.co/Wan-AI/Wan2.1-FLF2V-14B-720P)
* **ModelScope Community**: [ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-FLF2V-14B-720P)
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Wan2.1 FLF2V 720P ComfyUI Native Workflow Example
### 1. Download Workflow Files and Related Input Files
Since this model is trained on high-resolution images, using smaller sizes may not yield good results. In the example, we use a size of 720 \* 1280, which may cause users with lower VRAM hard to run smoothly and will take a long time to generate.
If needed, please adjust the video generation size for testing. A small generation size may not produce good output with this model, please notice that.
Please download the WebP file below, and drag it into ComfyUI to load the corresponding workflow. The workflow has embedded the corresponding model download file information.

Please download the two images below, which we will use as the starting and ending frames of the video


### 2. Manual Model Installation
If corresponding
All models involved in this guide can be found [here](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files).
**diffusion\_models** Choose one version based on your hardware conditions
* FP16:[wan2.1\_flf2v\_720p\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_flf2v_720p_14B_fp16.safetensors?download=true)
* FP8:[wan2.1\_flf2v\_720p\_14B\_fp8\_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_flf2v_720p_14B_fp8_e4m3fn.safetensors)
If you have previously tried Wan Video related workflows, you may already have the following files.
Choose one version from **Text encoders** for download,
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors?download=true)
File Storage Location
```
ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └─── wan2.1_flf2v_720p_14B_fp16.safetensors # or FP8 version
│ ├── text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors # or your chosen version
│ ├── vae/
│ │ └── wan_2.1_vae.safetensors
│ └── clip_vision/
│ └── clip_vision_h.safetensors
```
### 3. Complete Workflow Execution Step by Step
1. Ensure the `Load Diffusion Model` node has loaded `wan2.1_flf2v_720p_14B_fp16.safetensors` or `wan2.1_flf2v_720p_14B_fp8_e4m3fn.safetensors`
2. Ensure the `Load CLIP` node has loaded `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. Ensure the `Load VAE` node has loaded `wan_2.1_vae.safetensors`
4. Ensure the `Load CLIP Vision` node has loaded `clip_vision_h.safetensors`
5. Upload the starting frame to the `Start_image` node
6. Upload the ending frame to the `End_image` node
7. (Optional) Modify the positive and negative prompts, both Chinese and English are supported
8. (**Important**) In `WanFirstLastFrameToVideo` we use 720*1280 as default size.because it's a 720P model, so using a small size will not yield good output. Please use size around 720*1280 for good generation.
9. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
# ComfyUI Wan2.1 Video Examples
Source: https://docs.comfy.org/tutorials/video/wan/wan-video
This guide demonstrates how to generate videos with first and last frames using Wan2.1 Video in ComfyUI
Wan2.1 Video series is a video generation model open-sourced by Alibaba in February 2025 under the [Apache 2.0 license](https://github.com/Wan-Video/Wan2.1?tab=Apache-2.0-1-ov-file).
It offers two versions:
* 14B (14 billion parameters)
* 1.3B (1.3 billion parameters)
Covering multiple tasks including text-to-video (T2V) and image-to-video (I2V).
The model not only outperforms existing open-source models in performance but more importantly, its lightweight version requires only 8GB of VRAM to run, significantly lowering the barrier to entry.
* [Wan2.1 Code Repository](https://github.com/Wan-Video/Wan2.1)
* [Wan2.1 Model Repository](https://huggingface.co/Wan-AI)
## Wan2.1 ComfyUI Native Workflow Examples
Please update ComfyUI to the latest version before starting the examples to make sure you have native Wan Video support.
## Model Installation
All models mentioned in this guide can be found [here](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files). Below are the common models you'll need for the examples in this guide, which you can download in advance:
Choose one version from **Text encoders** to download:
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors?download=true)
File storage locations:
```
ComfyUI/
├── models/
│ ├── diffusion_models/
│ ├── ... # Let's download the models in the corresponding workflow
│ ├── text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── vae/
│ │ └── wan_2.1_vae.safetensors
│ └── clip_vision/
│ └── clip_vision_h.safetensors
```
For diffusion models, we'll use the fp16 precision models in this guide because we've found that they perform better than the bf16 versions. If you need other precision versions, please visit [here](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/diffusion_models) to download them.
## Wan2.1 Text-to-Video Workflow
Before starting the workflow, please download [wan2.1\_t2v\_1.3B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_fp16.safetensors?download=true) and save it to the `ComfyUI/models/diffusion_models/` directory.
> If you need other t2v precision versions, please visit [here](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/diffusion_models) to download them.
### 1. Workflow File Download
Download the file below and drag it into ComfyUI to load the corresponding workflow:

### 2. Complete the Workflow Step by Step
1. Make sure the `Load Diffusion Model` node has loaded the `wan2.1_t2v_1.3B_fp16.safetensors` model
2. Make sure the `Load CLIP` node has loaded the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model
3. Make sure the `Load VAE` node has loaded the `wan_2.1_vae.safetensors` model
4. (Optional) You can modify the video dimensions in the `EmptyHunyuanLatentVideo` node if needed
5. (Optional) If you need to modify the prompts (positive and negative), make changes in the `CLIP Text Encoder` node at number `5`
6. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the video generation
## Wan2.1 Image-to-Video Workflow
**Since Wan Video separates the 480P and 720P models**, we'll need to provide examples for both resolutions in this guide. In addition to using different models, they also have slight parameter differences.
### 480P Version
#### 1. Workflow and Input Image
Download the image below and drag it into ComfyUI to load the corresponding workflow:

We'll use the following image as input:

#### 2. Model Download
Please download [wan2.1\_i2v\_480p\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_i2v_480p_14B_fp16.safetensors?download=true) and save it to the `ComfyUI/models/diffusion_models/` directory.
#### 3. Complete the Workflow Step by Step
1. Make sure the `Load Diffusion Model` node has loaded the `wan2.1_i2v_480p_14B_fp16.safetensors` model
2. Make sure the `Load CLIP` node has loaded the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model
3. Make sure the `Load VAE` node has loaded the `wan_2.1_vae.safetensors` model
4. Make sure the `Load CLIP Vision` node has loaded the `clip_vision_h.safetensors` model
5. Upload the provided input image in the `Load Image` node
6. (Optional) Enter the video description content you want to generate in the `CLIP Text Encoder` node
7. (Optional) You can modify the video dimensions in the `WanImageToVideo` node if needed
8. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the video generation
### 720P Version
#### 1. Workflow and Input Image
Download the image below and drag it into ComfyUI to load the corresponding workflow:

We'll use the following image as input:

#### 2. Model Download
Please download [wan2.1\_i2v\_720p\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp16.safetensors?download=true) and save it to the `ComfyUI/models/diffusion_models/` directory.
#### 3. Complete the Workflow Step by Step
1. Make sure the `Load Diffusion Model` node has loaded the `wan2.1_i2v_720p_14B_fp16.safetensors` model
2. Make sure the `Load CLIP` node has loaded the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model
3. Make sure the `Load VAE` node has loaded the `wan_2.1_vae.safetensors` model
4. Make sure the `Load CLIP Vision` node has loaded the `clip_vision_h.safetensors` model
5. Upload the provided input image in the `Load Image` node
6. (Optional) Enter the video description content you want to generate in the `CLIP Text Encoder` node
7. (Optional) You can modify the video dimensions in the `WanImageToVideo` node if needed
8. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute the video generation
# ComfyUI Wan2.2 Fun Camera Control: Video Generation Workflow Example
Source: https://docs.comfy.org/tutorials/video/wan/wan2-2-fun-camera
This article demonstrates how to use camera control for video generation with Wan2.2 Fun Camera Control in ComfyUI.
**Wan2.2-Fun-Camera-Control** is a next-generation video generation and camera control model developed by Alibaba PAI. By introducing innovative Camera Control Codes and combining deep learning with multimodal conditional inputs, it generates high-quality videos that adhere to predefined camera motion conditions. The model is released under the **Apache 2.0 license**, allowing for commercial use.
**Key Features**:
* **Camera Motion Control**: Supports various camera motion modes, including **Pan Up**, **Pan Down**, **Pan Left**, **Pan Right**, **Zoom In**, **Zoom Out**, and combinations thereof.
* **High-Quality Video Generation**: Based on the Wan2.2 architecture, it outputs cinematic-quality videos.
Here are the relevant model weights and code repository:
* [🤗Wan2.2-Fun-A14B-Control-Camera](https://huggingface.co/alibaba-pai/Wan2.2-Fun-A14B-Control-Camera)
* Code Repository: [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
## Wan2.2 Fun Camera Control: Video Generation Workflow Example
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
The workflow provided includes two versions:
1. Using the [Wan2.2-Lightning](https://huggingface.co/lightx2v/Wan2.2-Lightning) 4-step LoRA via lightx2v: This may result in reduced video dynamics but offers faster generation.
2. The fp8\_scaled version without the acceleration LoRA.
Below are the timing results tested on an RTX4090D 24GB GPU for 640\*640 resolution and 81-frame length:
| Model Type | Resolution | VRAM Usage | First Generation Time | Second Generation Time |
| ------------------------- | ---------- | ---------- | --------------------- | ---------------------- |
| fp8\_scaled | 640×640 | 84% | ≈ 536 seconds | ≈ 513 seconds |
| fp8\_scaled + 4-step LoRA | 640×640 | 89% | ≈ 108 seconds | ≈ 71 seconds |
While the 4-step LoRA improves initial user experience, it may reduce video dynamism. By default, the accelerated LoRA version is enabled. To switch workflows, select the nodes and press **Ctrl+B**.
### 1. Workflow and Asset Download
Download the video or JSON file below and drag it into ComfyUI to load the corresponding workflow. The workflow will prompt you to download the models.
Download JSON Workflow
Please download the image below, which we will use as input.

### 2. Model Links
The following models can be found in [Wan\_2.2\_ComfyUI\_Repackaged](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged):
**Diffusion Model**
* [wan2.2\_fun\_camera\_high\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_camera_high_noise_14B_fp8_scaled.safetensors)
* [wan2.2\_fun\_camera\_low\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_camera_low_noise_14B_fp8_scaled.safetensors)
**Wan2.2-Lightning LoRA (Optional, for acceleration)**
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_high\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors)
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_low\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
File save location
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_fun_camera_low_noise_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_fun_camera_high_noise_14B_fp8_scaled.safetensors
│ ├───📂 loras/
│ │ ├─── wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
│ │ └─── wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Complete the Workflow Step-by-Step
This workflow uses LoRA. Ensure the Diffusion model and LoRA are consistent; high noise and low noise models and LoRA must be paired accordingly.
1. **High noise** model and **LoRA** loading
* Ensure the `Load Diffusion Model` node loads `wan2.2_fun_camera_high_noise_14B_fp8_scaled.safetensors`
* Ensure the `LoraLoaderModelOnly` node loads `wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors`
2. **Low noise** model and **LoRA** loading
* Ensure the `Load Diffusion Model` node loads `wan2.2_fun_camera_low_noise_14B_fp8_scaled.safetensors`
* Ensure the `LoraLoaderModelOnly` node loads `wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors`
3. Ensure the `Load CLIP` node loads `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
4. Ensure the `Load VAE` node loads `wan_2.1_vae.safetensors`
5. Upload the starting frame in the `Load Image` node
6. Modify the Prompt (both Chinese and English are acceptable)
7. Set camera control parameters in the `WanCameraEmbedding` node:
* **Camera Motion**: Select the camera motion type (Zoom In, Zoom Out, Pan Up, Pan Down, Pan Left, Pan Right, Static, etc.)
* **Width/Height**: Set video resolution
* **Length**: Set the number of video frames (default is 81 frames)
* **Speed**: Set video speed (default is 1.0)
8. Click the `Run` button or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
# ComfyUI Wan2.2 Fun Control Video Generation Example
Source: https://docs.comfy.org/tutorials/video/wan/wan2-2-fun-control
This article introduces how to use ComfyUI to complete the Wan2.2 Fun Control video generation using control videos
**Wan2.2-Fun-Control** is a next-generation video generation and control model launched by Alibaba PAI team. Through innovative Control Codes mechanism combined with deep learning and multi-modal conditional inputs, it can generate high-quality videos that comply with preset control conditions. The model is released under the **Apache 2.0 license** and supports commercial use.
**Key Features**:
* **Multi-modal Control**: Supports multiple control conditions including **Canny (line art)**, **Depth**, **OpenPose (human pose)**, **MLSD (geometric edges)**, and **trajectory control**
* **High-Quality Video Generation**: Based on the Wan2.2 architecture, outputs film-level quality videos
* **Multi-language Support**: Supports multi-language prompts including Chinese and English
Below are the relevant model weights and code repositories:
* [🤗Wan2.2-Fun-A14B-Control](https://huggingface.co/alibaba-pai/Wan2.2-Fun-A14B-Control)
* Code repository: [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
## ComfyOrg Wan2.2 Fun InP & Control Youtube Live Stream Replay
VIDEO
## Wan2.2 Fun Control Video Generation Workflow Example
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
This workflow provides two versions:
1. A version using [Wan2.2-Lightning](https://huggingface.co/lightx2v/Wan2.2-Lightning) 4-step LoRA from lightx2v: may cause some loss in video dynamics but offers faster speed
2. A fp8\_scaled version without acceleration LoRA
Below are the test results using an RTX4090D 24GB VRAM GPU at 640×640 resolution with 81 frames
| Model Type | VRAM Usage | First Generation Time | Second Generation Time |
| ------------------------- | ---------- | --------------------- | ---------------------- |
| fp8\_scaled | 83% | ≈ 524s | ≈ 520s |
| fp8\_scaled + 4-step LoRA | 89% | ≈ 138s | ≈ 79s |
Since using the 4-step LoRA provides a better experience for first-time workflow users, but may cause some loss in video dynamics, we have enabled the accelerated LoRA version by default. If you want to enable the other workflow, select it and use **Ctrl+B** to activate.
### 1. Download Workflow and Materials
Download the video below or JSON file and drag it into ComfyUI to load the workflow
Download JSON Workflow
Please download the following images and videos as input materials.

> We use a preprocessed video here.
### 2. Models
You can find the models below at [Wan\_2.2\_ComfyUI\_Repackaged](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged)
**Diffusion Model**
* [wan2.2\_fun\_control\_high\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors)
* [wan2.2\_fun\_control\_low\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors)
**Wan2.2-Lightning LoRA (Optional, for acceleration)**
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_high\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors)
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_low\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors
│ ├───📂 loras/
│ │ ├─── wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
│ │ └─── wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Workflow Guide
This workflow uses LoRA. Please ensure the corresponding Diffusion model and LoRA are matched - high noise and low noise models and LoRAs need to be used correspondingly.
1. **High noise** model and **LoRA** loading
* Ensure the `Load Diffusion Model` node loads the `wan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors` model
* Ensure the `LoraLoaderModelOnly` node loads the `wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors`
2. **Low noise** model and **LoRA** loading
* Ensure the `Load Diffusion Model` node loads the `wan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors` model
* Ensure the `LoraLoaderModelOnly` node loads the `wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors`
3. Ensure the `Load CLIP` node loads the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model
4. Ensure the `Load VAE` node loads the `wan_2.1_vae.safetensors` model
5. Upload the start frame in the `Load Image` node
6. In the second `Load video` node, load the pose control video. The provided video has been preprocessed and can be used directly
7. Since we provide a preprocessed pose video, the corresponding video image preprocessing node needs to be disabled. You can select it and use `Ctrl + B` to disable it
8. Modify the Prompt - you can use both Chinese and English
9. In `Wan22FunControlToVideo`, modify the video dimensions. The default is set to 640×640 resolution to avoid excessive processing time for users with low VRAM
10. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
### Additional Notes
Since ComfyUI's built-in nodes only include Canny preprocessor, you can use tools like [ComfyUI-comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) to implement other types of image preprocessing
# ComfyUI Wan2.2 Fun Inp Start-End Frame Video Generation Example
Source: https://docs.comfy.org/tutorials/video/wan/wan2-2-fun-inp
This article introduces how to use ComfyUI to complete the Wan2.2 Fun Inp start-end frame video generation example
**Wan2.2-Fun-Inp** is a start-end frame controlled video generation model launched by Alibaba PAI team. It supports inputting **start and end frame images** to generate intermediate transition videos, providing creators with greater creative control. The model is released under the **Apache 2.0 license** and supports commercial use.
**Key Features**:
* **Start-End Frame Control**: Supports inputting start and end frame images to generate intermediate transition videos, enhancing video coherence and creative freedom
* **High-Quality Video Generation**: Based on the Wan2.2 architecture, outputs film-level quality videos
* **Multi-Resolution Support**: Supports generating videos at 512×512, 768×768, 1024×1024 and other resolutions to suit different scenarios
**Model Version**:
* **14B High-Performance Version**: Model size exceeds 32GB, with better results but requires high VRAM
Below are the relevant model weights and code repositories:
* [🤗Wan2.2-Fun-Inp-14B](https://huggingface.co/alibaba-pai/Wan2.2-Fun-A14B-InP)
* Code repository: [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
## ComfyOrg Wan2.2 Fun InP & Control Youtube Live Stream Replay
VIDEO
## Wan2.2 Fun Inp Start-End Frame Video Generation Workflow Example
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
This workflow provides two versions:
1. A version using [Wan2.2-Lightning](https://huggingface.co/lightx2v/Wan2.2-Lightning) 4-step LoRA from lightx2v for accelerated video generation
2. A fp8\_scaled version without acceleration LoRA
Below are the test results using an RTX4090D 24GB VRAM GPU at 640×640 resolution with 81 frames
| Model Type | VRAM Usage | First Generation Time | Second Generation Time |
| ------------------------- | ---------- | --------------------- | ---------------------- |
| fp8\_scaled | 83% | ≈ 524s | ≈ 520s |
| fp8\_scaled + 4-step LoRA | 89% | ≈ 138s | ≈ 79s |
Since the acceleration with LoRA is significant but the video dynamic is lost, the provided workflows enable the accelerated LoRA version by default. If you want to enable the other workflow, select it and use **Ctrl+B** to activate.
### 1. Download Workflow File
Please update your ComfyUI to the latest version, and find "**Wan2.2 Fun Inp**" under the menu `Workflow` -> `Browse Templates` -> `Video` to load the workflow.
Or, after updating ComfyUI to the latest version, download the workflow below and drag it into ComfyUI to load.
Download JSON Workflow
Use the following materials as the start and end frames


### 2. Models
**Diffusion Model**
* [wan2.2\_fun\_inpaint\_high\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_inpaint_high_noise_14B_fp8_scaled.safetensors)
* [wan2.2\_fun\_inpaint\_low\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_inpaint_low_noise_14B_fp8_scaled.safetensors)
**Lightning LoRA (Optional, for acceleration)**
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_high\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors)
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_low\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_fun_inpaint_high_noise_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_fun_inpaint_low_noise_14B_fp8_scaled.safetensors
│ ├───📂 loras/
│ │ ├─── wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
│ │ └─── wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Workflow Guide
This workflow uses LoRA. Please make sure the corresponding Diffusion model and LoRA are matched.
1. **High noise** model and **LoRA** loading
* Ensure the `Load Diffusion Model` node loads the `wan2.2_fun_inpaint_high_noise_14B_fp8_scaled.safetensors` model
* Ensure the `LoraLoaderModelOnly` node loads the `wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors`
2. **Low noise** model and **LoRA** loading
* Ensure the `Load Diffusion Model` node loads the `wan2.2_fun_inpaint_low_noise_14B_fp8_scaled.safetensors` model
* Ensure the `LoraLoaderModelOnly` node loads the `wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors`
3. Ensure the `Load CLIP` node loads the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model
4. Ensure the `Load VAE` node loads the `wan_2.1_vae.safetensors` model
5. Upload the start and end frame images as materials
6. Enter your prompt in the Prompt group
7. Adjust the size and video length in the `WanFunInpaintToVideo` node
* Adjust the `width` and `height` parameters. The default is `640`. We set a smaller size, but you can modify it as needed.
* Adjust the `length`, which is the total number of frames. The current workflow fps is 16. For example, if you want to generate a 5-second video, you should set it to 5\*16 = 80.
8. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation
# Wan2.2-S2V Audio-Driven Video Generation ComfyUI Native Workflow Example
Source: https://docs.comfy.org/tutorials/video/wan/wan2-2-s2v
This is a native workflow example for Wan2.2-S2V audio-driven video generation in ComfyUI.
We're excited to announce that Wan2.2-S2V, the advanced audio-driven video generation model, is now natively supported in ComfyUI! This powerful AI model can transform static images and audio inputs into dynamic video content, supporting dialogue, singing, performance, and various creative content needs.
**Model Highlights**
* **Audio-Driven Video Generation**: Transforms static images and audio into synchronized videos
* **Cinematic-Grade Quality**: Generates film-quality videos with natural expressions and movements
* **Minute-Level Generation**: Supports long-form video creation
* **Multi-Format Support**: Works with full-body and half-body characters
* **Enhanced Motion Control**: Generates actions and environments from text instructions
Wan2.2 S2V Code: [GitHub](https://github.com/aigc-apps/VideoX-Fun)
Wan2.2 S2V Model: [Hugging Face](https://huggingface.co/Wan-AI/Wan2.2-S2V-14B)
## Wan2.2 S2V ComfyUI Native Workflow
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
### 1. Download Workflow File
Download the following workflow file and drag it into ComfyUI to load the workflow.
Download JSON Workflow
Download the following image and audio as input:

Download Input Audio
### 2. Model Links
You can find the models in [our repo](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged)
**diffusion\_models**
* [wan2.2\_s2v\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_s2v_14B_fp8_scaled.safetensors)
* [wan2.2\_s2v\_14B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_s2v_14B_bf16.safetensors)
**audio\_encoders**
* [wav2vec2\_large\_english\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/audio_encoders/wav2vec2_large_english_fp16.safetensors)
**vae**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**text\_encoders**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_s2v_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_s2v_14B_bf16.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ ├───📂 audio_encoders/ # Create one if you can't find this folder
│ │ └─── wav2vec2_large_english_fp16.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Workflow Instructions
#### 3.1 About Lightning LoRA
#### 3.2 About fp8\_scaled and bf16 Models
You can find both models [here](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/diffusion_models):
* [wan2.2\_s2v\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_s2v_14B_fp8_scaled.safetensors)
* [wan2.2\_s2v\_14B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_s2v_14B_bf16.safetensors)
This template uses `wan2.2_s2v_14B_fp8_scaled.safetensors`, which requires less VRAM. But you can try `wan2.2_s2v_14B_bf16.safetensors` to reduce quality degradation.
#### 3.3 Step-by-Step Operation Instructions
**Step 1: Load Models**
1. **Load Diffusion Model**: Load `wan2.2_s2v_14B_fp8_scaled.safetensors` or `wan2.2_s2v_14B_bf16.safetensors`
* The provided workflow uses `wan2.2_s2v_14B_fp8_scaled.safetensors`, which requires less VRAM
* But you can try `wan2.2_s2v_14B_bf16.safetensors` to reduce quality degradation
2. **Load CLIP**: Load `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. **Load VAE**: Load `wan_2.1_vae.safetensors`
4. **AudioEncoderLoader**: Load `wav2vec2_large_english_fp16.safetensors`
5. **LoraLoaderModelOnly**: Load `wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise.safetensors` (Lightning LoRA)
* We tested all wan2.2 lightning LoRAs. Since this is not a LoRA specifically trained for Wan2.2 S2V, many key values don't match, but we added it because it significantly reduces generation time. We will continue to optimize this template
* Using it will cause significant dynamic and quality loss
* If you find the output quality too poor, you can try the original 20-step workflow
6. **LoadAudio**: Upload our provided audio file or your own audio
7. **Load Image**: Upload reference image
8. **Batch sizes**: Set according to the number of Video S2V Extend subgraph nodes you add
* Each Video S2V Extend subgraph adds 77 frames to the final output
* For example: If you added 2 Video S2V Extend subgraphs, the batch size should be 3, which means the total number of sampling iterations
* **Chunk Length**: Keep the default value of 77
9. **Sampler Settings**: Choose different settings based on whether you use Lightning LoRA
* With 4-step Lightning LoRA: steps: 4, cfg: 1.0
* Without 4-step Lightning LoRA: steps: 20, cfg: 6.0
10. **Size Settings**: Set the output video dimensions
11. **Video S2V Extend**: Video extension subgraph nodes. Since our default frames per sampling is 77, and this is a 16fps model, each extension will generate 77 / 16 = 4.8125 seconds of video
* You need some calculation to match the number of video extension subgraph nodes with the input audio length. For example: If input audio is 14s, the total frames needed are 14x16=224, each video extension is 77 frames, so you need 224/77 = 2.9, rounded up to 3 video extension subgraph nodes
12. Use Ctrl-Enter or click the Run button to execute the workflow
# Wan2.2 Video Generation ComfyUI Official Native Workflow Example
Source: https://docs.comfy.org/tutorials/video/wan/wan2_2
Official usage guide for Alibaba Cloud Tongyi Wanxiang 2.2 video generation model in ComfyUI
VIDEO
Wan 2.2 is a new generation multimodal generative model launched by WAN AI. This model adopts an innovative MoE (Mixture of Experts) architecture, consisting of high-noise and low-noise expert models. It can divide expert models according to denoising timesteps, thus generating higher quality video content.
Wan 2.2 has three core features: cinematic-level aesthetic control, deeply integrating professional film industry aesthetic standards, supporting multi-dimensional visual control such as lighting, color, and composition; large-scale complex motion, easily restoring various complex motions and enhancing the smoothness and controllability of motion; precise semantic compliance, excelling in complex scenes and multi-object generation, better restoring users' creative intentions.
The model supports multiple generation modes such as text-to-video and image-to-video, suitable for content creation, artistic creation, education and training, and other application scenarios.
[Wan2.2 Prompt Guide](https://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y)
## Model Highlights
* **Cinematic-level Aesthetic Control**: Professional camera language, supports multi-dimensional visual control such as lighting, color, and composition
* **Large-scale Complex Motion**: Smoothly restores various complex motions, enhances motion controllability and naturalness
* **Precise Semantic Compliance**: Complex scene understanding, multi-object generation, better restoring creative intentions
* **Efficient Compression Technology**: 5B version with high compression ratio VAE, memory optimization, supports mixed training
## Wan2.2 Open Source Model Versions
The Wan2.2 series models are based on the Apache 2.0 open source license and support commercial use. The Apache 2.0 license allows you to freely use, modify, and distribute these models, including for commercial purposes, as long as you retain the original copyright notice and license text.
| Model Type | Model Name | Parameters | Main Function | Model Repository |
| -------------- | --------------- | ---------- | ---------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------- |
| Hybrid Model | Wan2.2-TI2V-5B | 5B | Hybrid version supporting both text-to-video and image-to-video, a single model meets two core task requirements | 🤗 [Wan2.2-TI2V-5B](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B) |
| Image-to-Video | Wan2.2-I2V-A14B | 14B | Converts static images into dynamic videos, maintaining content consistency and smooth dynamic process | 🤗 [Wan2.2-I2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B) |
| Text-to-Video | Wan2.2-T2V-A14B | 14B | Generates high-quality videos from text descriptions, with cinematic-level aesthetic control and precise semantic compliance | 🤗 [Wan2.2-T2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B) |
## ComfyOrg Wan2.2 Live Streams
For ComfyUI Wan2.2 usage, we have conducted live streams, which you can view to learn how to use them.
VIDEO
VIDEO
VIDEO
This tutorial will use the [🤗 Comfy-Org/Wan\_2.2\_ComfyUI\_Repackaged](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged) version.
Make sure your ComfyUI is updated.
* [Download ComfyUI](https://www.comfy.org/download)
* [Update Guide](/installation/update_comfyui)
Workflows in this guide can be found in the [Workflow Templates](/interface/features/template).
If you can't find them in the template, your ComfyUI may be outdated.(Desktop version's update will delay sometime)
If nodes are missing when loading a workflow, possible reasons:
1. Not using the latest ComfyUI version(Nightly version)
2. Using Stable or Desktop version (Latest changes may not be included)
3. Some nodes failed to import at startup
## Wan2.2 TI2V 5B Hybrid Version Workflow Example
The Wan2.2 5B version should fit well on 8GB vram with the ComfyUI native offloading.
### 1. Download Workflow File
Please update your ComfyUI to the latest version, and through the menu `Workflow` -> `Browse Templates` -> `Video`, find "Wan2.2 5B video generation" to load the workflow.
Download JSON Workflow File
### 2. Manually Download Models
**Diffusion Model**
* [wan2.2\_ti2v\_5B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors)
**VAE**
* [wan2.2\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan2.2_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ └───wan2.2_ti2v_5B_fp16.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan2.2_vae.safetensors
```
### 3. Follow the Steps
1. Ensure the `Load Diffusion Model` node loads the `wan2.2_ti2v_5B_fp16.safetensors` model.
2. Ensure the `Load CLIP` node loads the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model.
3. Ensure the `Load VAE` node loads the `wan2.2_vae.safetensors` model.
4. (Optional) If you need to perform image-to-video generation, you can use the shortcut Ctrl+B to enable the `Load image` node to upload an image.
5. (Optional) In the `Wan22ImageToVideoLatent` node, you can adjust the size settings and the total number of video frames (`length`).
6. (Optional) If you need to modify the prompts (positive and negative), please do so in the `CLIP Text Encoder` node at step 5.
7. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation.
## Wan2.2 14B T2V Text-to-Video Workflow Example
### 1. Workflow File
Please update your ComfyUI to the latest version, and through the menu `Workflow` -> `Browse Templates` -> `Video`, find "Wan2.2 14B T2V" to load the workflow.
Or update your ComfyUI to the latest version, then download the following video and drag it into ComfyUI to load the workflow.
Download JSON Workflow File
### 2. Manually Download Models
**Diffusion Model**
* [wan2.2\_t2v\_high\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors)
* [wan2.2\_t2v\_low\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Follow the Steps
1. Ensure the first `Load Diffusion Model` node loads the `wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors` model.
2. Ensure the second `Load Diffusion Model` node loads the `wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors` model.
3. Ensure the `Load CLIP` node loads the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model.
4. Ensure the `Load VAE` node loads the `wan_2.1_vae.safetensors` model.
5. (Optional) In the `EmptyHunyuanLatentVideo` node, you can adjust the size settings and the total number of video frames (`length`).
6. (Optional) If you need to modify the prompts (positive and negative), please do so in the `CLIP Text Encoder` node at step 5.
7. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation.
## Wan2.2 14B I2V Image-to-Video Workflow Example
### 1. Workflow File
Please update your ComfyUI to the latest version, and through the menu `Workflow` -> `Browse Templates` -> `Video`, find "Wan2.2 14B I2V" to load the workflow.
Or update your ComfyUI to the latest version, then download the following video and drag it into ComfyUI to load the workflow.
Download JSON Workflow File
You can use the following image as input:

### 2. Manually Download Models
**Diffusion Model**
* [wan2.2\_i2v\_high\_noise\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_i2v_high_noise_14B_fp16.safetensors)
* [wan2.2\_i2v\_low\_noise\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_i2v_low_noise_14B_fp16.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_i2v_low_noise_14B_fp16.safetensors
│ │ └─── wan2.2_i2v_high_noise_14B_fp16.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. Follow the Steps
1. Make sure the first `Load Diffusion Model` node loads the `wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors` model.
2. Make sure the second `Load Diffusion Model` node loads the `wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors` model.
3. Make sure the `Load CLIP` node loads the `umt5_xxl_fp8_e4m3fn_scaled.safetensors` model.
4. Make sure the `Load VAE` node loads the `wan_2.1_vae.safetensors` model.
5. In the `Load Image` node, upload the image to be used as the initial frame.
6. If you need to modify the prompts (positive and negative), do so in the `CLIP Text Encoder` node at step 6.
7. (Optional) In `EmptyHunyuanLatentVideo`, you can adjust the size settings and the total number of video frames (`length`).
8. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation.
## Wan2.2 14B FLF2V Workflow Example
The first and last frame workflow uses the same model locations as the I2V section.
### 1. Workflow and Input Material Preparation
Download the video or the JSON workflow below and open it in ComfyUI.
Download JSON Workflow
Download the following images as input materials:


### 2. Follow the Steps
1. Upload the image to be used as the starting frame in the first `Load Image` node.
2. Upload the image to be used as the ending frame in the second `Load Image` node.
3. Adjust the size settings in the `WanFirstLastFrameToVideo` node.
* By default, a relatively small size is set to prevent low VRAM users from consuming too many resources.
* If you have enough VRAM, you can try a resolution around 720P.
4. Write appropriate prompts according to your first and last frames.
5. Click the `Run` button, or use the shortcut `Ctrl(cmd) + Enter` to execute video generation.
## Community Resources
### GGUF Versions
* [bullerwins/Wan2.2-I2V-A14B-GGUF/](https://huggingface.co/bullerwins/Wan2.2-I2V-A14B-GGUF/)
* [bullerwins/Wan2.2-T2V-A14B-GGUF](https://huggingface.co/bullerwins/Wan2.2-T2V-A14B-GGUF)
* [QuantStack/Wan2.2 GGUFs](https://huggingface.co/collections/QuantStack/wan22-ggufs-6887ec891bdea453a35b95f3)
**Custom Node**
[City96/ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF)
### WanVideoWrapper
[Kijai/ComfyUI-WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper)
**Wan2.2 models**
[Kijai/WanVideo\_comfy\_fp8\_scaled](https://hf-mirror.com/Kijai/WanVideo_comfy_fp8_scaled)
**Wan2.1 models**
[Kijai/WanVideo\_comfy/Lightx2v](https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Lightx2v)
**Lightx2v 4steps LoRA**
* [Wan2.2-T2V-A14B-4steps-lora-rank64-V1](https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-rank64-V1)
# BasicScheduler - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/BasicScheduler
`BasicScheduler` 节点旨在根据提供的调度器、模型和去噪参数为扩散模型计算一系列 sigma 值。它根据去噪因子动态调整总步骤数,以微调扩散过程,在一些需要精细控制的高级的采样过程
`BasicScheduler` 节点旨在根据提供的调度器、模型和去噪参数为扩散模型计算一系列 sigma 值。它根据去噪因子动态调整总步骤数,以微调扩散过程,在一些需要精细控制的高级的采样过程(比如分步采样)等等提供了精细的不同阶段的“配方”
## 输入
| 参数名称 | 数据类型 | 输入类型 | 默认值 | 范围 | 比喻说明 | 技术作用 |
| ----- | -------------- | ------ | --- | ------- | --------------------------- | -------------------- |
| `模型` | MODEL | Input | - | - | **画布类型**:不同材质的画布需要不同的颜料配方 | 扩散模型对象,决定sigma值的计算基础 |
| `调度器` | COMBO\[STRING] | Widget | - | 9种选项 | **调色技法**:选择颜料浓度变化的方式 | 调度算法,控制噪声衰减模式 |
| `步数` | INT | Widget | 20 | 1-10000 | **调色次数**:调色20次 vs 50次的精细度差异 | 采样步数,影响生成质量与速度 |
| `降噪` | FLOAT | Widget | 1.0 | 0.0-1.0 | **创作强度**:从微调到重绘的控制力度 | 去噪强度,支持部分重绘场景 |
### 调度器类型详解
基于源码 `comfy.samplers.SCHEDULER_NAMES`,支持以下9种调度器:
| 调度器名称 | 特点 | 适用场景 | 噪声衰减特性 |
| --------------------- | -------- | ---------- | -------- |
| **normal** | 标准线性调度 | 通用场景,平衡效果 | 均匀递减 |
| **karras** | 平滑过渡调度 | 高质量生成,细节丰富 | 平滑非线性递减 |
| **exponential** | 指数递减调度 | 快速生成,效率优先 | 指数型快速递减 |
| **sgm\_uniform** | SGM均匀调度 | 特定模型优化 | SGM优化递减 |
| **simple** | 简单调度 | 快速测试,基础应用 | 简化递减 |
| **ddim\_uniform** | DDIM均匀调度 | DDIM采样优化 | DDIM特定递减 |
| **beta** | Beta分布调度 | 特殊分布需求 | Beta函数递减 |
| **linear\_quadratic** | 线性二次调度 | 复杂场景优化 | 二次函数递减 |
| **kl\_optimal** | KL最优调度 | 理论最优化 | KL散度优化递减 |
## 输出结果
| 参数名称 | 数据类型 | 输出类型 | 比喻说明 | 技术含义 |
| -------- | ------ | ------ | --------------------------- | ------------------ |
| `sigmas` | SIGMAS | Output | **调色配方表**:详细的颜料浓度清单,供画家逐步使用 | 噪声水平序列,指导扩散模型的去噪过程 |
## 节点角色:画家的调色助手
想象你是一位画家,正在从一团混乱的颜料(噪声)创作清晰的图像。`BasicScheduler` 就像是的**专业调色助手**,它的工作是为您准备一系列精确的颜料浓度配方:
### 工作流程
* **第1步**:使用90%浓度的颜料(高噪声水平)
* **第2步**:使用80%浓度的颜料
* **第3步**:使用70%浓度的颜料
* **...**
* **最后一步**:使用0%浓度(纯净画布,无噪声)
### 调色助手的特殊技能
**不同的调色方法(scheduler)**:
* **"karras"调色法**:颜料浓度变化非常平滑,像专业画家的渐变技巧
* **"exponential"调色法**:颜料浓度快速递减,适合快速创作
* **"linear"调色法**:颜料浓度均匀递减,稳定可控
**精细控制(steps)**:
* **20次调色**:快速作画,效率优先
* **50次调色**:精细作画,质量优先
**创作强度(denoise)**:
* **1.0 = 全新创作**:完全从空白画布开始
* **0.5 = 半改造**:保留原画一半,改造一半
* **0.2 = 微调优化**:只对原画进行细微调整
### 与其他节点的配合
`BasicScheduler`(调色助手)→ 准备配方 → `SamplerCustom`(画家)→ 实际绘画 → 完成作品
# Canny - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/Canny
Canny 节点是 ComfyUI 中用于提取图像边缘的节点。
从照片中提取所有边缘线条,就像用钢笔为照片描边一样,把物体的轮廓和细节边界都画出来。
## 工作原理
想象您是一位画家,要用钢笔为一张照片描边。Canny节点就像一个智能助手,帮您决定哪些地方需要画线(边缘),哪些地方不需要。
这个过程就像筛选工作:
* **高阈值**是"必须画线的标准":只有非常明显、清晰的轮廓线才会被画出来,比如人物脸部轮廓、建筑物边框
* **低阈值**是"完全不画线的标准":太微弱的边缘会被忽略,避免画出噪点和无意义的线条
* **中间区域**:介于两个标准之间的边缘,如果连着"必须画的线"就一起画出来,如果是孤立的就不画
最终输出一张黑白图像,白色部分是检测到的边缘线条,黑色部分是没有边缘的区域。
## 输入
| 参数名称 | 数据类型 | 输入方式 | 默认值 | 取值范围 | 功能说明 |
| ---- | ----- | ---- | --- | --------- | --------------------------------- |
| 图像 | IMAGE | 连接 | - | - | 需要提取边缘的原始照片 |
| 低阈值 | FLOAT | 手动输入 | 0.4 | 0.01-0.99 | 低阈值,决定忽略多弱的边缘。数值越小保留的细节越多,但可能产生噪点 |
| 高阈值 | FLOAT | 手动输入 | 0.8 | 0.01-0.99 | 高阈值,决定保留多强的边缘。数值越大只保留最明显的轮廓线 |
## 输出
| 输出名称 | 数据类型 | 说明 |
| ---- | ----- | ----------------------------- |
| 图像 | IMAGE | 黑白边缘图像,白色线条为检测到的边缘,黑色区域为无边缘部分 |
## 参数对比
**常见问题:**
* 边缘断断续续:尝试降低高阈值
* 出现很多噪点:提高低阈值
* 丢失重要细节:降低低阈值
* 边缘太粗糙:检查输入图像质量和分辨率
# CheckpointLoaderSimple - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/CheckpointLoaderSimple
CheckpointLoaderSimple 节点是 ComfyUI 中用于加载模型的节点。
这是一个模型加载器节点,用于从指定位置加载模型文件,并将其分解为三个核心组件:主模型、文本编码器和图像编解码器。
这个节点会自动检测`ComfyUI/models/checkpoints`文件夹下的所有模型文件,以及你在extra\_model\_paths.yaml文件中配置的额外路径。
1. **模型兼容性**:确保所选模型与你的工作流程兼容,不同类型的模型(如SD1.5、SDXL、Flux等)需要配合相应的采样器和其他节点
2. **文件管理**:将模型文件放在`ComfyUI/models/checkpoints`文件夹中,或通过extra\_model\_paths.yaml配置其他路径
3. **界面刷新**:如果在ComfyUI运行期间添加了新的模型文件,需要刷新浏览器(Ctrl+R)才能在下拉列表中看到新文件
## 输入
| 参数名称 | 数据类型 | 输入方式 | 默认值 | 取值范围 | 功能说明 |
| ------------ | ------ | ---- | ---- | ---------------------- | ------------------------------ |
| checkpoint名称 | STRING | 下拉选择 | null | checkpoints文件夹中的所有模型文件 | 选择要加载的检查点模型文件名称,决定了后续生图使用的AI模型 |
## 输出
| 输出名称 | 数据类型 | 说明 |
| ---- | ----- | ----------------------------- |
| 模型 | MODEL | 用于图像去噪生成的主要扩散模型,是AI绘画的核心组件 |
| CLIP | CLIP | 用于编码文本提示词的模型,将文字描述转换为AI能理解的信息 |
| VAE | VAE | 用于图像编解码的模型,负责在像素空间和潜在空间之间转换 |
# 加载CLIP - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipLoader
`加载CLIP ` 节点主要用于单独加载 CLIP 文本编码器模型。
该节点主要用于单独加载 CLIP 文本编码器模型。
支持检测以下路径的模型文件检测:
* “ComfyUI/models/text\_encoders/”
* “ComfyUI/models/clip/”
> 如果你是在 ComfyUI 启动后才保存的模型则需要刷新 ComfyUI 前端来获取最新的模型文件路径列表
支持的模型格式有:
* `.ckpt`
* `.pt`
* `.pt2`
* `.bin`
* `.pth`
* `.safetensors`
* `.pkl`
* `.sft`
更多最新模型文件加载详情请查阅[folder\_paths](https://github.com/comfyanonymous/ComfyUI/blob/master/folder_paths.py)
## 输入
| 参数名称 | 数据类型 | 作用 |
| -------- | -------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `CLIP名称` | COMBO\[STRING] | 指定要加载的 CLIP 模型的名称。此名称用于在预定义的目录结构内定位模型文件。 |
| `类型` | COMBO\[STRING] | 确定要加载的 CLIP 模型类型,随着 ComfyUI 支持的模型数量增加这里的类型也会新增对应的类型,请查看[node.py](https://github.com/comfyanonymous/ComfyUI/blob/master/nodes.py)中源码里关于`CLIPLoader` 类的相关定义 |
| `设备` | COMBO\[STRING] | 选择加载 CLIP 模型的设备,`default` 将会将对应的模型在 GPU 上运行,如果选择`CPU` 将强制在 CPU上进行加载 |
### 不同`设备`选项的说明
**选择 "default" 的情况**:
* 有足够的 GPU 内存
* 希望获得最佳性能
* 让系统自动优化内存使用
**选择 "cpu" 的情况**:
* GPU 内存不足
* 需要为其他模型(如 UNet)保留 GPU 内存
* 在低 VRAM 环境下运行
* 调试或特殊用途需要
**性能影响**
CPU 运行会比 GPU 运行慢很多,但可以节省宝贵的 GPU 内存供其他更重要的模型组件使用。在内存受限的环境中,将 CLIP 模型放在 CPU 上是一个常见的优化策略。
### 支持的搭配
| 模型类型 | 对应编码器 |
| ----------------- | ----------------------- |
| stable\_diffusion | clip-l |
| stable\_cascade | clip-g |
| sd3 | t5 xxl/ clip-g / clip-l |
| stable\_audio | t5 base |
| mochi | t5 xxl |
| cosmos | old t5 xxl |
| lumina2 | gemma 2 2B |
| wan | umt5 xxl |
未来随着 ComfyUI 的更新,这个更新搭配可能会新增,详情请参考 [node.py](https://github.com/comfyanonymous/ComfyUI/blob/master/nodes.py)中源码里关于`CLIPLoader` 类的相关定义说明
## 输出
| 参数名称 | 数据类型 | 作用 |
| ------ | ---- | --------------------------- |
| `clip` | CLIP | 加载的 CLIP 模型,准备用于下游任务或进一步处理。 |
## 其它扩展
CLIP 模型在 ComfyUI 中扮演着文本编码器的核心角色,负责将文本提示转换为可供扩散模型理解的数值表示,你可以把它理解成翻译官,负责将你的文本翻译成大模型可以理解的语言,当然不同模型也存在着 “方言” ,所以在不同架构的模型之间需要不同的 CLIP 模型来完成文本编码的这一过程。
# CLIP融合简易 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipMergeSimple
`CLIP融合简易` 是一个高级模型合并节点,用于将两个 CLIP 文本编码器模型按指定比例进行合并.
`CLIP融合简易` 是一个高级模型合并节点,用于将两个 CLIP 文本编码器模型按指定比例进行合并.
此节点专门用于根据指定比例合并两个CLIP模型,有效地混合它们的特性。它有选择性地将一个模型的补丁应用到另一个模型上,排除像位置ID和对数尺度这样的特定组件,以创建一个结合了两个源模型特征的混合模型。
## 输入
| 参数名称 | 数据类型 | 作用 |
| ------- | ----- | ------------------------------------------------------------------------------- |
| `clip1` | CLIP | 要合并的第一个CLIP模型。它作为合并过程的基础模型。 |
| `clip2` | CLIP | 要合并的第二个CLIP模型。根据指定的比例,除位置ID和对数尺度外,其关键补丁将应用于第一个模型。 |
| `比例` | FLOAT | 取值范围 `0.0 - 1.0` 确定从第二个模型融合到第一个模型中的特性比例。比例为1.0意味着完全采用第二个模型的特性,而0.0则仅保留第一个模型的特性。 |
## 输出
| 参数名称 | 数据类型 | 作用 |
| ------ | ---- | -------------------------------- |
| `clip` | CLIP | 结果合并的CLIP模型,根据指定的比例整合了两个输入模型的特性。 |
## 合并机制详解
### 合并算法
节点使用加权平均的方式合并两个模型:
1. **克隆基础模型**: 首先克隆 `clip1` 作为基础模型
2. **获取补丁**: 从 `clip2` 获取所有键值补丁 (key patches)
3. **过滤特殊键**: 跳过 `.position_ids` 和 `.logit_scale` 结尾的键
4. **应用加权合并**: 使用公式 `(1.0 - 比例) * clip1 + 比例 * clip2`
### 比例 参数说明
* **比例 = 0.0**: 完全使用 clip1,忽略 clip2
* **比例 = 0.5**: 两个模型各占 50%
* **比例 = 1.0**: 完全使用 clip2,忽略 clip1
## 使用场景
1. **模型风格融合**: 结合不同训练数据的 CLIP 模型特点
2. **性能优化**: 平衡不同模型的优缺点
3. **实验研究**: 探索不同 CLIP 编码器的组合效果
# 保存CLIP - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipSave
`CLIP保存` 节点用于将 CLIP 文本编码器模型保存为 SafeTensors 格式文件, 该节点属于高级模型合并工作流的一部分,通常与 `CLIPMergeSimple`、`CLIPMergeAdd` 等节点配合使用。保存的文件采用 SafeTensors 格式,确保安全性和兼容性。
`CLIP保存` 节点用于将 CLIP 文本编码器模型保存为 SafeTensors 格式文件, 该节点属于高级模型合并工作流的一部分,通常与 `CLIPMergeSimple`、`CLIPMergeAdd` 等节点配合使用。保存的文件采用 SafeTensors 格式,确保安全性和兼容性。
## 输入
| 参数名 | 类型 | 是否必需 | 默认值 | 描述 |
| --------------- | -------------- | ---- | ---------------- | --------------- |
| `clip` | CLIP | 必需 | - | 要保存的 CLIP 模型 |
| `文件名前缀` | STRING | 必需 | `"clip/ComfyUI"` | 保存文件的前缀路径 |
| `prompt` | PROMPT | 隐藏参数 | - | 工作流提示信息(用于元数据) |
| `extra_pnginfo` | EXTRA\_PNGINFO | 隐藏参数 | - | 额外的PNG信息(用于元数据) |
## 输出
该节点没有定义输出类型, 会将处理后的文件保存到 `ComfyUI/output/` 文件夹下
### 多文件保存策略
节点会根据 CLIP 模型类型分别保存不同组件:
| 前缀类型 | 文件名后缀 | 说明 |
| --------- | --------- | ------------ |
| `clip_l.` | `_clip_l` | CLIP-L 文本编码器 |
| `clip_g.` | `_clip_g` | CLIP-G 文本编码器 |
| 空前缀 | 无后缀 | 其他 CLIP 组件 |
# 设置CLIP最后一层 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipSetLastLayer
`设置CLIP最后一层` 是 ComfyUI 中用于控制 CLIP 模型处理深度的核心节点。
`设置CLIP最后一层` 是 ComfyUI 中用于控制 CLIP 模型处理深度的核心节点。它允许用户精确控制 CLIP 文本编码器在哪一层停止处理,从而影响文本理解的深度和生成图像的风格。
想象 CLIP 模型是一个24层的智能大脑:
* 浅层 (1-8层):识别基本字母、单词
* 中层 (9-16层):理解语法、句子结构
* 深层 (17-24层):掌握抽象概念、复杂语义
`设置CLIP最后一层` 就像一个 **"思考深度调节器"**:
-1: 使用全部24层(最完整理解)
-2: 停在第23层(稍微简化)
-12: 停在第13层(中等理解)
-24: 只用第1层(最基础理解)
## 输入
| 参数名称 | 数据类型 | 默认值 | 取值范围 | 功能说明 |
| ----------- | ---- | --- | -------- | ------------------------------- |
| `clip` | CLIP | - | - | 要修改的CLIP模型 |
| `停止在 CLIP层` | INT | -1 | -24 到 -1 | 指定停止处理的层级,-1表示使用全部层级,-24表示只用第一层 |
## 输出
| 输出名称 | 数据类型 | 说明 |
| ---- | ---- | ----------------------- |
| clip | CLIP | 已修改的CLIP模型,指定的层被设置为最后一层 |
## 为什么需要设置最后一层
* **性能优化**:就像不需要博士学位来理解简单句子一样,有时浅层理解就够了,速度更快
* **风格控制**:不同层次的理解会产生不同的艺术风格
* **兼容性**:某些模型可能在特定层次上表现更好
# CLIP文本编码 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipTextEncode
`CLIP文本编码` 这个节点就像一位翻译官,它能将你用文字描述的创作想法转换成AI能够理解的特殊语言,帮助AI准确理解你想要创作的图像内容。
`CLIP文本编码 (CLIPTextEncode)` 这个节点就像一位翻译官,它能将你用文字描述的创作想法转换成AI能够理解的特殊"语言",帮助AI准确理解你想要创作的图像内容。
想象你在和一位外国画家沟通,你需要一位翻译来帮助你准确传达你想要的画作效果。这个节点就像那位翻译,它使用CLIP模型(一个经过大量图文训练的AI模型)来理解你的文字描述,并将其转换成AI绘画模型能理解的"语言指令"。
## 输入
| 参数名称 | 数据类型 | 输入方式 | 默认值 | 取值范围 | 功能说明 |
| ---- | ------ | ----- | --- | ---------- | ----------------------------------------------------- |
| text | STRING | 文本输入框 | 空 | 任意文本 | 就像给画家的详细说明,在这里输入你想要生成的图像的文字描述。支持多行文本,可以非常详细地描述你想要的效果。 |
| clip | CLIP | 模型选择 | 无 | 已加载的CLIP模型 | 相当于选择特定的翻译官,不同的CLIP模型就像不同的翻译官,他们对艺术风格的理解略有不同。 |
## 输出
| 输出名称 | 数据类型 | 说明 |
| ---- | ------------ | -------------------------------------------------------- |
| 条件 | CONDITIONING | 这是转换后的"绘画指令",包含了AI能够理解的详细创作指引。这些指令会告诉AI模型应该如何绘制符合你描述的图像。 |
## 使用建议
1. **文本提示的基本用法**
* 可以像写作文一样详细描述你想要的图像
* 越具体的描述,生成的图像越符合预期
* 可以使用英文逗号分隔不同的描述要素
2. **特殊功能:使用Embedding模型**
* Embedding模型就像预设的艺术风格包,可以快速应用特定的艺术效果
* 目前支持 .safetensors、.pt、.bin 这三种文件格式,你不一定需要在使用的时候用完整的模型名称
* 使用方法:
1. 将embedding模型文件(.pt格式)放入`ComfyUI/models/embeddings`文件夹
2. 在文本中使用`embedding:模型名称`来调用
例如:如果你有一个叫`EasyNegative.pt`的模型,可以这样使用:
```
a beautiful landscape, embedding:EasyNegative, high quality
```
3. **提示词权重调整**
* 可以用括号来调整某些描述的重要程度
* 例如:`(beautiful:1.2)`会让"beautiful"这个特征更突出
* 普通括号`()`的默认权重是1.1
* 使用键盘快捷键 `ctrl + 上/下方向键` 头可以快速调整权重
* 对应权重快速调整步长可以在设置中进行修改
4. **注意事项**
* 确保CLIP模型已正确加载
* 文本描述尽量使用正面、明确的词语
* 如果使用embedding模型,确保文件名称输入正确并且和当前主模型的架构吻合
# CLIP文本编码Flux - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipTextEncodeFlux
CLIP文本编码Flux 节点是 ComfyUI 中专为 Flux 架构设计的高级文本编码节点。
`CLIP文本编码Flux` 是 ComfyUI 中专为 Flux 架构设计的高级文本编码节点。它采用双文本编码器(CLIP-L 与 T5XXL)协同机制,能够同时处理结构化关键词和详细自然语言描述,为 Flux 模型提供更精准、更丰富的文本理解能力,提升文本到图像的生成质量。
该节点基于双编码器协作机制:
1. `clip_l` 输入会被 CLIP-L 编码器处理,提取风格、主题等关键词特征,适合简洁描述。
2. `t5xxl` 输入由 T5XXL 编码器处理,擅长理解复杂、细致的自然语言场景描述。
3. 两路编码结果融合后,结合"引导"参数,生成统一的条件嵌入(CONDITIONING),用于下游的 Flux 采样器节点,控制生成内容与文本描述的契合度。
## 输入
| 参数名称 | 数据类型 | 输入方式 | 默认值 | 取值范围 | 功能说明 |
| -------- | ------ | ---- | --- | ----------- | ---------------------------------------------- |
| `clip` | CLIP | 节点输入 | 无 | - | 必须是支持 Flux 架构的 CLIP 模型,包含 CLIP-L 和 T5XXL 两个编码器 |
| `clip_l` | STRING | 文本框 | 无 | 最多77个token | 适合输入简洁的关键词描述,如风格、主题等 |
| `t5xxl` | STRING | 文本框 | 无 | 近乎无限制 | 适合输入详细的自然语言描述,表达复杂场景和细节 |
| `引导` | FLOAT | 滑块 | 3.5 | 0.0 - 100.0 | 控制文本条件对生成过程的影响强度,数值越大越严格遵循文本描述 |
## 输出
| 输出名称 | 数据类型 | 说明 |
| ---- | ------------ | ---------------------------- |
| `条件` | CONDITIONING | 包含双编码器处理后的条件嵌入和引导参数,用于条件图像生成 |
## 使用示例
### 提示词示例
* **clip\_l 输入框**(关键词风格):
* 使用结构化、简洁的关键词组合
* 示例:`masterpiece, best quality, portrait, oil painting, dramatic lighting`
* 重点描述风格、质量、主题等核心元素
* **t5xxl 输入框**(自然语言描述):
* 使用完整、流畅的场景描述
* 示例:`A highly detailed portrait in oil painting style, featuring dramatic chiaroscuro lighting that creates deep shadows and bright highlights, emphasizing the subject's features with renaissance-inspired composition.`
* 重点描述场景细节、空间关系、光影效果
### 注意事项
1. 确保使用兼容的 Flux 架构 CLIP 模型
2. 建议同时填写 clip\_l 和 t5xxl,以发挥双编码器优势
3. 注意 clip\_l 的词元数量限制(77个token)
4. 根据生成效果调整"引导"参数
# CLIP文本编码混元DiT - ComfyUI内置节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipTextEncodeHunyuanDit
CLIP文本编码混元DiT节点用于将文本提示编码为HunyuanDiT兼容的条件嵌入。
`CLIP文本编码混元DiT` 节点的主要功能是将输入的文本转换为模型可以理解的形式。是一个高级条件化节点,专门用于 HunyuanDiT 模型的双文本编码器架构。
主要作用它就像一个翻译器,可以将我们的文字描述转换成 AI 模型能理解的"机器语言"。其中 `bert` 和 `mt5xl` 偏好不同类型的提示词输入
## 输入
| 参数 | 数据类型 | 描述 |
| ------- | ------ | -------------------------------------------- |
| `clip` | CLIP | 一个 CLIP 模型实例,用于文本的标记化和编码,是生成条件的核心。 |
| `bert` | STRING | 需要编码的文本输入,偏好短语、关键词,支持多行和动态提示。 |
| `mt5xl` | STRING | 另一个需要编码的文本输入,支持多行和动态提示(多语言),可以使用完整的句子和复杂的描述。 |
## 输出
| 参数 | 数据类型 | 描述 |
| ---- | ------------ | ----------------------- |
| `条件` | CONDITIONING | 编码后的条件输出,用于生成任务中的进一步处理。 |
# CLIP文本编码SDXL - ComfyUI内置节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipTextEncodeSdxl
CLIP文本编码SDXL节点用于将文本提示编码为SDXL兼容的条件嵌入。
此节点设计使用特别为SDXL架构定制的CLIP模型对文本输入进行编码。它使用双编码器系统(CLIP-L和CLIP-G)来处理文本描述,从而生成更准确的图像。
## 输入
| 参数名称 | 数据类型 | 作用 |
| -------- | -------- | ----------------------- |
| `clip` | `CLIP` | 用于编码文本的CLIP模型实例。 |
| `宽度` | `INT` | 指定图像的宽度(以像素为单位),默认1024。 |
| `高度` | `INT` | 指定图像的高度(以像素为单位),默认1024。 |
| `裁剪宽` | `INT` | 裁剪区域的宽度(以像素为单位),默认0。 |
| `裁剪高` | `INT` | 裁剪区域的高度(以像素为单位),默认0。 |
| `目标宽度` | `INT` | 输出图像的目标宽度,默认1024。 |
| `目标高度` | `INT` | 输出图像的目标高度,默认1024。 |
| `text_g` | `STRING` | 全局文本描述,用于整体场景描述。 |
| `text_l` | `STRING` | 局部文本描述,用于细节描述。 |
## 输出
| 参数名称 | 数据类型 | 作用 |
| ---- | -------------- | --------------------- |
| `条件` | `CONDITIONING` | 包含编码后的文本和图像生成所需的条件信息。 |
# CLIP文本编码SDXL精炼器 - ComfyUI内置节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipTextEncodeSdxlRefiner
CLIP文本编码SDXL精炼器节点用于将文本提示编码为SDXL精炼兼容的条件嵌入。
此节点专门为 SDXL Refiner 模型设计,用于将文本提示转换为条件信息,通过纳入审美得分和维度信息来增强生成任务的条件,从而提升最终的精炼效果。它就像是一位专业的艺术指导,不仅传达您的创作意图,还能为作品注入精确的美学标准和规格要求。
## 工作原理
SDXL Refiner 是一个专门的精炼模型,它在 SDXL 基础模型的基础上,专注于提升图像的细节和质量。这个过程就像是一位艺术修饰师:
1. 首先,它接收基础模型生成的初步图像或文本描述
2. 然后,通过精确的美学评分和尺寸参数来指导精炼过程
3. 最后,它专注于处理图像的高频细节,提升整体质量
Refiner 可以通过两种方式使用:
* 作为独立的精炼步骤,对基础模型生成的图像进行后期处理
* 作为专家集成系统的一部分,在生成过程的低噪声阶段接管处理
## 输入
| 参数名称 | 数据类型 | 输入方式 | 默认值 | 取值范围 | 功能说明 |
| ------ | ------ | ---- | ---- | ---------- | --------------------------------------------------------------------------------------------------------------------------- |
| `CLIP` | CLIP | 必需 | - | - | 用于文本标记化和编码的 CLIP 模型实例,是将文本转换为模型可理解格式的核心组件 |
| `美学分数` | FLOAT | 可选 | 6.0 | 0.0-1000.0 | 控制生成图像的视觉质量和美观程度,类似于为艺术作品设定质量标准: - 高分值(7.5-8.5):追求更精美、细节丰富的效果 - 中等分值(6.0-7.0):平衡的质量控制 - 低分值(2.0-3.0):适用于负面提示 |
| `宽度` | INT | 必需 | 1024 | 64-16384 | 指定输出图像的宽度(像素),需要是 8 的倍数。SDXL 在总像素量接近 1024×1024 (约100万像素) 时效果最佳 |
| `高度` | INT | 必需 | 1024 | 64-16384 | 指定输出图像的高度(像素),需要是 8 的倍数。SDXL 在总像素量接近 1024×1024 (约100万像素) 时效果最佳 |
| `text` | STRING | 必需 | - | - | 文本提示描述,支持多行输入和动态提示语法。在 Refiner 中,文本提示应更注重描述期望的视觉质量和细节特征 |
## 输出
| 输出名称 | 数据类型 | 说明 |
| ---- | ------------ | ---------------------------------------------------------------- |
| `条件` | CONDITIONING | 经过细化的条件输出,包含了文本语义、美学标准和尺寸信息的综合编码,专门用于指导 SDXL Refiner 模型进行精确的图像精炼 |
## 注意事项
1. 该节点专门为 SDXL Refiner 模型优化,与普通的 CLIPTextEncode 节点有所不同
2. 美学分数建议使用 7.5 作为基准值,这是 SDXL 训练时的标准设置
3. 所有尺寸参数必须是 8 的倍数,且建议总像素量接近 1024×1024(约100万像素)
4. Refiner 模型专注于提升图像细节和质量,因此文本提示应该更注重描述期望的视觉效果,而不是场景内容
5. 在实际使用中,Refiner 通常用于生成过程的后期阶段(约最后20%的步骤),专注于细节优化
# CLIP视觉编码 - ComfyUI内置节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipVisionEncode
CLIP视觉编码节点用于将输入图像通过CLIP Vision模型转换为视觉特征向量。
`CLIP视觉编码` 节点是 ComfyUI 中的图像编码节点,用于将输入图像通过 CLIP Vision 模型转换为视觉特征向量。该节点是连接图像和文本理解的重要桥梁,广泛用于各种 AI 图像生成和处理工作流中。
**节点功能**
* **图像特征提取**:将输入图像转换为高维特征向量
* **多模态桥接**:为图像和文本的联合处理提供基础
* **条件生成**:为基于图像的条件生成提供视觉条件
## 输入参数
| 参数名 | 类型 | 说明 |
| -------- | ------------ | ----------------------------------- |
| `clip视觉` | CLIP\_VISION | CLIP视觉模型,通常通过 CLIPVisionLoader 节点加载 |
| `图像` | IMAGE | 需要编码的输入图像 |
| `裁剪` | 下拉选择 | 图像裁剪方式,可选值:center(居中裁剪)、none(不裁剪) |
## 输出参数
| 参数名 | 类型 | 说明 |
| -------- | -------------------- | -------- |
| CLIP视觉输出 | CLIP\_VISION\_OUTPUT | 编码后的视觉特征 |
这个输出对象包含:
* `last_hidden_state`: 最后一层的隐藏状态
* `image_embeds`: 图像嵌入向量
* `penultimate_hidden_states`: 倒数第二层的隐藏状态
* `mm_projected`: 多模态投影结果(如果可用)
# 加载CLIP视觉 - ComfyUI内置节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/ClipVisionLoader
加载CLIP视觉节点用于从 `ComfyUI/models/clip_vision` 文件夹中加载 CLIP Vision 模型。
该节点会检测位于 `ComfyUI/models/clip_vision` 文件夹下的模型,同时也会读取在 `extra_model_paths.yaml` 文件中配置的额外路径的模型,如果你的模型是在 ComfyUI 启动后才添加的,请 **刷新 ComfyUI 界面** 保证前端能够获取到最新的模型文件列表
## 输入
| 参数名称 | 数据类型 | 作用 |
| -------- | -------------- | ------------------------------------------------ |
| `clip名称` | COMBO\[STRING] | 会获取`ComfyUI/models/clip_vision` 文件夹下受支持格式的模型文件列表 |
## 输出
| 参数名称 | 数据类型 | 作用 |
| -------- | ------------ | -------------------------------- |
| `CLIP视觉` | CLIP\_VISION | 加载的CLIP视觉模型,准备用于编码图像或执行其他视觉相关任务。 |
# 加载3D - ComfyUI内置节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/Load3D
Load3D 节点是 ComfyUI 中用于加载和预览多种 3D 模型文件的核心节点,支持多格式导入与丰富的三维视图操作。
Load3D 节点用于加载和处理 3D 模型文件的核心节点,在加载节点时会自动获取 ` ComfyUI/input/3d/` 可用的 3D 资源,你也可以通过上传功能将受支持的 3D 文件上传然后进行预览。
**支持格式**
目前该节点支持多种 3D 文件格式,包括 `.gltf`、`.glb`、`.obj`、`.fbx` 和 `.stl`。
**3D 节点预设**
3D 节点的一些相关偏好设置可以在 ComfyUI 的设置菜单中进行设置,请参考下面的文档了解对应的设置
[设置菜单 - 3D](/zh-CN/interface/settings/3d)
除了常规的节点输出之外, Load3D 有许多相关的 3D 视图相关操作是位于预览区域菜单, 3D 节点
## 输入
| 参数名 | 类型 | 描述 | 默认值 | 范围 |
| ---- | ---- | ----------------------------------------------- | ---- | ------- |
| 模型文件 | 文件选择 | 3D 模型文件路径,支持上传,默认读取 ` ComfyUI/input/3d/` 下的模型文件 | - | 受支持格式文件 |
| 宽度 | INT | 画布渲染宽度 | 1024 | 1-4096 |
| 高度 | INT | 画布渲染高度 | 1024 | 1-4096 |
## 输出
| 参数名称 | 数据类型 | 说明 |
| ---------------- | -------------- | ------------------------------------------ |
| image | IMAGE | 画布渲染渲染图像 |
| mask | MASK | 包含当前模型位置的遮罩 |
| mesh\_path | STRING | 模型文件路径在`ComfyUI/input` 文件夹下的路径 |
| normal | IMAGE | 法线贴图 |
| lineart | IMAGE | 线稿图像输出,对应的 `edge_threshold` 可在画布的模型菜单中进行调节 |
| camera\_info | LOAD3D\_CAMERA | 相机信息 |
| recording\_video | VIDEO | 录制视频(仅当有录制视频存在时) |
对应所有的输出预览如下:
## 模型画布(Canvas)区说明
Load 3D 节点的 Canvas 区域包含了诸多的视图操作,包括:
* 预览视图设置(网格、背景色、预览视图)
* 相机控制: 控制FOV、相机类型
* 全局光照强度: 调节光照强度
* 视频录制: 录制视频并导出视频
* 模型导出: 支持`GLB`、`OBJ`、`STL` 格式
* 等
1. 包含了 Load 3D 节点的多个菜单以及隐藏菜单
2. 重新`缩放预览窗口大小`以及进行`画布视频录制`菜单
3. 3D 视图操作轴
4. 预览缩略图
5. 预览尺寸设置,通过设置尺寸然后再缩放窗口大小来缩放预览视图显示
### 1. 视图操作
视图控制操作:
* 鼠标左键点击 + 拖拽: 视图旋转控制
* 鼠标右键 + 拖拽: 平移视图
* 鼠标中键: 缩放控制
* 坐标轴: 切换视图
### 2. 左侧菜单功能
在预览区域,有些视图操作相关的菜单被隐藏在了菜单里,点击菜单按钮可以展开对应不同的菜单
* 1. 场景(Scene): 包含预览窗口网格、背景色、缩略图设置
* 2. 模型(Model): 模型渲染模式、纹理材质、上方向设置
* 3. 摄像机(Camera): 轴测视图和透视视图切换、透视视角大小设置
* 4. 灯光(Light): 场景全局光照强度
* 5. 导出(Export): 导出模型为其它格式(GLB、OBJ、STL)
#### 场景(Scene)
场景菜单提供了对场景的一些基础设置功能
1. 显示 / 隐藏网格
2. 设置背景色
3. 点击上传设置背景图片
4. 隐藏预览图
#### 模型(Model)
模型菜单提供了一些模型的相关功能
1. **上方向(Up direction)**: 确定模型的哪个轴为上方向
2. **渲染模式(Material mode)**: 模型渲染模式切换 原始(Original)、法线(Normal)、线框(Wireframe)、线稿(Lineart)
#### 摄像机(Camera)
该菜单提供了轴测视图和透视视图切换、透视视角大小设置
1. **相机(Camera)**: 在轴测视图和正交视图之间快速切换
2. **视场角(FOV)**: 调整 FOV 视角角度
#### 灯光(Light)
通过该菜单可以快速调节模型场景的全局光照强度
#### 导出(Export)
该菜单提供了一个快速转换模型格式并导出的能力
### 3. 右侧菜单功能
右侧菜单的两个主要功能为:
1. **重设视图比例**: 点击按钮后视图将根据设定好的宽高按比例调整画布渲染区域比例
2. **视频录制**: 允许你将当前的 3D 视图操作录制为视频,允许导入,并可以作为 `recording_video` 输出给后续节点
# Flux 1.1 [pro] Ultra Image - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/bfl/flux-1-1-pro-ultra-image
使用 Black Forest Labs 的高分辨率图像生成 API 创建图像
Flux 1.1 \[pro] Ultra Image 节点允许你通过文本提示词生成超高分辨率的图像,直接连接 Black Forest Labs 的最新图像生成 API。
此节点支持两种主要使用模式:
1. **文生图**:通过文本提示词生成高质量图像(不使用任何图像收入时)
2. **图生图**:将现有图像与提示词结合,创建融合两者特点的新图像(Remix 模式)
此节点支持通过 API 调用 Ultra 模式,能够生成 4 倍于标准 Flux 1.1 \[pro] 分辨率的图像(高达 4MP),同时不牺牲提示词遵循性,并且保持仅 10 秒的超快生成时间。与其他高分辨率模型相比,生成速度提高了 2.5 倍以上。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------------ | --- | ------ | ------------------------------------------------------------------ |
| prompt | 字符串 | "" | 生成图像的文本描述 |
| prompt\_upsampling | 布尔值 | False | 是否使用提示词上采样技术增强细节。启用后会自动修改提示词以获得更具创造性的生成,但结果会变得不确定(相同种子不会产生完全相同的结果) |
| seed | 整数 | 0 | 随机种子值,控制生成的随机性 |
| aspect\_ratio | 字符串 | "16:9" | 图像的宽高比,必须在 1:4 到 4:1 之间 |
| raw | 布尔值 | False | 设置为 True 时,生成更少处理痕迹、更自然的图像 |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| ----------------------- | --- | --- | ----------------------------------------------------------------- |
| image\_prompt | 图像 | 无 | 可选输入,用于图生图(Remix)模式 |
| image\_prompt\_strength | 浮点数 | 0.1 | 在有`image_prompt` 输入时生效,调节提示词与图像提示之间的混合程度,值越大输出越接近输入图像,范围为 0.0-1.0 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ----------- |
| IMAGE | 图像 | 生成的高分辨率图像结果 |
## 使用示例
请访问下面的教程查看对应的使用示例
* [Flux 1.1 Pro Ultra Image API 节点 ComfyUI 官方示例工作流](/zh-CN/tutorials/api-nodes/black-forest-labs/flux-1-1-pro-ultra-image)
## 工作原理
Flux 1.1 \[pro] Ultra 模式利用优化的深度学习架构和高效的 GPU 加速技术,在不牺牲速度的情况下实现高分辨率图像生成。当请求发送到 API 后,系统会解析提示词,应用适当的参数,然后并行计算图像,最终生成并返回高分辨率结果。
与常规模型相比,Ultra 模式特别关注细节保留和大尺寸下的一致性,以确保即使在 4MP 的高分辨率下也能保持令人印象深刻的质量。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class FluxProUltraImageNode(ComfyNodeABC):
"""
Generates images synchronously based on prompt and resolution.
"""
MINIMUM_RATIO = 1 / 4
MAXIMUM_RATIO = 4 / 1
MINIMUM_RATIO_STR = "1:4"
MAXIMUM_RATIO_STR = "4:1"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation",
},
),
"prompt_upsampling": (
IO.BOOLEAN,
{
"default": False,
"tooltip": "Whether to perform upsampling on the prompt. If active, automatically modifies the prompt for more creative generation, but results are nondeterministic (same seed will not produce exactly the same result).",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "The random seed used for creating the noise.",
},
),
"aspect_ratio": (
IO.STRING,
{
"default": "16:9",
"tooltip": "Aspect ratio of image; must be between 1:4 and 4:1.",
},
),
"raw": (
IO.BOOLEAN,
{
"default": False,
"tooltip": "When True, generate less processed, more natural-looking images.",
},
),
},
"optional": {
"image_prompt": (IO.IMAGE,),
"image_prompt_strength": (
IO.FLOAT,
{
"default": 0.1,
"min": 0.0,
"max": 1.0,
"step": 0.01,
"tooltip": "Blend between the prompt and the image prompt.",
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
@classmethod
def VALIDATE_INPUTS(cls, aspect_ratio: str):
try:
validate_aspect_ratio(
aspect_ratio,
minimum_ratio=cls.MINIMUM_RATIO,
maximum_ratio=cls.MAXIMUM_RATIO,
minimum_ratio_str=cls.MINIMUM_RATIO_STR,
maximum_ratio_str=cls.MAXIMUM_RATIO_STR,
)
except Exception as e:
return str(e)
return True
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/bfl"
def api_call(
self,
prompt: str,
aspect_ratio: str,
prompt_upsampling=False,
raw=False,
seed=0,
image_prompt=None,
image_prompt_strength=0.1,
auth_token=None,
**kwargs,
):
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/bfl/flux-pro-1.1-ultra/generate",
method=HttpMethod.POST,
request_model=BFLFluxProUltraGenerateRequest,
response_model=BFLFluxProGenerateResponse,
),
request=BFLFluxProUltraGenerateRequest(
prompt=prompt,
prompt_upsampling=prompt_upsampling,
seed=seed,
aspect_ratio=validate_aspect_ratio(
aspect_ratio,
minimum_ratio=self.MINIMUM_RATIO,
maximum_ratio=self.MAXIMUM_RATIO,
minimum_ratio_str=self.MINIMUM_RATIO_STR,
maximum_ratio_str=self.MAXIMUM_RATIO_STR,
),
raw=raw,
image_prompt=(
image_prompt
if image_prompt is None
else convert_image_to_base64(image_prompt)
),
image_prompt_strength=(
None if image_prompt is None else round(image_prompt_strength, 2)
),
),
auth_token=auth_token,
)
output_image = handle_bfl_synchronous_operation(operation)
return (output_image,)
```
# Ideogram V1 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/ideogram/ideogram-v1
使用Ideogram API创建精准文字渲染图像的节点
Ideogram V1 节点允许你使用Ideogram的文本到图像API生成具有高质量文字渲染能力的图像。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| --------------------- | --- | ------ | -------------------------------------- |
| prompt | 字符串 | "" | 描述要生成内容的文本提示词 |
| turbo | 布尔值 | False | 是否使用turbo模式(更快生成,但可能质量较低) |
| aspect\_ratio | 选择项 | "1:1" | 图像宽高比 |
| magic\_prompt\_option | 选择项 | "AUTO" | 决定是否在生成中使用MagicPrompt,选项:AUTO, ON, OFF |
| seed | 整数 | 0 | 随机种子值(0-2147483647) |
| negative\_prompt | 字符串 | "" | 指定不希望在图像中出现的元素 |
| num\_images | 整数 | 1 | 生成图像数量(1-8) |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class IdeogramV1(ComfyNodeABC):
"""
Generates images synchronously using the Ideogram V1 model.
Images links are available for a limited period of time; if you would like to keep the image, you must download it.
"""
def __init__(self):
pass
@classmethod
def INPUT_TYPES(cls) -> InputTypeDict:
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation",
},
),
"turbo": (
IO.BOOLEAN,
{
"default": False,
"tooltip": "Whether to use turbo mode (faster generation, potentially lower quality)",
}
),
},
"optional": {
"aspect_ratio": (
IO.COMBO,
{
"options": list(V1_V2_RATIO_MAP.keys()),
"default": "1:1",
"tooltip": "The aspect ratio for image generation.",
},
),
"magic_prompt_option": (
IO.COMBO,
{
"options": ["AUTO", "ON", "OFF"],
"default": "AUTO",
"tooltip": "Determine if MagicPrompt should be used in generation",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 2147483647,
"step": 1,
"control_after_generate": True,
"display": "number",
},
),
"negative_prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Description of what to exclude from the image",
},
),
"num_images": (
IO.INT,
{"default": 1, "min": 1, "max": 8, "step": 1, "display": "number"},
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
RETURN_TYPES = (IO.IMAGE,)
FUNCTION = "api_call"
CATEGORY = "api node/image/ideogram/v1"
DESCRIPTION = cleandoc(__doc__ or "")
API_NODE = True
def api_call(
self,
prompt,
turbo=False,
aspect_ratio="1:1",
magic_prompt_option="AUTO",
seed=0,
negative_prompt="",
num_images=1,
auth_token=None,
):
# Determine the model based on turbo setting
aspect_ratio = V1_V2_RATIO_MAP.get(aspect_ratio, None)
model = "V_1_TURBO" if turbo else "V_1"
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/ideogram/generate",
method=HttpMethod.POST,
request_model=IdeogramGenerateRequest,
response_model=IdeogramGenerateResponse,
),
request=IdeogramGenerateRequest(
image_request=ImageRequest(
prompt=prompt,
model=model,
num_images=num_images,
seed=seed,
aspect_ratio=aspect_ratio if aspect_ratio != "ASPECT_1_1" else None,
magic_prompt_option=(
magic_prompt_option if magic_prompt_option != "AUTO" else None
),
negative_prompt=negative_prompt if negative_prompt else None,
)
),
auth_token=auth_token,
)
response = operation.execute()
if not response.data or len(response.data) == 0:
raise Exception("No images were generated in the response")
image_urls = [image_data.url for image_data in response.data if image_data.url]
if not image_urls:
raise Exception("No image URLs were generated in the response")
return (download_and_process_images(image_urls),)
```
# Ideogram V2 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/ideogram/ideogram-v2
使用Ideogram V2 API创建高质量图像和文字渲染的节点
Ideogram V2 节点让你能够利用Ideogram的第二代AI模型生成更加精细的图像,特别是在文字渲染、图像质量和整体美学方面有所提升。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ------ | --- | ----- | --------------------------- |
| prompt | 字符串 | "" | 描述要生成内容的文本提示词 |
| turbo | 布尔值 | False | 是否使用turbo模式(更快的生成速度,可能质量较低) |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| --------------------- | --- | ------ | ----------------------------------------------------------------------------------- |
| aspect\_ratio | 选择项 | "1:1" | 图像宽高比,当resolution设置为"Auto"时有效 |
| resolution | 选择项 | "Auto" | 输出图像分辨率,如果不设为"Auto",将覆盖aspect\_ratio设置 |
| magic\_prompt\_option | 选择项 | "AUTO" | 决定是否在生成过程中使用MagicPrompt功能,选项有\["AUTO", "ON", "OFF"] |
| seed | 整数 | 0 | 随机种子值,范围0-2147483647 |
| style\_type | 选择项 | "NONE" | 生成风格类型(仅V2有效),选项有\["AUTO", "GENERAL", "REALISTIC", "DESIGN", "RENDER\_3D", "ANIME"] |
| negative\_prompt | 字符串 | "" | 指定不希望在图像中出现的元素 |
| num\_images | 整数 | 1 | 生成图像数量,范围1-8 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class IdeogramV2(ComfyNodeABC):
"""
Generates images synchronously using the Ideogram V2 model.
Images links are available for a limited period of time; if you would like to keep the image, you must download it.
"""
def __init__(self):
pass
@classmethod
def INPUT_TYPES(cls) -> InputTypeDict:
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation",
},
),
"turbo": (
IO.BOOLEAN,
{
"default": False,
"tooltip": "Whether to use turbo mode (faster generation, potentially lower quality)",
}
),
},
"optional": {
"aspect_ratio": (
IO.COMBO,
{
"options": list(V1_V2_RATIO_MAP.keys()),
"default": "1:1",
"tooltip": "The aspect ratio for image generation. Ignored if resolution is not set to AUTO.",
},
),
"resolution": (
IO.COMBO,
{
"options": list(V1_V1_RES_MAP.keys()),
"default": "Auto",
"tooltip": "The resolution for image generation. If not set to AUTO, this overrides the aspect_ratio setting.",
},
),
"magic_prompt_option": (
IO.COMBO,
{
"options": ["AUTO", "ON", "OFF"],
"default": "AUTO",
"tooltip": "Determine if MagicPrompt should be used in generation",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 2147483647,
"step": 1,
"control_after_generate": True,
"display": "number",
},
),
"style_type": (
IO.COMBO,
{
"options": ["AUTO", "GENERAL", "REALISTIC", "DESIGN", "RENDER_3D", "ANIME"],
"default": "NONE",
"tooltip": "Style type for generation (V2 only)",
},
),
"negative_prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Description of what to exclude from the image",
},
),
"num_images": (
IO.INT,
{"default": 1, "min": 1, "max": 8, "step": 1, "display": "number"},
),
#"color_palette": (
# IO.STRING,
# {
# "multiline": False,
# "default": "",
# "tooltip": "Color palette preset name or hex colors with weights",
# },
#),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
RETURN_TYPES = (IO.IMAGE,)
FUNCTION = "api_call"
CATEGORY = "api node/image/ideogram/v2"
DESCRIPTION = cleandoc(__doc__ or "")
API_NODE = True
def api_call(
self,
prompt,
turbo=False,
aspect_ratio="1:1",
resolution="Auto",
magic_prompt_option="AUTO",
seed=0,
style_type="NONE",
negative_prompt="",
num_images=1,
color_palette="",
auth_token=None,
):
aspect_ratio = V1_V2_RATIO_MAP.get(aspect_ratio, None)
resolution = V1_V1_RES_MAP.get(resolution, None)
# Determine the model based on turbo setting
model = "V_2_TURBO" if turbo else "V_2"
# Handle resolution vs aspect_ratio logic
# If resolution is not AUTO, it overrides aspect_ratio
final_resolution = None
final_aspect_ratio = None
if resolution != "AUTO":
final_resolution = resolution
else:
final_aspect_ratio = aspect_ratio if aspect_ratio != "ASPECT_1_1" else None
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/ideogram/generate",
method=HttpMethod.POST,
request_model=IdeogramGenerateRequest,
response_model=IdeogramGenerateResponse,
),
request=IdeogramGenerateRequest(
image_request=ImageRequest(
prompt=prompt,
model=model,
num_images=num_images,
seed=seed,
aspect_ratio=final_aspect_ratio,
resolution=final_resolution,
magic_prompt_option=(
magic_prompt_option if magic_prompt_option != "AUTO" else None
),
style_type=style_type if style_type != "NONE" else None,
negative_prompt=negative_prompt if negative_prompt else None,
color_palette=color_palette if color_palette else None,
)
),
auth_token=auth_token,
)
response = operation.execute()
if not response.data or len(response.data) == 0:
raise Exception("No images were generated in the response")
image_urls = [image_data.url for image_data in response.data if image_data.url]
if not image_urls:
raise Exception("No image URLs were generated in the response")
return (download_and_process_images(image_urls),)
```
# Ideogram V3 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/ideogram/ideogram-v3
使用Ideogram最新V3 API创建顶级质量图像和文字渲染的节点
此节点连接到Ideogram V3 API,来完成对应的图像生成任务。
目前此节点支持两种图像生成模式:
* **文生图模式** - 从纯文本提示词生成全新图像
* **局部重绘模式(Inpainting)** - 通过提供原始图像和遮罩来重新生成特定区域
### 文生图模式
这是默认模式,当没有提供图像和遮罩输入时激活。只需提供提示词和所需的参数:
1. 在提示词字段中描述你想要的图像
2. 选择适当的宽高比或分辨率
3. 调整其他参数如魔法提示、种子和渲染质量
4. 运行节点生成图像
### 局部重绘模式
**重要提示**:此模式要求同时提供图像和遮罩输入。如果只提供其中一个,节点将抛出错误。
1. 将原始图像连接到`image`输入端口
2. 创建一个与原图相同尺寸的遮罩,白色区域表示要重新生成的部分
3. 将遮罩连接到`mask`输入端口
4. 在提示词中描述要在遮罩区域生成的内容
5. 运行节点执行局部编辑
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| --------------------- | --- | ---------- | --------------------------- |
| prompt | 字符串 | "" | 描述要生成内容的文本提示词 |
| aspect\_ratio | 选择项 | "1:1" | 图像宽高比(仅文生图模式有效) |
| resolution | 选择项 | "Auto" | 图像分辨率,设置后会覆盖宽高比设置 |
| magic\_prompt\_option | 选择项 | "AUTO" | 魔法提示增强选项:AUTO、ON或OFF |
| seed | 整数 | 0 | 随机种子值,设为0则随机生成 |
| num\_images | 整数 | 1 | 生成图像数量(1-8) |
| rendering\_speed | 选择项 | "BALANCED" | 渲染速度:BALANCED、TURBO或QUALITY |
### 可选参数
| 参数 | 类型 | 说明 |
| ----- | -- | ------------------------------------ |
| image | 图像 | 用于局部重绘模式的输入图像(**必须与mask一起提供**) |
| mask | 遮罩 | 用于局部重绘的遮罩,白色区域将被替换(**必须与image一起提供**) |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 工作原理
Ideogram V3节点使用最先进的AI模型处理用户输入,能够理解复杂的设计意图和文字排版需求。它支持两种主要模式:
1. **生成模式**:从文本提示创建全新图像
2. **编辑模式**:使用原始图像+遮罩组合,只替换遮罩指定的区域
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class IdeogramV3(ComfyNodeABC):
"""
Generates images synchronously using the Ideogram V3 model.
Supports both regular image generation from text prompts and image editing with mask.
Images links are available for a limited period of time; if you would like to keep the image, you must download it.
"""
def __init__(self):
pass
@classmethod
def INPUT_TYPES(cls) -> InputTypeDict:
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation or editing",
},
),
},
"optional": {
"image": (
IO.IMAGE,
{
"default": None,
"tooltip": "Optional reference image for image editing.",
},
),
"mask": (
IO.MASK,
{
"default": None,
"tooltip": "Optional mask for inpainting (white areas will be replaced)",
},
),
"aspect_ratio": (
IO.COMBO,
{
"options": list(V3_RATIO_MAP.keys()),
"default": "1:1",
"tooltip": "The aspect ratio for image generation. Ignored if resolution is not set to Auto.",
},
),
"resolution": (
IO.COMBO,
{
"options": V3_RESOLUTIONS,
"default": "Auto",
"tooltip": "The resolution for image generation. If not set to Auto, this overrides the aspect_ratio setting.",
},
),
"magic_prompt_option": (
IO.COMBO,
{
"options": ["AUTO", "ON", "OFF"],
"default": "AUTO",
"tooltip": "Determine if MagicPrompt should be used in generation",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 2147483647,
"step": 1,
"control_after_generate": True,
"display": "number",
},
),
"num_images": (
IO.INT,
{"default": 1, "min": 1, "max": 8, "step": 1, "display": "number"},
),
"rendering_speed": (
IO.COMBO,
{
"options": ["BALANCED", "TURBO", "QUALITY"],
"default": "BALANCED",
"tooltip": "Controls the trade-off between generation speed and quality",
},
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
RETURN_TYPES = (IO.IMAGE,)
FUNCTION = "api_call"
CATEGORY = "api node/image/ideogram/v3"
DESCRIPTION = cleandoc(__doc__ or "")
API_NODE = True
def api_call(
self,
prompt,
image=None,
mask=None,
resolution="Auto",
aspect_ratio="1:1",
magic_prompt_option="AUTO",
seed=0,
num_images=1,
rendering_speed="BALANCED",
auth_token=None,
):
# Check if both image and mask are provided for editing mode
if image is not None and mask is not None:
# Edit mode
path = "/proxy/ideogram/ideogram-v3/edit"
# Process image and mask
input_tensor = image.squeeze().cpu()
# Validate mask dimensions match image
if mask.shape[1:] != image.shape[1:-1]:
raise Exception("Mask and Image must be the same size")
# Process image
img_np = (input_tensor.numpy() * 255).astype(np.uint8)
img = Image.fromarray(img_np)
img_byte_arr = io.BytesIO()
img.save(img_byte_arr, format="PNG")
img_byte_arr.seek(0)
img_binary = img_byte_arr
img_binary.name = "image.png"
# Process mask - white areas will be replaced
mask_np = (mask.squeeze().cpu().numpy() * 255).astype(np.uint8)
mask_img = Image.fromarray(mask_np)
mask_byte_arr = io.BytesIO()
mask_img.save(mask_byte_arr, format="PNG")
mask_byte_arr.seek(0)
mask_binary = mask_byte_arr
mask_binary.name = "mask.png"
# Create edit request
edit_request = IdeogramV3EditRequest(
prompt=prompt,
rendering_speed=rendering_speed,
)
# Add optional parameters
if magic_prompt_option != "AUTO":
edit_request.magic_prompt = magic_prompt_option
if seed != 0:
edit_request.seed = seed
if num_images > 1:
edit_request.num_images = num_images
# Execute the operation for edit mode
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=path,
method=HttpMethod.POST,
request_model=IdeogramV3EditRequest,
response_model=IdeogramGenerateResponse,
),
request=edit_request,
files={
"image": img_binary,
"mask": mask_binary,
},
content_type="multipart/form-data",
auth_token=auth_token,
)
elif image is not None or mask is not None:
# If only one of image or mask is provided, raise an error
raise Exception("Ideogram V3 image editing requires both an image AND a mask")
else:
# Generation mode
path = "/proxy/ideogram/ideogram-v3/generate"
# Create generation request
gen_request = IdeogramV3Request(
prompt=prompt,
rendering_speed=rendering_speed,
)
# Handle resolution vs aspect ratio
if resolution != "Auto":
gen_request.resolution = resolution
elif aspect_ratio != "1:1":
v3_aspect = V3_RATIO_MAP.get(aspect_ratio)
if v3_aspect:
gen_request.aspect_ratio = v3_aspect
# Add optional parameters
if magic_prompt_option != "AUTO":
gen_request.magic_prompt = magic_prompt_option
if seed != 0:
gen_request.seed = seed
if num_images > 1:
gen_request.num_images = num_images
# Execute the operation for generation mode
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=path,
method=HttpMethod.POST,
request_model=IdeogramV3Request,
response_model=IdeogramGenerateResponse,
),
request=gen_request,
auth_token=auth_token,
)
# Execute the operation and process response
response = operation.execute()
if not response.data or len(response.data) == 0:
raise Exception("No images were generated in the response")
image_urls = [image_data.url for image_data in response.data if image_data.url]
if not image_urls:
raise Exception("No image URLs were generated in the response")
return (download_and_process_images(image_urls),)
```
# Luma Image to Image - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/luma/luma-image-to-image
使用Luma AI修改图像的节点
Luma Image to Image 节点允许你使用 Luma AI 的技术根据文本提示词修改现有图像,同时保留原始图像的某些特征和结构。
## 节点功能
此节点连接到Luma AI的文本到图像API,让用户能够通过详细的文本提示词生成图像。Luma AI以其出色的真实感和细节表现而闻名,特别擅长生成照片级别的逼真内容和艺术风格图像。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| -------------------- | --- | ---- | -------------------------------------- |
| prompt | 字符串 | "" | 描述要生成内容的文本提示词 |
| model | 选择项 | - | 选择使用的生成模型 |
| aspect\_ratio | 选择项 | 16:9 | 设置输出图像的宽高比 |
| seed | 整数 | 0 | 种子值,用于确定节点是否应重新运行,但实际结果与种子无关 |
| style\_image\_weight | 浮点数 | 1.0 | 样式图像的权重,范围0.02-1.0,仅在提供style\_image时生效 |
### 可选参数
在没有下列参数输入的情况下,对应的节点为文生图模式
| 参数 | 类型 | 说明 |
| ---------------- | --------- | ----------------------------------------------------- |
| image\_luma\_ref | LUMA\_REF | Luma参考节点连接,通过输入图像影响生成结果,最多可考虑4张图像 |
| style\_image | 图像 | 样式参考图像,仅使用1张图像,影响图像生成的风格,通过 `style_image_weight` 调整权重 |
| character\_image | 图像 | 将角色特征加入到生成结果中,可以是多张图像的批次,最多可提供4张图像 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 使用示例
## 工作原理
Luma Image to Image 节点分析输入图像并结合文本提示词来引导修改过程。它使用 Luma AI 的生成模型,根据提示词对图像进行创新性的变化。
节点流程:
1. 首先将输入图像上传到 ComfyAPI
2. 然后将图像 URL 与提示词发送到 Luma API
3. 等待 Luma AI 完成处理
4. 下载并返回生成的图像
image\_weight 参数控制原始图像的影响程度 - 接近 0 的值会更多地保留原始图像特征,而接近 1 的值则允许更大幅度的修改。
## 源码参考
\[节点源码 (更新于 2025-05-05)]
```python
class LumaImageModifyNode(ComfyNodeABC):
"""
Modifies images synchronously based on prompt and aspect ratio.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Luma"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (IO.IMAGE,),
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation",
},
),
"image_weight": (
IO.FLOAT,
{
"default": 1.0,
"min": 0.02,
"max": 1.0,
"step": 0.01,
"tooltip": "Weight of the image; the closer to 0.0, the less the image will be modified.",
},
),
"model": ([model.value for model in LumaImageModel],),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
},
"optional": {},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
prompt: str,
model: str,
image: torch.Tensor,
image_weight: float,
seed,
auth_token=None,
**kwargs,
):
# first, upload image
download_urls = upload_images_to_comfyapi(
image, max_images=1, auth_token=auth_token
)
image_url = download_urls[0]
# next, make Luma call with download url provided
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/luma/generations/image",
method=HttpMethod.POST,
request_model=LumaImageGenerationRequest,
response_model=LumaGeneration,
),
request=LumaImageGenerationRequest(
prompt=prompt,
model=model,
modify_image_ref=LumaModifyImageRef(
url=image_url, weight=round(image_weight, 2)
),
),
auth_token=auth_token,
)
response_api: LumaGeneration = operation.execute()
operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path=f"/proxy/luma/generations/{response_api.id}",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=LumaGeneration,
),
completed_statuses=[LumaState.completed],
failed_statuses=[LumaState.failed],
status_extractor=lambda x: x.state,
auth_token=auth_token,
)
response_poll = operation.execute()
img_response = requests.get(response_poll.assets.image)
img = process_image_response(img_response)
return (img,)
```
# Luma Reference - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/luma/luma-reference
为Luma图像生成提供参考图像的辅助节点
Luma Reference 节点允许你设置参考图像和权重,用于指导Luma图像生成节点的创作过程,使得生成的图像更接近参考图像的特定特征。
## 节点功能
此节点作为Luma生成节点的辅助工具,允许用户提供参考图像来影响生成结果。它让用户能够设置参考图像的权重,以控制参考图像对最终结果的影响程度。
多个 Luma Reference 节点可以串联,根据对应的 API 要求,最多运行同时串联 4 个进行工作
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------ | --- | --- | ----------------- |
| image | 图像 | - | 作为参考的输入图像 |
| weight | 浮点数 | 1.0 | 控制参考图像的影响强度 (0-1) |
### 输出
| 输出 | 类型 | 说明 |
| --------- | --------- | ------------ |
| luma\_ref | LUMA\_REF | 包含图像和权重的参考对象 |
## 使用示例
Luma Text to Image 工作流示例
## 工作原理
Luma Reference 节点接收图像输入并允许设置权重值。该节点不直接生成或修改图像,而是创建一个包含图像数据和权重信息的参考对象,后续传递给Luma生成节点。
在生成过程中,Luma AI 会分析参考图像的特征,并根据设定的权重将这些特征融入到生成结果中。较高的权重值意味着生成的图像将更接近参考图像的特征,而较低的权重值则表示参考图像只会轻微影响最终结果。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class LumaReferenceNode(ComfyNodeABC):
"""
Holds an image and weight for use with Luma Generate Image node.
"""
RETURN_TYPES = (LumaIO.LUMA_REF,)
RETURN_NAMES = ("luma_ref",)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "create_luma_reference"
CATEGORY = "api node/image/Luma"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (
IO.IMAGE,
{
"tooltip": "Image to use as reference.",
},
),
"weight": (
IO.FLOAT,
{
"default": 1.0,
"min": 0.0,
"max": 1.0,
"step": 0.01,
"tooltip": "Weight of image reference.",
},
),
},
"optional": {"luma_ref": (LumaIO.LUMA_REF,)},
}
def create_luma_reference(
self, image: torch.Tensor, weight: float, luma_ref: LumaReferenceChain = None
):
if luma_ref is not None:
luma_ref = luma_ref.clone()
else:
luma_ref = LumaReferenceChain()
luma_ref.add(LumaReference(image=image, weight=round(weight, 2)))
return (luma_ref,)
```
# Luma Text to Image - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/luma/luma-text-to-image
使用Luma AI将文本描述转换为高质量图像的节点
Luma Text to Image 节点允许你使用Luma AI的先进图像生成功能,通过文本描述创建高度逼真和艺术化的图像。
## 节点功能
此节点连接到Luma AI的文本到图像API,让用户能够通过详细的文本提示词生成图像。Luma AI以其出色的真实感和细节表现而闻名,特别擅长生成照片级别的逼真内容和艺术风格图像。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| -------------------- | --- | ---- | ---------------------------------------------- |
| prompt | 字符串 | "" | 描述要生成内容的文本提示词 |
| model | 选择项 | - | 选择使用的生成模型 |
| aspect\_ratio | 选择项 | 16:9 | 设置输出图像的宽高比 |
| seed | 整数 | 0 | 种子值,用于确定节点是否应重新运行,但实际结果与种子无关 |
| style\_image\_weight | 浮点数 | 1.0 | 样式图像的权重,范围0.0-1.0,仅在提供style\_image时生效,越大风格参考越强 |
### 可选参数
| 参数 | 类型 | 说明 |
| ---------------- | --------- | --------------------------------- |
| image\_luma\_ref | LUMA\_REF | Luma参考节点连接,通过输入图像影响生成结果,最多可考虑4张图像 |
| style\_image | 图像 | 样式参考图像,仅使用1张图像 |
| character\_image | 图像 | 角色参考图像,可以是多张图像的批次,最多可考虑4张图像 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 使用示例
Luma Text to Image 工作流详细使用指南
## 工作原理
Luma Text to Image 节点分析用户提供的文本提示词,通过Luma AI的生成模型创建相应的图像。该过程利用深度学习技术理解文本描述并将其转换为视觉表现。用户可以通过调整各种参数来精细控制生成过程,包括分辨率、引导尺度和负面提示词。
此外,节点支持使用参考图像和概念引导来进一步影响生成结果,使创作者能够更精确地实现他们的创意愿景。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class LumaImageGenerationNode(ComfyNodeABC):
"""
Generates images synchronously based on prompt and aspect ratio.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Luma"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation",
},
),
"model": ([model.value for model in LumaImageModel],),
"aspect_ratio": (
[ratio.value for ratio in LumaAspectRatio],
{
"default": LumaAspectRatio.ratio_16_9,
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
"style_image_weight": (
IO.FLOAT,
{
"default": 1.0,
"min": 0.0,
"max": 1.0,
"step": 0.01,
"tooltip": "Weight of style image. Ignored if no style_image provided.",
},
),
},
"optional": {
"image_luma_ref": (
LumaIO.LUMA_REF,
{
"tooltip": "Luma Reference node connection to influence generation with input images; up to 4 images can be considered."
},
),
"style_image": (
IO.IMAGE,
{"tooltip": "Style reference image; only 1 image will be used."},
),
"character_image": (
IO.IMAGE,
{
"tooltip": "Character reference images; can be a batch of multiple, up to 4 images can be considered."
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
prompt: str,
model: str,
aspect_ratio: str,
seed,
style_image_weight: float,
image_luma_ref: LumaReferenceChain = None,
style_image: torch.Tensor = None,
character_image: torch.Tensor = None,
auth_token=None,
**kwargs,
):
# handle image_luma_ref
api_image_ref = None
if image_luma_ref is not None:
api_image_ref = self._convert_luma_refs(
image_luma_ref, max_refs=4, auth_token=auth_token
)
# handle style_luma_ref
api_style_ref = None
if style_image is not None:
api_style_ref = self._convert_style_image(
style_image, weight=style_image_weight, auth_token=auth_token
)
# handle character_ref images
character_ref = None
if character_image is not None:
download_urls = upload_images_to_comfyapi(
character_image, max_images=4, auth_token=auth_token
)
character_ref = LumaCharacterRef(
identity0=LumaImageIdentity(images=download_urls)
)
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/luma/generations/image",
method=HttpMethod.POST,
request_model=LumaImageGenerationRequest,
response_model=LumaGeneration,
),
request=LumaImageGenerationRequest(
prompt=prompt,
model=model,
aspect_ratio=aspect_ratio,
image_ref=api_image_ref,
style_ref=api_style_ref,
character_ref=character_ref,
),
auth_token=auth_token,
)
response_api: LumaGeneration = operation.execute()
operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path=f"/proxy/luma/generations/{response_api.id}",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=LumaGeneration,
),
completed_statuses=[LumaState.completed],
failed_statuses=[LumaState.failed],
status_extractor=lambda x: x.state,
auth_token=auth_token,
)
response_poll = operation.execute()
img_response = requests.get(response_poll.assets.image)
img = process_image_response(img_response)
return (img,)
def _convert_luma_refs(
self, luma_ref: LumaReferenceChain, max_refs: int, auth_token=None
):
luma_urls = []
ref_count = 0
for ref in luma_ref.refs:
download_urls = upload_images_to_comfyapi(
ref.image, max_images=1, auth_token=auth_token
)
luma_urls.append(download_urls[0])
ref_count += 1
if ref_count >= max_refs:
break
return luma_ref.create_api_model(download_urls=luma_urls, max_refs=max_refs)
def _convert_style_image(
self, style_image: torch.Tensor, weight: float, auth_token=None
):
chain = LumaReferenceChain(
first_ref=LumaReference(image=style_image, weight=weight)
)
return self._convert_luma_refs(chain, max_refs=1, auth_token=auth_token)
```
# OpenAI DALL·E 2 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/openai/openai-dalle2
使用OpenAI的DALL·E 2模型生成图像的节点
OpenAI DALL·E 2 节点让你能够使用OpenAI的DALL·E 2 API,通过文本描述生成创意图像。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------ | --- | ----------- | ------------------------------------- |
| prompt | 字符串 | "" | 用于DALL·E生成图像的文本提示,支持多行输入 |
| seed | 整数 | 0 | 实际结果与种子并无关系,这个参数只是决定是否重新执行 |
| size | 选择项 | "1024x1024" | 输出图像尺寸,选项:256x256, 512x512, 1024x1024 |
| n | 整数 | 1 | 生成的图像数量,范围1-8 |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| ----- | -- | ---- | --------------------- |
| image | 图像 | None | 用于图像编辑的可选参考图像 |
| mask | 蒙版 | None | 用于修复绘制的可选蒙版(白色区域将被替换) |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 功能说明
* 基本功能:通过文本提示生成图像
* 图像编辑:当同时提供image和mask参数时,将执行图像编辑操作(白色蒙版区域会被替换)
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class OpenAIDalle2(ComfyNodeABC):
"""
Generates images synchronously via OpenAI's DALL·E 2 endpoint.
Uses the proxy at /proxy/openai/images/generations. Returned URLs are short‑lived,
so download or cache results if you need to keep them.
"""
def __init__(self):
pass
@classmethod
def INPUT_TYPES(cls) -> InputTypeDict:
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Text prompt for DALL·E",
},
),
},
"optional": {
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 2**31 - 1,
"step": 1,
"display": "number",
"control_after_generate": True,
"tooltip": "not implemented yet in backend",
},
),
"size": (
IO.COMBO,
{
"options": ["256x256", "512x512", "1024x1024"],
"default": "1024x1024",
"tooltip": "Image size",
},
),
"n": (
IO.INT,
{
"default": 1,
"min": 1,
"max": 8,
"step": 1,
"display": "number",
"tooltip": "How many images to generate",
},
),
"image": (
IO.IMAGE,
{
"default": None,
"tooltip": "Optional reference image for image editing.",
},
),
"mask": (
IO.MASK,
{
"default": None,
"tooltip": "Optional mask for inpainting (white areas will be replaced)",
},
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
RETURN_TYPES = (IO.IMAGE,)
FUNCTION = "api_call"
CATEGORY = "api node/image/openai"
DESCRIPTION = cleandoc(__doc__ or "")
API_NODE = True
def api_call(
self,
prompt,
seed=0,
image=None,
mask=None,
n=1,
size="1024x1024",
auth_token=None,
):
model = "dall-e-2"
path = "/proxy/openai/images/generations"
content_type = "application/json"
request_class = OpenAIImageGenerationRequest
img_binary = None
if image is not None and mask is not None:
path = "/proxy/openai/images/edits"
content_type = "multipart/form-data"
request_class = OpenAIImageEditRequest
input_tensor = image.squeeze().cpu()
height, width, channels = input_tensor.shape
rgba_tensor = torch.ones(height, width, 4, device="cpu")
rgba_tensor[:, :, :channels] = input_tensor
if mask.shape[1:] != image.shape[1:-1]:
raise Exception("Mask and Image must be the same size")
rgba_tensor[:, :, 3] = 1 - mask.squeeze().cpu()
rgba_tensor = downscale_image_tensor(rgba_tensor.unsqueeze(0)).squeeze()
image_np = (rgba_tensor.numpy() * 255).astype(np.uint8)
img = Image.fromarray(image_np)
img_byte_arr = io.BytesIO()
img.save(img_byte_arr, format="PNG")
img_byte_arr.seek(0)
img_binary = img_byte_arr # .getvalue()
img_binary.name = "image.png"
elif image is not None or mask is not None:
raise Exception("Dall-E 2 image editing requires an image AND a mask")
# Build the operation
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=path,
method=HttpMethod.POST,
request_model=request_class,
response_model=OpenAIImageGenerationResponse,
),
request=request_class(
model=model,
prompt=prompt,
n=n,
size=size,
seed=seed,
),
files=(
{
"image": img_binary,
}
if img_binary
else None
),
content_type=content_type,
auth_token=auth_token,
)
response = operation.execute()
img_tensor = validate_and_cast_response(response)
return (img_tensor,)
```
# OpenAI DALL·E 3 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/openai/openai-dalle3
使用OpenAI的DALL·E 3模型生成高质量图像的节点
此节点连接到OpenAI的DALL·E 3 API,使用户能够通过详细的文本提示词生成高质量图像。DALL·E 3是 OpenAI 的图像生成模型,相比前代提供了显著提升的图像质量、更精确的提示词理解和更优秀的细节表现能力。
## 参数说明
### 输入参数
| 参数 | 类型 | 默认值 | 说明 |
| ------- | --- | ----------- | ----------------------------------------------------------------------------- |
| prompt | 字符串 | "" | 详细描述要生成内容的文本提示词 |
| seed | 整数 | 0 | 最终结果与种子无关,这个参数只是决定是否重新执行 |
| quality | 选择项 | "standard" | 图像质量,可选值:"standard"或"hd" |
| style | 选择项 | "natural" | 视觉风格,可选值:"natural"或"vivid"。"vivid"使模型倾向于生成超现实和戏剧性图像,"natural"则生成更自然、不那么超现实的图像 |
| size | 选择项 | "1024x1024" | 输出图像尺寸,可选值:"1024x1024"、"1024x1792"或"1792x1024" |
### 输出参数
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class OpenAIDalle3(ComfyNodeABC):
"""
Generates images synchronously via OpenAI's DALL·E 3 endpoint.
Uses the proxy at /proxy/openai/images/generations. Returned URLs are short‑lived,
so download or cache results if you need to keep them.
"""
def __init__(self):
pass
@classmethod
def INPUT_TYPES(cls) -> InputTypeDict:
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Text prompt for DALL·E",
},
),
},
"optional": {
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 2**31 - 1,
"step": 1,
"display": "number",
"control_after_generate": True,
"tooltip": "not implemented yet in backend",
},
),
"quality": (
IO.COMBO,
{
"options": ["standard", "hd"],
"default": "standard",
"tooltip": "Image quality",
},
),
"style": (
IO.COMBO,
{
"options": ["natural", "vivid"],
"default": "natural",
"tooltip": "Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images.",
},
),
"size": (
IO.COMBO,
{
"options": ["1024x1024", "1024x1792", "1792x1024"],
"default": "1024x1024",
"tooltip": "Image size",
},
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
RETURN_TYPES = (IO.IMAGE,)
FUNCTION = "api_call"
CATEGORY = "api node/image/openai"
DESCRIPTION = cleandoc(__doc__ or "")
API_NODE = True
def api_call(
self,
prompt,
seed=0,
style="natural",
quality="standard",
size="1024x1024",
auth_token=None,
):
model = "dall-e-3"
# build the operation
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/openai/images/generations",
method=HttpMethod.POST,
request_model=OpenAIImageGenerationRequest,
response_model=OpenAIImageGenerationResponse,
),
request=OpenAIImageGenerationRequest(
model=model,
prompt=prompt,
quality=quality,
size=size,
style=style,
seed=seed,
),
auth_token=auth_token,
)
response = operation.execute()
img_tensor = validate_and_cast_response(response)
return (img_tensor,)
```
# OpenAI GPT Image 1 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/openai/openai-gpt-image1
使用OpenAI的GPT-4 Vision模型生成图像的节点
此节点连接到OpenAI的 GPT Image 1 API,让用户能够通过详细的文本提示词生成图像。GPT Image 1与传统的DALL·E模型不同,它利用了GPT-4的语言理解能力,可以处理更复杂和语境丰富的提示词,生成更符合用户意图的图像内容。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------- | --- | ------ | -------------------------------------------------------- |
| prompt | 字符串 | "" | 详细描述要生成内容的文本提示词 |
| quality | 选择项 | "low" | 图像质量级别,可选值:"low", "medium", "high" |
| size | 选择项 | "auto" | 输出图像尺寸,可选值:"auto", "1024x1024", "1024x1536", "1536x1024" |
### 图像编辑参数
| 参数 | 类型 | 说明 |
| ----- | -- | ----------------------------- |
| image | 图像 | 用于图像编辑的输入图像,支持批量输入多张图像 |
| mask | 掩码 | 指定图像中要修改的区域(可选),使用掩码时只能输入单张图像 |
### 可选参数
| 参数 | 类型 | 说明 |
| ---------- | --- | ------------------------------------------- |
| background | 选择项 | 背景处理选项,可选值:"opaque"(不透明), "transparent"(透明) |
| seed | 整数 | 生成的随机种子,当前在后端尚未实现 |
| n | 整数 | 生成的图像数量,范围1-8 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 工作原理
OpenAI GPT Image 1 节点结合了GPT-4的语言理解能力和图像生成技术。它首先分析用户提供的文本提示词,理解其语义内容和意图,然后生成符合描述的图像。
当提供输入图像时,节点可以在图像编辑模式下工作,允许对现有图像进行修改。通过额外提供掩码,用户可以精确控制哪些区域应该被修改,哪些应该保持不变。注意使用掩码时,只能提供单张图像输入。
用户可以通过调整各种参数来控制生成结果,包括质量级别、尺寸、背景处理和生成数量。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class OpenAIGPTImage1(ComfyNodeABC):
"""
Generates images synchronously via OpenAI's GPT Image 1 endpoint.
Uses the proxy at /proxy/openai/images/generations. Returned URLs are short‑lived,
so download or cache results if you need to keep them.
"""
def __init__(self):
pass
@classmethod
def INPUT_TYPES(cls) -> InputTypeDict:
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Text prompt for GPT Image 1",
},
),
},
"optional": {
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 2**31 - 1,
"step": 1,
"display": "number",
"control_after_generate": True,
"tooltip": "not implemented yet in backend",
},
),
"quality": (
IO.COMBO,
{
"options": ["low", "medium", "high"],
"default": "low",
"tooltip": "Image quality, affects cost and generation time.",
},
),
"background": (
IO.COMBO,
{
"options": ["opaque", "transparent"],
"default": "opaque",
"tooltip": "Return image with or without background",
},
),
"size": (
IO.COMBO,
{
"options": ["auto", "1024x1024", "1024x1536", "1536x1024"],
"default": "auto",
"tooltip": "Image size",
},
),
"n": (
IO.INT,
{
"default": 1,
"min": 1,
"max": 8,
"step": 1,
"display": "number",
"tooltip": "How many images to generate",
},
),
"image": (
IO.IMAGE,
{
"default": None,
"tooltip": "Optional reference image for image editing.",
},
),
"mask": (
IO.MASK,
{
"default": None,
"tooltip": "Optional mask for inpainting (white areas will be replaced)",
},
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
RETURN_TYPES = (IO.IMAGE,)
FUNCTION = "api_call"
CATEGORY = "api node/image/openai"
DESCRIPTION = cleandoc(__doc__ or "")
API_NODE = True
def api_call(
self,
prompt,
seed=0,
quality="low",
background="opaque",
image=None,
mask=None,
n=1,
size="1024x1024",
auth_token=None,
):
model = "gpt-image-1"
path = "/proxy/openai/images/generations"
content_type="application/json"
request_class = OpenAIImageGenerationRequest
img_binaries = []
mask_binary = None
files = []
if image is not None:
path = "/proxy/openai/images/edits"
request_class = OpenAIImageEditRequest
content_type ="multipart/form-data"
batch_size = image.shape[0]
for i in range(batch_size):
single_image = image[i : i + 1]
scaled_image = downscale_image_tensor(single_image).squeeze()
image_np = (scaled_image.numpy() * 255).astype(np.uint8)
img = Image.fromarray(image_np)
img_byte_arr = io.BytesIO()
img.save(img_byte_arr, format="PNG")
img_byte_arr.seek(0)
img_binary = img_byte_arr
img_binary.name = f"image_{i}.png"
img_binaries.append(img_binary)
if batch_size == 1:
files.append(("image", img_binary))
else:
files.append(("image[]", img_binary))
if mask is not None:
if image.shape[0] != 1:
raise Exception("Cannot use a mask with multiple image")
if image is None:
raise Exception("Cannot use a mask without an input image")
if mask.shape[1:] != image.shape[1:-1]:
raise Exception("Mask and Image must be the same size")
batch, height, width = mask.shape
rgba_mask = torch.zeros(height, width, 4, device="cpu")
rgba_mask[:, :, 3] = 1 - mask.squeeze().cpu()
scaled_mask = downscale_image_tensor(rgba_mask.unsqueeze(0)).squeeze()
mask_np = (scaled_mask.numpy() * 255).astype(np.uint8)
mask_img = Image.fromarray(mask_np)
mask_img_byte_arr = io.BytesIO()
mask_img.save(mask_img_byte_arr, format="PNG")
mask_img_byte_arr.seek(0)
mask_binary = mask_img_byte_arr
mask_binary.name = "mask.png"
files.append(("mask", mask_binary))
# Build the operation
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=path,
method=HttpMethod.POST,
request_model=request_class,
response_model=OpenAIImageGenerationResponse,
),
request=request_class(
model=model,
prompt=prompt,
quality=quality,
background=background,
n=n,
seed=seed,
size=size,
),
files=files if files else None,
content_type=content_type,
auth_token=auth_token,
)
response = operation.execute()
img_tensor = validate_and_cast_response(response)
return (img_tensor,)
```
# Recraft Color RGB - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-color-rgb
为Recraft图像生成定义颜色控制的辅助节点
Recraft Color RGB 节点允许你定义精确的RGB颜色值,用于控制Recraft图像生成过程中的颜色使用。
## 节点功能
此节点创建一个颜色配置对象,可以连接到Recraft Controls节点,用于指定生成图像中应使用的颜色。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| -- | -- | --- | ------------ |
| r | 整数 | 0 | 红色通道值(0-255) |
| g | 整数 | 0 | 绿色通道值(0-255) |
| b | 整数 | 0 | 蓝色通道值(0-255) |
### 输出
| 输出 | 类型 | 说明 |
| -------------- | ------------- | ---------------------------- |
| recraft\_color | Recraft Color | 颜色配置对象,连接到Recraft Controls节点 |
## 使用示例
Recraft Text to Image 工作流示例
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftColorRGBNode:
"""
Create Recraft Color by choosing specific RGB values.
"""
RETURN_TYPES = (RecraftIO.COLOR,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
RETURN_NAMES = ("recraft_color",)
FUNCTION = "create_color"
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"r": (IO.INT, {
"default": 0,
"min": 0,
"max": 255,
"tooltip": "Red value of color."
}),
"g": (IO.INT, {
"default": 0,
"min": 0,
"max": 255,
"tooltip": "Green value of color."
}),
"b": (IO.INT, {
"default": 0,
"min": 0,
"max": 255,
"tooltip": "Blue value of color."
}),
},
"optional": {
"recraft_color": (RecraftIO.COLOR,),
}
}
def create_color(self, r: int, g: int, b: int, recraft_color: RecraftColorChain=None):
recraft_color = recraft_color.clone() if recraft_color else RecraftColorChain()
recraft_color.add(RecraftColor(r, g, b))
return (recraft_color, )
```
# Recraft Controls - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-controls
为Recraft图像生成提供高级控制参数的节点
Recraft Controls 节点允许你定义一系列控制参数(如颜色和背景颜色指导),用于精确指导Recraft的图像生成过程。这个节点将多种控制输入整合为一个统一的控制对象。
## 参数说明
### 可选参数
| 参数 | 类型 | 说明 |
| ----------------- | ------------- | ----------- |
| colors | Recraft Color | 用于生成图像的颜色控制 |
| background\_color | Recraft Color | 背景颜色控制 |
### 输出
| 输出 | 类型 | 说明 |
| ----------------- | ---------------- | --------------------- |
| recraft\_controls | Recraft Controls | 控制配置对象,连接到Recraft生成节点 |
## 使用示例
Recraft Text to Image 工作流示例
## 工作原理
节点处理流程:
1. 收集输入的控制参数(colors和background\_color)
2. 将这些参数整合到一个结构化的控制对象中
3. 输出此控制对象,可连接到各种Recraft生成节点
当连接到Recraft生成节点后,这些控制参数会影响AI的生成过程,使AI能够考虑多种因素,而不仅仅是文本提示的语义内容。如果配置了颜色输入,AI将尝试在生成的图像中合理地使用这些颜色。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftControlsNode:
"""
Create Recraft Controls for customizing Recraft generation.
"""
RETURN_TYPES = (RecraftIO.CONTROLS,)
RETURN_NAMES = ("recraft_controls",)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "create_controls"
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
},
"optional": {
"colors": (RecraftIO.COLOR,),
"background_color": (RecraftIO.COLOR,),
}
}
def create_controls(self, colors: RecraftColorChain=None, background_color: RecraftColorChain=None):
return (RecraftControls(colors=colors, background_color=background_color), )
```
# Recraft Creative Upscale - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-creative-upscale
使用AI技术创意增强图像细节和分辨率的 Recraft API 节点
Recraft Creative Upscale 节点使用 Recraft 的 API 增加图像分辨率,还创造性地增强和丰富图像细节。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ----- | -- | --- | ----------- |
| image | 图像 | - | 需要创意放大的输入图像 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------------ |
| IMAGE | 图像 | 创意放大后的高分辨率图像 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftCreativeUpscaleNode(RecraftCrispUpscaleNode):
"""
Upscale image synchronously.
Enhances a given raster image using ‘creative upscale’ tool, boosting resolution with a focus on refining small details and faces.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
RECRAFT_PATH = "/proxy/recraft/images/creativeUpscale"
```
# Recraft Crisp Upscale - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-crisp-upscale
使用AI技术增加图像清晰度和分辨率的 Recraft API 节点
Recraft Crisp Upscale 节点利用 Recraft 的 API 增强图像分辨率和清晰度。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ----- | -- | --- | --------- |
| image | 图像 | - | 需要放大的输入图像 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | --------- |
| IMAGE | 图像 | 放大和增强后的图像 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftCrispUpscaleNode:
"""
Upscale image synchronously.
Enhances a given raster image using ‘crisp upscale’ tool, increasing image resolution, making the image sharper and cleaner.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
RECRAFT_PATH = "/proxy/recraft/images/crispUpscale"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (IO.IMAGE, ),
},
"optional": {
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
image: torch.Tensor,
auth_token=None,
**kwargs,
):
images = []
total = image.shape[0]
pbar = ProgressBar(total)
for i in range(total):
sub_bytes = handle_recraft_file_request(
image=image[i],
path=self.RECRAFT_PATH,
auth_token=auth_token,
)
images.append(torch.cat([bytesio_to_image_tensor(x) for x in sub_bytes], dim=0))
pbar.update(1)
images_tensor = torch.cat(images, dim=0)
return (images_tensor,)
```
# Recraft Image Inpainting - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-image-inpainting
使用Recraft API选择性地修改图像区域
Recraft Image Inpainting 节点允许你选择性地修改图像的特定区域,保持其余部分不变。通过提供图像、蒙版和文本提示词,可以精确地生成新内容填充指定区域。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------ | --- | --- | ------------- |
| image | 图像 | - | 需要修改的输入图像 |
| mask | 蒙版 | - | 定义要修改区域的黑白蒙版 |
| prompt | 字符串 | "" | 描述要在蒙版区域生成的内容 |
| n | 整数 | 1 | 生成结果的数量(1-6) |
| seed | 整数 | 0 | 随机种子值 |
### 可选参数
| 参数 | 类型 | 说明 |
| ----------------- | ---------------- | ---------------- |
| recraft\_style | Recraft Style | 设置生成内容的风格 |
| negative\_prompt | 字符串 | 指定不希望在生成内容中出现的元素 |
| recraft\_controls | Recraft Controls | 附加控制参数(颜色等) |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | -------- |
| IMAGE | 图像 | 修改后的图像结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftImageInpaintingNode:
"""
Modify image based on prompt and mask.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (IO.IMAGE, ),
"mask": (IO.MASK, ),
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation.",
},
),
"n": (
IO.INT,
{
"default": 1,
"min": 1,
"max": 6,
"tooltip": "The number of images to generate.",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
},
"optional": {
"recraft_style": (RecraftIO.STYLEV3,),
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "An optional text description of undesired elements on an image.",
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
image: torch.Tensor,
mask: torch.Tensor,
prompt: str,
n: int,
seed,
auth_token=None,
recraft_style: RecraftStyle = None,
negative_prompt: str = None,
**kwargs,
):
default_style = RecraftStyle(RecraftStyleV3.realistic_image)
if recraft_style is None:
recraft_style = default_style
if not negative_prompt:
negative_prompt = None
request = RecraftImageGenerationRequest(
prompt=prompt,
negative_prompt=negative_prompt,
model=RecraftModel.recraftv3,
n=n,
style=recraft_style.style,
substyle=recraft_style.substyle,
style_id=recraft_style.style_id,
random_seed=seed,
)
# prepare mask tensor
_, H, W, _ = image.shape
mask = mask.unsqueeze(-1)
mask = mask.movedim(-1,1)
mask = common_upscale(mask, width=W, height=H, upscale_method="nearest-exact", crop="disabled")
mask = mask.movedim(1,-1)
mask = (mask > 0.5).float()
images = []
total = image.shape[0]
pbar = ProgressBar(total)
for i in range(total):
sub_bytes = handle_recraft_file_request(
image=image[i],
mask=mask[i:i+1],
path="/proxy/recraft/images/inpaint",
request=request,
auth_token=auth_token,
)
images.append(torch.cat([bytesio_to_image_tensor(x) for x in sub_bytes], dim=0))
pbar.update(1)
images_tensor = torch.cat(images, dim=0)
return (images_tensor, )
```
# Recraft Image to Image - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-image-to-image
通过文本描述和参考图像生成新图像的 Recraft API 节点
Recraft Image to Image 节点通过 Recraft 的 API 将参考图像和文本提示词生成新的图像。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------ | --- | --- | ----------- |
| image | 图像 | - | 作为参考的输入图像 |
| prompt | 字符串 | "" | 生成图像的文本描述 |
| n | 整数 | 1 | 生成图像数量(1-6) |
| seed | 整数 | 0 | 随机种子值 |
### 可选参数
| 参数 | 类型 | 说明 |
| ----------------- | ---------------- | ----------- |
| recraft\_style | Recraft Style | 设置生成图像的风格 |
| negative\_prompt | 字符串 | 指定不希望出现的元素 |
| recraft\_controls | Recraft Controls | 附加控制参数(颜色等) |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftImageToImageNode:
"""
Modify image based on prompt and strength.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (IO.IMAGE, ),
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation.",
},
),
"n": (
IO.INT,
{
"default": 1,
"min": 1,
"max": 6,
"tooltip": "The number of images to generate.",
},
),
"strength": (
IO.FLOAT,
{
"default": 0.5,
"min": 0.0,
"max": 1.0,
"step": 0.01,
"tooltip": "Defines the difference with the original image, should lie in [0, 1], where 0 means almost identical, and 1 means miserable similarity."
}
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
},
"optional": {
"recraft_style": (RecraftIO.STYLEV3,),
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "An optional text description of undesired elements on an image.",
},
),
"recraft_controls": (
RecraftIO.CONTROLS,
{
"tooltip": "Optional additional controls over the generation via the Recraft Controls node."
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
image: torch.Tensor,
prompt: str,
n: int,
strength: float,
seed,
auth_token=None,
recraft_style: RecraftStyle = None,
negative_prompt: str = None,
recraft_controls: RecraftControls = None,
**kwargs,
):
default_style = RecraftStyle(RecraftStyleV3.realistic_image)
if recraft_style is None:
recraft_style = default_style
controls_api = None
if recraft_controls:
controls_api = recraft_controls.create_api_model()
if not negative_prompt:
negative_prompt = None
request = RecraftImageGenerationRequest(
prompt=prompt,
negative_prompt=negative_prompt,
model=RecraftModel.recraftv3,
n=n,
strength=round(strength, 2),
style=recraft_style.style,
substyle=recraft_style.substyle,
style_id=recraft_style.style_id,
controls=controls_api,
random_seed=seed,
)
images = []
total = image.shape[0]
pbar = ProgressBar(total)
for i in range(total):
sub_bytes = handle_recraft_file_request(
image=image[i],
path="/proxy/recraft/images/imageToImage",
request=request,
auth_token=auth_token,
)
images.append(torch.cat([bytesio_to_image_tensor(x) for x in sub_bytes], dim=0))
pbar.update(1)
images_tensor = torch.cat(images, dim=0)
return (images_tensor, )
```
# Recraft Remove Background - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-remove-background
自动移除图像背景并生成透明Alpha通道的 Recraft API 节点
Recraft Remove Background 节点通过 Recraft 的 API 能够智能识别并移除图像背景,生成带有透明背景的图像和对应的Alpha蒙版。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ----- | -- | --- | ----------- |
| image | 图像 | - | 需要移除背景的输入图像 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------------------- |
| IMAGE | 图像 | 移除背景后的图像(带Alpha通道) |
| MASK | 蒙版 | 主体对象的蒙版(白色区域为保留的主体) |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftRemoveBackgroundNode:
"""
Remove background from image, and return processed image and mask.
"""
RETURN_TYPES = (IO.IMAGE, IO.MASK)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (IO.IMAGE, ),
},
"optional": {
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
image: torch.Tensor,
auth_token=None,
**kwargs,
):
images = []
total = image.shape[0]
pbar = ProgressBar(total)
for i in range(total):
sub_bytes = handle_recraft_file_request(
image=image[i],
path="/proxy/recraft/images/removeBackground",
auth_token=auth_token,
)
images.append(torch.cat([bytesio_to_image_tensor(x) for x in sub_bytes], dim=0))
pbar.update(1)
images_tensor = torch.cat(images, dim=0)
# use alpha channel as masks, in B,H,W format
masks_tensor = images_tensor[:,:,:,-1:].squeeze(-1)
return (images_tensor, masks_tensor)
```
# Recraft Replace Background - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-replace-background
自动识别前景主体并替换背景的 Recraft API 节点
Recraft Replace Background 节点能够通过 Recraft 的 API 来智能识别图像中的主体对象,并根据文本描述生成新的背景场景。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------ | --- | --- | ------------------------- |
| image | 图像 | - | 包含需要保留主体的输入图像 |
| prompt | 字符串 | "" | 用于图像生成的提示词 |
| n | 整数 | 1 | 要生成的图像数量(1-6) |
| seed | 整数 | 0 | 确定节点是否应该重新运行的种子;实际结果与种子无关 |
### 可选参数
| 参数 | 类型 | 说明 |
| ---------------- | ------------- | ------------------ |
| recraft\_style | Recraft Style | 设置生成背景的风格 |
| negative\_prompt | 字符串 | 图像中不希望出现的元素的可选文本描述 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ---------- |
| IMAGE | 图像 | 替换背景后的完整图像 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftReplaceBackgroundNode:
"""
Replace background on image, based on provided prompt.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (IO.IMAGE, ),
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation.",
},
),
"n": (
IO.INT,
{
"default": 1,
"min": 1,
"max": 6,
"tooltip": "The number of images to generate.",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
},
"optional": {
"recraft_style": (RecraftIO.STYLEV3,),
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "An optional text description of undesired elements on an image.",
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
image: torch.Tensor,
prompt: str,
n: int,
seed,
auth_token=None,
recraft_style: RecraftStyle = None,
negative_prompt: str = None,
**kwargs,
):
default_style = RecraftStyle(RecraftStyleV3.realistic_image)
if recraft_style is None:
recraft_style = default_style
if not negative_prompt:
negative_prompt = None
request = RecraftImageGenerationRequest(
prompt=prompt,
negative_prompt=negative_prompt,
model=RecraftModel.recraftv3,
n=n,
style=recraft_style.style,
substyle=recraft_style.substyle,
style_id=recraft_style.style_id,
)
images = []
total = image.shape[0]
pbar = ProgressBar(total)
for i in range(total):
sub_bytes = handle_recraft_file_request(
image=image[i],
path="/proxy/recraft/images/replaceBackground",
request=request,
auth_token=auth_token,
)
images.append(torch.cat([bytesio_to_image_tensor(x) for x in sub_bytes], dim=0))
pbar.update(1)
images_tensor = torch.cat(images, dim=0)
return (images_tensor, )
```
# Recraft Style - Digital Illustration - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-style-digital-illustration
为Recraft图像生成设置数字插画风格的辅助节点
此节点创建一个风格配置对象,用于指导Recraft的图像生成过程朝向数字插画的视觉效果。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| -------- | --- | ---- | ------------ |
| substyle | 选择项 | None | 数字插画风格的具体子风格 |
### 输出
| 输出 | 类型 | 说明 |
| -------------- | ------------- | --------------------- |
| recraft\_style | Recraft Style | 风格配置对象,连接到Recraft生成节点 |
## 使用示例
Recraft Text to Image 工作流示例
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftStyleV3DigitalIllustrationNode(RecraftStyleV3RealisticImageNode):
"""
Select digital_illustration style and optional substyle.
"""
RECRAFT_STYLE = RecraftStyleV3.digital_illustration
```
# Recraft Style - Logo Raster - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-style-logo-raster
为Recraft图像生成设置Logo栅格风格的辅助节点
此节点创建一个风格配置对象,用于指导Recraft的图像生成过程朝向专业标志设计的视觉效果。通过选择不同的子风格,可以定义生成标志的设计风格、复杂度和适用场景。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| -------- | --- | --- | ------------------ |
| substyle | 选择项 | - | Logo栅格风格的具体子风格(必选) |
### 输出
| 输出 | 类型 | 说明 |
| -------------- | ------------- | --------------------- |
| recraft\_style | Recraft Style | 风格配置对象,连接到Recraft生成节点 |
## 使用示例
Recraft Text to Image 工作流示例
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftStyleV3LogoRasterNode(RecraftStyleV3RealisticImageNode):
"""
Select vector_illustration style and optional substyle.
"""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"substyle": (get_v3_substyles(s.RECRAFT_STYLE, include_none=False),),
}
}
RECRAFT_STYLE = RecraftStyleV3.logo_raster
```
# Recraft Style - Realistic Image - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-style-realistic-image
为Recraft图像生成设置真实照片风格的辅助节点
Recraft Style - Realistic Image 节点用于设置Recraft图像生成的真实照片风格,提供多种子风格选项,以控制生成图像的视觉特性。
## 节点功能
此节点创建一个风格配置对象,用于指导Recraft的图像生成过程朝向真实照片的视觉效果。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| -------- | --- | ---- | ---------------- |
| substyle | 选择项 | None | 真实照片风格的具体子风格(必选) |
### 输出
| 输出 | 类型 | 说明 |
| -------------- | ------------- | --------------------- |
| recraft\_style | Recraft Style | 风格配置对象,连接到Recraft生成节点 |
## 使用示例
Recraft Text to Image 工作流示例
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftStyleV3RealisticImageNode:
"""
Select realistic_image style and optional substyle.
"""
RETURN_TYPES = (RecraftIO.STYLEV3,)
RETURN_NAMES = ("recraft_style",)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "create_style"
CATEGORY = "api node/image/Recraft"
RECRAFT_STYLE = RecraftStyleV3.realistic_image
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"substyle": (get_v3_substyles(s.RECRAFT_STYLE),),
}
}
def create_style(self, substyle: str):
if substyle == "None":
substyle = None
return (RecraftStyle(self.RECRAFT_STYLE, substyle),)
```
# Recraft Text to Image - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-text-to-image
通过文本描述生成高质量图像的 Recraft API 节点
Recraft Text to Image 节点允许你通过文本提示词生成高质量图像,直接连接 Recraft AI 的图像生成 API,创建各种风格的图像作品。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------ | --- | --------- | ----------- |
| prompt | 字符串 | "" | 生成图像的文本描述 |
| size | 选择项 | 1024x1024 | 输出图像尺寸 |
| n | 整数 | 1 | 生成图像数量(1-6) |
| seed | 整数 | 0 | 随机种子值 |
### 可选参数
| 参数 | 类型 | 说明 |
| ----------------- | ---------------- | ------------------- |
| recraft\_style | Recraft Style | 设置生成图像的风格,默认为"真实照片" |
| negative\_prompt | 字符串 | 指定不希望出现的元素 |
| recraft\_controls | Recraft Controls | 附加控制参数(颜色等) |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 使用示例
## 工作原理
本节点主要通过以下步骤处理请求:
1. 整合输入参数,包括提示词、图像尺寸、生成数量和随机种子
2. 如有连接,合并风格设置和控制参数
3. 构建API请求并发送到Recraft服务器
4. 接收返回的图像URL并下载图像数据
5. 将图像数据转换为ComfyUI可用的tensor格式并输出
节点使用同步操作模式,会在处理完成前阻塞工作流执行,直到所有请求的图像都生成完毕。该节点利用Recraft的V3模型进行图像生成,能支持各种详细的文本描述和风格变化。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftTextToImageNode:
"""
Generates images synchronously based on prompt and resolution.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation.",
},
),
"size": (
[res.value for res in RecraftImageSize],
{
"default": RecraftImageSize.res_1024x1024,
"tooltip": "The size of the generated image.",
},
),
"n": (
IO.INT,
{
"default": 1,
"min": 1,
"max": 6,
"tooltip": "The number of images to generate.",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
},
"optional": {
"recraft_style": (RecraftIO.STYLEV3,),
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "An optional text description of undesired elements on an image.",
},
),
"recraft_controls": (
RecraftIO.CONTROLS,
{
"tooltip": "Optional additional controls over the generation via the Recraft Controls node."
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
prompt: str,
size: str,
n: int,
seed,
recraft_style: RecraftStyle = None,
negative_prompt: str = None,
recraft_controls: RecraftControls = None,
auth_token=None,
**kwargs,
):
default_style = RecraftStyle(RecraftStyleV3.realistic_image)
if recraft_style is None:
recraft_style = default_style
controls_api = None
if recraft_controls:
controls_api = recraft_controls.create_api_model()
if not negative_prompt:
negative_prompt = None
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/recraft/image_generation",
method=HttpMethod.POST,
request_model=RecraftImageGenerationRequest,
response_model=RecraftImageGenerationResponse,
),
request=RecraftImageGenerationRequest(
prompt=prompt,
negative_prompt=negative_prompt,
model=RecraftModel.recraftv3,
size=size,
n=n,
style=recraft_style.style,
substyle=recraft_style.substyle,
style_id=recraft_style.style_id,
controls=controls_api,
),
auth_token=auth_token,
)
response: RecraftImageGenerationResponse = operation.execute()
images = []
for data in response.data:
image = bytesio_to_image_tensor(
download_url_to_bytesio(data.url, timeout=1024)
)
if len(image.shape) < 4:
image = image.unsqueeze(0)
images.append(image)
output_image = torch.cat(images, dim=0)
return (output_image,)
```
# Recraft Text to Vector - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-text-to-vector
通过文本描述生成可缩放矢量图像的 Recraft API 节点
Recraft Text to Vector 节点允许你通过文本描述通过 Recraft 的 API 生成高质量的矢量图形(SVG格式),适用于创建标志、图标和可无限缩放的插图。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| -------- | --- | --------- | ------------ |
| prompt | 字符串 | "" | 描述要生成的矢量图形 |
| substyle | 选择项 | - | 矢量风格的子类型 |
| size | 选择项 | 1024x1024 | 输出矢量图的画布尺寸 |
| n | 整数 | 1 | 生成结果的数量(1-6) |
| seed | 整数 | 0 | 随机种子值 |
### 可选参数
| 参数 | 类型 | 说明 |
| ----------------- | ---------------- | ----------- |
| negative\_prompt | 字符串 | 指定不希望出现的元素 |
| recraft\_controls | Recraft Controls | 附加控制参数(颜色等) |
### 输出
| 输出 | 类型 | 说明 |
| --- | --- | -------------------------- |
| SVG | 矢量图 | 生成的SVG矢量图形,需连接到SaveSVG节点保存 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftTextToVectorNode:
"""
Generates SVG synchronously based on prompt and resolution.
"""
RETURN_TYPES = (RecraftIO.SVG,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the image generation.",
},
),
"substyle": (get_v3_substyles(RecraftStyleV3.vector_illustration),),
"size": (
[res.value for res in RecraftImageSize],
{
"default": RecraftImageSize.res_1024x1024,
"tooltip": "The size of the generated image.",
},
),
"n": (
IO.INT,
{
"default": 1,
"min": 1,
"max": 6,
"tooltip": "The number of images to generate.",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
},
"optional": {
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "An optional text description of undesired elements on an image.",
},
),
"recraft_controls": (
RecraftIO.CONTROLS,
{
"tooltip": "Optional additional controls over the generation via the Recraft Controls node."
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
prompt: str,
substyle: str,
size: str,
n: int,
seed,
negative_prompt: str = None,
recraft_controls: RecraftControls = None,
auth_token=None,
**kwargs,
):
# create RecraftStyle so strings will be formatted properly (i.e. "None" will become None)
recraft_style = RecraftStyle(RecraftStyleV3.vector_illustration, substyle=substyle)
controls_api = None
if recraft_controls:
controls_api = recraft_controls.create_api_model()
if not negative_prompt:
negative_prompt = None
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/recraft/image_generation",
method=HttpMethod.POST,
request_model=RecraftImageGenerationRequest,
response_model=RecraftImageGenerationResponse,
),
request=RecraftImageGenerationRequest(
prompt=prompt,
negative_prompt=negative_prompt,
model=RecraftModel.recraftv3,
size=size,
n=n,
style=recraft_style.style,
substyle=recraft_style.substyle,
controls=controls_api,
),
auth_token=auth_token,
)
response: RecraftImageGenerationResponse = operation.execute()
svg_data = []
for data in response.data:
svg_data.append(download_url_to_bytesio(data.url, timeout=1024))
return (SVG(svg_data),)
```
# Recraft Vectorize Image - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/recraft-vectorize-image
将栅格图像转换为矢量SVG格式的 Recraft API 节点
Recraft Vectorize Image 节点可以将通过 Recraft 的 API 将栅格图像(如照片、PNG或JPEG) 转换为矢量SVG格式。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ----- | -- | --- | ----------- |
| image | 图像 | - | 要转换为矢量的输入图像 |
### 输出
| 输出 | 类型 | 说明 |
| --- | --- | --------------------------- |
| SVG | 矢量图 | 转换后的SVG矢量图形,需连接到SaveSVG节点保存 |
## 使用示例
Recraft Text to Image 工作流示例
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class RecraftVectorizeImageNode:
"""
Generates SVG synchronously from an input image.
"""
RETURN_TYPES = (RecraftIO.SVG,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Recraft"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (IO.IMAGE, ),
},
"optional": {
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
image: torch.Tensor,
auth_token=None,
**kwargs,
):
svgs = []
total = image.shape[0]
pbar = ProgressBar(total)
for i in range(total):
sub_bytes = handle_recraft_file_request(
image=image[i],
path="/proxy/recraft/images/vectorize",
auth_token=auth_token,
)
svgs.append(SVG(sub_bytes))
pbar.update(1)
return (SVG.combine_all(svgs), )
```
# Save SVG - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/recraft/save-svg
将SVG矢量图形保存到文件的实用节点
Save SVG 节点允许你将从Recraft矢量生成节点获取的SVG数据保存为文件系统中的可用文件。这是处理和导出矢量图形的必要组件。
## 节点功能
此节点接收SVG矢量数据,并将其保存为文件系统中的标准SVG文件。它支持自动文件命名和保存路径指定,使得矢量图形可以被其他软件打开和编辑。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | --------- | ------------------------------------------------- |
| svg | SVG | - | 要保存的 SVG 矢量数据 |
| filename\_prefix | 字符串 | "recraft" | 文件名前缀 |
| output\_dir | 字符串 | - | 输出目录,默认为 ComfyUI 输出文件夹具体路径为 `ComfyUI/output/svg/` |
| index | 整数 | -1 | 保存索引,-1 表示所有生成的 SVG |
### 输出
| 输出 | 类型 | 说明 |
| --- | --- | ---------- |
| SVG | SVG | 传递输入的SVG数据 |
## 使用示例
Recraft Text to Image 工作流示例
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class SaveSVGNode:
"""
Save SVG files on disk.
"""
def __init__(self):
self.output_dir = folder_paths.get_output_directory()
self.type = "output"
self.prefix_append = ""
RETURN_TYPES = ()
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "save_svg"
CATEGORY = "api node/image/Recraft"
OUTPUT_NODE = True
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"svg": (RecraftIO.SVG,),
"filename_prefix": ("STRING", {"default": "svg/ComfyUI", "tooltip": "The prefix for the file to save. This may include formatting information such as %date:yyyy-MM-dd% or %Empty Latent Image.width% to include values from nodes."})
},
"hidden": {
"prompt": "PROMPT",
"extra_pnginfo": "EXTRA_PNGINFO"
}
}
def save_svg(self, svg: SVG, filename_prefix="svg/ComfyUI", prompt=None, extra_pnginfo=None):
filename_prefix += self.prefix_append
full_output_folder, filename, counter, subfolder, filename_prefix = folder_paths.get_save_image_path(filename_prefix, self.output_dir)
results = list()
# Prepare metadata JSON
metadata_dict = {}
if prompt is not None:
metadata_dict["prompt"] = prompt
if extra_pnginfo is not None:
metadata_dict.update(extra_pnginfo)
# Convert metadata to JSON string
metadata_json = json.dumps(metadata_dict, indent=2) if metadata_dict else None
for batch_number, svg_bytes in enumerate(svg.data):
filename_with_batch_num = filename.replace("%batch_num%", str(batch_number))
file = f"{filename_with_batch_num}_{counter:05}_.svg"
# Read SVG content
svg_bytes.seek(0)
svg_content = svg_bytes.read().decode('utf-8')
# Inject metadata if available
if metadata_json:
# Create metadata element with CDATA section
metadata_element = f"""
"""
# Insert metadata after opening svg tag using regex
import re
svg_content = re.sub(r'(]*>)', r'\1\n' + metadata_element, svg_content)
# Write the modified SVG to file
with open(os.path.join(full_output_folder, file), 'wb') as svg_file:
svg_file.write(svg_content.encode('utf-8'))
results.append({
"filename": file,
"subfolder": subfolder,
"type": self.type
})
counter += 1
return { "ui": { "images": results } }
```
# Stability AI Stable Diffusion 3.5 - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/stability-ai/stability-ai-stable-diffusion-3-5-image
使用Stability AI的Stable Diffusion 3.5模型生成高质量图像的节点
Stability AI Stable Diffusion 3.5 Image 节点使用 Stability AI 的 Stable Diffusion 3.5 API 生成高质量图像。它支持文本到图像和图像到图像的生成,能够根据文本提示词创建详细的视觉内容。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------- | --- | ------ | ---------------------------------------------------- |
| prompt | 字符串 | "" | 您希望在输出图像中看到的内容。强有力、描述性的提示词,清晰定义元素、颜色和主题将带来更好的结果 |
| model | 选择项 | - | 选择使用的Stability SD 3.5模型 |
| aspect\_ratio | 选择项 | "1:1" | 生成图像的宽高比 |
| style\_preset | 选择项 | "None" | 可选的期望图像风格预设 |
| cfg\_scale | 浮点数 | 4.0 | 扩散过程对提示文本的遵循程度(更高的值使图像更接近您的提示词)。范围:1.0 - 10.0,步长:0.1 |
| seed | 整数 | 0 | 用于创建噪声的随机种子,范围0-4294967294 |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | --- | -------------------------------------------------------------------------- |
| image | 图像 | - | 输入图像。当提供图像时,节点将切换到图像到图像模式 |
| negative\_prompt | 字符串 | "" | 您不希望在输出图像中看到的关键词。这是一个高级功能 |
| image\_denoise | 浮点数 | 0.5 | 输入图像的去噪程度。0.0产生与输入完全相同的图像,1.0则相当于没有提供任何图像。范围:0.0 - 1.0,步长:0.01。仅在提供输入图像时有效 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ----- |
| IMAGE | 图像 | 生成的图像 |
## 使用示例
Stability AI Stable Diffusion 3.5 Image 工作流示例
## 注意事项
* 当提供输入图像时,节点将从文本到图像模式切换到图像到图像模式
* 在图像到图像模式下,宽高比参数将被忽略
* 模式选择会根据是否提供图像自动切换:
* 未提供图像:文本到图像模式
* 提供图像:图像到图像模式
* 如果style\_preset设置为"None",则不会应用任何预设风格
## 源码
\[节点源码 (更新于2025-05-07)]
```python
class StabilityStableImageSD_3_5Node:
"""
Generates images synchronously based on prompt and resolution.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/Stability AI"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "What you wish to see in the output image. A strong, descriptive prompt that clearly defines elements, colors, and subjects will lead to better results."
},
),
"model": ([x.value for x in Stability_SD3_5_Model],),
"aspect_ratio": ([x.value for x in StabilityAspectRatio],
{
"default": StabilityAspectRatio.ratio_1_1,
"tooltip": "Aspect ratio of generated image.",
},
),
"style_preset": (get_stability_style_presets(),
{
"tooltip": "Optional desired style of generated image.",
},
),
"cfg_scale": (
IO.FLOAT,
{
"default": 4.0,
"min": 1.0,
"max": 10.0,
"step": 0.1,
"tooltip": "How strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt)",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 4294967294,
"control_after_generate": True,
"tooltip": "The random seed used for creating the noise.",
},
),
},
"optional": {
"image": (IO.IMAGE,),
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "Keywords of what you do not wish to see in the output image. This is an advanced feature."
},
),
"image_denoise": (
IO.FLOAT,
{
"default": 0.5,
"min": 0.0,
"max": 1.0,
"step": 0.01,
"tooltip": "Denoise of input image; 0.0 yields image identical to input, 1.0 is as if no image was provided at all.",
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(self, model: str, prompt: str, aspect_ratio: str, style_preset: str, seed: int, cfg_scale: float,
negative_prompt: str=None, image: torch.Tensor = None, image_denoise: float=None,
auth_token=None):
validate_string(prompt, strip_whitespace=False)
# prepare image binary if image present
image_binary = None
mode = Stability_SD3_5_GenerationMode.text_to_image
if image is not None:
image_binary = tensor_to_bytesio(image, total_pixels=1504*1504).read()
mode = Stability_SD3_5_GenerationMode.image_to_image
aspect_ratio = None
else:
image_denoise = None
if not negative_prompt:
negative_prompt = None
if style_preset == "None":
style_preset = None
files = {
"image": image_binary
}
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/stability/v2beta/stable-image/generate/sd3",
method=HttpMethod.POST,
request_model=StabilityStable3_5Request,
response_model=StabilityStableUltraResponse,
),
request=StabilityStable3_5Request(
prompt=prompt,
negative_prompt=negative_prompt,
aspect_ratio=aspect_ratio,
seed=seed,
strength=image_denoise,
style_preset=style_preset,
cfg_scale=cfg_scale,
model=model,
mode=mode,
),
files=files,
content_type="multipart/form-data",
auth_token=auth_token,
)
response_api = operation.execute()
if response_api.finish_reason != "SUCCESS":
raise Exception(f"Stable Diffusion 3.5 Image generation failed: {response_api.finish_reason}.")
image_data = base64.b64decode(response_api.image)
returned_image = bytesio_to_image_tensor(BytesIO(image_data))
return (returned_image,)
```
# Stability Stable Image Ultra - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/image/stability-ai/stability-ai-stable-image-ultra
使用Stability AI的超级稳定扩散模型生成高质量图像的节点
Stability Stable Image Ultra 节点使用 Stability AI 的 Stable Diffusion Ultra API 生成高质量图像。它支持文本到图像和图像到图像的生成,能够根据文本提示词创建细节丰富、艺术表现力强的视觉内容。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------- | --- | ------ | ---------------------------------------------------------------------------------------------------------------------------------------- |
| prompt | 字符串 | "" | 详细描述要生成内容的文本提示词。强大、描述性的提示词能明确定义元素、颜色和主题,从而获得更好的结果。可以使用`(词:权重)`格式控制特定词的权重,其中权重为0到1之间的值。例如:`天空是清爽的(蓝色:0.3)和(绿色:0.8)`表示天空是蓝色和绿色的,但绿色比蓝色更明显。 |
| aspect\_ratio | 选择项 | "1:1" | 输出图像的宽高比 |
| style\_preset | 选择项 | "None" | 可选的生成图像的预设风格 |
| seed | 整数 | 0 | 用于创建噪声的随机种子,范围0-4294967294 |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | --- | ---------------------------------------------------- |
| image | 图像 | - | 用于图像到图像生成的输入图像 |
| negative\_prompt | 字符串 | "" | 描述不希望在输出图像中看到的内容。这是一个高级功能 |
| image\_denoise | 浮点数 | 0.5 | 输入图像的去噪强度,范围0.0-1.0。0.0会产生与输入完全相同的图像,1.0则相当于没有提供任何图像 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| IMAGE | 图像 | 生成的图像结果 |
## 使用示例
Stability AI Stable Image Ultra 工作流示例
## 注意事项
* 当未提供输入图像时,image\_denoise参数不会生效
* 如果style\_preset设置为"None",则不会应用任何预设风格
* 当使用图像到图像功能时,输入图像会转换为适当的格式后发送到API
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class StabilityStableImageUltraNode:
"""
Generates images synchronously based on prompt and resolution.
"""
RETURN_TYPES = (IO.IMAGE,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/image/stability"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "What you wish to see in the output image. A strong, descriptive prompt that clearly defines" +
"What you wish to see in the output image. A strong, descriptive prompt that clearly defines" +
"elements, colors, and subjects will lead to better results. " +
"To control the weight of a given word use the format `(word:weight)`," +
"where `word` is the word you'd like to control the weight of and `weight`" +
"is a value between 0 and 1. For example: `The sky was a crisp (blue:0.3) and (green:0.8)`" +
"would convey a sky that was blue and green, but more green than blue."
},
),
"aspect_ratio": ([x.value for x in StabilityAspectRatio],
{
"default": StabilityAspectRatio.ratio_1_1,
"tooltip": "Aspect ratio of generated image.",
},
),
"style_preset": (get_stability_style_presets(),
{
"tooltip": "Optional desired style of generated image.",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 4294967294,
"control_after_generate": True,
"tooltip": "The random seed used for creating the noise.",
},
),
},
"optional": {
"image": (IO.IMAGE,),
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "A blurb of text describing what you do not wish to see in the output image. This is an advanced feature."
},
),
"image_denoise": (
IO.FLOAT,
{
"default": 0.5,
"min": 0.0,
"max": 1.0,
"step": 0.01,
"tooltip": "Denoise of input image; 0.0 yields image identical to input, 1.0 is as if no image was provided at all.",
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(self, prompt: str, aspect_ratio: str, style_preset: str, seed: int,
negative_prompt: str=None, image: torch.Tensor = None, image_denoise: float=None,
auth_token=None):
# prepare image binary if image present
image_binary = None
if image is not None:
image_binary = tensor_to_bytesio(image, 1504 * 1504).read()
else:
image_denoise = None
if not negative_prompt:
negative_prompt = None
if style_preset == "None":
style_preset = None
files = {
"image": image_binary
}
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/stability/v2beta/stable-image/generate/ultra",
method=HttpMethod.POST,
request_model=StabilityStableUltraRequest,
response_model=StabilityStableUltraResponse,
),
request=StabilityStableUltraRequest(
prompt=prompt,
negative_prompt=negative_prompt,
aspect_ratio=aspect_ratio,
seed=seed,
strength=image_denoise,
style_preset=style_preset,
),
files=files,
content_type="multipart/form-data",
auth_token=auth_token,
)
response_api = operation.execute()
if response_api.finish_reason != "SUCCESS":
raise Exception(f"Stable Image Ultra generation failed: {response_api.finish_reason}.")
image_data = base64.b64decode(response_api.image)
returned_image = bytesio_to_image_tensor(BytesIO(image_data))
return (returned_image,)
```
# Google Veo2 Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/google/google-veo2-video
使用Google的Veo2技术通过文本描述生成视频的节点
Google Veo2 Video 节点通过文本描述生成高质量视频,利用Google的Veo2 API技术将文本提示转换为动态视频内容。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------------ | --- | ------- | -------------------------- |
| prompt | 字符串 | "" | 详细描述要生成视频内容的文本 |
| aspect\_ratio | 选择项 | "16:9" | 输出视频的宽高比,可选"16:9"或"9:16" |
| negative\_prompt | 字符串 | "" | 指导系统避免在视频中出现的内容 |
| duration\_seconds | 整数 | 5 | 输出视频的持续时间,5-8秒 |
| enhance\_prompt | 布尔值 | True | 是否使用AI辅助增强提示词 |
| person\_generation | 选择项 | "ALLOW" | 是否允许生成人物,可选"ALLOW"或"BLOCK" |
| seed | 整数 | 0 | 随机种子,0表示随机生成 |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| ----- | -- | ---- | ---------------- |
| image | 图像 | None | 可选的参考图像,用于引导视频生成 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class VeoVideoGenerationNode(ComfyNodeABC):
"""
Generates videos from text prompts using Google's Veo API.
This node can create videos from text descriptions and optional image inputs,
with control over parameters like aspect ratio, duration, and more.
"""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Text description of the video",
},
),
"aspect_ratio": (
IO.COMBO,
{
"options": ["16:9", "9:16"],
"default": "16:9",
"tooltip": "Aspect ratio of the output video",
},
),
},
"optional": {
"negative_prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Negative text prompt to guide what to avoid in the video",
},
),
"duration_seconds": (
IO.INT,
{
"default": 5,
"min": 5,
"max": 8,
"step": 1,
"display": "number",
"tooltip": "Duration of the output video in seconds",
},
),
"enhance_prompt": (
IO.BOOLEAN,
{
"default": True,
"tooltip": "Whether to enhance the prompt with AI assistance",
}
),
"person_generation": (
IO.COMBO,
{
"options": ["ALLOW", "BLOCK"],
"default": "ALLOW",
"tooltip": "Whether to allow generating people in the video",
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFF,
"step": 1,
"display": "number",
"control_after_generate": True,
"tooltip": "Seed for video generation (0 for random)",
},
),
"image": (IO.IMAGE, {
"default": None,
"tooltip": "Optional reference image to guide video generation",
}),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
RETURN_TYPES = (IO.VIDEO,)
FUNCTION = "generate_video"
CATEGORY = "api node/video/Veo"
DESCRIPTION = "Generates videos from text prompts using Google's Veo API"
API_NODE = True
def generate_video(
self,
prompt,
aspect_ratio="16:9",
negative_prompt="",
duration_seconds=5,
enhance_prompt=True,
person_generation="ALLOW",
seed=0,
image=None,
auth_token=None,
):
# Prepare the instances for the request
instances = []
instance = {
"prompt": prompt
}
# Add image if provided
if image is not None:
image_base64 = convert_image_to_base64(image)
if image_base64:
instance["image"] = {
"bytesBase64Encoded": image_base64,
"mimeType": "image/png"
}
instances.append(instance)
# Create parameters dictionary
parameters = {
"aspectRatio": aspect_ratio,
"personGeneration": person_generation,
"durationSeconds": duration_seconds,
"enhancePrompt": enhance_prompt,
}
# Add optional parameters if provided
if negative_prompt:
parameters["negativePrompt"] = negative_prompt
if seed > 0:
parameters["seed"] = seed
# Initial request to start video generation
initial_operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/veo/generate",
method=HttpMethod.POST,
request_model=Veo2GenVidRequest,
response_model=Veo2GenVidResponse
),
request=Veo2GenVidRequest(
instances=instances,
parameters=parameters
),
auth_token=auth_token
)
initial_response = initial_operation.execute()
operation_name = initial_response.name
logging.info(f"Veo generation started with operation name: {operation_name}")
# Define status extractor function
def status_extractor(response):
# Only return "completed" if the operation is done, regardless of success or failure
# We'll check for errors after polling completes
return "completed" if response.done else "pending"
# Define progress extractor function
def progress_extractor(response):
# Could be enhanced if the API provides progress information
return None
# Define the polling operation
poll_operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path="/proxy/veo/poll",
method=HttpMethod.POST,
request_model=Veo2GenVidPollRequest,
response_model=Veo2GenVidPollResponse
),
completed_statuses=["completed"],
failed_statuses=[], # No failed statuses, we'll handle errors after polling
status_extractor=status_extractor,
progress_extractor=progress_extractor,
request=Veo2GenVidPollRequest(
operationName=operation_name
),
auth_token=auth_token,
poll_interval=5.0
)
# Execute the polling operation
poll_response = poll_operation.execute()
# Now check for errors in the final response
# Check for error in poll response
if hasattr(poll_response, 'error') and poll_response.error:
error_message = f"Veo API error: {poll_response.error.message} (code: {poll_response.error.code})"
logging.error(error_message)
raise Exception(error_message)
# Check for RAI filtered content
if (hasattr(poll_response.response, 'raiMediaFilteredCount') and
poll_response.response.raiMediaFilteredCount > 0):
# Extract reason message if available
if (hasattr(poll_response.response, 'raiMediaFilteredReasons') and
poll_response.response.raiMediaFilteredReasons):
reason = poll_response.response.raiMediaFilteredReasons[0]
error_message = f"Content filtered by Google's Responsible AI practices: {reason} ({poll_response.response.raiMediaFilteredCount} videos filtered.)"
else:
error_message = f"Content filtered by Google's Responsible AI practices ({poll_response.response.raiMediaFilteredCount} videos filtered.)"
logging.error(error_message)
raise Exception(error_message)
# Extract video data
video_data = None
if poll_response.response and hasattr(poll_response.response, 'videos') and poll_response.response.videos and len(poll_response.response.videos) > 0:
video = poll_response.response.videos[0]
# Check if video is provided as base64 or URL
if hasattr(video, 'bytesBase64Encoded') and video.bytesBase64Encoded:
# Decode base64 string to bytes
video_data = base64.b64decode(video.bytesBase64Encoded)
elif hasattr(video, 'gcsUri') and video.gcsUri:
# Download from URL
video_url = video.gcsUri
video_response = requests.get(video_url)
video_data = video_response.content
else:
raise Exception("Video returned but no data or URL was provided")
else:
raise Exception("Video generation completed but no video was returned")
if not video_data:
raise Exception("No video data was returned")
logging.info("Video generation completed successfully")
# Convert video data to BytesIO object
video_io = io.BytesIO(video_data)
# Return VideoFromFile object
return (VideoFromFile(video_io),)
```
# Kling Image to Video (Camera Control) - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/kwai_vgi/kling-camera-control-i2v
使用摄像机控制功能的Kling图像到视频转换节点
Kling Image to Video (Camera Control) 节点可将静态图像转换为具有专业镜头动作的视频,支持变焦、旋转、平移、倾斜和第一人称视角等摄像机控制功能,同时保持对原始图像内容的关注。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | ---- | ----------------- |
| start\_frame | 图像 | - | 要转换为视频的输入图像 |
| prompt | 字符串 | "" | 描述视频动作和内容的文本提示词 |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| cfg\_scale | 浮点数 | 7.0 | 配置缩放值,控制对提示词的遵循程度 |
| aspect\_ratio | 选择项 | 16:9 | 输出视频的宽高比 |
### 摄像机控制参数
| 参数 | 类型 | 说明 |
| --------------- | ------------- | --------------------------------- |
| camera\_control | CameraControl | 摄像机控制配置,来自Kling Camera Controls节点 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class KlingCameraControlI2VNode(KlingImage2VideoNode):
"""
Kling Image to Video Camera Control Node. This node is a image to video node, but it supports controlling the camera.
Duration, mode, and model_name request fields are hard-coded because camera control is only supported in pro mode with the kling-v1-5 model at 5s duration as of 2025-05-02.
"""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"start_frame": model_field_to_node_input(
IO.IMAGE, KlingImage2VideoRequest, "image"
),
"prompt": model_field_to_node_input(
IO.STRING, KlingImage2VideoRequest, "prompt", multiline=True
),
"negative_prompt": model_field_to_node_input(
IO.STRING,
KlingImage2VideoRequest,
"negative_prompt",
multiline=True,
),
"cfg_scale": model_field_to_node_input(
IO.FLOAT, KlingImage2VideoRequest, "cfg_scale"
),
"aspect_ratio": model_field_to_node_input(
IO.COMBO,
KlingImage2VideoRequest,
"aspect_ratio",
enum_type=AspectRatio,
),
"camera_control": (
"CAMERA_CONTROL",
{
"tooltip": "Can be created using the Kling Camera Controls node. Controls the camera movement and motion during the video generation.",
},
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
DESCRIPTION = "Transform still images into cinematic videos with professional camera movements that simulate real-world cinematography. Control virtual camera actions including zoom, rotation, pan, tilt, and first-person view, while maintaining focus on your original image."
def api_call(
self,
start_frame: torch.Tensor,
prompt: str,
negative_prompt: str,
cfg_scale: float,
aspect_ratio: str,
camera_control: CameraControl,
auth_token: Optional[str] = None,
):
return super().api_call(
model_name="kling-v1-5",
start_frame=start_frame,
cfg_scale=cfg_scale,
mode="pro",
aspect_ratio=aspect_ratio,
duration="5",
prompt=prompt,
negative_prompt=negative_prompt,
camera_control=camera_control,
auth_token=auth_token,
)
```
# Kling Text to Video (Camera Control) - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/kwai_vgi/kling-camera-control-t2v
使用摄像机控制功能的Kling文本到视频生成节点
Kling Text to Video (Camera Control) 节点允许用户将文本转换为具有专业摄像机动作的视频。该节点是标准 Kling Text to Video 节点的扩展版本,增加了摄像机控制功能。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --------------- | ------ | --------------------------------------- |
| prompt | 字符串 | "" | 描述要生成视频内容的文本提示词 |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| cfg\_scale | 浮点数 | 7.0 | 配置缩放值,控制对提示词的遵循程度 |
| aspect\_ratio | 选择项 | "16:9" | 输出视频的宽高比 |
| camera\_control | CAMERA\_CONTROL | 无 | 摄像机控制配置,必须通过 Kling Camera Controls 节点创建 |
### 固定参数
注意:以下参数在节点内部已固定设置,用户无法修改:
* 模型名称:kling-v1-5
* 模式:pro
* 视频时长:5秒
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class KlingCameraControlT2VNode(KlingTextToVideoNode):
"""
Kling Text to Video Camera Control Node. This node is a text to video node, but it supports controlling the camera.
Duration, mode, and model_name request fields are hard-coded because camera control is only supported in pro mode with the kling-v1-5 model at 5s duration as of 2025-05-02.
"""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": model_field_to_node_input(
IO.STRING, KlingText2VideoRequest, "prompt", multiline=True
),
"negative_prompt": model_field_to_node_input(
IO.STRING,
KlingText2VideoRequest,
"negative_prompt",
multiline=True,
),
"cfg_scale": model_field_to_node_input(
IO.FLOAT, KlingText2VideoRequest, "cfg_scale"
),
"aspect_ratio": model_field_to_node_input(
IO.COMBO,
KlingText2VideoRequest,
"aspect_ratio",
enum_type=AspectRatio,
),
"camera_control": (
"CAMERA_CONTROL",
{
"tooltip": "Can be created using the Kling Camera Controls node. Controls the camera movement and motion during the video generation.",
},
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
DESCRIPTION = "Transform text into cinematic videos with professional camera movements that simulate real-world cinematography. Control virtual camera actions including zoom, rotation, pan, tilt, and first-person view, while maintaining focus on your original text."
def api_call(
self,
prompt: str,
negative_prompt: str,
cfg_scale: float,
aspect_ratio: str,
camera_control: Optional[CameraControl] = None,
auth_token: Optional[str] = None,
):
return super().api_call(
model_name="kling-v1-5",
cfg_scale=cfg_scale,
mode="pro",
aspect_ratio=aspect_ratio,
duration="5",
prompt=prompt,
negative_prompt=negative_prompt,
camera_control=camera_control,
auth_token=auth_token,
)
```
# Kling Camera Controls - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/kwai_vgi/kling-camera-controls
为Kling视频生成提供摄像机控制参数的节点
Kling Camera Controls 节点用于定义虚拟摄像机的行为参数,控制Kling视频生成过程中的镜头运动和视角变化。
## 参数说明
| 参数 | 类型 | 默认值 | 说明 |
| --------------------- | --- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
| camera\_control\_type | 选择项 | "simple" | 预定义的摄像机运动类型。simple: 可自定义的摄像机运动;down\_back: 摄像机下降并向后移动;forward\_up: 摄像机向前移动并向上倾斜;right\_turn\_forward: 向右旋转并向前移动;left\_turn\_forward: 向左旋转并向前移动 |
| horizontal\_movement | 浮点数 | 0 | 控制摄像机沿水平轴(x轴)的移动。负值表示向左,正值表示向右 |
| vertical\_movement | 浮点数 | 0 | 控制摄像机沿垂直轴(y轴)的移动。负值表示向下,正值表示向上 |
| pan | 浮点数 | 0.5 | 控制摄像机在垂直平面上的旋转(x轴)。负值表示向下旋转,正值表示向上旋转 |
| tilt | 浮点数 | 0 | 控制摄像机在水平平面上的旋转(y轴)。负值表示向左旋转,正值表示向右旋转 |
| roll | 浮点数 | 0 | 控制摄像机的滚动量(z轴)。负值表示逆时针,正值表示顺时针 |
| zoom | 浮点数 | 0 | 控制摄像机焦距的变化。负值表示视场变窄,正值表示视场变宽 |
**注意**:至少需要一个非零值的摄像机控制参数才有效。
### 输出
| 输出 | 类型 | 说明 |
| --------------- | --------------- | ------------ |
| camera\_control | CAMERA\_CONTROL | 包含摄像机设置的配置对象 |
**注意**:并非所有模型和模式组合都支持摄像机控制。请参考Kling API文档了解更多信息。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class KlingCameraControls(KlingNodeBase):
"""Kling Camera Controls Node"""
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"camera_control_type": (
IO.COMBO,
{
"options": [
camera_control_type.value
for camera_control_type in CameraType
],
"default": "simple",
"tooltip": "Predefined camera movements type. simple: Customizable camera movement. down_back: Camera descends and moves backward. forward_up: Camera moves forward and tilts up. right_turn_forward: Rotate right and move forward. left_turn_forward: Rotate left and move forward.",
},
),
"horizontal_movement": get_camera_control_input_config(
"Controls camera's movement along horizontal axis (x-axis). Negative indicates left, positive indicates right"
),
"vertical_movement": get_camera_control_input_config(
"Controls camera's movement along vertical axis (y-axis). Negative indicates downward, positive indicates upward."
),
"pan": get_camera_control_input_config(
"Controls camera's rotation in vertical plane (x-axis). Negative indicates downward rotation, positive indicates upward rotation.",
default=0.5,
),
"tilt": get_camera_control_input_config(
"Controls camera's rotation in horizontal plane (y-axis). Negative indicates left rotation, positive indicates right rotation.",
),
"roll": get_camera_control_input_config(
"Controls camera's rolling amount (z-axis). Negative indicates counterclockwise, positive indicates clockwise.",
),
"zoom": get_camera_control_input_config(
"Controls change in camera's focal length. Negative indicates narrower field of view, positive indicates wider field of view.",
),
}
}
DESCRIPTION = "Kling Camera Controls Node. Not all model and mode combinations support camera control. Please refer to the Kling API documentation for more information."
RETURN_TYPES = ("CAMERA_CONTROL",)
RETURN_NAMES = ("camera_control",)
FUNCTION = "main"
@classmethod
def VALIDATE_INPUTS(
cls,
horizontal_movement: float,
vertical_movement: float,
pan: float,
tilt: float,
roll: float,
zoom: float,
) -> bool | str:
if not is_valid_camera_control_configs(
[
horizontal_movement,
vertical_movement,
pan,
tilt,
roll,
zoom,
]
):
return "Invalid camera control configs: at least one of the values must be non-zero"
return True
def main(
self,
camera_control_type: str,
horizontal_movement: float,
vertical_movement: float,
pan: float,
tilt: float,
roll: float,
zoom: float,
) -> tuple[CameraControl]:
return (
CameraControl(
type=CameraType(camera_control_type),
config=CameraConfig(
horizontal=horizontal_movement,
vertical=vertical_movement,
pan=pan,
roll=roll,
tilt=tilt,
zoom=zoom,
),
),
)
```
# Kling Image to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/kwai_vgi/kling-image-to-video
使用Kling的AI技术将静态图像转换为动态视频的节点
Kling Image to Video 节点可以通过Kling的图像到视频API,将静态图像转换为动态视频内容。
## 参数说明
### 基本参数
以下所有参数均为必需参数:
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | ------------ | ----------------- |
| start\_frame | 图像 | - | 输入的源图像 |
| prompt | 字符串 | "" | 描述视频动作和内容的文本提示词 |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| cfg\_scale | 浮点数 | 7.0 | 配置缩放值,控制对提示词的遵循程度 |
| model\_name | 选择项 | "kling-v1-5" | 使用的模型类型 |
| aspect\_ratio | 选择项 | "16:9" | 输出视频的宽高比 |
| duration | 选择项 | "5s" | 生成视频的持续时间 |
| mode | 选择项 | "pro" | 视频生成模式 |
### 输出
| 输出 | 类型 | 说明 |
| --------- | --- | --------- |
| VIDEO | 视频 | 生成的视频结果 |
| video\_id | 字符串 | 视频的唯一标识符 |
| duration | 字符串 | 视频的实际持续时间 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class KlingImage2VideoNode(KlingNodeBase):
"""Kling Image to Video Node"""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"start_frame": model_field_to_node_input(
IO.IMAGE, KlingImage2VideoRequest, "image"
),
"prompt": model_field_to_node_input(
IO.STRING, KlingImage2VideoRequest, "prompt", multiline=True
),
"negative_prompt": model_field_to_node_input(
IO.STRING,
KlingImage2VideoRequest,
"negative_prompt",
multiline=True,
),
"model_name": model_field_to_node_input(
IO.COMBO,
KlingImage2VideoRequest,
"model_name",
enum_type=KlingVideoGenModelName,
),
"cfg_scale": model_field_to_node_input(
IO.FLOAT, KlingImage2VideoRequest, "cfg_scale"
),
"mode": model_field_to_node_input(
IO.COMBO,
KlingImage2VideoRequest,
"mode",
enum_type=KlingVideoGenMode,
),
"aspect_ratio": model_field_to_node_input(
IO.COMBO,
KlingImage2VideoRequest,
"aspect_ratio",
enum_type=KlingVideoGenAspectRatio,
),
"duration": model_field_to_node_input(
IO.COMBO,
KlingImage2VideoRequest,
"duration",
enum_type=KlingVideoGenDuration,
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
RETURN_TYPES = ("VIDEO", "STRING", "STRING")
RETURN_NAMES = ("VIDEO", "video_id", "duration")
DESCRIPTION = "Kling Image to Video Node"
def get_response(self, task_id: str, auth_token: str) -> KlingImage2VideoResponse:
return poll_until_finished(
auth_token,
ApiEndpoint(
path=f"{PATH_IMAGE_TO_VIDEO}/{task_id}",
method=HttpMethod.GET,
request_model=KlingImage2VideoRequest,
response_model=KlingImage2VideoResponse,
),
)
def api_call(
self,
start_frame: torch.Tensor,
prompt: str,
negative_prompt: str,
model_name: str,
cfg_scale: float,
mode: str,
aspect_ratio: str,
duration: str,
camera_control: Optional[KlingCameraControl] = None,
end_frame: Optional[torch.Tensor] = None,
auth_token: Optional[str] = None,
) -> tuple[VideoFromFile]:
validate_prompts(prompt, negative_prompt, MAX_PROMPT_LENGTH_I2V)
initial_operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=PATH_IMAGE_TO_VIDEO,
method=HttpMethod.POST,
request_model=KlingImage2VideoRequest,
response_model=KlingImage2VideoResponse,
),
request=KlingImage2VideoRequest(
model_name=KlingVideoGenModelName(model_name),
image=tensor_to_base64_string(start_frame),
image_tail=(
tensor_to_base64_string(end_frame)
if end_frame is not None
else None
),
prompt=prompt,
negative_prompt=negative_prompt if negative_prompt else None,
cfg_scale=cfg_scale,
mode=KlingVideoGenMode(mode),
aspect_ratio=KlingVideoGenAspectRatio(aspect_ratio),
duration=KlingVideoGenDuration(duration),
camera_control=camera_control,
),
auth_token=auth_token,
)
task_creation_response = initial_operation.execute()
validate_task_creation_response(task_creation_response)
task_id = task_creation_response.data.task_id
final_response = self.get_response(task_id, auth_token)
validate_video_result_response(final_response)
video = get_video_from_response(final_response)
return video_result_to_node_output(video)
```
# Kling Start-End Frame to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/kwai_vgi/kling-start-end-frame-to-video
使用Kling的AI技术创建从起始帧到结束帧平滑过渡的视频
Kling Start-End Frame to Video 节点允许你提供开始和结束图像,生成在两者之间平滑过渡的视频序列。该节点会自动创建所有中间帧,产生流畅的变换效果。
## 参数说明
### 必需参数
| 参数 | 类型 | 说明 |
| ---------------- | --- | ---------------------- |
| start\_frame | 图像 | 视频的起始帧图像 |
| end\_frame | 图像 | 视频的结束帧图像 |
| prompt | 字符串 | 描述视频内容和过渡效果的文本提示词 |
| negative\_prompt | 字符串 | 指定不希望在视频中出现的元素 |
| cfg\_scale | 浮点数 | 配置缩放值,控制对提示词的遵循程度 |
| aspect\_ratio | 选择项 | 输出视频的宽高比 |
| mode | 选择项 | 视频生成配置,格式为"模式/时长/模型名称" |
### Mode 选项
节点支持以下模式选项:
* standard mode / 5s duration / kling-v1
* standard mode / 5s duration / kling-v1-5
* pro mode / 5s duration / kling-v1
* pro mode / 5s duration / kling-v1-5
* pro mode / 5s duration / kling-v1-6
* pro mode / 10s duration / kling-v1-5
* pro mode / 10s duration / kling-v1-6
默认值为 "pro mode / 5s duration / kling-v1"
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 工作原理
Kling Start-End Frame to Video 节点分析起始帧和结束帧图像,然后创建连接这两个状态的平滑过渡序列。节点将图像和参数发送到Kling的API服务器,后者生成所有必要的中间帧,创建流畅的变换效果。
可以通过提示词引导过渡效果的风格和内容,而负面提示词则帮助避免不需要的元素。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class KlingStartEndFrameNode(KlingImage2VideoNode):
"""
Kling First Last Frame Node. This node allows creation of a video from a first and last frame. It calls the normal image to video endpoint, but only allows the subset of input options that support the `image_tail` request field.
"""
@staticmethod
def get_mode_string_mapping() -> dict[str, tuple[str, str, str]]:
"""
Returns a mapping of mode strings to their corresponding (mode, duration, model_name) tuples.
Only includes config combos that support the `image_tail` request field.
"""
return {
"standard mode / 5s duration / kling-v1": ("std", "5", "kling-v1"),
"standard mode / 5s duration / kling-v1-5": ("std", "5", "kling-v1-5"),
"pro mode / 5s duration / kling-v1": ("pro", "5", "kling-v1"),
"pro mode / 5s duration / kling-v1-5": ("pro", "5", "kling-v1-5"),
"pro mode / 5s duration / kling-v1-6": ("pro", "5", "kling-v1-6"),
"pro mode / 10s duration / kling-v1-5": ("pro", "10", "kling-v1-5"),
"pro mode / 10s duration / kling-v1-6": ("pro", "10", "kling-v1-6"),
}
@classmethod
def INPUT_TYPES(s):
modes = list(KlingStartEndFrameNode.get_mode_string_mapping().keys())
return {
"required": {
"start_frame": model_field_to_node_input(
IO.IMAGE, KlingImage2VideoRequest, "image"
),
"end_frame": model_field_to_node_input(
IO.IMAGE, KlingImage2VideoRequest, "image_tail"
),
"prompt": model_field_to_node_input(
IO.STRING, KlingImage2VideoRequest, "prompt", multiline=True
),
"negative_prompt": model_field_to_node_input(
IO.STRING,
KlingImage2VideoRequest,
"negative_prompt",
multiline=True,
),
"cfg_scale": model_field_to_node_input(
IO.FLOAT, KlingImage2VideoRequest, "cfg_scale"
),
"aspect_ratio": model_field_to_node_input(
IO.COMBO,
KlingImage2VideoRequest,
"aspect_ratio",
enum_type=AspectRatio,
),
"mode": (
modes,
{
"default": modes[2],
"tooltip": "The configuration to use for the video generation following the format: mode / duration / model_name.",
},
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
DESCRIPTION = "Generate a video sequence that transitions between your provided start and end images. The node creates all frames in between, producing a smooth transformation from the first frame to the last."
def parse_inputs_from_mode(self, mode: str) -> tuple[str, str, str]:
"""Parses the mode input into a tuple of (model_name, duration, mode)."""
return KlingStartEndFrameNode.get_mode_string_mapping()[mode]
def api_call(
self,
start_frame: torch.Tensor,
end_frame: torch.Tensor,
prompt: str,
negative_prompt: str,
cfg_scale: float,
aspect_ratio: str,
mode: str,
auth_token: Optional[str] = None,
):
mode, duration, model_name = self.parse_inputs_from_mode(mode)
return super().api_call(
prompt=prompt,
negative_prompt=negative_prompt,
model_name=model_name,
start_frame=start_frame,
cfg_scale=cfg_scale,
mode=mode,
aspect_ratio=aspect_ratio,
duration=duration,
end_frame=end_frame,
auth_token=auth_token,
)
```
# Kling Text to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/kwai_vgi/kling-text-to-video
使用Kling的AI技术将文本描述转换为视频的节点
Kling Text to Video 节点通过连接Kling的API服务,实现文本到视频的生成功能。用户只需提供描述性文本,即可创建对应的视频内容。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | ----------------- | ----------------- |
| prompt | 字符串 | "" | 描述要生成视频内容的文本提示词 |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| cfg\_scale | 浮点数 | 7.0 | 配置缩放值,控制对提示词的遵循程度 |
| model\_name | 选择项 | "kling-v2-master" | 使用的视频生成模型 |
| aspect\_ratio | 选择项 | AspectRatio枚举值 | 输出视频的宽高比 |
| duration | 选择项 | Duration枚举值 | 生成视频的持续时间 |
| mode | 选择项 | Mode枚举值 | 视频生成模式 |
### 输出
| 输出 | 类型 | 说明 |
| -------------- | --- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
| Kling ID | 字符串 | 任务识别ID |
| Duration (sec) | 字符串 | 视频时长(秒) |
## 工作原理
节点将文本提示词发送到Kling的API服务器,系统处理后返回生成的视频结果。处理过程包括初始请求和任务状态轮询,当任务完成后,节点会下载视频并输出结果。
用户可以通过调整各种参数来控制生成效果,包括负面提示词、配置缩放值以及视频属性等。系统会验证提示词长度,确保请求符合API要求。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class KlingTextToVideoNode(KlingNodeBase):
"""Kling Text to Video Node"""
@staticmethod
def poll_for_task_status(task_id: str, auth_token: str) -> KlingText2VideoResponse:
"""Polls the Kling API endpoint until the task reaches a terminal state."""
polling_operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path=f"{PATH_TEXT_TO_VIDEO}/{task_id}",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=KlingText2VideoResponse,
),
completed_statuses=[
TaskStatus.succeed.value,
],
failed_statuses=[TaskStatus.failed.value],
status_extractor=lambda response: (
response.data.task_status.value
if response.data and response.data.task_status
else None
),
auth_token=auth_token,
)
return polling_operation.execute()
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": model_field_to_node_input(
IO.STRING, KlingText2VideoRequest, "prompt", multiline=True
),
"negative_prompt": model_field_to_node_input(
IO.STRING, KlingText2VideoRequest, "negative_prompt", multiline=True
),
"model_name": model_field_to_node_input(
IO.COMBO,
KlingText2VideoRequest,
"model_name",
enum_type=ModelName,
default="kling-v2-master",
),
"cfg_scale": model_field_to_node_input(
IO.FLOAT, KlingText2VideoRequest, "cfg_scale"
),
"mode": model_field_to_node_input(
IO.COMBO, KlingText2VideoRequest, "mode", enum_type=Mode
),
"duration": model_field_to_node_input(
IO.COMBO, KlingText2VideoRequest, "duration", enum_type=Duration
),
"aspect_ratio": model_field_to_node_input(
IO.COMBO,
KlingText2VideoRequest,
"aspect_ratio",
enum_type=AspectRatio,
),
},
"hidden": {"auth_token": "AUTH_TOKEN_COMFY_ORG"},
}
RETURN_TYPES = ("VIDEO", "STRING", "STRING")
RETURN_NAMES = ("VIDEO", "Kling ID", "Duration (sec)")
DESCRIPTION = "Kling Text to Video Node"
def api_call(
self,
prompt: str,
negative_prompt: str,
model_name: str,
cfg_scale: float,
mode: str,
duration: int,
aspect_ratio: str,
camera_control: Optional[CameraControl] = None,
auth_token: Optional[str] = None,
) -> tuple[VideoFromFile, str, str]:
validate_prompts(prompt, negative_prompt, MAX_PROMPT_LENGTH_T2V)
initial_operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=PATH_TEXT_TO_VIDEO,
method=HttpMethod.POST,
request_model=KlingText2VideoRequest,
response_model=KlingText2VideoResponse,
),
request=KlingText2VideoRequest(
prompt=prompt if prompt else None,
negative_prompt=negative_prompt if negative_prompt else None,
duration=Duration(duration),
mode=Mode(mode),
model_name=ModelName(model_name),
cfg_scale=cfg_scale,
aspect_ratio=AspectRatio(aspect_ratio),
camera_control=camera_control,
),
auth_token=auth_token,
)
initial_response = initial_operation.execute()
if not is_valid_initial_response(initial_response):
error_msg = f"Kling initial request failed. Code: {initial_response.code}, Message: {initial_response.message}, Data: {initial_response.data}"
logging.error(error_msg)
raise KlingApiError(error_msg)
task_id = initial_response.data.task_id
final_response = self.poll_for_task_status(task_id, auth_token)
if not is_valid_video_response(final_response):
error_msg = (
f"Kling task {task_id} succeeded but no video data found in response."
)
logging.error(error_msg)
raise KlingApiError(error_msg)
video = final_response.data.task_result.videos[0]
logging.debug("Kling task %s succeeded. Video URL: %s", task_id, video.url)
return (
download_url_to_video_output(video.url),
str(video.id),
str(video.duration),
)
```
# Luma Concepts - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/luma/luma-concepts
为Luma图像生成提供概念引导的辅助节点
Luma Concepts 节点允许你将预定义的镜头概念应用到Luma生成过程中,提供更精确的镜头和视角控制,无需复杂的提示词描述。
## 节点功能
此节点作为Luma生成节点的辅助工具,让用户能够选择和应用预定义的镜头概念,这些概念包括不同的拍摄角度(如俯视、仰视)、镜头距离(如特写、远景)、运动方式(如推进、跟随)等摄影参数。它简化了创作工作流程,提供了一种直观的方式来控制生成结果的镜头效果。
## 参数说明
### 基本参数
| 参数 | 类型 | 说明 |
| -------- | --- | --------------------------- |
| concept1 | 选择项 | 第一个镜头概念选择,包含多种预设镜头选项和"none" |
| concept2 | 选择项 | 第二个镜头概念选择,包含多种预设镜头选项和"none" |
| concept3 | 选择项 | 第三个镜头概念选择,包含多种预设镜头选项和"none" |
| concept4 | 选择项 | 第四个镜头概念选择,包含多种预设镜头选项和"none" |
### 可选参数
| 参数 | 类型 | 说明 |
| -------------- | -------------- | ---------------------------------- |
| luma\_concepts | LUMA\_CONCEPTS | 可选的额外Camera Concepts,会与此处选择的镜头概念合并 |
### 输出
| 输出 | 类型 | 说明 |
| -------------- | ------------- | --------------- |
| luma\_concepts | LUMA\_CONCEPT | 包含所有选定镜头概念的组合对象 |
## 使用示例
Luma Text to Video 工作流示例
Luma Image to Video 工作流示例
## 工作原理
Luma Concepts 节点提供了丰富的预定义镜头概念供选择,包括但不限于:
* 不同的拍摄距离(如特写、中景、远景)
* 视角高度(如地平、俯视、仰视)
* 运动方式(如推进、跟随、环绕)
* 特殊效果(如手持、稳定、漂浮)
用户可以从这些选项中选择最多4个概念组合使用。节点会创建一个包含所选镜头概念的对象,该对象随后传递给Luma生成节点。在生成过程中,Luma AI会根据这些镜头概念来影响生成结果的视角和构图,确保输出图像体现所选的摄影效果。
通过组合多个镜头概念,用户可以创建复杂的镜头指导,而无需撰写详细的提示词描述。这对于需要特定摄影视角或构图的场景特别有用。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class LumaConceptsNode(ComfyNodeABC):
"""
Holds one or more Camera Concepts for use with Luma Text to Video and Luma Image to Video nodes.
"""
RETURN_TYPES = (LumaIO.LUMA_CONCEPTS,)
RETURN_NAMES = ("luma_concepts",)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "create_concepts"
CATEGORY = "api node/image/Luma"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"concept1": (get_luma_concepts(include_none=True),),
"concept2": (get_luma_concepts(include_none=True),),
"concept3": (get_luma_concepts(include_none=True),),
"concept4": (get_luma_concepts(include_none=True),),
},
"optional": {
"luma_concepts": (
LumaIO.LUMA_CONCEPTS,
{
"tooltip": "Optional Camera Concepts to add to the ones chosen here."
},
),
},
}
def create_concepts(
self,
concept1: str,
concept2: str,
concept3: str,
concept4: str,
luma_concepts: LumaConceptChain = None,
):
chain = LumaConceptChain(str_list=[concept1, concept2, concept3, concept4])
if luma_concepts is not None:
chain = luma_concepts.clone_and_merge(chain)
return (chain,)
```
# Luma Image to Video - ComfyUI 原生API节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/luma/luma-image-to-video
使用Luma AI将静态图像转换为动态视频的节点
Luma Image to Video 节点允许你使用Luma AI的先进技术,将静态图像转换为流畅、动态的视频内容,为图像赋予生命力和动态特性。
## 节点功能
此节点连接到Luma AI的图像到视频API,让用户能够基于输入图像创建动态视频。它可以理解图像中的内容并生成自然、连贯的动作,同时保持原始图像的视觉风格和特性。结合文本提示词,用户可以精确控制生成视频的动态效果。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------- | --- | ------ | ---------------------------- |
| prompt | 字符串 | "" | 描述视频动作和内容的文本提示词 |
| model | 选择项 | - | 使用的视频生成模型 |
| resolution | 选择项 | "540p" | 输出视频分辨率 |
| duration | 选择项 | - | 视频时长选项 |
| loop | 布尔值 | False | 是否循环播放视频 |
| seed | 整数 | 0 | 种子值,用于确定节点是否应重新运行,但实际结果与种子无关 |
### 可选参数
| 参数 | 类型 | 说明 |
| -------------- | -------------- | -------------------------------- |
| first\_image | 图像 | 视频的第一帧图像(与last\_image至少需要提供一个) |
| last\_image | 图像 | 视频的最后一帧图像(与first\_image至少需要提供一个) |
| luma\_concepts | LUMA\_CONCEPTS | 用于控制相机运动和镜头效果的概念引导 |
### 参数要求
* **first\_image** 和 **last\_image** 至少需要提供其中一个
* 每个图像输入(first\_image和last\_image)最多只接受1张图片
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 使用示例
Luma Image to Video 工作流教程
## 工作原理
Luma Image to Video 节点分析输入图像的内容和结构,然后结合文本提示词来确定如何为图像添加动态效果。它使用Luma AI的生成模型理解图像中的对象、人物或场景,并创建合理、连贯的动作序列。
用户可以通过提示词描述期望的动作类型、方向和强度,节点将据此生成相应的视频效果。通过设置不同的参数,如分辨率和时长,用户可以进一步定制输出视频的特性。
此外,通过提供起始帧和最后帧参考图像,用户可以指定视频的起始和结束状态,使动作朝特定方向发展。概念引导功能则允许用户进一步控制视频的整体风格、相机运动和美学效果。
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class LumaImageToVideoGenerationNode(ComfyNodeABC):
"""
Generates videos synchronously based on prompt, input images, and output_size.
"""
RETURN_TYPES = (IO.VIDEO,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/video/Luma"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the video generation",
},
),
"model": ([model.value for model in LumaVideoModel],),
# "aspect_ratio": ([ratio.value for ratio in LumaAspectRatio], {
# "default": LumaAspectRatio.ratio_16_9,
# }),
"resolution": (
[resolution.value for resolution in LumaVideoOutputResolution],
{
"default": LumaVideoOutputResolution.res_540p,
},
),
"duration": ([dur.value for dur in LumaVideoModelOutputDuration],),
"loop": (
IO.BOOLEAN,
{
"default": False,
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
},
"optional": {
"first_image": (
IO.IMAGE,
{"tooltip": "First frame of generated video."},
),
"last_image": (IO.IMAGE, {"tooltip": "Last frame of generated video."}),
"luma_concepts": (
LumaIO.LUMA_CONCEPTS,
{
"tooltip": "Optional Camera Concepts to dictate camera motion via the Luma Concepts node."
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
prompt: str,
model: str,
resolution: str,
duration: str,
loop: bool,
seed,
first_image: torch.Tensor = None,
last_image: torch.Tensor = None,
luma_concepts: LumaConceptChain = None,
auth_token=None,
**kwargs,
):
if first_image is None and last_image is None:
raise Exception(
"At least one of first_image and last_image requires an input."
)
keyframes = self._convert_to_keyframes(first_image, last_image, auth_token)
duration = duration if model != LumaVideoModel.ray_1_6 else None
resolution = resolution if model != LumaVideoModel.ray_1_6 else None
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/luma/generations",
method=HttpMethod.POST,
request_model=LumaGenerationRequest,
response_model=LumaGeneration,
),
request=LumaGenerationRequest(
prompt=prompt,
model=model,
aspect_ratio=LumaAspectRatio.ratio_16_9, # ignored, but still needed by the API for some reason
resolution=resolution,
duration=duration,
loop=loop,
keyframes=keyframes,
concepts=luma_concepts.create_api_model() if luma_concepts else None,
),
auth_token=auth_token,
)
response_api: LumaGeneration = operation.execute()
operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path=f"/proxy/luma/generations/{response_api.id}",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=LumaGeneration,
),
completed_statuses=[LumaState.completed],
failed_statuses=[LumaState.failed],
status_extractor=lambda x: x.state,
auth_token=auth_token,
)
response_poll = operation.execute()
vid_response = requests.get(response_poll.assets.video)
return (VideoFromFile(BytesIO(vid_response.content)),)
def _convert_to_keyframes(
self,
first_image: torch.Tensor = None,
last_image: torch.Tensor = None,
auth_token=None,
):
if first_image is None and last_image is None:
return None
frame0 = None
frame1 = None
if first_image is not None:
download_urls = upload_images_to_comfyapi(
first_image, max_images=1, auth_token=auth_token
)
frame0 = LumaImageReference(type="image", url=download_urls[0])
if last_image is not None:
download_urls = upload_images_to_comfyapi(
last_image, max_images=1, auth_token=auth_token
)
frame1 = LumaImageReference(type="image", url=download_urls[0])
return LumaKeyframes(frame0=frame0, frame1=frame1)
```
# Luma Text to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/luma/luma-text-to-video
使用Luma AI将文本描述转换为视频的节点
Luma Text to Video 节点允许你使用Luma AI的创新视频生成技术,通过文本描述创建高质量、流畅的视频内容。
## 节点功能
此节点连接到Luma AI的文本到视频API,让用户能够通过详细的文本提示词生成动态视频内容。
## 参数说明
### 基本参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------- | --- | -------------- | ----------------------------- |
| prompt | 字符串 | "" | 描述要生成视频内容的文本提示词 |
| model | 选择项 | - | 使用的视频生成模型 |
| aspect\_ratio | 选择项 | "ratio\_16\_9" | 视频宽高比 |
| resolution | 选择项 | "res\_540p" | 视频分辨率 |
| duration | 选择项 | - | 视频时长选项 |
| loop | 布尔值 | False | 是否循环播放视频 |
| seed | 整数 | 0 | 随机种子,用于决定节点是否需要重新运行;实际结果与种子无关 |
当使用 Ray 1.6 模型时,duration 和 resolution 参数将不会生效。
### 可选参数
| 参数 | 类型 | 说明 |
| -------------- | -------------- | --------------------------------- |
| luma\_concepts | LUMA\_CONCEPTS | 可选的摄像机概念,通过Luma Concepts节点控制摄像机运动 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 使用示例
Luma Text to Video 工作流示例
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class LumaTextToVideoGenerationNode(ComfyNodeABC):
"""
Generates videos synchronously based on prompt and output_size.
"""
RETURN_TYPES = (IO.VIDEO,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/video/Luma"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the video generation",
},
),
"model": ([model.value for model in LumaVideoModel],),
"aspect_ratio": (
[ratio.value for ratio in LumaAspectRatio],
{
"default": LumaAspectRatio.ratio_16_9,
},
),
"resolution": (
[resolution.value for resolution in LumaVideoOutputResolution],
{
"default": LumaVideoOutputResolution.res_540p,
},
),
"duration": ([dur.value for dur in LumaVideoModelOutputDuration],),
"loop": (
IO.BOOLEAN,
{
"default": False,
},
),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "Seed to determine if node should re-run; actual results are nondeterministic regardless of seed.",
},
),
},
"optional": {
"luma_concepts": (
LumaIO.LUMA_CONCEPTS,
{
"tooltip": "Optional Camera Concepts to dictate camera motion via the Luma Concepts node."
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
prompt: str,
model: str,
aspect_ratio: str,
resolution: str,
duration: str,
loop: bool,
seed,
luma_concepts: LumaConceptChain = None,
auth_token=None,
**kwargs,
):
duration = duration if model != LumaVideoModel.ray_1_6 else None
resolution = resolution if model != LumaVideoModel.ray_1_6 else None
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/luma/generations",
method=HttpMethod.POST,
request_model=LumaGenerationRequest,
response_model=LumaGeneration,
),
request=LumaGenerationRequest(
prompt=prompt,
model=model,
resolution=resolution,
aspect_ratio=aspect_ratio,
duration=duration,
loop=loop,
concepts=luma_concepts.create_api_model() if luma_concepts else None,
),
auth_token=auth_token,
)
response_api: LumaGeneration = operation.execute()
operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path=f"/proxy/luma/generations/{response_api.id}",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=LumaGeneration,
),
completed_statuses=[LumaState.completed],
failed_statuses=[LumaState.failed],
status_extractor=lambda x: x.state,
auth_token=auth_token,
)
response_poll = operation.execute()
vid_response = requests.get(response_poll.assets.video)
return (VideoFromFile(BytesIO(vid_response.content)),)
```
# MiniMax Image to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/minimax/minimax-image-to-video
使用 MiniMax AI将静态图像转换为动态视频的节点
MiniMax Image to Video 节点使用 MiniMax 的API,基于输入图像和提示词同步生成视频内容。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------ | --- | -------- | ---------------------------------------------- |
| image | 图像 | - | 用作视频生成第一帧的输入图像 |
| prompt\_text | 字符串 | "" | 引导视频生成的文本提示词 |
| model | 选择项 | "I2V-01" | 可选模型包括"I2V-01-Director"、"I2V-01"、"I2V-01-live" |
### 可选参数
| 参数 | 类型 | 说明 |
| ---- | -- | ------------ |
| seed | 整数 | 用于创建噪声的随机种子值 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class MinimaxImageToVideoNode(MinimaxTextToVideoNode):
"""
Generates videos synchronously based on an image and prompt, and optional parameters using Minimax's API.
"""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": (
IO.IMAGE,
{
"tooltip": "Image to use as first frame of video generation"
},
),
"prompt_text": (
"STRING",
{
"multiline": True,
"default": "",
"tooltip": "Text prompt to guide the video generation",
},
),
"model": (
[
"I2V-01-Director",
"I2V-01",
"I2V-01-live",
],
{
"default": "I2V-01",
"tooltip": "Model to use for video generation",
},
),
},
"optional": {
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "The random seed used for creating the noise.",
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
RETURN_TYPES = ("VIDEO",)
DESCRIPTION = "Generates videos from an image and prompts using Minimax's API"
FUNCTION = "generate_video"
CATEGORY = "api node/video/Minimax"
API_NODE = True
OUTPUT_NODE = True
```
# MiniMax Text to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/minimax/minimax-text-to-video
使用 MiniMax AI将文本描述转换为视频的节点
MiniMax Text to Video 节点通过连接 MiniMax 的API,允许用户利用文本提示词生成高质量、流畅的视频内容。该节点支持不同的视频生成模型,可以创建各种风格和类型的短视频片段。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------ | --- | -------- | ----------------------------------------- |
| prompt\_text | 字符串 | "" | 用于指导视频生成的文本提示词 |
| model | 选择项 | "T2V-01" | 使用的视频生成模型,可选值包括"T2V-01"和"T2V-01-Director" |
| seed | 整数 | 0 | 生成的随机种子,影响初始噪声创建,默认值为0 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
\[节点源码 (更新于2025-05-03)]
```python
class MinimaxTextToVideoNode:
"""
Generates videos synchronously based on a prompt, and optional parameters using Minimax's API.
"""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt_text": (
"STRING",
{
"multiline": True,
"default": "",
"tooltip": "Text prompt to guide the video generation",
},
),
"model": (
[
"T2V-01",
"T2V-01-Director",
],
{
"default": "T2V-01",
"tooltip": "Model to use for video generation",
},
),
},
"optional": {
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 0xFFFFFFFFFFFFFFFF,
"control_after_generate": True,
"tooltip": "The random seed used for creating the noise.",
},
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
RETURN_TYPES = ("VIDEO",)
DESCRIPTION = "Generates videos from prompts using Minimax's API"
FUNCTION = "generate_video"
CATEGORY = "api node/video/Minimax"
API_NODE = True
OUTPUT_NODE = True
def generate_video(
self,
prompt_text,
seed=0,
model="T2V-01",
image: torch.Tensor=None, # used for ImageToVideo
subject: torch.Tensor=None, # used for SubjectToVideo
auth_token=None,
):
'''
Function used between Minimax nodes - supports T2V, I2V, and S2V, based on provided arguments.
'''
# upload image, if passed in
image_url = None
if image is not None:
image_url = upload_images_to_comfyapi(image, max_images=1, auth_token=auth_token)[0]
# TODO: figure out how to deal with subject properly, API returns invalid params when using S2V-01 model
subject_reference = None
if subject is not None:
subject_url = upload_images_to_comfyapi(subject, max_images=1, auth_token=auth_token)[0]
subject_reference = [SubjectReferenceItem(image=subject_url)]
video_generate_operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/minimax/video_generation",
method=HttpMethod.POST,
request_model=MinimaxVideoGenerationRequest,
response_model=MinimaxVideoGenerationResponse,
),
request=MinimaxVideoGenerationRequest(
model=Model(model),
prompt=prompt_text,
callback_url=None,
first_frame_image=image_url,
subject_reference=subject_reference,
prompt_optimizer=None,
),
auth_token=auth_token,
)
response = video_generate_operation.execute()
task_id = response.task_id
if not task_id:
raise Exception(f"Minimax generation failed: {response.base_resp}")
video_generate_operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path="/proxy/minimax/query/video_generation",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=MinimaxTaskResultResponse,
query_params={"task_id": task_id},
),
completed_statuses=["Success"],
failed_statuses=["Fail"],
status_extractor=lambda x: x.status.value,
auth_token=auth_token,
)
task_result = video_generate_operation.execute()
file_id = task_result.file_id
if file_id is None:
raise Exception("Request was not successful. Missing file ID.")
file_retrieve_operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/minimax/files/retrieve",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=MinimaxFileRetrieveResponse,
query_params={"file_id": int(file_id)},
),
request=EmptyRequest(),
auth_token=auth_token,
)
file_result = file_retrieve_operation.execute()
file_url = file_result.file.download_url
if file_url is None:
raise Exception(
f"No video was found in the response. Full response: {file_result.model_dump()}"
)
logging.info(f"Generated video URL: {file_url}")
video_io = download_url_to_bytesio(file_url)
if video_io is None:
error_msg = f"Failed to download video from {file_url}"
logging.error(error_msg)
raise Exception(error_msg)
return (VideoFromFile(video_io),)
```
# Pika 2.2 Image to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/pika/pika-image-to-video
使用Pika的AI技术将静态图像转换为动态视频的节点
Pika 2.2 Image to Video 节点通过连接Pika最新的2.2版本API,将静态图像转变为动态视频。该节点能够保留原始图像的视觉特征,同时根据文本提示词添加自然的动态效果。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | ------- | --------------- |
| image | 图像 | - | 要转换为视频的输入图像 |
| prompt\_text | 字符串 | "" | 描述视频动作和内容的文本提示词 |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| seed | 整数 | 0 | 生成过程的随机种子 |
| resolution | 选择项 | "1080p" | 生成视频的分辨率 |
| duration | 选择项 | "5s" | 生成视频的持续时间 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 工作流程
节点将输入图像和相关参数(提示词、分辨率、持续时间等)通过多部分表单数据发送到Pika的API服务器。API处理后返回生成的视频结果。用户可以通过调整提示词、负面提示词、随机种子等参数来控制生成效果。
## 源码参考
\[节点源码 (更新于2025-05-05)]
```python
class PikaImageToVideoV2_2(PikaNodeBase):
"""Pika 2.2 Image to Video Node."""
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"image": (
IO.IMAGE,
{"tooltip": "The image to convert to video"},
),
**cls.get_base_inputs_types(PikaBodyGenerate22I2vGenerate22I2vPost),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
DESCRIPTION = "Sends an image and prompt to the Pika API v2.2 to generate a video."
RETURN_TYPES = ("VIDEO",)
def api_call(
self,
image: torch.Tensor,
prompt_text: str,
negative_prompt: str,
seed: int,
resolution: str,
duration: int,
auth_token: Optional[str] = None,
) -> tuple[VideoFromFile]:
"""API call for Pika 2.2 Image to Video."""
# Convert image to BytesIO
image_bytes_io = tensor_to_bytesio(image)
image_bytes_io.seek(0) # Reset stream position
# Prepare file data for multipart upload
pika_files = {"image": ("image.png", image_bytes_io, "image/png")}
# Prepare non-file data using the Pydantic model
pika_request_data = PikaBodyGenerate22I2vGenerate22I2vPost(
promptText=prompt_text,
negativePrompt=negative_prompt,
seed=seed,
resolution=resolution,
duration=duration,
)
initial_operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=PATH_IMAGE_TO_VIDEO,
method=HttpMethod.POST,
request_model=PikaBodyGenerate22I2vGenerate22I2vPost,
response_model=PikaGenerateResponse,
),
request=pika_request_data,
files=pika_files,
content_type="multipart/form-data",
auth_token=auth_token,
)
return self.execute_task(initial_operation, auth_token)
```
# Pika 2.2 Scenes - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/pika/pika-scenes
使用Pika的AI技术基于多张图像创建连贯场景视频的节点
Pika 2.2 Scenes 节点允许你上传多张图像作为素材,并生成一个包含这些素材对象的高质量视频。节点利用Pika的2.2版本API,基于这些图像创建连贯的场景过渡视频。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ----------------- | --- | ------------------------- | ------------------ |
| prompt\_text | 字符串 | "" | 描述视频内容和场景的文本提示词 |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| seed | 整数 | 0 | 生成过程的随机种子 |
| ingredients\_mode | 选择项 | "creative" | 图像组合模式 |
| resolution | 选择项 | 根据API默认 | 生成视频的分辨率 |
| duration | 选择项 | 根据API默认 | 生成视频的持续时间 |
| aspect\_ratio | 浮点数 | 1.7777777777777777 (16:9) | 输出视频的宽高比,范围0.4-2.5 |
### 可选参数
| 参数 | 类型 | 说明 |
| -------------------- | -- | --------- |
| image\_ingredient\_1 | 图像 | 场景中的第一张图像 |
| image\_ingredient\_2 | 图像 | 场景中的第二张图像 |
| image\_ingredient\_3 | 图像 | 场景中的第三张图像 |
| image\_ingredient\_4 | 图像 | 场景中的第四张图像 |
| image\_ingredient\_5 | 图像 | 场景中的第五张图像 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 工作原理
Pika 2.2 Scenes 节点分析所有输入图像,然后创建一个包含这些图像元素的视频。节点将图像和参数发送到Pika的API服务器,处理完成后返回生成的视频结果。
用户可以通过提示词引导视频的风格和内容,通过负面提示词排除不需要的元素。节点支持上传最多5张图像作为素材,并会根据指定的组合模式、分辨率、持续时间和宽高比生成最终视频。
## 源码参考
```python
class PikaScenesV2_2(PikaNodeBase):
"""Pika 2.2 Scenes Node."""
@classmethod
def INPUT_TYPES(cls):
image_ingredient_input = (
IO.IMAGE,
{"tooltip": "Image that will be used as ingredient to create a video."},
)
return {
"required": {
**cls.get_base_inputs_types(
PikaBodyGenerate22C2vGenerate22PikascenesPost,
),
"ingredients_mode": model_field_to_node_input(
IO.COMBO,
PikaBodyGenerate22C2vGenerate22PikascenesPost,
"ingredientsMode",
enum_type=IngredientsMode,
default="creative",
),
"aspect_ratio": model_field_to_node_input(
IO.FLOAT,
PikaBodyGenerate22C2vGenerate22PikascenesPost,
"aspectRatio",
step=0.001,
min=0.4,
max=2.5,
default=1.7777777777777777,
),
},
"optional": {
"image_ingredient_1": image_ingredient_input,
"image_ingredient_2": image_ingredient_input,
"image_ingredient_3": image_ingredient_input,
"image_ingredient_4": image_ingredient_input,
"image_ingredient_5": image_ingredient_input,
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
DESCRIPTION = "Combine your images to create a video with the objects in them. Upload multiple images as ingredients and generate a high-quality video that incorporates all of them."
RETURN_TYPES = ("VIDEO",)
def api_call(
self,
prompt_text: str,
negative_prompt: str,
seed: int,
resolution: str,
duration: int,
ingredients_mode: str,
aspect_ratio: float,
image_ingredient_1: Optional[torch.Tensor] = None,
image_ingredient_2: Optional[torch.Tensor] = None,
image_ingredient_3: Optional[torch.Tensor] = None,
image_ingredient_4: Optional[torch.Tensor] = None,
image_ingredient_5: Optional[torch.Tensor] = None,
auth_token: Optional[str] = None,
) -> tuple[VideoFromFile]:
"""API call for Pika Scenes 2.2."""
all_image_bytes_io = []
for image in [
image_ingredient_1,
image_ingredient_2,
image_ingredient_3,
image_ingredient_4,
image_ingredient_5,
]:
if image is not None:
image_bytes_io = tensor_to_bytesio(image)
image_bytes_io.seek(0)
all_image_bytes_io.append(image_bytes_io)
# Prepare files data for multipart upload
pika_files = [
("images", (f"image_{i}.png", image_bytes_io, "image/png"))
for i, image_bytes_io in enumerate(all_image_bytes_io)
]
# Prepare non-file data using the Pydantic model
pika_request_data = PikaBodyGenerate22C2vGenerate22PikascenesPost(
ingredientsMode=ingredients_mode,
promptText=prompt_text,
negativePrompt=negative_prompt,
seed=seed,
resolution=resolution,
duration=duration,
aspectRatio=aspect_ratio,
)
initial_operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=PATH_PIKASCENES,
method=HttpMethod.POST,
request_model=PikaBodyGenerate22C2vGenerate22PikascenesPost,
response_model=PikaGenerateResponse,
),
request=pika_request_data,
files=pika_files,
content_type="multipart/form-data",
auth_token=auth_token,
)
return self.execute_task(initial_operation, auth_token)
```
# Pika 2.2 Text to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/pika/pika-text-to-video
使用Pika的AI技术将文本描述转换为视频的节点
Pika 2.2 Text to Video 节点允许你使用Pika的2.2版本API,通过文本描述创建视频内容。此节点连接到Pika的文本到视频API,让用户能够通过文本提示词生成视频,并提供多种参数控制生成效果。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | ------------------ | -------------------------- |
| prompt\_text | 字符串 | "" | 描述要生成视频内容的文本提示词 |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| seed | 整数 | 0 | 生成过程的随机种子 |
| resolution | 选择项 | "1080p" | 生成视频的分辨率 |
| duration | 选择项 | "5s" | 生成视频的持续时间 |
| aspect\_ratio | 浮点数 | 1.7777777777777777 | 输出视频的宽高比,范围0.4-2.5,步长0.001 |
### 输出参数
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
\[节点源码 (更新于2025-05-05)]
```python
class PikaTextToVideoNodeV2_2(PikaNodeBase):
"""Pika 2.2 Text to Video Node."""
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
**cls.get_base_inputs_types(PikaBodyGenerate22T2vGenerate22T2vPost),
"aspect_ratio": model_field_to_node_input(
IO.FLOAT,
PikaBodyGenerate22T2vGenerate22T2vPost,
"aspectRatio",
step=0.001,
min=0.4,
max=2.5,
default=1.7777777777777777,
),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
RETURN_TYPES = ("VIDEO",)
DESCRIPTION = "Sends a text prompt to the Pika API v2.2 to generate a video."
def api_call(
self,
prompt_text: str,
negative_prompt: str,
seed: int,
resolution: str,
duration: int,
aspect_ratio: float,
auth_token: Optional[str] = None,
) -> tuple[VideoFromFile]:
"""API call for Pika 2.2 Text to Video."""
initial_operation = SynchronousOperation(
endpoint=ApiEndpoint(
path=PATH_TEXT_TO_VIDEO,
method=HttpMethod.POST,
request_model=PikaBodyGenerate22T2vGenerate22T2vPost,
response_model=PikaGenerateResponse,
),
request=PikaBodyGenerate22T2vGenerate22T2vPost(
promptText=prompt_text,
negativePrompt=negative_prompt,
seed=seed,
resolution=resolution,
duration=duration,
aspectRatio=aspect_ratio,
),
auth_token=auth_token,
content_type="application/x-www-form-urlencoded",
)
return self.execute_task(initial_operation, auth_token)
```
# PixVerse Image to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/pixverse/pixverse-image-to-video
使用PixVerse的AI技术将静态图像转换为动态视频的节点
PixVerse Image to Video 节点通过 PixVerse 的API服务,可以将静态图像转换为动态视频。它能保留原始图像的视觉特征,同时根据文本提示词添加自然的动态效果。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ---------------- | --- | ------------ | --------------- |
| image | 图像 | - | 要转换为视频的输入图像 |
| prompt | 字符串 | "" | 描述视频动作和内容的文本提示词 |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| seed | 整数 | -1 | 生成过程的随机种子,-1为随机 |
| quality | 选择项 | "high" | 生成视频的质量级别 |
| aspect\_ratio | 选择项 | "r16\_9" | 输出视频的宽高比 |
| duration | 选择项 | "seconds\_4" | 生成视频的持续时间 |
| motion\_mode | 选择项 | "standard" | 视频的动作模式 |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------------ | ------------------ | ---- | --------------- |
| pixverse\_template | PIXVERSE\_TEMPLATE | None | 可选的PixVerse模板配置 |
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
\[节点源码 (更新于2025-05-05)]
```python
class PixverseImageToVideoNode(ComfyNodeABC):
"""
Pixverse Image to Video
Generates videos from an image and prompts.
"""
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"image": ("IMAGE",),
"prompt": ("STRING", {"multiline": True, "default": ""}),
"negative_prompt": ("STRING", {"multiline": True, "default": ""}),
"seed": ("INT", {"default": -1, "min": -1, "max": 0xffffffffffffffff}),
"quality": (list(PixverseQuality.__members__.keys()), {"default": "high"}),
"aspect_ratio": (list(PixverseAspectRatio.__members__.keys()), {"default": "r16_9"}),
"duration": (list(PixverseDuration.__members__.keys()), {"default": "seconds_4"}),
"motion_mode": (list(PixverseMotionMode.__members__.keys()), {"default": "standard"}),
},
"optional": {
"pixverse_template": ("PIXVERSE_TEMPLATE",),
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
RETURN_TYPES = ("VIDEO",)
DESCRIPTION = "Generates videos from an image and prompts using Pixverse's API"
FUNCTION = "generate_video"
CATEGORY = "api node/video/Pixverse"
API_NODE = True
OUTPUT_NODE = True
```
# Pixverse Template - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/pixverse/pixverse-template
为Pixverse视频生成提供预设模板的辅助节点
Pixverse Template 节点允许你从预定义的视频生成模板中选择,用于控制Pixverse视频生成节点的输出风格和效果。这是一个辅助节点,可以连接到Pixverse的视频生成节点,让用户能够快速应用预设的视频风格,而无需手动调整复杂的参数组合。
## 参数说明
### 必需参数
| 参数 | 类型 | 说明 |
| -------- | --- | --------------------- |
| template | 选择项 | 从可用的预设视频生成模板列表中选择一个模板 |
### 输出
| 输出 | 类型 | 说明 |
| ------------------ | ------------------- | ------------- |
| pixverse\_template | PixverseIO.TEMPLATE | 包含所选模板ID的配置对象 |
## 源码参考
\[节点源码 (更新于2025-05-05)]
```python
class PixverseTemplateNode:
"""
Select template for Pixverse Video generation.
"""
RETURN_TYPES = (PixverseIO.TEMPLATE,)
RETURN_NAMES = ("pixverse_template",)
FUNCTION = "create_template"
CATEGORY = "api node/video/Pixverse"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"template": (list(pixverse_templates.keys()), ),
}
}
def create_template(self, template: str):
template_id = pixverse_templates.get(template, None)
if template_id is None:
raise Exception(f"Template '{template}' is not recognized.")
# just return the integer
return (template_id,)
```
# PixVerse Text to Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/pixverse/pixverse-text-to-video
使用PixVerse的AI技术将文本描述转换为视频的节点
PixVerse Text to Video 节点通过连接 PixVerse 的文本到视频 API,让用户能够使用文本描述生成高质量视频内容。用户可以通过调整各种参数如视频质量、持续时间和动作模式等来满足不同的创作需求。
## 参数说明
### 必选参数
| 参数 | 类型 | 默认值 | 说明 |
| ----------------- | --- | ------------------------- | -------------------- |
| prompt | 字符串 | "" | 描述要生成视频内容的文本提示词 |
| aspect\_ratio | 选择项 | - | 输出视频的宽高比 |
| quality | 选择项 | PixverseQuality.res\_540p | 生成视频的质量级别 |
| duration\_seconds | 选择项 | - | 生成视频的持续时间 |
| motion\_mode | 选择项 | - | 视频的动作模式 |
| seed | 整数 | 0 | 生成过程的随机种子,控制生成结果的一致性 |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------------ | ------------------ | ---- | ------------------------ |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| pixverse\_template | PIXVERSE\_TEMPLATE | None | 可选的PixVerse模板配置,用于影响生成风格 |
### 特殊限制
* 当选择1080p质量时,只支持normal动作模式且持续时间固定为5秒
* 当选择非5秒持续时间时,只支持normal动作模式
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
\[节点源码 (更新于2025-05-05)]
```python
class PixverseTextToVideoNode(ComfyNodeABC):
"""
Generates videos synchronously based on prompt and output_size.
"""
RETURN_TYPES = (IO.VIDEO,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/video/Pixverse"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the video generation",
},
),
"aspect_ratio": (
[ratio.value for ratio in PixverseAspectRatio],
),
"quality": (
[resolution.value for resolution in PixverseQuality],
{
"default": PixverseQuality.res_540p,
},
),
"duration_seconds": ([dur.value for dur in PixverseDuration],),
"motion_mode": ([mode.value for mode in PixverseMotionMode],),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 2147483647,
"control_after_generate": True,
"tooltip": "Seed for video generation.",
},
),
},
"optional": {
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "An optional text description of undesired elements on an image.",
},
),
"pixverse_template": (
PixverseIO.TEMPLATE,
{
"tooltip": "An optional template to influence style of generation, created by the Pixverse Template node."
}
)
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
prompt: str,
aspect_ratio: str,
quality: str,
duration_seconds: int,
motion_mode: str,
seed,
negative_prompt: str=None,
pixverse_template: int=None,
auth_token=None,
**kwargs,
):
# 1080p is limited to 5 seconds duration
# only normal motion_mode supported for 1080p or for non-5 second duration
if quality == PixverseQuality.res_1080p:
motion_mode = PixverseMotionMode.normal
duration_seconds = PixverseDuration.dur_5
elif duration_seconds != PixverseDuration.dur_5:
motion_mode = PixverseMotionMode.normal
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/pixverse/video/text/generate",
method=HttpMethod.POST,
request_model=PixverseTextVideoRequest,
response_model=PixverseVideoResponse,
),
request=PixverseTextVideoRequest(
prompt=prompt,
aspect_ratio=aspect_ratio,
quality=quality,
duration=duration_seconds,
motion_mode=motion_mode,
negative_prompt=negative_prompt if negative_prompt else None,
template_id=pixverse_template,
seed=seed,
),
auth_token=auth_token,
)
response_api = operation.execute()
if response_api.Resp is None:
raise Exception(f"Pixverse request failed: '{response_api.ErrMsg}'")
operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path=f"/proxy/pixverse/video/result/{response_api.Resp.video_id}",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=PixverseGenerationStatusResponse,
),
completed_statuses=[PixverseStatus.successful],
failed_statuses=[PixverseStatus.contents_moderation, PixverseStatus.failed, PixverseStatus.deleted],
status_extractor=lambda x: x.Resp.status,
auth_token=auth_token,
)
response_poll = operation.execute()
vid_response = requests.get(response_poll.Resp.url)
return (VideoFromFile(BytesIO(vid_response.content)),)
```
# PixVerse Transition Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/api-node/video/pixverse/pixverse-transition-video
使用 PixVerse 的AI技术创建从起始帧到结束帧平滑过渡的视频
PixVerse Transition Video 节点连接到 PixVerse 的过渡视频生成 API,允许你提供开始和结束图像,生成在两者之间平滑过渡的视频序列。节点会自动创建所有中间帧,产生流畅的变换效果,特别适合创建变形、场景转换和对象演变等视觉效果。
## 参数说明
### 必需参数
| 参数 | 类型 | 默认值 | 说明 |
| ----------------- | --- | --------------------------- | ------------------------ |
| first\_frame | 图像 | - | 视频的起始帧图像 |
| last\_frame | 图像 | - | 视频的结束帧图像 |
| prompt | 字符串 | "" | 描述视频内容和过渡效果的文本提示词 |
| quality | 选择项 | "PixverseQuality.res\_540p" | 生成视频的质量级别 |
| duration\_seconds | 选择项 | - | 生成视频的持续时间 |
| motion\_mode | 选择项 | - | 视频的动作模式 |
| seed | 整数 | 0 | 生成过程的随机种子,范围0-2147483647 |
### 可选参数
| 参数 | 类型 | 默认值 | 说明 |
| ------------------ | ------------------ | ---- | ---------------------- |
| negative\_prompt | 字符串 | "" | 指定不希望在视频中出现的元素 |
| pixverse\_template | PIXVERSE\_TEMPLATE | None | 可选的PixVerse模板配置,影响生成风格 |
### 参数限制说明
* 当quality设置为1080p时,动作模式(motion\_mode)会强制设为normal,持续时间(duration\_seconds)会强制设为5秒
* 当持续时间(duration\_seconds)不等于5秒时,动作模式(motion\_mode)会强制设为normal
### 输出
| 输出 | 类型 | 说明 |
| ----- | -- | ------- |
| VIDEO | 视频 | 生成的视频结果 |
## 源码参考
```python
class PixverseTransitionVideoNode(ComfyNodeABC):
"""
Generates videos synchronously based on prompt and output_size.
"""
RETURN_TYPES = (IO.VIDEO,)
DESCRIPTION = cleandoc(__doc__ or "") # Handle potential None value
FUNCTION = "api_call"
API_NODE = True
CATEGORY = "api node/video/Pixverse"
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"first_frame": (
IO.IMAGE,
),
"last_frame": (
IO.IMAGE,
),
"prompt": (
IO.STRING,
{
"multiline": True,
"default": "",
"tooltip": "Prompt for the video generation",
},
),
"quality": (
[resolution.value for resolution in PixverseQuality],
{
"default": PixverseQuality.res_540p,
},
),
"duration_seconds": ([dur.value for dur in PixverseDuration],),
"motion_mode": ([mode.value for mode in PixverseMotionMode],),
"seed": (
IO.INT,
{
"default": 0,
"min": 0,
"max": 2147483647,
"control_after_generate": True,
"tooltip": "Seed for video generation.",
},
),
},
"optional": {
"negative_prompt": (
IO.STRING,
{
"default": "",
"forceInput": True,
"tooltip": "An optional text description of undesired elements on an image.",
},
),
"pixverse_template": (
PixverseIO.TEMPLATE,
{
"tooltip": "An optional template to influence style of generation, created by the Pixverse Template node."
}
)
},
"hidden": {
"auth_token": "AUTH_TOKEN_COMFY_ORG",
},
}
def api_call(
self,
first_frame: torch.Tensor,
last_frame: torch.Tensor,
prompt: str,
quality: str,
duration_seconds: int,
motion_mode: str,
seed,
negative_prompt: str=None,
pixverse_template: int=None,
auth_token=None,
**kwargs,
):
first_frame_id = upload_image_to_pixverse(first_frame, auth_token=auth_token)
last_frame_id = upload_image_to_pixverse(last_frame, auth_token=auth_token)
# 1080p is limited to 5 seconds duration
# only normal motion_mode supported for 1080p or for non-5 second duration
if quality == PixverseQuality.res_1080p:
motion_mode = PixverseMotionMode.normal
duration_seconds = PixverseDuration.dur_5
elif duration_seconds != PixverseDuration.dur_5:
motion_mode = PixverseMotionMode.normal
operation = SynchronousOperation(
endpoint=ApiEndpoint(
path="/proxy/pixverse/video/transition/generate",
method=HttpMethod.POST,
request_model=PixverseTransitionVideoRequest,
response_model=PixverseVideoResponse,
),
request=PixverseTransitionVideoRequest(
first_frame_img=first_frame_id,
last_frame_img=last_frame_id,
prompt=prompt,
quality=quality,
duration=duration_seconds,
motion_mode=motion_mode,
negative_prompt=negative_prompt if negative_prompt else None,
template_id=pixverse_template,
seed=seed,
),
auth_token=auth_token,
)
response_api = operation.execute()
if response_api.Resp is None:
raise Exception(f"Pixverse request failed: '{response_api.ErrMsg}'")
operation = PollingOperation(
poll_endpoint=ApiEndpoint(
path=f"/proxy/pixverse/video/result/{response_api.Resp.video_id}",
method=HttpMethod.GET,
request_model=EmptyRequest,
response_model=PixverseGenerationStatusResponse,
),
completed_statuses=[PixverseStatus.successful],
failed_statuses=[PixverseStatus.contents_moderation, PixverseStatus.failed, PixverseStatus.deleted],
status_extractor=lambda x: x.Resp.status,
auth_token=auth_token,
)
response_poll = operation.execute()
vid_response = requests.get(response_poll.Resp.url)
return (VideoFromFile(BytesIO(vid_response.content)),)
```
# Wan Vace To Video - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/conditioning/video-models/wan-vace-to-video
使用阿里通义万相的高分辨率视频生成 API 创建视频
Wan Vace To Video 节点允许您通过文本提示词生成视频,并支持多种输入方式,包括文本、图像、视频、遮罩和控制信号等。
该节点将输入的条件(提示词)、控制视频和遮罩组合起来,生成高质量的视频。它首先对输入进行预处理和编码,然后应用条件信息来生成最终的视频潜在表示。
当提供参考图像时,它会被作为视频的起始参考。控制视频和遮罩可以用来引导生成过程,让生成的视频更符合的预期。
## 参数说明
### 必选参数
| 参数名 | 类型 | 默认值 | 范围 | 说明 |
| ----------- | ------------ | --- | ------------------ | ------------- |
| positive | CONDITIONING | - | - | 正面提示词条件 |
| negative | CONDITIONING | - | - | 负面提示词条件 |
| vae | VAE | - | - | 用于编码/解码的VAE模型 |
| width | INT | 832 | 16-MAX\_RESOLUTION | 生成视频的宽度,步长为16 |
| height | INT | 480 | 16-MAX\_RESOLUTION | 生成视频的高度,步长为16 |
| length | INT | 81 | 1-MAX\_RESOLUTION | 生成视频的帧数,步长为4 |
| batch\_size | INT | 1 | 1-4096 | 批处理大小 |
| strength | FLOAT | 1.0 | 0.0-1000.0 | 条件强度,步长为0.01 |
### 可选参数
| 参数名 | 类型 | 说明 |
| ---------------- | ----- | --------------------- |
| control\_video | IMAGE | 控制视频,用于引导生成过程 |
| control\_masks | MASK | 控制遮罩,定义视频中哪些区域应受到控制 |
| reference\_image | IMAGE | 参考图像,作为视频生成的起点或参考(单张) |
### 输出参数
| 参数名 | 类型 | 说明 |
| ------------ | ------------ | ------------------------------------------------------------------------------------------------------- |
| positive | CONDITIONING | 处理后的正面提示词条件 |
| negative | CONDITIONING | 处理后的负面提示词条件 |
| latent | LATENT | 生成的视频潜在表示 |
| trim\_latent | INT | 裁剪潜在表示的参数,默认值为0。当提供参考图像时,该值会被设置为参考图像在潜在空间中的形状尺寸。它指示下游节点需要从生成的潜在表示中裁剪掉多少来自参考图像的内容,确保最终视频输出中参考图像的影响被适当控制。 |
## 源码
\[源码更新时间: 2025-05-15]
```Python
class WanVaceToVideo:
@classmethod
def INPUT_TYPES(s):
return {"required": {"positive": ("CONDITIONING", ),
"negative": ("CONDITIONING", ),
"vae": ("VAE", ),
"width": ("INT", {"default": 832, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 16}),
"height": ("INT", {"default": 480, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 16}),
"length": ("INT", {"default": 81, "min": 1, "max": nodes.MAX_RESOLUTION, "step": 4}),
"batch_size": ("INT", {"default": 1, "min": 1, "max": 4096}),
"strength": ("FLOAT", {"default": 1.0, "min": 0.0, "max": 1000.0, "step": 0.01}),
},
"optional": {"control_video": ("IMAGE", ),
"control_masks": ("MASK", ),
"reference_image": ("IMAGE", ),
}}
RETURN_TYPES = ("CONDITIONING", "CONDITIONING", "LATENT", "INT")
RETURN_NAMES = ("positive", "negative", "latent", "trim_latent")
FUNCTION = "encode"
CATEGORY = "conditioning/video_models"
EXPERIMENTAL = True
def encode(self, positive, negative, vae, width, height, length, batch_size, strength, control_video=None, control_masks=None, reference_image=None):
latent_length = ((length - 1) // 4) + 1
if control_video is not None:
control_video = comfy.utils.common_upscale(control_video[:length].movedim(-1, 1), width, height, "bilinear", "center").movedim(1, -1)
if control_video.shape[0] < length:
control_video = torch.nn.functional.pad(control_video, (0, 0, 0, 0, 0, 0, 0, length - control_video.shape[0]), value=0.5)
else:
control_video = torch.ones((length, height, width, 3)) * 0.5
if reference_image is not None:
reference_image = comfy.utils.common_upscale(reference_image[:1].movedim(-1, 1), width, height, "bilinear", "center").movedim(1, -1)
reference_image = vae.encode(reference_image[:, :, :, :3])
reference_image = torch.cat([reference_image, comfy.latent_formats.Wan21().process_out(torch.zeros_like(reference_image))], dim=1)
if control_masks is None:
mask = torch.ones((length, height, width, 1))
else:
mask = control_masks
if mask.ndim == 3:
mask = mask.unsqueeze(1)
mask = comfy.utils.common_upscale(mask[:length], width, height, "bilinear", "center").movedim(1, -1)
if mask.shape[0] < length:
mask = torch.nn.functional.pad(mask, (0, 0, 0, 0, 0, 0, 0, length - mask.shape[0]), value=1.0)
control_video = control_video - 0.5
inactive = (control_video * (1 - mask)) + 0.5
reactive = (control_video * mask) + 0.5
inactive = vae.encode(inactive[:, :, :, :3])
reactive = vae.encode(reactive[:, :, :, :3])
control_video_latent = torch.cat((inactive, reactive), dim=1)
if reference_image is not None:
control_video_latent = torch.cat((reference_image, control_video_latent), dim=2)
vae_stride = 8
height_mask = height // vae_stride
width_mask = width // vae_stride
mask = mask.view(length, height_mask, vae_stride, width_mask, vae_stride)
mask = mask.permute(2, 4, 0, 1, 3)
mask = mask.reshape(vae_stride * vae_stride, length, height_mask, width_mask)
mask = torch.nn.functional.interpolate(mask.unsqueeze(0), size=(latent_length, height_mask, width_mask), mode='nearest-exact').squeeze(0)
trim_latent = 0
if reference_image is not None:
mask_pad = torch.zeros_like(mask[:, :reference_image.shape[2], :, :])
mask = torch.cat((mask_pad, mask), dim=1)
latent_length += reference_image.shape[2]
trim_latent = reference_image.shape[2]
mask = mask.unsqueeze(0)
positive = node_helpers.conditioning_set_values(positive, {"vace_frames": control_video_latent, "vace_mask": mask, "vace_strength": strength})
negative = node_helpers.conditioning_set_values(negative, {"vace_frames": control_video_latent, "vace_mask": mask, "vace_strength": strength})
latent = torch.zeros([batch_size, 16, latent_length, height // 8, width // 8], device=comfy.model_management.intermediate_device())
out_latent = {}
out_latent["samples"] = latent
return (positive, negative, out_latent, trim_latent)
```
# TrimVideoLatent 节点
Source: https://docs.comfy.org/zh-CN/built-in-nodes/latent/video/trim-video-latent
裁剪潜在空间中的视频帧
TrimVideoLatent 节点用于在潜在空间(LATENT)中裁剪视频帧。常用于处理视频潜变量序列时,去除前面不需要的帧,实现视频的“前向裁剪”。
基本用法:将需要裁剪的视频潜变量输入到 samples,设置 trim\_amount 为要裁剪的帧数。节点会从视频的开头裁剪掉指定数量的帧,输出剩余的潜变量序列。
典型场景:用于视频生成、视频编辑等场景下,去除不需要的前置帧,或配合其他节点实现视频片段的拼接与处理。
## 参数说明
### 输入参数
| 参数名 | 类型 | 是否必填 | 默认值 | 说明 |
| ------------ | ------ | ---- | --- | ---------------- |
| samples | LATENT | 是 | 无 | 输入的潜在视频数据 |
| trim\_amount | INT | 是 | 0 | 需要裁剪掉的帧数(从前往后裁剪) |
### 输出参数
| 参数名 | 类型 | 说明 |
| ------- | ------ | --------- |
| samples | LATENT | 裁剪后的视频潜变量 |
## 使用示例
Wan2.1 VACE 视频生成工作流示例
### 源码
```python
class TrimVideoLatent:
@classmethod
def INPUT_TYPES(s):
return {"required": { "samples": ("LATENT",),
"trim_amount": ("INT", {"default": 0, "min": 0, "max": 99999}),
}}
RETURN_TYPES = ("LATENT",)
FUNCTION = "op"
CATEGORY = "latent/video"
EXPERIMENTAL = True
def op(self, samples, trim_amount):
samples_out = samples.copy()
s1 = samples["samples"]
samples_out["samples"] = s1[:, :, trim_amount:]
return (samples_out,)
```
# ComfyUI 内置节点
Source: https://docs.comfy.org/zh-CN/built-in-nodes/overview
ComfyUI 内置节点介绍
内置节点是 ComfyUI 的默认节点,它们是 ComfyUI 的核心功能,你无需额外安装第三方自定义节点包,就可以使用的节点。
## 节点文档更新说明
我们目前已经支持了内置的节点帮助文档,所以此部分的文档内容是定期从 [这个仓库](https://github.com/Comfy-Org/embedded-docs) 中同步过来的,目前我们会每周进行一次人工同步和内容更新。
## 贡献内容
如果你发现我们的内容错误,或者想要补充我们缺失的内容,请在 [这个仓库](https://github.com/Comfy-Org/embedded-docs) 中提交 issue 或 pr 来帮助我们改进。
# Ksampler - ComfyUI 原生节点文档
Source: https://docs.comfy.org/zh-CN/built-in-nodes/sampling/ksampler
Ksampler 节点是 ComfyUI 中常用的采样节点。
KSampler 节点用于对潜在图像(latent image)进行多步去噪采样。它结合正向和负向条件(prompt),通过指定的采样算法和调度器,生成高质量的潜在图像。常用于文生图、图生图等 AI 图像生成流程中。
## 参数说明
### 输入参数
| 参数名 | 类型 | 是否必填 | 默认值 | 说明 |
| ------------- | ------------ | ---- | --- | ---------------------------------------------- |
| model | MODEL | 是 | 无 | 用于去噪的模型(如 Stable Diffusion 模型) |
| seed | INT | 是 | 0 | 随机种子,保证生成结果可复现 |
| steps | INT | 是 | 20 | 去噪步数,步数越多图像越精细,速度越慢 |
| cfg | FLOAT | 是 | 8.0 | Classifier-Free Guidance 系数,数值越高越贴合提示词,过高会影响质量 |
| sampler\_name | 枚举 | 是 | 无 | 采样算法名称,影响生成速度、风格和质量 |
| scheduler | 枚举 | 是 | 无 | 调度器,控制噪声去除的过程 |
| positive | CONDITIONING | 是 | 无 | 正向条件,描述希望图像包含的内容 |
| negative | CONDITIONING | 是 | 无 | 负向条件,描述希望图像排除的内容 |
| latent\_image | LATENT | 是 | 无 | 待去噪的潜在图像,通常为噪声或上一步输出 |
| denoise | FLOAT | 是 | 1.0 | 去噪强度,1.0为完全去噪,数值越低越保留原始结构,适合图生图 |
### 输出参数
| 输出名 | 类型 | 说明 |
| ------- | ------ | --------------------- |
| samples | LATENT | 去噪后的潜在图像,可用于后续解码为最终图像 |
## 使用示例
Stable diffusion 1.5 文生图工作流示例
Stable diffusion 1.5 图生图工作流示例
## 源码
\[更新于2025年5月15日]
```Python
def common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent, denoise=1.0, disable_noise=False, start_step=None, last_step=None, force_full_denoise=False):
latent_image = latent["samples"]
latent_image = comfy.sample.fix_empty_latent_channels(model, latent_image)
if disable_noise:
noise = torch.zeros(latent_image.size(), dtype=latent_image.dtype, layout=latent_image.layout, device="cpu")
else:
batch_inds = latent["batch_index"] if "batch_index" in latent else None
noise = comfy.sample.prepare_noise(latent_image, seed, batch_inds)
noise_mask = None
if "noise_mask" in latent:
noise_mask = latent["noise_mask"]
callback = latent_preview.prepare_callback(model, steps)
disable_pbar = not comfy.utils.PROGRESS_BAR_ENABLED
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
denoise=denoise, disable_noise=disable_noise, start_step=start_step, last_step=last_step,
force_full_denoise=force_full_denoise, noise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
out = latent.copy()
out["samples"] = samples
return (out, )
class KSampler:
@classmethod
def INPUT_TYPES(s):
return {
"required": {
"model": ("MODEL", {"tooltip": "The model used for denoising the input latent."}),
"seed": ("INT", {"default": 0, "min": 0, "max": 0xffffffffffffffff, "control_after_generate": True, "tooltip": "The random seed used for creating the noise."}),
"steps": ("INT", {"default": 20, "min": 1, "max": 10000, "tooltip": "The number of steps used in the denoising process."}),
"cfg": ("FLOAT", {"default": 8.0, "min": 0.0, "max": 100.0, "step":0.1, "round": 0.01, "tooltip": "The Classifier-Free Guidance scale balances creativity and adherence to the prompt. Higher values result in images more closely matching the prompt however too high values will negatively impact quality."}),
"sampler_name": (comfy.samplers.KSampler.SAMPLERS, {"tooltip": "The algorithm used when sampling, this can affect the quality, speed, and style of the generated output."}),
"scheduler": (comfy.samplers.KSampler.SCHEDULERS, {"tooltip": "The scheduler controls how noise is gradually removed to form the image."}),
"positive": ("CONDITIONING", {"tooltip": "The conditioning describing the attributes you want to include in the image."}),
"negative": ("CONDITIONING", {"tooltip": "The conditioning describing the attributes you want to exclude from the image."}),
"latent_image": ("LATENT", {"tooltip": "The latent image to denoise."}),
"denoise": ("FLOAT", {"default": 1.0, "min": 0.0, "max": 1.0, "step": 0.01, "tooltip": "The amount of denoising applied, lower values will maintain the structure of the initial image allowing for image to image sampling."}),
}
}
RETURN_TYPES = ("LATENT",)
OUTPUT_TOOLTIPS = ("The denoised latent.",)
FUNCTION = "sample"
CATEGORY = "sampling"
DESCRIPTION = "Uses the provided model, positive and negative conditioning to denoise the latent image."
def sample(self, model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=1.0):
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
```
# 更新日志
Source: https://docs.comfy.org/zh-CN/changelog/index
跟踪ComfyUI的最新功能、改进和错误修复
**Wan2.2 S2V 工作流增强与模型支持扩展**
本次发布专注于 Wan2.2 S2V 的相关视频工作流功能并扩展模型支持:
**Wan2.2 S2V工作流控制**
* **WanSoundImageToVideoExtend节点**:全新的音频驱动视频工作流手动视频扩展节点,为创作者提供对生成视频长度和时间的精确控制。这使得对音频内容如何转换为视频序列的精细调控成为可能。
* **音视频同步**:修复了当视频扩展超过音频长度时导致工作流失败的关键问题,确保无论音频持续时间如何都能可靠地生成音频转视频内容。
* **自动音频裁剪**:视频保存现在会自动将音频裁剪至匹配视频长度,消除最终输出文件中的音视频同步问题。
**高级潜在空间处理**
* **LatentCut节点**:用于在精确点位切割潜在变量的新节点,在复杂生成工作流中实现对潜在空间操作的更精细控制。这对于批处理和时序视频工作流特别有用,比如删除视频中的某些帧。
**Wan2.2 5B模型集成**
* **Fun Control模型支持**:增加了对Wan2.2 5B fun control模型的支持,。
* **Fun Inpaint模型支持**:集成了Wan2.2 5B fun inpaint模型。
**工作流模板**
* **模板v0.1.70**:添加了 Wan2.2 S2V 初步版本的工作流
**节点模型补丁改进**
此次专项更新改进了支撑 ComfyUI 灵活架构的核心节点模型补丁系统:
**核心基础设施增强**
* **节点模型补丁更新**:增强了 nodes\_model\_patch.py,改进了底层模型补丁机制,使得 ComfyUI 对 Qwen-Image ControlNet 的扩展更为容易
**工作流程优势**
* **增强的稳定性**:核心模型补丁改进有助于在不同工作流程配置中实现更可靠的节点执行和模型处理
**音频工作流集成与增强性能优化**
此版本增加了 ComfyUI 的音频处理能力,并包含性能改进和模型兼容性更新:
**音频处理更新**
* **Wav2vec2 音频编码器**:新增原生 wav2vec2 实现作为音频编码器模型,为多模态应用启用音频到嵌入工作流
* **音频编码器目录**:新增 models/audio\_encoders 目录,此为 Wan2.2 S2V 音频编码器目录
* **AudioEncoderOutput V3 支持**:使 AudioEncoderOutput 与 V3 节点架构兼容,确保与现代工作流架构的无缝集成
**Google Gemini API 集成**
* **Gemini Image API 节点**: 新增全新 Google Gemini Image API 节点,即 “nano-Nano-banana” 图像编辑模型 API,具有高度的一致性
**视频生成性能与内存优化**
* **WAN 2.2 S2V 模型支持**:正在开发的 WAN 2.2 声音到视频模型实现,具有优化的内存使用和性能
* **增强的 S2V 性能**:对超过120帧的视频生成进行性能改进,改善扩展视频工作流
* **更好的内存估算**:改进 S2V 工作流的内存使用估算,防止长视频生成期间出现内存不足错误
* **负音频处理**:修复 S2V 工作流中负音频输入处理,使用适当的零值
**采样与节点增强**
* **DPM++ 2M SDE Heun (RES) 采样器**:@Balladie 提供的全新高级采样器,为精细调节生成控制提供额外的采样选项
* **LatentConcat 节点**:用于连接潜在张量的新节点,启用高级潜在空间操作工作流
* **EasyCache/LazyCache 稳定性**:修复当张量属性(形状/数据类型/设备)在采样期间发生变化时的关键崩溃问题,确保工作流可靠性
**模型兼容性改进**
* **ControlNet 类型模型**:增强与 Qwen Edit 和 Kontext 工作流配合使用的 ControlNet 类型模型的兼容性修复
* **Flux 内存优化**:调整 Flux 模型内存使用因子以获得更好的资源利用
**基础设施与可靠性**
* **模板更新**:更新至 0.1.66 和 0.1.68 版本
* **文档清理**:从自述文件中移除未完全实现的模型,以避免用户混淆
**增强模型支持和 Qwen Image ControlNet 集成等**
此版本显著扩展了 ControlNet 功能并改进了模型兼容性,让 ComfyUI 工作流更加多样化和可靠:
**Qwen ControlNet 生态系统**
* **Diffsynth ControlNet 支持**:新增支持 Qwen Diffsynth ControlNet 的 Canny 和深度条件控制,实现精确的边缘和基于深度的图像控制
* **InstantX Qwen ControlNet**:集成 InstantX Qwen ControlNet,扩展创意控制选项
* **修复补丁 ControlNet/模型**:通过专用的 Diffsynth 修复 ControlNet 支持增强修复功能
**节点架构和 API 演进**
* **V3 架构迁移**:字符串节点、Google Veo API 和 Ideogram API 节点升级到 V3 架构,提供更好的性能和一致性
* **增强的 API 节点**:OpenAI Chat 节点重命名为"OpenAI ChatGPT" 提高辨识度,Gemini Chat 节点现在包含复制按钮功能
* **改进的可用性**:API 节点现在通过更清晰的标签和增强的交互功能提供更好的用户体验
**工作流可靠性和性能**
* **LTXV 噪声遮罩修复**:解决了存在真实噪声遮罩时关键帧噪声遮罩维度问题,确保视频工作流执行稳定
* **3D 潜在条件控制**:修复了 3D 潜在空间的条件遮罩,在高级工作流中启用适当的深度感知条件控制
* **无效文件名处理**:通过正确处理无效文件名改进工作流保存功能,防止保存失败
* **EasyCache 和 LazyCache**:实现高级缓存系统以改进工作流执行性能
**平台和开发改进**
* **Python 3.13 支持**:完全兼容 Python 3.13,保持 ComfyUI 与最新 Python 发展同步
* **前端更新**:更新到 v1.25.10,改进导航和用户界面增强功能
* **逐元素融合**:通过逐元素操作融合添加性能优化
* **导航模式回滚**:回滚导航默认到传统旧版模式,避免默认启用新版标准导航模式导致用户体验问题,用户仍旧可以在设置中启用标准导航模式
**模型支持**
* **Qwen-Image-Edit模型**:原生支持 Qwen-Image-Edit
* **FluxKontextMultiReferenceLatentMethod 节点**:用于 Flux 工作流的多图参考输入节点
* **WAN 2.2 Fun Camera模型支持**:支持通过相机控制生成视频
* **模板更新**:升级到版本 0.1.62,新增了 Wan2.2 Fun Camera 和 Qwen Image Edit 等模板
**核心功能改进**
* **Context Windows上下文节点支持**:增强采样代码,支持更长序列的生成任务
* **SDPA 后端优化**:改进缩放点积注意力后端设置,提升性能
**多媒体节点支持**
* **音频录制节点**:新增原生音频录制节点,现在你可以直接在 ComfyUI内录制音频
* **音频视频集成**:完善音频-视频依赖项集成
**API节点支持更新**
* **GPT-5 系列模型**:支持最新的 GPT-5 模型
* **Kling V2-1 和 V2-1-Master**:更新视频生成模型功能
* **Minimax Hailuo 视频节点**:新增视频生成节点
* **Vidu 视频节点**:Vidu API 节点支持
* **Google 模型更新**:新增 Google Gemini 的新模型
* **OpenAI API 修复**:修复 OpenAI API 节点输入图像的 MIME 类型导致的错误
**性能优化**
* **Intel GPU 兼容性**:修复 Intel 集成 GPU 兼容性问题
* **PyTorch 兼容性**:增强与旧版 PyTorch 的兼容性
* **Torch Compile 优化**:改进 torch compile 行为
* **内存管理**:优化安装大小和内存效率
**前端变更**
* **子图支持**:子图功能支持
* **快捷键面板**: 新增底部快捷键面板
* **修改 UI 布局**:修改终端入口布局,增加模板,日志面板等多个入口
* **新增标准画布模式**: 新增标注画布模式,可在 lite-graph(画面) Canvas > 画布导航模式中切换
* **新增小地图**: 新增工作流迷你地图
* **新增Tab预览**: 新增工作流 tab 预览
* **调整顶部 Tab 菜单布局**
**模型集成与性能提升**
本次更新扩展了ComfyUI的模型支持,优化了API功能,并提升了工作流程的稳定性:
**Qwen模型生态**
* **Qwen图像模型支持**:改进了LoRA加载和模型合并功能,支持更复杂的视觉工作流程
* **Qwen模型合并节点**:新增节点,支持合并Qwen图像模型
* **SimpleTuner Lycoris LoRA支持**:扩展了对Qwen-Image模型的Lycoris LoRA兼容性
**API与性能**
* **异步API节点**:引入异步API节点,支持非阻塞工作流程执行
* **内存处理优化**:RepeatLatentBatch节点改进,能正确处理多维latent
* **WAN 2.2 Fun Control支持**:新增对WAN 2.2 fun control功能的支持
**硬件优化**
* **AMD GPU改进**:提升了AMD Radeon的FP16精度处理和性能
* **RDNA3架构修复**:解决了gfx1201 GPU使用Flux模型的问题
* **PyTorch支持更新**:升级了CUDA和ROCM PyTorch版本
**开发体验**
* **日志记录优化**:功能标志仅在详细模式下显示
* **音频处理安全**:增强了torchaudio导入安全检查
* **Kling API改进**:修复了Kling Image API节点的图像类型参数
**界面优化与模型支持**
此版本改进了用户体验,增加了前沿模型支持:
**用户界面**
* **最近项目 API**:新增跟踪最近使用项目的API
* **工作流导航优化**:改进了常用元素的组织
**模型集成**
* **Qwen视觉模型支持**:初步支持Qwen图像模型
* **图像处理优化**:增强了Qwen模型集成
**视频生成**
* **Veo3视频生成**:新增Veo3视频生成节点,支持音频
* **音视频合成**:在单个节点中结合视频和音频生成
**性能与稳定**
* **内存管理优化**:优化了条件VRAM使用
* **设备一致性**:修复了条件数据和上下文的设备问题
* **ControlNet稳定性**:解决了ControlNet兼容性问题
**开发与系统**
* **错误处理**:添加了条件设备不匹配的警告
* **模板更新**:更新了多个模板版本
**API增强与性能优化**
此版本引入了后端改进和性能优化:
**ComfyAPI框架**
* **ComfyAPI Core v0.0.2**:核心API框架更新,提升了稳定性和可扩展性
* **部分执行支持**:新增后端支持部分工作流执行
**视频处理**
* **WAN Camera内存优化**:改进了基于WAN的相机工作流内存管理
* **WanFirstLastFrameToVideo修复**:解决了视频生成的关键问题
**性能与模型**
* **VAE优化**:在VAE操作中优化了torch.silu
* **WAN VAE优化**:对WAN VAE操作进行了优化
**节点架构**
* **V3节点架构**:实现了下一代节点架构系统
* **模板更新**:更新了多个模板版本
**内存优化与大模型性能**
本次更新优化了大模型工作流的内存使用:
**WAN 2.2模型**
* **内存占用减少**:减少了WAN 2.2 VAE操作的内存使用
* **5B I2V模型支持**:优化了WAN 2.2 5B图像到视频模型
**VRAM管理**
* **Windows大显卡支持**:为Windows高端显卡增加了VRAM分配
* **内存分配改进**:改进了多大模型用户的内存管理
**硬件加速和音频处理**
此版本扩展了硬件支持并增强了音频处理:
**音频处理**
* **PyAV音频后端**:用PyAV替换torchaudio.load
* **音频集成优化**:增强了多媒体生成工作流的音频处理
**硬件支持**
* **Iluvatar CoreX支持**:添加了对Iluvatar CoreX加速器的支持
* **Intel XPU优化**:改进了XPU支持
* **AMD ROCm增强**:在Torch 2.8上为gfx1201启用了PyTorch attention
* **CUDA内存管理**:修复了CUDA malloc激活问题
**采样算法**
* **Euler CFG++增强**:分离了去噪和噪声估计过程
* **WAN模型支持**:添加了对WAN模型的全面支持
**训练功能**
* **训练节点增强**:添加了算法支持、梯度累积和可选梯度检查点
* **训练灵活性**:提供了更好的内存管理和性能优化
**节点和工作流**
* **Moonvalley V2V节点**:添加了Moonvalley Marey V2V节点
* **负面提示词更新**:改进了Moonvalley节点的负面提示词处理
* **历史API增强**:向get\_history API添加了map\_function参数
**API和系统**
* **前端版本跟踪**:在/system\_stats API响应中添加了required\_frontend\_version
* **设备信息**:增强了XPU设备名称打印
* **模板更新**:更新了多个模板版本
**开发体验**
* **文档更新**:增强了README和模型集成指南
* **行结束符修复**:标准化了工作流中的行结束符
* **代码清理**:移除了已弃用的代码
**采样与训练功能改进**
本版本增强了采样算法和训练功能:
**采样和生成功能**
* **SA-Solver采样器**:新的重构SA-Solver采样算法
* **实验性CFGNorm节点**:无分类器引导标准化节点
* **嵌套双CFG支持**:为DualCFGGuider节点添加嵌套风格配置
* **SamplingPercentToSigma节点**:用于计算sigma的新实用节点
**训练功能**
* **多数据集支持**:LoRA训练节点支持多个图像-描述数据集
* **训练循环优化**:优化了训练算法
* **错误检测增强**:为LoRA操作添加了模型检测错误提示
**平台和性能**
* **异步节点支持**:完全支持异步节点函数
* **Chroma灵活性**:取消了Chroma中硬编码的patch\_size参数
* **LTXV VAE解码器**:切换到改进的默认填充模式
* **Safetensors内存管理**:为mmap问题添加了解决方案
**API和集成**
* **自定义提示ID**:API允许指定提示ID
* **Kling API优化**:增加了轮询超时时间
* **历史令牌清理**:从历史项目中删除了敏感令牌
* **Python 3.9兼容性**:修复了兼容性问题
**错误修复和稳定性**
* **MaskComposite修复**:解决了目标蒙版的错误
* **Fresca输入/输出**:修正了Fresca模型工作流程的处理
* **引用错误修复**:解决了Gemini节点实现中的引用问题
* **行结束标准化**:自动检测和删除Windows行结束符
**开发体验**
* **警告系统**:添加了torch导入错误警告
* **模板更新**:更新了多个模板版本
* **文档**:增强了fast\_fp16\_accumulation的文档
**采样和模型控制增强**
此版本改进了采样算法和模型控制系统:
**新采样功能**
* **TCFG节点**:增强的分类器无关引导控制
* **ER-SDE采样器**:从VE迁移到VP算法
* **跳层引导(SLG)**:用于推理期间精确层级控制
**开发工具**
* **自定义节点管理**:新的`--whitelist-custom-nodes`参数
* **性能优化**:双CFG节点在CFG为1.0时自动优化
* **GitHub Actions集成**:自动化发布webhook通知
**图像处理**
* **新变换节点**:添加了ImageRotate和ImageFlip节点
* **ImageColorToMask修复**:修正了掩码值返回
* **3D模型支持**:上传3D模型到自定义子文件夹
**引导和条件**
* **PerpNeg引导器**:更新了前后CFG处理
* **潜在条件修复**:解决了多步骤工作流中的条件问题
* **去噪步骤**:为多个采样器添加去噪步骤支持
**平台稳定性**
* **PyTorch兼容性**:修复了PyTorch nightly构建的问题
* **FP8回退**:当FP8操作遇到异常时自动回退
* **音频处理**:移除了已弃用的torchaudio.save函数依赖
**模型集成**
* **Moonvalley节点**:为Moonvalley模型工作流添加原生支持
* **调度器重新排序**:简单调度器现在默认优先
* **模板更新**:更新了多个模板版本
**安全性**
* **安全加载**:在不安全加载文件时添加警告
* **文件验证**:增强检查点加载安全措施
**模型支持与工作流可靠性**
本次更新改进了模型兼容性和工作流稳定性:
**扩展模型文档**:为Flux Kontext和Omnigen 2模型添加了支持文档
**VAE编码改进**:移除了VAE编码过程中不必要的随机噪声注入
**内存管理修复**:解决了影响Kontext模型使用的关键内存估算错误
**模型支持新增**
* **Cosmos Predict2支持**:实现文本到图像和图像到视频生成工作流
* **增强的Flux兼容性**:Chroma Text Encoder能与常规Flux模型协作
* **LoRA训练集成**:新增原生LoRA训练节点
**性能和硬件优化**
* **AMD GPU增强**:在AMD GPU上启用FP8操作和PyTorch注意力机制
* **Apple Silicon修复**:解决了Apple设备上的FP16注意力问题
* **Flux模型稳定性**:解决了特定Flux模型生成黑色图像的问题
**采样改进**
* **Rectified Flow采样器**:新增支持RF的SEEDS和多步DPM++ SDE采样器
* **ModelSamplingContinuousEDM**:新增cosmos\_rflow选项
* **内存优化**:改进了Cosmos模型的内存估算
**开发和集成功能**
* **SQLite数据库支持**:增强自定义节点的数据管理功能
* **PyProject.toml集成**:从pyproject文件自动注册web文件夹
* **前端灵活性**:支持语义化版本后缀和预发布前端版本
* **分词器增强**:通过tokenizer\_data配置min\_length设置
**使用体验改进**
* **Kontext宽高比修复**:解决了小组件限制
* **SaveLora一致性**:统一了所有保存节点的文件名格式
* **Python版本警告**:为过时的Python安装添加警报
* **WebcamCapture修复**:修正了IS\_CHANGED签名
**工作流工具和性能优化**
本次更新带来了新的工作流工具和性能优化:
**新工作流工具**
* **ImageStitch节点**:在工作流中拼接多个图像
* **GetImageSize节点**:提取图像尺寸并支持批处理
* **Regex Replace节点**:高级文本处理功能
**模型兼容性**
* **张量处理改进**:简化的列表处理使多模型工作流更可靠
* **BFL API优化**:完善了对Kontext模型的支持
* **性能提升**:在色度处理中使用融合乘加运算
**开发体验**
* **自定义节点支持**:添加pyproject.toml支持
* **帮助菜单集成**:在节点库侧边栏中新增帮助系统
* **API文档**:增强API节点文档
**前端和UI**
* **前端更新至v1.21.7**:多项稳定性修复和性能改进
* **自定义API基础支持**:改进了自定义部署配置的子路径处理
* **安全加固**:修复XSS漏洞
**错误修复和稳定性**
* **Pillow兼容性**:更新了已弃用的API调用
* **ROCm支持**:改进了AMD GPU用户的版本检测
* **模板更新**:增强了自定义节点开发的项目模板
# 快速入门
Source: https://docs.comfy.org/zh-CN/comfy-cli/getting-started
### 概述
`comfy-cli` 是一个 [命令行工具](https://github.com/Comfy-Org/comfy-cli),可以帮助更轻松地安装和管理 Comfy。
### 安装 CLI
```bash pip
pip install comfy-cli
```
```bash homebrew
brew tap Comfy-Org/comfy-cli
brew install comfy-org/comfy-cli/comfy-cli
```
获取 shell 自动补全提示:
```bash
comfy --install-completion
```
### 安装 ComfyUI
使用任意高于 3.9 的 Python 版本创建一个虚拟环境。
```bash conda
conda create -n comfy-env python=3.11
conda activate comfy-env
```
```bash venv
python3 -m venv comfy-env
source comfy-env/bin/activate
```
安装 ComfyUI
```bash
comfy install
```
您仍然需要根据您的 GPU 安装 CUDA 或 ROCm。
### 运行 ComfyUI
```bash
comfy launch
```
### 管理自定义节点
```bash
comfy node install
```
我们使用 `cm-cli` 来安装自定义节点。更多信息请参阅 [文档](https://github.com/ltdrdata/ComfyUI-Manager/blob/main/docs/en/cm-cli.md)。
### 管理模型
使用 `comfy-cli` 下载模型非常简单。只需运行以下命令:
```bash
comfy model download models/checkpoints
```
### 贡献
我们鼓励对 comfy-cli 的贡献!如果您有建议、想法或错误报告,请在我们的 [GitHub 仓库](https://github.com/Comfy-Org/comfy-cli/issues) 上提交问题。如果您想贡献代码,请 fork 仓库并提交 pull request。
有关更多详细信息,请参阅 [开发指南](https://github.com/Comfy-Org/comfy-cli/blob/main/DEV_README.md)。
### 数据分析
我们会跟踪 CLI 的使用情况以改进用户体验。您可以通过运行以下命令禁用此功能:
```bash
comfy tracking disable
```
要重新启用跟踪,请运行:
```bash
comfy tracking enable
```
# 参考
Source: https://docs.comfy.org/zh-CN/comfy-cli/reference
# CLI
## 节点
**用法**:
```console
$ comfy node [OPTIONS] COMMAND [ARGS]...
```
**选项**:
* `--install-completion`: 为当前 shell 安装自动补全功能。
* `--show-completion`: 显示当前 shell 的自动补全功能,可用于复制或自定义安装。
* `--help`: 显示此消息并退出。
**命令**:
* `deps-in-workflow`
* `disable`
* `enable`
* `fix`
* `install`
* `install-deps`
* `reinstall`
* `restore-dependencies`
* `restore-snapshot`
* `save-snapshot`: 保存当前 ComfyUI 环境的快照...
* `show`
* `simple-show`
* `uninstall`
* `update`
### `deps-in-workflow`
**用法**:
```console
$ deps-in-workflow [OPTIONS]
```
**选项**:
* `--workflow TEXT`: 工作流文件 (.json/.png) \[必需]
* `--output TEXT`: 输出文件 (.json/.png) \[必需]
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `disable`
**用法**:
```console
$ disable [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: 禁用自定义节点 \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `enable`
**用法**:
```console
$ enable [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: 启用自定义节点 \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `fix`
**用法**:
```console
$ fix [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: 修复指定自定义节点的依赖项 \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `install`
**用法**:
```console
$ install [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: 安装自定义节点 \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `install-deps`
**用法**:
```console
$ install-deps [OPTIONS]
```
**选项**:
* `--deps TEXT`: 依赖项规范文件 (.json)
* `--workflow TEXT`: 工作流文件 (.json/.png)
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `reinstall`
**用法**:
```console
$ reinstall [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: 重新安装自定义节点 \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `restore-dependencies`
**用法**:
```console
$ restore-dependencies [OPTIONS]
```
**选项**:
* `--help`: 显示此消息并退出。
### `restore-snapshot`
**用法**:
```console
$ restore-snapshot [OPTIONS] PATH
```
**参数**:
* `PATH`: \[必需]
**选项**:
* `--help`: 显示此消息并退出。
### `save-snapshot`
保存当前 ComfyUI 环境的快照。
**用法**:
```console
$ save-snapshot [OPTIONS]
```
**选项**:
* `--output TEXT`: 指定输出文件路径 (.json/.yaml)。
* `--help`: 显示此消息并退出。
### `show`
**用法**:
```console
$ show [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: \[installed|enabled|not-installed|disabled|all|snapshot|snapshot-list] \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `simple-show`
**用法**:
```console
$ simple-show [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: \[installed|enabled|not-installed|disabled|all|snapshot|snapshot-list] \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `uninstall`
**用法**:
```console
$ uninstall [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: 卸载自定义节点 \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
### `update`
**用法**:
```console
$ update [OPTIONS] ARGS...
```
**参数**:
* `ARGS...`: 更新自定义节点 \[必需]
**选项**:
* `--channel TEXT`: 指定操作模式
* `--mode TEXT`: \[remote|local|cache]
* `--help`: 显示此消息并退出。
## 模型
**用法**:
```console
$ comfy model [OPTIONS] COMMAND [ARGS]...
```
**选项**:
* `--install-completion`: 为当前 shell 安装自动补全功能。
* `--show-completion`: 显示当前 shell 的自动补全功能,可用于复制或自定义安装。
* `--help`: 显示此消息并退出。
**命令**:
* `download`: 下载模型到指定的相对路径...
* `list`: 显示当前所有已下载模型的列表...
* `remove`: 删除一个或多个已下载的模型,...
### `download`
如果模型尚未下载,则将其下载到指定的相对路径。
**用法**:
```console
$ download [OPTIONS]
```
**选项**:
* `--url TEXT`: 模型下载的 URL \[必需]
* `--relative-path TEXT`: 从当前工作区到安装模型的相对路径。 \[默认值: models/checkpoints]
* `--help`: 显示此消息并退出。
### `list`
以表格格式显示当前已下载的所有模型列表。
**用法**:
```console
$ list [OPTIONS]
```
**选项**:
* `--relative-path TEXT`: 从当前工作区到存储模型的相对路径。 \[默认值: models/checkpoints]
* `--help`: 显示此消息并退出。
### `remove`
通过直接指定或交互式选择,删除一个或多个已下载的模型。
**用法**:
```console
$ remove [OPTIONS]
```
**选项**:
* `--relative-path TEXT`: 从当前工作区到存储模型的相对路径。 \[默认值: models/checkpoints]
* `--model-names TEXT`: 要删除的模型文件名列表,用空格分隔。
* `--help`: 显示此消息并退出。
# 开始
Source: https://docs.comfy.org/zh-CN/comfy-cli/troubleshooting
### 前提条件
你需要在你的系统上已经安装好了 Git。可以[在这里](https://git-scm.com/downloads)下载。
# 数据类型
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/datatypes
这些是最重要的内置数据类型。您也可以[定义自己的数据类型](./more_on_inputs#custom-datatypes)。
数据类型在客户端用于防止工作流将错误形式的数据传递给节点 - 有点像强类型。
JavaScript 客户端代码通常不允许将节点输出连接到不同数据类型的输入,
不过下面会提到一些例外情况。
## Comfy 数据类型
### COMBO
* `INPUT_TYPES` 中不需要额外参数
* Python 数据类型:定义为 `list[str]`,输出值为 `str`
表示下拉菜单组件。
与其他数据类型不同,`COMBO` 在 `INPUT_TYPES` 中不是通过 `str` 指定的,而是通过 `list[str]` 指定,
对应下拉列表中的选项,默认选中第一个选项。
`COMBO` 输入通常在运行时动态生成。例如,在内置的 `CheckpointLoaderSimple` 节点中,您会看到
```
"ckpt_name": (folder_paths.get_filename_list("checkpoints"), )
```
or they might just be a fixed list of options,
```
"play_sound": (["no","yes"], {}),
```
### 原始类型和重路由
原始类型和重路由节点仅存在于客户端。它们没有固有的数据类型,但在连接时会采用所连接输入或输出的数据类型(这就是为什么它们不能连接到 `*` 输入...)
## Python 数据类型
### INT
* `INPUT_TYPES` 中的额外参数:
* `default` 是必需的
* `min` 和 `max` 是可选的
* Python 数据类型 `int`
### FLOAT
* `INPUT_TYPES` 中的额外参数:
* `default` 是必需的
* `min`、`max`、`step` 是可选的
* Python 数据类型 `float`
### STRING
* `INPUT_TYPES` 中的额外参数:
* `default` 是必需的
* Python 数据类型 `str`
### BOOLEAN
* `INPUT_TYPES` 中的额外参数:
* `default` 是必需的
* Python 数据类型 `bool`
## 张量数据类型
### IMAGE
* `INPUT_TYPES` 中不需要额外参数
* Python 数据类型 `torch.Tensor`,形状为 \[B,H,W,C]
一批 `B` 张图像,高度 `H`,宽度 `W`,具有 `C` 个通道(通常 `C=3` 表示 `RGB`)。
### LATENT
* `INPUT_TYPES` 中不需要额外参数
* Python 数据类型 `dict`,包含一个形状为 \[B,C,H,W] 的 `torch.Tensor`
传入的 `dict` 包含键 `samples`,这是一个形状为 \[B,C,H,W] 的 `torch.Tensor`,表示
一批 `B` 个潜空间表示,具有 `C` 个通道(现有 stable diffusion 模型通常 `C=4`),高度 `H`,宽度 `W`。
高度和宽度是对应图像尺寸的 1/8(这是您在 Empty Latent Image 节点中设置的值)。
字典中的其他条目包含潜空间蒙版等内容。
{/* TODO 需要深入研究这个 */}
{/* TODO 新的 SD 模型可能有不同的 C 值? */}
### MASK
* `INPUT_TYPES` 中不需要额外参数
* Python 数据类型 `torch.Tensor`,形状为 \[H,W] 或 \[B,C,H,W]
### AUDIO
* `INPUT_TYPES` 中不需要额外参数
* Python 数据类型 `dict`,包含一个形状为 \[B, C, T] 的 `torch.Tensor` 和采样率。
传入的 `dict` 包含键 `waveform`,这是一个形状为 \[B, C, T] 的 `torch.Tensor`,表示
一批 `B` 个音频样本,具有 `C` 个通道(`C=2` 表示立体声,`C=1` 表示单声道),以及 `T` 个时间步(即音频样本的数量)。
`dict` 还包含另一个键 `sample_rate`,表示音频的采样率。
## 自定义采样数据类型
### Noise
`NOISE` 数据类型表示噪声的*来源*(而不是噪声本身)。它可以由任何提供生成噪声方法的 Python 对象表示,
方法签名为 `generate_noise(self, input_latent:Tensor) -> Tensor`,以及一个属性 `seed:Optional[int]`。
`seed` 被传入 `SamplerCustomAdvanced` 中的 `sample` guider,但在任何标准 guider 中似乎都没有使用。
它是可选的,所以您通常可以将其设置为 None。
当需要添加噪声时,潜空间表示会被传入这个方法,它应该返回一个包含噪声的相同形状的 `Tensor`。
参见[噪声混合示例](./snippets#creating-noise-variations)
### Sampler
`SAMPLER` 数据类型表示一个采样器,它由一个提供 `sample` 方法的 Python 对象表示。
Stable diffusion 采样超出了本指南的范围;如果您想深入研究这部分代码,请查看 `comfy/samplers.py`。
### Sigmas
`SIGMAS` 数据类型表示由调度器产生的采样过程中每个步骤前后的 sigma 值。
它表示为一个一维张量,长度为 `steps+1`,其中每个元素表示对应步骤之前预期的噪声量,
最后一个值表示最终步骤之后的噪声量。
对于 SDXL 模型,一个具有 20 步和去噪值为 1 的 `normal` 调度器会产生:
```
tensor([14.6146, 10.7468, 8.0815, 6.2049, 4.8557,
3.8654, 3.1238, 2.5572, 2.1157, 1.7648,
1.4806, 1.2458, 1.0481, 0.8784, 0.7297,
0.5964, 0.4736, 0.3555, 0.2322, 0.0292, 0.0000])
```
sigma 的起始值取决于模型,这就是为什么调度器节点需要一个 `MODEL` 输入来产生 SIGMAS 输出
### Guider
`GUIDER` 是去噪过程的泛化,由提示词或任何其他形式的条件"引导"。在 Comfy 中,guider 由
一个提供 `__call__(*args, **kwargs)` 方法的可调用 Python 对象表示,该方法由采样器调用。
`__call__` 方法接收(在 `args[0]` 中)一批噪声潜空间表示(张量 `[B,C,H,W]`),并返回噪声的预测(相同形状的张量)。
## 模型数据类型
稳定扩散模型还有一些更技术性的数据类型。最重要的是 `MODEL`、`CLIP`、`VAE` 和 `CONDITIONING`。
目前这些内容超出了本指南的范围!{/* TODO 但可能不会永远如此 */}
## 附加参数
以下是输入定义的"额外选项"部分可以使用的官方支持键的列表。
您可以为自己的自定义小部件使用额外的键,但*不应该*将以下任何键用于其他目的。
{/* TODO -- 我真的把所有内容都列出来了吗? */}
| 键名 | 描述 |
| ---------------- | ----------------------------------------------------------------------------------- |
| `default` | 控件的默认值 |
| `min` | 数字类型(`FLOAT` 或 `INT`)的最小值 |
| `max` | 数字类型(`FLOAT` 或 `INT`)的最大值 |
| `step` | 控件的增减步长 |
| `label_on` | 布尔值为 `True` 时在 UI 中显示的标签 (`BOOL`) |
| `label_off` | 布尔值为 `False` 时在 UI 中显示的标签 (`BOOL`) |
| `defaultInput` | 默认使用输入插槽而不是支持的控件 |
| `forceInput` | 与 `defaultInput` 相同,且不允许转换为控件 |
| `multiline` | 使用多行文本框 (`STRING`) |
| `placeholder` | 当为空时在 UI 中显示的占位文本 (`STRING`) |
| `dynamicPrompts` | 使前端评估动态提示词 |
| `lazy` | 声明此输入使用[延迟求值](./lazy_evaluation) |
| `rawLink` | 当存在链接时,您将收到链接而不是求值后的值(即 `["nodeId", ]`)。主要在节点使用[节点扩展](./expansion)时有用。 |
# 节点扩展
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/expansion
## 节点扩展
通常,当节点执行时,执行函数会立即返回该节点的输出结果。"节点扩展"是一种相对高级的技术,允许节点返回一个新的子图节点,该子图将替代原节点在图中。这种技术使自定义节点能够实现循环功能。
### 简单示例
首先,这里是一个节点扩展的简单示例:
我们强烈建议在创建子图时使用 `GraphBuilder` 类。这不是强制性的,但它可以防止您犯许多容易犯的错误。
```python
def load_and_merge_checkpoints(self, checkpoint_path1, checkpoint_path2, ratio):
from comfy_execution.graph_utils import GraphBuilder # 通常在文件顶部
graph = GraphBuilder()
checkpoint_node1 = graph.node("CheckpointLoaderSimple", checkpoint_path=checkpoint_path1)
checkpoint_node2 = graph.node("CheckpointLoaderSimple", checkpoint_path=checkpoint_path2)
merge_model_node = graph.node("ModelMergeSimple", model1=checkpoint_node1.out(0), model2=checkpoint_node2.out(0), ratio=ratio)
merge_clip_node = graph.node("ClipMergeSimple", clip1=checkpoint_node1.out(1), clip2=checkpoint_node2.out(1), ratio=ratio)
return {
# 返回 (MODEL, CLIP, VAE) 输出
"result": (merge_model_node.out(0), merge_clip_node.out(0), checkpoint_node1.out(2)),
"expand": graph.finalize(),
}
```
虽然这个节点以前可以通过手动调用 ComfyUI 内部来实现,但使用扩展意味着每个子节点将单独缓存(所以如果你更改 `model2`,你不需要重新加载 `model1`)。
### 要求
为了执行节点扩展,一个节点必须返回一个包含以下键的字典:
1. `result`: 一个包含节点输出的元组。这可能是一个混合的最终值(像你从正常节点返回的那样)和节点输出。
2. `expand`: 要执行扩展的最终图。如果您不使用 `GraphBuilder`,请参见下文。
#### 不使用 GraphBuilder 的额外要求
`expand` 键的格式与 ComfyUI API 格式相同。以下要求由 `GraphBuilder` 处理,但如果您选择不使用它,则必须手动处理:
1. 节点 ID 必须在整个图中唯一。(这包括在多次使用列表时由于使用列表而导致的同一节点的多次执行。)
2. 节点 ID 必须确定且在多次执行图中一致(包括由于缓存而导致的部分执行)。
即使您不想使用 `GraphBuilder` 来实际构建图(例如,因为您从文件加载了图的原始 JSON),您也可以使用 `GraphBuilder.alloc_prefix()` 函数生成一个前缀,并使用 `comfy.graph_utils.add_graph_prefix` 修复现有图以满足这些要求。
### 高效的子图缓存
虽然您可以向子图中的节点传递非文字输入(如 torch 张量),但这可能会抑制子图内部的缓存。当可能时,您应该传递子图对象的链接,而不是节点本身。(您可以在输入的[附加参数](./datatypes#additional-parameters)中声明一个 `rawLink` 来轻松实现这一点。)
# 图像、潜变量和蒙版
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/images_and_masks
在处理这些数据类型时,你需要了解 `torch.Tensor` 类。\
完整文档请见[这里](https://pytorch.org/docs/stable/tensors.html),或者参考 Comfy 所需关键概念的[介绍](./tensors)。
如果你的节点有且只有一个输出且为张量,记得返回 `(image,)` 而不是 `(image)`
下方大部分概念都在[示例代码片段](./snippets)中有说明。
## 图像(Images)
IMAGE 是一个形状为 `[B,H,W,C]` 的 `torch.Tensor`,其中 `C=3`。如果你需要保存或加载图像,需要在 `PIL.Image` 格式和张量之间进行转换——请参见下方代码片段!注意,有些 `pytorch` 操作提供(或期望)`[B,C,H,W]`,即“通道优先”,这样做是为了计算效率。请务必小心区分。
### 使用 PIL.Image
如果你想加载和保存图像,你需要用到 PIL:
```python
from PIL import Image, ImageOps
```
## 蒙版(Masks)
MASK 是一个形状为 `[B,H,W]` 的 `torch.Tensor`。\
在许多场景下,蒙版的值为二值(0 或 1),用于指示哪些像素需要进行特定操作。\
有时蒙版的值会在 0 到 1 之间,用于表示遮罩的程度(例如调整透明度、滤镜或图层合成)。
### Load Image 节点生成的蒙版
`LoadImage` 节点会使用图像的 alpha 通道(即“RGBA”中的“A”)来创建蒙版。\
alpha 通道的值会被归一化到 \[0,1](torch.float32),然后再取反。\
`LoadImage` 节点在加载图像时总会生成一个 MASK 输出。许多图片(如 JPEG)没有 alpha 通道,这种情况下,`LoadImage` 会创建一个形状为 `[1, 64, 64]` 的默认蒙版。
### 理解蒙版的形状
在 `numpy`、`PIL` 等库中,单通道图像(如蒙版)通常表示为二维数组,形状为 `[H,W]`。\
这意味着 `C`(通道)维度是隐式的,因此与 IMAGE 类型不同,蒙版的批量通常只有三维:`[B, H, W]`。\
有时你会遇到 `B` 维被隐式 squeeze 的蒙版,变成了 `[H,W]` 的张量。
在使用 MASK 时,你经常需要通过 unsqueeze 匹配形状,变成 `[B,H,W,C]`,其中 `C=1`。\
要给 `C` 维 unsqueeze,请用 `unsqueeze(-1)`,要给 `B` 维 unsqueeze,请用 `unsqueeze(0)`。\
如果你的节点接收 MASK 作为输入,建议总是检查 `len(mask.shape)`。
## 潜变量(Latents)
LATENT 是一个 `dict`;潜变量样本通过键 `samples` 引用,形状为 `[B,C,H,W]`,其中 `C=4`。
LATENT 是通道优先,IMAGE 是通道最后
# 延迟求值
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/lazy_evaluation
## 延迟求值
默认情况下,在节点运行之前,所有 `required` 和 `optional` 输入都会被求值。然而,有时某些输入可能不会被使用,对其进行求值会导致不必要的处理。以下是一些可能从延迟求值中受益的节点示例:
1. 一个 `ModelMergeSimple` 节点,其中比例要么是 `0.0`(这种情况下不需要加载第一个模型)
要么是 `1.0`(这种情况下不需要加载第二个模型)。
2. 两个图像之间的插值,其中比例(或蒙版)要么完全是 `0.0` 要么完全是 `1.0`。
3. 一个 Switch 节点,其中一个输入决定其他输入中的哪一个会被传递。
将输入设置为延迟求值的成本非常低。如果可以实现,通常都应该这样做。
### 创建延迟输入
将输入设置为"延迟"输入需要两个步骤:
1. 在 `INPUT_TYPES` 返回的字典中将输入标记为延迟
2. 定义一个名为 `check_lazy_status` 的方法(注意:*不是*类方法),该方法将在求值之前被调用来确定是否需要更多输入。
为了演示这些,我们将创建一个"MixImages"节点,它根据蒙版在两个图像之间进行插值。如果整个蒙版都是 `0.0`,我们不需要对第二个图像之前的任何部分进行求值。如果整个蒙版都是 `1.0`,我们可以跳过对第一个图像的求值。
#### 定义 `INPUT_TYPES`
将输入声明为延迟输入很简单,只需在输入的选项字典中添加一个 `lazy: True` 键值对即可。
```python
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"image1": ("IMAGE",{"lazy": True}),
"image2": ("IMAGE",{"lazy": True}),
"mask": ("MASK",),
},
}
```
在这个例子中,`image1` 和 `image2` 都被标记为延迟输入,但 `mask` 总是会被求值。
#### 定义 `check_lazy_status`
一个 `check_lazy_status` 方法在存在一个或多个尚未可用的延迟输入时被调用。这个方法接收与标准执行函数相同的参数。所有可用的输入都会以其最终值传递,而不可用的延迟输入则会有一个 `None` 值。
`check_lazy_status` 方法的责任是返回一个列表,其中包含任何需要继续执行的延迟输入的名称。如果所有延迟输入都可用,该方法应返回一个空列表。
注意,`check_lazy_status` 方法可能会被多次调用。例如,你可能在评估一个延迟输入后发现需要评估另一个延迟输入。
注意,因为该方法使用实际的输入值,所以它*不是*类方法。
```python
def check_lazy_status(self, mask, image1, image2):
mask_min = mask.min()
mask_max = mask.max()
needed = []
if image1 is None and (mask_min != 1.0 or mask_max != 1.0):
needed.append("image1")
if image2 is None and (mask_min != 0.0 or mask_max != 0.0):
needed.append("image2")
return needed
```
### 完整示例
```python
class LazyMixImages:
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"image1": ("IMAGE",{"lazy": True}),
"image2": ("IMAGE",{"lazy": True}),
"mask": ("MASK",),
},
}
RETURN_TYPES = ("IMAGE",)
FUNCTION = "mix"
CATEGORY = "Examples"
def check_lazy_status(self, mask, image1, image2):
mask_min = mask.min()
mask_max = mask.max()
needed = []
if image1 is None and (mask_min != 1.0 or mask_max != 1.0):
needed.append("image1")
if image2 is None and (mask_min != 0.0 or mask_max != 0.0):
needed.append("image2")
return needed
# Not trying to handle different batch sizes here just to keep the demo simple
def mix(self, mask, image1, image2):
mask_min = mask.min()
mask_max = mask.max()
if mask_min == 0.0 and mask_max == 0.0:
return (image1,)
elif mask_min == 1.0 and mask_max == 1.0:
return (image2,)
result = image1 * (1. - mask) + image2 * mask,
return (result[0],)
```
## 执行阻塞
虽然延迟求值是推荐的方式来"禁用"图的一部分,但有时你想要禁用一个没有实现延迟求值的 `OUTPUT` 节点。如果是你开发的节点,你应该按照以下方式添加延迟求值:
1. 添加一个 `enabled` 的必需(如果这是一个新节点)或可选(如果你关心向后兼容性)输入,默认值为 `True`
2. 将所有其他输入设置为延迟输入
3. 仅在 `enabled` 为 `True` 时评估其他输入
如果你无法控制该节点,你可以使用 `comfy_execution.graph.ExecutionBlocker`。这个特殊对象可以作为任何输出端口的返回值。任何接收到 `ExecutionBlocker` 作为输入的节点都会跳过执行,并将该 `ExecutionBlocker` 作为其所有输出返回。
**ExecutionBlocker 故意设计为无法阻止其向前传播。** 如果你认为需要这种功能,你应该使用延迟求值。
### 使用方法
有两种方式可以构造和使用 `ExecutionBlocker`:
1. 向构造函数传入 `None` 来静默阻止执行。这在阻止执行是成功运行的一部分时很有用——比如禁用某个输出。
```python
def silent_passthrough(self, passthrough, blocked):
if blocked:
return (ExecutionBlocker(None),)
else:
return (passthrough,)
```
2. 向构造函数传入一个字符串,当节点因接收到该对象而被阻止执行时显示错误消息。这在你想显示有意义的错误消息时很有用,比如当有人使用无意义的输出时——例如,加载不包含 VAE 的模型时的 `VAE` 输出。
```python
def load_checkpoint(self, ckpt_name):
ckpt_path = folder_paths.get_full_path("checkpoints", ckpt_name)
model, clip, vae = load_checkpoint(ckpt_path)
if vae is None:
# 这个错误信息比在后续节点中出现 "'NoneType' has no attribute" 错误更有用
vae = ExecutionBlocker(f"No VAE contained in the loaded model {ckpt_name}")
return (model, clip, vae)
```
# 生命周期
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/lifecycle
## Comfy 如何加载自定义节点
当 Comfy 启动时,它会扫描 `custom_nodes` 目录下的 Python 模块,并尝试加载它们。
如果模块导出了 `NODE_CLASS_MAPPINGS`,它就会被视为自定义节点。
一个 Python 模块是包含 `__init__.py` 文件的目录。
模块会导出在 `__init__.py` 文件中 `__all__` 属性中列出的内容。
### **init**.py
当 Comfy 尝试导入模块时,会执行 `__init__.py` 文件。要让模块被识别为包含自定义节点定义,必须导出 `NODE_CLASS_MAPPINGS`。如果导出了(并且导入过程中没有出错),模块中定义的节点就会在 Comfy 中可用。如果你的代码有错误,Comfy 会继续运行,但会报告该模块加载失败。所以请检查 Python 控制台!
一个非常简单的 `__init__.py` 文件如下所示:
```python
from .python_file import MyCustomNode
NODE_CLASS_MAPPINGS = { "My Custom Node" : MyCustomNode }
__all__ = ["NODE_CLASS_MAPPINGS"]
```
#### NODE\_CLASS\_MAPPINGS
`NODE_CLASS_MAPPINGS` 必须是一个 `dict`,将自定义节点的唯一名称(在整个 Comfy 安装中唯一)映射到对应的节点类。
#### NODE\_DISPLAY\_NAME\_MAPPINGS
`__init__.py` 还可以导出 `NODE_DISPLAY_NAME_MAPPINGS`,它将同样的唯一名称映射为节点的显示名称。
如果没有提供 `NODE_DISPLAY_NAME_MAPPINGS`,Comfy 会使用唯一名称作为显示名称。
#### WEB\_DIRECTORY
如果你需要部署客户端代码,还需要导出 JavaScript 文件所在路径(相对于模块的路径)。通常将这些文件放在自定义节点的 `js` 子目录下。
只有 `.js` 文件会被服务;你不能通过这种方式部署 `.css` 或其他类型的文件
在旧版本的 Comfy 中,`__init__.py` 需要将 JavaScript 文件复制到主 Comfy 的 web 子目录下。你可能还会看到这样做的代码。不要这样做。
# 数据列表
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/lists
## 长度为一的处理
在内部,Comfy 服务器将从一个节点流向下一个节点的数据表示为 Python `list`,通常长度为 1,类型为相关的数据类型。
在正常操作中,当一个节点返回输出时,输出 `tuple` 中的每个元素都会被单独包裹在一个长度为 1 的列表中;然后当下一个节点被调用时,数据会被解包并传递给主函数。
通常你无需担心这一点,因为 Comfy 会自动进行包裹和解包。
这与批处理无关。例如,一个批次(如潜变量或图像)是列表中的*单个条目*(参见 [张量数据类型](./images_and_masks))
## 列表处理
在某些情况下,单个工作流会处理多个数据实例,此时内部数据将是包含多个数据实例的列表。
例如,逐个处理一系列图像以避免 VRAM 不足,或处理不同尺寸的图像。
默认情况下,Comfy 会按顺序处理列表中的值:
* 如果输入是不同长度的 `list`,较短的会通过重复最后一个值进行填充
* 主方法会针对输入列表中的每个值调用一次
* 输出也是 `list`,每个输出的长度与最长的输入相同
相关代码可在 `execution.py` 的 `map_node_over_list` 方法中找到。
然而,由于 Comfy 会将节点输出包裹为长度为 1 的 `list`,如果自定义节点返回的 `tuple` 中包含一个 `list`,该 `list` 会被包裹并作为单个数据处理。
为了告诉 Comfy 返回的列表不应被包裹,而是作为一系列数据进行顺序处理,节点应提供一个类属性 `OUTPUT_IS_LIST`,它是一个与 `RETURN_TYPES` 长度相同的 `tuple[bool]`,用于指定哪些输出应如此处理。
节点也可以重写默认的输入行为,在一次调用中接收整个列表。这可以通过设置类属性 `INPUT_IS_LIST` 为 `True` 实现。
以下是内置节点的一个(带注释的)示例——`ImageRebatch` 接收一个或多个图像批次(作为列表接收,因为 `INPUT_IS_LIST = True`),并将它们重新分批为所需大小的批次。
`INPUT_IS_LIST` 是节点级别的——所有输入都会被同样处理。因此,`batch_size` 控件的值通过 `batch_size[0]` 获取。
```Python
class ImageRebatch:
@classmethod
def INPUT_TYPES(s):
return {"required": { "images": ("IMAGE",),
"batch_size": ("INT", {"default": 1, "min": 1, "max": 4096}) }}
RETURN_TYPES = ("IMAGE",)
INPUT_IS_LIST = True
OUTPUT_IS_LIST = (True, )
FUNCTION = "rebatch"
CATEGORY = "image/batch"
def rebatch(self, images, batch_size):
batch_size = batch_size[0] # 所有输入都是列表,所以 batch_size 是 list[int]
output_list = []
all_images = []
for img in images: # 每个 img 是一个图像批次
for i in range(img.shape[0]): # 每个 i 是一张单独的图像
all_images.append(img[i:i+1])
for i in range(0, len(all_images), batch_size): # 按 batch_size 分块,每块组成一个新批次
output_list.append(torch.cat(all_images[i:i+batch_size], dim=0)) # 如果图像批次宽高不同会报错!
return (output_list,)
```
#### INPUT\_IS\_LIST
# 发布到 Manager
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/manager
{/*
description: "了解如何将自定义节点发布到 ComfyUI Manager 数据库。"
*/}
{/*
## 什么是自定义节点?
Comfy 的强大之处在于其基于节点的方式,允许你通过不同方式组合节点来开发新的工作流。内置节点提供了丰富的功能,但你可能会发现需要某些核心节点未提供的功能。
自定义节点是由社区开发的节点。它允许你实现新功能并与更广泛的社区分享。如果你有兴趣开发自定义节点,可以在[这里](/zh-CN/custom-nodes/overview)阅读更多相关内容。
## ComfyUI Manager
虽然自定义节点可以手动安装,但大多数人会使用
[ComfyUI Manager](https://github.com/ltdrdata/ComfyUI-Manager)来安装它们。**ComfyUI Manager** 负责安装、
更新和移除自定义节点及其依赖项。但它不是 Comfy 核心的一部分,因此需要手动安装。
### 安装 ComfyUI Manager
```bash
cd ComfyUI/custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager.git
```
之后重启 Comfy。
详细或特殊情况请参见 [ComfyUI Manager 安装指南](https://github.com/ltdrdata/ComfyUI-Manager?tab=readme-ov-file#installation)。
*/}
### 使用 ComfyUI Manager
要让你的自定义节点通过 **ComfyUI Manager** 提供,你需要将其保存为 git 仓库(通常在 `github.com`),
然后在 **ComfyUI Manager** 的 git 仓库提交一个 Pull Request,在其中编辑 `custom-node-list.json` 以添加你的节点。
[详细说明](https://github.com/ltdrdata/ComfyUI-Manager?tab=readme-ov-file#how-to-register-your-custom-node-into-comfyui-manager)。
当用户安装节点时,**ComfyUI Manager** 会:
git clone 该仓库,
安装自定义节点仓库下 `requirements.txt` 中列出的 pip 依赖(如果存在),
```
pip install -r requirements.txt
```
与所有 `pip` 情况一样,你的节点依赖可能会与其他自定义节点冲突。不要让你的 `requirements.txt` 比实际需要的更严格。
如果自定义节点仓库中存在 `install.py`,则执行它。
`install.py` 会从自定义节点的根路径执行
### ComfyUI Manager 文件
如上所述,**ComfyUI Manager** 会使用一些文件和脚本来管理自定义节点的生命周期。这些都是可选的。
* `requirements.txt` - 如上所述的 Python 依赖
* `install.py`, `uninstall.py` - 安装或卸载自定义节点时执行
用户可以直接删除目录,因此不能依赖 `uninstall.py` 一定会被执行
* `disable.py`, `enable.py` - 禁用或重新启用自定义节点时执行
`enable.py` 只会在禁用的节点被重新启用时运行——它应仅用于撤销 `disable.py` 所做的更改
被禁用的自定义节点子目录会加上 `.disabled` 后缀,Comfy 会忽略这些模块
* `node_list.json` - 仅当自定义节点的 NODE\_CLASS\_MAPPINGS 模式不符合常规时才需要。
官方详情请参见 [ComfyUI Manager 指南](https://github.com/ltdrdata/ComfyUI-Manager?tab=readme-ov-file#custom-node-support-guide)。
# 隐藏与灵活输入
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/more_on_inputs
## 隐藏输入
除了在客户端创建对应输入或小部件的 `required`(必需)和 `optional`(可选)输入外,还有三种 `hidden`(隐藏)输入选项,允许自定义节点从服务器请求特定信息。
这些选项通过在 `INPUT_TYPES` 的 `dict` 中返回 `hidden` 字段来访问,其签名为 `dict[str,str]`,可包含 `PROMPT`、`EXTRA_PNGINFO` 或 `UNIQUE_ID` 中的一个或多个。
```python
@classmethod
def INPUT_TYPES(s):
return {
"required": {...},
"optional": {...},
"hidden": {
"unique_id": "UNIQUE_ID",
"prompt": "PROMPT",
"extra_pnginfo": "EXTRA_PNGINFO",
}
}
```
### UNIQUE\_ID
`UNIQUE_ID` 是节点的唯一标识符,与客户端节点的 `id` 属性相同。它通常用于客户端与服务器的通信(参见 [消息](/zh-CN/development/comfyui-server/comms_messages#获取当前节点-id-node-id))。
### PROMPT
`PROMPT` 是客户端发送到服务器的完整提示(prompt)。详见 [prompt 对象](/zh-CN/custom-nodes/js/javascript_objects_and_hijacking#prompt)。
### EXTRA\_PNGINFO
`EXTRA_PNGINFO` 是一个字典,会被复制到任何保存的 `.png` 文件的元数据中。自定义节点可以将额外信息存储在该字典中用于保存(或作为与下游节点通信的一种方式)。
注意:如果 Comfy 启动时使用了 `disable_metadata` 选项,这些数据将不会被保存。
### DYNPROMPT
`DYNPROMPT` 是 `comfy_execution.graph.DynamicPrompt` 的一个实例。它与 `PROMPT` 不同,`DYNPROMPT` 可能会在执行过程中因 [节点扩展](/zh-CN/custom-nodes/backend/expansion) 而发生变化。
`DYNPROMPT` 仅应在高级场景下使用(如在自定义节点中实现循环)。
## 灵活输入
### 自定义数据类型
如果你希望在自定义节点之间传递数据,定义自定义数据类型会很有帮助。这几乎只需要为数据类型选择一个唯一的大写字符串名称,例如 `CHEESE`。
然后你可以在节点的 `INPUT_TYPES` 和 `RETURN_TYPES` 中使用 `CHEESE`,Comfy 客户端只允许 `CHEESE` 输出连接到 `CHEESE` 输入。`CHEESE` 可以是任意 Python 对象。
需要注意的一点是,由于 Comfy 客户端并不了解 `CHEESE`,你需要(除非为 `CHEESE` 定义了自定义小部件,这属于进阶话题)强制它作为输入而不是小部件。这可以通过输入选项字典中的 `forceInput` 选项实现:
```python
@classmethod
def INPUT_TYPES(s):
return {
"required": { "my_cheese": ("CHEESE", {"forceInput":True}) }
}
```
### 通配输入
```python
@classmethod
def INPUT_TYPES(s):
return {
"required": { "anything": ("*",{})},
}
@classmethod
def VALIDATE_INPUTS(s, input_types):
return True
```
前端允许使用 `*` 表示该输入可以连接到任意来源。由于后端并未正式支持此功能,你可以通过在 `VALIDATE_INPUTS` 函数中接受名为 `input_types` 的参数来跳过类型校验。(详见 [VALIDATE\_INPUTS](./server_overview#validate-inputs) 了解更多信息。)
节点需要自行处理传入的数据。
### 动态创建的输入
如果输入是在客户端动态创建的,则无法在 Python 源码中定义。为了访问这些数据,我们需要一个 `optional` 字典,允许 Comfy 传递任意名称的数据。由于 Comfy 服务器
```python
class ContainsAnyDict(dict):
def __contains__(self, key):
return True
...
@classmethod
def INPUT_TYPES(s):
return {
"required": {},
"optional": ContainsAnyDict()
}
...
def main_method(self, **kwargs):
# 动态创建的输入数据会在 kwargs 字典中
```
感谢 rgthree 提供的这个 Python 小技巧!
# 属性
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/server_overview
自定义节点的属性
### 简单示例
下面是“反转图片节点”的代码,概述了自定义节点开发中的关键概念。
```python
class InvertImageNode:
@classmethod
def INPUT_TYPES(cls):
return {
"required": { "image_in" : ("IMAGE", {}) },
}
RETURN_TYPES = ("IMAGE",)
RETURN_NAMES = ("image_out",)
CATEGORY = "examples"
FUNCTION = "invert"
def invert(self, image_in):
image_out = 1 - image_in
return (image_out,)
```
### 主要属性
每个自定义节点都是一个 Python 类,具有以下关键属性:
#### INPUT\_TYPES
顾名思义,`INPUT_TYPES` 定义了节点的输入。该方法返回一个 `dict`,**必须**包含 `required` 键,也**可以**包含 `optional` 和/或 `hidden` 键。`required` 和 `optional` 输入的唯一区别在于,`optional` 输入可以不连接。\
关于 `hidden` 输入的更多信息,参见 [隐藏输入](./more_on_inputs#hidden-inputs)。
每个键的值又是一个 `dict`,其中的键值对指定输入的名称和类型。类型由一个 `tuple` 定义,第一个元素是数据类型,第二个元素是包含附加参数的 `dict`。
这里我们只有一个必需输入,名为 `image_in`,类型为 `IMAGE`,没有额外参数。
注意,与接下来几个属性不同,`INPUT_TYPES` 是一个 `@classmethod`。这样做的目的是让下拉小部件中的选项(比如要加载的 checkpoint 名称)可以在运行时由 Comfy 动态计算。我们稍后会详细介绍这一点。{/* TODO 写好后补充链接 */}
#### RETURN\_TYPES
一个由 `str` 组成的 `tuple`,定义了节点返回的数据类型。如果节点没有输出,也必须提供 `RETURN_TYPES = ()`。
如果你只有一个输出,记得加上逗号:`RETURN_TYPES = ("IMAGE",)`。这是 Python 创建元组所必需的。
#### RETURN\_NAMES
用于标记输出的名称。此项为可选;如果省略,名称将直接使用 `RETURN_TYPES` 的小写形式。
#### CATEGORY
节点在 ComfyUI **添加节点** 菜单中的分类。可以用路径指定子菜单,例如 `examples/trivial`。
#### FUNCTION
节点执行时应调用的 Python 函数名。
该函数以命名参数的方式被调用。所有 `required`(和 `hidden`)输入都会包含在内;`optional` 输入只有在连接时才会包含,因此你应在函数定义中为它们提供默认值(或用 `**kwargs` 捕获)。
该函数返回一个与 `RETURN_TYPES` 对应的元组。即使没有返回内容,也必须返回元组(`return ()`)。同样,如果只有一个输出,记得加上逗号 `return (image_out,)`!
### 执行控制扩展
Comfy 的一个很棒的特性是它会缓存输出,并且只会执行那些结果可能与上次运行不同的节点。这可以极大地加快许多工作流的速度。
本质上,这通过识别哪些节点会产生输出(比如 Image Preview 和 Save Image 节点,这些节点总是会被执行),然后反向追踪哪些节点提供了自上次运行以来可能已更改的数据。
自定义节点有两个可选特性可以协助这一过程。
#### OUTPUT\_NODE
默认情况下,节点不会被视为输出节点。设置 `OUTPUT_NODE = True` 可以指定该节点为输出节点。
#### IS\_CHANGED
默认情况下,如果节点的任何输入或小部件发生变化,Comfy 会认为该节点已更改。这通常是正确的,但在某些情况下你可能需要重写此行为,例如节点使用了随机数(且未指定种子——此时最好提供一个种子输入,以便用户可以控制可复现性并避免不必要的执行)、加载了可能已在外部更改的输入,或有时会忽略某些输入(因此不需要仅因这些输入变化而执行)。
注意,IS\_CHANGED 的返回值不应为 `bool` 类型。
`IS_CHANGED` 接收与主函数(由 `FUNCTION` 指定)相同的参数,并可以返回任意 Python 对象。该对象会与上次运行时返回的对象进行比较,如果 `is_changed != is_changed_old`,则认为节点已更改(相关代码在 `execution.py` 中)。
由于 `True == True`,如果节点返回 `True` 表示已更改,实际上会被认为未更改!如果不是为了兼容现有节点,这一行为本可以在 Comfy 代码中修正。
如果你希望节点始终被认为已更改(不推荐,因为这会阻止 Comfy 优化执行流程),可以 `return float("NaN")`。这会返回一个 `NaN`,它与任何值都不相等,甚至与另一个 `NaN` 也不相等。
一个实际检查变化的好例子是内置的 LoadImage 节点的代码,它会加载图片并返回哈希值:
```python
@classmethod
def IS_CHANGED(s, image):
image_path = folder_paths.get_annotated_filepath(image)
m = hashlib.sha256()
with open(image_path, 'rb') as f:
m.update(f.read())
return m.digest().hex()
```
### 其他属性
还有三个属性可以用来修改 Comfy 对节点的默认处理方式。
#### INPUT\_IS\_LIST, OUTPUT\_IS\_LIST
用于控制数据的顺序处理,详见[后文](./lists)。
### VALIDATE\_INPUTS
如果定义了类方法 `VALIDATE_INPUTS`,则在工作流开始执行前会被调用。\
`VALIDATE_INPUTS` 如果输入有效应返回 `True`,否则返回一个描述错误的字符串(这会阻止执行)。
#### 常量校验
注意,`VALIDATE_INPUTS` 只会接收到在工作流中定义为常量的输入。任何来自其他节点的输入都不会在 `VALIDATE_INPUTS` 中可用。
`VALIDATE_INPUTS` 只会收到其签名中请求的输入(即 `inspect.getfullargspec(obj_class.VALIDATE_INPUTS).args` 返回的参数)。通过这种方式接收的输入不会经过默认校验规则。例如,在下面的代码片段中,前端会使用 `foo` 输入指定的 `min` 和 `max`,但后端不会强制校验。
```python
class CustomNode:
@classmethod
def INPUT_TYPES(cls):
return {
"required": { "foo" : ("INT", {"min": 0, "max": 10}) },
}
@classmethod
def VALIDATE_INPUTS(cls, foo):
# YOLO,啥都行!
return True
```
此外,如果该函数接收 `**kwargs`,则会收到所有可用输入,并且所有这些输入都将跳过校验,就像显式指定一样。
#### 类型校验
如果 `VALIDATE_INPUTS` 方法接收一个名为 `input_types` 的参数,则会传入一个字典,键为每个连接到其他节点输出的输入名,值为该输出的类型。
当存在此参数时,所有输入类型的默认校验都会被跳过。下面是一个利用前端允许指定多种类型的例子:
```python
class AddNumbers:
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"input1" : ("INT,FLOAT", {"min": 0, "max": 1000}),
"input2" : ("INT,FLOAT", {"min": 0, "max": 1000})
},
}
@classmethod
def VALIDATE_INPUTS(cls, input_types):
# input1 和 input2 的 min/max 仍然会被校验,因为
# 我们没有将 `input1` 或 `input2` 作为参数接收
if input_types["input1"] not in ("INT", "FLOAT"):
return "input1 必须是 INT 或 FLOAT 类型"
if input_types["input2"] not in ("INT", "FLOAT"):
return "input2 必须是 INT 或 FLOAT 类型"
return True
```
# 带注释的示例
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/snippets
不断增长的示例代码片段集合……
## 图像与蒙版
### 加载图像
将图像加载为批量大小为1(基于 `nodes.py` 中的 `LoadImage` 源代码)
```python
i = Image.open(image_path)
i = ImageOps.exif_transpose(i)
if i.mode == 'I':
i = i.point(lambda i: i * (1 / 255))
image = i.convert("RGB")
image = np.array(image).astype(np.float32) / 255.0
image = torch.from_numpy(image)[None,]
```
### 保存图像批量
保存一批图像(基于 `nodes.py` 中的 `SaveImage` 源代码)
```python
for (batch_number, image) in enumerate(images):
i = 255. * image.cpu().numpy()
img = Image.fromarray(np.clip(i, 0, 255).astype(np.uint8))
filepath = # some path that takes the batch number into account
img.save(filepath)
```
### 反转蒙版
反转蒙版是一个简单的过程。由于蒙版已被归一化到 \[0,1] 区间:
```python
mask = 1.0 - mask
```
### 将蒙版转换为图像形状
```Python
# 我们需要 [B,H,W,C],其中 C = 1
if len(mask.shape)==2: # 当前为 [H,W],插入 B 和 C 作为第1维
mask = mask[None,:,:,None]
elif len(mask.shape)==3 and mask.shape[2]==1: # 当前为 [H,W,C]
mask = mask[None,:,:,:]
elif len(mask.shape)==3: # 当前为 [B,H,W]
mask = mask[:,:,:,None]
```
### 将蒙版用作透明层
当用于修复或分割等任务时,蒙版的值最终会被四舍五入为最接近的整数,使其为二值——0表示要忽略的区域,1表示要处理的区域。但在蒙版传递到这些节点之前,这一步不会发生。这种灵活性允许你像在数码摄影中那样,将蒙版用作透明层:
```python
# 将蒙版反转回原始透明层
mask = 1.0 - mask
# 扩展 `C`(通道)维度
mask = mask.unsqueeze(-1)
# 沿 `C` 维拼接(cat)
rgba_image = torch.cat((rgb_image, mask), dim=-1)
```
## 噪声
### 创建噪声变体
以下是一个创建混合两个噪声源的噪声对象的示例。通过调整 `weight2`,可以用来生成轻微不同的噪声变体。
```python
class Noise_MixedNoise:
def __init__(self, nosie1, noise2, weight2):
self.noise1 = noise1
self.noise2 = noise2
self.weight2 = weight2
@property
def seed(self): return self.noise1.seed
def generate_noise(self, input_latent:torch.Tensor) -> torch.Tensor:
noise1 = self.noise1.generate_noise(input_latent)
noise2 = self.noise2.generate_noise(input_latent)
return noise1 * (1.0-self.weight2) + noise2 * (self.weight2)
```
# 使用 torch.Tensor
Source: https://docs.comfy.org/zh-CN/custom-nodes/backend/tensors
## pytorch、张量与 torch.Tensor
Comfy 的所有核心数值计算都是由 [pytorch](https://pytorch.org/) 完成的。如果你的自定义节点需要深入 stable diffusion 的底层,你就需要熟悉这个库,这远超本简介的范围。
不过,许多自定义节点都需要操作图像、潜变量和蒙版,这些在内部都表示为 `torch.Tensor`,因此你可能需要收藏
[torch.Tensor 的官方文档](https://pytorch.org/docs/stable/tensors.html)。
### 什么是张量?
`torch.Tensor` 表示张量,张量是向量或矩阵在任意维度上的数学泛化。张量的 *秩*(rank)是它的维度数量(所以向量秩为 1,矩阵秩为 2);它的 *形状*(shape)描述了每个维度的大小。
因此,一个 RGB 图像(高为 H,宽为 W)可以被看作是三组数组(每个颜色通道一组),每组大小为 H x W,可以表示为形状为 `[H,W,3]` 的张量。在 Comfy 中,图像几乎总是以批量(batch)形式出现(即使批量中只有一张图)。`torch` 总是将批量维放在第一位,所以 Comfy 的图像形状为 `[B,H,W,3]`,通常写作 `[B,H,W,C]`,其中 C 代表通道数(Channels)。
### squeeze、unsqueeze 与 reshape
如果张量的某个维度大小为 1(称为折叠维度),那么去掉这个维度后的张量与原张量等价(比如只有一张图片的批量其实就是一张图片)。去除这种折叠维度称为 squeeze,插入一个这样的维度称为 unsqueeze。
有些 torch 代码和自定义节点作者会在某个维度折叠时返回 squeeze 过的张量——比如批量只有一个成员时。这是常见的 bug 来源!
将同样的数据以不同的形状表示称为 reshape。通常你需要了解底层数据结构,因此请谨慎操作!
### 重要符号说明
`torch.Tensor` 支持大多数 Python 的切片符号、迭代和其他常见的类列表操作。张量还有一个 `.shape` 属性,返回其大小,类型为 `torch.Size`(它是 `tuple` 的子类,可以当作元组使用)。
还有一些你经常会见到的重要符号(其中几个在标准 Python 里不常见,但在处理张量时很常用):
* `torch.Tensor` 支持在切片符号中使用 `None`,表示插入一个大小为 1 的新维度。
* `:` 在切片张量时常用,表示"保留整个维度"。就像 Python 里的 `a[start:end]`,但省略了起止点。
* `...` 表示"未指定数量的所有维度"。所以 `a[0, ...]` 会提取批量中的第一个元素,无论有多少维度。
* 在需要传递形状的函数中,形状通常以 `tuple` 形式传递,其中某个维度可以用 `-1`,表示该维度的大小由数据总量自动推算。
```python
>>> a = torch.Tensor((1,2))
>>> a.shape
torch.Size([2])
>>> a[:,None].shape
torch.Size([2, 1])
>>> a.reshape((1,-1)).shape
torch.Size([1, 2])
```
### 元素级操作
许多 `torch.Tensor` 的二元操作(包括 '+', '-', '\*', '/' 和 '==')都是元素级的(即对每个元素独立操作)。操作数必须是形状相同的两个张量,或一个张量和一个标量。所以:
```python
>>> import torch
>>> a = torch.Tensor((1,2))
>>> b = torch.Tensor((3,2))
>>> a*b
tensor([3., 4.])
>>> a/b
tensor([0.3333, 1.0000])
>>> a==b
tensor([False, True])
>>> a==1
tensor([ True, False])
>>> c = torch.Tensor((3,2,1))
>>> a==c
Traceback (most recent call last):
File "", line 1, in
RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 0
```
### 张量的布尔值
张量的"真值"与 Python 列表的真值不同。
你可能熟悉 Python 列表的真值:非空列表为 `True`,`None` 或 `[]` 为 `False`。而 `torch.Tensor`(只要有多个元素)没有定义的真值。你需要用 `.all()` 或 `.any()` 来合并元素级的真值:
```python
>>> a = torch.Tensor((1,2))
>>> print("yes" if a else "no")
Traceback (most recent call last):
File "", line 1, in
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
>>> a.all()
tensor(False)
>>> a.any()
tensor(True)
```
这也意味着你需要用 `if a is not None:` 而不是 `if a:` 来判断一个张量变量是否已被赋值。
# 帮助页面
Source: https://docs.comfy.org/zh-CN/custom-nodes/help_page
如何为自定义节点创建帮助文档
## 使用 Markdown 创建节点文档
自定义节点可以使用 Markdown 来创建富文本文档,这些文档信息将在 UI 中显示,取代常见的节点描述信息。可以为用户提供关于节点功能、参数和使用示例的详细信息。
## 设置
为你的节点添加节点文档:
1. 在你的 `WEB_DIRECTORY` 中创建 `docs` 文件夹
2. 添加以节点名称命名的 Markdown 文件(您的节点名称是用于注册节点的 `NODE_CLASS_MAPPINGS` 字典中的字典键):
* `WEB_DIRECTORY/docs/NodeName.md` - 默认文档
* `WEB_DIRECTORY/docs/NodeName/en.md` - 英文文档
* `WEB_DIRECTORY/docs/NodeName/zh.md` - 中文文档
* 根据需要添加其他语言版本(例如 `fr.md`、`de.md` 等)
系统将根据用户的语言设置自动加载相应的文档,如果没有本地化版本,则回退到 `NodeName.md`。
## 支持的 Markdown 功能
* 标准 Markdown 语法(标题、列表、代码块等)
* 使用 Markdown 语法的图片:``
* 具有特定属性的 HTML 媒体元素:
* `` 和 `` 标签
* 允许的属性:`controls`、`autoplay`、`loop`、`muted`、`preload`、`poster`
## 示例结构
```
my-custom-node/
├── __init__.py
├── web/ # WEB_DIRECTORY
│ ├── js/
│ │ └── my-node.js
│ └── docs/
│ ├── MyNode.md # 默认文档
│ └── MyNode/
│ ├── en.md # 英文版本
│ └── zh.md # 中文版本
```
## 示例 Markdown 文件
```markdown
# 我的自定义节点
此节点使用高级算法处理图像。
## 参数
- **image**: 要处理的输入图像
- **strength**: 处理强度 (0.0 - 1.0)
## 用法

```
# 关于面板徽章
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_about_panel_badges
关于面板徽章 API 允许扩展为 ComfyUI 的关于页面添加自定义徽章。这些徽章可以显示有关你的扩展的信息,并包含指向文档、源代码或其他资源的链接。
## 基本用法
```javascript
app.registerExtension({
name: "MyExtension",
aboutPageBadges: [
{
label: "Documentation",
url: "https://example.com/docs",
icon: "pi pi-file"
},
{
label: "GitHub",
url: "https://github.com/username/repo",
icon: "pi pi-github"
}
]
});
```
## 徽章配置
每个徽章都需要以下所有属性:
```javascript
{
label: string, // 徽章上显示的文本
url: string, // 点击徽章时打开的 URL
icon: string // 图标类名(例如 PrimeVue 图标)
}
```
## 图标选项
徽章图标使用 PrimeVue 的图标集。以下是一些常用图标:
* 文档:`pi pi-file` 或 `pi pi-book`
* GitHub:`pi pi-github`
* 外部链接:`pi pi-external-link`
* 信息:`pi pi-info-circle`
* 下载:`pi pi-download`
* 网站:`pi pi-globe`
* Discord:`pi pi-discord`
完整的可用图标列表请参考 [PrimeVue 图标文档](https://primevue.org/icons/)。
## 示例
```javascript
app.registerExtension({
name: "BadgeExample",
aboutPageBadges: [
{
label: "Website",
url: "https://example.com",
icon: "pi pi-home"
},
{
label: "Donate",
url: "https://example.com/donate",
icon: "pi pi-heart"
},
{
label: "Documentation",
url: "https://example.com/docs",
icon: "pi pi-book"
}
]
});
```
徽章会显示在设置对话框的关于面板中,可以通过 ComfyUI 界面右上角的齿轮图标进入。
# 底部面板标签页
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_bottom_panel_tabs
底部面板标签页 API 允许扩展为 ComfyUI 界面的底部面板添加自定义标签页。这对于添加日志、调试工具或自定义面板等功能非常有用。
## 基本用法
```javascript
app.registerExtension({
name: "MyExtension",
bottomPanelTabs: [
{
id: "customTab",
title: "Custom Tab",
type: "custom",
render: (el) => {
el.innerHTML = '这是我的自定义标签页内容
';
}
}
]
});
```
## 标签页配置
每个标签页都需要 `id`、`title` 和 `type`,以及一个渲染函数:
```javascript
{
id: string, // 标签页的唯一标识符
title: string, // 标签页上显示的标题
type: string, // 标签页类型(通常为 "custom")
icon?: string, // 图标类名(可选)
render: (element) => void // 用于填充标签页内容的函数
}
```
`render` 函数会接收一个 DOM 元素,你应在其中插入标签页的内容。
## 交互元素
你可以添加如按钮等交互元素:
```javascript
app.registerExtension({
name: "InteractiveTabExample",
bottomPanelTabs: [
{
id: "controlsTab",
title: "Controls",
type: "custom",
render: (el) => {
el.innerHTML = `
运行工作流
`;
// 添加事件监听器
el.querySelector('#runBtn').addEventListener('click', () => {
app.queuePrompt();
});
}
}
]
});
```
## 使用 React 组件
你可以在底部面板标签页中挂载 React 组件:
```javascript
// 在你的扩展中引入 React 依赖
import React from "react";
import ReactDOM from "react-dom/client";
// 简单的 React 组件
function TabContent() {
const [count, setCount] = React.useState(0);
return (
React 组件
计数:{count}
setCount(count + 1)}>递增
);
}
// 注册带有 React 内容的扩展
app.registerExtension({
name: "ReactTabExample",
bottomPanelTabs: [
{
id: "reactTab",
title: "React Tab",
type: "custom",
render: (el) => {
const container = document.createElement("div");
container.id = "react-tab-container";
el.appendChild(container);
// 挂载 React 组件
ReactDOM.createRoot(container).render(
);
}
}
]
});
```
## 独立注册
你也可以在 `registerExtension` 之外注册标签页:
```javascript
app.extensionManager.registerBottomPanelTab({
id: "standAloneTab",
title: "Stand-Alone Tab",
type: "custom",
render: (el) => {
el.innerHTML = '此标签页是独立注册的
';
}
});
```
# 命令与快捷键绑定
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_commands_keybindings
命令与快捷键绑定 API 允许扩展注册自定义命令,并将其与键盘快捷键关联。这样用户可以无需鼠标,快速触发操作。
## 基本用法
```javascript
app.registerExtension({
name: "MyExtension",
// 注册命令
commands: [
{
id: "myCommand",
label: "我的命令",
function: () => {
console.log("命令已执行!");
}
}
],
// 将快捷键与命令关联
keybindings: [
{
combo: { key: "k", ctrl: true },
commandId: "myCommand"
}
]
});
```
## 命令配置
每个命令都需要 `id`、`label` 和 `function`:
```javascript
{
id: string, // 命令的唯一标识符
label: string, // 命令显示名称
function: () => void // 命令被触发时执行的函数
}
```
## 快捷键配置
每个快捷键都需要 `combo` 和 `commandId`:
```javascript
{
combo: { // 按键组合
key: string, // 主键(单个字符或特殊按键)
ctrl?: boolean, // 是否需要 Ctrl 键(可选)
shift?: boolean, // 是否需要 Shift 键(可选)
alt?: boolean, // 是否需要 Alt 键(可选)
meta?: boolean // 是否需要 Meta/Command 键(可选)
},
commandId: string // 要触发的命令 ID
}
```
### 特殊按键
对于非字符按键,请使用以下值之一:
* 方向键:`"ArrowUp"`、`"ArrowDown"`、`"ArrowLeft"`、`"ArrowRight"`
* 功能键:`"F1"` 到 `"F12"`
* 其他特殊按键:`"Escape"`、`"Tab"`、`"Enter"`、`"Backspace"`、`"Delete"`、`"Home"`、`"End"`、`"PageUp"`、`"PageDown"`
## 命令示例
```javascript
app.registerExtension({
name: "CommandExamples",
commands: [
{
id: "runWorkflow",
label: "运行工作流",
function: () => {
app.queuePrompt();
}
},
{
id: "clearWorkflow",
label: "清空工作流",
function: () => {
if (confirm("确定要清空工作流吗?")) {
app.graph.clear();
}
}
},
{
id: "saveWorkflow",
label: "保存工作流",
function: () => {
app.graphToPrompt().then(workflow => {
const blob = new Blob([JSON.stringify(workflow)], {type: "application/json"});
const url = URL.createObjectURL(blob);
const a = document.createElement("a");
a.href = url;
a.download = "workflow.json";
a.click();
URL.revokeObjectURL(url);
});
}
}
]
});
```
## 快捷键示例
```javascript
app.registerExtension({
name: "KeybindingExamples",
commands: [
/* 上述命令定义 */
],
keybindings: [
// Ctrl+R 运行工作流
{
combo: { key: "r", ctrl: true },
commandId: "runWorkflow"
},
// Ctrl+Shift+C 清空工作流
{
combo: { key: "c", ctrl: true, shift: true },
commandId: "clearWorkflow"
},
// Ctrl+S 保存工作流
{
combo: { key: "s", ctrl: true },
commandId: "saveWorkflow"
},
// F5 运行工作流(备用)
{
combo: { key: "F5" },
commandId: "runWorkflow"
}
]
});
```
## 注意事项与限制
* ComfyUI 核心中定义的快捷键无法被扩展覆盖。请在以下源码文件中查看核心快捷键:
* [核心命令](https://github.com/Comfy-Org/ComfyUI_frontend/blob/e76e9ec61a068fd2d89797762f08ee551e6d84a0/src/composables/useCoreCommands.ts)
* [核心菜单命令](https://github.com/Comfy-Org/ComfyUI_frontend/blob/e76e9ec61a068fd2d89797762f08ee551e6d84a0/src/constants/coreMenuCommands.ts)
* [核心快捷键](https://github.com/Comfy-Org/ComfyUI_frontend/blob/e76e9ec61a068fd2d89797762f08ee551e6d84a0/src/constants/coreKeybindings.ts)
* [保留按键组合](https://github.com/Comfy-Org/ComfyUI_frontend/blob/e76e9ec61a068fd2d89797762f08ee551e6d84a0/src/constants/reservedKeyCombos.ts)
* 某些按键组合被浏览器保留(如 Ctrl+F 用于搜索),无法被覆盖
* 如果多个扩展注册了相同的快捷键,行为未定义
# 对话框 API
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_dialog
对话框 API 提供了在桌面端和 Web 环境下都能一致工作的标准化对话框。扩展作者会发现 prompt 和 confirm 方法最为实用。
## 基本用法
### 输入对话框(Prompt Dialog)
```javascript
// 显示一个输入对话框
app.extensionManager.dialog.prompt({
title: "用户输入",
message: "请输入你的姓名:",
defaultValue: "User"
}).then(result => {
if (result !== null) {
console.log(`输入内容: ${result}`);
}
});
```
### 确认对话框(Confirm Dialog)
```javascript
// 显示一个确认对话框
app.extensionManager.dialog.confirm({
title: "确认操作",
message: "你确定要继续吗?",
type: "default"
}).then(result => {
console.log(result ? "用户已确认" : "用户已取消");
});
```
## API 参考
### Prompt
```javascript
app.extensionManager.dialog.prompt({
title: string, // 对话框标题
message: string, // 显示的消息/问题
defaultValue?: string // 输入框的初始值(可选)
}).then((result: string | null) => {
// result 是输入的文本,若取消则为 null
});
```
### Confirm
```javascript
app.extensionManager.dialog.confirm({
title: string, // 对话框标题
message: string, // 显示的消息
type?: "default" | "overwrite" | "delete" | "dirtyClose" | "reinstall", // 对话框类型(可选)
itemList?: string[], // 要显示的项目列表(可选)
hint?: string // 显示的提示文本(可选)
}).then((result: boolean | null) => {
// result 为 true 表示确认,false 表示拒绝,null 表示取消
});
```
如需了解 ComfyUI 中其他专用对话框,扩展作者可参考源码中的 `dialogService.ts` 文件。
# 带注释的示例
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_examples
不断增长的示例代码片段集合……
## 右键菜单
### 背景菜单
主背景菜单(在画布上右键)是通过调用 `LGraph.getCanvasMenuOptions` 生成的。添加自定义菜单选项的一种方式是劫持这个调用:
```Javascript
/* 在 setup() 中 */
const original_getCanvasMenuOptions = LGraphCanvas.prototype.getCanvasMenuOptions;
LGraphCanvas.prototype.getCanvasMenuOptions = function () {
// 获取基础选项
const options = original_getCanvasMenuOptions.apply(this, arguments);
options.push(null); // 插入分隔线
options.push({
content: "菜单的文本",
callback: async () => {
// 执行任意操作
}
})
return options;
}
```
### 节点菜单
当你在节点上右键时,菜单同样是通过 `node.getExtraMenuOptions` 生成的。但这次不是返回一个 options 对象,而是将其作为参数传入……
```javascript
/* 在 beforeRegisterNodeDef() 中 */
if (nodeType?.comfyClass=="MyNodeClass") {
const original_getExtraMenuOptions = nodeType.prototype.getExtraMenuOptions;
nodeType.prototype.getExtraMenuOptions = function(_, options) {
original_getExtraMenuOptions?.apply(this, arguments);
options.push({
content: "做点有趣的事",
callback: async () => {
// 有趣的操作
}
})
}
}
```
### 子菜单
如果你想要子菜单,可以提供一个回调,使用 `LiteGraph.ContextMenu` 创建它:
```javascript
function make_submenu(value, options, e, menu, node) {
const submenu = new LiteGraph.ContextMenu(
["选项 1", "选项 2", "选项 3"],
{
event: e,
callback: function (v) {
// 用 v (=="选项 x") 做点什么
},
parentMenu: menu,
node:node
}
)
}
/* ... */
options.push(
{
content: "带选项的菜单",
has_submenu: true,
callback: make_submenu,
}
)
```
## 捕获 UI 事件
这和你预期的一样——在 DOM 中找到 UI 元素并添加 eventListener。`setup()` 是做这件事的好地方,因为此时页面已完全加载。例如,检测"队列"按钮的点击:
```Javascript
function queue_button_pressed() { console.log("队列按钮被按下!") }
document.getElementById("queue-button").addEventListener("click", queue_button_pressed);
```
## 检测工作流开始
这是众多 `api` 事件之一:
```javascript
import { api } from "../../scripts/api.js";
/* 在 setup() 中 */
function on_execution_start() {
/* 执行任意操作 */
}
api.addEventListener("execution_start", on_execution_start);
```
## 检测工作流被中断
这是一个劫持 api 的简单例子:
```Javascript
import { api } from "../../scripts/api.js";
/* 在 setup() 中 */
const original_api_interrupt = api.interrupt;
api.interrupt = function () {
/* 在调用原方法前做点什么 */
original_api_interrupt.apply(this, arguments);
/* 或者在之后 */
}
```
## 捕获节点点击
`node` 有一个 mouseDown 方法可以被劫持。
这次我们注意传递任何返回值。
```javascript
async nodeCreated(node) {
if (node?.comfyClass === "My Node Name") {
const original_onMouseDown = node.onMouseDown;
node.onMouseDown = function( e, pos, canvas ) {
alert("哎呦!");
return original_onMouseDown?.apply(this, arguments);
}
}
}
```
# Comfy 钩子(Hooks)
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_hooks
## 扩展钩子
在 Comfy 执行的不同阶段,应用会调用 `#invokeExtensionsAsync` 或 `#invokeExtensions`,并传入钩子的名称。
这些方法会在所有已注册的扩展上调用同名方法(如果存在),比如上面例子中的 `setup`。
Comfy 提供了多种钩子,供自定义扩展代码使用,以修改客户端行为。
这些钩子会在 Comfy 客户端元素的创建和修改过程中被调用。 工作流执行过程中的事件由 `apiUpdateHandlers` 处理。 {/* TODO link when written */}
下面介绍了一些最重要的钩子。由于 Comfy 仍在积极开发中,新的钩子会不时加入,因此可以在 `app.js` 中搜索 `#invokeExtensions` 以查找所有可用钩子。
另请参阅[钩子的调用顺序](#call-sequences)。
### 常用钩子
从 `beforeRegisterNodeDef` 开始,大多数扩展都会用到它,通常也是唯一需要的钩子。
#### beforeRegisterNodeDef()
对每种节点类型(即 `AddNode` 菜单中可用的节点列表)调用一次,用于修改节点的行为。
```Javascript
async beforeRegisterNodeDef(nodeType, nodeData, app)
```
传入的 `nodeType` 参数本质上是该类型所有节点的模板,因此对 `nodeType.prototype` 的修改会应用到所有该类型节点上。`nodeData` 封装了 Python 代码中定义的节点相关信息,如类别、输入和输出。`app` 是主 Comfy app 对象的引用(你应该已经导入了它!)
该方法会对每个已注册扩展、每种节点类型都调用一次,而不仅仅是扩展自己添加的节点。
常见做法是检查 `nodeType.ComfyClass`,它保存了该节点对应的 Python 类名,以判断是否需要修改该节点。通常这意味着只修改你自己添加的自定义节点,但有时也可能需要修改其他节点(或其他自定义节点也可能修改你的节点!),此时要注意兼容性。
由于其他扩展也可能会修改节点,建议尽量减少假设,并尽量隔离你的更改,友好共存。
在 `beforeRegisterNodeDef` 中非常常见的做法是"劫持"已有方法:
```Javascript
async beforeRegisterNodeDef(nodeType, nodeData, app) {
if (nodeType.comfyClass=="MyNodeClass") {
const onConnectionsChange = nodeType.prototype.onConnectionsChange;
nodeType.prototype.onConnectionsChange = function (side,slot,connect,link_info,output) {
const r = onConnectionsChange?.apply(this, arguments);
console.log("Someone changed my connection!");
return r;
}
}
}
```
这种做法是先保存原型上的原方法,然后替换为新方法。新方法会调用原方法(`?.apply` 保证即使没有原方法也不会出错),然后执行额外操作。根据你的代码逻辑,可能需要在新方法的其他位置调用 `apply`,甚至有条件地调用。
以这种方式劫持方法时,建议查看核心 comfy 代码(断点调试很有用),以确保方法签名一致。
#### nodeCreated()
```Javascript
async nodeCreated(node)
```
当某个节点实例被创建时调用(就在 `nodeType` 上的 `ComfyNode()` 构造函数结束时)。在这个钩子里你可以修改节点的具体实例。
如果是对所有实例都生效的更改,建议在上文的 `beforeRegisterNodeDef` 里加到 prototype 上。
#### init()
```Javascript
async init()
```
当 Comfy 网页被加载(或重新加载)时调用。调用时机是在图对象已创建,但还未注册或创建任何节点之前。可以用来劫持 app 或 graph(`LiteGraph` 对象)的方法,从而修改核心 Comfy 行为。详见[Comfy 对象与劫持](./javascript_objects_and_hijacking)。
能力越大,责任越大。劫持核心行为可能导致你的节点与其他自定义节点或未来 Comfy 更新不兼容。
#### setup()
```Javascript
async setup()
```
在启动流程结束时调用。适合添加事件监听器(无论是 Comfy 事件还是 DOM 事件),或添加全局菜单,相关内容在其他地方有详细介绍。 {/* TODO link when written */}
如果要在工作流加载后做事,请用 `afterConfigureGraph`,不要用 `setup`。
### 调用顺序
以下顺序是通过在 Comfy `app.js` 文件中插入日志代码获得的。你也可以用类似方法帮助理解执行流程。
```Javascript
/* 截至目前大约在第 220 行: */
#invokeExtensions(method, ...args) {
console.log(`invokeExtensions ${method}`) // 此行为新增
// ...
}
/* 截至目前大约在第 250 行: */
async #invokeExtensionsAsync(method, ...args) {
console.log(`invokeExtensionsAsync ${method}`) // 此行为新增
// ...
}
```
#### 网页加载时
```
invokeExtensionsAsync init
invokeExtensionsAsync addCustomNodeDefs
invokeExtensionsAsync getCustomWidgets
invokeExtensionsAsync beforeRegisterNodeDef [多次重复]
invokeExtensionsAsync registerCustomNodes
invokeExtensionsAsync beforeConfigureGraph
invokeExtensionsAsync nodeCreated
invokeExtensions loadedGraphNode
invokeExtensionsAsync afterConfigureGraph
invokeExtensionsAsync setup
```
#### 加载工作流
```
invokeExtensionsAsync beforeConfigureGraph
invokeExtensionsAsync beforeRegisterNodeDef [0、1 或多次]
invokeExtensionsAsync nodeCreated [多次重复]
invokeExtensions loadedGraphNode [多次重复]
invokeExtensionsAsync afterConfigureGraph
```
{/* TODO 为什么 beforeRegisterNodeDef 会再次被调用? */}
#### 添加新节点
```
invokeExtensionsAsync nodeCreated
```
# Comfy 对象
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_objects_and_hijacking
## LiteGraph
Comfy UI 构建于 [LiteGraph](https://github.com/jagenjo/litegraph.js) 之上。
Comfy 的许多功能都由 LiteGraph 提供,因此如果你要开发更复杂的节点,建议克隆该仓库并查阅文档,文档位于 `doc/index.html`。
## ComfyApp
`app` 对象(始终可通过 `import { app } from "../../scripts/app.js";` 获取)代表在浏览器中运行的 Comfy 应用,包含许多有用的属性和函数,部分如下所示。
不建议劫持 `app` 上的函数,因为 Comfy 正在持续开发,核心行为可能会变化。
### 属性
`app` 的重要属性包括(非完整列表):
| 属性 | 内容 |
| --------------- | --------------------------------------------------------------------- |
| `canvas` | 一个 LGraphCanvas 对象,代表当前用户界面。包含如 `node_over` 和 `selected_nodes` 等有用属性。 |
| `canvasEl` | DOM `` 元素 |
| `graph` | 指向当前图的 LGraph 对象的引用 |
| `runningNodeId` | 执行期间,当前正在执行的节点 |
| `ui` | 可访问部分 UI 元素,如队列、菜单和对话框 |
`canvas`(用于图形元素)和 `graph`(用于逻辑连接)大概率是你最常用的。
### 函数
同样,函数也有很多。以下是一些重要的:
| 函数 | 说明 |
| ----------------- | ---------------------------- |
| graphToPrompt | 将图转换为可发送到 Python 服务器的 prompt |
| loadGraphData | 加载一个图 |
| queuePrompt | 将 prompt 提交到队列 |
| registerExtension | 你已经见过——用于添加扩展 |
## LGraph
`LGraph` 对象是 LiteGraph 框架的一部分,代表当前图的逻辑状态(节点和连线)。
如果你想操作图,LiteGraph 文档(克隆 `https://github.com/jagenjo/litegraph.js` 后在 `doc/index.html`)描述了你需要的函数。
你可以用 `graph` 获取节点和连线的详细信息,例如:
```Javascript
const ComfyNode_object_for_my_node = app.graph._nodes_by_id(my_node_id)
ComfyNode_object_for_my_node.inputs.forEach(input => {
const link_id = input.link;
if (link_id) {
const LLink_object = app.graph.links[link_id]
const id_of_upstream_node = LLink_object.origin_id
// 等等
}
});
```
## LLink
`LLink` 对象可通过 `graph.links` 访问,代表图中一条从节点 `link.origin_id` 的输出槽 `link.origin_slot` 到节点 `link.target_id` 的输入槽 `link.target_slot` 的连线。它还有一个字符串类型 `link.type` 和 `link.id`。
`LLink` 是在 `LGraphNode`(`ComfyNode` 是其子类)的 `connect` 方法中创建的。
避免自己创建 LLink 对象——请使用 LiteGraph 的相关函数。
## ComfyNode
`ComfyNode` 是 `LGraphNode` 的子类,因此 LiteGraph 文档对于通用操作也很有帮助。不过,Comfy 对 LiteGraph 的核心行为做了大量扩展,也没有用到 LiteGraph 的全部功能。
以下描述适用于普通节点。
分组节点、原始节点、注释节点和重定向节点有不同的属性。
`ComfyNode` 对象代表当前工作流中的一个节点。它有许多重要属性和大量可用或可劫持的函数,用于修改行为。
为了更全面地了解节点对象,你可以在扩展中插入如下代码,并在 `console.log` 处打断点。创建新节点时即可用调试器查看节点。
```Javascript
async nodeCreated(node) {
console.log("nodeCreated")
}
```
### 属性
| 属性 | 内容 |
| ----------------- | ---------------------------------------------------------------- |
| `bgcolor` | 节点的背景色,默认 undefined |
| `comfyClass` | 节点对应的 Python 类 |
| `flags` | 包含节点状态相关标志的字典。特别是 `flags.collapsed` 表示节点是否折叠。 |
| `graph` | 指向 LGraph 对象的引用 |
| `id` | 唯一 id |
| `input_type` | 输入类型列表(如 "STRING"、"MODEL"、"CLIP" 等)。通常与 Python 的 INPUT\_TYPES 匹配 |
| `inputs` | 输入列表(见下文) |
| `mode` | 通常为 0,静音为 2,旁路为 4。1 和 3 未被 Comfy 使用 |
| `order` | 节点的执行顺序。由 `LGraph.computeExecutionOrder()` 在提交 prompt 时设置 |
| `pos` | 节点在画布上的 \[x,y] 位置 |
| `properties` | 包含 "Node name for S\&R" 的字典,由 LiteGraph 使用 |
| `properties_info` | `properties` 中条目的类型和默认值 |
| `size` | 节点在画布上的宽高 |
| `title` | 显示标题 |
| `type` | 节点类的唯一名称(来自 Python) |
| `widgets` | 小部件列表(见下文) |
| `widgets_values` | 小部件的当前值列表 |
### 函数
函数非常多(上次统计有 85 个)。以下是部分常用函数。
大多数函数未被 Comfy 修改,仍为 LiteGraph 核心代码。
#### 输入、输出、小部件
| 函数 | 说明 |
| ---------------------- | --------------------------------------------------------- |
| Inputs / Outputs | 大多数有同名的输出方法:s/In/Out/ |
| `addInput` | 创建新输入,需指定名称和类型 |
| `addInputs` | `addInput` 的数组版本 |
| `findInputSlot` | 通过输入名查找槽索引 |
| `findInputSlotByType` | 按类型查找输入。可选参数优先或仅使用空闲槽 |
| `removeInput` | 按槽索引移除输入 |
| `getInputNode` | 获取连接到该输入的节点。输出等价方法为 `getOutputNodes`,返回列表 |
| `getInputLink` | 获取连接到该输入的 LLink。无输出等价方法 |
| Widgets | |
| `addWidget` | 添加标准 Comfy 小部件 |
| `addCustomWidget` | 添加自定义小部件(在 `getComfyWidgets` 钩子中定义) |
| `addDOMWidget` | 添加由 DOM 元素定义的小部件 |
| `convertWidgetToInput` | 如果 `isConvertableWidget` 允许,将小部件转为输入(见 `widgetInputs.js`) |
#### 连接
| 函数 | 说明 |
| --------------------- | ---------------------------- |
| `connect` | 将本节点输出连接到其他节点输入 |
| `connectByType` | 按类型将输出连接到其他节点——连接到第一个可用的匹配槽 |
| `connectByTypeOutput` | 按类型将输入连接到其他节点输出 |
| `disconnectInput` | 移除输入(按名称或索引)上的所有连线 |
| `disconnectOutput` | 断开输出与指定节点输入的连接 |
| `onConnectionChange` | 每个节点都会调用。`side==1` 表示是本节点的输入 |
| `onConnectInput` | 在建立连接前调用。如果返回 `false`,则拒绝连接 |
#### 显示
| 函数 | 说明 |
| ------------------ | ------------------------------------------------ |
| `setDirtyCanvas` | 指定前景(节点)和/或背景(连线和图像)需要重绘 |
| `onDrawBackground` | 用 `CanvasRenderingContext2D` 对象绘制背景。Comfy 用于渲染图像 |
| `onDrawForeground` | 用 `CanvasRenderingContext2D` 对象绘制节点 |
| `getTitle` | 要显示的标题 |
| `collapse` | 切换节点折叠状态 |
`collapse` 命名不准确;它实际上是切换折叠状态。
它接受一个布尔参数,可用于覆盖 `node.collapsable === false`。
#### 其他
| 函数 | 说明 |
| ------------ | --------------------------------------- |
| `changeMode` | 用于设置节点为旁路(`mode == 4`)或非旁路(`mode == 0`) |
## 输入与小部件
输入和小部件是向节点输入数据的两种方式。一般来说,小部件可以转为输入,但并非所有输入都能转为小部件(许多数据类型无法通过 UI 元素输入)。
`node.inputs` 是当前所有输入的列表(节点左侧的彩色圆点),包含 `.name`、`.type` 和 `.link`(指向 `app.graph.links` 中的 LLink)。
如果输入是已转换的小部件,还会在 `.widget` 中保存对该小部件(现已失效)的引用。
`node.widgets` 是所有小部件的列表,无论是否已转为输入。小部件有:
| 属性/函数 | 说明 |
| ---------- | ----------------------------- |
| `callback` | 小部件值变化时调用的函数 |
| `last_y` | 小部件在节点中的垂直位置 |
| `name` | 小部件名称(节点内唯一) |
| `options` | Python 代码中指定的选项(如默认值、最小值、最大值) |
| `type` | 小部件类型名称(见下文),小写 |
| `value` | 当前小部件值。此属性有 get/set 方法 |
### 小部件类型
`app.widgets` 是当前已注册小部件类型的字典,键为类型名的大写。
Comfy 内置小部件类型包括直观的 `BOOLEAN`、`INT`、`FLOAT`,
还有 `STRING`(分单行和多行)、
`COMBO`(下拉列表选择)、`IMAGEUPLOAD`(用于加载图片节点)。
可通过在扩展中提供 `getCustomWidgets` 方法添加自定义小部件类型。
{/* TODO add link */}
### 关联小部件
小部件也可以关联——如内置的 `seed` 和 `control_after_generate`。
关联小部件的 `.type = 'base_widget_type:base_widget_name'`;如 `control_after_generate` 可能有类型 `int:seed`。
## Prompt
当你在 Comfy 中点击"Queue Prompt"按钮时,会调用 `app.graphToPrompt()` 方法,将当前图转换为可发送到服务器的 prompt。
`app.graphToPrompt` 返回一个对象(下称 `prompt`),包含 `output` 和 `workflow` 两个属性。
### output
`prompt.output` 将图中每个节点的 `node_id` 映射为一个对象,包含两个属性:
* `prompt.output[node_id].class_type`,自定义节点类的唯一名称(在 Python 代码中定义)
* `prompt.output[node_id].inputs`,包含每个输入(或小部件)的值,是一个从输入名到以下内容的映射:
* 如果是小部件,则为选中的值
* 如果有连线,则为一个数组,内容为(`upstream_node_id`, `upstream_node_output_slot`)
* 如果是已转为输入但未连接的小部件,则为 undefined
* 其他未连接的输入不会出现在 `.inputs` 中
注意,描述已连接输入的数组中的 `upstream_node_id` 是字符串而不是整数。
### workflow
`prompt.workflow` 包含以下属性:
* `config` - 额外配置项字典(默认空)
* `extra` - 包含工作流额外信息的字典。默认有:
* `extra.ds` - 描述当前图视图(`scale` 和 `offset`)
* `groups` - 工作流中的所有分组
* `last_link_id` - 最后添加的连线 id
* `last_node_id` - 最后添加的节点 id
* `links` - 图中所有连线的列表。每项为五个整数和一个字符串的数组:
* (`link_id`, `upstream_node_id`, `upstream_node_output_slot`, `downstream_node_id`, `downstream_node_input_slot`, `data type`)
* `nodes` - 图中所有节点的列表。每项为节点部分属性的映射,见[上文](#comfynode)
* 包含属性:`flags`、`id`、`inputs`、`mode`、`order`、`pos`、`properties`、`size`、`type`、`widgets_values`
* 另外,除非节点没有输出,还会有 `outputs` 属性,为该节点所有输出的列表,每项包含:
* `name` - 输出名称
* `type` - 输出数据类型
* `links` - 从该输出出发的所有连线的 `link_id` 列表(无连接时为空数组或 null)
* `shape` - 绘制输出时的形状(默认 3,表示圆点)
* `slot_index` - 输出的槽编号
* `version` - LiteGraph 版本号(当前为 `0.4`)
无输出的节点没有 `nodes.output` 属性,而不是空列表。
# Javascript 扩展
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_overview
## 扩展 Comfy 客户端
Comfy 可以通过扩展机制进行修改。要添加一个扩展,你需要:
* 从你的 Python 模块中导出 `WEB_DIRECTORY`,
* 将一个或多个 `.js` 文件放入该目录,
* 使用 `app.registerExtension` 注册你的扩展。
下面是这三个步骤。了解如何添加扩展后,可以查阅可用的[钩子](/zh-CN/custom-nodes/js/javascript_hooks)以让你的代码被调用,
也可以了解你可能需要的各种 [Comfy 对象](/zh-CN/custom-nodes/js/javascript_objects_and_hijacking),
或者直接跳转到一些[示例代码片段](/zh-CN/custom-nodes/js/javascript_examples)。
### 导出 `WEB_DIRECTORY`
可以通过在你的自定义节点目录下创建一个子目录(通常叫 `js`),并导出 `WEB_DIRECTORY` 来扩展 Comfy 网页客户端——所以你的 `__init__.py` 应该包含如下内容:
```python
WEB_DIRECTORY = "./js"
__all__ = ["NODE_CLASS_MAPPINGS", "NODE_DISPLAY_NAME_MAPPINGS", "WEB_DIRECTORY"]
```
### 包含 `.js` 文件
所有 Javascript `.js` 文件会在 Comfy 网页加载时被浏览器自动加载。你无需指定扩展所在的文件。
只有 `.js` 文件会被添加到网页。其他资源(如 `.css` 文件)可以通过 `extensions/custom_node_subfolder/the_file.css` 访问,并可通过代码动态添加。
该路径不包含子文件夹名称。`WEB_DIRECTORY` 的值由服务器插入。
### 注册扩展
扩展的基本结构是导入主 Comfy `app` 对象,并调用 `app.registerExtension`,
传入一个包含唯一 `name` 和一个或多个由 Comfy 钩子调用的函数的字典。
一个完整、简单且“烦人”的扩展示例如下:
```Javascript
import { app } from "../../scripts/app.js";
app.registerExtension({
name: "a.unique.name.for.a.useless.extension",
async setup() {
alert("Setup complete!")
},
})
```
# 选择工具箱
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_selection_toolbox
Selection Toolbox API 允许扩展在画布上选择节点时添加自定义操作按钮。这为所选项目(节点、组等)提供了对上下文敏感命令的快速访问。
## 基本用法
要向选择工具箱添加命令,您的扩展需要:
1. 使用标准[命令接口](https://docs.comfy.org/custom-nodes/js/javascript_commands_keybindings)定义命令
2. 实现 `getSelectionToolboxCommands` 方法来指定工具箱中显示哪些命令
注意:每当进行新选择时,`getSelectionToolboxCommands` 方法会为选择集中的每个项目调用。
```javascript
app.registerExtension({
name: "MyExtension",
commands: [
{
id: "my-extension.duplicate-special",
label: "特殊复制",
icon: "pi pi-copy",
function: (selectedItem) => {
// 您的命令逻辑
console.log("使用特殊行为复制选定的节点");
}
}
],
getSelectionToolboxCommands: (selectedItem) => {
// 返回要在工具箱中显示的命令 ID 数组
return ["my-extension.duplicate-special"];
}
});
```
## 命令定义
选择工具箱的命令使用标准的 ComfyUI 命令接口:
```javascript
{
id: string, // 命令的唯一标识符
label: string, // 按钮工具提示的显示文本
icon?: string, // 按钮的图标类(可选)
function: (selectedItem) => void // 单击时执行的函数
}
```
`function` 接收选定的项作为参数,允许您对当前选择执行操作。
## 图标选项
选择工具箱按钮支持与其他 UI 元素相同的图标库:
* PrimeVue 图标:`pi pi-[icon-name]`(例如 `pi pi-star`)
* Material Design 图标:`mdi mdi-[icon-name]`(例如 `mdi mdi-content-copy`)
## 动态命令可见性
每次选择更改时都会调用 `getSelectionToolboxCommands` 方法,允许您根据选择的内容显示不同的命令:
```javascript
app.registerExtension({
name: "ContextualCommands",
commands: [
{
id: "my-ext.align-nodes",
label: "对齐节点",
icon: "pi pi-align-left",
function: () => {
// 对齐多个节点
}
},
{
id: "my-ext.configure-single",
label: "配置",
icon: "pi pi-cog",
function: () => {
// 配置单个节点
}
}
],
getSelectionToolboxCommands: (selectedItem) => {
const selectedItems = app.canvas.selectedItems;
const itemCount = selectedItems ? selectedItems.size : 0;
if (itemCount > 1) {
// 为多个项目显示对齐命令
return ["my-ext.align-nodes"];
} else if (itemCount === 1) {
// 为单个项目显示配置
return ["my-ext.configure-single"];
}
return [];
}
});
```
## 使用选定的项目
通过应用程序的画布对象访问有关选定项目的信息。`selectedItems` 属性是一个 Set,包含节点、组和其他画布元素:
```javascript
app.registerExtension({
name: "SelectionInfo",
commands: [
{
id: "my-ext.show-info",
label: "显示选择信息",
icon: "pi pi-info-circle",
function: () => {
const selectedItems = app.canvas.selectedItems;
if (selectedItems && selectedItems.size > 0) {
console.log(`选择了 ${selectedItems.size} 个项目`);
// 遍历选定的项目
selectedItems.forEach(item => {
if (item.type) {
console.log(`项目:${item.type}(ID: ${item.id})`);
}
});
}
}
}
],
getSelectionToolboxCommands: () => ["my-ext.show-info"]
});
```
## 完整示例
这是一个展示各种选择工具箱功能的简单示例:
```javascript
app.registerExtension({
name: "SelectionTools",
commands: [
{
id: "selection-tools.count",
label: "计数选择",
icon: "pi pi-hashtag",
function: () => {
const count = app.canvas.selectedItems?.size || 0;
app.extensionManager.toast.add({
severity: "info",
summary: "选择计数",
detail: `您选择了 ${count} 个项目`,
life: 3000
});
}
},
{
id: "selection-tools.copy-ids",
label: "复制 ID",
icon: "pi pi-copy",
function: () => {
const items = Array.from(app.canvas.selectedItems || []);
const ids = items.map(item => item.id).filter(id => id !== undefined);
if (ids.length > 0) {
navigator.clipboard.writeText(ids.join(', '));
app.extensionManager.toast.add({
severity: "success",
summary: "已复制",
detail: `已复制 ${ids.length} 个 ID 到剪贴板`,
life: 2000
});
}
}
},
{
id: "selection-tools.log-types",
label: "记录类型",
icon: "pi pi-info-circle",
function: () => {
const items = Array.from(app.canvas.selectedItems || []);
const typeCount = {};
items.forEach(item => {
const type = item.type || 'unknown';
typeCount[type] = (typeCount[type] || 0) + 1;
});
console.log("选择类型:", typeCount);
}
}
],
getSelectionToolboxCommands: (selectedItem) => {
const selectedItems = app.canvas.selectedItems;
const itemCount = selectedItems ? selectedItems.size : 0;
if (itemCount === 0) return [];
const commands = ["selection-tools.count", "selection-tools.log-types"];
// 仅在项目有 ID 时显示复制命令
const hasIds = Array.from(selectedItems).some(item => item.id !== undefined);
if (hasIds) {
commands.push("selection-tools.copy-ids");
}
return commands;
}
});
```
## 注意事项
* 必须在设置中启用选择工具箱:`Comfy.Canvas.SelectionToolbox`
* 在 `getSelectionToolboxCommands` 中引用命令之前,必须在 `commands` 数组中定义命令
* 选择更改时工具箱会自动更新
* 每当进行新选择时,`getSelectionToolboxCommands` 方法会为选择集中的每个项目调用
* 使用 `app.canvas.selectedItems`(一个 Set)来访问所有选定的项目,包括节点、组和其他画布元素
* 为了向后兼容,`app.canvas.selected_nodes` 仍然存在,但只包含节点
# 设置
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_settings
你可以为 ComfyUI 提供一个设置对象,这些设置会显示在用户打开 ComfyUI 设置面板时。
## 基本操作
### 添加一个设置项
```javascript
import { app } from "../../scripts/app.js";
app.registerExtension({
name: "My Extension",
settings: [
{
id: "example.boolean",
name: "示例布尔设置",
type: "boolean",
defaultValue: false,
},
],
});
```
`id` 必须在所有扩展中唯一,并将用于获取设置值。
如果你没有[指定分类](#categories),那么 `id` 会通过 `.` 分割来决定它在设置面板中的显示位置。
* 如果你的 `id` 不包含 `.`,它会显示在"其他"分类下,并以你的 `id` 作为分组标题。
* 如果你的 `id` 至少包含一个 `.`,最左边的部分会作为设置分类,第二部分作为分组标题,后续部分会被忽略。
### 读取设置项
```javascript
import { app } from "../../scripts/app.js";
if (app.extensionManager.setting.get('example.boolean')) {
console.log("设置已启用。");
} else {
console.log("设置已禁用。");
}
```
### 响应设置变化
当用户在设置面板中更改设置时,`onChange()` 事件处理器会被立即调用。
每次页面加载、扩展注册时也会调用。
```javascript
{
id: "example.boolean",
name: "示例布尔设置",
type: "boolean",
defaultValue: false,
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
### 写入设置项
```javascript
import { app } from "../../scripts/app.js";
try {
await app.extensionManager.setting.set("example.boolean", true);
} catch (error) {
console.error(`更改设置时出错: ${error}`);
}
```
### 额外配置
设置类型基于 [PrimeVue](https://primevue.org/) 组件。
在 `attrs` 字段中添加 PrimeVue 文档中描述的属性即可为 ComfyUI 设置项配置更多参数。
例如,下面为数字输入框添加了增减按钮:
```javascript
{
id: "example.number",
name: "示例数字设置",
type: "number",
defaultValue: 0,
attrs: {
showButtons: true,
},
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
## 类型
### 布尔值(Boolean)
显示一个开关。
基于 [ToggleSwitch PrimeVue 组件](https://primevue.org/toggleswitch/)。
```javascript
{
id: "example.boolean",
name: "示例布尔设置",
type: "boolean",
defaultValue: false,
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
### 文本(Text)
自由文本输入。
基于 [InputText PrimeVue 组件](https://primevue.org/inputtext/)。
```javascript
{
id: "example.text",
name: "示例文本设置",
type: "text",
defaultValue: "Foo",
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
### 数字(Number)
用于输入数字。
如需允许小数位,请将 `maxFractionDigits` 属性设置为大于 0 的数字。
基于 [InputNumber PrimeVue 组件](https://primevue.org/inputnumber/)。
```javascript
{
id: "example.number",
name: "示例数字设置",
type: "number",
defaultValue: 42,
attrs: {
showButtons: true,
maxFractionDigits: 1,
},
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
### 滑块(Slider)
允许用户直接输入数字或通过滑块选择。
基于 [Slider PrimeVue 组件](https://primevue.org/slider/)。不支持区间。
```javascript
{
id: "example.slider",
name: "示例滑块设置",
type: "slider",
attrs: {
min: -10,
max: 10,
step: 0.5,
},
defaultValue: 0,
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
### 下拉选择(Combo)
允许用户从下拉列表中选择。
你可以用纯字符串或带 `text` 和 `value` 字段的对象提供选项。如果只提供字符串,则会同时作为显示和实际值。
通过 `editable: true` 属性允许用户输入自定义内容,通过 `filter: true` 属性允许搜索。
基于 [Select PrimeVue 组件](https://primevue.org/select/)。不支持分组。
```javascript
{
id: "example.combo",
name: "示例下拉设置",
type: "combo",
defaultValue: "first",
options: [
{ text: "我的第一个选项", value: "first" },
"我的第二个选项",
],
attrs: {
editable: true,
filter: true,
},
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
### 颜色(Color)
允许用户通过颜色选择器选择颜色或输入十六进制颜色值。
注意格式必须为六位十六进制,不支持三位简写。
基于 [ColorPicker PrimeVue 组件](https://primevue.org/colorpicker/)。
```javascript
{
id: "example.color",
name: "示例颜色设置",
type: "color",
defaultValue: "ff0000",
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
### 图片(Image)
允许用户上传图片。
设置会以 [data URL](https://developer.mozilla.org/en-US/docs/Web/URI/Schemes/data) 格式保存。
基于 [FileUpload PrimeVue 组件](https://primevue.org/fileupload/)。
```javascript
{
id: "example.image",
name: "示例图片设置",
type: "image",
onChange: (newVal, oldVal) => {
console.log(`设置从 ${oldVal} 变为 ${newVal}`);
},
}
```
### 隐藏(Hidden)
隐藏设置不会显示在设置面板,但你可以在代码中读写它们。
```javascript
{
id: "example.hidden",
name: "示例隐藏设置",
type: "hidden",
}
```
## 其他
### 分类(Categories)
你可以通过 `category` 字段单独指定设置的分类。
这样可以在不更改 `id` 的情况下调整分类和命名,不会丢失用户已设置的值。
```javascript
{
id: "example.boolean",
name: "示例布尔设置",
type: "boolean",
defaultValue: false,
category: ["分类名称", "分组标题", "设置标签"],
}
```
### 工具提示(Tooltips)
你可以通过 `tooltip` 字段添加额外的上下文帮助。这会在字段名后显示一个小的 ℹ︎ 图标,用户悬停时会显示帮助文本。
```javascript
{
id: "example.boolean",
name: "示例布尔设置",
type: "boolean",
defaultValue: false,
tooltip: "这是一些有用的提示信息",
}
```
# 侧边栏标签页
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_sidebar_tabs
侧边栏标签页 API 允许扩展为 ComfyUI 界面的侧边栏添加自定义标签页。这对于添加需要持续可见性和快速访问的功能非常有用。
## 基本用法
```javascript
app.extensionManager.registerSidebarTab({
id: "customSidebar",
icon: "pi pi-compass",
title: "自定义标签页",
tooltip: "我的自定义侧边栏标签页",
type: "custom",
render: (el) => {
el.innerHTML = '这是我的自定义侧边栏内容
';
}
});
```
## 标签页配置
每个标签页需要以下属性:
```javascript
{
id: string, // 标签页的唯一标识符
icon: string, // 标签按钮的图标类名
title: string, // 标签页标题文本
tooltip?: string, // 悬停时的提示文本(可选)
type: string, // 标签页类型(通常为 "custom")
render: (element) => void // 用于填充标签页内容的函数
}
```
`render` 函数会接收一个 DOM 元素,你应在其中插入标签页的内容。
## 图标选项
侧边栏标签页图标可使用多种图标集:
* PrimeVue 图标:`pi pi-[icon-name]`(如 `pi pi-home`)
* Material Design 图标:`mdi mdi-[icon-name]`(如 `mdi mdi-robot`)
* Font Awesome 图标:`fa-[style] fa-[icon-name]`(如 `fa-solid fa-star`)
使用这些图标前请确保已加载相应的图标库。
## 有状态标签页示例
你可以创建带有状态的标签页:
```javascript
app.extensionManager.registerSidebarTab({
id: "statefulTab",
icon: "pi pi-list",
title: "笔记",
type: "custom",
render: (el) => {
// 创建元素
const container = document.createElement('div');
container.style.padding = '10px';
const notepad = document.createElement('textarea');
notepad.style.width = '100%';
notepad.style.height = '200px';
notepad.style.marginBottom = '10px';
// 加载已保存内容(如有)
const savedContent = localStorage.getItem('comfyui-notes');
if (savedContent) {
notepad.value = savedContent;
}
// 自动保存内容
notepad.addEventListener('input', () => {
localStorage.setItem('comfyui-notes', notepad.value);
});
// 组装 UI
container.appendChild(notepad);
el.appendChild(container);
}
});
```
## 使用 React 组件
你可以在侧边栏标签页中挂载 React 组件:
```javascript
// 在你的扩展中引入 React 依赖
import React from "react";
import ReactDOM from "react-dom/client";
// 注册带有 React 内容的侧边栏标签页
app.extensionManager.registerSidebarTab({
id: "reactSidebar",
icon: "mdi mdi-react",
title: "React 标签页",
type: "custom",
render: (el) => {
const container = document.createElement("div");
container.id = "react-sidebar-container";
el.appendChild(container);
// 定义一个简单的 React 组件
function SidebarContent() {
const [count, setCount] = React.useState(0);
return (
React 侧边栏
计数:{count}
setCount(count + 1)}>
递增
);
}
// 挂载 React 组件
ReactDOM.createRoot(container).render(
);
}
});
```
如需查看将 React 应用集成为侧边栏标签页的真实案例,请参考 [ComfyUI-Copilot 项目(GitHub)](https://github.com/AIDC-AI/ComfyUI-Copilot)。
## 动态内容更新
你可以根据图变化动态更新侧边栏内容:
```javascript
app.extensionManager.registerSidebarTab({
id: "dynamicSidebar",
icon: "pi pi-chart-line",
title: "统计信息",
type: "custom",
render: (el) => {
const container = document.createElement('div');
container.style.padding = '10px';
el.appendChild(container);
// 更新统计信息的函数
function updateStats() {
const stats = {
nodes: app.graph._nodes.length,
connections: Object.keys(app.graph.links).length
};
container.innerHTML = `
工作流统计
节点数:${stats.nodes}
连接数:${stats.connections}
`;
}
// 初始更新
updateStats();
// 监听图变化
const api = app.api;
api.addEventListener("graphChanged", updateStats);
// 标签页销毁时清理监听器
return () => {
api.removeEventListener("graphChanged", updateStats);
};
}
});
```
# Toast API
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_toast
Toast API 提供了一种向用户显示非阻塞通知消息的方式。这对于在不打断工作流的情况下提供反馈非常有用。
## 基本用法
### 简单 Toast
```javascript
// 显示一个简单的信息提示
app.extensionManager.toast.add({
severity: "info",
summary: "信息",
detail: "操作已成功完成",
life: 3000
});
```
### Toast 类型
```javascript
// 成功提示
app.extensionManager.toast.add({
severity: "success",
summary: "成功",
detail: "数据保存成功",
life: 3000
});
// 警告提示
app.extensionManager.toast.add({
severity: "warn",
summary: "警告",
detail: "此操作可能导致问题",
life: 5000
});
// 错误提示
app.extensionManager.toast.add({
severity: "error",
summary: "错误",
detail: "请求处理失败",
life: 5000
});
```
### Alert 辅助方法
```javascript
// 快捷方式创建警告提示
app.extensionManager.toast.addAlert("这是一条重要消息");
```
## API 参考
### Toast 消息
```javascript
app.extensionManager.toast.add({
severity?: "success" | "info" | "warn" | "error" | "secondary" | "contrast", // 消息严重级别(默认:"info")
summary?: string, // Toast 的简短标题
detail?: any, // 详细消息内容
closable?: boolean, // 用户是否可以关闭该提示(默认:true)
life?: number, // 自动关闭前的持续时间(毫秒)
group?: string, // 用于管理相关 Toast 的分组标识
styleClass?: any, // 消息的样式类
contentStyleClass?: any // 内容的样式类
});
```
### Alert 辅助方法
```javascript
app.extensionManager.toast.addAlert(message: string);
```
### 其他方法
```javascript
// 移除指定的 toast
app.extensionManager.toast.remove(toastMessage);
// 移除所有 toast
app.extensionManager.toast.removeAll();
```
# 顶部菜单栏
Source: https://docs.comfy.org/zh-CN/custom-nodes/js/javascript_topbar_menu
顶部菜单栏 API 允许扩展为 ComfyUI 的顶部菜单栏添加自定义菜单项。这对于提供高级功能或不常用命令的访问非常有用。
## 基本用法
```javascript
app.registerExtension({
name: "MyExtension",
// 定义命令
commands: [
{
id: "myCommand",
label: "我的命令",
function: () => { alert("命令已执行!"); }
}
],
// 将命令添加到菜单
menuCommands: [
{
path: ["扩展", "我的扩展"],
commands: ["myCommand"]
}
]
});
```
命令定义方式与 [命令与快捷键绑定 API](./javascript_commands_keybindings) 相同。详细定义命令请参见该页面。
## 命令配置
每个命令都需要 `id`、`label` 和 `function`:
```javascript
{
id: string, // 命令的唯一标识符
label: string, // 命令显示名称
function: () => void // 命令被触发时执行的函数
}
```
## 菜单配置
`menuCommands` 数组定义了命令在菜单结构中的位置:
```javascript
{
path: string[], // 表示菜单层级的数组
commands: string[] // 要添加到该位置的命令 ID 数组
}
```
`path` 数组指定菜单层级。例如,`["文件", "导出"]` 会将命令添加到"文件"菜单下的"导出"子菜单。
## 菜单示例
### 添加到已有菜单
```javascript
app.registerExtension({
name: "MenuExamples",
commands: [
{
id: "saveAsImage",
label: "另存为图片",
function: () => {
// 保存画布为图片的代码
}
},
{
id: "exportWorkflow",
label: "导出工作流",
function: () => {
// 导出工作流的代码
}
}
],
menuCommands: [
// 添加到文件菜单
{
path: ["文件"],
commands: ["saveAsImage", "exportWorkflow"]
}
]
});
```
### 创建子菜单结构
```javascript
app.registerExtension({
name: "SubmenuExample",
commands: [
{
id: "option1",
label: "选项 1",
function: () => { console.log("选项 1"); }
},
{
id: "option2",
label: "选项 2",
function: () => { console.log("选项 2"); }
},
{
id: "suboption1",
label: "子选项 1",
function: () => { console.log("子选项 1"); }
}
],
menuCommands: [
// 创建嵌套菜单结构
{
path: ["扩展", "我的工具"],
commands: ["option1", "option2"]
},
{
path: ["扩展", "我的工具", "高级"],
commands: ["suboption1"]
}
]
});
```
### 多个菜单位置
你可以将同一个命令添加到多个菜单位置:
```javascript
app.registerExtension({
name: "MultiLocationExample",
commands: [
{
id: "helpCommand",
label: "获取帮助",
function: () => { window.open("https://docs.example.com", "_blank"); }
}
],
menuCommands: [
// 添加到帮助菜单
{
path: ["帮助"],
commands: ["helpCommand"]
},
// 也添加到扩展菜单
{
path: ["扩展"],
commands: ["helpCommand"]
}
]
});
```
命令可以与其他 ComfyUI API(如设置)配合使用。关于设置 API 的更多信息,请参见 [设置 API](./javascript_settings) 文档。
# 概述
Source: https://docs.comfy.org/zh-CN/custom-nodes/overview
自定义节点允许你实现新功能并与更广泛的社区分享。
自定义节点就像任何 Comfy 节点一样:它接收输入,对其进行处理,然后产生输出。
虽然有些自定义节点执行非常复杂的任务,但许多节点只做一件事。下面是一个简单节点的例子,它接收一张图片并进行反色处理。
## 客户端-服务器模型
Comfy 运行在客户端-服务器模型下。服务器端由 Python 编写,负责所有实际工作:数据处理、模型、图像扩散等。客户端由 Javascript 编写,负责用户界面。
Comfy 也可以以 API 模式使用,在该模式下,工作流由非 Comfy 客户端(如其他 UI 或命令行脚本)发送到服务器。
自定义节点可以分为以下四类:
### 仅服务器端
大多数自定义节点仅在服务器端运行,通过定义一个 Python 类来指定输入和输出类型,并提供一个可调用的函数来处理输入并生成输出。
### 仅客户端
少数自定义节点仅对客户端 UI 进行修改,但不添加核心功能。尽管名字如此,它们甚至可能不会向系统添加新节点。
### 客户端与服务器端独立
自定义节点可以同时提供额外的服务器功能和额外(相关的)UI 功能(例如用于新数据类型的新小部件)。在大多数情况下,客户端和服务器之间的通信可以通过 Comfy 的数据流控制来处理。
### 客户端与服务器端联动
在少数情况下,UI 功能和服务器需要直接相互通信。
任何需要客户端-服务器通信的节点都无法通过 API 使用。
# Tips
Source: https://docs.comfy.org/zh-CN/custom-nodes/tips
### Recommended Development Lifecycle
# 快速入门
Source: https://docs.comfy.org/zh-CN/custom-nodes/walkthrough
本页将带你一步步完成自定义节点的创建过程。
我们的示例将接收一批图片,并返回其中一张图片。最初,这个节点会返回平均颜色最亮的图片;随后我们会扩展它,支持多种选择标准,最后还会添加一些客户端代码。
本页假设你对 Python 或 Javascript 的了解很少。
完成本教程后,可以深入了解 [后端代码](./backend/server_overview) 和 [前端代码](./backend/server_overview)。
## 编写基础节点
### 前置条件
* 一个可用的 ComfyUI [安装环境](/zh-CN/installation/manual_install)。开发建议手动安装 ComfyUI。
* 一个可用的 comfy-cli [安装环境](/zh-CN/comfy-cli/getting-started)。
### 环境搭建
```bash
cd ComfyUI/custom_nodes
comfy node scaffold
```
回答几个问题后,你会得到一个新的目录。
```bash
~ % comfy node scaffold
You've downloaded .cookiecutters/cookiecutter-comfy-extension before. Is it okay to delete and re-download it? [y/n] (y): y
[1/9] full_name (): Comfy
[2/9] email (you@gmail.com): me@comfy.org
[3/9] github_username (your_github_username): comfy
[4/9] project_name (My Custom Nodepack): FirstComfyNode
[5/9] project_slug (firstcomfynode):
[6/9] project_short_description (A collection of custom nodes for ComfyUI):
[7/9] version (0.0.1):
[8/9] Select open_source_license
1 - GNU General Public License v3
2 - MIT license
3 - BSD license
4 - ISC license
5 - Apache Software License 2.0
6 - Not open source
Choose from [1/2/3/4/5/6] (1): 1
[9/9] include_web_directory_for_custom_javascript [y/n] (n): y
Initialized empty Git repository in firstcomfynode/.git/
✓ Custom node project created successfully!
```
### 定义节点
将以下代码添加到 `src/nodes.py` 末尾:
```Python src/nodes.py
class ImageSelector:
CATEGORY = "example"
@classmethod
def INPUT_TYPES(s):
return { "required": { "images": ("IMAGE",), } }
RETURN_TYPES = ("IMAGE",)
FUNCTION = "choose_image"
```
自定义节点的基本结构详见[这里](/zh-CN/custom-nodes/backend/server_overview)。
自定义节点通过 Python 类定义,必须包含以下四项:`CATEGORY`(指定新节点在添加节点菜单中的位置)、`INPUT_TYPES`(类方法,定义节点输入,详见[后文](/zh-CN/custom-nodes/backend/server_overview#input-types))、`RETURN_TYPES`(定义节点输出)、`FUNCTION`(节点执行时调用的函数名)。
注意输入和输出的数据类型都是 `IMAGE`(单数),即使我们期望接收一批图片并只返回一张。在 Comfy 中,`IMAGE` 代表图片批量,单张图片视为批量大小为 1。
### 主函数
主函数 `choose_image` 会收到在 `INPUT_TYPES` 中定义的命名参数,并返回一个与 `RETURN_TYPES` 匹配的 `tuple`。由于我们处理的是图片,图片在内部以 `torch.Tensor` 存储,
```Python
import torch
```
然后将函数添加到你的类中。图片的数据类型是形状为 `[B,H,W,C]` 的 `torch.Tensor`,其中 `B` 是批量大小,`C` 是通道数(RGB 为 3)。遍历该张量会得到 `B` 个形状为 `[H,W,C]` 的张量。`.flatten()` 方法将其变为一维张量,长度为 `H*W*C`,`torch.mean()` 求均值,`.item()` 将单值张量转为 Python 浮点数。
```Python
def choose_image(self, images):
brightness = list(torch.mean(image.flatten()).item() for image in images)
brightest = brightness.index(max(brightness))
result = images[brightest].unsqueeze(0)
return (result,)
```
最后两行说明:
* `images[brightest]` 返回形状为 `[H,W,C]` 的张量。`unsqueeze` 用于在第 0 维插入一个长度为 1 的维度,得到 `[B,H,W,C]`,其中 `B=1`,即单张图片。
* `return (result,)` 末尾的逗号很重要,确保返回的是元组。
### 注册节点
要让 Comfy 识别新节点,必须在包级别可用。修改 `src/nodes.py` 末尾的 `NODE_CLASS_MAPPINGS` 变量。你需要重启 ComfyUI 才能看到更改。
```Python src/nodes.py
NODE_CLASS_MAPPINGS = {
"Example" : Example,
"Image Selector" : ImageSelector,
}
# 可选:你可以在 `NODE_DISPLAY_NAME_MAPPINGS` 字典中重命名节点。
NODE_DISPLAY_NAME_MAPPINGS = {
"Example": "Example Node",
"Image Selector": "Image Selector",
}
```
关于 ComfyUI 如何发现和加载自定义节点的详细说明,请参见[节点生命周期文档](/zh-CN/custom-nodes/backend/lifecycle)。
## 添加选项
这个节点可能有点无聊,所以我们可以加一些选项;比如一个小部件,让你选择最亮、最红、最绿或最蓝的图片。将你的 `INPUT_TYPES` 修改为:
```Python
@classmethod
def INPUT_TYPES(s):
return { "required": { "images": ("IMAGE",),
"mode": (["brightest", "reddest", "greenest", "bluest"],)} }
```
然后更新主函数。我们用一个很简单的“最红”定义,即像素的平均 R 值除以三色平均值。所以:
```Python
def choose_image(self, images, mode):
batch_size = images.shape[0]
brightness = list(torch.mean(image.flatten()).item() for image in images)
if (mode=="brightest"):
scores = brightness
else:
channel = 0 if mode=="reddest" else (1 if mode=="greenest" else 2)
absolute = list(torch.mean(image[:,:,channel].flatten()).item() for image in images)
scores = list( absolute[i]/(brightness[i]+1e-8) for i in range(batch_size) )
best = scores.index(max(scores))
result = images[best].unsqueeze(0)
return (result,)
```
## 调整 UI
也许我们想要一些可视化反馈,所以让我们发送一条文本消息进行显示。
### 从服务器发送消息
只需在 Python 代码中添加两行:
```Python
from server import PromptServer
```
在 `choose_image` 方法末尾添加一行,将消息发送到前端(`send_sync` 需要一个唯一的消息类型和一个字典):
```Python
PromptServer.instance.send_sync("example.imageselector.textmessage", {"message":f"Picked image {best+1}"})
return (result,)
```
### 编写客户端扩展
要为客户端添加 Javascript,在你的自定义节点目录下创建 `web/js` 子目录,并在 `__init__.py` 末尾导出 `WEB_DIRECTORY`:
```Python
WEB_DIRECTORY = "./web/js"
__all__ = ['NODE_CLASS_MAPPINGS', 'WEB_DIRECTORY']
```
客户端扩展以 `.js` 文件保存在 `web/js` 子目录下,所以创建 `image_selector/web/js/imageSelector.js`,内容如下。(更多内容见 [客户端开发](./js/javascript_overview))
```Javascript
app.registerExtension({
name: "example.imageselector",
async setup() {
function messageHandler(event) { alert(event.detail.message); }
app.api.addEventListener("example.imageselector.textmessage", messageHandler);
},
})
```
我们所做的就是注册一个扩展,并在 `setup()` 方法中为我们发送的消息类型添加监听器。它会读取我们发送的字典(存储在 `event.detail` 中)。
停止 Comfy 服务器,重新启动,刷新网页,运行你的工作流。
### 完整示例
完整示例见[这里](https://gist.github.com/robinjhuang/fbf54b7715091c7b478724fc4dffbd03)。你可以下载示例工作流 [JSON 文件](https://github.com/Comfy-Org/docs/blob/main/public/workflow.json) 或在下方查看:
# 工作流模板
Source: https://docs.comfy.org/zh-CN/custom-nodes/workflow_templates
如果你的自定义节点包含示例工作流文件,ComfyUI 可以在模板浏览器(`工作流`/`浏览模板`菜单)中向用户展示这些文件。工作流模板是帮助用户快速上手你的节点的好方法。
作为节点开发者,你只需要创建一个 `example_workflows` 文件夹并在其中放置 `json` 文件即可。你还可以选择性地放置同名的 `jpg` 文件作为模板缩略图。
在底层,ComfyUI 会静态提供这些文件,并通过一个端点(`/api/workflow_templates`)返回工作流模板集合。
目前以下文件夹名称也可以被接受,但我们仍旧建议使用`example_workflows`
* workflow
* workflows
* example
* examples
## 示例
在 `ComfyUI-MyCustomNodeModule/example_workflows/` 目录下:
* `My_example_workflow_1.json`
* `My_example_workflow_1.jpg`
* `My_example_workflow_2.json`
在这个例子中,ComfyUI 的模板浏览器会显示一个名为 `ComfyUI-MyCustomNodeModule` 的类别,其中包含两个项目,其中一个带有缩略图。
# 通过 API Key 集成来使用 ComfyUI API 节点
Source: https://docs.comfy.org/zh-CN/development/comfyui-server/api-key-integration
本文介绍了如何通过 API Key 集成来使用 ComfyUI API 节点
从[PR #8041](https://github.com/comfyanonymous/ComfyUI/pull/8041)开始,ComfyUI 支持通过创建 API Key 来直接使用 ComfyUI 内置的 API 节点,无需特定的前端界面(甚至可以完全不使用前端)。
这意味着你可以创建工作流来组合:
* 本地操作系统模型
* 自定义节点社区的工具
* 流行的付费模型
并通过本地 Comfy webserver API 一起运行所有内容,让它处理所有的协调工作。
这对于将 Comfy 用作后端服务、通过命令行运行 Comfy、拥有自己的前端等用户都很有帮助。
## 前提条件
使用 API Key 来调用 ComfyUI 内置的 API 节点需要:
* 确保你的 ComfyUI 版本 >= [PR #8041](https://github.com/comfyanonymous/ComfyUI/pull/8041)
* 对应账户的 API Key
* 足够的账户积分
使用 API Key 来调用 ComfyUI 内置的 API 节点需要先在 [ComfyUI Platform](https://platform.comfy.org/login) 上注册一个账户,然后创建 API key
请参考用户界面章节了解如何使用 API Key 进行登录
你需要确保你的 ComfyUI 账户有足够的积分来测试对应的功能。
请参考积分章节了解如何为你的账户购买积分
## Python 示例
以下是一个如何通过 Python 代码向 ComfyUI API 发送包含 API节点的工作流的示例:
```python
"""在无头模式或使用替代前端运行 ComfyUI 时使用 API 节点
你可以通过在 prompt 中包含 API key 来执行包含 API 节点的 ComfyUI 工作流。
API key 需要添加到 payload 的 `extra_data` 字段中。
下面我们展示一个如何实现的示例。
更多信息请参考:
- API 节点概述: https://docs.comfy.org/zh-CN/tutorials/api-nodes/overview
- 要生成 API key,请登录这里: https://platform.comfy.org/login
"""
import json
from urllib import request
SERVER_URL = "http://127.0.0.1:8188"
# 我们有一个包含 API 节点的 prompt/job(API 格式的工作流)。
workflow_with_api_nodes = """{
"11": {
"inputs": {
"prompt": "A dreamy, surreal half-body portrait of a young woman meditating. She has a short, straight bob haircut dyed in pastel pink, with soft bangs covering her forehead. Her eyes are gently closed, and her hands are raised in a calm, open-palmed meditative pose, fingers slightly curved, as if levitating or in deep concentration. She wears a colorful dress made of patchwork-like pastel tiles, featuring clouds, stars, and rainbows. Around her float translucent, iridescent soap bubbles reflecting the rainbow hues. The background is a fantastical sky filled with cotton-candy clouds and vivid rainbow waves, giving the entire scene a magical, dreamlike atmosphere. Emphasis on youthful serenity, whimsical ambiance, and vibrant soft lighting.",
"prompt_upsampling": false,
"seed": 589991183902375,
"aspect_ratio": "1:1",
"raw": false,
"image_prompt_strength": 0.4000000000000001,
"image_prompt": [
"14",
0
]
},
"class_type": "FluxProUltraImageNode",
"_meta": {
"title": "Flux 1.1 [pro] Ultra Image"
}
},
"12": {
"inputs": {
"filename_prefix": "ComfyUI",
"images": [
"11",
0
]
},
"class_type": "SaveImage",
"_meta": {
"title": "Save Image"
}
},
"14": {
"inputs": {
"image": "example.png"
},
"class_type": "LoadImage",
"_meta": {
"title": "Load Image"
}
}
}"""
prompt = json.loads(workflow_with_api_nodes)
payload = {
"prompt": prompt,
# 将 `api_key_comfy_org` 添加到 payload 中。
# 如果你需要处理多个客户端,可以先从关联的用户获取 key。
"extra_data": {
"api_key_comfy_org": "comfyui-87d01e28d*******************************************************" # 替换为实际的 key
},
}
data = json.dumps(payload).encode("utf-8")
req = request.Request(f"{SERVER_URL}/prompt", data=data)
# 发送请求
request.urlopen(req)
```
## 相关文档
* [API节点概述](https://docs.comfy.org/zh-CN/tutorials/api-nodes/overview)
* [账户管理](https://docs.comfy.org/zh-CN/interface/user)
* [积分](https://docs.comfy.org/zh-CN/interface/credits)
# 消息传递
Source: https://docs.comfy.org/zh-CN/development/comfyui-server/comms_messages
## 消息传递机制
在工作流执行期间(或当执行队列状态发生变化时),`PromptExecutor` 会
通过 `PromptServer` 实例的 `send_sync` 方法向客户端回传消息。
这些消息由 `api.js` 文件中定义的 `socket` 事件监听器负责接收(截至本文撰写时,该监听器大致位于第 90 行,您也可以通过搜索 `this.socket.addEventListener` 找到它)。
该监听器会为每种已知的消息类型创建一个 `CustomEvent` 对象,并将其派发给所有已注册的相应监听器。
扩展程序可以遵循标准的 Javascript 模式来注册事件接收(此操作通常在 `setup()` 函数中完成):
```Javascript
api.addEventListener(message_type, messageHandler);
```
如果 `message_type` 并非内置消息类型,系统会自动将其添加至已知消息类型列表。
注册的 `messageHandler` 函数在被调用时,会接收到一个 `CustomEvent` 对象。
该对象是对 `socket` 事件的扩展,额外增加了一个 `.detail` 属性,此属性是一个包含了服务器所发送数据的字典。因此,通常的使用方式如下:
```Javascript
function messageHandler(event) {
if (event.detail.node == aNodeIdThatIsInteresting) { // 判断是否为目标节点
// 利用 event.detail.other_things 中的数据执行相应操作
}
}
```
### 内置消息类型
在工作流执行期间(或当执行队列状态发生变化时),`PromptExecutor` 会通过 `PromptServer` 实例的 `send_sync` 方法向客户端发送以下类型的消息。
扩展程序可以注册监听这些消息中的任意一种。
| 事件类型 (event) | 触发时机 | 数据内容 (data) |
| ----------------------- | ------------------------------------------------ | ------------------------------------------------------------------------------------- |
| `execution_start` | 当一个提示 (prompt) 即将开始执行时 | `prompt_id` (提示ID) |
| `execution_error` | 当执行过程中发生错误时 | `prompt_id` (提示ID),以及其他附加错误信息 |
| `execution_interrupted` | 当某个节点抛出 `InterruptProcessingException` 异常导致执行中断时 | `prompt_id` (提示ID)、`node_id` (节点ID)、`node_type` (节点类型) 以及 `executed` (一个包含已执行节点ID的列表) |
| `execution_cached` | 在执行开始阶段 | `prompt_id` (提示ID)、`nodes` (一个节点ID列表,这些节点的缓存输出将被复用,因此这些节点会被跳过执行) |
| `execution_success` | 当提示中的所有节点都已成功执行时 | `prompt_id`, `timestamp`(时间戳) |
| `executing` | 当一个新节点即将开始执行时 | `node` (当前执行的节点ID,若为 `None` 则表示整个提示执行完毕)、`prompt_id` (提示ID) |
| `executed` | 当一个节点执行完毕并返回了用户界面 (UI) 元素时 | `node` (节点ID)、`prompt_id` (提示ID)、`output` (节点返回的UI数据) |
| `progress` | 在执行某个实现了特定进度报告钩子 (hook) 的节点期间 | `node` (节点ID)、`prompt_id` (提示ID)、`value` (当前进度值)、`max` (最大进度值) |
| `status` | 当执行队列的状态发生变化时 | `exec_info` (一个字典,其中包含 `queue_remaining`,表示队列中剩余的任务数量) |
### 关于 `executed` 消息的使用
值得注意的是,`executed` 消息并非在每个节点完成执行时都会发送(这一点与 `executing` 消息不同),
它仅在节点执行后需要更新用户界面时才会触发。
要实现这一点,节点的Python主执行函数需要返回一个字典,而非通常的元组:
```python
# 在主执行函数的末尾
return { "ui": a_new_dictionary, "result": the_tuple_of_output_values }
```
这样,`a_new_dictionary` 的内容便会作为 `executed` 消息中 `output` 字段的值发送给客户端。
如果节点本身没有输出(即不产生传递给下游节点的数据),那么返回字典中的 `result` 键可以省略(例如,可以参考 `nodes.py` 文件中 `SaveImage` 节点的实现方式)。
### 自定义消息类型
如前所述,在客户端,只需为自定义的消息类型名称注册一个监听器,即可轻松添加对新消息类型的处理。
```Javascript
api.addEventListener("my.custom.message", messageHandler);
```
在服务器端,实现方式同样简洁:
```Python
from server import PromptServer
# 然后,(通常)在您的节点主执行函数中
PromptServer.instance.send_sync("my.custom.message", a_dictionary)
```
#### 获取当前节点 ID (node\_id)
大多数内置消息的 `node` 字段都包含了当前正在执行的节点 ID。在自定义消息中,您很可能也需要包含此信息。
在服务器端,可以通过一个隐藏输入来获取节点 ID。这需要在节点的 `INPUT_TYPES` 字典中添加一个 `hidden` 键来实现:
```Python
@classmethod
def INPUT_TYPES(s):
return {"required" : { }, # 此处填写您节点所需的常规输入
"hidden": { "node_id": "UNIQUE_ID" } } # 添加此 hidden 键以获取节点ID
def my_main_function(self, required_inputs, node_id): # node_id 会作为参数传入
# 执行某些操作...
PromptServer.instance.send_sync("my.custom.message", {"node": node_id, "other_things": etc}) # 在消息中包含节点ID
```
# 服务器概览
Source: https://docs.comfy.org/zh-CN/development/comfyui-server/comms_overview
## 概览
Comfy 服务器构建于 [aiohttp 框架](https://docs.aiohttp.org/) 基础之上,该框架则依赖于 [asyncio](https://pypi.org/project/asyncio/) 库。
服务器向客户端发送消息时,会通过其 `send_sync` 方法(该服务器是 `server.py` 文件中定义的 `PromptServer` 类的一个实例)以 `socket` 消息的形式进行。这些消息由注册在 `api.js` 文件中的 `socket` 事件监听器负责处理。更多详情请参阅[消息传递](/zh-CN/development/comfyui-server/comms_messages)。
客户端向服务器发送消息时,则通过 `api.js` 文件中定义的 `api.fetchApi()` 方法进行,这些请求由服务器端设定的 HTTP 路由负责处理。更多详情请参阅[路由机制](/zh-CN/development/comfyui-server/comms_routes)部分。
当您将一个请求加入处理队列时,客户端会提交完整的工作流信息(包括所有小部件的当前值)。
一旦请求进入队列,服务器将不会接收您在此之后对工作流所做的任何修改。
若希望在程序执行过程中动态调整服务器行为,则需要借助路由机制来实现。
python3 .github/scripts/validate-links.py
# 路由
Source: https://docs.comfy.org/zh-CN/development/comfyui-server/comms_routes
## 路由
服务器定义了一系列 `get` 和 `post` 方法,
这些方法可以通过在 `server.py` 中搜索 `@routes` 找到。当你在网页客户端提交工作流时,
它会被发送到 `/prompt` 端点,该端点会验证提示并将其添加到执行队列中,
返回 `prompt_id` 和 `number`(队列中的位置),如果验证失败则返回 `error` 和 `node_errors`。
提示队列定义在 `execution.py` 中,该文件还定义了 `PromptExecutor` 类。
### 内置路由
`server.py` 定义了以下路由:
#### 核心 API 路由
| 路径 | get/post/ws | 用途 |
| ------------------------------ | ----------- | -------------------------------------------------- |
| `/` | get | 加载 Comfy 网页 |
| `/ws` | websocket | 用于与服务器进行实时通信的 WebSocket 端点 |
| `/embeddings` | get | 获取可用的嵌入模型名称列表 |
| `/extensions` | get | 获取注册了 `WEB_DIRECTORY` 的扩展列表 |
| `/features` | get | 获取服务器功能和能力 |
| `/models` | get | 获取可用模型类型列表 |
| `/models/{folder}` | get | 获取特定文件夹中的模型 |
| `/workflow_templates` | get | 获取自定义节点模块及其关联模板工作流的映射 |
| `/upload/image` | post | 上传图片 |
| `/upload/mask` | post | 上传蒙版 |
| `/view` | get | 查看图片。更多选项请参见 `server.py` 中的 `@routes.get("/view")` |
| `/view_metadata`/{folder_name} | get | 获取模型的元数据 |
| `/system_stats` | get | 获取系统信息(Python 版本、设备、显存等) |
| `/prompt` | get | 获取当前队列状态和执行信息 |
| `/prompt` | post | 提交提示到队列 |
| `/object_info` | get | 获取所有节点类型的详细信息 |
| `/object_info/{node_class}` | get | 获取特定节点类型的详细信息 |
| `/history` | get | 获取队列历史记录 |
| `/history/{prompt_id}` | get | 获取特定提示的队列历史记录 |
| `/history` | post | 清除历史记录或删除历史记录项 |
| `/queue` | get | 获取执行队列的当前状态 |
| `/queue` | post | 管理队列操作(清除待处理/运行中的任务) |
| `/interrupt` | post | 停止当前工作流执行 |
| `/free` | post | 通过卸载指定模型释放内存 |
| `/userdata` | get | 列出指定目录中的用户数据文件 |
| `/v2/userdata` | get | 增强版本,以结构化格式列出文件和目录 |
| `/userdata/{file}` | get | 获取特定的用户数据文件 |
| `/userdata/{file}` | post | 上传或更新用户数据文件 |
| `/userdata/{file}` | delete | 删除特定的用户数据文件 |
| `/userdata/{file}/move/{dest}` | post | 移动或重命名用户数据文件 |
| `/users` | get | 获取用户信息 |
| `/users` | post | 创建新用户(仅限多用户模式) |
### WebSocket 通信
`/ws` 端点提供客户端与服务器之间的实时双向通信。用于:
* 接收执行进度更新
* 实时获取节点执行状态
* 接收错误消息和调试信息
* 队列状态变化时的实时更新
WebSocket 连接发送不同类型的 JSON 消息,例如:
* `status` - 整体系统状态更新
* `execution_start` - 当提示执行开始时
* `execution_cached` - 当使用缓存结果时
* `executing` - 节点执行期间的更新
* `progress` - 长时间运行操作的进度更新
* `executed` - 当节点完成执行时
### 自定义路由
如果你想在执行过程中从客户端向服务器发送消息,你需要在服务器中添加一个自定义路由。
对于复杂的情况,你需要深入研究 [aiohttp 框架文档](https://docs.aiohttp.org/),但大多数情况可以按以下方式处理:
```Python
from server import PromptServer
from aiohttp import web
routes = PromptServer.instance.routes
@routes.post('/my_new_path')
async def my_function(request):
the_data = await request.post()
# the_data now holds a dictionary of the values sent
MyClass.handle_my_message(the_data)
return web.json_response({})
```
除非你确切知道自己在做什么,否则不要尝试在类中定义 `my_function`。
`@routes.post` 装饰器做了很多工作!相反,应该像上面那样定义函数,
然后调用一个类方法。
如果你不需要修改任何内容,也可以定义 `@routes.get`。
客户端可以通过发送 `FormData` 对象来使用这个新路由,代码如下所示,
这将导致上面代码中的 `the_data` 包含 `message` 和 `node_id` 键:
```Javascript
import { api } from "../../scripts/api.js";
function send_message(node_id, message) {
const body = new FormData();
body.append('message',message);
body.append('node_id', node_id);
api.fetchApi("/my_new_path", { method: "POST", body, });
}
```
# 执行模型反转指南
Source: https://docs.comfy.org/zh-CN/development/comfyui-server/execution_model_inversion_guide
[PR #2666](https://github.com/comfyanonymous/ComfyUI/pull/2666) 将执行模型从原先的“后端到前端”递归方式,转变为“前端到后端”的拓扑排序方式。尽管多数自定义节点预计仍能照常工作,本指南旨在帮助自定义节点开发者识别那些*可能*因此变更而出现问题的情况。
## 不兼容变更
### Monkey Patching
任何曾对执行模型进行 `Monkey Patching` (猴子补丁)的代码,在新模型下很可能失效。值得注意的是,此 `PR` 带来的执行性能已超越多数主流 `Monkey Patching` 方案,因此许多此类补丁已无必要。
### 可选输入验证
在此 `PR` 更新前,系统仅对那些完全通过一连串 `\"required\"` (必需)输入连接到输出节点的节点进行验证。如果您的自定义节点以往仅通过 `\"optional\"` (可选)输入连接,那么之前可能并未发现其验证失败的情况。
如果您的节点输出此前已能连接至 `\"required\"` (必需)输入,那么本节内容对您的影响可能不大。本节主要针对那些使用自定义数据类型,并且其节点仅通过 `\"optional\"` (可选)输入进行连接的自定义节点开发者。
以下列出了一些可能导致验证失败的情形及建议解决方案:
* 为了配置自定义小部件(widget),在不适合进行比较的类型(如字典)上使用了保留的[附加参数](/zh-CN/custom-nodes/backend/datatypes#附加参数)(例如 `min` 和 `max`)。
* 将所用的附加参数更改为非保留关键字,例如 `uiMin` 和 `uiMax`。*(推荐方案)*
```python
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"my_size": ("VEC2", {"uiMin": 0.0, "uiMax": 1.0}),
}
}
```
* 为该输入定义一个自定义的 [VALIDATE\_INPUTS](/zh-CN/custom-nodes/backend/server_overview#validate-inputs) 函数,从而跳过对其的验证。*(快速方案)*
```python
@classmethod
def VALIDATE_INPUTS(cls, my_size):
return True
```
* 使用了复合类型(例如 `CUSTOM_A,CUSTOM_B`)
* (作为输出时)定义并使用类似 `MakeSmartType` 的包装器 [见于此 PR 的单元测试](https://github.com/comfyanonymous/ComfyUI/pull/2666/files#diff-714643f1fdb6f8798c45f77ab10d212ca7f41dd71bbe55069f1f9f146a8f0cb9R2)
```python
class MyCustomNode:
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"input": (MakeSmartType("FOO,BAR"), {}),
}
}
RETURN_TYPES = (MakeSmartType("FOO,BAR"),)
# ...
```
* (作为输入时)定义一个自定义的 [VALIDATE\_INPUTS](/zh-CN/custom-nodes/backend/server_overview#validate-inputs) 函数,使其接受 `input_types` 参数,从而跳过类型验证。
```python
@classmethod
def VALIDATE_INPUTS(cls, input_types):
return True
```
* (输入输出均适用,且便捷)定义并使用 `@VariantSupport` 装饰器 [见于此 PR 的单元测试](https://github.com/comfyanonymous/ComfyUI/pull/2666/files#diff-714643f1fdb6f8798c45f77ab10d212ca7f41dd71bbe55069f1f9f146a8f0cb9R15)
```python
@VariantSupport
class MyCustomNode:
@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"input": ("FOO,BAR", {}),
}
}
RETURN_TYPES = (MakeSmartType("FOO,BAR"),)
# ...
```
* 在图(graph)定义中将列表(例如 `[1, 2, 3]`)用作常量(例如,代表一个 `VEC3` 类型的常量输入)。此用法在旧版中需配合前端扩展。并且,此前大小恰好为 `2` 的列表本身就会导致失败——它们会被视为无效链接。
* 将列表包装在形如 `{ "value": [1, 2, 3] }` 的字典中。
### 执行顺序
执行顺序以往便会因节点的 `ID` 不同而变化,如今,缓存值的不同也可能导致执行顺序的改变。通常而言,除了图结构所固有的约束外,执行顺序应被视为不确定的,并可能随时调整。
切勿依赖特定的执行顺序。
*HIC SUNT DRACONES*
## 新增功能
### 验证更改
为了缓解前述[可选输入验证](#可选输入验证)变更带来的影响,`VALIDATE_INPUTS` 函数新增了若干特性。
* 对于由 `VALIDATE_INPUTS` 函数接收的输入,系统将不再执行默认验证流程。
* `VALIDATE_INPUTS` 函数现支持接收 `**kwargs` 参数。一旦使用,节点创建者将被视为已自行处理所有输入的验证。
* `VALIDATE_INPUTS` 函数可以接收一个名为 `input_types` 的参数。该参数是一个字典,其中包含了每个通过链接接入的输入及其对应连接输出的类型。若定义了此参数,系统将跳过对该节点所有输入的类型验证。
更多详情请参阅 [VALIDATE\_INPUTS](/zh-CN/custom-nodes/backend/server_overview#validate-inputs) 文档。
### Lazy Evaluation
输入现支持 `Lazy Evaluation` (惰性求值),即可以先判断是否确实需要某个输入值,再决定是否执行其连接的上游节点及其所有依赖节点。更多信息请参见[惰性求值](/zh-CN/custom-nodes/backend/lazy_evaluation)。
### Node Expansion
在运行时,节点可以动态扩展为一个子图(subgraph)。该机制使得通过尾递归(tail-recursion)实现循环等复杂逻辑成为可能。更多信息请参见[节点扩展](/zh-CN/custom-nodes/backend/expansion)。
# 自定义节点
Source: https://docs.comfy.org/zh-CN/development/core-concepts/custom-nodes
了解 ComfyUI 中自定义节点的安装、依赖启用、更新、禁用、卸载
## 关于自定义节点
当你在安装了 ComfyUI 后,你将会发现 ComfyUI 中已经包含了许多节点,这些原生的节点称为 **Comfy Core** 节点,这些节点都是由 ComfyUI 官方维护的。
此外还有许多来自 ComfyUI 社区的众多作者带来的各种各样的 [**自定义节点**](https://registry.comfy.org),这些自定义节点为 ComfyUI 带来了诸多的扩展功能,大大扩展了 ComfyUI 的功能和能力边界。
在本指南中,我们将介绍如何自定义节点相关的一些操作,包括安装、更新、禁用、卸载、依赖安装等。
每个人都可以开发自己的自定义的扩展功能到 ComfyUI 中,并分享给其他人使用,你可以在[这里](https://registry.comfy.org)找到许多来自社区的自定义节点,如果你想要开发自己的自定义节点请访问下面的部分开始:
了解如何开始开发一个自定义节点
## 自定义节点管理
在这个部分我们将讲解:
* 安装自定义节点
* 安装节点依赖
* 自定义节点版本控制
* 卸载自定义节点
* 临时禁用自定义节点
* 处理自定义节点依赖冲突
### 1. 安装自定义节点
目前 ComfyUI 支持通过多种方式安装自定义节点,包括:
* \[通过 ComfyUI Manager 安装(推荐)]\(#通过 ComfyUI Manager 安装)
* 通过 Git 安装
* 手动安装
我们推荐通过 **ComfyUI Manager** 来安装自定义节点,这是一个在 ComfyUI 自定义节点生态中具有非常重要意义的一个工具,它使得自定义节点管理(如搜索、安装、更新、禁用和卸载)变得简单,你只需要在 ComfyUI Manager 中搜索你想要安装的节点,然后点击安装即可。
但由于目前所有的自定义节点都是存储在 GitHub 上,所以在本篇针对于某些无法正常访问 GitHub 的的地区,我们在本篇撰写了详尽的不同的自定义节点的安装方式。
另外由于我们推荐使用 **ComfyUI Manager** 进行对应的插件管理,我们推荐使用这一工具来进行插件的管理,你可以在[这里](https://github.com/Comfy-Org/ComfyUI-Manager)找到它的源码。
所以在本篇文档中,我们将会使用安装 ComfyUI Manager 作为自定义节点安装示例,并在本篇的相关介绍部分补充如何使用它来进行节点的管理。
由于 ComfyUI Manager 的功能非常丰富,所在在这里我们将对应的通过 ComfyUI Manager 章节单独使用一篇文档来介绍,请访问下面的链接来了解如何使用 ComfyUI Manager 来安装自定义节点。
了解如何使用 ComfyUI Manager 安装自定义节点
首先你需要确保在你的系统中已经安装了 Git,你可以在对应的系统终端(terminal)中输入下面的命令来检查是否已经安装了 Git。
```bash
git --version
```
如果已经安装了 Git,你将会看到类似如下的输出:
如果还没有安装,请访问 [git-scm.com](https://git-scm.com/) 下载对应的安装包进行安装, linux 用户请参考 [git-scm.com/downloads/linux](https://git-scm.com/downloads/linux) 这个部分来完成对应的安装。
对于 ComfyUI Desktop 版本,你可以参考下面的方式使用 Desktop 的终端来完成对应的安装。
在完成 Git 的安装后,我们需要对应的自定义节点的仓库地址,在这里我们使用 ComfyUI-Manager 的仓库地址作为示例:
```bash
https://github.com/Comfy-Org/ComfyUI-Manager
```
对于无法顺利访问 GitHub 的地区,可以尝试使用其它的代码托管服务网站来 fork 对应的仓库,然后使用对应的仓库地址来完成对应的节点安装,比如 gitee 等。
我们首先需要进入到 ComfyUI 的自定义节点的相关目录,以 ComfyUI 便携版为例,如果对应文件夹位置为`D:\ComfyUI_windows_portable`,那么你应该可以找到对应自定义节点的文件夹为`D:\ComfyUI_windows_portable\ComfyUI\custom_nodes`,首先我们需要使用 `cd` 命令进入到对应的目录
```bash
cd D:\ComfyUI_windows_portable\ComfyUI\custom_nodes
```
然后我们使用 `git clone` 命令来完成对应的节点安装。
```bash
git clone https://github.com/Comfy-Org/ComfyUI-Manager
```
如果一切顺利,你将会看到类似如下的输出:
这意味着你已经成功克隆了对应的自定义节点的代码,接下来我们需要进行对应依赖的安装。
请参考[安装节点依赖](#安装节点依赖)章节中的说明进行对应的依赖安装。
手动安装虽然不是推荐的安装方式,但在你无法使用 git 顺利安装的情况下,这不失为一个备用的方案。
使用这种方式安装的插件,将丢失对应的 git 版本历史信息,将无法很便利地进行后续的版本管理
使用手动安装我们需要首先下载对应的节点代码,然后解压到对应的目录。
访问对应自定义节点的仓库界面
1. 点击 `Code` 按钮
2. 然后点击 `Download ZIP` 按钮,下载对应的 zip 包。
3. 解压对应的 zip 包
将上面步骤解压后的代码复制到 ComfyUI 自定义节点目录,以 ComfyUI 便携版为例,如果对应文件夹位置为`D:\ComfyUI_windows_portable`,那么你应该可以找到对应自定义节点的文件夹为`D:\ComfyUI_windows_portable\ComfyUI\custom_nodes`,将上面步骤解压后的代码复制到对应的目录。
请参考[安装节点依赖](#安装节点依赖)章节中的说明进行对应的依赖安装。
### 2. 安装节点依赖
自定义节点都需要进行相关的依赖的安装,比如对于 ComfyUI-Manager 来说,你可以访问[requirements.txt](https://github.com/Comfy-Org/ComfyUI-Manager/blob/main/requirements.txt) 文件来查看对应的依赖包的要求,
在之前的步骤中,我们仅仅是把对应的自定义节点代码克隆到了本地,并没有安装对应的依赖,所以接下来我们需要安装对应的依赖。
实际上,如果你使用的是 ComfyUI-Manager 来安装插件的话,ComfyUI Manager 会自动帮你完成对应的依赖安装,你只需要在安装插件后,重启 ComfyUI 即可,这也是为什么我们极力推荐使用 ComfyUI Manager 来安装自定义节点。
但也许你会在某些情况下无法顺利使用 ComfyUI Manager 来安装自定义节点,所以我们提供了这部分较为详细的依赖安装说明。
在关于[依赖关系](/zh-CN/development/core-concepts/dependencies)章节中,我们介绍了 ComfyUI 中依赖关系的相关内容,ComfyUI 是一个基于 **Python** 的项目,我们构建了一个用于运行 ComfyUI 的独立 **Python** 运行环境,所有的相关依赖都需要被安装在在这个独立的 **Python** 运行环境中。
如果你直接在系统级的终端运行 `pip install -r requirements.txt`,对应的依赖可能会被安装到了系统级的 **Python** 环境中,会导致对应的自定义节点在 ComfyUI 的环境中依赖还是丢失的,导致对应自定义节点无法正常运行。
所以接下来我们需要使用 ComfyUI 的独立 Python 运行环境来完成对应的依赖安装。
依据不同的 ComfyUI 版本我们将使用不同的方式来进行对应的依赖安装,
对于 ComfyUI 便携版(Portable)来说,它使用的是一个嵌入式的 Python ,对应 Python 位于 `\ComfyUI_windows_portable\python_embeded` 目录下, 我们需要使用对应的 Python 来完成对应的依赖安装。
首先先在便携版的目录下启动 terminal 或者启动 terminal 后使用 `cd` 命令进入到 `\ComfyUI_windows_portable\` 目录下
确保对应终端的目录为 `\ComfyUI_windows_portable\` 目录下,如下图为 `D:\ComfyUI_windows_portable\`
然后使用 `python_embeded\python.exe` 来完成对应的依赖安装。
```bash
python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-Manager\requirements.txt
```
当然你可以把对应的 ComfyUI-Manager 替换为你实际安装的自定义节点名称,但需要确保对应节点目录下的确存在 `requirements.txt` 文件。
由于 ComfyUI 桌面版在安装过程中已经安装了 ComfyUI-Manager 及对应依赖,但本文是以 ComfyUI Manager 作为自定义节点安装为示例讲解的,实际上你并不需要再在桌面版中执行 ComfyUI Manager 的依赖安装。
如果没有意外情况,我们建议使用 ComfyUI Manager 来安装自定义节点,这样你就不需要手动安装依赖了。
然后使用下面的命令来安装对应插件的依赖
```bash
pip install -r .\custom_nodes\<对应自定义节点名称>\requirements.txt
```
如下图,是对应 ComfyUI-Hunyuan3Dwrapper 的依赖安装。
对于自定义 Python 环境的用户,我们建议使用 `pip install -r requirements.txt` 来完成对应的依赖安装。
### 自定义节点版本控制
自定义节点的版本控制,实际上是基于 Git 的版本控制来进行的,你可以通过 Git 来进行对应的节点版本管理,但实际上在 ComfyUI Manager 中已经很好地集成了这一版本管理功能,非常感谢 [@Dr.Lt.Data](https://github.com/ltdrdata) 为我们带来如此便捷的工具。
在这个部分我们依旧会为你讲解这两种不同插件版本管理的方法,但如果你是使用 zip 压缩包进行手动安装的,那么对应的 git 版本历史信息会丢失,会导致你无法进行对应的版本管理。
由于我们正在对 ComfyUI Manager 进行迭代,实际最新的界面和步骤可能会有较大的改变
如图进行对应的操作,进入 ComfyUI Manager 的对应界面
可以使用对应的筛选过滤来过滤出已安装的节点包,然后再进行对应的节点管理
切换到对应的版本,Manager 会帮助你完成对应的依赖更新和安装,通常切换完版本之后你需要重启 ComfyUI 以使更改生效。
找到你对应节点所在的目录文件夹如`ComfyUI/custom_nodes/ComfyUI-Manager`
使用 `cd` 命令来进入对应的文件夹
```bash
cd <你的安装目录>/ComfyUI/custom_nodes/ComfyUI-Manager
```
可以使用以下命令查看所有可用的 tag 和 release:
```bash
git tag
```
这将列出所有的版本标签,你可以选择你想要切换到的版本。
使用以下命令切换到指定的 tag 或 release:
```bash
git checkout
```
将 `` 替换为你想要切换到的具体版本标签。
如果你想要切换到特定的 commit 版本,可以使用以下命令:
```bash
git checkout
```
将 `` 替换为你想要切换到的具体 commit 的哈希值。
由于进行版本切换后,对应的自定义节点包的依赖可能会有所改变,所以你需要重新进行对应节点的依赖的安装,请参考[安装节点依赖](#2.安装节点依赖)部分的说明进入对应的环境进行安装
### 卸载自定义节点
待更新
### 临时禁用自定义节点
待更新
### 自定义节点依赖冲突
待更新
## ComfyUI Manager
目前在 [Desktop 版本](/zh-CN/installation/desktop/windows) 中已默认包含该工具,而在[便携(Portable)版](/zh-CN/installation/comfyui_portable_windows)中,你需要参考本文档中[安装管理器](#安装自定义节点)章节中的说明进行安装。
由于随着 ComfyUI 的发展,ComfyUI Manager 在 ComfyUI 中的角色也愈加重要,目前 ComfyUI-Manager 已经正式加入 Comfy Org 组织,正式成为 ComfyUI 核心依赖的一部分,并且由原作者[Dr.Lt.Data](https://github.com/ltdrdata)继续维护,你可以查看[这篇博客](https://blog.comfy.org/p/comfyui-manager-joins-comfy-org)了解更多信息。
并且在未来迭代中,我们会大大优化 ComfyUI Manager 的使用,所以本文的中使用说明文档的界面,可能会与最新版本的 ComfyUI Manager 界面有所不同。
### 安装管理器
如果您正在运行 ComfyUI 服务器应用程序,则需要安装管理器。如果 ComfyUI 正在运行,请在继续之前将其关闭。
第一步是安装 Git,这是一个用于软件版本控制的命令行应用程序。Git 将从 [github.com](https://github.com) 下载 ComfyUI 管理器。从 [git-scm.com](https://git-scm.com/) 下载并安装 Git。
安装 Git 后,导航到 ComfyUI 服务器程序目录,进入标记为 **custom\_nodes** 的文件夹。打开命令窗口或终端。确保命令行显示当前目录路径为 **custom\_nodes**。输入以下命令。这将下载管理器。从技术上讲,这被称为 *克隆 Git 仓库*。
### 检测缺失的节点
在安装管理器后,你可以在管理器中检测到缺失的节点。
## 开发一个自定义节点
如果你具有一定的开发能力,请从下面的文档开始以了解如何开始开发一个自定义节点。
了解如何开始开发一个自定义节点
# 依赖关系
Source: https://docs.comfy.org/zh-CN/development/core-concepts/dependencies
了解 ComfyUI 中的依赖关系
## 工作流文件依赖于其他文件
我们经常可以从社区里获取到各种各样的工作流文件,但往往在运行之后发现工作流无法直接运行,这是因为一个工作流文件除了工作流自身之外还要依赖于其它文件,比如媒体资产的输入、模型、自定义节点、相关的 Python 依赖等等。
ComfyUI 的工作流只有在所有相关依赖条件都被满足的情况下才能正常运行。
ComfyUI 工作流的运行依赖主要分为以下几类:
* 资产(媒体文件包括音频、视频、图像等等输入)
* 自定义节点
* Python 依赖
* 模型(如:Stable Diffusion 模型等)
## 资产
AI 模型是一个 ***资产*** 的例子。在媒体制作中,资产是提供输入数据的某种媒体文件。例如,视频编辑程序处理存储在磁盘上的电影文件。编辑程序的项目文件保存了这些电影文件资产的链接,允许非破坏性编辑,而不改变原始电影文件。
ComfyUI 的工作方式也是如此。工作流只有在找到并加载所有必需的资产时才能运行。生成性 AI 模型、图像、电影和声音是工作流可能依赖的一些资产示例。因此,这些被称为 ***依赖资产*** 或 ***资产依赖关系***。
## 自定义节点
自定义节点是 ComfyUI 的一个重要组成部分,
## Python 依赖
ComfyUI 是一个基于 Python 的项目,我们构建了一个独立的 Python 运行环境,来运行 ComfyUI,所有的相关依赖都会被安装在在这个独立的 Python 运行环境中。
### ComfyUI 的依赖
你可以在 ComfyUI 的 [requirements.txt](https://github.com/comfyanonymous/ComfyUI/blob/master/requirements.txt) 文件中查看 ComfyUI 当前的依赖
```text
comfyui-frontend-package==1.14.5
torch
torchsde
torchvision
torchaudio
numpy>=1.25.0
einops
transformers>=4.28.1
tokenizers>=0.13.3
sentencepiece
safetensors>=0.4.2
aiohttp>=3.11.8
yarl>=1.18.0
pyyaml
Pillow
scipy
tqdm
psutil
#non essential dependencies:
kornia>=0.7.1
spandrel
soundfile
av
```
随着 ComfyUI 的发展,我们可能也会调整相应的依赖,比如添加新的依赖,或者删除一些不再需要的依赖。
所以如果你是使用 Git 来更新 ComfyUI 那么你需要在拉取最新的更新之后在对应的环境下使用
```bash
pip install -r requirements.txt
```
从而来安装 ComfyUI 最新的依赖以保证 ComfyUI 的正常运行,你也可以通过修改特定包的依赖版本来实现部分依赖的升级或者降级
另外 ComfyUI 的前端 [ComfyUI\_frontend](https://github.com/Comfy-Org/ComfyUI_frontend) 目前是作为一个独立的项目来进行维护,我们会在对应版本稳定之后更新对应的 `comfyui-frontend-package` 依赖版本,如果你需要切换对应的前端版本,你可以在[这里](https://pypi.org/project/comfyui-frontend-package/#history)查看对应的版本信息。
### 自定义节点的依赖
感谢 ComfyUI 社区众多作者的努力,使得我们可以通过使用不同的自定义节点(Custom Nodes)来扩展 ComfyUI 的功能,实现令人赞叹的创意。
通常,每个自定义节点都会有一个独立的依赖,并且每个自定义节点都会有一个独立的 `requirements.txt` 文件。
如果你使用 [ComfyUI Manager](https://github.com/ltdrdata/ComfyUI-Manager) 来安装自定义节点,那么通常 ComfyUI Manager 会自动帮你安装对应的依赖。
当然也有一些需要你手动进行依赖安装的情况,目前所有的自定义节点都将被安装至 `ComfyUI/custom_nodes` 目录下,
你需要在你的 ComfyUI Python 环境中进入到对应插件的目录然后执行 `pip install -r requirements.txt` 来安装对应的依赖。
如果是 [Windows 便携版](/zh-CN/installation/comfyui_portable_windows),你可以在便携版的`ComfyUI_windows_portable`目录下使用
```
python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\\requirements.txt
```
来安装对应节点的依赖。
### 依赖冲突
依赖冲突是我们在使用 ComfyUI 的时候经常会遇到的问题,你可能会发现在安装或者更新好某个自定义节点后,之前安装的一些自定义节点在 ComfyUI 的节点库中查找不到了,或者出现了错误弹窗,其中一个可能的原因就是依赖冲突。
依赖冲突的原因可能有很多,比如:
1. 自定义节点的版本锁定
部分插件在开发时会固定依赖库的精确版本(如`open_clip_torch==2.26.1`),而其他插件可能要求更高版本(如`open_clip_torch>=2.29.0`),导致版本无法同时满足。
**解决方法**:你可以试着把对应的固定版本依赖改为范围约束比如 `open_clip_torch>=2.26.1`,然后重新执行依赖的安装来解决这些问题。
2. 环境污染
在执行自定义节点依赖安装的过程中,可能会覆盖其它插件已经安装的库的版本,比如多个插件依赖 `PyTorch` 但要求的是不同的 CUDA 版本,后安装的插件会破坏原有的环境。
**解决方法**:
* 你可以试着手动在 python 虚拟环境中手动安装特定版本的依赖,来解决这类问题。
* 或者为不同的插件创建不同的 python 虚拟环境,来解决这类问题。
* 试着逐个安装插件,在安装完每个插件后,重新启动 ComfyUI 来观察是否会出现依赖冲突。
3. 自定义节点依赖版本与 ComfyUI 依赖版本不兼容
这类依赖冲突的问题可能较难解决,你可能需要通过升降级 ComfyUI 或者更改自定义节点的依赖版本,来解决这类问题。
**解决方法**:这类依赖冲突的问题可能较难解决,你可能需要通过升降级 ComfyUI 或者更改自定义节点的依赖版本,来解决这类问题。
## 模型
模型是 ComfyUI 的一个重要资产依赖,基本上各类的自定义节点和工作流都围绕着特定的模型展开,比如 stable diffusion 系列、Flux 系列、Ltxv 等等。
这些模型是我们使用 ComfyUI 进行创作的重要基础,所以我们在使用 ComfyUI 的时候需要确保我们使用的模型是正常可用的,通常,我们对应的模型都在 `ComfyUI/models/` 目录的对应目录进行保存,当然你也可以通过修改模板创建一个 [extra\_model\_paths.yaml](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 来使得额外的模型路径被 ComfyUI 识别。
这样就可以实现多个 ComfyUI 实例共享同一个模型库,从而减少磁盘的占用。
# 连线
Source: https://docs.comfy.org/zh-CN/development/core-concepts/links
了解 ComfyUI 中的连接链接
由于 ComfyUI 仍旧在快速迭代和开发中,我们每天都在持续迭代,所以本文中相关操作的部分,可能会有一定变动或者遗漏,请以实际为准,如果你遇到实际的操作有变动,可能是因为我们进行了迭代更新,你也可以 fork [这个 repo](https://github.com/Comfy-Org/docs) 的和我们一起完善这个文档。
## 连线连接节点
在 ComfyUI 的术语中,节点之间的线条或曲线称为 ***连线***。这些连线可以以多种方式显示,例如曲线、直角线、直线或直接完全隐藏。
你可以在 **设置菜单** --> **画面(Lite Graph)** --> **画面(Grap)** --> **连线渲染模式(Link Render Mode)** 进行连线样式的修改。
也可以在 **Canvas Menu** 中临时隐藏连线。
根据情况,可能需要查看所有链接。特别是在学习、分享或仅仅理解工作流时,连线之间的可见性可以让其它用户理解不同节点之间的相互作用。但对于那些不打算被更改的打包工作流,你也可以隐藏连线从而获得一个干净简洁的布局。
### 重新路由节点
通常,当我们节点过多时,对应的连接线难免被遮挡或者出现交叉,这时理解工作流作用就变得十分困难,如果你想要对应连线保持清晰,则可以采用一个 **重新路由(reroute)** 节点来手动调整连线。
同时我们也在努力迭代,我们也已经完善 litegraph 原生的重路由功能,我们更建议在未来使用这个功能进行连线的重新组织。
## 颜色编码
节点属性的数据类型通过输入/输出端口和链接连接线的颜色编码来表示。我们总是可以通过颜色判断哪些输入和输出可以相互连接。端口只能连接到相同颜色的其他端口来保证对应数据类型的匹配。
目前常见数据类型:
| 数据类型 | 颜色 |
| ---------- | ---- |
| 扩散模型 | 薰衣草色 |
| CLIP 模型 | 黄色 |
| VAE 模型 | 玫瑰色 |
| 条件化 | 橙色 |
| 潜在图像 | 粉色 |
| 像素图像 | 蓝色 |
| 蒙版 | 绿色 |
| 数字(整数或浮点数) | 浅绿色 |
| 网格(Mesh) | 亮绿色 |
# 模型
Source: https://docs.comfy.org/zh-CN/development/core-concepts/models
{/*
description: "了解AI模型及其在ComfyUI中的作用"
*/}
## 模型是必不可少的
模型是媒体生成工作流程的核心组件。通过组合和混合,它们能够实现多样化的创意效果。
***模型*** 一词有多种含义。在这里,它指的是一种数据文件,包含节点图执行任务所需的信息。具体来说,它是一种数据结构,用于表示某种功能。作为动词,建模意味着对某物进行表示或提供示例。
在ComfyUI中,模型数据文件的典型示例是AI ***扩散模型***。它是一组庞大的数据集,表示文本与图像之间的复杂关系,从而实现文字与图片的相互转换。其他用于图像生成的常见模型示例包括多模态视觉和语言模型,如CLIP,以及图像放大模型,如RealESRGAN。
## 模型文件
模型文件是生成媒体制作的必需品。没有模型文件,工作流程将无法进行。ComfyUI安装包中不包含模型文件,但它通常可以自动下载并安装缺失的模型文件。许多模型可以通过**ComfyUI管理器**窗口下载和安装。模型还可以在以下网站获取:[huggingface.co](https://huggingface.co)、[civitai.green](https://civitai.green)和[github.com](https://github.com)。
### 在ComfyUI中使用模型
1. 下载并将其放置在ComfyUI程序目录中
1. 在**models**文件夹中,您会找到各种类型模型的子文件夹,例如**checkpoints**
2. **ComfyUI管理器**帮助自动化搜索、下载和安装的过程
3. 如果ComfyUI正在运行,请重启它
2. 在您的工作流程中,创建适合模型类型的节点,例如**Load Checkpoints**、**Load LoRA**、**Load VAE**
3. 在加载节点中,选择您希望使用的模型
4. 将加载节点连接到工作流程中的其他节点
## 添加外部模型路径
如果你想要在 `ComfyUI/models` 之外管理你的模型文件,可能出于以下原因:
* 你有多个 ComfyUI 实例,你想要让这些实例共享模型文件,从而减少磁盘占用
* 你有多个不同的类型的 GUI 程序,如:WebUI, 你想要他们共用模型文件
* 模型文件无法被识别或读取到
我们提供了通过 `extra_model_paths.yaml` 配置文件来添加额外模型搜索路径的方法。
### 不同 ComfyUI 版本配置文件位置
对于[便携版](/zh-CN/installation/comfyui_portable_windows)和[手动安装](/zh-CN/installation/manual_install)的 ComfyUI版本,你可以在 ComfyUI 的根目录下找到 `extra_model_paths.yaml.example` 的示例文件
```
ComfyUI/extra_model_paths.yaml.example
```
复制并重命名为 `extra_model_paths.yaml` 来使用, 并保持在 ComfyUI 的根目录下, 路径应该是 `ComfyUI/extra_model_paths.yaml`
你也可以在 [这里](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 找到配置示例文件
如果你使用的是 ComfyUI 桌面应用程序,你可以参考下图打开额外模型的配置文件:
或者通过下面的位置打开:
```
C:\Users\YourUsername\AppData\Roaming\ComfyUI\extra_models_config.yaml
```
```
~/Library/Application Support/ComfyUI/extra_models_config.yaml
```
对应的配置文件不应该被改变
### 配置示例
比如,你需要额外让 ComfyUI 识别的模型文件位于下面的文件夹:
```
📁 YOUR_PATH/
├── 📁models/
| ├── 📁 loras/
| │ └── xxxxx.safetensors
| ├── 📁 checkpoints/
| │ └── xxxxx.safetensors
| ├── 📁 vae/
| │ └── xxxxx.safetensors
| └── 📁 controlnet/
| └── xxxxx.safetensors
```
那么你可以进行如下的配置来让 ComfyUI 识别到你设备上的模型路径
```
my_custom_config:
base_path: YOUR_PATH
loras: models/loras/
checkpoints: models/checkpoints/
vae: models/vae/
controlnet: models/controlnet/
```
或者使用
```
my_custom_config:
base_path: YOUR_PATH/models/
loras: loras
checkpoints: checkpoints
vae: vae
controlnet: controlnet
```
对于桌面版,请在原有配置路径下新增配置,而不覆盖掉安装过程中自动生成的路径配置,请在修改前备份对应的文件,这样在你配置错误时可以及时恢复
或者你也可以参考默认的 [extra\_model\_paths.yaml.example](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 来配置,保存之后, 需要 **重启 ComfyUI** 才能生效。
下面是完整的原始的配置配置示例:
```yaml
#Rename this to extra_model_paths.yaml and ComfyUI will load it
#config for a1111 ui
#all you have to do is change the base_path to where yours is installed
a111:
base_path: path/to/stable-diffusion-webui/
checkpoints: models/Stable-diffusion
configs: models/Stable-diffusion
vae: models/VAE
loras: |
models/Lora
models/LyCORIS
upscale_models: |
models/ESRGAN
models/RealESRGAN
models/SwinIR
embeddings: embeddings
hypernetworks: models/hypernetworks
controlnet: models/ControlNet
#config for comfyui
#your base path should be either an existing comfy install or a central folder where you store all of your models, loras, etc.
#comfyui:
# base_path: path/to/comfyui/
# # You can use is_default to mark that these folders should be listed first, and used as the default dirs for eg downloads
# #is_default: true
# checkpoints: models/checkpoints/
# clip: models/clip/
# clip_vision: models/clip_vision/
# configs: models/configs/
# controlnet: models/controlnet/
# diffusion_models: |
# models/diffusion_models
# models/unet
# embeddings: models/embeddings/
# loras: models/loras/
# upscale_models: models/upscale_models/
# vae: models/vae/
#other_ui:
# base_path: path/to/ui
# checkpoints: models/checkpoints
# gligen: models/gligen
# custom_nodes: path/custom_nodes
```
### 添加额外自定义节点路径
除了添加外部模型之外,你同样可以添加不在 ComfyUI 默认路径下的自定义节点路径
请注意,这并不会改变自定义节点的默认安装路径,只是在启动 ComfyUI 时会增加额外的路径搜索,你仍旧需要在对应的环境中完成自定义节点的依赖的安装,来保证其运行环境的完整性。
下面是一个简单的配置示例(Mac 系统),请根据你的实际情况进行修改,并新增到对应的配置文件中,保存后需要 **重启 ComfyUI** 才能生效:
```yaml
my_custom_nodes:
custom_nodes: /Users/your_username/Documents/extra_custom_nodes
```
### 文件大小
相对于图像文件,模型文件可能非常大。一个典型的未压缩图像可能需要几MB的磁盘存储。生成的AI模型可能大到数万倍,单个模型可达数十GB。它们占用大量磁盘空间,并且在网络上传输需要很长时间。
## 模型训练和优化
生成式AI模型是通过在非常大的数据集上训练机器学习程序创建的,例如图像和文本描述的配对。AI模型并不明确存储训练数据,而是存储数据中隐含的相关性。
像Stability AI和Black Forest Labs这样的组织和公司发布“基础”模型,这些模型携带大量通用信息。这些是通用的生成AI模型。通常,基础模型需要进行***优化***,以获得高质量的生成输出。一个专门的社区致力于优化基础模型。新的优化模型产生更好的输出,提供新的或不同的功能,并/或使用更少的资源。优化后的模型通常可以在计算能力和/或内存较少的系统上运行。
## 辅助模型
模型功能可以通过辅助模型进行扩展。例如,艺术指导文本到图像的工作流程以实现特定结果,单靠扩散模型可能会很困难或不可能。额外的模型可以在工作流程图中优化扩散模型,以产生所需的结果。示例包括**LoRA**(低秩适应),一个针对特定主题训练的小模型;**ControlNet**,一个使用引导图像帮助控制构图的模型;以及**Inpainting**,一个允许某些扩散模型在现有图像中生成新内容的模型。
# 节点
Source: https://docs.comfy.org/zh-CN/development/core-concepts/nodes
了解 ComfyUI 中节点的概念。
在 ComfyUI 中,节点是我们执行任务的单元,他们是构建好的一个个独立的模块,无论是 **Comfy Core** 还是 **自定义节点** ,每个节点都是一个独立的模块,有着自己独特的功能,节点之间通过连线连接,我们可以像搭乐高积木一样搭建起来复杂的功能。
可以说,不同的节点组合构建出了 ComfyUI 的无限可能。
例如在 K-Sampler 节点中,你可以看到它有多个输入和输出,也同时包含多个参数设置,这些参数决定了节点执行的逻辑,它的背后是对应编写好的 Python 逻辑,从而可以让你不用去接触代码就可以实现对应的功能。
由于 ComfyUI 仍旧在快速迭代和开发中,我们每天都在持续迭代,所以本文中相关操作的部分,可能会有一定变动或者遗漏,请以实际为准,如果你遇到实际的操作有变动,可能是因为我们进行了迭代更新,你也可以 fork [这个 repo](https://github.com/Comfy-Org/docs) 的和我们一起完善这个文档。
## 节点的的不同状态
在 ComfyUI 中,节点有多种状态,下面是一些常见的节点状态:
1. **正常(Normal)状态**: 正常状态
2. **运行(Running)状态**: 运行中状态,通常在你开始运行工作流后,正在执行的节点会显示这个状态
3. **错误(Error)状态**: 节点错误,通常在运行工作流后,如果对应的节点输入存在问题,导致了错误会显示这个状态,并用红色标识对应出错的输入节点,你需要解决对应出错的输入来保证工作流正常运行
4. **丢失(Missing)状态**: 这个状态通常在你导入了一些工作流后会出现,存在两种可能
* ComfyCore 原生节点丢失: 这通常是因为 ComfyUI 的版本更新了,而你当前使用的 ComfyUI 版本较旧,你需要更新 ComfyUI 来解决这个问题
* 自定义节点丢失: 工作流中是用了第三方作者开发的自定义节点,而你的本地的 ComfyUI 版本没有安装对应的自定义节点,你可以使用 [ComfyUI-Manager](https://github.com/Comfy-Org/ComfyUI-Manager) 来查找丢失的自定义节点
## 节点之间的连接
在 ComfyUI 中,节点通过[连线](/zh-CN/development/core-concepts/links)连接,从而让相同的数据类型在不同的处理单元之间进行流转处理,从而获得最终的结果。
每个节点都会接收一些输入内容,然后经过模块处理将他们转换为对应的输出,不同的节点链接之间,必须符合数据类型规定的要求,在 ComfyUI 中,我们使用不同的颜色来区分节点的数据类型,下面是一些基础的数据类型。
| 数据类型 | 颜色 |
| ----------- | ---- |
| 扩散模型 | 薰衣草色 |
| CLIP 模型 | 黄色 |
| VAE 模型 | 玫瑰色 |
| 条件化 | 橙色 |
| 潜在图像 | 粉色 |
| 像素图像 | 蓝色 |
| 蒙版 | 绿色 |
| 数字 (整数或浮点数) | 浅绿色 |
| 网格(Mesh) | 亮绿色 |
随着 ComfyUI 的迭代,我们可能会拓展更多的数据类型,以符合更多场景的需求。
### 节点之间的连接和取消连接
**连接**: 在上一个节点的输出点中拖拽到下一个节点相同颜色的输入中,即可连接
**取消连接**: 在被输入的端点,点击后鼠标左键拖拽输入,即可取消连接,或者通过连线的中点菜单来取消连接。
## 节点的外观
我们为提供了多种样式设置,你可以根据你的需求来设置节点的外观:
* 修改样式
* 双击节点标题修改节点名称
* 通过上下文菜单将节点输入在 input 和 组件(widget)之间进行切换
* 通过节点右下角来缩放节点尺寸
### 节点标签 Badges
我们提供了多个节点标签(Badges)的显示功能,比如:
* 节点ID
* 节点来源
目前 **Comfy Core 节点** 采用小狐狸的图标来展示,自定义节点则采用其名称,这样你可以快速了解到对应节点是来自哪个节点包。
你可以在菜单中设置对应的显示:
## 节点上下文菜单
节点的上下文菜单主要分为两种
* 针对节点本身的上下文菜单
* 针对输入 / 输出的上下文菜单
### 节点的上下文菜单
通过在节点上点击鼠标右键,你可以展开对应的节点上下文菜单,下面是对应的菜单截图:
在节点的右键上下文菜单中你可以
* 调整节点的颜色样式
* 修改标题
* 克隆、复制、删除节点
* 设置节点的模式
在这个菜单中,除了外观相关的设置比较重要的是下面的菜单操作
* **模式(Mode)**: 设置节点的模式,Always、Never、绕过(Bypass)
* **切换节点输入的控件(Widget)和 输入模式**: 切换节点输入的控件(Widget)和 输入模式
#### 模式(Mode)
对于模式,你可能注意到目前我们提供了:Always、Never、On Event、On Trigger 四种模式,但实际上只有 **Always** 和 **Never** 是有效的,**On Event** 和 **On Trigger** 实际上是无效的,目前我们尚未完全实现这个功能,另外你可以把 **绕过(Bypass)** 也理解为一种模式,下面是对于几种可用模式的解释
* **Always**: 节点默认模式,当节点首次运行或者自上一次执行后,对应输入有变化对应节点都会执行
* **Never**: 节点在任何情况下都不会执行,就像节点被删除了,后续节点无法读取接收到任何来自它的数据
* **绕过(Bypass)**: 节点在任何情况下都不会执行,但是后续的节点仍旧可以试着获取到未经这个节点的处理的数据
下面是对于 `Never` 和 `Bypass` 模式的对比:
在这个对比的例子中,你可以看到,两个工作流都是同时应用了两个 LoRA 模型,差异在于其中一个`Load LoRA` 节点被设置为 `Never` 模式而另一个被设置为`Bypass` 模式。
* 被设置为 `Never` 模式的节点,后续的节点由于接收不到任何的输入数据而出现了报错
* 被设置为 `Bypass` 模式的节点,后续的节点仍旧可以获取到未经这个节点处理的数据,从而加载了第一个`Load LoRA` 节点的输出数据,所以后续的工作流依旧可以正常运行
#### 切换节点输入的控件(Widget)和 输入模式
在有些情况下,我们需要使用来自其它节点的输出结果作为输入,此时我们就可以通过切换节点输入的控件(Widget)和 输入模式来实现。
下面是一个非常简单的例子:
通过将 K-Sampler 的 Seed 从输入控件(Widget)切换为输入模式,从而统一多个节点的 seed ,实现多个采样间的变量统一。
对比第一个节点和后续的两个节点,你可以看到后两个节点的 seed 是输入模式了,同样你还可以把它再转换回控件模式:
前端 v1.16.0 版本后,我们改进了这一功能,现在你只需要直接将输入端的连线直接连接到对应组件,即可完成这一过程
### 输入 / 输出的上下文菜单
这里上下文菜单主要和对应输入输出的数据类型相关
在拖动节点的输入 / 输出的时候,当有连线出现,但你未连接到其它节点的输入或输出的时候,此时释放鼠标则会弹出针对输入 / 输出的上下文菜单,用于快速添加相关类型的节点。
你可以在设置中调整对应的节点建议的数量
## 节点选择工具箱
**节点选择工具箱(Selection tool box)** 是一个为节点提供快速操作的一个浮层工具,当你选中一个节点的时候,它会悬浮在选中的节点上方,通过这个节点你可以:
* 修改节点的颜色
* 快速设置节点为 Bypass 模式(在运行时候不执行)
* 固定节点
* 删除节点
当然,这些功能在对应节点的右键菜单中也可以找到,节点选择工具箱只是提供了一个快捷操作,如果你想要关闭这个功能,可以在设置中关闭。
## 节点组
在 ComfyUI 中,你可以将一个工具流的部分,同时选用,再使用右键菜单将它们合并成一个节点组,使得对应的部分可以成为一个可复用的模块,从而在你的 ComfyUI 中进行重复调用
# 属性
Source: https://docs.comfy.org/zh-CN/development/core-concepts/properties
{/*
description: "了解 ComfyUI 中节点属性"
*/}
## 节点是属性的容器
节点通常具有 ***属性***。也称为 ***参数*** 或 ***特性***,节点属性是可以更改的变量。一些属性可以通过用户手动调整,使用称为 ***小部件*** 的数据输入字段。其他属性可以通过连接到属性 ***输入插槽*** 或端口的其他节点自动驱动。通常,属性可以在小部件和输入之间转换,从而允许用户手动或自动控制属性值。
属性可以采取多种形式,并包含多种不同类型的信息。例如,**加载检查点** 节点具有一个属性:生成模型检查点文件的文件路径。**KSampler** 节点具有多个属性,例如采样 **步骤**、**CFG** 比例、**采样器名称** 等等。
## 数据类型
信息可以以多种不同形式出现,称为 ***数据类型***。例如,字母数字文本称为 ***字符串***,整数称为 ***整数***,带小数点的数字称为 ***浮点数*** 或 ***浮点***。新的数据类型总是被添加到 ComfyUI 中。
ComfyUI 是用 Python 脚本语言编写的,该语言对数据类型非常宽容。相比之下,ComfyUI 环境是非常 ***强类型*** 的。这意味着不同的数据类型不能混合。例如,我们不能将图像输出连接到整数输入。这对用户来说是一个巨大的好处,指导他们正确构建工作流程并防止程序错误。
# 工作流
Source: https://docs.comfy.org/zh-CN/development/core-concepts/workflow
{/*
description: "了解 ComfyUI 中工作流的概念。"
*/}
## 节点图
ComfyUI 是一个用于构建和运行生成内容的 ***工作流*** 的环境。在这个上下文中,工作流被定义为一组称为 ***节点*** 的程序对象,它们相互连接,形成一个网络。这个网络也被称为 ***图***。
ComfyUI 工作流可以生成任何类型的媒体:图像、视频、音频、AI 模型、AI 代理等。
## 示例工作流
要开始,请尝试一些 [官方工作流](https://comfyanonymous.github.io/ComfyUI_examples)。这些工作流仅使用 ComfyUI 安装中包含的核心节点。一个蓬勃发展的开发者社区创建了丰富的 [生态系统](https://registry.comfy.org) 的自定义节点,以扩展 ComfyUI 的功能。
### 简单示例
## 可视化编程
像 ComfyUI 这样的基于节点的计算机程序提供了一种传统菜单和按钮驱动应用程序无法实现的强大灵活性。ComfyUI 节点图不受传统计算机应用程序提供的工具的限制。它是一个高级的 ***可视化编程环境***,允许用户设计复杂的系统,而无需编写程序代码或理解高级数学。
许多其他计算机应用程序也使用相同的节点图范式。示例包括合成应用程序 Nuke、3D 程序 Maya 和 Blender、实时图形引擎 Unreal,以及交互媒体创作程序 Max。
### 更复杂的示例
## 过程框架
另一个用于描述基于节点的应用程序的术语是 ***过程框架***。过程意味着生成:某种过程或算法被用来生成内容,例如 3D 模型或音乐作品。
ComfyUI 是所有这些东西:一个节点图、一个可视化编程环境和一个过程框架。使 ComfyUI 不同(并且令人惊叹!)的是,它的开放结构允许我们生成任何类型的媒体资产,例如图片、电影、声音、3D 模型、AI 模型等。
在 ComfyUI 的上下文中,***工作流*** 这个术语是节点网络或图的同义词。它对应于 3D 或多媒体程序中的 ***场景图***:特定磁盘文件中所有节点的网络。3D 程序称之为 ***场景文件***。视频编辑、合成和多媒体程序通常称之为 ***项目文件***。
## 保存工作流
ComfyUI 工作流会自动保存在任何生成图像的元数据中,允许用户打开并使用生成图像的图形。工作流也可以存储在遵循 JSON 数据格式的人类可读文本文件中。这对于不支持元数据的媒体格式是必要的。以 JSON 文件格式存储的 ComfyUI 工作流非常小,便于版本控制、归档和共享图形,而不依赖于任何生成的媒体。
# 概述
Source: https://docs.comfy.org/zh-CN/development/overview
作为开发者使用 ComfyUI
ComfyUI 是一个强大的 GenAI 推理引擎,可用于本地运行 AI 模型、创建工作流、开发自定义节点,以及部署为服务器。
ComfyUI 的主要功能包括:
* **[创建工作流](/zh-CN/development/core-concepts/workflow)**:工作流是一种编排 AI 模型和自动化任务的方式。它们是一系列相互连接形成管道的节点。
* **[自定义节点](/zh-CN/development/core-concepts/custom-nodes)**:任何人都可以编写自定义节点来扩展 ComfyUI 的功能。节点使用 Python 编写,并由社区发布。
* **扩展**:扩展是改进 ComfyUI 用户界面的第三方应用程序。
* **[部署](/zh-CN/development/comfyui-server/comms_overview)**:ComfyUI 可以在您自己的环境中部署为 API 端点。
# 开始 ComfyUI 的 AI 绘图之旅
Source: https://docs.comfy.org/zh-CN/get_started/first_generation
本部分教程将会带你完成首次 ComfyUI 的图片生成,了解并熟悉 ComfyUI 中的一些界面基础操作,如工作流加载、模型安装、图片生成等
本篇的主要目的是带你初步了解 ComfyUI 熟悉 ComfyUI 的一些基础操作,并引导你首次的图片生成
1. 加载示例工作流
* 从 ComfyUI 加载`Workflows template`中的`Text to Image`工作流
* 使用带有`metadata` 的图片中加载工作流
2. 指导你完成模型
* 自动安装模型
* 手动安装模型
* 使用 **ComfyUI Manager** 的模型管理功能安装模型
3. 进行一次文本到图片的生成
## 关于文生图的说明
**文生图(Text to Image)**,是 AI 绘图的基础,通过输入文本描述来生成对应的图片,是 AI 绘图最常用的功能之一,你可以理解成你把你的**绘图要求(正向提示词、负向提示词)**告诉一个**画家(绘图模型)**,画家会根据你的要求,画出你想要的内容,由于本篇教程主要是为了引导你开始 ComfyUI 的使用,对于文生图的详细说明,我们将在[文生图](/zh-CN/tutorials/basic/image-to-image)章节进行详细讲解
## ComfyUI 文生图工作流教程讲解
### 1. 启动 ComfyUI
请确定你已经按照安装部分的指南完成了 ComfyUI 的启动,并可以成功打开 ComfyUI 的页面
如果你还未安装 ComfyUI 请根据你的设备情况选择一个合适的版本进行安装
ComfyUI 桌面版目前支持 **Windows 及 MacOS(Apple Silicon)** 的独立安装,目前仍在 Beta 版本
* 代码开源在 [Github](https://github.com/Comfy-Org/desktop)
由于 Desktop 总是基于稳定版本发布构建,所以我们最新的一些更新,对于 Desktop 来说可能需要等待一段时间才能体验到,如果你想要总是体验最新版本,请使用便携版或者手动安装
你可以从下面选择适合你的系统和硬件开始安装 ComfyUI
适合带有 **Nvdia** 显卡 **Windows** 版本的 ComfyUI 桌面版
适合带有 **Apple Silicon** 的 MacOS ComfyUI 桌面版
ComfyUI桌面版,**暂时没有 Linux 的预构建**,请访问[手动安装](/zh-CN/installation/manual_install)部分进行 ComfyUI 的安装
便携版是一个集成了独立的嵌入式 Python 环境的 ComfyUI 版本,使用便携版你可以体验到最新的功能,目前仅支持 **Windows** 系统
支持 **Navida 显卡** 和在 **CPU** 运行的 **Windows** ComfyUI 版本,始终使用最新 commit 的代码
支持所有的系统类型和 GPU 类型(Nvidia、AMD、Intel、Apple Silicon、Ascend NPU、寒武纪 MLU)的用户都可以尝试使用手动安装 ComfyUI
### 2. 加载默认文生图工作流
正常情况下,打开 ComfyUI 后是会自动加载默认的文生图工作流的, 不过你仍旧可以尝试以下不同方式加载工作流来熟悉 ComfyUI 的一些基础操作
请对照图片中序号所对应的顺序进行操作
1. 点击 ComfyUI 界面右下角的**Fit View**按钮,防止已加载工作流是在视图外导致不可见
2. 点击侧边栏的**文件夹图标(workflows)**
3. 点击 工作流(Workflows)面板顶部的**浏览工作流示例(Browse example workflows)** 按钮
下图继续
4. 选择默认的第一个工作流 **Image Generation** 以加载图标
或者你也可以从`workflow`菜单中选择**Browse workflow templates** 浏览工作流模板
所有用 ComfyUI 生成的图片,都会带有 metadata 信息,这些信息会包含图片的 workflow 信息,你可以通过这些信息来加载对应的 workflow。
现在,让我们尝试一下,请保存下面的工作流图片,然后直接拖入 ComfyUI 的界面中,或者使用菜单 **工作流(Workflows)** -> **打开(Open)** 打开这个图片以加载对应的 workflow
ComfyUI 工作流还可以 json 格式存储,当我们完成一个工作流后,可以使用菜单 **工作流(Workflows)** -> **导出(Export)** 导出,这样对应的工作流就可以被保存为 json 文件中加载
现在,让我们尝试一下,点击下面的按钮,前往 Github 下载对应的 text-to-image.json 文件
前往下载 text-to-image.json 文件
下载完成后,请使用菜单 **工作流(Workflows)** -> **打开(Open)** 打开这个 json 文件以加载对应的 workflow
### 3. 安装绘图模型
通常在 ComfyUI 的初始安装中,并不会包含任何的绘图模型,但是模型是我们运行图片生成必不可少的部分。
在你完成第二步,工作流的加载后,如果你的电脑上没有安装[v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors) 这个模型文件时,一般会出现下图的提示
你可以直接选择点击 `Download` 按钮,让 ComfyUI 自动完成对应的模型的下载,但由于在有些地区不能够顺利访问对应模型的下载源,所以在这个步骤中,我将说明几种不同的模型安装方法。
无论使用哪种方法,模型都会被保存到 `<你的 ComfyUI 安装位置>/ComfyUI/models/` 文件夹下,你可以在你的电脑上尝试找到这个文件夹位置,你可以看到许多文件夹比如 `checkpoints`、`embeddings`、`vae`、`lora`、`upscale_model` 等,这些都是不同的模型保存的文件夹,通常以文件夹名称区分,ComfyUI 在启动时会检测这些文件夹下的模型文件,以及`extra_model_paths.yaml` 文件中配置的文件路径
被检测到的不同的文件夹里的模型将可以在 ComfyUI 的不同 **模型加载节点** 里使用,下面让我们开始了解不同模型的安装方式:
在你点击 **Download** 按钮后,ComfyUI 将会执行下载,根据你使用的版本不同,将会执行不同的行为
桌面版将自动完成模型的下载并保存到 `<你的 ComfyUI 安装位置>/ComfyUI/models/checkpoints` 目录下
你可以等待安装完成或者在侧边栏的模型面板里查看安装进度
如果一切顺利,模型应该可以自动下载到本地,如果长时间未下载成功,请尝试其它安装方法
浏览器将会执行文件下载,请在下载完成后,将文件保存到的 `<你的 ComfyUI 安装位置>/ComfyUI_windows_portable/ComfyUI/models/checkpoints` 目录下
[ComfyUI Manager](https://github.com/ltdrdata/ComfyUI-Manager) 是由 [ltdrdata](https://github.com/ltdrdata) 开发的用于扩展和管理自定义节点、模型及插件的工具,目前 ComfyUI 的安装过程会自动完成 ComfyUI Manager 的安装,下面的步骤将会引导你使用 ComfyUI Manager 安装模型
如图,点击对应的 `Manager` 按钮,即可打开 ComfyUI Manager 的界面
如图,点击对应的 `Model Manager` 按钮,即可打开模型管理界面
1. 请在搜索框输入`v1-5-pruned-emaonly.ckpt`
2. 在搜索结果里,点击对应的 `install` 按钮,即可下载模型
不过由于目前各类模型的更新迭代速度较快,你不一定可以在这里找到所有的模型,另外在某些地区因为无法正常访问 ComfyUI Manager 的模型下载源,所以会存在下载不成功的情况,如果长时间无法下载成功,请尝试手动安装
请访问模型地址:[前往下载 v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors)
参考下面图片完成对应模型的下载
下载完成后,请将对应的**v1-5-pruned-emaonly-fp16.safetensors** 文件保存到以下位置
请找到你在安装过程中设置的 ComfyUI 安装位置,将对应模型文件保存到以下文件夹位置 `<你的 ComfyUI 安装位置>/ComfyUI/models/checkpoints`
找到你解压后的便携版的文件夹,在**ComfyUI\_windows\_portable/ComfyUI/models/checkpoints** 文件夹下完成模型的保存
请参考 桌面版和便携版部分的说明查找 **ComfyUI/models/checkpoints**文件夹位置
完成对应保存操作后,请刷新或者重启 ComfyUI 保证对应模型可以被 ComfyUI 检测
### 4. 加载模型,并进行第一次图片生成
在完成了对应的绘图模型安装后,请参考下图步骤加载对应的模型,并进行第一次图片的生成
请对应图片序号,完成下面操作
1. 请在 **Load Checkpoint** 节点使用箭头或者点击文本区域确保 **v1-5-pruned-emaonly-fp16.safetensors** 被选中,且左右切换箭头不会出现 **null** 的文本
2. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl + enter(回车)` 来执行图片生成
等待对应流程执行完成后,你应该可以在界面的 **保存图像(Save Image)** 节点中看到对应的图片结果,可以在上面右键保存到本地
对于文生图的详细说明,下面的指南中会有详细的说明和指导
点击这里查看文生图工作流的详细说明
## 故障排除
### 模型加载问题
如果 `Load Checkpoint` 节点没有任何模型可以选择,或者显示为 **null**,请先确认你的模型安装位置正确,或者尝试 **刷新** 或者 **重启 ComfyUI** 使得对应文件夹下的模型可以被检测到
# ComfyUI 官方文档
Source: https://docs.comfy.org/zh-CN/index
ComfyUI 官方文档
export const HeroCard = ({filename, darkFilename, title, description, href}) => {
return
{darkFilename ? <>
> : }
{title}
{description}
;
};
{/* header */}
最强大的开源节点式生成式 AI 应用程序
关于 ComfyUI
由 comfyanonymous 和其他贡献者 开发。
ComfyUI 是一个基于节点的生成式 AI 界面和推理引擎
用户可以通过节点组合各种 AI 模型和操作,实现高度可定制和可控的内容生成
ComfyUI 完全开源,可以在本地设备上运行
# 便携版(Windows)
Source: https://docs.comfy.org/zh-CN/installation/comfyui_portable_windows
本篇教程将指导你如何下载和开始使用 ComfyUI Portable(便携版) 并运行对应的程序
**ComfyUI Portable(便携版)** 是一个独立封装完整的 ComfyUI Windows 版本,内部已经整合了 ComfyUI 运行所需的独立的 **Python(python\_embeded)**,只需要解压即可使用,目前便携版本支持通过 **Nvidia** 显卡或者 **CPU** 运行。
本部分指南将引导你完成对应的安装。
## 下载 ComfyUI Portable(便携版)
您可通过点击下面的链接来获取最新的 **ComfyUI Portable(便携版)** 下载链接
下载 ComfyUI Portable(便携版)
下载后你可以使用类似解压软件如 [7-ZIP](https://7-zip.org/) 对压缩包进行解压
便携版解压后对应的文件结构及说明如下:
```
ComfyUI_windows_portable
├── 📂ComfyUI // ComfyUI 程序主体
├── 📂python_embeded // 独立的 Python 环境
├── 📂update // 用于升级便携版安装包的批处理脚本
├── README_VERY_IMPORTANT.txt // 英文版本的 ComfyUI 便携版使用说明
├── run_cpu.bat // 双击启动 ComfyUI(仅支持 CPU)
└── run_nvidia_gpu.bat // 双击启动 ComfyUI(仅支持 Nvidia 显卡)
```
## 如何启动 ComfyUI
根据你的电脑情况双击 `run_nvidia_gpu.bat ` 或者 `run_cpu.bat` 来启动 ComfyUI,你会看到对应下图所示的命令的运行
当你看到类似图片中的
```
To see the GUI go to: http://127.0.0.1:8188
```
此时你的 ComfyUI 服务已经启动,正常情况下 ComfyUI 会自动打开你的默认浏览器并访问 `http://127.0.0.1:8188` 地址,如果没有自动打开,请手动打开浏览器并访问该地址。
使用过程中请不要关闭对应的命令行窗口,否则 ComfyUI 将会停止运行
## 添加外部模型路径
如果你想要在 `ComfyUI/models` 之外管理你的模型文件,可能出于以下原因:
* 你有多个 ComfyUI 实例,你想要让这些实例共享模型文件,从而减少磁盘占用
* 你有多个不同的类型的 GUI 程序,如:WebUI, 你想要他们共用模型文件
* 模型文件无法被识别或读取到
我们提供了通过 `extra_model_paths.yaml` 配置文件来添加额外模型搜索路径的方法。
### 不同 ComfyUI 版本配置文件位置
对于[便携版](/zh-CN/installation/comfyui_portable_windows)和[手动安装](/zh-CN/installation/manual_install)的 ComfyUI版本,你可以在 ComfyUI 的根目录下找到 `extra_model_paths.yaml.example` 的示例文件
```
ComfyUI/extra_model_paths.yaml.example
```
复制并重命名为 `extra_model_paths.yaml` 来使用, 并保持在 ComfyUI 的根目录下, 路径应该是 `ComfyUI/extra_model_paths.yaml`
你也可以在 [这里](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 找到配置示例文件
如果你使用的是 ComfyUI 桌面应用程序,你可以参考下图打开额外模型的配置文件:
或者通过下面的位置打开:
```
C:\Users\YourUsername\AppData\Roaming\ComfyUI\extra_models_config.yaml
```
```
~/Library/Application Support/ComfyUI/extra_models_config.yaml
```
对应的配置文件不应该被改变
### 配置示例
比如,你需要额外让 ComfyUI 识别的模型文件位于下面的文件夹:
```
📁 YOUR_PATH/
├── 📁models/
| ├── 📁 loras/
| │ └── xxxxx.safetensors
| ├── 📁 checkpoints/
| │ └── xxxxx.safetensors
| ├── 📁 vae/
| │ └── xxxxx.safetensors
| └── 📁 controlnet/
| └── xxxxx.safetensors
```
那么你可以进行如下的配置来让 ComfyUI 识别到你设备上的模型路径
```
my_custom_config:
base_path: YOUR_PATH
loras: models/loras/
checkpoints: models/checkpoints/
vae: models/vae/
controlnet: models/controlnet/
```
或者使用
```
my_custom_config:
base_path: YOUR_PATH/models/
loras: loras
checkpoints: checkpoints
vae: vae
controlnet: controlnet
```
对于桌面版,请在原有配置路径下新增配置,而不覆盖掉安装过程中自动生成的路径配置,请在修改前备份对应的文件,这样在你配置错误时可以及时恢复
或者你也可以参考默认的 [extra\_model\_paths.yaml.example](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 来配置,保存之后, 需要 **重启 ComfyUI** 才能生效。
下面是完整的原始的配置配置示例:
```yaml
#Rename this to extra_model_paths.yaml and ComfyUI will load it
#config for a1111 ui
#all you have to do is change the base_path to where yours is installed
a111:
base_path: path/to/stable-diffusion-webui/
checkpoints: models/Stable-diffusion
configs: models/Stable-diffusion
vae: models/VAE
loras: |
models/Lora
models/LyCORIS
upscale_models: |
models/ESRGAN
models/RealESRGAN
models/SwinIR
embeddings: embeddings
hypernetworks: models/hypernetworks
controlnet: models/ControlNet
#config for comfyui
#your base path should be either an existing comfy install or a central folder where you store all of your models, loras, etc.
#comfyui:
# base_path: path/to/comfyui/
# # You can use is_default to mark that these folders should be listed first, and used as the default dirs for eg downloads
# #is_default: true
# checkpoints: models/checkpoints/
# clip: models/clip/
# clip_vision: models/clip_vision/
# configs: models/configs/
# controlnet: models/controlnet/
# diffusion_models: |
# models/diffusion_models
# models/unet
# embeddings: models/embeddings/
# loras: models/loras/
# upscale_models: models/upscale_models/
# vae: models/vae/
#other_ui:
# base_path: path/to/ui
# checkpoints: models/checkpoints
# gligen: models/gligen
# custom_nodes: path/custom_nodes
```
### 添加额外自定义节点路径
除了添加外部模型之外,你同样可以添加不在 ComfyUI 默认路径下的自定义节点路径
请注意,这并不会改变自定义节点的默认安装路径,只是在启动 ComfyUI 时会增加额外的路径搜索,你仍旧需要在对应的环境中完成自定义节点的依赖的安装,来保证其运行环境的完整性。
下面是一个简单的配置示例(Mac 系统),请根据你的实际情况进行修改,并新增到对应的配置文件中,保存后需要 **重启 ComfyUI** 才能生效:
```yaml
my_custom_nodes:
custom_nodes: /Users/your_username/Documents/extra_custom_nodes
```
## 进行第一次图片生成
安装成功后,你可以参考访问下面的章节,开始你的 ComfyUI 之路。
本教程将引导你完成第一次的模型安装以及对应的文本到图片的生成
## 社区分发版本
在中国早期有社区作者 [@秋葉aaaki](https://space.bilibili.com/12566101) 制作过独立分发的版本-秋叶整合包,有被广泛使用。
如果你是在中国使用,这个版本更改了 Github 的源地址,并配置了 Pypi 地址为为中国国内镜像地址,这可以让你在开始上手 ComfyUI 时可以避免一些因为网络导致的依赖和更新问题。
访问秋叶整合包原始发布地址
这个社区分发版本并不是由 ComfyUI 官方维护更新
## 其它 ComfyUI 便携版相关说明
### 1. ComfyUI 便携版升级
你可以使用 **update** 文件夹下的相关批处理命令完成 ComfyUI 便携版的升级
```
ComfyUI_windows_portable
└─ 📂update
├── update.py
├── update_comfyui.bat // 更新 ComfyUI 到最新的 Commit 版本
├── update_comfyui_and_python_dependencies.bat // 请仅在你的运行环境存在问题时使用
└── update_comfyui_stable.bat // 更新 ComfyUI 为最新的 stable 版本
```
### 2. ComfyUI 便携版设置局域网访问
如果你的 ComfyUI 运行在局域网内,想要其它的设备也可以访问到 ComfyUI,你可以通过记事本修改 `run_nvidia_gpu.bat` 或者 `run_cpu.bat` 文件来完成配置,主要通过添加`--listen`来添加监听地址
下面的示例是添加了 `--listen` 参数的 `run_nvidia_gpu.bat` 文件命令
```bat
.\python_embeded\python.exe -s ComfyUI\main.py --listen --windows-standalone-build
pause
```
当启用 ComfyUI 后您会发现最后的运行地址会变为
```
Starting server
To see the GUI go to: http://0.0.0.0:8188
To see the GUI go to: http://[::]:8188
```
你可以通过 `WIN + R` 输入`cmd` 打开命令行,输入 `ipconfig` 来查看你的局域网 IP 地址,然后在其它设备上输入 `http://你的局域网IP:8188` 来访问 ComfyUI
# Linux桌面版
Source: https://docs.comfy.org/zh-CN/installation/desktop/linux
本文将介绍 ComfyUI Desktop MacOS 版本的下载以及安装使用
Linux预建包尚不可用。请尝试[手动安装](/zh-CN/installation/manual_install)。
当Linux预建包可用时,你可以配置外部模型路径:
## 添加外部模型路径
如果你在计算机上的 ComfyUI 安装目录之外的其他位置存储了模型,可以通过配置 `extra_model_paths.yaml` 文件将它们添加到 ComfyUI 中。
对于 ComfyUI 桌面版,对应文件路径为:
* Windows:`C:\Users\<你的用户名>\AppData\Roaming\ComfyUI\extra_model_paths.yaml`
* macOS:`~/Library/Application Support/ComfyUI/extra_model_paths.yaml`
* Linux:`~/.config/ComfyUI/extra_model_paths.yaml`
详细说明请参见[模型文档](/zh-CN/development/core-concepts/models#adding-external-model-paths)
# macOS 桌面版
Source: https://docs.comfy.org/zh-CN/installation/desktop/macos
本文将介绍 ComfyUI Desktop macOS 版本的下载以及安装使用。
export const log_path_0 = "~/Library/Logs/ComfyUI"
export const config_path_0 = "~/Library/Application Support/ComfyUI"
**ComfyUI 桌面版(Desktop)** 是一个独立的安装版本,可以像常规软件一样安装,支持快捷安装、自动配置 **Python 环境及依赖**,并支持导入已有的 ComfyUI 设置、模型、工作流和文件。
ComfyUI 桌面版是一个开源项目,完整代码请访问[这里](https://github.com/Comfy-Org/desktop)。
ComfyUI 桌面版(macOS)目前仅支持 Apple Silicon。
本篇教程将引导你完成对应的软件安装,并提供相关的安装配置说明。
由于 **ComfyUI 桌面版** 仍处于 **Beta** 状态,实际的安装过程可能会发生变化。
## ComfyUI 桌面版(macOS)下载
请点击下面的按钮下载对应的 macOS 系统 **ComfyUI 桌面版** 安装包。
Download for macOS
## 通过 Homebrew 安装
ComfyUI 桌面版也可通过 [Homebrew](https://brew.sh/) 安装:
```
brew install comfyui
```
## ComfyUI 桌面版安装步骤
1. 双击下载到的安装包文件。
2. 如图所示,请将 **ComfyUI** 程序按箭头所示拖入 **Applications** 文件夹。
3. 如果在打开安装包后,文件夹显示如下(图标上出现禁止符号),则说明你当前的系统版本与 **ComfyUI 桌面版** 不兼容。
4. 然后在 **启动台 (Launchpad)** 中找到对应的 **ComfyUI 图标**,点击它即可进入 ComfyUI 的初始化设置。
## ComfyUI 桌面版初始化流程
点击 **Get Started** 开始初始化步骤。
安装 ComfyUI 可能会出现许多问题。也许在安装 pytorch(15 GB)时网络连接失败,或者你没有安装 git,当检测到问题时,维护页面会自动打开,并提供解决问题的方法。
你可以使用它来解决大多数问题:
* 创建一个 Python 虚拟环境
* 重新安装所有缺失的核心依赖项到由桌面管理的 Python 虦虚拟环境
* 安装 git,VC redis
* 选择一个新的安装位置
默认维护页面会显示当前报错的内容
点击 `All` 可以切换查看可以操作的所有内容
对应三个选项:
1. **MPS(推荐):** Metal Performance Shaders (MPS) 是苹果的优化框架,让开发者能在苹果设备上利用 GPU 加速包括机器学习在内的高性能计算任务,且支持 PyTorch 等框架使用 GPU 提升模型训练和推理效率。
2. **Manual Configuration(手动配置):** 你需要手动安装和配置 Python 运行环境。除非你知道应该如何配置,否则请不要选择。
3. **Enable CPU Mode(启用 CPU 模式):** 仅适用于开发人员和特殊情况。除非你确定需要使用此模式,否则请不要选择。
如无特殊情况,请按截图所示选择 **MPS**,然后点击 **Next** 进入下一步。
在这一步选择 ComfyUI 以下相关内容的安装位置:
* **Python 环境**
* **Models(模型文件)**
* **Custom Nodes(自定义节点)**
建议:
* 请新建一个单独的空白文件夹作为 ComfyUI 的安装目录。
* 请确保磁盘至少有 **5GB** 左右的可用空间,以保证 **ComfyUI 桌面版** 的正常安装。
ComfyUI 并非所有文件都安装在此目录下,部分文件将会位于 macOS 的系统目录下。你可以参考本篇指南的卸载部分,了解如何完全卸载 ComfyUI 桌面版。
完成后点击 **Next** 进入下一步。
在这一步,你可以将已有的 ComfyUI 安装内容迁移到 ComfyUI 桌面版中。选择你电脑上已有的 **ComfyUI** 安装目录,安装程序会自动识别对应目录下的:
* **User Files(用户文件)**
* **Models(模型文件):** 不会进行复制,只是与桌面版进行关联。
* **Custom Nodes(自定义节点):** 节点将会重新进行安装。
不用担心,此步骤不会复制模型文件。你可以按需勾选或取消勾选对应的选项,然后点击 **Next** 进入下一步。
这一步是偏好设置:
1. **Automatic Updates(自动更新):** 设置是否在 ComfyUI 更新可用时自动更新。
2. **Usage Metrics(使用情况分析):** 如果启用,我们将收集**匿名的使用数据**,用于帮助我们改进 ComfyUI。
3. **Mirror Settings(镜像设置):** 由于程序需要联网下载 Python 以完成相关环境安装,如果你在安装时也如图所示出现红色的 ❌,这可能导致后续安装失败。请参考下面步骤进行处理:
展开对应的镜像设置,找到具体失败的镜像。在当前截图中,错误为 **Python Install Mirror** 镜像下载失败。
对于不同的镜像错误,你可以参考下面的内容尝试手动查找不同的镜像,并进行替换
以下情况主要针对中国境内用户
#### Python 安装镜像
如果默认镜像无法使用,请尝试使用下面的镜像
```
https://python-standalone.org/mirror/astral-sh/python-build-standalone
```
如果你需要查找其它备选 GitHub 的镜像地址,请查找并构建指向 `python-build-standalone` 仓库releases的镜像地址
```
https://github.com/astral-sh/python-build-standalone/releases/download
```
构建类似下面格式的链接
```
https://xxx/astral-sh/python-build-standalone/releases/download
```
由于大多 Github 镜像服务都由第三方提供,所以请注意使用过程中的安全性。
#### PyPI 镜像
* 阿里云:[https://mirrors.aliyun.com/pypi/simple/](https://mirrors.aliyun.com/pypi/simple/)
* 腾讯云:[https://mirrors.cloud.tencent.com/pypi/simple/](https://mirrors.cloud.tencent.com/pypi/simple/)
* 中国科技大学:[https://pypi.mirrors.ustc.edu.cn/simple/](https://pypi.mirrors.ustc.edu.cn/simple/)
* 上海交通大学:[https://pypi.sjtu.edu.cn/simple/](https://pypi.sjtu.edu.cn/simple/)
#### Torch 镜像
* 阿里云: [https://mirrors.aliyun.com/pytorch-wheels/cu121/](https://mirrors.aliyun.com/pytorch-wheels/cu121/)
如果一切无误,安装程序将完成安装并自动进入 ComfyUI 桌面版界面,这说明已经安装成功。
## 进行第一次图片生成
安装成功后,你可以参考访问下面的章节,开始你的 ComfyUI 之路。
本教程将引导你完成第一次的模型安装以及对应的文本到图片的生成
## 如何更新 ComfyUI 桌面版
目前 ComfyUI 桌面版更新采用自动检测更新,请确保在设置中已经启用自动更新
你也可以在 `菜单` --> `帮助` --> `检查更新` 中选择手动检查是否有可用的更新
## 添加外部模型路径
如果你想要在 `ComfyUI/models` 之外管理你的模型文件,可能出于以下原因:
* 你有多个 ComfyUI 实例,你想要让这些实例共享模型文件,从而减少磁盘占用
* 你有多个不同的类型的 GUI 程序,如:WebUI, 你想要他们共用模型文件
* 模型文件无法被识别或读取到
我们提供了通过 `extra_model_paths.yaml` 配置文件来添加额外模型搜索路径的方法。
### 不同 ComfyUI 版本配置文件位置
对于[便携版](/zh-CN/installation/comfyui_portable_windows)和[手动安装](/zh-CN/installation/manual_install)的 ComfyUI版本,你可以在 ComfyUI 的根目录下找到 `extra_model_paths.yaml.example` 的示例文件
```
ComfyUI/extra_model_paths.yaml.example
```
复制并重命名为 `extra_model_paths.yaml` 来使用, 并保持在 ComfyUI 的根目录下, 路径应该是 `ComfyUI/extra_model_paths.yaml`
你也可以在 [这里](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 找到配置示例文件
如果你使用的是 ComfyUI 桌面应用程序,你可以参考下图打开额外模型的配置文件:
或者通过下面的位置打开:
```
C:\Users\YourUsername\AppData\Roaming\ComfyUI\extra_models_config.yaml
```
```
~/Library/Application Support/ComfyUI/extra_models_config.yaml
```
对应的配置文件不应该被改变
### 配置示例
比如,你需要额外让 ComfyUI 识别的模型文件位于下面的文件夹:
```
📁 YOUR_PATH/
├── 📁models/
| ├── 📁 loras/
| │ └── xxxxx.safetensors
| ├── 📁 checkpoints/
| │ └── xxxxx.safetensors
| ├── 📁 vae/
| │ └── xxxxx.safetensors
| └── 📁 controlnet/
| └── xxxxx.safetensors
```
那么你可以进行如下的配置来让 ComfyUI 识别到你设备上的模型路径
```
my_custom_config:
base_path: YOUR_PATH
loras: models/loras/
checkpoints: models/checkpoints/
vae: models/vae/
controlnet: models/controlnet/
```
或者使用
```
my_custom_config:
base_path: YOUR_PATH/models/
loras: loras
checkpoints: checkpoints
vae: vae
controlnet: controlnet
```
对于桌面版,请在原有配置路径下新增配置,而不覆盖掉安装过程中自动生成的路径配置,请在修改前备份对应的文件,这样在你配置错误时可以及时恢复
或者你也可以参考默认的 [extra\_model\_paths.yaml.example](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 来配置,保存之后, 需要 **重启 ComfyUI** 才能生效。
下面是完整的原始的配置配置示例:
```yaml
#Rename this to extra_model_paths.yaml and ComfyUI will load it
#config for a1111 ui
#all you have to do is change the base_path to where yours is installed
a111:
base_path: path/to/stable-diffusion-webui/
checkpoints: models/Stable-diffusion
configs: models/Stable-diffusion
vae: models/VAE
loras: |
models/Lora
models/LyCORIS
upscale_models: |
models/ESRGAN
models/RealESRGAN
models/SwinIR
embeddings: embeddings
hypernetworks: models/hypernetworks
controlnet: models/ControlNet
#config for comfyui
#your base path should be either an existing comfy install or a central folder where you store all of your models, loras, etc.
#comfyui:
# base_path: path/to/comfyui/
# # You can use is_default to mark that these folders should be listed first, and used as the default dirs for eg downloads
# #is_default: true
# checkpoints: models/checkpoints/
# clip: models/clip/
# clip_vision: models/clip_vision/
# configs: models/configs/
# controlnet: models/controlnet/
# diffusion_models: |
# models/diffusion_models
# models/unet
# embeddings: models/embeddings/
# loras: models/loras/
# upscale_models: models/upscale_models/
# vae: models/vae/
#other_ui:
# base_path: path/to/ui
# checkpoints: models/checkpoints
# gligen: models/gligen
# custom_nodes: path/custom_nodes
```
### 添加额外自定义节点路径
除了添加外部模型之外,你同样可以添加不在 ComfyUI 默认路径下的自定义节点路径
请注意,这并不会改变自定义节点的默认安装路径,只是在启动 ComfyUI 时会增加额外的路径搜索,你仍旧需要在对应的环境中完成自定义节点的依赖的安装,来保证其运行环境的完整性。
下面是一个简单的配置示例(Mac 系统),请根据你的实际情况进行修改,并新增到对应的配置文件中,保存后需要 **重启 ComfyUI** 才能生效:
```yaml
my_custom_nodes:
custom_nodes: /Users/your_username/Documents/extra_custom_nodes
```
## 桌面端 Python 环境相关
桌面端的安装将在你选择的安装目录下创建一个python 的虚拟环境,通常这是一个隐藏的 `.venv` 文件夹。
如果你需要为 ComfyUI 插件处理相关的依赖则需要在这个环境中进行处理,直接系统系统的命令行会有将对应依赖安装到系统环境的风险,请参考下面的指示完成对应环境的激活。
### 如何使用 桌面端 的 python 环境?
你可以使用桌面端自带的终端来使用 python 环境。
1. 点击菜单栏的 icon 打开底部面板
2. 点击 `Terminal` 打开终端
3. 如果你想要看对应环境的 python 安装位置,可以使用下面的命令
```
python -c "import sys; print(sys.executable)"
```
```
which python
```
除非你了解你当前的操作含义,否则你的操作可能会导致对应环境依赖的问题,请谨慎使用此方式进行操作
你也可以使用你喜欢的终端来使用 python 环境,但在使用之前需要先激活对应的虚拟环境。
在使用其它终端进行操作时,如果你不熟悉相关操作,可能会将对应依赖安装到系统环境,请谨慎使用此方式进行操作
在截图中,是一个 macOS 终端的示例。如果你使用的是 Windows,请参考以下步骤在系统上激活虚拟环境。
打开你喜欢的终端,使用 `cd` 命令进入你安装的 ComfyUI 的目录如
```
cd <你的 ComfyUI 安装目录>/ComfyUI
```
在终端中输入下面的命令激活虚拟环境
```
.venv/Scripts/activate
```
```
source .venv/bin/activate
```
激活后,你可以在终端中看到类似 `(ComfyUI)` 的提示,表示你已经激活了虚拟环境
使用 `which python` 查看当前 python 所在环境,确认不是系统环境
完成以上步骤后你就激活了对应的 Python 环境,你可以继续在这个环境里进行依赖安装相关的操作了。
## 如何卸载 ComfyUI 桌面版
要卸载 **ComfyUI 桌面版**,你可以直接在 **Applications** 文件夹内删除 **ComfyUI** 程序。
如果你想要**完全删除** **ComfyUI 桌面版** 的所有文件,可以手动删除以下文件夹:
```
~/Library/Application Support/ComfyUI
```
以上操作**不会**删除你的以下文件夹。如果需要,请手动删除:
* Models(模型文件)
* Custom Nodes(自定义节点)
* Input/Output 目录(图片输入/输出目录)
## 故障排除
### 如何定位安装错误
如果安装失败,你应该可以看到下面的界面显示
此时建议你采取以下几种方式查找错误原因
1. 点击 `Show Teriminal` 查看错误问题输出
2. 点击 `Open Logs` 查看安装过程日志
3. 访问官方论坛查找错误反馈
4. 点击`Reinstall`尝试重新安装
建议在提交反馈之前,你可以将对应的**错误输出**以及 **log 文件**信息提供给类似 **GPT**一类的工具
如上图,询问对应错误的原因,或者完全删除 ComfyUI 后进行安装重试
### 反馈错误
如果在安装过程中,你发生了任何错误,请通过以下任意方式查看是否有类似错误反馈,或者向我们提交错误
* Github Issues: [https://github.com/Comfy-Org/desktop/issues](https://github.com/Comfy-Org/desktop/issues)
* Comfy 官方论坛: [https://forum.comfy.org/](https://forum.comfy.org/)
请在提交错误时确保提交了以下日志以及配置文件,方便我们进行问题的定位和查找
1. 日志文件
| 文件名 | 描述 | 位置 |
| ----------- | ------------------------------------------- | ------------ |
| main.log | 包含与桌面应用和服务器启动相关的日志,来自桌面的 Electron 进程。 | {log_path_0} |
| comfyui.log | 包含与 ComfyUI 正常运行相关的日志,例如核心 ComfyUI 进程的终端输出。 | {log_path_0} |
2. 配置文件
| 文件名 | 描述 | 位置 |
| ------------------------ | ---------------------------- | --------------- |
| extra\_model\_paths.yaml | 包含 ComfyUI 将搜索模型和自定义节点的额外路径。 | {config_path_0} |
| config.json | 包含应用配置。此文件通常不应直接编辑。 | {config_path_0} |
# Windows桌面版
Source: https://docs.comfy.org/zh-CN/installation/desktop/windows
本文将介绍 ComfyUI Desktop Windows 版本的下载以及安装使用
export const log_path_0 = "C:\\Users\\<你的用户名>\\AppData\\Roaming\\ComfyUI\\logs"
export const config_path_0 = "C:\\Users\\<你的用户名>\\AppData\\Roaming\\ComfyUI"
**ComfyUI 桌面版(Desktop)** 是一个独立的安装版本,可以像常规软件一样进行安装,支持快捷安装自动配置 **Python环境及依赖** ,支持导入已有的 ComfyUI 设置、模型、工作流和文件,可以快速从已有的[ComfyUI 便携版](/zh-CN/installation/comfyui_portable_windows)迁移到桌面版
ComfyUI 桌面版是一个开源项目,完整代码请访问 [这里](https://github.com/Comfy-Org/desktop)
ComfyUI 桌面版(Windows)硬件要求:
* NVIDIA 显卡
本篇教程将引导你完成对应的软件安装,并说明相关安装配置说明。
由于 **ComfyUI 桌面版** 仍旧处于 **Beta** 状态,实际的安装过程可能会发生变化
## ComfyUI 桌面版(Windows)下载
请点击下面的按钮下载对应的针对 Windows 系统的 **ComfyUI 桌面版** 安装包
Download for Windows (NVIDIA)
## ComfyUI 桌面版安装步骤
双击下载到的安装包文件,首先将会执行一次自动安装,并在桌面生成一个 **ComfyUI 桌面版** 的快捷方式
双击对应的快捷,进入 ComfyUI 的初始化设置
### ComfyUI 桌面版初始化流程
点击 **Get Started** 开始初始化步骤
安装 ComfyUI 可能会出现许多问题。也许在安装 pytorch(15 GB)时网络连接失败,或者你没有安装 git,当检测到问题时,维护页面会自动打开,并提供解决问题的方法。
你可以使用它来解决大多数问题:
* 创建一个 Python 虚拟环境
* 重新安装所有缺失的核心依赖项到由桌面管理的 Python 虦虚拟环境
* 安装 git,VC redis
* 选择一个新的安装位置
默认维护页面会显示当前报错的内容
点击 `All` 可以切换查看可以操作的所有内容
对应三个选项为:
1. **Nvidia 显卡(推荐):** 直接支持使用 pytorch 和 CUDA
2. **Manual Configuration 手动配置:** 你需要手动安装和配置 python 运行环境,除非你知道应该如何配置,否则请不要选择
3. **Enable CPU Mode 启用 CPU 模式:** 仅适用于开发人员和特殊情况,除非你确定你需要使用这个模式,否则请不要选择
如无特殊情况,请按截图所示选择**NVIDIA**,并点击 **Next** 进入下一步
在这一步将选择 ComfyUI 以下内容的安装位置:
* **Python 虚拟环境**
* **Models 模型文件**
* **Custom Nodes 自定义节点**
建议:
* 请选择**固态硬盘**作为安装位置,这将提高 ComfyUI 访问模型的速度。
* 请新建一个单独的空白文件夹作为 ComfyUI 的安装目录
* 请保证对应磁盘至少有 **15G** 左右的磁盘空间,以保证 ComfyUI Desktop 的安装
ComfyUI 并非所有文件都安装在此目录下,部分文件依然会安装在 C 盘,后期如需卸载,你可以参考本篇指南的卸载部分完成完整的 ComfyUI 桌面版的卸载
完成后点击 **Next** 进入下一步
在这一步你可以将你已有的 ComfyUI 安装内容迁移到 ComfyUI 桌面版中,如图所示,选择了原本的 **D:\ComfyUI\_windows\_portable\ComfyUI** 安装目录,安装程序会自动识别对应目录下的:
* **User Files 用户文件**
* **Models 模型文件:** 不会进行复制,只是与桌面版进行关联
* **Custom Nodes 自定义节点:** 自定义节点将会重新进行安装
不要担心,这个步骤并不会复制模型文件,你可以按你的需要勾选或者取消勾选对应的选项,点击 **Next** 进入下一步
这一步是偏好设置
1. **Automatic Updates 自动更新:** 是否设置在 ComfyUI 更新可用时自动更新
2. **Usage Metrics 使用情况分析:** 如果启用,我们将收集**匿名的使用数据** 用于帮助我们改进 ComfyUI
3. **Mirror Settings 镜像设置:** 由于程序需要联网下载 Python 完成相关环境安装,如果你在安装时候也如图所示出现了红色的❌,提示这可能会导致后续安装过程的失败,则请参考下面步骤进行处理
展开对应的镜像设置,找到具体失败的镜像,在当前截图中错误为 **Python Install Mirror** 镜像失败。
对于不同的镜像错误,你可以参考下面的内容尝试手动查找不同的镜像,并进行替换
以下情况主要针对中国境内用户
#### Python 安装镜像
如果默认镜像无法使用,请尝试使用下面的镜像
```
https://python-standalone.org/mirror/astral-sh/python-build-standalone
```
如果你需要查找其它备选 GitHub 的镜像地址,请查找并构建指向 `python-build-standalone` 仓库releases的镜像地址
```
https://github.com/astral-sh/python-build-standalone/releases/download
```
构建类似下面格式的链接
```
https://xxx/astral-sh/python-build-standalone/releases/download
```
由于大多 Github 镜像服务都由第三方提供,所以请注意使用过程中的安全性。
#### PyPI 镜像
* 阿里云:[https://mirrors.aliyun.com/pypi/simple/](https://mirrors.aliyun.com/pypi/simple/)
* 腾讯云:[https://mirrors.cloud.tencent.com/pypi/simple/](https://mirrors.cloud.tencent.com/pypi/simple/)
* 中国科技大学:[https://pypi.mirrors.ustc.edu.cn/simple/](https://pypi.mirrors.ustc.edu.cn/simple/)
* 上海交通大学:[https://pypi.sjtu.edu.cn/simple/](https://pypi.sjtu.edu.cn/simple/)
#### Torch 镜像
* 阿里云: [https://mirrors.aliyun.com/pytorch-wheels/cu121/](https://mirrors.aliyun.com/pytorch-wheels/cu121/)
如果一切无误,安装程序将完成安装并自动进入 ComfyUI 桌面版界面, 则说明已经安装成功
## 进行第一次图片生成
安装成功后,你可以参考访问下面的章节,开始你的 ComfyUI 之路。
本教程将引导你完成第一次的模型安装以及对应的文本到图片的生成
## 如何更新 ComfyUI 桌面版
目前 ComfyUI 桌面版更新采用自动检测更新,请确保在设置中已经启用自动更新
你也可以在 `菜单` --> `帮助` --> `检查更新` 中选择手动检查是否有可用的更新
## 添加外部模型路径
如果你想要在 `ComfyUI/models` 之外管理你的模型文件,可能出于以下原因:
* 你有多个 ComfyUI 实例,你想要让这些实例共享模型文件,从而减少磁盘占用
* 你有多个不同的类型的 GUI 程序,如:WebUI, 你想要他们共用模型文件
* 模型文件无法被识别或读取到
我们提供了通过 `extra_model_paths.yaml` 配置文件来添加额外模型搜索路径的方法。
### 不同 ComfyUI 版本配置文件位置
对于[便携版](/zh-CN/installation/comfyui_portable_windows)和[手动安装](/zh-CN/installation/manual_install)的 ComfyUI版本,你可以在 ComfyUI 的根目录下找到 `extra_model_paths.yaml.example` 的示例文件
```
ComfyUI/extra_model_paths.yaml.example
```
复制并重命名为 `extra_model_paths.yaml` 来使用, 并保持在 ComfyUI 的根目录下, 路径应该是 `ComfyUI/extra_model_paths.yaml`
你也可以在 [这里](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 找到配置示例文件
如果你使用的是 ComfyUI 桌面应用程序,你可以参考下图打开额外模型的配置文件:
或者通过下面的位置打开:
```
C:\Users\YourUsername\AppData\Roaming\ComfyUI\extra_models_config.yaml
```
```
~/Library/Application Support/ComfyUI/extra_models_config.yaml
```
对应的配置文件不应该被改变
### 配置示例
比如,你需要额外让 ComfyUI 识别的模型文件位于下面的文件夹:
```
📁 YOUR_PATH/
├── 📁models/
| ├── 📁 loras/
| │ └── xxxxx.safetensors
| ├── 📁 checkpoints/
| │ └── xxxxx.safetensors
| ├── 📁 vae/
| │ └── xxxxx.safetensors
| └── 📁 controlnet/
| └── xxxxx.safetensors
```
那么你可以进行如下的配置来让 ComfyUI 识别到你设备上的模型路径
```
my_custom_config:
base_path: YOUR_PATH
loras: models/loras/
checkpoints: models/checkpoints/
vae: models/vae/
controlnet: models/controlnet/
```
或者使用
```
my_custom_config:
base_path: YOUR_PATH/models/
loras: loras
checkpoints: checkpoints
vae: vae
controlnet: controlnet
```
对于桌面版,请在原有配置路径下新增配置,而不覆盖掉安装过程中自动生成的路径配置,请在修改前备份对应的文件,这样在你配置错误时可以及时恢复
或者你也可以参考默认的 [extra\_model\_paths.yaml.example](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 来配置,保存之后, 需要 **重启 ComfyUI** 才能生效。
下面是完整的原始的配置配置示例:
```yaml
#Rename this to extra_model_paths.yaml and ComfyUI will load it
#config for a1111 ui
#all you have to do is change the base_path to where yours is installed
a111:
base_path: path/to/stable-diffusion-webui/
checkpoints: models/Stable-diffusion
configs: models/Stable-diffusion
vae: models/VAE
loras: |
models/Lora
models/LyCORIS
upscale_models: |
models/ESRGAN
models/RealESRGAN
models/SwinIR
embeddings: embeddings
hypernetworks: models/hypernetworks
controlnet: models/ControlNet
#config for comfyui
#your base path should be either an existing comfy install or a central folder where you store all of your models, loras, etc.
#comfyui:
# base_path: path/to/comfyui/
# # You can use is_default to mark that these folders should be listed first, and used as the default dirs for eg downloads
# #is_default: true
# checkpoints: models/checkpoints/
# clip: models/clip/
# clip_vision: models/clip_vision/
# configs: models/configs/
# controlnet: models/controlnet/
# diffusion_models: |
# models/diffusion_models
# models/unet
# embeddings: models/embeddings/
# loras: models/loras/
# upscale_models: models/upscale_models/
# vae: models/vae/
#other_ui:
# base_path: path/to/ui
# checkpoints: models/checkpoints
# gligen: models/gligen
# custom_nodes: path/custom_nodes
```
### 添加额外自定义节点路径
除了添加外部模型之外,你同样可以添加不在 ComfyUI 默认路径下的自定义节点路径
请注意,这并不会改变自定义节点的默认安装路径,只是在启动 ComfyUI 时会增加额外的路径搜索,你仍旧需要在对应的环境中完成自定义节点的依赖的安装,来保证其运行环境的完整性。
下面是一个简单的配置示例(Mac 系统),请根据你的实际情况进行修改,并新增到对应的配置文件中,保存后需要 **重启 ComfyUI** 才能生效:
```yaml
my_custom_nodes:
custom_nodes: /Users/your_username/Documents/extra_custom_nodes
```
## 桌面端 Python 环境相关
桌面端的安装将在你选择的安装目录下创建一个python 的虚拟环境,通常这是一个隐藏的 `.venv` 文件夹。
如果你需要为 ComfyUI 插件处理相关的依赖则需要在这个环境中进行处理,直接系统系统的命令行会有将对应依赖安装到系统环境的风险,请参考下面的指示完成对应环境的激活。
### 如何使用 桌面端 的 python 环境?
你可以使用桌面端自带的终端来使用 python 环境。
1. 点击菜单栏的 icon 打开底部面板
2. 点击 `Terminal` 打开终端
3. 如果你想要看对应环境的 python 安装位置,可以使用下面的命令
```
python -c "import sys; print(sys.executable)"
```
```
which python
```
除非你了解你当前的操作含义,否则你的操作可能会导致对应环境依赖的问题,请谨慎使用此方式进行操作
你也可以使用你喜欢的终端来使用 python 环境,但在使用之前需要先激活对应的虚拟环境。
在使用其它终端进行操作时,如果你不熟悉相关操作,可能会将对应依赖安装到系统环境,请谨慎使用此方式进行操作
在截图中,是一个 macOS 终端的示例。如果你使用的是 Windows,请参考以下步骤在系统上激活虚拟环境。
打开你喜欢的终端,使用 `cd` 命令进入你安装的 ComfyUI 的目录如
```
cd <你的 ComfyUI 安装目录>/ComfyUI
```
在终端中输入下面的命令激活虚拟环境
```
.venv/Scripts/activate
```
```
source .venv/bin/activate
```
激活后,你可以在终端中看到类似 `(ComfyUI)` 的提示,表示你已经激活了虚拟环境
使用 `which python` 查看当前 python 所在环境,确认不是系统环境
完成以上步骤后你就激活了对应的 Python 环境,你可以继续在这个环境里进行依赖安装相关的操作了。
## 如何卸载 ComfyUI 桌面版
对于 **ComfyUI 桌面版** 你可以在 Windows 的系统设置中使用系统的卸载功能来完成对应软件的卸载操作
如果你想要完全删除 **ComfyUI 桌面版** 的所有文件,你可以手动删除以下文件夹:
* C:\Users\<你的用户名>\AppData\Local\@comfyorgcomfyui-electron-updater
* C:\Users\<你的用户名>\AppData\Local\Programs\@comfyorgcomfyui-electron
* C:\Users\<你的用户名>\AppData\Roaming\ComfyUI
以上的操作并不会删除以下你的以下文件夹,如果你需要删除对应文件的话,请手动删除:
* models 模型文件
* custom nodes 自定义节点
* input/output directories. 图片输入/输出目录
## 故障排除
### 显示不支持的设备
由于 ComfyUI 桌面版(Windows)仅支持可以使用 **CUDA 的 Nvdia 显卡** 所以如果你的设备不支持,可能会出现此界面
* 请更换使用支持的设备
* 或者考虑使用 [ComfyUI便携版](/zh-CN/installation/comfyui_portable_windows) 或者通过[手动安装](/zh-CN/installation/manual_install)来使用 ComfyUI
### 如何定位安装错误
如果安装失败,你应该可以看到下面的界面显示
此时建议你采取以下几种方式查找错误原因
1. 点击 `Show Teriminal` 查看错误问题输出
2. 点击 `Open Logs` 查看安装过程日志
3. 访问官方论坛查找错误反馈
4. 点击`Reinstall`尝试重新安装
建议在提交反馈之前,你可以将对应的**错误输出**以及 **log 文件**信息提供给类似 **GPT**一类的工具
如上图,询问对应错误的原因,或者完全删除 ComfyUI 后进行安装重试
### 反馈错误
如果在安装过程中,你发生了任何错误,请通过以下任意方式查看是否有类似错误反馈,或者向我们提交错误
* Github Issues: [https://github.com/Comfy-Org/desktop/issues](https://github.com/Comfy-Org/desktop/issues)
* Comfy 官方论坛: [https://forum.comfy.org/](https://forum.comfy.org/)
请在提交错误时确保提交了以下日志以及配置文件,方便我们进行问题的定位和查找
1. 日志文件
| 文件名 | 描述 | 位置 |
| ----------- | ------------------------------------------- | ------------ |
| main.log | 包含与桌面应用和服务器启动相关的日志,来自桌面的 Electron 进程。 | {log_path_0} |
| comfyui.log | 包含与 ComfyUI 正常运行相关的日志,例如核心 ComfyUI 进程的终端输出。 | {log_path_0} |
2. 配置文件
| 文件名 | 描述 | 位置 |
| ------------------------ | ---------------------------- | --------------- |
| extra\_model\_paths.yaml | 包含 ComfyUI 将搜索模型和自定义节点的额外路径。 | {config_path_0} |
| config.json | 包含应用配置。此文件通常不应直接编辑。 | {config_path_0} |
# 如何在 ComfyUI 中安装自定义节点
Source: https://docs.comfy.org/zh-CN/installation/install_custom_node
本部分将指导你如何使用不同的方式来进行 ComfyUI 自定义节点的安装
## 什么是自定义节点?
自定义节点是ComfyUI的扩展插件,能够增加新功能,如高级图像处理、机器学习微调、颜色调整等。这些节点由社区开发,可显著扩展ComfyUI的基础功能。
在安装自定义节点前,你有必要对自定义节点进行审查,由于 ComfyUI 是一个开源项目,也有可能存在恶意利用自定义节点进行攻击的插件
1. 请只安装来自可信作者和社区常用的自定义节点
2. 在安装前了解插件功能,避免安装不明来源的插件以确保系统安全
3. 避免安装冷门插件和来路不明的插件,未经审核的插件可能存在安全风险,进而导致你的电脑被黑客入侵。
所有的自定义节点安装都需要完成下面的两个步骤:
1. 克隆节点代码到 `ComfyUI/custom_nodes` 目录
2. 安装对应的 Python 依赖
在本文中我们将介绍三种安装方法,下面是对应的优缺点对比,由于目前 [ComfyUI Manager](https://github.com/Comfy-Org/ComfyUI-Manager) 还未正式加入到核心依赖中,但在未来 ComfyUI Manager 将会成为核心依赖的一部分,但本部分的指南仍旧提供了其它安装插件的指南,以便满足你的需求。
| 方法 | 优点 | 缺点 |
| ------------------------ | ---------------------------------- | ---------------------------------------- |
| **ComfyUI Manager** (推荐) | 1. 自动化安装 2. 依赖处理 3. 图形界面 | 不在 registry 中注册的节点无法通过 Manager 直接搜索到 |
| **Git 克隆** | 可以安装不在 registry 中注册的节点 | 1. 需要Git知识 2. 手动处理依赖 3. 存在安装风险 |
| **代码仓库 ZIP 下载** | 1. 无需Git 2. 手动处理一切 | 1. 需要手动处理依赖 2. 无版本控制 3. 存在安装风险 |
提示: 在安装自定义节点前,请先查看插件的 README 文件,了解插件的安装方法和使用方法,有些插件有对特定的包和环境有要求,比如对应的模型、依赖的版本、常见问题解决等
## 方法一:ComfyUI Manager(推荐)
使用 ComfyUI Manager 安装自定义节点时,在列表中的自定义节点并不代表其是安全的,请在安装前了解其功能和作用,确保只安装可信插件和热门作者的自定义节点,否则将会给你的设备带来潜在风险
1. 如果有标记`⚠️`的节点,代表其可能与其它一些插件存在依赖冲突
2. 作者名称标注为`✅`的节点,代表对应作者在 Github 的活跃度
3. 对于插件可能存在的风险行为,会在列表中使用红色字段提醒,请在确保插件安全后再进行安装
找到需要安装的节点,点击`Install`按钮
在弹出的窗口中会要求选择版本:
* `nightly` 版本: 直接从 Github 下载最新源码,但在 `security_level` 为 `normal` 时,会提示当前的安全等级设置不允许下载该插件
* `latest` 或带数字的稳定版本(推荐): 从 [https://registry.comfy.org/](https://registry.comfy.org/) 下载经过审查的代码,不会触发安全检查
`nightly` 版本通常是最新的版本,但因为它直接从 github 下载没有经过审查的版本,存在一定的代码风险,如果你一定需要安装`nightly`版本,请将 Manager 的 `security_level` 设置为 `weak`
对应配置文件路径为 `ComfyUI/user/default/ComfyUI-Manager/config.ini`,但请注意这并不是我们推荐的配置,请只临时使用这个配置
点击安装后 Manager 会自动完成依赖的安装,并会在安装完成后提示你重启 ComfyUI
重启后在 ComfyUI Manager 中检查对应的插件是否安装成功, 是否出现`import failed`的错误提示
## 方法二:使用 Git 进行安装手动安装
适用于Manager中找不到的新节点或需要特定版本时, 需要你的系统中已经安装好了 [Git](https://git-scm.com/)
在GitHub页面点击"Code"按钮复制HTTPS链接
```bash
cd /path/to/ComfyUI/custom_nodes
```
```bash
git clone [仓库URL]
```
安装依赖需要安装到对应的 ComfyUI 环境中,请注意不要混淆到你的系统环境, 否则可能会导致你的系统环境污染
在 Windows 便携版中, 依赖需要安装到对应的嵌入式 Python 环境中
```bash
python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\[节点目录名]\requirements.txt
```
在你的 ComfyUI 环境中安装依赖
```bash
cd [节点目录名]
pip install -r requirements.txt
```
重启ComfyUI并刷新浏览器, 检查对应的插件是否安装成功, 在启动日志界面是否出现`import failed`的错误提示
## 方法三:ZIP下载安装
适用于无法使用 Git 或 Manager 安装的用户
我们不是非常推荐使用这种方式来进行安装,因为这种安装方式会丢失
在GitHub页面点击 `Code` → `Download ZIP`
解压ZIP文件
将解压后的文件夹复制到 `ComfyUI/custom_nodes/` 目录
重启ComfyUI并刷新浏览器
重启后在 ComfyUI Manager 中检查对应的插件是否安装成功, 是否出现`import failed`的错误提示
{/* ## 常见问题排查
[To Be Updated] */}
# 如何在不同系统上手动安装 ComfyUI
Source: https://docs.comfy.org/zh-CN/installation/manual_install
本部分将指导你完成在 Windows、MacOS 以及 Linux 的手动安装过程
对于 ComfyUI 的安装, 主要分为几个步骤
1. 创建一个虚拟环境(避免污染系统级 Python 环境)
2. 克隆 ComfyUI 代码仓库
3. 安装依赖
4. 启动 ComfyUI
你也可以参考 [ComfyUI CLI](/zh-CN/comfy-cli/getting-started) 来安装 ComfyUI, 它是一个命令行工具,可以方便地安装 ComfyUI 并管理其依赖。
## (可选)创建虚拟环境
独立的虚拟环境是必要的,因为 ComfyUI 的依赖可能会与系统上的其他依赖冲突,也可以避免对系统级 Python 环境的污染。
[Install Miniconda](https://docs.anaconda.com/free/miniconda/index.html#latest-miniconda-installer-links). 这将帮助您安装 ComfyUI 所需的正确版本的 Python 和其他库。
使用 Conda 创建一个环境。
```
conda create -n comfyenv
conda activate comfyenv
```
## 克隆代码仓库
你需要保证你的系统上已经安装了 [Git](https://git-scm.com/downloads), 首先你需要打开终端(命令行),然后克隆代码仓库。
如果你还没有安装 Microsoft Visual C++ Redistributable,请在[这里安装](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170)
打开终端应用程序。
打开[终端应用程序](https://support.apple.com/guide/terminal/open-or-quit-terminal-apd5265185d-f365-44cb-8b09-71a064a42125/mac)。
```bash
git clone git@github.com:comfyanonymous/ComfyUI.git
```
## 安装GPU 及 ComfyUI 依赖
安装 GPU 依赖
```
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
```
或者,您可以安装 PyTorch 的夜间版本。
安装夜间版本(可能风险更大)
```
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch-nightly -c nvidia
```
```
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
```
或者,您可以安装 PyTorch 的夜间版本。
安装夜间版本(可能风险更大)
```
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.0
```
```bash
conda install pytorch-nightly::pytorch torchvision torchaudio -c pytorch-nightly
```
```bash
cd ComfyUI
pip install -r requirements.txt
```
启动 ComfyUI
```
cd ComfyUI
python main.py
```
## 如何更新 ComfyUI
使用命令行进入 ComfyUI 的安装路径,然后拉取最新代码。
```bash
cd <安装路径>/ComfyUI
git pull
```
使用命令行进入 ComfyUI 的安装路径,然后安装依赖。
你需要确保当前的 Python 环境是 ComfyUI 的虚拟环境,否则依赖会安装到系统级 Python 环境,污染系统级 Python 环境。
```bash
pip install -r requirements.txt
```
## 添加外部模型路径
如果你想要在 `ComfyUI/models` 之外管理你的模型文件,可能出于以下原因:
* 你有多个 ComfyUI 实例,你想要让这些实例共享模型文件,从而减少磁盘占用
* 你有多个不同的类型的 GUI 程序,如:WebUI, 你想要他们共用模型文件
* 模型文件无法被识别或读取到
我们提供了通过 `extra_model_paths.yaml` 配置文件来添加额外模型搜索路径的方法。
### 不同 ComfyUI 版本配置文件位置
对于[便携版](/zh-CN/installation/comfyui_portable_windows)和[手动安装](/zh-CN/installation/manual_install)的 ComfyUI版本,你可以在 ComfyUI 的根目录下找到 `extra_model_paths.yaml.example` 的示例文件
```
ComfyUI/extra_model_paths.yaml.example
```
复制并重命名为 `extra_model_paths.yaml` 来使用, 并保持在 ComfyUI 的根目录下, 路径应该是 `ComfyUI/extra_model_paths.yaml`
你也可以在 [这里](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 找到配置示例文件
如果你使用的是 ComfyUI 桌面应用程序,你可以参考下图打开额外模型的配置文件:
或者通过下面的位置打开:
```
C:\Users\YourUsername\AppData\Roaming\ComfyUI\extra_models_config.yaml
```
```
~/Library/Application Support/ComfyUI/extra_models_config.yaml
```
对应的配置文件不应该被改变
### 配置示例
比如,你需要额外让 ComfyUI 识别的模型文件位于下面的文件夹:
```
📁 YOUR_PATH/
├── 📁models/
| ├── 📁 loras/
| │ └── xxxxx.safetensors
| ├── 📁 checkpoints/
| │ └── xxxxx.safetensors
| ├── 📁 vae/
| │ └── xxxxx.safetensors
| └── 📁 controlnet/
| └── xxxxx.safetensors
```
那么你可以进行如下的配置来让 ComfyUI 识别到你设备上的模型路径
```
my_custom_config:
base_path: YOUR_PATH
loras: models/loras/
checkpoints: models/checkpoints/
vae: models/vae/
controlnet: models/controlnet/
```
或者使用
```
my_custom_config:
base_path: YOUR_PATH/models/
loras: loras
checkpoints: checkpoints
vae: vae
controlnet: controlnet
```
对于桌面版,请在原有配置路径下新增配置,而不覆盖掉安装过程中自动生成的路径配置,请在修改前备份对应的文件,这样在你配置错误时可以及时恢复
或者你也可以参考默认的 [extra\_model\_paths.yaml.example](https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example) 来配置,保存之后, 需要 **重启 ComfyUI** 才能生效。
下面是完整的原始的配置配置示例:
```yaml
#Rename this to extra_model_paths.yaml and ComfyUI will load it
#config for a1111 ui
#all you have to do is change the base_path to where yours is installed
a111:
base_path: path/to/stable-diffusion-webui/
checkpoints: models/Stable-diffusion
configs: models/Stable-diffusion
vae: models/VAE
loras: |
models/Lora
models/LyCORIS
upscale_models: |
models/ESRGAN
models/RealESRGAN
models/SwinIR
embeddings: embeddings
hypernetworks: models/hypernetworks
controlnet: models/ControlNet
#config for comfyui
#your base path should be either an existing comfy install or a central folder where you store all of your models, loras, etc.
#comfyui:
# base_path: path/to/comfyui/
# # You can use is_default to mark that these folders should be listed first, and used as the default dirs for eg downloads
# #is_default: true
# checkpoints: models/checkpoints/
# clip: models/clip/
# clip_vision: models/clip_vision/
# configs: models/configs/
# controlnet: models/controlnet/
# diffusion_models: |
# models/diffusion_models
# models/unet
# embeddings: models/embeddings/
# loras: models/loras/
# upscale_models: models/upscale_models/
# vae: models/vae/
#other_ui:
# base_path: path/to/ui
# checkpoints: models/checkpoints
# gligen: models/gligen
# custom_nodes: path/custom_nodes
```
### 添加额外自定义节点路径
除了添加外部模型之外,你同样可以添加不在 ComfyUI 默认路径下的自定义节点路径
请注意,这并不会改变自定义节点的默认安装路径,只是在启动 ComfyUI 时会增加额外的路径搜索,你仍旧需要在对应的环境中完成自定义节点的依赖的安装,来保证其运行环境的完整性。
下面是一个简单的配置示例(Mac 系统),请根据你的实际情况进行修改,并新增到对应的配置文件中,保存后需要 **重启 ComfyUI** 才能生效:
```yaml
my_custom_nodes:
custom_nodes: /Users/your_username/Documents/extra_custom_nodes
```
# 系统要求
Source: https://docs.comfy.org/zh-CN/installation/system_requirements
本文将介绍 ComfyUI 目前的一些系统要求,包括硬件及软件要求
在本篇我们将介绍安装 ComfyUI 的系统要求, 由于 ComfyUI 的更新频繁,本篇文档未必能够及时更新,请参考[ComfyUI](https://github.com/comfyanonymous/ComfyUI)中的相关说明。
无论是哪个版本的 ComfyUI,都是运行在一个独立的 Python 环境中。
### 操作系统要求
目前我们支持以下操作系统:
* Windows
* Linux
* macOS(支持 Apple 芯片,如 M 系列)
请参考[ComfyUI Windows 和 Linux 手动安装章节](https://github.com/comfyanonymous/ComfyUI?tab=readme-ov-file#manual-install-windows-linux)了解详细的安装步骤。
你可以参考下面的章节来了解不同系统和版本 ComfyUI 的安装方式,在不同版本的安装中我们简单对安装的系统要求进行了说明。
ComfyUI 桌面版目前支持 **Windows 及 MacOS(Apple Silicon)** 的独立安装,目前仍在 Beta 版本
* 代码开源在 [Github](https://github.com/Comfy-Org/desktop)
由于 Desktop 总是基于稳定版本发布构建,所以我们最新的一些更新,对于 Desktop 来说可能需要等待一段时间才能体验到,如果你想要总是体验最新版本,请使用便携版或者手动安装
你可以从下面选择适合你的系统和硬件开始安装 ComfyUI
适合带有 **Nvdia** 显卡 **Windows** 版本的 ComfyUI 桌面版
适合带有 **Apple Silicon** 的 MacOS ComfyUI 桌面版
ComfyUI桌面版,**暂时没有 Linux 的预构建**,请访问[手动安装](/zh-CN/installation/manual_install)部分进行 ComfyUI 的安装
便携版是一个集成了独立的嵌入式 Python 环境的 ComfyUI 版本,使用便携版你可以体验到最新的功能,目前仅支持 **Windows** 系统
支持 **Navida 显卡** 和在 **CPU** 运行的 **Windows** ComfyUI 版本,始终使用最新 commit 的代码
支持所有的系统类型和 GPU 类型(Nvidia、AMD、Intel、Apple Silicon、Ascend NPU、寒武纪 MLU)的用户都可以尝试使用手动安装 ComfyUI
### Python 版本
* 推荐 Python 3.12
* 支持 Python 3.13(部分自定义节点可能不兼容)
### 支持的硬件
* NVIDIA 显卡
* AMD 显卡
* Intel 显卡(包括 Arc 系列,支持 IPEX)
* Apple Silicon(M1/M2)
* Ascend NPU
* Cambricon MLU
* CPU(可用 --cpu 参数,速度较慢)
请参考[ComfyUI Windows 和 Linux 手动安装章节](https://github.com/comfyanonymous/ComfyUI?tab=readme-ov-file#manual-install-windows-linux)了解详细的安装步骤。
目前稳定版本的 PyTorch 2.7 已经支持 Blackwell 架构(CUDA 12.8),ComfyUI 的核心版本和桌面版本已经采用了这个版本。
### 依赖
* 需安装 PyTorch(不同硬件需不同版本,详见下方)
* 需安装 ComfyUI 的 requirements.txt 中所有依赖
请参考手动安装章节了解详细的安装步骤。
# 如何更新 ComfyUI
Source: https://docs.comfy.org/zh-CN/installation/update_comfyui
本部分将指导你详细了解 ComfyUI 的更新过程
尽管我们可能已经在不同版本的各部分章节,对于 ComfyUI 的更新过程都有所说明,但是为了方便用户能够更清楚的了解 ComfyUI 的更新过程,我们会在这部分对于 ComfyUI 的更新进行详细的说明。
## 如何更新 ComfyUI?
ComfyUI 便携版提供了便捷的批处理脚本来完成更新操作。
### 更新脚本位置
在便携版安装目录下的 `update` 文件夹中,可以找到以下更新脚本:
```
ComfyUI_windows_portable
└─ 📂update
├── update.py
├── update_comfyui.bat // 更新到最新开发版本
├── update_comfyui_stable.bat // 更新到最新稳定版本
└── update_comfyui_and_python_dependencies.bat // 更新依赖(问题修复时使用)
```
更新过程中请保持网络连接稳定,如果在中国大陆地区可能需要配置网络代理来保证能够顺利访问 GitHub 仓库
ComfyUI 桌面版采用自动更新机制,确保你始终使用最新版本,不过由于 Desktop 版本是基于稳定版本构建的,所以它的功能更新相对滞后,你也可以访问 [下载页面](https://www.comfy.org/download) 来获取最新版本。
### 自动更新设置
请确保在设置中已经启用自动更新功能:
### 手动检查更新
你也可以手动检查是否有可用的更新:
1. 点击菜单栏中的 `菜单`
2. 选择 `帮助`
3. 点击 `检查更新`
桌面版会自动处理所有更新过程,包括 ComfyUI 核心代码和相关依赖的更新
手动安装的 ComfyUI 需要通过 Git 命令来完成更新操作。
### 更新前准备
确保你的系统已安装 [Git](https://git-scm.com/downloads) 并且 ComfyUI 是通过 Git 克隆安装的。
### 更新步骤
首先激活 ComfyUI 的 Python 虚拟环境(如果使用了虚拟环境):
```bash
# 对于 conda 环境
conda activate comfyui
# 对于 venv 环境
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
```
进入 ComfyUI 安装目录并拉取最新代码:
```bash
cd
git pull
```
安装或更新 ComfyUI 的依赖包:
```bash
pip install -r requirements.txt
```
请确保当前处于 ComfyUI 的虚拟环境中,避免污染系统级 Python 环境
更新完成后重新启动 ComfyUI:
```bash
python main.py
```
### 切换版本(可选)
如果需要切换到特定版本,可以使用以下命令:
```bash
# 查看提交历史
git log --oneline
# 切换到特定提交
git checkout
# 返回最新版本
git checkout master
```
建议定期更新以获得最新功能和安全修复,同时推荐使用稳定版本以确保系统稳定性
## ComfyUI 的不同版本说明
首先取决于你安装方式的不同,目前 ComfyUI 有以下的几种安装版本,下面的相关链接中已经包含了针对不同版本的更新说明。
ComfyUI 桌面版目前支持 **Windows 及 MacOS(Apple Silicon)** 的独立安装,目前仍在 Beta 版本
* 代码开源在 [Github](https://github.com/Comfy-Org/desktop)
由于 Desktop 总是基于稳定版本发布构建,所以我们最新的一些更新,对于 Desktop 来说可能需要等待一段时间才能体验到,如果你想要总是体验最新版本,请使用便携版或者手动安装
你可以从下面选择适合你的系统和硬件开始安装 ComfyUI
适合带有 **Nvdia** 显卡 **Windows** 版本的 ComfyUI 桌面版
适合带有 **Apple Silicon** 的 MacOS ComfyUI 桌面版
ComfyUI桌面版,**暂时没有 Linux 的预构建**,请访问[手动安装](/zh-CN/installation/manual_install)部分进行 ComfyUI 的安装
便携版是一个集成了独立的嵌入式 Python 环境的 ComfyUI 版本,使用便携版你可以体验到最新的功能,目前仅支持 **Windows** 系统
支持 **Navida 显卡** 和在 **CPU** 运行的 **Windows** ComfyUI 版本,始终使用最新 commit 的代码
支持所有的系统类型和 GPU 类型(Nvidia、AMD、Intel、Apple Silicon、Ascend NPU、寒武纪 MLU)的用户都可以尝试使用手动安装 ComfyUI
## 在更新 ComfyUI 时都需要更新什么内容?
目前 ComfyUI 的更新主要需要确保两部分内容:
1. 更新 ComfyUI 的核心代码
2. 更新 ComfyUI 的核心依赖,包括必要的 Python 依赖和 ComfyUI 的功能依赖包。
**核心代码**: 新的节点,新的模型支持,新的功能等。
**核心依赖**: 主要包括 ComfyUI 的前端功能,工作流模板,节点帮助文档等。
```
comfyui-frontend-package # ComfyUI 前端功能
comfyui-workflow-templates # ComfyUI 工作流模板
comfyui-embedded-docs # ComfyUI 节点的帮助文档
```
目前这三个核心依赖项目分别在不同的仓库中维护:
* [ComfyUI\_frontend](https://github.com/Comfy-Org/ComfyUI_frontend/) - 前端界面和交互功能
* [workflow\_templates](https://github.com/Comfy-Org/workflow_templates) - 预置工作流模板
* [comfyui-embedded-docs](https://github.com/Comfy-Org/embedded-docs) - 节点帮助文档
另外很有必要说明的一点是,开发版本(nightly) 和 稳定版本(release) 的区别:
* **开发版本(nightly)**:最新 commit 的代码,你可以体验到我们最新提供的一些功能,但是也有可能存在一些潜在的问题
* **稳定版本(release)**:是基于稳定版本构建,通常会滞后于开发版本,但是稳定性更高,我们会在相关功能发布稳定后对稳定版本进行支持
目前较多用户总是在更新过程中处于 release 版本或者桌面版,但是发现需要的功能是开发版本中提供的对应版本并不存在,对于此情况请检查本地 `ComfyUI/requirements.txt` 和[nightly 版本的依赖](https://github.com/comfyanonymous/ComfyUI/blob/master/requirements.txt)是否一致,来确定当前是否所有依赖都是我们最新版本的功能支持。
## 常见更新问题
### 更新后前端、工作流模板、节点帮助文档等缺失或滞后
经常有用户只是使用 `git pull` 命令来更新 ComfyUI 的代码,但**忽略了核心依赖更新**,导致出现以下问题:
* 前端功能缺失或显示异常
* 找不到新增的工作流模板
* 节点帮助文档过时或缺失
* 新功能没有对应的前端支持
请在使用了 `git pull` 命令后,在对应的 ComfyUI 环境使用 `pip install -r requirements.txt` 命令来更新依赖。
如果依赖更新失败常见可能是网络或者计算机权限问题,目前在更新过程中如果出现了核心依赖失败的情况会回退到旧版本,通常你可以在启动日志中看到类似下面的日志:
```
Falling back to the default frontend.
ComfyUI frontend version: xxx
```
请按以下步骤排查:
1. 在对应的环境中使用 `pip list` 命令来查看当前安装的依赖包,如果发现有版本不一致的情况,请在对应的 ComfyUI 环境使用 `pip install -r requirements.txt` 命令来尝试再次更新依赖。
2. 如果更新后仍然存在问题,请检查网络连接是否正常,如果在中国大陆地区可能需要配置网络代理来保证能够顺利访问 GitHub 仓库。
3. 如果仍然存在问题,请检查计算机权限是否正常,如果需要管理员权限,请使用管理员权限运行命令行。
### 如何正确更新核心依赖
**推荐方法**:使用 `ComfyUI_windows_portable\update\update_comfyui.bat` 这个批处理脚本,这个脚本会同时更新 ComfyUI 代码和所有 Python 依赖包。
**手动更新依赖**:
如果你需要手动更新依赖,可以使用以下命令:
```bash
# 在便携版目录下打开命令行
.\python_embeded\python.exe -m pip install -r ComfyUI\requirements.txt
```
**标准更新流程**:
如你使用 Conda 管理虚拟环境,请先激活虚拟环境:
```bash
conda activate comfyui # 或其他环境名
```
这一步我们需要进入 ComfyUI 的根目录,并使用 Git 更新对应的代码。
```
cd
git pull
```
这一步我们需要更新 ComfyUI 的依赖,这个步骤非常重要,特别是前端 `comfyui-frontend-package` 的更新
```
pip install -r requirements.txt
```
桌面版通常会自动处理依赖更新。如果遇到问题:
1. **检查自动更新设置**是否启用
2. **手动触发更新**:菜单 → 帮助 → 检查更新
3. **重新安装桌面版**(极端情况下)
### 依赖更新故障排除
如果依赖更新失败,请按以下步骤排查:
如果位于中国大陆地区,请确保能够访问 PyPI 或配置中国境内镜像:
```bash
# 使用清华大学镜像
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```
如果批量安装失败,可尝试逐个安装。**首先查看 `ComfyUI/requirements.txt` 中的版本要求**:
**然后按照指定版本安装:**
```bash
pip install comfyui-frontend-package==1.17.11
pip install comfyui-workflow-templates==1.0.0
pip install comfyui-embedded-docs==1.0.0
```
建议使用 `ComfyUI/requirements.txt` 中指定的确切版本号,不要自行升级到最新版本,这可能导致兼容性问题。
### 为什么我更新后找不到新功能?
这是最常见的问题之一:
* 如果你使用的是**桌面版**,因为桌面版是基于稳定版本构建的,它的功能更新相对滞后
* 请确定你使用的是**开发版本(nightly)**,而不是**稳定版本(release)**
另外还需要确保在更新过程中对应的依赖已经成功更新,如果更新后仍然存在问题,请参考[依赖更新故障排除](#依赖更新故障排除)章节来排查问题。
### 如何切换到开发(nightly)版本或者稳定(release)版本?
不同版本的区别
* **特点**:包含最新的 commit 代码
* **优势**:可以第一时间体验到最新功能和改进
* **风险**:可能存在未发现的 bug 或不稳定因素
* **适合人群**:开发者、测试用户、想要体验最新功能的用户
* **特点**:经过测试和验证的稳定代码
* **优势**:稳定性高,适合生产环境使用
* **劣势**:功能更新会有延迟,可能落后开发版本数周或数月
* **适合人群**:需要稳定性的用户、生产环境用户
使用 `update_comfyui.bat` 而不是 `update_comfyui_stable.bat`:
```
# 开发版本(最新功能)
double-click: update_comfyui.bat
# 稳定版本
double-click: update_comfyui_stable.bat
```
```bash
# 切换到开发版本
git checkout master
git pull
# 切换到最新稳定版本
git fetch --tags
git checkout $(git describe --tags `git rev-list --tags --max-count=1`)
```
桌面版通常基于稳定版本构建,目前暂时不支持切换版本,如果你需要最新功能,建议:
1. 等待桌面版更新
2. 或者使用便携版/手动安装来体验最新功能
### 更新后出现错误怎么办?
1. **检查依赖**:运行 `pip install -r requirements.txt` 确保所有依赖都已更新
2. **检查自定义节点**:某些自定义节点可能与新版本不兼容
3. **回退版本**:如果问题严重,可以回退到之前的稳定版本
如果出现问题,可以参考我们的问题排查页面来解决。
查看如何进行 ComfyUI 的故障排查
### 如何了解最新功能?
* **GitHub Releases**:查看 [ComfyUI Releases](https://github.com/comfyanonymous/ComfyUI/releases) 了解稳定版本更新
* **GitHub Commits**:查看 [最新提交](https://github.com/comfyanonymous/ComfyUI/commits/master) 了解开发进度
* **社区讨论**:关注我们的[博客](https://blog.comfy.org)和[推特](https://x.com/comfyui)来了解最新动态
# ComfyUI 外观设置
Source: https://docs.comfy.org/zh-CN/interface/appearance
ComfyUI 外观设置选项的详细说明
这部分的设置主要用于自定义 ComfyUI 的外观,包括色彩主题、背景图片、节点样式等。
## 色彩主题
自定义 ComfyUI 外观的主要方式是通过内置的调色板系统。
1. 切换 ComfyUI 主题
2. 将当前选中的主题导出为 JSON 格式
3. 从Json文件中载入自定义主题配置
4. 删除自定义主题配置
对于调色板无法满足的外观需求,你可以通过 [user.css](#使用user-css进行高级外观自定义) 文件进行高级外观自定义
### 如何自定义颜色主题
调色板允许您修改许多特定属性。以下是一些最常自定义的元素,颜色采用十六进制表示:
1. 下面的 JSON 注释只是为了注释说明,实际使用请不要复制下面的内容进行修改,否则将会导致主题无法正常使用
2. 由于我们仍在频繁迭代,下面的这些内容可能会随着 ComfyUI 前端的更新而有所调整,如果需要修改,请从设置中导出主题配置,然后进行修改
```json
{
"id": "dark", // 必须是唯一的,不能和其它主题的id相同
"name": "Dark (Default)", // 主题名称,显示在主题选择器中
"colors": {
"node_slot": { // 节点连接槽的颜色配置
"CLIP": "#FFD500", // CLIP 模型连接槽颜色
"CLIP_VISION": "#A8DADC", // CLIP Vision 模型连接槽颜色
"CLIP_VISION_OUTPUT": "#ad7452", // CLIP Vision 输出连接槽颜色
"CONDITIONING": "#FFA931", // 条件控制连接槽颜色
"CONTROL_NET": "#6EE7B7", // ControlNet 模型连接槽颜色
"IMAGE": "#64B5F6", // 图像数据连接槽颜色
"LATENT": "#FF9CF9", // 潜在空间连接槽颜色
"MASK": "#81C784", // 蒙版数据连接槽颜色
"MODEL": "#B39DDB", // 模型连接槽颜色
"STYLE_MODEL": "#C2FFAE", // 风格模型连接槽颜色
"VAE": "#FF6E6E", // VAE 模型连接槽颜色
"NOISE": "#B0B0B0", // 噪声数据连接槽颜色
"GUIDER": "#66FFFF", // 引导器连接槽颜色
"SAMPLER": "#ECB4B4", // 采样器连接槽颜色
"SIGMAS": "#CDFFCD", // Sigmas 数据连接槽颜色
"TAESD": "#DCC274" // TAESD 模型连接槽颜色
},
"litegraph_base": { // LiteGraph 基础界面配置
"BACKGROUND_IMAGE": "", // 背景图片,默认为空
"CLEAR_BACKGROUND_COLOR": "#222", // 主画布背景色
"NODE_TITLE_COLOR": "#999", // 节点标题文本颜色
"NODE_SELECTED_TITLE_COLOR": "#FFF", // 选中节点的标题颜色
"NODE_TEXT_SIZE": 14, // 节点文本大小
"NODE_TEXT_COLOR": "#AAA", // 节点文本颜色
"NODE_TEXT_HIGHLIGHT_COLOR": "#FFF", // 节点文本高亮颜色
"NODE_SUBTEXT_SIZE": 12, // 节点子文本大小
"NODE_DEFAULT_COLOR": "#333", // 节点默认颜色
"NODE_DEFAULT_BGCOLOR": "#353535", // 节点默认背景色
"NODE_DEFAULT_BOXCOLOR": "#666", // 节点默认边框颜色
"NODE_DEFAULT_SHAPE": 2, // 节点默认形状
"NODE_BOX_OUTLINE_COLOR": "#FFF", // 节点边框轮廓颜色
"NODE_BYPASS_BGCOLOR": "#FF00FF", // 节点旁路背景色
"NODE_ERROR_COLOUR": "#E00", // 节点错误状态颜色
"DEFAULT_SHADOW_COLOR": "rgba(0,0,0,0.5)", // 默认阴影颜色
"DEFAULT_GROUP_FONT": 24, // 分组默认字体大小
"WIDGET_BGCOLOR": "#222", // 小部件背景色
"WIDGET_OUTLINE_COLOR": "#666", // 小部件轮廓颜色
"WIDGET_TEXT_COLOR": "#DDD", // 小部件文本颜色
"WIDGET_SECONDARY_TEXT_COLOR": "#999", // 小部件次要文本颜色
"WIDGET_DISABLED_TEXT_COLOR": "#666", // 小部件禁用状态文本颜色
"LINK_COLOR": "#9A9", // 连接线颜色
"EVENT_LINK_COLOR": "#A86", // 事件连接线颜色
"CONNECTING_LINK_COLOR": "#AFA", // 正在连接时的连接线颜色
"BADGE_FG_COLOR": "#FFF", // 徽章前景色
"BADGE_BG_COLOR": "#0F1F0F" // 徽章背景色
},
"comfy_base": { // ComfyUI 基础界面配置
"fg-color": "#fff", // 前景色
"bg-color": "#202020", // 背景色
"comfy-menu-bg": "#353535", // 菜单背景色
"comfy-menu-secondary-bg": "#303030", // 次级菜单背景色
"comfy-input-bg": "#222", // 输入框背景色
"input-text": "#ddd", // 输入文本颜色
"descrip-text": "#999", // 描述文本颜色
"drag-text": "#ccc", // 拖拽文本颜色
"error-text": "#ff4444", // 错误文本颜色
"border-color": "#4e4e4e", // 边框颜色
"tr-even-bg-color": "#222", // 表格偶数行背景色
"tr-odd-bg-color": "#353535", // 表格奇数行背景色
"content-bg": "#4e4e4e", // 内容区背景色
"content-fg": "#fff", // 内容区前景色
"content-hover-bg": "#222", // 内容区悬停背景色
"content-hover-fg": "#fff", // 内容区悬停前景色
"bar-shadow": "rgba(16, 16, 16, 0.5) 0 0 0.5rem" // 栏阴影效果
}
}
}
```
## 画布
### 背景图片
* 版本要求:ComfyUI 前端版本 1.20.5 或更新版本
* 功能:为画布设置自定义背景图片,提供更加个性化的工作空间,你可以上传图片或者使用网络图片来为画布设置背景图片
## 节点
### 节点不透明度
* 功能:设置节点的不透明度,0表示完全透明,1表示完全不透明
## 节点组件
### 文本域小部件字体大小\*\*
* **范围**:8 - 24
* **功能**:设置文本域小部件中的字体大小,调整文本输入框中文字的显示大小,提升可读性
## 侧边栏
### 统一侧边栏宽度
* **功能**:启用后,当你在不同的侧边栏之间切换时,侧边栏的宽度将统一为一致的宽度,如果禁用,不同的侧边栏的宽度在切换时可以保持自定义的宽度
### 侧边栏大小
* **功能**:控制侧边栏的尺寸大小,可以设置为正常或者小
### 侧边栏位置
* **功能**:控制侧边栏显示在界面的左侧还是右侧,允许用户根据使用习惯调整侧边栏位置
## 树形浏览器
### 树形浏览器项目内边距
* **功能**:设置树形浏览器(侧边栏面板)中项目的内边距,调整树形结构中各项目之间的间距
## 使用user.css进行高级外观自定义
对于调色板不能提供足够控制的情况,您可以通过 user.css 文件使用自定义 CSS。此方法推荐给需要自定义调色板系统中不可用元素的高级用户。
### 要求
* ComfyUI 前端版本 1.20.5 或更新版本
### 设置 user.css
1. 在 ComfyUI 用户目录(与工作流和设置相同位置 - 请参阅下面的位置详细信息)中创建一个名为 `user.css` 的文件
2. 在此文件中添加您的自定义 CSS 规则
3. 重启 ComfyUI 或刷新页面以应用更改
### 用户目录位置
ComfyUI 用户目录是存储您的个人设置、工作流和自定义内容的地方。位置取决于您的安装类型:
```
C:\Users\<你的用户名>\AppData\Roaming\ComfyUI\user
```
```
~//ComfyUI/user
```
用户目录位于您的 ComfyUI 安装文件夹中:
```
/user/
```
例如,如果您将 ComfyUI 克隆到 `C:\ComfyUI`,您的用户目录将是 `C:\ComfyUI\user\default`(或者如果您设置了自定义用户名,则为 `C:\ComfyUI\user\john`)。
ComfyUI 支持每个安装支持多个用户。如果您没有配置自定义用户名,默认为 "default"。每个用户在 `user` 文件夹内都有自己的子目录。
用户目录位于您的 ComfyUI 便携版文件夹中:
```
/ComfyUI/user/
```
例如:`ComfyUI_windows_portable/ComfyUI/user/default`
ComfyUI 支持每个安装支持多个用户。如果您没有配置自定义用户名,默认为 "default"。每个用户在 `user` 文件夹内都有自己的子目录。
此位置包含您的工作流、设置和其他用户特定文件。
找到上述文件夹位置后,请将对应的 Css 文件复制到对应的用户目录中如默认用户文件夹为`ComfyUI/user/default`,然后重启 ComfyUI 或刷新页面以应用更改
### user.css 示例及相关说明
`user.css` 文件会在启动的早期就进行加载。所以能需要在 CSS 规则中使用 `!important` 来确保它们覆盖默认样式。
**user.css 自定义示例**
```css
/* 增加输入框和菜单中的字体大小以提高可读性 */
.comfy-multiline-input, .litecontextmenu .litemenu-entry {
font-size: 20px !important;
}
/* 使上下文菜单项更大,便于选择 */
.litegraph .litemenu-entry,
.litemenu-title {
font-size: 24px !important;
}
/* 为调色板中不可用的特定元素自定义样式 */
.comfy-menu {
border: 1px solid rgb(126, 179, 189) !important;
border-radius: 0px 0px 0px 10px !important;
backdrop-filter: blur(2px);
}
```
**最佳实践**
1. **首先使用调色板**进行大多数自定义
2. **仅在必要时使用 user.css**,用于调色板未涵盖的元素
3. **在进行重大更改前导出您的主题**,以便在需要时恢复
4. **与社区分享您的主题**,以启发他人
**故障排除**
* 如果您的调色板更改没有显示,尝试刷新页面
* 如果 CSS 自定义不起作用,检查您是否使用前端版本 1.20.5+
* 尝试在未应用的 user.css 规则中添加 `!important`
* 保留您的自定义备份,以便轻松恢复
# ComfyUI 积分管理
Source: https://docs.comfy.org/zh-CN/interface/credits
在本篇中我们将介绍 ComfyUI 的积分管理功能,包括积分的获取、使用、查看等操作。
积分系统是为了支持 `API Nodes` 节点而新增的,由于调用闭源 AI 模型需要消耗Token,所以对应的积分管理是很有必要的,在默认情况下积分界面并不会展示,请首先在`设置` -> `用户`中登录对应的 ComfyUI 账号,然后你就可以在 `设置` -> `积分` 中查看关联账号的积分信息了。
ComfyUI将始终保持完全开源,并对本地用户免费。
## 如何购买积分?
下面是对应的积分购买演示视频:
详细操作步骤如下:
在 `设置` -> `用户` 中登录你的 ComfyUI 账号
在登录后你应该可以看到对应的菜单增加了 `积分(Credits)` 选项
前往 `设置` -> `积分` 中对积分进行购买
在弹窗中设置购买金额,并点击 `购买`按钮
在支付页面请进行以下步骤:
1. 选择用于支付的币种
2. 确认对应的邮箱是你在 ComfyUI 中的注册邮箱
3. 选择对应的支付方式
* 信用卡
* 微信(仅在选择使用美元支付时才支持)
* 支付宝(仅在选择使用美元支付时才支持)
4. 点击 `Pay` 按钮或者`Generate QR Code` 按钮完成对应的支付流程
在支付动作完成后,请返回`菜单` -> `积分` 检查你的余额是否已经更新,请试着刷新界面或者重启
## 常见问题
不支持,当积分为负数时你将无法运行 API Nodes
目前我们暂不支持退款
点击`设置` -> `积分` 即可在对应页面看淡当前余额和`积分历史`的入口
我们可以在多个设备间登录同一个账号,但是不支持共享积分给其它用户
由于不同尺寸的图像和生成数量,每次消耗的`Token` 和 `积分`是不一样的,在 `设置` -> `积分` 中,你可以看到每次消耗的积分,以及对应的积分历史
请确保你选择了美元支付,目前微信支付宝等仅在美元支付时才支持
# 部分执行 - 允许你只运行 ComfyUI 中的部分节点
Source: https://docs.comfy.org/zh-CN/interface/features/partial-execution
ComfyUI 中部分执行(Partial Execution)功能的使用方法和条件
**部分执行** 功能是位于 ComfyUI 节点选择工具箱上的一个功能,它能够让你 **只运行工作流的一部分** ,而不是完整运行工作流中的所有节点,它仅在所选节点是一个 **输出节点** 时才可用,可用时会显示为绿色的三角形图标
## 什么是部分执行?
部分执行(Partial Execution),就像它字面上的意思一样,只运行工作流的一部分,而不是完整运行工作流中的所有节点
在上面的示意图中,是部分执行与运行工作流功能的对比
1. 部分执行(左侧):只运行从起始节点到输出节点分支的工作流
2. 运行工作流(右侧):运行工作流中的所有节点
这个功能能够运行你更灵活地运行工作流的特定部分,而不是每次都运行整个工作流。
## 如何使用部分执行功能?
要使用部分执行功能,需要满足,当前选中的节点是一个输出节点,如保存或者预览节点,当对应的节点符合条件时,选中节点后选择工作箱上的按钮会显示为绿色三角形图标,点击该图标即可运行部分工作流
## 常见问题
Q: 为什么在使用这个功能的时候,所有节点都运行了?
A: 前确保你的 ComfyUI 前端版本是在 v1.23.4 版本之后,甚至可能需要 v1.24.x 的版本,对应的缺陷是在 1.24.x 左右版本的才修复,所以请更新你的 ComfyUI 到最新版本,确保前端版本符合要求
# 子图功能 - ComfyUI 中的工作流组织工具
Source: https://docs.comfy.org/zh-CN/interface/features/subgraph
讲解 ComfyUI 中子图(Subgraph)功能的使用方法,包括创建、导航和管理子图
子图功能需要至少 ComfyUI 前端版本 1.24.3 版本的支持,如果你未在你的 ComfyUI 中发现此功能,请参考这篇文档进行更新: [如何更新ComfyUI](/zh-CN/installation/update_comfyui)
* 本文示例图片使用nightly 版本前端制作,界面请以实际为准
子图是 ComfyUI 中的一个高级功能,它允许你将复杂的工作流打包成一个新的节点,使得它更易管理和复用。
简单来说,子图就像是工作流程中的"文件夹",你可以将相关的节点组织在一起像使用单个节点一样使用整个子图。
你可以使用子图达到:
* 简化复杂工作流
* 轻松复用一些常见的节点组合
* 创建可快速复用的节点组合来提高搭建工作流的效率
## 如何创建子图?
在 ComfyUI 中选中任意节点
ComfyUI 会根据选中节点的输入输出,自动创建一个子图,并将其添加到画布中。
经过编辑和整理,你可以将子图调整成一个具备完善功能的节点。
## 编辑子图
### 1. 子图有和普通节点一样的编辑功能
你可以像一般节点一样使用子图:
1. 修改外观节点颜色
2. 修改节点名称
3. 使用绕过(Bypass)来禁用
4. 等等
### 2. 子图的编辑
* 在子图非组件(widget) 空白处双击以进入子图编辑状态,
* 点击子图编辑按钮
下面为进入子图的编辑状态
1. 子图导航,你可以通过这里退出当前子图,并返回上级
2. 输入插槽(slot): 会显示在子图外部的内部节点输入
* 你可以像链接正常节点一样将输出连接到插槽中
* 通过在连接点上**点击鼠标右键**,你可以重命名/删除 暴露在子图外部的插槽
3. 输出插槽(slot): 暴露在子图外部的输出,和输入插槽功能类似
1. 图中数字为 1 的插槽为**默认插槽**: 用于新增 输出 / 输出 连接以暴露给上级的子图视图
2. 在已有插槽上**点击鼠标右键**,可以重命名、删除、取消已原有的节点的连接
> 插槽连接同样遵循数据类型验证
### 3. 子图的嵌套
在子图中你可以进一步嵌套的子图,来创建更复杂的工作流
同时在编辑嵌套的子图时也提供了多级的导航来方便你返回到上一级
## 将子图转回普通节点
当子图编辑完成后,你可以将子图转回为普通节点,只需要选中子图后,再在选择工具箱上选择同样的按钮即可完成子图转回节点的操作,在右键菜单上也有对应的选项
# 工作流模板 - ComfyUI 内置工作流模板
Source: https://docs.comfy.org/zh-CN/interface/features/template
模板中提供了 ComfyUI 原生支持的模型工作流以及自定义节点提供的示例工作流,你可以在模板中找到目前已被支持的相关模型的工作流并进行使用
工作流模板是 ComfyUI 中原生支持的模型工作流及自定义节点提供的模型工作流的浏览窗口。
在 ComfyUI 的工作流模板功能中,你可以找到以下内容:
* 原生支持的模型工作流
* 自定义节点提供的示例工作流
## 如何打开 ComfyUI 的模板
通过菜单 `工作流` --> `浏览工作流模板` 即可打开工作流模板
## 如何使用模板
1. 加载模板: 点击你需要使用的任意模板以加载对应的模板工作流
2. 下载模型: 在加载模板时,ComfyUI 会自动检查模型文件是否存在,如果模型文件不完整,则会提示你下载模型
3. 使用工作流: 确定工作流的需要的各种要素如模型、输入图片、提示词等是完整的,即可点击运行按钮开始使用工作流
### 模型保存位置
每个工作流模板都嵌入了对应的模型链接,在首次使用时如果没有检测到对应的模型文件,则会提示下载信息
1. 对于桌面版,当你点击`下载`按钮后, 桌面程序会自动为你完成模型文件的下载
2. 其它版本则会通过浏览器下载对应模型,你需要下载后将对应的模型保存到 `ComfyUI/models` 对应的文件夹,比如截图中对应的模型应保存的位置为:
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── qwen_image_fp8_e4m3fn.safetensors
│ ├── 📂 vae/
│ │ └── qwen_image_vae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors
```
在目前版本中,模板的缺失文件检测仅支持检测模型是否在对应的一级目录下是否有同名文件,也就是文件需要在 `ComfyUI/models/diffusion_models` 下存在,
如果你已经下载好了模型并存放在类似:`ComfyUI/models/diffusion_models/wan_video` 的子文件夹,那么你可以忽略弹窗提示,在对应的模型加载节点确保对应的模型被加载选择即可。
如果你好奇我们是如何做到嵌入模型链接的,我们在对应节点的 `properties` 中添加了 `models` 信息, 下面是嵌入了模型信息的 `DualCLIPLoader` 节点的完整片段:
```json
{
"id": 40,
"type": "DualCLIPLoader",
"pos": [
-320,
290
],
"size": [
270,
130
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"type": "CLIP",
"links": [
64
]
}
],
"properties": {
"Node name for S&R": "DualCLIPLoader",
"cnr_id": "comfy-core",
"ver": "0.3.40",
"models": [
{
"name": "clip_l.safetensors",
"url": "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors",
"directory": "text_encoders"
},
{
"name": "t5xxl_fp16.safetensors",
"url": "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors",
"directory": "text_encoders"
}
]
},
"widgets_values": [
"clip_l.safetensors",
"t5xxl_fp16.safetensors",
"flux",
"default"
]
}
```
在 `properties` 中新增了 `models`, 对应的模型信息包含了对应的 `name`、`url` 以及 `directory` 信息。
* `name` 对应了模型文件的名称
* `url` 对应了模型文件的下载链接(可直接下载的文件链接,而不是仓库地址)
* `directory` 对应了模型文件的下载后应存储到的 ComfyUI/models 目录下的哪个子目录, 如 `vae` 则对应 `ComfyUI/models/vae` 目录
目前模型链接仅支持 huggingface 和 civitai 来源的链接,并且要求模型格式为安全的模型格式如 `.safetensors`, `.sft` 格式等,类似 `.gguf`, 不安全格式在嵌入后会提示不安全并不显示链接
你可以使用[这个工具](https://comfyui-wiki.github.io/ComfyUI-Workflow-JSON-Editor/)来编辑工作流模板中的模型信息,但目前 [@ComfyUI-Wiki](https://github.com/ComfyUI-Wiki) 仅对原生的节点增加了支持。
## 如何更新模板?
目前模板是通过一个单独依赖的形式进行管理和更新的,对应的地址为:[comfyui-workflow-templates](https://pypi.org/project/comfyui-workflow-templates/)
如果你在更新 ComfyUI 后发现没有文档或者我们的更新说明中新增的模板,那么可能是你没有在更新 ComfyUI 之后更新对应的依赖, 你可以访问 [ComfyUI/requirements.txt](https://github.com/comfyanonymous/ComfyUI/blob/master/requirements.txt) 来查看对应的依赖版本
通常下面的三个依赖是每次更新 ComfyUI 时我们也可能会同时升级的
```
comfyui-frontend-package==1.24.4
comfyui-workflow-templates==0.1.52
comfyui-embedded-docs==0.2.4
```
如果你不了解如何正确更新,请参考 [更新 ComfyUI](/zh-CN/installation/update_comfyui) 章节了解如何更新 ComfyUI 和对应的依赖
## 如何向 ComfyUI 官方贡献模板?
目前所有的模板都托管在 [workflow\_templates](https://github.com/Comfy-Org/workflow_templates/) 仓库中,你可以通过提交 PR 的方式来贡献模板,对于官方模板我们要求如下:
1. 不使用任何的第三方节点(避免用户因为缺失节点导致需要额外的安装)
2. 对应的模板不与已有模板重复, 需要是已支持的相关模型功能
3. 你可以在对应的仓库提交 issue 进行提问
## 自定义节点模板
如果自定义节点作者有提供了模板和示例工作流,你也可以在模板浏览里找到,通常只需要找到对应的节点名称的分类即可看到作者提供的所有目录。
如果你是自定义节点作者,请注意目前我们仅支持模板文件夹下的一级目录,并不支持多级目录,且仅支持 JSON 格式的模板
## 如何为自定义节点添加模板?
[自定义节点模板](/zh-CN/custom-nodes/workflow_templates) 中介绍了如何为你开发的自定义节点添加工作流模板,你也可以向自定义节点作者提交 PR 来贡献自定义节点模板
# 遮罩编辑器 - 在 ComfyUI 中创建和编辑遮罩
Source: https://docs.comfy.org/zh-CN/interface/maskeditor
讲解 ComfyUI 中遮罩编辑器(Mask Editor)的使用方法,以及对应设置等
遮罩编辑器是 ComfyUI 中一个非常实用的功能,它可以帮助用户在图像中创建和编辑遮罩,而不需要在其它应用程序中进行操作。
遮罩编辑器目前通过 `Load image` 节点来触发,当你上传图像后,可以在节点上右键通过菜单 `Open in MaskEditor` 来打开遮罩编辑器。
然后你就可以通过鼠标在图像上点击来创建和编辑遮罩了。
## 演示视频
你的浏览器不支持 video 标签。
# ComfyUI 界面概览
Source: https://docs.comfy.org/zh-CN/interface/overview
在本篇中,我们将简要介绍 ComfyUI 的基础用户界面,带你熟悉了解 ComfyUI 界面的各个部分,
可视化界面是目前绝大数用户使用 ComfyUI 来调用 [ComfyUI Server](/zh-CN/development/comfyui-server/comms_overview) 进行相应媒体资源生成的方式,它提供了一个可供用户操作和组织工作流的可视化界面,用于组织和调试工作流,并生成令人惊叹的作品。
在本篇我们将粗略介绍 ComfyUI 的界面以及各个部分的功能,在后续的章节中我们将详细介绍各个部分的功能和使用方法。
通常,当你当你启动 ComfyUI 后你可以看到下面这样的一个界面:
如果你是较为早期的用户,你应该还见过之前的这样的菜单界面:
这两个菜单界面可以通过设置进行切换,但随着 ComfyUI 的功能日益强大和复杂,我们建议你使用新版的菜单界面来获得更好的使用体验。
目前 ComfyUI前端是一个独立的项目,作为一个独立的 ComfyUI 依赖进行进行发布和更新维护,如果你想要参与贡献,可以 fork 这个[仓库](https://github.com/Comfy-Org/ComfyUI_frontend),并进行 pull request。
## 本地化支持
目前 ComfyUI 支持:包括英文、中文、俄罗斯语、法语、日文、韩文。
如果你需要切换界面语言到你需要的语言可以点击 **设置齿轮图标** 然后在 `Comfy` --> `Locale` 中选择你需要的语言。
## 新版菜单界面
### 界面分区(Workspace)
下面是主要的 ComfyUI 的界面分区以及各部分的简要介绍。
目前 ComfyUI 的界面除开主要的工作流界面,主要分为以下几个部分:
1. 菜单栏:提供工作流、编辑、帮助菜单,工作流执行、ComfyUI Manager入口等等
2. 侧边栏面板切换按钮:用于切换工作流历史队列、节点库、模型库、本地用户工作流浏览等
3. 切换主题按钮: 在 ComfyUI 默认的暗色主题和亮色主题之间进行快速切换
4. 设置:点击后可打开设置按钮
5. 画布菜单: 提供了ComfyUI 画布的视图放大、缩小、自适应操作等
### 菜单栏功能
上图是顶部菜单栏的对应功能,包含的常见的功能,我们会在具体的功能使用部分再详细介绍对应的功能
### 侧边栏面板按钮
在目前的 ComfyUI 中,我们提供了四个侧边面板包含了以下功能:
1. 工作流历史队列(Queue): 所有 ComfyUI 执行媒体内容生成的队列信息
2. 节点库(Node Library): 所有 ComfyUI 中的节点包括`Comfy Core` 和你安装的自定义节点都会可以在这里进行查找
3. 模型库(Model Library): 你本地的`ComfyUI/models` 目录下的模型可以在这里被查找到
4. 本地用户工作流(Workflows): 你本地保存的工作流可以在这里被查找到
## 旧版菜单
目前 ComfyUI 默认启用新版界面,如果你更偏好使用旧版界面,可以点击 **设置齿轮图标** 然后在 `Comfy` --> `菜单(Menu)` 将 `使用新菜单(Use new menu)` 设置为 `disabled` 即可切换到旧版本的菜单。
旧版菜单界面仅支持英文。
旧版本菜单界面功能标注说明如下:
# ComfyUI 3D 设置
Source: https://docs.comfy.org/zh-CN/interface/settings/3d
ComfyUI 3D 设置选项的详细说明
这部分的设置主要用于控制 ComfyUI 中 3D 相关组件的初始化设置,包括相机、光照、场景等,在创建新的3D组件时,会根据这些设置进行初始化,在创建后,这些设置仍然可以单独调整。
## 相机
### 摄像机类型
* **选项**:
* `perspective` (透视)
* `orthographic` (正交)
* **功能**: 控制创建新的3D组件时,默认的相机是透视还是正交。这个默认设置仍然可以在创建后在节点的画布中单独调整
## 光照
这部分菜单用于设置 3D 相关组件的光照设置的预设, 对应的设置在 ComfyUI 的 3D 设置中同样可以进行修改
### 光照调整步长
* **默认值**: 0.5
* **功能**: 控制在3D场景中调整光照强度时的步长。较小的步长值可以实现更精细的光照调整,较大的值则会使每次调整的变化更加明显
### 光照强度下限
* **默认值**: 1
* **功能**: 设置3D场景允许的最小光照强度值。此项定义在调整任何3D控件照明时可设定的最低亮度
### 最大光照强度
* **默认值**: 10
* **功能**: 设置3D场景允许的最大光照强度值。此项定义了在调整任何3D控件照明时可设定的最高亮度上限
### 初始光照强度
* **默认值**: 3
* **功能**: 设置3D场景中灯光的默认亮度级别。该数值决定新建3D控件时灯光照亮物体的强度,但每个控件在创建后都可以单独调整
## 场景
这个设置允许你使用设置默认的 3D 节点的偏好设置
### 初始背景颜色
* **作用**: 控制3D场景的默认背景颜色。此设置决定新建3D组件时的背景外观,但每个组件在创建后都可以单独调整
* **默认值**: `282828` (深灰色)
修改背景颜色,同样可以在画布中进行调整
### 显示预览
* **作用**: 控制创建新的3D组件时是否默认显示预览屏幕。此默认设置在创建后仍可为每个组件单独切换
* **默认值**: true (开启)
### 显示网格
* **作用**: 控制创建新的3D组件时是否默认显示网格。此默认设置在创建后仍可为每个组件单独切换
* **默认值**: true (开启)
# 关于页面
Source: https://docs.comfy.org/zh-CN/interface/settings/about
ComfyUI 关于设置页详细说明
About 页面是 ComfyUI 设置系统中的一个信息展示面板,用于显示应用程序版本信息、相关链接和系统统计数据,这些设置在向我们提交反馈问题时,可以提供给我们一些非常关键的信息。
### 版本信息徽章
About 页面显示以下核心版本信息:
* **ComfyUI 版本**:显示后端 ComfyUI 的版本号,链接到官方 GitHub 仓库
* **ComfyUI\_frontend 版本**:显示前端界面的版本号,链接到前端 GitHub 仓库
* **Discord 社区**:提供 ComfyOrg Discord 服务器的链接
* **官方网站**:链接到 ComfyOrg 官方网站
由于这里的版本信息主要是对应稳定版本的信息,如果你正在使用的是 nightly 版本,那么这里并不会显示对应的 commit hash 等,如果你正在使用 nightly 版本,可以在对应的 ComfyUI 主目录中使用 `git log` 命令来查看对应的 commit hash 等信息。
另外一个常见的问题是,不同的依赖包会在更新中失败回滚
### 自定义节点徽章
如果安装了自定义节点,About 页面还会显示自定义节点提供的额外徽章信息。这些徽章由各个自定义节点通过 `aboutPageBadges` 属性注册。
### 系统统计信息
页面底部显示详细的系统统计信息,包括:
* 硬件配置信息
* 软件环境信息
* 系统性能数据
## 扩展开发者指南
扩展开发者可以通过在扩展配置中添加 `aboutPageBadges` 属性来向 About 页面添加自定义徽章:
```javascript
app.registerExtension({
name: 'MyExtension',
aboutPageBadges: [
{
label: 'My Extension v1.0.0',
url: 'https://github.com/myuser/myextension',
icon: 'pi pi-github'
}
]
})
```
# Comfy 设置
Source: https://docs.comfy.org/zh-CN/interface/settings/comfy
ComfyUI 核心设置选项的详细说明
## API 节点
### 显示 API 节点定价徽章
* **默认值**:启用
* **功能**:控制是否在 API 节点上显示定价徽章,帮助用户识别 API 节点的使用成本
> 更多关于 API 节点请参考 [API 节点](/zh-CN/tutorials/api-nodes/overview)
## 开发者模式
### 启用开发模式选项
* **默认值**:禁用
* **功能**:启用开发者模式选项(如API保存等)
## 编辑令牌权重
### Ctrl+上/下 精度
* **默认值**:0.01
* **功能**:当你在使用类型 CLIPTextEncode 或者文本框输入类的节点组件时,使用 Ctrl+上/下 可以快速调整权重,这个选项会改变每次调整的权重值
## 区域设置(本地化)
### 语言
* **选项**:中文 (Chinese)、English (英文)、日本語 (日文)、한국어 (韩文)、Русский (俄文)、Español (西班牙语)、Français (法语)
* **默认值**:自动检测浏览器语言
* **功能**:修改 ComfyUI 界面显示语言
## 菜单
### 使用新菜单
* **默认值**:顶部
* **功能**:选择菜单界面和位置,目前仅支持顶部、底部、禁用
菜单界面将会显示在工作界面的顶部
菜单栏界面将会显示工作界面的底部
如果你偏好早期旧版菜单,可以尝试使用这个选项。
由于我们在不断更新,一些新功能支持并不会在旧版菜单中同步支持。
## 模型库
模型库指的是 ComfyUI 侧边菜单栏中的模型管理功能,你可以通过这个功能来查看你在 `ComfyUI/models` 及额外配置的模型文件夹中的模型
### 模型库名称格式
* **默认值**:标题
* **功能**:选择在模型库树视图中显示的名称格式,目前仅支持文件名、标题
### 自动加载所有模型文件夹
* **默认值**:禁用
* **功能**:是否在点击模型库时自动检测所有文件夹下的模型文件,启用可能导致加载延迟(需要循环遍历所有文件夹),禁用时只有点击文件夹名称才会加载对应文件夹下的文件
## 节点
在 ComfyUI 的迭代过程中,我们会对一些节点进行调整,也会启用一些节点,这些节点可能在未来版本中发生重大变化或被移除,但是为了保证兼容性类似弃用的节点并没有被移除,你可以通过下面的设置来启用是否显示 **实验性节点** 和 **已弃用节点**
### 显示已弃用节点
* **默认值**:禁用
* **功能**:控制是否在搜索中显示已弃用的节点,已弃用节点在UI中默认隐藏,但在现有工作流中仍然有效
### 显示实验性节点
* **默认值**:启用
* **功能**:控制是否在搜索中显示实验性节点,实验性节点说一些新的功能支持,但未完全稳定,可能在未来版本中发生变化或被移除
## 节点搜索框
### 节点建议数量
* **默认值**:5
* **功能**:用于修改相关节点上下文菜单中推荐的节点数量,数值越大显示的相关推荐节点数量越多
### 显示节点频率
* **默认值**:禁用
* **功能**:控制是否在搜索结果中显示节点使用频率
### 显示节点ID名称
* **默认值**:禁用
* **功能**:控制是否在搜索结果中显示节点ID名称
### 显示节点类别
* **默认值**:启用
* **功能**:控制是否在搜索结果中显示节点类别,帮助用户了解节点的分类信息
### 节点预览
* **默认值**:启用
* **功能**:控制是否在搜索结果中显示节点预览,方便你快速预览节点
### 节点搜索框
* **默认值**:默认
* **功能**: 选择节点搜索框的实现方式(实验性功能),如果选择 `litegraph(旧版)` 则会切换到早期的 ComfyUI 的搜索框
## 节点组件
### 组件控制模式
* **选项**:之前、之后
* **功能**:控制节点组件值的更新时机是在工作流运行之前还是滞后,比如更新 seed 种子的值
### 文本框组件拼写检查
* **默认值**:禁用
* **功能**:控制文本域小部件是否启用拼写检查, 在文本输入时提供拼写检查功能, 这一功能是通过浏览器的 spellcheck 属性实现的
## 队列
### 队列历史大小
* **默认值**:100
* **功能**:控制侧边栏队列历史面板里记录的队列历史大小,数值越大记录的队列历史越多,数量多大时加载页面时也会占用更多内存
## 执行按钮
### 批处理计数限制
* **默认值**:100
* **功能**:设置单次点击添加到队列的最大任务数量,防止意外添加过多任务到队列
## 验证和开发者设置
### 验证节点定义
* **默认值**:禁用
* **功能**:控制是否在启动时验证所有节点定义(慢),仅推荐给节点开发者使用,当启用时系统会使用 Zod 模式对每个节点定义进行严格验证,这一功能会消耗更多内存和时间
* **错误处理**:验证失败的节点定义会被跳过,并在控制台输出警告信息
由于需要对所有节点定义进行详细的模式验证,这个功能会显著增加启动时间,因此默认关闭,仅推荐给节点开发者使用
### 校验工作流
* **默认值**:启用
* **功能**:确保工作流的结构和连接正确性, 如果启用,系统会调用 `useWorkflowValidation().validateWorkflow()` 对工作流数据进行验证
* **验证过程**:验证过程包含两个步骤:
* 模式验证: 使用 Zod 模式验证工作流结构
* 链接修复: 检查并修复节点间的连接问题
* **错误处理**:验证失败时会显示错误提示,但不会阻止工作流加载
## 窗口
### 关闭窗口时显示确认
* **默认值**:启用
* **功能**:当存在已修改但未保存的工作流时,控制关闭浏览器窗口或标签页时是否显示确认,防止意外关闭窗口导致未保存的工作流丢失
## 工作流
### 持久化工作流状态并在页面(重新)加载时恢复
* **默认值**:启用
* **功能**:控制是否在页面(重新)加载时恢复工作流状态,在页面刷新后保持工作流内容
### 自动保存
* **默认值**:关闭
* **功能**:控制工作流的自动保存行为,自动保存工作流更改,避免数据丢失
### 自动保存延迟(毫秒)
* **默认值**:1000
* **功能**:设置自动保存的延迟时间,仅在自动保存设置为"延迟后"时生效
### 删除工作流时显示确认
* **默认值**:启用
* **功能**:控制在侧边栏删除工作流时是否显示确认对话框,防止意外删除重要工作流
### 在工作流中保存和恢复视图位置及缩放
* **默认值**:启用
* **功能**:控制是否在工作流中保存和恢复画布位置和缩放级别,在重新打开工作流时恢复之前的视图状态
### 已打开工作流位置
* **选项**:侧边栏、顶部栏、顶部栏(第二行)
* **默认值**:顶部栏
* **功能**:控制打开的工作流标签的显示位置,目前仅支持侧边栏、顶部栏、顶部栏(第二行)
### 保存工作流时提示文件名
* **默认值**:启用
* **功能**:控制保存工作流时是否提示输入文件名,允许用户自定义工作流文件名
### 保存工作流时排序节点ID
* **默认值**:禁用
* **功能**:决定保存工作流时是否对节点ID进行排序,使工作流文件格式更规范,便于版本控制
### 显示缺失节点警告
* **默认值**:启用
* **功能**:控制是否显示工作流中缺失节点的警告,帮助用户识别工作流中不可用的节点
### 显示缺失模型警告
* **默认值**:启用
* **功能**: 我们支持在工作流文件中对 widget 的值添加模型链接信息,用于加载模型文件时的提示,当启用时如果你本地没有对应的模型文件则会显示工作流中缺失模型的警告
### 清除工作流时需要确认
* **默认值**:启用
* **功能**:控制清除工作流时是否显示确认对话框,防止意外清除工作流内容
### 保存节点ID到工作流
* **默认值**:启用
* **功能**:控制是否在保存工作流时保存节点ID,使工作流文件格式更规范,便于版本控制
# ComfyUI 桌面应用通用设置
Source: https://docs.comfy.org/zh-CN/interface/settings/comfy-desktop
ComfyUI 桌面应用通用设置选项的详细说明
## 常规
**窗口样式**
* **默认值**: default
* **说明**: 控制应用窗口的标题栏样式
**自动更新检查**
* **默认值**: 启用
* **说明**: 自动检查 ComfyUI 桌面版的更新,在更新可同时将会提醒你进行更新
**发送匿名使用情况统计**
* **默认值**: 启用
* **说明**: 发送匿名使用情况统计数据,帮助改进软件 ,对应的设置修改需要重启才能生效
## UV(包管理器)
这个部分主要针对中国地区用户设置使用,因为 Desktop 使用的许多原始镜像都是中国境外的,所以对国内用户访问不一定友好,你可以在这里设置你自己的镜像源,以提高访问速度,保证对应包可以正常访问下载。
**Python 安装镜像**
* **默认值**: 空(使用默认源)
* **说明**:
* 管理的 Python 安装包从 Astral python-build-standalone 项目下载
* 可设置镜像 URL 使用不同的 Python 安装源
* 提供的 URL 将替换默认的 GitHub 下载地址
* 支持使用 file:// 协议从本地目录读取分发包
* **验证**: 自动检查镜像可达性
**PyPI 安装镜像**
* **默认值**: 空(使用默认源)
* **说明**: 默认的 pip 包安装镜像源
**Torch安装镜像**
* **默认值**: 空(使用默认源)
* **说明**: PyTorch 专用的 pip 安装镜像源
# 扩展设置
Source: https://docs.comfy.org/zh-CN/interface/settings/extension
ComfyUI 扩展管理和设置选项的详细说明
扩展设置面板是 ComfyUI 前端设置系统中的一个特殊管理面板,专门用于管理前端扩展插件的启用/禁用状态,区别于自定义节点(Custom Node),这个面板只是用于管理自定义节点注册的前端扩展,而不是禁用自定义节点。
这些前端扩展插件是用于增强 ComfyUI 的体验,比如提供快捷键、设置、UI 组件、菜单项等功能。
扩展状态更改后需要重新加载页面才能生效:
## Extension 设置面板功能
### 1. 扩展列表管理
显示所有已注册的扩展,包括:
* 扩展名称
* 核心扩展标识(显示 "Core" 标签)
* 启用/禁用状态
### 2. 搜索功能
提供搜索框快速查找特定扩展:
### 3. 启用/禁用控制
每个扩展都有独立的切换开关:
### 4. 批量操作
提供右键菜单进行批量操作:
* 启用所有扩展
* 禁用所有扩展
* 禁用第三方扩展(保留核心扩展)
## 注意事项
* 扩展状态更改需要重新加载页面才能生效
* 某些核心扩展无法被禁用
* 系统会自动禁用已知有问题的扩展
* 扩展设置会自动保存到用户配置文件中
这个 Extension 设置面板本质上是一个"前端插件管理器",让用户可以灵活控制 ComfyUI 的功能模块。
# ComfyUI 画面(LiteGraph)设置
Source: https://docs.comfy.org/zh-CN/interface/settings/lite-graph
ComfyUI 图形渲染引擎 LiteGraph 的设置选项详细说明
LiteGraph 是 ComfyUI 的底层图形渲染引擎,这个分类下的设置主要控制画布、节点、链接等图形界面的行为和外观。
## 画布相关设置
### 显示选择工具箱
* **默认值**:启用
* **功能**:选择工具箱是选中节点后在节点上浮动显示的快捷操作工具栏,提供了常用的快捷操作如部分运行、固定、删除、颜色修改等等
### 低质量渲染缩放阈值
* **默认值**:0.6
* **范围**:0.1 - 1.0
* **功能**: 在渲染界面时,特别是当工作流特别复杂及整个画布特别大时,对应元素的前端渲染会消耗特别多的内存而造成卡顿,通过调低此阈值,可以控制元素在缩放到特定百分比时进入低质量渲染模式,从而降低内存消耗,对应不同渲染模式如下图
### 最大FPS
* **默认值**:0(使用屏幕刷新率)
* **范围**:0 - 120
* **功能**:限制画布的渲染帧率,0表示使用屏幕刷新率,越高的 FPS 会让画面(Canvas) 渲染越流畅,但同时也会消耗更多性能,但过小时则会有越明显的卡顿感。
### 始终吸附到网格
* **默认值**:禁用
* **功能**:在此选项没有启用时,你可以按住 `Shift` 键来使节点边缘和网格对齐,在启用后则无需按住 `Shift` 键即可自动对齐网格
### 吸附网格大小
* **范围**:1 - 500
* **功能**:在启用自动吸附或者按住 `Shift` 键进行节点的移动时,这个参数会决定吸附的网格大小,默认值为 10,你可以根据你的需求进行调整。
### 启用快速缩放快捷键
* **默认值**:启用
* **功能**:启用 `Ctrl + Shift + 鼠标左键拖拽` 的快速缩放功能,提供更快速的缩放操作方式
### 显示图形画布菜单
* **默认值**:启用
* **功能**:控制是否显示右下角的画布菜单
画布菜单位于整个 ComfyUI 界面的右下角,包含了画布的缩放、临时隐藏所有连线、快速缩放工作流到适应画布等操作,如下图所示
### 画布缩放速度
* **默认值**:1.1
* **范围**:1.01 - 2.5
* **功能**:控制画布缩放的速度,调整鼠标滚轮缩放的敏感度
### 在左下角显示画布信息(fps等)
* **默认值**:启用
* **功能**:控制是否在左下角显示画布信息,显示性能指标如 FPS 等
## 上下文菜单
### 放大时缩放节点组合部件菜单(列表)
* **默认值**:启用
* **功能**:控制是否在放大时显示节点组合部件菜单(列表),允许用户选择节点组合部件
## 画面
### 连线渲染模式
* **默认值**:2(Spline样条线)
* **选项**:直线、线性、样条线、隐藏
* **功能**:设置连线的渲染样式,控制节点间连线的视觉样式
## 组
这部分的设置主要和节点组功能相关
### 双击组标题以编辑
* **默认值**:启用
* **功能**:控制是否可以双击节点标题进行编辑,允许用户重命名节点,图中标注为 `1` 的部分
### 分组选中节点填充
* **默认值**:10
* **范围**:0 - 100
* **功能**:设置分组选中节点时的内边距,控制分组框与节点间的间距,图中标注为 `2` 箭头标注部分
## 连线
### 链接中点标记
* **默认值**:Circle(圆形)
* **选项**:无、圆形、箭头
* **功能**:设置链接中点的标记样式,在链接中点显示方向指示
## 释放链接
这部分的菜单目前主要控制当链接连线释放时的相关操作,目前两个相关操作为:
**释放后会出现和当前输入 / 输出相关的节点推荐列表**
**释放后会启动搜索框**
### 链接释放动作(Shift键)
* **默认值**: 搜索框
* **选项**: 上下文菜单、搜索框、无操作
* **功能**:设置按住Shift键释放链接时的动作,按住Shift释放链接时的特殊行为
### 链接释放动作(无修饰键)
* **默认值**: 上下文菜单
* **选项**: 上下文菜单、搜索框、无操作
* **功能**:设置释放链接时的默认动作,控制拖拽链接后释放时的行为
## 节点
### 始终收缩新节点
* **默认值**:启用
* **功能**:控制是否在创建新节点时自动收缩,从而让节点能够始终显示最小的尺寸,但可能会导致添加时有些文本显示会被截断,需要手动调整节点大小
### 启用DOM元素裁剪(启用可能会降低性能)
* **默认值**:启用
* **功能**:启用DOM元素裁剪(可能影响性能),优化渲染但可能降低性能
### 中键单击创建新的转接点
* **默认值**:启用
* **功能**:中键点击时创建新的重路由节点,快速创建用于整理连线的重路由节点
### 删除节点时保留连线
* **默认值**:启用
* **功能**:删除中间节点时自动绕过连接,删除节点时尝试重新连接其输入输出链接
### 吸附高亮节点
* **默认值**:启用
* **功能**:拖拽链接到节点时高亮显示节点,提供视觉反馈,显示可连接的节点,启用后效果如下图,对应链接的一侧会显示高亮的样式
### 连线自动吸附到节点接口
* **默认值**:启用
* **功能**:拖拽链接到节点上时自动吸附到可用插槽,简化连接操作,自动找到合适的输入插槽
### 启用工具提示
* **默认值**:启用
* **功能**:在部分节点信息中会包含一些工具提示,包含了一些参数说明等,当启用后会在鼠标悬停时显示这些工具提示,如下图
### 工具提示延迟
* **默认值**:500
* **功能**:控制工具提示的延迟时间,单位为毫秒,设置为0表示立即显示工具提示
### 节点制作周期标签
* **默认值**:ShowAll(显示全部)
* **功能**:控制节点生命周期标记的显示,显示节点的状态信息
### 节点ID标签
* **默认值**:None(不显示)
* **功能**:控制节点ID标记的显示,显示节点的唯一标识符
### 节点源标签
* **选项**:
* None(不显示)
* HideBuiltIn(隐藏内置)
* ShowAll(显示全部)
* **功能**:控制节点源标记的显示模式,显示节点来源信息,对应的显示效果如下图,如果显示全部则会显示自定义节点和内置节点的标签,方便你判断对应的节点来源,对应小狐狸标志为 ComfyUI 内置节点
### 双击节点标题以编辑
* **默认值**:启用
* **功能**:控制是否可以双击节点标题进行编辑,允许用户重命名节点
## 节点组件
### 浮点组件四舍五入的小数位数 \[0 = 自动]
* **默认值**:0(自动)
* **范围**:0 - 6
* **功能**:设置浮点小部件四舍五入的小数位数,0表示自动,需要页面重新加载
### 禁用默认浮点组件四舍五入
* **默认值**:禁用
* **功能**:控制是否禁用默认的浮点小部件四舍五入,需要页面重新加载,当节点后端设置了四舍五入时无法禁用
### 禁用节点组件滑块
* **默认值**:禁用
* **功能**:控制是否禁用节点小部件中的滑块控件,强制使用文本输入而非滑块
### 预览图像格式
* **默认值**:空字符串(使用原格式)
* **功能**:设置图像小部件中预览图像的格式,转换为轻量级格式如 webp、jpeg 等
### 在图像预览下方显示宽度×高度
* **默认值**:启用
* **功能**:在图像预览下方显示宽度×高度信息,显示图像的尺寸信息
## 指针
### 启用触控板手势
* **默认值**:启用
* **功能**:此设置为画布启用触控板模式,允许使用双指捏合缩放和拖动。
### 双击间隔(最大)
* **默认值**:300
* **功能**:双击的两次点击之间的最大时间(毫秒)。增加此值有助于解决双击有时未被识别的问题。
### 指针点击漂移延迟
* **默认值**:150
* **功能**:按下指针按钮后,忽略指针移动的最大时间(毫秒)。有助于防止在点击时意外移动鼠标。
### 指针点击漂移(距离)
* **默认值**:6
* **功能**:如果指针在按住按钮时移动超过此距离,则视为拖动(而不是点击)。有助于防止在点击时意外移动鼠标
## 重新路由
### 重新路由样条偏移
* **默认值**:20
* **功能**:用于确定重路由节点两侧的曲线的平滑程度,值越大,曲线越平滑,值越小,曲线越尖锐
# ComfyUI 遮罩编辑器设置
Source: https://docs.comfy.org/zh-CN/interface/settings/mask-editor
ComfyUI 遮罩编辑器设置选项的详细说明
## 画笔调整
### 画笔调整速度倍增器
* **功能**: 控制调整时画笔大小和硬度变化的速度
* **说明**: 更高的值意味着更快的变化
### 将画笔调整锁定到主轴
* **功能**: 启用后,画笔调整将仅根据您移动的方向影响大小或硬度
* **说明**: 这个功能可以让用户更精确地控制画笔属性的调整
## 新编辑器
### 使用新画笔编辑器
* **功能**: 切换到新的画笔编辑器界面
* **说明**: 允许用户在新旧编辑器界面之间切换
新版本具有更好的 UI 界面和交互,功能会更加完整
旧版本是早期版本,功能比较简单,但是可以满足基本需求,后续不会再进行更新
# ComfyUI 设置概览
Source: https://docs.comfy.org/zh-CN/interface/settings/overview
ComfyUI 设置概览的详细说明
这个部分是关于 ComfyUI 前端设置菜单中详细的设置说明, 对于所有的用户设置将会自动保存到 `ComfyUI/user/default/comfy.settings.json` 文件夹
你可以使用 `ctrl + ,` 快捷键来打开设置面板, 然后点击对应的设置选项进行设置。
由于自定义节点也可以在菜单中注册对应的设置类目,在我们的官方文档说明中目前仅包含原生的设置内容,另外有部分选项设置 **仅针对 ComfyUI 桌面版** 有效,我们也在对应页面中做了注释说明。
## ComfyUI 设置菜单
用户设置与 ComfyUI 账户相关,主要用于登录 ComfyUI 账户,以使用 API 节点
购买积分及积分余额的历史入口,仅在登录 ComfyUI 账户后可见
ComfyUI 核心设置选项的详细说明
ComfyUI 中画面(Lite Graph)设置选项的详细说明
修改 ComfyUI 的外观选项,如主题、背景色、侧边栏位置等等
管理 ComfyUI 中前端扩展插件的启用/禁用状态
针对 3D 节点初始化时的一些设置选项
桌面版更新设置,镜像设置等(仅针对 ComfyUI 桌面版有效)
调整蒙版编辑器的使用偏好
修改 ComfyUI 的快捷键设置
了解当前的 ComfyUI 版本信息,设备运行信息等,这在日常反馈中非常有用
修改 ComfyUI 的配置文件, 此设置仅针对 ComfyUI 桌面版有效
# 服务器配置
Source: https://docs.comfy.org/zh-CN/interface/settings/server-config
ComfyUI 服务器配置选项的详细说明
目前 `Server-Config` 设置菜单仅在针对 Desktop 版本中存在,其它版本中并无此设置菜单项目
## 网络
### 主机:要监听的IP地址
* **主机地址 (listen)**:
* **功能**:设置服务器绑定的IP地址。默认 `127.0.0.1` 表示只允许本地访问,如果需要局域网访问可设置为 `0.0.0.0`
虽然我们为 Desktop 版本提供了局域网监听设置的能力,但是作为桌面版,它本身并不适合作为服务器使用,我们建议如果你需要将 ComfyUI 示例作为局域网内的公共服务使用,请参考手动部署教程部署对应的 ComfyUI 服务。
### 端口端口:要监听的端口
**功能**:服务器监听的端口号。桌面版默认8000端口,Web版通常使用8188端口
### TTLS 密钥文件:HTTPS 的 TLS 密钥文件路径
**功能**:HTTPS加密所需的私钥文件路径,用于建立安全连接
### TLS 证书文件:HTTPS 的 TLS 证书文件路径
**功能**:HTTPS加密所需的证书文件路径,与私钥配合使用
### 最大上传大小
* **最大上传大小 (max-upload-size)**:
* **功能**:限制单个文件上传的最大尺寸,单位为MB,默认100MB。影响图片、模型等文件的上传限制
限制单个文件上传的最大尺寸,单位为MB,默认100MB。影响图片、模型等文件的上传限制
## CUDA
### 要使用的 CUDA 设备索引
**功能**:指定使用哪块NVIDIA显卡。0表示第一块显卡,1表示第二块,以此类推。对多GPU系统很重要
### 使用 CUDA malloc 进行内存分配
**功能**:控制是否使用CUDA的内存分配器。可以改善某些情况下的内存管理效率
## 推理
### 全局浮点精度
**功能**:设置模型计算的数值精度。FP16节省显存但可能影响质量,FP32更精确但占用更多显存
### UNET 精度
**选项**:
* `auto`:自动选择最合适的精度
* `fp64`:64位浮点精度,精度最高但显存占用最大
* `fp32`:32位浮点精度,标准精度
* `fp16`:16位浮点精度,可节省显存
* `bf16`:16位brain浮点精度,介于fp16和fp32之间
* `fp8_e4m3fn`:8位浮点精度(e4m3),显存占用最小
* `fp8_e5m2`:8位浮点精度(e5m2),显存占用最小
**功能**:专门控制扩散模型核心组件UNET的计算精度。更高的精度可以提供更好的图像生成质量,但会占用更多显存。较低的精度可以显著节省显存,但可能会影响生成结果的质量。
### VAE 精度
**选项与建议**:
* `auto`:自动选择最合适的精度,推荐8-12GB显存的用户使用
* `fp16`:16位浮点精度,推荐6GB及以下显存的用户使用,可节省显存但可能影响质量
* `fp32`:32位浮点精度,推荐16GB及以上显存且追求最佳质量的用户使用
* `bf16`:16位brain浮点精度,推荐支持此格式的新型显卡使用,可获得更好的性能平衡
**功能**:控制变分自编码器(VAE)的计算精度,影响图像编码/解码的质量和速度。更高的精度可以提供更好的图像重建质量,但会占用更多显存。较低的精度可以节省显存,但可能会影响图像的细节还原。
### 文本编码器精度
**选项**:
* `auto`:自动选择最合适的精度
* `fp8_e4m3fn`:8位浮点精度(e4m3),显存占用最小
* `fp8_e5m2`:8位浮点精度(e5m2),显存占用最小
* `fp16`:16位浮点精度,可节省显存
* `fp32`:32位浮点精度,标准精度
**功能**:控制文本提示词编码器的计算精度,影响文本理解的准确性和显存占用。更高的精度可以提供更准确的文本理解,但会占用更多显存。较低的精度可以节省显存,但可能会影响提示词的解析效果。
## 内存
### 强制使用 channels-last 内存格式
**功能**:改变内存中数据的排列方式,可能提升某些硬件上的性能
### DirectML 设备索引
**功能**:在Windows上使用DirectML加速时指定设备,主要用于AMD显卡
### 禁用IPEX优化
**功能**:关闭Intel CPU优化,主要影响Intel处理器的性能
### VRAM 管理模式
**选项**:
* `auto`:自动管理显存,根据模型大小和需求自动分配显存
* `lowvram`:低显存模式,只使用最低限度的显存,可能会影响生成质量
* `normalvram`:标准显存模式,平衡显存使用和性能
* `highvram`:高显存模式,使用较多显存以获得更好性能
* `novram`:不使用显存,完全使用系统内存运行
* `cpu`:仅使用CPU运行,不使用显卡
**功能**:控制显存的使用策略,如自动管理、低显存模式等
### 保留VRAM
**功能**:为操作系统和其他程序预留的显存量,防止系统卡死
### 禁用智能内存管理
**功能**:关闭自动内存优化,强制将模型移到系统内存以释放显存
### CPU 运行 VAE
**功能**:强制VAE在CPU上运行,可以节省显存但会降低处理速度
## 预览
### 用于潜空间预览的方法
**选项**:
* `none`: 不显示预览图像,生成过程中只显示进度条
* `auto`: 自动选择最合适的预览方法,根据系统性能和显存情况动态调整
* `latent2rgb`: 直接将潜空间数据转换为RGB图像进行预览,速度较快但质量一般
* `taesd`: 使用轻量级的TAESD模型进行预览,在速度和质量之间取得平衡
**功能**: 控制生成过程中如何预览中间结果。不同的预览方法会影响预览的质量和性能消耗。选择合适的预览方法可以在预览效果和系统资源占用之间找到平衡点。
### 预览图像大小
**功能**:设置预览图像的分辨率,影响预览清晰度和性能,尺寸越大,预览质量越高,但也会占用更多显存
### 算法优化
* **确定性算法 (deterministic)**:
启用后使用确定性算法,相同输入会产生相同输出,但计算速度较慢
* **快速模式 (fast)**:
启用实验性优化,可能提升速度但可能影响生成质量
## 缓存
### 经典缓存系统
**功能**:使用传统的缓存策略,更保守但稳定
### 使用 LRU 缓存,最多缓存 N 个节点结果
**功能**:使用最近最少使用(Least Recently Used)算法的缓存系统,可以缓存指定数量的节点计算结果
**说明**:
* 通过设置一个具体的数字来控制最大缓存数量,如 10、50、100 等
* 缓存可以避免重复计算相同的节点操作,提高工作流执行速度
* 当缓存达到上限时,会自动清除最久未使用的结果
* 缓存的结果会占用系统内存(RAM/VRAM),数值越大占用越多
**使用建议**:
* 默认值为 null,表示不启用 LRU 缓存
* 根据系统内存容量和使用需求设置合适的缓存数量
* 对于经常重复使用相同节点配置的工作流,建议启用此功能
* 如果系统内存充足,可以设置较大的数值以获得更好的性能提升
## 注意力
### 交叉注意力方法
**选项**:
* `auto`: 自动选择最合适的注意力计算方法
* `split`: 分块计算注意力,可以节省显存但速度较慢
* `quad`: 使用四分注意力算法,在速度和显存使用上取得平衡
* `pytorch`: 使用PyTorch原生注意力计算,速度较快但显存占用大
**功能**: 控制模型计算注意力时使用的具体算法。不同的算法会在生成质量、速度和显存占用之间做出不同的权衡。通常建议使用auto自动选择。
* **强制upcast-attention (force-upcast-attention)**:
强制使用高精度计算注意力,提升质量但增加显存使用
* **禁用upcast-attention (dont-upcast-attention)**:
禁用高精度注意力计算,节省显存
## 常规
### 禁用xFormers优化
**功能**:关闭 xFormers 库的优化功能。xFormers 是一个专门优化 Transformer 模型注意力机制的库,通常可以提高计算效率、减少内存使用并加快推理速度。禁用此优化后会:
* 回退到标准的注意力计算方法
* 可能增加内存使用和计算时间
* 在某些情况下提供更稳定的运行环境
**使用场景**:
* 遇到与 xFormers 相关的兼容性问题时
* 需要更精确的计算结果时(某些优化可能影响数值精度)
* 在调试或排查问题时需要使用标准实现
### 模型文件的默认哈希函数
**选项**:
* `sha256`: 使用 SHA-256 算法进行哈希校验,安全性高但计算较慢
* `sha1`: 使用 SHA-1 算法,速度较快但安全性稍低
* `sha512`: 使用 SHA-512 算法,提供最高安全性但计算最慢
* `md5`: 使用 MD5 算法,速度最快但安全性较低
**功能**:设置模型文件校验的哈希算法,用于验证文件完整性。不同的哈希算法在计算速度和安全性之间有不同的权衡。通常建议使用 sha256 作为默认选项,它能在安全性和性能之间取得较好的平衡。
### 使 pytorch 在可以时使用较慢的确定性算法
**功能**: 强制 PyTorch 在可能的情况下使用确定性算法,以提高结果的可重现性。
**说明**:
* 启用后 PyTorch 会优先使用确定性算法而不是更快的非确定性算法
* 相同的输入将产生相同的输出,有助于调试和结果验证
* 确定性算法通常比非确定性算法运行更慢
* 即使启用此设置,也不能完全保证在所有情况下都能产生完全相同的图像结果
**使用场景**:
* 科学研究需要严格的结果可重现性
* 调试过程中需要稳定的输出结果
* 生产环境中需要保证结果一致性
### 不打印服务器输出
**功能**:禁止在控制台显示服务器运行信息,保持界面整洁。
**说明**:
* 启用后将不显示 ComfyUI 服务器的日志和运行信息
* 可以减少控制台的信息干扰,使界面更加清爽
* 在大量日志输出时可能略微提升系统性能
* 默认为关闭状态(false),即默认显示服务器输出
**使用场景**:
* 生产环境中不需要查看调试信息时
* 希望保持控制台界面整洁时
* 系统运行稳定无需监控日志时
**注意**:在开发和调试过程中建议保持此选项关闭,以便及时查看服务器的运行状态和错误信息。
### 禁用在文件中保存提示元数据
**功能**:不在生成的图片中保存工作流信息,减少文件大小,但同时也意味着对应工作流信息的缺失,你无法再使用工作流输出的文件来重现对应的生成结果
### 禁用所有自定义节点
**功能**:禁止加载所有第三方扩展节点,通常用于在排查问题时使用,用于来定位对应的错误是否由于第三方扩展节点导致
### 日志详细级别
**功能**:控制日志输出的详细程度,用于调试和监控系统运行状态。
**选项**:
* `CRITICAL`: 仅输出严重错误信息,这些错误可能导致程序无法继续运行
* `ERROR`: 输出错误信息,表示某些功能无法正常工作
* `WARNING`: 输出警告信息,表示可能存在的问题但不影响主要功能
* `INFO`: 输出一般信息,包括系统运行状态和重要操作记录
* `DEBUG`: 输出最详细的调试信息,包括系统内部运行的细节
**说明**:
* 日志级别从上到下详细程度递增
* 每个级别都会包含比它更高级别的所有日志信息
* 建议在正常使用时设置为 INFO 级别
* 在排查问题时可以设置为 DEBUG 级别以获取更多信息
* 在生产环境中可以设置为 WARNING 或 ERROR 级别以减少日志量
## 目录
### 输入目录
**功能**:设置输入文件(如图片、模型)的默认存放路径
### 输出目录
**功能**:设置生成结果的保存路径
# ComfyUI 的快捷键及自定义设置
Source: https://docs.comfy.org/zh-CN/interface/shortcuts
ComfyUI 的键盘和鼠标快捷键及相关设置
目前 ComfyUI 已经支持快捷键自定义,你可以在点击 `设置(齿轮图标)` --> `快捷键` 中进行快捷键的设置。
在对应菜单中,你可以看到目前 ComfyUI 所有的快捷键设置,点击对应命令之前的`编辑图标`,就可以对快捷键进行自定义。
| 快捷键 | 命令 |
| ------------------------------- | ------------------------------ |
| Ctrl + Enter | 执行提示词 |
| Ctrl + Shift + Enter | 执行提示词(前端) |
| Ctrl + Alt + Enter | 中断 |
| Ctrl + Z / Ctrl + Y | 撤销/重做 |
| Ctrl + S | 保存工作流 |
| Ctrl + O | 加载工作流 |
| Ctrl + A | 选择所有节点 |
| Alt + C | 折叠/展开选定节点 |
| Ctrl + M | 静音/取消静音选定节点 |
| Ctrl + B | 忽略/取消忽略选定节点 |
| Delete Backspace | 删除选定节点 |
| Backspace | 清除工作流 |
| Space | 按住并移动光标时移动画布 |
| Ctrl + Click Shift + Click | 将点击的节点添加到选择中 |
| Ctrl + C/Ctrl + V | 复制并粘贴选定节点(不保持与未选定节点输出的连接) |
| Ctrl + C/Ctrl + Shift + V | 复制并粘贴选定节点(保持未选定节点输出到粘贴节点输入的连接) |
| Shift + Drag | 同时移动多个选定节点 |
| Ctrl + G | 添加框到选中节点 |
| Ctrl + , | 显示设置对话框 |
| Alt + = | 放大(画布) |
| Alt + - | 缩小(画布) |
| . | 适应视图到选中节点 |
| P | 固定/取消固定选中项 |
| Q | 切换执行队列侧边栏 |
| W | 切换工作流侧边栏 |
| N | 切换节点库侧边栏 |
| M | 切换模型库侧边栏 |
| Ctrl + \` | 切换日志底部面板 |
| F | 切换焦点模式(全屏) |
| R | 刷新节点定义 |
| 双击左键 | 快速搜索要添加的节点 |
| 快捷键 | 说明 |
| -------------------------------- | ------------------------------ |
| Cmd ⌘ + Enter | 执行提示词 |
| Cmd ⌘ + Shift + Enter | 执行提示词(前端) |
| Cmd ⌘ + Alt + Enter | 中断 |
| Cmd ⌘ + Z/Cmd ⌘ + Y | 撤销/重做 |
| Cmd ⌘ + S | 保存工作流 |
| Cmd ⌘ + O | 加载工作流 |
| Cmd ⌘ + A | 选择所有节点 |
| Opt ⌥ + C | 折叠/展开选定节点 |
| Cmd ⌘ + M | 静音/取消静音选定节点 |
| Cmd ⌘ + B | 忽略/取消忽略选定节点 |
| Delete Backspace | 删除选定节点 |
| Backspace | 清除工作流 |
| Space | 按住并移动光标时移动画布 |
| Cmd ⌘ + Click Shift + Click | 将点击的节点添加到选择中 |
| Cmd ⌘ + C / Cmd ⌘ + V | 复制并粘贴选定节点(不保持与未选定节点输出的连接) |
| Cmd ⌘ + C / Cmd ⌘ + Shift + V | 复制并粘贴选定节点(保持未选定节点输出到粘贴节点输入的连接) |
| Shift + Drag | 同时移动多个选定节点 |
| Cmd ⌘ + G | 添加框到选中节点 |
| Cmd ⌘ + , | 显示设置对话框 |
| Opt ⌥ + = | 放大(画布) |
| Opt ⌥ + - | 缩小(画布) |
| . | 适应视图到选中节点 |
| P | 固定/取消固定选中项 |
| Q | 切换执行队列侧边栏 |
| W | 切换工作流侧边栏 |
| N | 切换节点库侧边栏 |
| M | 切换模型库侧边栏 |
| Cmd ⌘ + \` | 切换日志底部面板 |
| F | 切换焦点模式(全屏) |
| R | 刷新节点定义 |
| 双击左键 | 快速搜索要添加的节点 |
# ComfyUI 账号管理
Source: https://docs.comfy.org/zh-CN/interface/user
在本篇中我们将介绍 ComfyUI 的账号管理功能,包括账号的登录、注册、注销等操作。
账号系统是为了支持 `API Nodes` 节点而新增的,`API Nodes`支持了对闭源模型 API 的调用,这大大扩展了 ComfyUI 的可能性,由于对应的 API 调用需要消耗 Token,所以我们增加了对应的用户系统。
当前我们支持以下几种登录方式:
* 邮箱登录
* Google 登录
* Github 登录
* API Key 登录(非白名单网站授权使用)
登录要求相关及说明,我们在本篇文档中会进行对应说明
## ComfyUI 版本要求
你可能至少需要使用 [ComfyUI v0.3.0](https://github.com/comfyanonymous/ComfyUI/releases/tag/v0.3.30) 版本才能使用账号系统,确保对应前端版本至少为`1.17.11`,有时候前端可能会安装失败而导致回滚到旧版本,所以请在`设置` -> `关于` 查看前端版本是否大于`1.17.11`
在部分地区可能会因为网络限制无法正常访问登录 API 导致登录超时或者失败, 在登录前请**确保你的网络环境不会被限制对应 API 的访问**,保证能够正常访问 Google 或者 Github 等网站。
由于我们仍在快速迭代更新,所以相关功能可能会有变动,如果没有特殊情况,请尽量更新到最新版本以获取相关功能支持。
## 网络要求
要使用API,你必须处于安全的网络环境中:
* 允许从`127.0.0.1`或`localhost`访问。
* 不支持使用`--listen`参数通过局域网访问API节点。
* 在未启用 SSL 证书即非 `https` 开头的站点你可能无法成功登录
* 你可能无法在不在我们白名单的站点中登录(可以通过 API Key 登录)
* 确保你能够正常连接我们的服务(在某些地区可能需要代理来访问)。
## 如何进行登录
在 `设置` -> `用户` 中进行登录:
## 登录方式
如果是首次登录,请首先创建一个账户。
## 使用 API Key 进行登录
由于目前并非所有 ComfyUI 的相关部署都是在我们的域名授权白名单中,所以在通过非白名单的网站登录时,在近期(2025-05-10)的更新中我们提供了 API Key 登录,下面是使用 API Key 登录的相关步骤:
在登录弹窗中选择 `Comfy API Key` 登录
1. 输入你的 API Key,并保存即可
2. 如果你没有 API Key,点击 `在这里获取` 链接跳转到 [https://platform.comfy.org/login](https://platform.comfy.org/login) 并登录获取。
登录成功后你可以在设置菜单中看到对应的 API Key 登录信息
请参考下列步骤进行 API Key 的申请和获取:
请访问 [https://platform.comfy.org/login](https://platform.comfy.org/login) 并使用对应的账号进行登录
在 API Keys 点击 `+ New` 创建 API Key
1. (必填)输入 API Key 名称,
2. 点击 `Generate` 创建
由于 API Key 仅在第一次创建时可见,所以请在创建后立即保存,后续无法查看,请妥善保存,请注意不要公开分享你的 API Key,如果对应的 Key 已经泄露,请及时删除并创建新的 API Key。
对于不使用的 API Key,或者存在泄露风险的 API Key,你可以点击 `Delete` 进行删除,以免造成不必要的损失。
如果你已经获取到了 API Key,并且是在公共的设备上登录的,请及时退出账号登录。
## 登录后状态
登录后在 CofmyUI 界面顶部菜单栏显示登录按钮,并可以通过该按钮打开对应的登录界面,并可以在设置菜单中退出对应的账号
## 常见问题
我们不会对登录设备进行限制,你可以在任何设备上登录你的账号,但是请注意,你的账号信息可能会被其他设备访问,所以请不要在公共设备上登录你的账号。
目前局域网内仅支持通过 API Key 登录,如果你是通过局域网访问 ComfyUI 相关服务,请使用 API Key 来进行登录。
由于我们的登录服务设置了白名单,所以你可能在一些服务端上部署的 ComfyUI 无法正常登录,对于此类情况,你可以使用 API Key 登录来解决。
# 自定义节点 CI/CD
Source: https://docs.comfy.org/zh-CN/registry/cicd
## 简介
在对自定义节点进行更改时,在 Comfy 或其他自定义节点中出现问题并不罕见。在每种操作系统和不同的 Pytorch 配置上进行测试通常是不现实的。
### 使用 Github Actions 运行 Comfy 工作流
[Comfy-Action](https://github.com/Comfy-Org/comfy-action) 允许您在 Github Actions 上运行 Comfy workflow\.json 文件。它支持下载模型、自定义节点,并可在 Linux/Mac/Windows 上运行。
### 结果
输出文件会上传到 [CI/CD 仪表板](https://comfyci.org),可以在提交新更改或发布自定义节点的新版本之前作为最后一步查看。
# 概述
Source: https://docs.comfy.org/zh-CN/registry/overview
## 简介
注册表(Registry)是一个自定义节点的公共集合。开发者可以发布、版本控制、弃用和跟踪与其自定义节点相关的指标。ComfyUI 用户可以从注册表中发现、安装和评价自定义节点。
## 为什么使用 Registry?
注册表通过标准化自定义节点的开发来帮助社区:
**节点版本控制:** 开发者经常发布其自定义节点的新版本,这往往会破坏依赖它们的工作流。通过使用[语义化版本控制](https://semver.org/),用户现在可以选择安全地升级、弃用或锁定其节点版本,提前了解其操作将如何影响其工作流。工作流 JSON 将存储所使用的节点版本,因此您可以始终可靠地重现您的工作流。
**节点安全性:** 注册表将作为 [ComfyUI-manager](https://github.com/comfy-org/ComfyUI-Manager) 的后端。所有节点都将被扫描是否存在恶意行为,如自定义 pip 包、任意系统调用等。通过这些检查的节点将在 UI-manager 上其名称旁边显示验证标志( )。有关安全标准列表,请参阅[标准](/zh-CN/registry/standards)。
**搜索:** 在 Registry 上搜索所有节点,为您的工作流找到现有节点。
## 发布节点
按照[教程](/zh-CN/registry/publishing)开始发布您的第一个节点。
## 常见问题
是的,Registry 上的自定义节点具有全局唯一名称,这允许 Comfy Workflow JSON 文件唯一标识任何自定义节点而不会发生冲突。
查看[标准](/zh-CN/registry/standards)了解更多信息。
一旦自定义节点版本发布,就不能更改。这确保用户可以依赖自定义节点随时间推移的稳定性。
自定义节点使用[语义化版本控制](https://semver.org/)进行版本控制。这使用户能够理解升级到新版本的影响。
您可以在 Comfy Registry 网站上通过点击 **更多操作 > 弃用** 来弃用版本。安装此版本的用户将看到弃用消息,并被鼓励升级到更新的版本。
当发布后发现问题时,弃用版本很有用。
# 发布节点
Source: https://docs.comfy.org/zh-CN/registry/publishing
## 设置注册表账户
按照以下步骤设置注册表账户并发布您的第一个节点。
### 观看教程
VIDEO
### 创建发布者
发布者是一个可以向注册表(registry)发布自定义节点的身份。每个自定义节点都需要在 pyproject.toml [文件]() 中包含发布者标识符。
访问 [Comfy Registry](https://registry.comfy.org),创建一个发布者账户。您的发布者 ID 是全球唯一的,并且之后不能更改,因为它用于您的自定义节点的 URL 中。
您的发布者 ID 可以在个人资料页面上 `@` 符号后面找到。
### 创建用于发布的 API 密钥
访问[这里](https://registry.comfy.org/nodes)并点击你想要为其创建 API 密钥的发布者。这将用于通过 CLI 发布自定义节点。
为 API 密钥命名并将其安全保存。如果密钥丢失了它,请重新创建一个新的密钥。
### 添加元数据
安装 comfy-cli了吗?
如果没有请 [先安装它](/zh-CN/comfy-cli/getting-started)。
```bash
comfy node init
```
这个命令将会生成下面这样的元数据:
```toml
# pyproject.toml
[project]
name = "" # Unique identifier for your node. Immutable after creation.
description = ""
version = "1.0.0" # Custom Node version. Must be semantically versioned.
license = { file = "LICENSE.txt" }
dependencies = [] # Filled in from requirements.txt
[project.urls]
Repository = "https://github.com/..."
[tool.comfy]
PublisherId = "" # TODO (fill in Publisher ID from Comfy Registry Website).
DisplayName = "" # Display name for the Custom Node. Can be changed later.
Icon = "https://example.com/icon.png" # SVG, PNG, JPG or GIF (MAX. 800x400px)
```
将此文件添加到您的仓库中。查看[规范](/zh-CN/registry/specifications)以获取有关 pyproject.toml 文件的更多信息。
## 发布到注册表(registry)
### 选项 1: Comfy CLI
运行下面的命令手动将您的节点发布到注册表。
```bash
comfy node publish
```
会被提示要求输入 API 密钥。
```bash
API Key for publisher '': ****************************************************
...Version 1.0.0 Published.
See it here: https://registry.comfy.org/publisherId/your-node
```
请记住,API 密钥默认是隐藏的。
当使用 CTRL+V 复制粘贴时,您的 API 密钥可能会有一个额外的 \x16 在后面,例如: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\x16。
建议通过右键点击复制粘贴您的 API 密钥。
### 选项 2: Github Actions
通过 Github Actions 自动发布您的节点。
前往 Settings -> Secrets and Variables -> Actions -> Under Secrets Tab and Repository secrets -> New Repository Secret.
创建一个名为 `REGISTRY_ACCESS_TOKEN` 的 secret 并存储您的 API 密钥作为值。
复制下面的代码并粘贴到 `/.github/workflows/publish_action.yml`
```bash
name: Publish to Comfy registry
on:
workflow_dispatch:
push:
branches:
- main
paths:
- "pyproject.toml"
jobs:
publish-node:
name: Publish Custom Node to registry
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v4
- name: Publish Custom Node
uses: Comfy-Org/publish-node-action@main
with:
personal_access_token: ${{ secrets.REGISTRY_ACCESS_TOKEN }} ## Add your own personal access token to your Github Repository secrets and reference it here.
```
如果您的分支名称不是 `main`,例如 `master`,请在 branches 部分添加名称。
推送到您的 `pyproject.toml` 的版本号。您应该在注册表中看到您的更新节点。
Github Action 会自动在您每次推送 `pyproject.toml` 文件的更新时运行。
# pyproject.toml
Source: https://docs.comfy.org/zh-CN/registry/specifications
# 规范
`pyproject.toml` 文件包含两个主要的 ComfyUI 自定义节点部分:`[project]` 和 `[tool.comfy]`。以下是每个部分的规范。
## \[project] 部分
### name(必需)
节点 ID 唯一标识自定义节点,并将用于注册表中的 URL。用户可以通过引用此名称来安装节点:
```bash
comfy node install
```
**要求:**
* 必须小于 100 个字符
* 只能包含字母、数字、连字符、下划线和句点
* 不能有连续的特殊字符
* 不能以数字或特殊字符开头
* 不区分大小写比较
**最佳实践:**
* 使用简短、描述性的名称
* 不要在名称中包含 "ComfyUI"
* 使其易于记忆和输入
**Examples:**
```toml
name = "image-processor" # ✅ Good: Simple and clear
name = "super-resolution" # ✅ Good: Describes functionality
name = "ComfyUI-enhancer" # ❌ Bad: Includes ComfyUI
name = "123-tool" # ❌ Bad: Starts with number
```
更多详细信息,请参阅[官方 Python 文档](https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#name)。
### version(必需)
使用 [语义化版本控制](https://semver.org/) 并包含三个数字的版本号 X.Y.Z:
* X(**MAJOR**):重大更改
* Y(**MINOR**):新功能(向后兼容)
* Z (**PATCH**): Bug fixes
**Examples:**
```toml
version = "1.0.0" # 初始版本
version = "1.1.0" # 添加新功能
version = "1.1.1" # 修复错误
version = "2.0.0" # 重大更改
```
### license(可选)
指定自定义节点的许可证。可以以两种方式指定:
1. **文件引用:**
```toml
license = { file = "LICENSE" } # ✅ 指向 LICENSE 文件
license = { file = "LICENSE.txt" } # ✅ 指向 LICENSE.txt 文件
license = "LICENSE" # ❌ 格式错误
```
2. **许可证名称:**
```toml
license = { text = "MIT License" } # ✅ 正确格式
license = { text = "Apache-2.0" } # ✅ 正确格式
license = "MIT LICENSE" # ❌ 格式错误
```
常见许可证:[MIT](https://opensource.org/license/mit), [GPL](https://www.gnu.org/licenses/gpl-3.0.en.html), [Apache](https://www.apache.org/licenses/LICENSE-2.0)
### description(推荐)
自定义节点的简要描述。
```toml
description = "A super resolution node for enhancing image quality"
```
### repository (必需)
相关资源的链接:
```toml
[project.urls]
Repository = "https://github.com/username/repository"
```
### urls(推荐)
相关资源的链接:
```toml
[project.urls]
Documentation = "https://github.com/username/repository/wiki"
"Bug Tracker" = "https://github.com/username/repository/issues"
```
### requires-python(推荐)
指定自定义节点支持的 Python 版本:
```toml
requires-python = ">=3.8" # Python 3.8 或更高版本
requires-python = ">=3.8,<3.11" # Python 3.8 到 3.11 之间(不包括 3.11)
```
### 前端版本兼容性(可选)
如果你的节点对 ComfyUI 前端版本有特定要求,你可以使用 `comfyui-frontend-package` 依赖项来指定。该包发布在 [PyPI](https://pypi.org/project/comfyui-frontend-package/) 上。
在以下情况下使用此字段:
* 你的自定义节点使用了特定版本中引入的前端 API
* 你发现了你的节点与某些前端版本之间的不兼容性
* 你的节点需要仅在较新前端版本中可用的特定 UI 功能
```toml
[project]
dependencies = [
"comfyui-frontend-package>=1.20.0" # 需要前端 1.20.0 或更新版本
"comfyui-frontend-package<=1.21.6" # 限制前端版本最高到 1.21.6
"comfyui-frontend-package>=1.19,<1.22" # 适用于前端 1.19 到 1.21.x
"comfyui-frontend-package~=1.20.0" # 兼容 1.20.x 但不包括 1.21.0
"comfyui-frontend-package!=1.21.3" # 适用于任何版本,除了 1.21.3
]
```
### classifiers(推荐)
使用分类器指定操作系统的兼容性和GPU加速器。这个信息用于帮助用户找到适合他们系统的节点。
```toml
[project]
classifiers = [
# 适用于所有操作系统的节点
"Operating System :: OS Independent",
# 或者对于特定操作系统的节点,指定支持的系统:
"Operating System :: Microsoft :: Windows", # Windows specific
"Operating System :: POSIX :: Linux", # Linux specific
"Operating System :: MacOS", # macOS specific
# GPU 加速器支持
"Environment :: GPU :: NVIDIA CUDA", # NVIDIA CUDA 支持
"Environment :: GPU :: AMD ROCm", # AMD ROCm 支持
"Environment :: GPU :: Intel Arc", # Intel Arc 支持
"Environment :: NPU :: Huawei Ascend", # 华为昇腾支持
"Environment :: GPU :: Apple Metal", # Apple Metal 支持
]
```
## \[tool.comfy] 部分
### PublisherId(必需)
你的唯一发布者标识符,通常与您的 GitHub 用户名匹配。
**Examples:**
```toml
PublisherId = "john-doe" # ✅ 匹配 GitHub 用户名
PublisherId = "image-wizard" # ✅ 唯一标识符
```
### DisplayName(可选)
你的自定义节点的用户友好名称。
```toml
DisplayName = "Super Resolution Node"
```
### Icon(可选)
你的自定义节点的图标 URL,将在 ComfyUI Registry 和 ComfyUI-Manager 中显示。
**要求:**
* 文件类型:SVG, PNG, JPG, 或 GIF
* 最大分辨率:400px × 400px
* 长宽比应该是正方形
```toml
Icon = "https://raw.githubusercontent.com/username/repo/main/icon.png"
```
### Banner(可选)
URL 指向一个较大的横幅图像,将在 ComfyUI Registry 和 ComfyUI-Manager 中显示。
**要求:**
* 文件类型:SVG, PNG, JPG, 或 GIF
* 长宽比:21:9
```toml
Banner = "https://raw.githubusercontent.com/username/repo/main/banner.png"
```
### requires-comfyui(可选)
指定你的节点兼容的 ComfyUI 版本。这有助于用户确保他们安装了正确版本的 ComfyUI。
**支持的操作符:** `<`, `>`, `<=`, `>=`, `~=`, `<>`, `!=` 和范围
```toml
requires-comfyui = ">=1.0.0" # ComfyUI 1.0.0 或更高版本
requires-comfyui = ">=1.0.0,<2.0.0" # ComfyUI 1.0.0 到 2.0.0 之间(不包括 2.0.0)
requires-comfyui = "~=1.0.0" # 兼容版本:1.0.0 或更新版本,但不包括 2.0.0
requires-comfyui = "!=1.2.3" # 任何版本,除了 1.2.3
requires-comfyui = ">0.1.3,<1.0.0" # 大于 0.1.3 且小于 1.0.0
```
### includes(可选)
指定是否强制包含某些特定文件夹。对于一些情况,例如在 frontend 项目中的自定义节点,最终打包输出的文件夹可能会被包含在 .gitignore 中。在这种情况下,我们需要强制包含它以用于注册表使用。
```toml
includes = ['dist']
```
## 完整示例
```toml
[project]
name = "super-resolution-node"
version = "1.0.0"
description = "Enhance image quality using advanced super resolution techniques"
license = { file = "LICENSE" }
requires-python = ">=3.8"
dependencies = [
"comfyui-frontend-package<=1.21.6" # 前端版本兼容性
]
classifiers = [
"Operating System :: OS Independent" # 适用于所有操作系统
]
dynamic = ["dependencies"]
[tool.setuptools.dynamic]
dependencies = {file = ["requirements.txt"]}
[project.urls]
Repository = "https://github.com/username/super-resolution-node"
Documentation = "https://github.com/username/super-resolution-node/wiki"
"Bug Tracker" = "https://github.com/username/super-resolution-node/issues"
[tool.comfy]
PublisherId = "image-wizard"
DisplayName = "Super Resolution Node"
Icon = "https://raw.githubusercontent.com/username/super-resolution-node/main/icon.png"
Banner = "https://raw.githubusercontent.com/username/super-resolution-node/main/banner.png"
requires-comfyui = ">=1.0.0" # ComfyUI 版本兼容性
```
# 标准
Source: https://docs.comfy.org/zh-CN/registry/standards
发布到注册表(Registry)的安全和其他标准
## 基本标准
### 1. 社区价值
自定义节点必须为 ComfyUI 社区提供有价值的功能
避免:
* 过度自我宣传
* 冒充或误导行为
* 恶意行为
* 自我宣传仅允许在指定的设置菜单部分内
* 顶部和侧边菜单应仅包含实用功能
### 2. 节点兼容性
不要干扰其他自定义节点的操作(安装、更新、删除)
* 对于其他自定义节点的依赖:
* 使用依赖功能时显示清晰的警告
* 提供展示所需节点的示例工作流
### 3. 法律合规性
必须遵守所有适用的法律和法规
### 5. 质量要求
节点必须功能完整、文档完善且积极维护。
### 6. 分叉指南
分叉的节点必须:
* 与原始节点名称有明显区别
* 在功能或代码上提供显著差异
以下是发布自定义节点到注册表必须满足的标准。
## 安全标准
自定义节点应该是安全的。我们将开始与违反这些标准的自定义节点合作进行重写。如果有一些应该由核心暴露的主要功能,请在 [rfcs 仓库](https://github.com/comfy-org/rfcs) 中提出请求。
### eval/exec 调用
#### 政策
由于安全考虑,禁止在自定义节点中使用 `eval` 和 `exec` 函数。
#### 原因
这些函数可以启用任意代码执行,在处理用户输入时创建潜在的远程代码执行(RCE)漏洞。包含将用户输入传递给 `eval` 或 `exec` 的节点的工作流可能被利用进行各种网络攻击,包括:
* 键盘记录
* 勒索软件
* 其他恶意代码执行
### 用于 pip install 的 subprocess
#### 政策
不允许通过 subprocess 调用进行运行时包安装。
#### 原因
* ComfyUI manager 将与 ComfyUI 一起发布,并允许用户安装依赖项
* 集中式依赖管理提高了安全性和用户体验
* 有助于防止潜在的供应链攻击
* 消除了多次重新加载 ComfyUI 的需要
### 代码混淆
#### 政策
禁止在自定义节点中进行代码混淆。
#### 原因
混淆的代码:
* 无法审查,很可能具有恶意性
# 节点定义 JSON
Source: https://docs.comfy.org/zh-CN/specs/nodedef_json
ComfyUI 节点的 JSON 模式。
节点定义 JSON 使用 [JSON Schema](https://json-schema.org/) 定义。对此模式的更改将在 [rfcs 仓库](https://github.com/comfy-org/rfcs) 中讨论。
## v2.0 (最新版本)
```json Node Definition v2.0
{
"$ref": "#/definitions/ComfyNodeDefV2",
"definitions": {
"ComfyNodeDefV2": {
"type": "object",
"properties": {
"inputs": {
"type": "object",
"additionalProperties": {
"anyOf": [
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"control_after_generate": {
"type": "boolean"
},
"type": {
"type": "string",
"const": "INT"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"round": {
"anyOf": [
{
"type": "number"
},
{
"type": "boolean",
"const": false
}
]
},
"type": {
"type": "string",
"const": "FLOAT"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {
"type": "boolean"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"label_on": {
"type": "string"
},
"label_off": {
"type": "string"
},
"type": {
"type": "string",
"const": "BOOLEAN"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {
"type": "string"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"multiline": {
"type": "boolean"
},
"dynamicPrompts": {
"type": "boolean"
},
"defaultVal": {
"type": "string"
},
"placeholder": {
"type": "string"
},
"type": {
"type": "string",
"const": "STRING"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
},
"type": {
"type": "string",
"const": "COMBO"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"type": {
"type": "string"
},
"name": {
"type": "string"
},
"isOptional": {
"type": "boolean"
}
},
"required": [
"type",
"name"
],
"additionalProperties": true
}
]
}
},
"outputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"index": {
"type": "number"
},
"name": {
"type": "string"
},
"type": {
"type": "string"
},
"is_list": {
"type": "boolean"
},
"options": {
"type": "array"
},
"tooltip": {
"type": "string"
}
},
"required": [
"index",
"name",
"type",
"is_list"
],
"additionalProperties": false
}
},
"hidden": {
"type": "object",
"additionalProperties": {}
},
"name": {
"type": "string"
},
"display_name": {
"type": "string"
},
"description": {
"type": "string"
},
"category": {
"type": "string"
},
"output_node": {
"type": "boolean"
},
"python_module": {
"type": "string"
},
"deprecated": {
"type": "boolean"
},
"experimental": {
"type": "boolean"
}
},
"required": [
"inputs",
"outputs",
"name",
"display_name",
"description",
"category",
"output_node",
"python_module"
],
"additionalProperties": false
}
},
"$schema": "http://json-schema.org/draft-07/schema#"
}
```
# 节点定义 JSON 1.0
Source: https://docs.comfy.org/zh-CN/specs/nodedef_json_1_0
ComfyUI 节点的 JSON 模式。
## v1.0
```json Node Definition v1.0
{
"$ref": "#/definitions/ComfyNodeDefV1",
"definitions": {
"ComfyNodeDefV1": {
"type": "object",
"properties": {
"input": {
"type": "object",
"properties": {
"required": {
"type": "object",
"additionalProperties": {
"anyOf": [
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "INT"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"control_after_generate": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "FLOAT"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"round": {
"anyOf": [
{
"type": "number"
},
{
"type": "boolean",
"const": false
}
]
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "BOOLEAN"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"type": "boolean"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"label_on": {
"type": "string"
},
"label_off": {
"type": "string"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "STRING"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"type": "string"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"multiline": {
"type": "boolean"
},
"dynamicPrompts": {
"type": "boolean"
},
"defaultVal": {
"type": "string"
},
"placeholder": {
"type": "string"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "COMBO"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
}
]
}
]
}
},
"optional": {
"type": "object",
"additionalProperties": {
"anyOf": [
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "INT"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"control_after_generate": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "FLOAT"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"anyOf": [
{
"type": "number"
},
{
"type": "array",
"items": {
"type": "number"
}
}
]
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"min": {
"type": "number"
},
"max": {
"type": "number"
},
"step": {
"type": "number"
},
"display": {
"type": "string",
"enum": [
"slider",
"number",
"knob"
]
},
"round": {
"anyOf": [
{
"type": "number"
},
{
"type": "boolean",
"const": false
}
]
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "BOOLEAN"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"type": "boolean"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"label_on": {
"type": "string"
},
"label_off": {
"type": "string"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "STRING"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {
"type": "string"
},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"multiline": {
"type": "boolean"
},
"dynamicPrompts": {
"type": "boolean"
},
"defaultVal": {
"type": "string"
},
"placeholder": {
"type": "string"
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string",
"const": "COMBO"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
},
"control_after_generate": {
"type": "boolean"
},
"image_upload": {
"type": "boolean"
},
"image_folder": {
"type": "string",
"enum": [
"input",
"output",
"temp"
]
},
"allow_batch": {
"type": "boolean"
},
"video_upload": {
"type": "boolean"
},
"remote": {
"type": "object",
"properties": {
"route": {
"anyOf": [
{
"type": "string",
"format": "uri"
},
{
"type": "string",
"pattern": "^\\/"
}
]
},
"refresh": {
"anyOf": [
{
"type": "number",
"minimum": -9007199254740991,
"maximum": 9007199254740991
},
{
"type": "number",
"maximum": 9007199254740991,
"minimum": -9007199254740991
}
]
},
"response_key": {
"type": "string"
},
"query_params": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"refresh_button": {
"type": "boolean"
},
"control_after_refresh": {
"type": "string",
"enum": [
"first",
"last"
]
},
"timeout": {
"type": "number",
"minimum": 0
},
"max_retries": {
"type": "number",
"minimum": 0
}
},
"required": [
"route"
],
"additionalProperties": false
},
"options": {
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
},
"additionalProperties": true
}
]
}
]
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "string"
},
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"default": {},
"defaultInput": {
"type": "boolean"
},
"forceInput": {
"type": "boolean"
},
"tooltip": {
"type": "string"
},
"hidden": {
"type": "boolean"
},
"advanced": {
"type": "boolean"
},
"rawLink": {
"type": "boolean"
},
"lazy": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
}
]
}
]
}
},
"hidden": {
"type": "object",
"additionalProperties": {}
}
},
"additionalProperties": false
},
"output": {
"type": "array",
"items": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": [
"string",
"number"
]
}
}
]
}
},
"output_is_list": {
"type": "array",
"items": {
"type": "boolean"
}
},
"output_name": {
"type": "array",
"items": {
"type": "string"
}
},
"output_tooltips": {
"type": "array",
"items": {
"type": "string"
}
},
"name": {
"type": "string"
},
"display_name": {
"type": "string"
},
"description": {
"type": "string"
},
"category": {
"type": "string"
},
"output_node": {
"type": "boolean"
},
"python_module": {
"type": "string"
},
"deprecated": {
"type": "boolean"
},
"experimental": {
"type": "boolean"
}
},
"required": [
"name",
"display_name",
"description",
"category",
"output_node",
"python_module"
],
"additionalProperties": false
}
},
"$schema": "http://json-schema.org/draft-07/schema#"
}
```
# 工作流 JSON
Source: https://docs.comfy.org/zh-CN/specs/workflow_json
ComfyUI 工作流的 JSON 模式。
工作流 JSON 使用 [JSON Schema](https://json-schema.org/) 定义。对此模式的更改将在 [rfcs 仓库](https://github.com/comfy-org/rfcs) 中讨论。
## v1.0 (最新版本)
```json ComfyUI Workflow v1.0
{
"$ref": "#/definitions/ComfyWorkflow1_0",
"definitions": {
"ComfyWorkflow1_0": {
"type": "object",
"properties": {
"version": {
"type": "number",
"const": 1
},
"config": {
"anyOf": [
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"links_ontop": {
"type": "boolean"
},
"align_to_grid": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
},
{
"type": "null"
}
]
},
"state": {
"type": "object",
"properties": {
"lastGroupid": {
"type": "number"
},
"lastNodeId": {
"type": "number"
},
"lastLinkId": {
"type": "number"
},
"lastRerouteId": {
"type": "number"
}
},
"additionalProperties": true
},
"groups": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {
"type": "string"
},
"bounding": {
"type": "array",
"minItems": 4,
"maxItems": 4,
"items": [
{
"type": "number"
},
{
"type": "number"
},
{
"type": "number"
},
{
"type": "number"
}
]
},
"color": {
"type": "string"
},
"font_size": {
"type": "number"
},
"locked": {
"type": "boolean"
}
},
"required": [
"title",
"bounding"
],
"additionalProperties": true
}
},
"nodes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"type": {
"type": "string"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"size": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"flags": {
"type": "object",
"properties": {
"collapsed": {
"type": "boolean"
},
"pinned": {
"type": "boolean"
},
"allow_interaction": {
"type": "boolean"
},
"horizontal": {
"type": "boolean"
},
"skip_repeated_outputs": {
"type": "boolean"
}
},
"additionalProperties": true
},
"order": {
"type": "number"
},
"mode": {
"type": "number"
},
"inputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"link": {
"type": [
"number",
"null"
]
},
"slot_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
}
},
"required": [
"name",
"type"
],
"additionalProperties": true
}
},
"outputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"links": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
},
"slot_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
}
},
"required": [
"name",
"type"
],
"additionalProperties": true
}
},
"properties": {
"type": "object",
"properties": {
"Node name for S&R": {
"type": "string"
}
},
"additionalProperties": true
},
"widgets_values": {
"anyOf": [
{
"type": "array"
},
{
"type": "object",
"additionalProperties": {}
}
]
},
"color": {
"type": "string"
},
"bgcolor": {
"type": "string"
}
},
"required": [
"id",
"type",
"pos",
"size",
"flags",
"order",
"mode",
"properties"
],
"additionalProperties": true
}
},
"links": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"origin_id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"origin_slot": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"target_id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"target_slot": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"parentId": {
"type": "number"
}
},
"required": [
"id",
"origin_id",
"origin_slot",
"target_id",
"target_slot",
"type"
],
"additionalProperties": true
}
},
"reroutes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"linkIds": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
}
},
"required": [
"id",
"pos"
],
"additionalProperties": true
}
},
"extra": {
"anyOf": [
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"ds": {
"type": "object",
"properties": {
"scale": {
"type": "number"
},
"offset": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
}
},
"required": [
"scale",
"offset"
],
"additionalProperties": true
},
"info": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"author": {
"type": "string"
},
"description": {
"type": "string"
},
"version": {
"type": "string"
},
"created": {
"type": "string"
},
"modified": {
"type": "string"
},
"software": {
"type": "string"
}
},
"required": [
"name",
"author",
"description",
"version",
"created",
"modified",
"software"
],
"additionalProperties": true
},
"linkExtensions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
}
},
"required": [
"id",
"parentId"
],
"additionalProperties": true
}
},
"reroutes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"linkIds": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
}
},
"required": [
"id",
"pos"
],
"additionalProperties": true
}
}
},
"additionalProperties": true
}
]
},
{
"type": "null"
}
]
},
"models": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"url": {
"type": "string",
"format": "uri"
},
"hash": {
"type": "string"
},
"hash_type": {
"type": "string"
},
"directory": {
"type": "string"
}
},
"required": [
"name",
"url",
"directory"
],
"additionalProperties": false
}
}
},
"required": [
"version",
"state",
"nodes"
],
"additionalProperties": true
}
},
"$schema": "http://json-schema.org/draft-07/schema#"
}
```
## Older versions
* [0.4](./workflow_json_0.4)
# 工作流 JSON 0.4
Source: https://docs.comfy.org/zh-CN/specs/workflow_json_0.4
ComfyUI 工作流的 JSON 模式。
## v0.4
```json
{
"$ref": "#/definitions/ComfyWorkflow0_4",
"definitions": {
"ComfyWorkflow0_4": {
"type": "object",
"properties": {
"last_node_id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"last_link_id": {
"type": "number"
},
"nodes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
"type": {
"type": "string"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"size": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"flags": {
"type": "object",
"properties": {
"collapsed": {
"type": "boolean"
},
"pinned": {
"type": "boolean"
},
"allow_interaction": {
"type": "boolean"
},
"horizontal": {
"type": "boolean"
},
"skip_repeated_outputs": {
"type": "boolean"
}
},
"additionalProperties": true
},
"order": {
"type": "number"
},
"mode": {
"type": "number"
},
"inputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"link": {
"type": [
"number",
"null"
]
},
"slot_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
}
},
"required": [
"name",
"type"
],
"additionalProperties": true
}
},
"outputs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"type": {
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
},
"links": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
},
"slot_index": {
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
}
},
"required": [
"name",
"type"
],
"additionalProperties": true
}
},
"properties": {
"type": "object",
"properties": {
"Node name for S&R": {
"type": "string"
}
},
"additionalProperties": true
},
"widgets_values": {
"anyOf": [
{
"type": "array"
},
{
"type": "object",
"additionalProperties": {}
}
]
},
"color": {
"type": "string"
},
"bgcolor": {
"type": "string"
}
},
"required": [
"id",
"type",
"pos",
"size",
"flags",
"order",
"mode",
"properties"
],
"additionalProperties": true
}
},
"links": {
"type": "array",
"items": {
"type": "array",
"minItems": 6,
"maxItems": 6,
"items": [
{
"type": "number"
},
{
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
{
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
{
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
{
"anyOf": [
{
"type": "integer"
},
{
"type": "string"
}
]
},
{
"anyOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "number"
}
]
}
]
}
},
"groups": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {
"type": "string"
},
"bounding": {
"type": "array",
"minItems": 4,
"maxItems": 4,
"items": [
{
"type": "number"
},
{
"type": "number"
},
{
"type": "number"
},
{
"type": "number"
}
]
},
"color": {
"type": "string"
},
"font_size": {
"type": "number"
},
"locked": {
"type": "boolean"
}
},
"required": [
"title",
"bounding"
],
"additionalProperties": true
}
},
"config": {
"anyOf": [
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"links_ontop": {
"type": "boolean"
},
"align_to_grid": {
"type": "boolean"
}
},
"additionalProperties": true
}
]
},
{
"type": "null"
}
]
},
"extra": {
"anyOf": [
{
"anyOf": [
{
"not": {}
},
{
"type": "object",
"properties": {
"ds": {
"type": "object",
"properties": {
"scale": {
"type": "number"
},
"offset": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
}
},
"required": [
"scale",
"offset"
],
"additionalProperties": true
},
"info": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"author": {
"type": "string"
},
"description": {
"type": "string"
},
"version": {
"type": "string"
},
"created": {
"type": "string"
},
"modified": {
"type": "string"
},
"software": {
"type": "string"
}
},
"required": [
"name",
"author",
"description",
"version",
"created",
"modified",
"software"
],
"additionalProperties": true
},
"linkExtensions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
}
},
"required": [
"id",
"parentId"
],
"additionalProperties": true
}
},
"reroutes": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "number"
},
"parentId": {
"type": "number"
},
"pos": {
"anyOf": [
{
"type": "object",
"properties": {
"0": {
"type": "number"
},
"1": {
"type": "number"
}
},
"required": [
"0",
"1"
],
"additionalProperties": true
},
{
"type": "array",
"minItems": 2,
"maxItems": 2,
"items": [
{
"type": "number"
},
{
"type": "number"
}
]
}
]
},
"linkIds": {
"anyOf": [
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "null"
}
]
}
},
"required": [
"id",
"pos"
],
"additionalProperties": true
}
}
},
"additionalProperties": true
}
]
},
{
"type": "null"
}
]
},
"version": {
"type": "number"
},
"models": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"url": {
"type": "string",
"format": "uri"
},
"hash": {
"type": "string"
},
"hash_type": {
"type": "string"
},
"directory": {
"type": "string"
}
},
"required": [
"name",
"url",
"directory"
],
"additionalProperties": false
}
}
},
"required": [
"last_node_id",
"last_link_id",
"nodes",
"links",
"version"
],
"additionalProperties": true
}
},
"$schema": "http://json-schema.org/draft-07/schema#"
}
```
# 如何解决和排查 ComfyUI 中自定义节点导致的问题
Source: https://docs.comfy.org/zh-CN/troubleshooting/custom-node-issues
故障排除和修复由自定义节点和扩展引起的问题
关于自定义节点问题排查,本篇文档的总体思路如下:
```mermaid
flowchart TD
A[遇到问题] --> B{禁用所有自定义节点后问题是否消失? }
B -- 是 --> C[问题由自定义节点引起]
B -- 否 --> D[问题不是自定义节点引起,参考其他排查文档 ]
C --> E{是否先排查前端扩展?}
E -- 是 --> F[在ComfyUI前端 进行排查仅需要重载前端 ]
E -- 否 --> G[使用通用的二分法 进行排查需要多次重启 ComfyUI ]
F --> H[使用二分法定位有问题的节点]
G --> H
H --> I[修复、替换、报告或移除有问题的节点 ]
I --> J[问题解决]
```
## 如何禁用所有的自定义节点?
从设置菜单中启动禁用自定义节点的 ComfyUI 桌面版
或手动运行服务器:
```bash
cd path/to/your/comfyui
python main.py --disable-all-custom-nodes
```
```bash
cd ComfyUI
python main.py --disable-all-custom-nodes
```
打开便携版所在的文件夹,找到 `run_nvidia_gpu.bat` 或者 `run_cpu.bat` 文件
1. 复制 `run_nvidia_gpu.bat` 或者 `run_cpu.bat` 文件并重命名为 `run_nvidia_gpu_disable_custom_nodes.bat`
2. 使用记事本打开复制后的文件
3. 在文件中添加 `--disable-all-custom-nodes` 参数,或者复制下面的参数保存到 txt 文件中后将文件后缀名改为 `.bat`
```bash
.\python_embeded\python.exe -s ComfyUI\main.py --disable-all-custom-nodes --windows-standalone-build
pause
```
4. 保存文件并关闭
5. 双击运行文件,如果一切正常,你应该会看到 ComfyUI 启动,并且自定义节点被禁用
1. 进入便携版所在的文件夹
2. 通过右键菜单 → 打开终端 来打开命令行
3\. 确保在命令行中文件夹名称是当前便携版所在的文件目录
4\. 输入以下命令,来通过便携版的 python 来启动 ComfyUI 并禁用自定义节点
```
.\python_embeded\python.exe -s ComfyUI\main.py --disable-all-custom-nodes
```
**结果:**
* ✅ **问题消失**:自定义节点导致问题 → 继续步骤 2
* ❌ **问题仍然存在**:不是自定义节点问题 → [报告问题](#报告问题)
## 二分法
在本篇中我们将会介绍使用二分搜索来进行自定义节点问题排查的思路,也就是一次排查一半的自定义节点,直到定位到导致问题的自定义节点
具体思路清参考下面的流程图,即每次启用所有未启用节点的一半,看看对应的问题是否出现,直到定位到对应的自定义节点是哪个
```mermaid
flowchart TD
A[开始] --> B{将所有自定义节点分成两半}
B --> C[启用前半部分自定义节点]
C --> D[重启ComfyUI测试]
D --> E{问题是否出现?}
E --> |是| F[问题在已启用自定义节点中]
E --> |否| G[问题在未启用自定义节点中]
F --> H{已启用自定义节点>1个?}
G --> I{未启用自定义节点>1个?}
H --> |是| J[继续二分已启用自定义节点]
I --> |是| K[继续二分未启用自定义节点]
H --> |否| L[找到问题自定义节点]
I --> |否| L
J --> B
K --> B
L --> M[结束]
```
## 两种排查方法
在本篇文档中,我们将排查的自定义节点分为两类
* A:包含前端扩展的自定义节点
* B: 常规节点
首先让我们先了解不同类型的自定义节点可能导致的问题和原因
对于自定义节点我们又特别需要对包含前端扩展的自定义节点进行优先排查,这类节点导致的问题是最多的, 他们主要的冲突是与 ComfyUI 前端版本更新产生的冲突。
常见的一些问题有:
* 工作流无法执行
* 节点预览图丢失
* UI元素错位
* 无法进入 ComfyUI 前端
* UI 完全损坏或显示空白屏幕
* 无法和 ComfyUI 后端正常通信
* 节点之间的连线无法正常工作
* 等等
常见这类节点导致的原因是因为:
* 我们在对前端进行更新过程中进行了一些修改调整,这些自定义节点没有及时更新
* 用户常常在更新过程中只是更新了 ComfyUI 并没有对自定义节点进行同步升级,虽然作者进行了更新,但是用户并没有在使用最新的兼容版本
* 作者停止了维护,导致对应的自定义节点扩展无法和前端相互兼容
如果问题不是由于自定义节点的前端扩展导致,那么更多会出现在依赖问题上,这类常见的问题会有:
* 控制台/日志中出现"Failed to import"(导入失败)错误
* 安装工作流缺失的节点后重启后任然显示丢失
* ComfyUI 崩溃或无法启动
* 等
常见的这类错误导致的原因有:
* 自定义节点还需要安装额外的 wheel 比如 ComfyUI-Nunchaku
* 自定义使用了严格的依赖版本比如 `torch==2.4.1` 但是其它插件使用的是 `torch>=2.4.2` 在安装对应自定义节点以后后另外一个节点的依赖就冲突了
* 网络问题导致的对应依赖没有安装成功
如果问题发生在 python 环境的相互依赖上,版本等,这类问题处理起来需要一定的 python 环境知识,需要懂得如何在 python 环境中进行依赖的安装和卸载,这类问题处理起来会相对复杂一些
## 使用二分法进行排查
这两种上面两种不同的自定义节点问题里,自定义节点前端扩展和 ComfyUI 的冲突较为常见,我们会优先排查这类节点,后续的整体的问题排查思路如下
### 1.排查自定义节点的前端扩展
在启动 ComfyUI 后,在设置菜单中找到 `扩展` 菜单,并参考上图步骤禁用所有第三方扩展
在首次禁用前端扩展后,建议重启一次 ComfyUI 来保证能够顺利地禁用所有的前端扩展加载
* 如果问题消失,那么问题就是由自定义节点的前端扩展引起的,我们可以继续进行下一步进行二分法进行排查
* 如果问题仍然存在,那么问题就不是由自定义节点的前端扩展引起的,请参考本文另外一部分自定义节点扩展问题排查的思路
使用本文开头提到的方法来进行问题排查,一次启用一半的自定义节点,直到找到导致问题的自定义节点
参考图片,对一半的前端扩展进行启用,另外如果对应的扩展名称都相似,那么说明他们是来自同一个自定义节点的前端扩展
如果找到问题自定义节点请参考本文的问题修复部分来进行自定义节点的问题修复
使用这种方法,你不用多次重启 ComfyUI 仅需要在每次启用 / 禁用自定义节点的前端扩展后重载 ComfyUI 即可, 而且你的排查范围也只是在有前端扩展的节点里,会大大缩小节点排查范围
### 2. 通用的自定义节点排查方法
对于二分搜索定位的方法,除了手动之外我们也有使用 comfy-cli 来进行自动化的二分搜索,具体方法如下:
使用 Comfy CLI 需要你具有一定的命令行使用经验,如果你对此不擅长,请使用手动二分搜索
如果你已安装 [Comfy CLI](/zh-CN/comfy-cli/getting-started),那么 Comfy CLI 可以允许你使用自动化的二分工具来找到有问题的节点:
```bash
# 开始二分会话
comfy-cli node bisect start
# 按照提示操作:
# - 使用当前启用的节点集测试 ComfyUI
# - 如果问题消失,标记为 'good':comfy-cli node bisect good
# - 如果问题仍然存在,标记为 'bad':comfy-cli node bisect bad
# - 重复直到识别出有问题的节点
# 完成后重置
comfy-cli node bisect reset
```
二分工具将自动启用/禁用节点并引导你完成整个过程。
开始之前,请**创建备份**你的 custom\_nodes 文件夹,以防出现问题。
如果你更喜欢手动操作或没有安装 Comfy CLI,请按照以下步骤操作:
在开始前先进入 `<你的安装位置>\ComfyUI\` 文件夹
* **备份所有自定义节点**: 将 `custom_nodes` 复制并重命名为 `custom_nodes_backup`
* **创建临时文件夹**: 创建一个名为 `custom_nodes_temp` 的文件夹
或者使用下面的命令行来进行备份
```bash
# 创建备份和临时文件夹
mkdir "%USERPROFILE%\custom_nodes_backup"
mkdir "%USERPROFILE%\custom_nodes_temp"
# 首先备份所有内容
xcopy "custom_nodes\*" "%USERPROFILE%\custom_nodes_backup\" /E /H /Y
```
手动备份 custom\_nodes 文件夹
或使用下面的命令行来进行备份
```bash
# 创建备份和临时文件夹
mkdir ~/custom_nodes_backup
mkdir ~/custom_nodes_temp
# 备份所有内容
cp -r custom_nodes/* ~/custom_nodes_backup/
```
```bash
# 创建备份和临时文件夹
mkdir /content/custom_nodes_backup
mkdir /content/custom_nodes_temp
# 首先备份所有内容
cp -r /content/ComfyUI/custom_nodes/* /content/custom_nodes_backup/
```
由于 Windows 有可视化界面你可以忽略这一步,除非你只是使用命令行
```bash
dir custom_nodes
```
```bash
ls custom_nodes/
```
```bash
ls /content/ComfyUI/custom_nodes/
```
假设你有 8 个自定义节点。将前一半移动到临时存储:
```bash
# 将前一半(节点 1-4)移动到临时文件夹
move "custom_nodes\node1" "%USERPROFILE%\custom_nodes_temp\"
move "custom_nodes\node2" "%USERPROFILE%\custom_nodes_temp\"
move "custom_nodes\node3" "%USERPROFILE%\custom_nodes_temp\"
move "custom_nodes\node4" "%USERPROFILE%\custom_nodes_temp\"
```
```bash
# 将前一半(节点 1-4)移动到临时文件夹
mv custom_nodes/node1 ~/custom_nodes_temp/
mv custom_nodes/node2 ~/custom_nodes_temp/
mv custom_nodes/node3 ~/custom_nodes_temp/
mv custom_nodes/node4 ~/custom_nodes_temp/
```
```bash
# 将前一半(节点 1-4)移动到临时文件夹
mv /content/ComfyUI/custom_nodes/node1 /content/custom_nodes_temp/
mv /content/ComfyUI/custom_nodes/node2 /content/custom_nodes_temp/
mv /content/ComfyUI/custom_nodes/node3 /content/custom_nodes_temp/
mv /content/ComfyUI/custom_nodes/node4 /content/custom_nodes_temp/
```
正常启动 ComfyUI
```bash
python main.py
```
* **问题仍然存在**:问题在剩余节点(5-8)中
* **问题消失**:问题在移动的节点(1-4)中
* 如果问题仍然存在:将剩余节点的一半(例如节点 7-8)移动到临时文件夹
* 如果问题消失:将临时节点的一半(例如节点 3-4)移回 custom\_nodes
* 重复直到找到单个有问题的节点
## 修复自定义节点问题
一旦你识别出有问题的自定义节点:
### 选项 1:更新节点
1. 检查 ComfyUI 管理器中是否有可用更新
2. 更新节点并再次测试
### 选项 2:替换节点
1. 寻找具有类似功能的替代自定义节点
2. 查看 [ComfyUI 注册表](https://registry.comfy.org) 寻找替代方案
### 选项 3:报告问题
联系自定义节点开发者:
1. 找到节点的 GitHub 仓库
2. 创建问题并包含:
* 你的 ComfyUI 版本
* 错误消息/日志
* 重现步骤
* 你的操作系统
### 选项 4:移除节点
如果没有修复可用且你不需要该功能:
1. 从 `custom_nodes/` 中移除有问题的节点
2. 重启 ComfyUI
## 报告非自定义节点导致的问题
如果问题不是由自定义节点引起的,请参考通过[故障排除概述](/zh-CN/troubleshooting/overview)了解其他常见问题。
### 自定义节点特定问题
联系自定义节点开发者:
* 找到节点的 GitHub 仓库
* 创建问题并包含你的 ComfyUI 版本、错误消息、重现步骤和操作系统
* 查看节点文档了解已知问题
### ComfyUI 核心问题
* **GitHub**:[ComfyUI Issues](https://github.com/comfyanonymous/ComfyUI/issues)
* **论坛**:[官方 ComfyUI 论坛](https://forum.comfy.org/)
### 桌面应用问题
* **GitHub**:[ComfyUI 桌面问题](https://github.com/Comfy-Org/desktop/issues)
### 前端问题
* **GitHub**:[ComfyUI 前端问题](https://github.com/Comfy-Org/ComfyUI_frontend/issues)
对于一般安装、模型或性能问题,请参见我们的[故障排除概述](/zh-CN/troubleshooting/overview)和[模型问题](/zh-CN/troubleshooting/model-issues)页面。
# 如何排查和解决 ComfyUI 中模型相关的问题
Source: https://docs.comfy.org/zh-CN/troubleshooting/model-issues
故障排除模型相关问题,包括架构不匹配、缺少模型和加载错误
## 模型架构不匹配
**症状:** 生成过程中出现张量维度错误,特别是在 VAE 解码阶段
**常见错误消息:**
* `Given groups=1, weight of size [64, 4, 3, 3], expected input[1, 16, 128, 128] to have 4 channels, but got 16 channels instead`
* `Given groups=1, weight of size [4, 4, 1, 1], expected input[1, 16, 144, 112] to have 4 channels, but got 16 channels instead`
* `Given groups=1, weight of size [320, 4, 3, 3], expected input[2, 16, 192, 128] to have 4 channels, but got 16 channels instead`
* `The size of tensor a (49) must match the size of tensor b (16) at non-singleton dimension 1`
* `Tensors must have same number of dimensions: got 2 and 3`
* `mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)`
**根本原因:** 将来自不同架构系列的模型混合使用
### 解决方案
1. **验证模型系列兼容性:**
* **Flux 模型**使用 16 通道潜在空间,配合双文本编码器调节(CLIP-L + T5-XXL)
* **SD1.5 模型**使用 4 通道潜在空间,配合单个 CLIP ViT-L/14 文本编码器
* **SDXL 模型**使用 4 通道潜在空间,配合双文本编码器(CLIP ViT-L/14 + OpenCLIP ViT-bigG/14)
* **SD3 模型**使用 16 通道潜在空间,配合三重文本编码器调节(CLIP-L + OpenCLIP bigG + T5-XXL)
* **ControlNet 模型**必须与基础检查点的架构匹配(SD1.5 ControlNet 仅适用于 SD1.5 检查点,SDXL ControlNet 仅适用于 SDXL 检查点,等等)
2. **常见不匹配场景和修复:**
**Flux + 错误的 VAE:**
```
问题:将 taesd 或 sdxl_vae.safetensors 与 Flux 检查点一起使用
修复:使用来自 Hugging Face Flux 发布的 ae.safetensors(Flux VAE)
```
**Flux + 不正确的 CLIP 配置:**
```
问题:在 DualClipLoader 的两个 CLIP 插槽中都使用 t5xxl_fp8_e4m3fn.safetensors
修复:在一个插槽中使用 t5xxl_fp8_e4m3fn.safetensors,在另一个插槽中使用 clip_l.safetensors
```
**ControlNet 架构不匹配:**
```
问题:SD1.5 ControlNet 与 SDXL 检查点(或反之)
错误:"mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)"
修复:使用为您的检查点架构设计的 ControlNet 模型
- SD1.5 检查点需要 SD1.5 ControlNet
- SDXL 检查点需要 SDXL ControlNet
```
3. **快速诊断:**
```bash
# 检查错误是否发生在 VAE 解码阶段
# 寻找 "expected input[X, Y, Z] to have N channels, but got M channels"
# Y 值表示通道数:4 = SD 模型,16 = Flux 模型
```
4. **预防策略:**
* 将所有工作流模型保持在同一架构系列内
* 从同一来源/发布下载完整的模型包
* 使用 ComfyUI 管理器的模型兼容性指示器
* 在自定义之前使用默认示例测试工作流
## 缺少模型错误
**错误消息:**
```
Prompt execution failed
Prompt outputs failed validation:
CheckpointLoaderSimple:
- Value not in list: ckpt_name: 'model-name.safetensors' not in []
```
### 解决方案
1. **下载所需模型:**
* 使用 ComfyUI 管理器自动下载模型
* 验证模型在正确的子文件夹中
2. **检查模型路径:**
* **检查点**:`models/checkpoints/`
* **VAE**:`models/vae/`
* **LoRA**:`models/loras/`
* **ControlNet**:`models/controlnet/`
* **嵌入**:`models/embeddings/`
3. **在 UI 之间共享模型或使用自定义路径:**
* 参见 [ComfyUI 模型共享或自定义模型文件夹存储位置配置](/zh-CN/installation/comfyui_portable_windows#2-comfyui-模型共享或自定义模型文件夹存储位置配置) 获取详细说明
* 编辑 `extra_model_paths.yaml` 文件添加自定义模型目录
### 模型搜索路径配置
如果您的模型在自定义位置,请参见详细的 [ComfyUI 模型共享或自定义模型文件夹存储位置配置](/zh-CN/installation/comfyui_portable_windows#2-comfyui-模型共享或自定义模型文件夹存储位置配置) 指南来配置 ComfyUI 找到它们。
## 模型加载错误
**错误消息:** "Error while deserializing header"
### 解决方案
1. **重新下载模型** - 下载过程中文件可能已损坏
2. **检查可用磁盘空间** - 确保有足够的空间用于模型加载(模型可能 2-15GB+)
3. **检查文件权限** - 确保 ComfyUI 可以读取模型文件
4. **使用不同模型测试** - 验证问题是模型特定的还是系统范围的
## 模型性能问题
### 模型加载缓慢
**症状:** 切换模型或开始生成时长时间延迟
**解决方案:**
1. **将模型保持在显存中:**
```bash
python main.py --highvram
```
2. **使用更快的存储:**
* 如果使用 HDD,将模型移至 SSD
* 使用 NVMe SSD 获得最佳性能
3. **调整缓存设置:**
```bash
python main.py --cache-classic # 使用旧式(积极)缓存
```
### 大型模型的内存问题
**"RuntimeError: CUDA out of memory":**
```bash
# 渐进式内存减少
python main.py --lowvram # 首先尝试
python main.py --novram # 如果 lowvram 不够
python main.py --cpu # 最后手段
```
**模型特定的内存优化:**
```bash
# 强制更低精度
python main.py --force-fp16
# 减少注意力内存使用
python main.py --use-pytorch-cross-attention
```
有关其他模型配置和设置信息,请参见[模型文档](/zh-CN/development/core-concepts/models)。
# 如何排查和解决 ComfyUI 中出现的错误
Source: https://docs.comfy.org/zh-CN/troubleshooting/overview
常见 ComfyUI 问题、解决方案和如何有效报告错误
我们日常收到的诸多反馈问题,我们发现绝大部分的问题提交都与自定义节点有关,所以在提交对应的错误反馈之前,请你确保详细阅读了 [自定义节点故障排除](/zh-CN/troubleshooting/custom-node-issues) 部分的指南,来确保对应的问题并不是由 ComfyUI 核心问题导致的。
查看如何排查自定义节点导致的问题。
## 常见问题与快速修复
在深入详细故障排除之前,请尝试这些常见解决方案:
### ComfyUI 无法启动
**症状:** 应用程序在启动时崩溃、黑屏或无法加载
**快速修复:**
1. **检查系统要求** - 确保您的系统符合[最低要求](/zh-CN/installation/system_requirements)
2. **更新 GPU 驱动程序** - 从 NVIDIA/AMD/Intel 下载最新驱动程序
### 生成失败或产生错误
**症状:** "Prompt execution failed"(提示执行失败)对话框,带有"Show report"(显示报告)按钮,工作流停止执行
**快速修复:**
1. **点击"Show report"** - 阅读详细错误消息以识别具体问题
2. **检查是否是自定义节点问题** - [遵循我们的自定义节点故障排除指南](/zh-CN/troubleshooting/custom-node-issues)
3. **验证模型文件** - 查看[模型文档](/zh-CN/development/core-concepts/models)了解模型设置
4. **检查显存使用情况** - 关闭其他使用 GPU 内存的应用程序
### 性能缓慢
**症状:** 生成时间非常慢、系统冻结、内存不足错误
**快速修复:**
1. **降低分辨率/批次大小** - 减少图像大小或图像数量
2. **使用内存优化标志** - 请参见下方性能优化部分
3. **关闭不必要的应用程序** - 释放 RAM 和显存
4. **检查 CPU/GPU 使用率** - 使用任务管理器识别瓶颈
**性能优化命令:**
对于低显存系统:
```bash
# 低显存模式(将模型分成多个部分)
python main.py --lowvram
# 当 --lowvram 不够用时的更低显存模式
python main.py --novram
# CPU 模式(非常慢但适用于任何硬件)
python main.py --cpu
```
提高性能:
```bash
# 禁用预览(节省显存和处理)
python main.py --preview-method none
# 将模型保持在显存中(更快但使用更多显存)
python main.py --highvram
# 强制 FP16 精度(更快,使用更少显存)
python main.py --force-fp16
# 使用优化的注意力机制
python main.py --use-pytorch-cross-attention
python main.py --use-flash-attention
# 异步权重卸载
python main.py --async-offload
```
内存管理:
```bash
# 为操作系统保留特定显存量(以 GB 为单位)
python main.py --reserve-vram 2
# 禁用智能内存管理
python main.py --disable-smart-memory
# 使用不同的缓存策略
python main.py --cache-none # 更少的内存使用
python main.py --cache-lru 10 # 缓存 10 个结果
```
## 安装过程中出现的问题
### 桌面应用问题
有关全面的桌面安装故障排除,请参见[桌面安装指南](/zh-CN/installation/desktop/windows)。
* **无法安装**:以管理员身份运行安装程序
* **缺少依赖项**:安装 [Visual C++ 可再发行组件](https://aka.ms/vs/17/release/vc_redist.x64.exe)
* **启动时崩溃**:检查 Windows 事件查看器以获取错误详细信息
* **"应用程序已损坏"**:在安全性与隐私设置中允许应用程序
* **性能问题**:在隐私设置中授予完整磁盘访问权限
* **崩溃**:检查控制台应用程序以获取崩溃报告
* **缺少库**:使用包管理器安装依赖项
* **LD\_LIBRARY\_PATH 错误**:PyTorch 库路径问题(见下文)
### 手动安装问题
文档可能略有过时。如果出现问题,请手动验证是否存在更新的稳定版本的 pytorch 或任何列出的库。请参考 [pytorch 安装矩阵](https://pytorch.org/get-started/locally/) 或 [ROCm 网站](https://rocm.docs.amd.com/projects/install-on-linux/en/develop/install/3rd-party/pytorch-install.html#using-a-wheels-package) 等资源。
**Python 版本冲突:**
```bash
# 检查 Python 版本(需要 3.9+,推荐 3.12)
python --version
# 使用虚拟环境(推荐)
python -m venv comfyui_env
source comfyui_env/bin/activate # Linux/Mac
comfyui_env\Scripts\activate # Windows
```
**包安装失败:**
```bash
# 首先更新 pip
python -m pip install --upgrade pip
# 安装依赖项
pip install -r requirements.txt
# 对于 NVIDIA GPU(CUDA 12.8)
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu128
# 对于 AMD GPU(仅限 Linux - ROCm 6.3)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.3
```
### Linux 特定问题
**LD\_LIBRARY\_PATH 错误:**
常见错误:
* "libcuda.so.1: cannot open shared object file"
* "libnccl.so: cannot open shared object file"
* "ImportError: libnvinfer.so.X: cannot open shared object file"
**解决方案:**
1. **现代 PyTorch 安装(最常见):**
```bash
# 对于带有 NVIDIA 包的虚拟环境
export LD_LIBRARY_PATH=$VIRTUAL_ENV/lib/python3.12/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
# 对于 conda 环境
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib/python3.12/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
# 或自动查找您的 Python site-packages
PYTHON_PATH=$(python -c "import site; print(site.getsitepackages()[0])")
export LD_LIBRARY_PATH=$PYTHON_PATH/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
# 您可能还需要其他 NVIDIA 库
export LD_LIBRARY_PATH=$PYTHON_PATH/nvidia/cuda_runtime/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$PYTHON_PATH/nvidia/cublas/lib:$LD_LIBRARY_PATH
```
2. **查找你拥有的库:**
```bash
# 检查已安装的 NVIDIA 包
python -c "import site; import os; nvidia_path=os.path.join(site.getsitepackages()[0], 'nvidia'); print('NVIDIA libs:', [d for d in os.listdir(nvidia_path) if os.path.isdir(os.path.join(nvidia_path, d))] if os.path.exists(nvidia_path) else 'Not found')"
# 查找 PyTorch 需要的缺失库
python -c "import torch; print(torch.__file__)"
ldd $(python -c "import torch; print(torch.__file__.replace('__init__.py', 'lib/libtorch_cuda.so'))")
```
3. **为你的环境永久设置:**
```bash
# 对于虚拟环境,添加到激活脚本
echo 'export LD_LIBRARY_PATH=$VIRTUAL_ENV/lib/python*/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH' >> $VIRTUAL_ENV/bin/activate
# 对于 conda 环境
conda env config vars set LD_LIBRARY_PATH=$CONDA_PREFIX/lib/python*/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
# 对于全局 bashrc(根据需要调整 Python 版本)
echo 'export LD_LIBRARY_PATH=$(python -c "import site; print(site.getsitepackages()[0])")/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH' >> ~/.bashrc
```
4. **替代方案:使用 ldconfig:**
```bash
# 检查当前库缓存
ldconfig -p | grep cuda
ldconfig -p | grep nccl
# 如果缺失,添加库路径(需要 root 权限)
sudo echo "/usr/local/cuda/lib64" > /etc/ld.so.conf.d/cuda.conf
sudo ldconfig
```
5. **调试库加载:**
```bash
# 详细库加载以查看缺失的内容
LD_DEBUG=libs python main.py 2>&1 | grep "looking for"
# 检查 PyTorch CUDA 可用性
python -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('CUDA version:', torch.version.cuda)"
```
## 模型相关问题
有关综合模型故障排除,包括架构不匹配、缺少模型和加载错误,请参见专门的[模型问题](/zh-CN/troubleshooting/model-issues)页面。
## 网络和 API 问题
### API 节点不工作
**症状:** API 调用失败、超时错误、配额超出
**解决方案:**
1. **检查 API 密钥有效性** - 在[用户设置](/zh-CN/interface/user)中验证密钥
2. **检查账户积分** - 确保有足够的 [API 积分](/zh-CN/interface/credits)
3. **验证互联网连接** - 使用其他在线服务进行测试
4. **检查服务状态** - 提供商可能正在经历停机
### 连接问题
**症状:** "无法连接到服务器"、超时错误
**解决方案:**
1. **检查防火墙设置** - 允许 ComfyUI 通过防火墙
2. **尝试不同端口** - 默认是 8188,尝试 8189 或 8190
3. **临时禁用 VPN** - VPN 可能阻止连接
4. **检查代理设置** - 如果不需要,禁用代理
## 硬件特定问题
### NVIDIA GPU 问题
**CUDA 错误、GPU 未检测到:**
```bash
# 检查 CUDA 安装
nvidia-smi
# 验证 PyTorch CUDA 支持
python -c "import torch; print(torch.cuda.is_available())"
# 重新安装带 CUDA 的 PyTorch
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
```
### AMD GPU 问题
**ROCm 支持、性能问题:**
```bash
# 安装 ROCm 版本的 PyTorch
pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm5.7
```
### Apple Silicon (M1/M2/M3) 问题
**MPS 后端错误:**
```bash
# 检查 MPS 可用性
python -c "import torch; print(torch.backends.mps.is_available())"
# 如果 MPS 导致问题,强制使用 CPU
python main.py --force-fp16 --cpu
```
## 获取帮助和报告错误
### 报告错误之前
1. **检查是否是已知问题:**
* 搜索 [GitHub Issues](https://github.com/comfyanonymous/ComfyUI/issues)
* 检查 [ComfyUI 论坛](https://forum.comfy.org/)
* 查看 [Discord 讨论](https://discord.com/invite/comfyorg)
2. **尝试基本故障排除:**
* 使用[默认工作流](/zh-CN/get_started/first_generation)进行测试
* 禁用所有自定义节点(参见[自定义节点故障排除](/zh-CN/troubleshooting/custom-node-issues))
* 检查控制台/终端中的错误消息
### 如何有效报告错误
#### 对于 ComfyUI 核心问题
**问题提交:** [GitHub Issues](https://github.com/comfyanonymous/ComfyUI/issues)
#### 对于桌面应用问题
**问题提交:** [桌面 GitHub Issues](https://github.com/Comfy-Org/desktop/issues)
#### 对于前端问题
**问题提交:** [前端 GitHub Issues](https://github.com/Comfy-Org/ComfyUI_frontend/issues)
#### 对于自定义节点问题
**问题提交:** 请到对应的自定义节点仓库中提交问题
### 在 issue 中你需要提供的信息
报告任何问题时,请包括以下内容:
* 操作系统(Windows 11、macOS 14.1、Ubuntu 22.04 等)
* ComfyUI 版本(检查设置中的关于页面)
* Python 版本:`python --version`
* PyTorch 版本:`python -c "import torch; print(torch.__version__)"`
* GPU 型号和驱动程序版本
```bash
# 系统信息
systeminfo | findstr /C:"OS Name" /C:"OS Version"
# GPU 信息
wmic path win32_VideoController get name
# Python 和 PyTorch 信息
python --version
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}')"
```
```bash
# 系统信息
uname -a
# GPU 信息(Linux)
lspci | grep VGA
# Python 和 PyTorch 信息
python --version
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}')"
```
* 问题的清晰描述
* 重现问题的步骤
* 预期行为与实际行为
* 如果可以,提供截图或复现过程的屏幕录制
* 控制台/终端的完整错误文本
* 浏览器控制台错误(F12 → 控制台选项卡)
* 任何崩溃日志或错误对话框
* 已安装的自定义节点列表
* 重现问题的工作流文件(.json)
* 最近的更改(新安装、更新等)
## 社区资源
* **官方论坛:** [forum.comfy.org](https://forum.comfy.org/)
* **Discord:** [ComfyUI Discord 服务器](https://discord.com/invite/comfyorg)
* **Reddit:** [r/comfyui](https://reddit.com/r/comfyui)
* **YouTube:** [ComfyUI 教程](https://www.youtube.com/@comfyorg)
对于影响许多用户的紧急问题,请查看我们的[状态页面](https://status.comfy.org)和[官方 Twitter](https://x.com/ComfyUI)以获取公告。
# ComfyUI Hunyuan3D-2 示例
Source: https://docs.comfy.org/zh-CN/tutorials/3d/hunyuan3D-2
本文将使用 Hunyuan3D-2 来完成在 ComfyUI 中 3D 资产生成的工作流示例。
# 混元3D 2.0 简介


[混元3D 2.0](https://github.com/Tencent/Hunyuan3D-2) 是腾讯推出的开源 3D 资产生成模型,可以通过文本、图像或草图生成带有高分辨率纹理贴图的高保真 3D 模型。
混元3D 2.0采用两阶段生成,首先采用生成无纹理的几何模型,再合成高分辨率的纹理贴图,有效分离了形状和纹理生成的复杂性,下面是混元3D 2.0的两个核心组件:
1. **几何生成模型(Hunyuan3D-DiT)**:基于流扩散的Transformer架构,生成无纹理的几何模型,可精准匹配输入条件。
2. **纹理生成模型(Hunyuan3D-Paint)**:结合几何条件和多视图扩散技术,为模型添加高分辨率纹理,支持PBR材质。
**主要优势**
* **高精度生成**:几何结构锐利,纹理色彩丰富,支持PBR材质生成,实现接近真实的光影效果。
* **多样化使用方式**:提供代码调用、Blender插件、Gradio应用及官网在线体验,适合不同用户需求。
* **轻量化与兼容性**:Hunyuan3D-2mini模型仅需5GB显存,标准版本形状生成需6GB显存,完整流程(形状+纹理)仅需12GB显存。
近期(2025 年 3 月 18 日),混元3D 2.0 还提供多视角形状生成模型(Hunyuan3D-2mv),支持从不同视角输入生成更精细的几何结构。
在本示例中包含三个工作流:
* 使用 Hunyuan3D-2mv 配合多个视图输入生成3D模型
* 使用 Hunyuan3D-2mv-turbo 配合多个视图输入生成3D模型
* 使用 Hunyuan3D-2 配合单个视图输入生成3D模型
目前 ComfyUI 已原生支持 Hunyuan3D-2mv,暂未支持纹理和材质的生成,请在开始之前确保你已升级到最新版本的 [ComfyUI](https://github.com/comfyanonymous/ComfyUI)。
本示例中工作流部分的输入图片示例 png 格式的图片的 Metadata 中包含工作流 json 的图片
* 直接拖入 ComfyUI
* 使用菜单 `Workflows` -> `Open(ctrl+o)`
可以加载对应的工作流并提示完成模型下载,对应的 `.glb` 格式模型将输出至 `ComfyUI/output/mesh` 文件夹。
## ComfyUI Hunyuan3D-2mv 工作流示例
Hunyuan3D-2mv 工作流中,我们将使用多视角的图片来生成3D模型,另外多个视角的图片在这个工作流中并不是必须的,你可以只输入 `front` 视角的图片来生成3D模型。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
### 1. 工作流
请下载下面的图片,并拖入 ComfyUI 以加载工作流,

下载下面的图片,同时我们将使用这些图片作为图片输入
在本示例中提供的输入图片都已经过提前处理去除了多余的背景,在实际的使用中,你可以借助类似[ComfyUI\_essentials](https://github.com/cubiq/ComfyUI_essentials) 这样的自定义来完成多余背景的自动去除。
### 2. 手动安装模型
下载下面的模型,并保存到对应的 ComfyUI 文件夹
* hunyuan3d-dit-v2-mv: [model.fp16.safetensors](https://huggingface.co/tencent/Hunyuan3D-2mv/resolve/main/hunyuan3d-dit-v2-mv/model.fp16.safetensors?download=true) 下载后可重命名为 `hunyuan3d-dit-v2-mv.safetensors`
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── hunyuan3d-dit-v2-mv.safetensors // 重命名后的文件
```
### 3. 按步骤运行工作流
1. 确保 Image Only Checkpoint Loader(img2vid model) 加载了我们下载并重命名的 `hunyuan3d-dit-v2-mv.safetensors` 模型
2. 在 `Load Image` 节点的各个视角中加载了对应视角的图片
3. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
如果你需要增加更多的视角,请确保 `Hunyuan3Dv2ConditioningMultiView` 节点中加载了其它视角的图片,并确保在 `Load Image` 节点中加载了对应视角的图片。
## 使用 Hunyuan3D-2mv-turbo 工作流
Hunyuan3D-2mv-turbo 工作流中,我们将使用 Hunyuan3D-2mv-turbo 模型来生成3D模型,这个模型是 Hunyuan3D-2mv 的分步蒸馏(Step Distillation)版本,可以更快地生成3D模型,在这个版本的工作流中我们设置 `cfg` 为 1.0 并添加 `flux guidance` 节点来控制 `distilled cfg` 的生成。
### 1. 工作流
请下载下面的图片,并拖入 ComfyUI 以加载工作流,

我们将使用下面的图片作为多视角的输入
### 2. 手动安装模型
下载下面的模型,并保存到对应的 ComfyUI 文件夹
* hunyuan3d-dit-v2-mv-turbo: [model.fp16.safetensors](https://huggingface.co/tencent/Hunyuan3D-2mv/resolve/main/hunyuan3d-dit-v2-mv-turbo/model.fp16.safetensors?download=true) 下载后可重命名为 `hunyuan3d-dit-v2-mv-turbo.safetensors`
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── hunyuan3d-dit-v2-mv-turbo.safetensors // 重命名后的文件
```
### 3. 按步骤运行工作流
1. 确保 `Image Only Checkpoint Loader(img2vid model)` 节点加载了我们重命名后的 `hunyuan3d-dit-v2-mv-turbo.safetensors` 模型
2. 在 `Load Image` 节点的各个视角中加载了对应视角的图片
3. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
## 使用 Hunyuan3D-2 单视图工作流
Hunyuan3D-2 工作流中,我们将使用 Hunyuan3D-2 模型来生成3D模型,这个模型不是一个多视角的模型,在这个工作流中,我们使用`Hunyuan3Dv2Conditioning` 节点替换掉 `Hunyuan3Dv2ConditioningMultiView` 节点。
### 1. 工作流
请下载下面的图片,并拖入 ComfyUI 以加载工作流

同时我们将使用这张图片作为图片输入

### 2. 手动安装模型
下载下面的模型,并保存到对应的 ComfyUI 文件夹
* hunyuan3d-dit-v2-0: [model.fp16.safetensors](https://huggingface.co/tencent/Hunyuan3D-2/resolve/main/hunyuan3d-dit-v2-0/model.fp16.safetensors?download=true) 下载后可重命名为 `hunyuan3d-dit-v2.safetensors`
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── hunyuan3d-dit-v2.safetensors // 重命名后的文件
```
### 3. 按步骤运行工作流
1. 确保 `Image Only Checkpoint Loader(img2vid model)` 节点加载了我们重命名后的 `hunyuan3d-dit-v2.safetensors` 模型
2. 在 `Load Image` 节点中加载了对应视角的图片
3. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
## 社区资源
下面是 Hunyuan3D-2 的相关的 ComfyUI 社区资源
* [ComfyUI-Hunyuan3DWrapper](https://github.com/kijai/ComfyUI-Hunyuan3DWrapper)
* [Kijai/Hunyuan3D-2\_safetensors](https://huggingface.co/Kijai/Hunyuan3D-2_safetensors/tree/main)
* [ComfyUI-3D-Pack](https://github.com/MrForExample/ComfyUI-3D-Pack)
## 混元3D 2.0 开源模型系列
目前混元3D 2.0 开源了多个模型,覆盖了完整的3D生成流程,你可以访问 [Hunyuan3D-2](https://github.com/Tencent/Hunyuan3D-2) 了解更多。
**Hunyuan3D-2mini 系列**
| 模型 | 描述 | 日期 | 参数 | Huggingface |
| --------------------- | ------------ | ---------- | ---- | ------------------------------------------------------------------------------------ |
| Hunyuan3D-DiT-v2-mini | Mini 图像到形状模型 | 2025-03-18 | 0.6B | [前往](https://huggingface.co/tencent/Hunyuan3D-2mini/tree/main/hunyuan3d-dit-v2-mini) |
**Hunyuan3D-2mv 系列**
| 模型 | 描述 | 日期 | 参数 | Huggingface |
| ------------------------ | ------------------------------- | ---------- | ---- | ------------------------------------------------------------------------------------- |
| Hunyuan3D-DiT-v2-mv-Fast | 指导蒸馏版本,可以将 DIT 推理时间减半 | 2025-03-18 | 1.1B | [前往](https://huggingface.co/tencent/Hunyuan3D-2mv/tree/main/hunyuan3d-dit-v2-mv-fast) |
| Hunyuan3D-DiT-v2-mv | 多视角图像到形状模型,适合需要用多个角度理解场景的 3D 创作 | 2025-03-18 | 1.1B | [前往](https://huggingface.co/tencent/Hunyuan3D-2mv/tree/main/hunyuan3d-dit-v2-mv) |
**Hunyuan3D-2 系列**
| 模型 | 描述 | 日期 | 参数 | Huggingface |
| ----------------------- | ------- | ---------- | ---- | ---------------------------------------------------------------------------------- |
| Hunyuan3D-DiT-v2-0-Fast | 指导蒸馏模型 | 2025-02-03 | 1.1B | [前往](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-dit-v2-0-fast) |
| Hunyuan3D-DiT-v2-0 | 图像到形状模型 | 2025-01-21 | 1.1B | [前往](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-dit-v2-0) |
| Hunyuan3D-Paint-v2-0 | 纹理生成模型 | 2025-01-21 | 1.3B | [前往](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-paint-v2-0) |
| Hunyuan3D-Delight-v2-0 | 图像去光影模型 | 2025-01-21 | 1.3B | [前往](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-delight-v2-0) |
# Flux 1.1 Pro Ultra Image API 节点 ComfyUI 官方示例工作流
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/black-forest-labs/flux-1-1-pro-ultra-image
本文将介绍在 ComfyUI 中使用 Flux 1.1 Pro Ultra Image API 节点的相关功能
FLUX 1.1 Pro Ultra 是由 BlackForestLabs 推出的高性能 AI 图像生成工具,主打超高分辨率与高效生成能力。它支持高达 4MP(标准版的 4 倍)的超清画质,同时将单张图像生成时间控制在 10 秒以内,速度比同类高分辨率模型快 2.5 倍。
该工具提供两种核心模式:
* **Ultra 模式**:专为高分辨率需求设计,适合广告、电商等需要细节放大的场景,能精准还原提示词并保持生成速度。
* **Raw 模式**:侧重自然真实感,优化人像肤色、光影及自然景观的细节,减少"AI 味",适合摄影艺术和真实风格创作。
目前在 ComfyUI 中我们已经支持了 Flux 1.1 Pro Ultra Image 节点,在本篇文档中我们将涉及以下内容:
* Flux 1.1 Pro 文生图
* Flux 1.1 Pro 图生图(Remix)
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Flux 1.1 Pro Ultra Image 节点文档
你可查阅下面的文档了解对应节点的详细参数设置
* [Flux 1.1 Pro Ultra Image](/zh-CN/built-in-nodes/api-node/image/bfl/flux-1-1-pro-ultra-image)
## Flux 1.1 \[pro] 文生图教程
### 1. 工作流文件下载
请下载下面的文件,并拖入 ComfyUI 以加载对应的工作流

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的工作流运行
1. (可选)在 `Flux 1.1 [pro] Ultra Image` 节点的 `prompt` 修改工作流的提示词
2. (可选)修改 `raw` 参数为 `false`,可以让输出的图像更真实
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
4. 等待 API 返回结果后,你可在`Save Image`节点中查看生成的图像, 对应的图像也会被保存至`ComfyUI/output/` 目录下
## Flux 1.1\[pro] 图生图教程
当我们在节点输入中添加了 `image_prompt` 则对应输出结果将会融合输入图片特点进行混合(Remix) ,`image_prompt_strength` 的大小影响混合的比例:数值越大与输入图像越相似.
### 1. 工作流文件下载
请下载下面的文件,并拖入 ComfyUI 以加载对应的工作流,或者在 **文生图工作流** 中紫色的节点上右键设置 `模式(mode)` 为 `总是(always)` 来启用 `image_prompt` 输入

我们会将下面的图片作为输入:

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的工作流运行
1. 在 `Load Image` 节点上点击 **Upload** 上传输入图像
2. (可选)修改 `Flux 1.1 [pro] Ultra Image` 中的 `image_prompt_strength` 的大小,来改变混合的比例
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
4. 等待 API 返回结果后,你可在`Save Image`节点中查看生成的图像, 对应的图像也会被保存至`ComfyUI/output/` 目录下
下面是不同 `image_prompt_strength` 的输出结果对比
# ComfyUI Flux.1 Kontext Pro Image API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/black-forest-labs/flux-1-kontext
本文将介绍如何在 ComfyUI 中使用 Flux.1 Kontext Pro Image API 节点来完成图像编辑功能
FLUX.1 Kontext 是由 Black Forest Labs 开发的一款专业的图像到图像编辑模型,专注于智能理解图像上下文并执行精确编辑。
能够在无需复杂描述的情况下实现多种编辑任务,包括对象修改、风格转换、背景替换、角色一致性编辑和文本编辑等。
Kontext 的核心优势在于其出色的上下文理解能力和角色一致性保持,即使经过多次迭代编辑,也能确保人物特征、构图布局等关键元素保持稳定。
目前,ComfyUI 中支持了 Flux.1 Kontext 的两个模型:
* **Kontext Pro** 适合编辑、合成和混音。
* **Kontext Max** 在排版、提示词精确度和速度方面突破极限。
本篇指南,我们将通过对应的工作流来简单介绍如何使用 Flux.1 Kontext 的相关 API 节点来完成图像编辑。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Flux.1 Kontext 多图输入工作流
我们最近更新支持了多图输入工作流,使用新增的 `Image Stitch` 节点,将允许你将多张图像拼接成一张图像,并使用 Flux.1 Kontext 进行编辑。
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

下载下面的图片用于输入或者使用你自己的图片:



### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成图工作流的运行:
1. 在 `Load image` 节点中请分别上传提供的图片
2. 在 `Flux.1 Kontext Pro Image` 修改必要的参数:
* `prompt` 输入你想要编辑的图像的提示词
* `aspect_ratio` 设置原图的高宽比,比例必须在 1:4 到 4:1 之间
* `prompt_upsampling` 设置是否使用提示词上采样,如果开启,会自动修改提示词以获得更丰富的结果,但结果是不可重复的
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的编辑。
4. 等待 API 返回结果后,你可在 `Save Image` 节点中查看编辑后的图像,对应的图像也会被保存至 `ComfyUI/output/` 目录下。
后续的两个工作流只是使用的 API 节点不同,实际上你只需要基于多图输入工作流修改即可,没有太大的差别
## Flux.1 Kontext Pro Image API 节点 工作流
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

下载下面的图片用于输入或者使用你自己的图片:

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成图工作流的运行:
1. 在 `Load Image` 节点中加载需要编辑的图像
2. 在 `Flux.1 Kontext Pro Image` 修改必要的参数
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的编辑。
4. 等待 API 返回结果后,你可在 `Save Image` 节点中查看编辑后的图像,对应的图像也会被保存至 `ComfyUI/output/` 目录下。
## Flux.1 Kontext Max Image API 节点 工作流
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

下载下面的图片用于输入或者使用你自己的图片进行演示:

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成图工作流的运行:
1. 在 `Load Image` 节点中加载需要编辑的图像
2. 在 `Flux.1 Kontext Max Image` 修改必要的参数
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的编辑。
4. 等待 API 返回结果后,你可在 `Save Image` 节点中查看编辑后的图像,对应的图像也会被保存至 `ComfyUI/output/` 目录下。
## Flux Kontext 提示词技巧
### 1. 基础修改
* 简单直接:`"Change the car color to red"`
* 保持风格:`"Change to daytime while maintaining the same style of the painting"`
### 2. 风格转换
**原则:**
* 明确命名风格:`"Transform to Bauhaus art style"`
* 描述特征:`"Transform to oil painting with visible brushstrokes, thick paint texture"`
* 保留构图:`"Change to Bauhaus style while maintaining the original composition"`
### 3. 角色一致性
**框架:**
* 具体描述:`"The woman with short black hair"`而非`"she"`
* 保留特征:`"while maintaining the same facial features, hairstyle, and expression"`
* 分步修改:先改背景,再改动作
### 4. 文本编辑
* 使用引号:`"Replace 'joy' with 'BFL'"`
* 保持格式:`"Replace text while maintaining the same font style"`
## 常见问题解决
### 角色变化过大
❌ 错误:`"Transform the person into a Viking"`
✅ 正确:`"Change the clothes to be a viking warrior while preserving facial features"`
### 构图位置改变
❌ 错误:`"Put him on a beach"`
✅ 正确:`"Change the background to a beach while keeping the person in the exact same position, scale, and pose"`
### 风格应用不准确
❌ 错误:`"Make it a sketch"`
✅ 正确:`"Convert to pencil sketch with natural graphite lines, cross-hatching, and visible paper texture"`
## 核心原则
1. **具体明确** - 使用精确描述,避免模糊词汇
2. **分步编辑** - 复杂修改分为多个简单步骤
3. **明确保留** - 说明哪些要保持不变
4. **动词选择** - 用"change"、"replace"而非"transform"
## 最佳实践模板
**对象修改:**
`"Change [object] to [new state], keep [content to preserve] unchanged"`
**风格转换:**
`"Transform to [specific style], while maintaining [composition/character/other] unchanged"`
**背景替换:**
`"Change the background to [new background], keep the subject in the exact same position and pose"`
**文本编辑:**
`"Replace '[original text]' with '[new text]', maintain the same font style"`
> **记住:** 越具体越好,Kontext 擅长理解详细指令并保持一致性。
# API Nodes 场景问题
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/faq
在使用 API Nodes 时你可能遇到的常见问题。
本文是关于 API 节点使用的常见问题,
请更新你的 ComfyUI 到最新版本(最新的 commit,或者最新的[桌面版](https://www.comfy.org/download)),由于我们后期可能会新增更多的 API 支持, 相应的节点也会进行更新, 所以请保持你的 ComfyUI 处于最新版本。
请注意需要区分 nightly 版本和 release 版本,有些情况下 `nightly` 版本(也就是最新的代码 commit 提交)才会包含最新的节点,因为 release 版本可能不会及时更新。
由于我们仍在快速地迭代中,所以当你无法找到对应节点时请确保你使用的是最新的版本。
API 访问需要你当前的请求是基于安全的网络环境,目前对 API 访问的网络环境要求如下:
* 本地网络仅允许 `127.0.0.1` 或者 `localhost` 访问, 这可能意味着,你无法在局域网环境下使用带有`--listen` 参数启动的 ComfyUI 服务中中使用 API Nodes 节点
* 能够正常访问我们的 API 服务(在某些地区可能需要使用代理服务)
* 你的账号没有足够的[积分](/zh-CN/interface/credits)
* 目前仅支持 `127.0.0.1` 或者 `localhost` 访问,
* 确保你的账户有足够余额
API Nodes 节点由于需要通过 API 调用闭源模型,所以需要使用积分,不支持免费使用
请参考下面的文档:
1. [Comfy账号](/zh-CN/interface/user): 在设置菜单中找到`用户`部分,进行登录
2. [积分](/zh-CN/interface/credits): 登录后设置界面会出现积分菜单,您可以在`设置` → `积分`中购买积分,我们使用预付费,不会有意外的费用
3. 通过 Stripe 完成付款
4. 查看积分是否更新,如果没有试着重启或者刷新页面
目前我们不支持对积分进行退款。
如果你觉得是因为技术问题出现了错误而存在未使用的余额,请[联系支持](mailto:support@comfy.org)
不允许积分出现负数,所以在对应 API 调用前,请确保你有足够的积分。
请在登录后访问[积分](/zh-CN/interface/credits) 菜单,查看相应的积分。
目前 API Nodes 节点仍在测试阶段,目前暂不支持,我们已经把这个功能纳入考虑中了。。
不你的积分不会过期。
不,你的积分不能转让给其他用户,也只限制于当前登录账户使用,但是我们并不限制登录设备的数量
我们不限制登录的设备数量,你可以在你想要的任何地方使用你的账号?
请发送请求邮件至 [support@comfy.org](mailto:support@comfy.org),我们将删除您的信息。
# Google Gemini API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/google/gemini
本文将介绍如何在 ComfyUI 中使用 Google Gemini API 节点来完成对话功能
Google Gemini 是 Google 推出的一款强大的 AI 模型,支持对话、文本生成等多种功能。目前 ComfyUI 已集成 Google Gemini API,你可以直接在 ComfyUI 中使用相关节点来完成对话功能。
本篇指南中,我们将引导你完成对应对话功能。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Google Gemini Chat 工作流
### 1. 工作流文件下载
请下载下面的 Json 文件并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
### 2. 按步骤完成工作流的运行
在对应模板中我们构建了一个用于分析提示词生成的角色的提示词,用于将你的图像解读成对应的绘图提示词
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中加载你需要 AI 的解读图片
2. (可选) 如果需要你可以修改`Google Gemini` 中的提示词,从而让 AI 来执行特定任务
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行对话。
4. 等待 API 返回结果后,你可在 `Preview Any` 节点中查看对应 AI 返回的内容。
### 3. 补充说明
* 目前文件输入节点 `Gemini Input Files` 需要先将文件上传至`ComfyUI/input/` 目录下, 此节点正在改进,我们会在更新后修改模板
* 工作流中提供了使用 `Batch Images` 来输入的示例,如果你有多张图片需要 AI 解读,可参考步骤图在使用右键来将对应的节点模式设置为 `总是(always)` 来启用
# ComfyUI Ideogram 3.0 API 节点官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/ideogram/ideogram-v3
本文将介绍在 ComfyUI 中使用 Ideogram 3.0 API 节点的相关功能
Ideogram 3.0 是由 Ideogram 发布的一款强大的文本到图像生成模型,以其卓越的照片级逼真度、精确的文本渲染和一致的风格控制而著称。
目前[Ideogram V3](/zh-CN/built-in-nodes/api-node/image/ideogram/ideogram-v3) 节点支持以下两种模式:
* 文生图模式
* 图像编辑模式(当同时提供了图像和遮罩输入时)
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Ideogram 3.0 节点文档
你可查阅下面的文档了解对应节点的详细参数设置等
* [Ideogram V3](/zh-CN/built-in-nodes/api-node/image/ideogram/ideogram-v3)
## Ideogram 3.0 API 节点文生图模式
当你只使用 [Ideogram V3](/zh-CN/built-in-nodes/api-node/image/ideogram/ideogram-v3) 而不输入图像和蒙版时,节点将使用文生图模式。
### 1. 工作流文件下载
请下载下面的文件,并拖入 ComfyUI 以加载对应的工作流

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的工作流运行
1. 在 `Ideogram V3` 节点的 `prompt` 中输入你想要生成的图像的描述
2. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
3. 等待 API 返回结果后,你可在`Save Image`节点中查看生成的图像, 对应的图像也会被保存至`ComfyUI/output/` 目录下
## Ideogram 3.0 API 节点图像编辑模式
\[待更新]
# Luma Image to Image API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/luma/luma-image-to-image
本文将介绍如何在 ComfyUI 中使用 Luma Image to Image API 节点的相关功能
[Luma Image to Image](/zh-CN/built-in-nodes/api-node/image/luma/luma-image-to-image) 节点允许你使用Luma AI的技术根据文本提示词修改现有图像,同时保留原始图像的某些特征和结构。
本篇指南中,我们将引导你如何使用对应节点来进行图生图的工作流设置。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Luma Image to Image 节点文档
你可查阅下面的文档了解对应节点的详细参数设置等
Luma Image to Image API 节点说明文档
## Luma Image to Image API 节点图生图工作流
这个功能在改变物体、形状等方面效果很好。但在改变颜色时,效果可能不太理想。建议使用较低的权重值,大约在0.0到0.1之间。
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

请下载下面的图片,我们将用做输入图:

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的工作流运行
1. 在 `Load image` 节点中点击 **upload** 按钮上传输入图片
2. (可选)修改工作流的提示词
3. (可选)修改`image_weight` 来修改输入图片的权重(越小越接近原图)
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
5. 等待 API 返回结果后,你可在`Save Image`节点中查看生成的图像, 对应的图片也会被保存至`ComfyUI/output/` 目录下
### 3. 不同 `image_weight` 参数输入结果
# Luma Image to Video API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/luma/luma-image-to-video
本文将介绍如何在 ComfyUI 中使用 Luma Image to Video API 节点的相关功能
[Luma Image to Video](/zh-CN/built-in-nodes/api-node/video/luma/luma-image-to-video) 节点允许你使用Luma AI的先进技术将静态图像转换为流畅、动态的视频内容,为图像赋予生命力和动态特性。
本篇指南中,我们将引导你如何使用对应节点来进行图像到视频的工作流设置。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Luma Image to Video 节点文档
你可查阅下面的文档了解对应节点的详细参数设置等
Luma Image to Video API 节点说明文档
Luma Concepts API 节点说明文档
## Luma Image to Video API 节点图像到视频工作流
Luma Image to Video 节点需要至少提供一个图像输入(`first_image`或`last_image`),结合文本提示词来确定视频的动态效果。在本篇指南中,我们制作了使用`first_image`和`luma_concepts`的示例,让你体验Luma AI在视频生成上的优秀能力。
### 1. 工作流文件下载
下面的视频的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

请下载下面的图片,我们将会用作输入:

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的工作流运行:
1. 在 `first_image` 节点中上传你的输入图像
2. (可选)在 Luma Image to Video 节点中编写提示词,描述你希望视频如何动态展示图像
3. (可选)修改 `Luma Concepts` 节点来控制相机运动效果,为视频添加专业的镜头语言
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频的生成
5. 等待 API 返回结果后,你可在 `Save Video` 节点中查看生成的视频,对应的视频也会被保存至 `ComfyUI/output/` 目录下
### 3. 补充说明
* **输入图像要求**:`first_image` 和 `last_image` 至少需要提供一个,每个输入最多只接受1张图片
* **Luma Concepts**:主要用于控制相机运动,提供更专业的视频镜头效果
* **Seed 参数**:仅用于确定节点是否应重新运行,但实际生成结果与种子值无关
* **启用输入节点**:要启用对应的输入请在目前紫色"绕过(Bypass)"模式的节点上右键,设置对应的"模式(mode)"为"总是(always)"
* **模型选择**:不同的视频生成模型有不同的特点,可以通过调整 model 参数来选择
* **分辨率与时长**:可以通过 resolution 和 duration 参数来调整输出视频的分辨率和时长
# Luma Text to Image API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/luma/luma-text-to-image
本文将介绍如何在 ComfyUI 中使用 Luma Text to Image API 节点的相关功能
[Luma Text to Image](/zh-CN/built-in-nodes/api-node/image/luma/luma-text-to-image) 节点允许你使用Luma AI的先进技术根据文本提示词生成高质量的图像,能够创建照片级别的逼真内容和艺术风格图像。
本篇指南中,我们将引导你如何使用对应节点来进行文本生图的工作流设置。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Luma Text to Image 节点文档
你可查阅下面的文档了解对应节点的详细参数设置等
Luma Text to Image API 节点说明文档
Luma Reference API 节点说明文档
## Luma Text to Image API 节点文本生图工作流
在 `Luma Text to Image` 节点没有使用任何的图像输入时,对应的工作流则为文生图工作流,在本篇指南中,我们制作了使用`style_image` 和 `image_luma_ref` 的示例。
能够让你体验到 Luma AI 在相关图像处理上的优秀能力。
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

请下载下面的图片,我们将会用作输入:


### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的工作流运行
1. 在 `Load image` 节点中上传参考图像
2. 在 `Load image(已重命名为 styleref)` 节点中上传风格参考图像
3. (可选)修改 `Luma Text to Image` 节点的提示词
4. (可选)修改 `style_image_weight` 的权重,来调整风格参考图像的权重
5. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
6. 等待 API 返回结果后,你可在`Save Image`节点中查看生成的图像, 对应的图片也会被保存至`ComfyUI/output/` 目录下
### 3. 补充说明
* [对应的节点](/zh-CN/built-in-nodes/api-node/image/luma/luma-text-to-image)同时允许最多同时分别输入 4 张参考图和角色参考。
* 如果要启用多张图像输入参考,请在对应“紫色”的处于`绕过(Bypass)` 的节点上右键,将对应的 `模式(mode)` 设置为 `总是(always)`
# Luma Text to Video API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/luma/luma-text-to-video
本文将介绍如何在 ComfyUI 中使用 Luma Text to Video API 节点的相关功能
[Luma Text to Video](/zh-CN/built-in-nodes/api-node/video/luma/luma-text-to-video) 节点允许你使用Luma AI的创新视频生成技术,通过文本描述创建高质量、流畅的视频内容。
本篇指南中,我们将引导你如何使用对应节点来进行文本到视频的工作流设置。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Luma Text to Video 节点文档
你可查阅下面的文档了解对应节点的详细参数设置等
Luma Text to Video API 节点说明文档
Luma Concepts API 节点说明文档
## Luma Text to Video API 节点文本到视频工作流
Luma Text to Video 节点需要提供文本提示词来描述生成视频内容。在本篇指南中,我们制作了使用`prompt`和`luma_concepts`的示例,让你体验Luma AI在视频生成上的优秀能力。
### 1. 工作流文件下载
下面的视频的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的工作流运行:
1. 在 `Luma Text to Video` 节点中编写提示词,描述你希望生成的视频内容
2. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频的生成
3. 等待 API 返回结果后,你可在 `Save Video` 节点中查看生成的视频,对应的视频也会被保存至 `ComfyUI/output/` 目录下
> (可选)修改 `Luma Concepts` 节点来控制相机运动效果,为视频添加专业的镜头语言
### 3. 补充说明
* **提示词撰写**:尽可能详细地描述场景、主体、动作和氛围,以获得最佳生成效果
* **Luma Concepts**:主要用于控制相机运动,提供更专业的视频镜头效果
* **Seed 参数**:仅用于确定节点是否应重新运行,但实际生成结果与种子值无关
* **模型选择**:不同的视频生成模型有不同的特点,可以通过调整 model 参数来选择
* **分辨率与时长**:可以通过 resolution 和 duration 参数来调整输出视频的分辨率和时长
* **Ray 1.6 模型注意事项**:当使用 Ray 1.6 模型时,duration 和 resolution 参数将不会生效
# Moonvalley API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/moonvalley/moonvalley-video-generation
本文将介绍如何在 ComfyUI 中使用 Moonvalley API 节点的文生视频、图生视频、视频转绘等能力
Moonvalley Marey Realism v1.5 是专为影视级创作打造的 AI 视频生成模型,该模型 **完全使用商业授权内容训练**,确保 **版权无忧,商用安全**。
## 产品亮点
* 极强的提示词理解力: 精准还原复杂提示词指令;
* 原生 1080p 高清画质: 训练数据集基于 **1080P** 视频训练,输出画面细腻。
* 真实物理与动态表现: 对物理运动模型、自然动态进行精准模拟,带来专业级别的真实感。
* 复杂场景分层与高级光影效果: 支持复杂场景的前中后景分层,智能理解空间关系
* 动作迁移和姿态迁移等生产级控制功能: 自动生成复合场景的真实光照。
目前 Moonvalley 相关 API 节点,已在 ComfyUI 中原生支持,你可以在 ComfyUI 中使用 对应的 文生视频、图生视频、视频转绘等能力。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Moonvalley 文生视频工作流
### 1. 工作流文件下载
下载 Json 格式工作流文件
### 2. 按步骤完成工作流的运行
1. 输入正向提示词(想要出现在画面中的内容)
2. 输入负向提示词(不想要出现在画面中的内容)
3. 修改视频输出分辨率
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频的生成
5. 等待 API 返回结果后,你可在 `Save Video` 节点中查看生成的视频,对应的视频也会被保存至 `ComfyUI/output/` 目录下
## Moonvalley 图生视频工作流
### 1. 工作流文件下载
下载 Json 格式工作流文件
下载下面的图片作为输入图片

### 2. 按步骤完成工作流的运行
1. 在 `Load Image` 节点中加载输入图像
2. 输入正向提示词(想要出现在画面中的内容)
3. 输入负向提示词(不想要出现在画面中的内容)
4. 修改视频输出分辨率
5. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频的生成
6. 等待 API 返回结果后,你可在 `Save Video` 节点中查看生成的视频,对应的视频也会被保存至 `ComfyUI/output/` 目录下
## Moonvalley 视频转视频工作流
`Moonvalley Marey Video to Video` 节点将允许你输入一段参考视频来进行视频的重绘,你可以参考视频画面动态或者角色姿态动作来进行视频的绘制生成。
### 1. 工作流文件下载
下载 Json 格式工作流文件
下载下面的视频作为输入视频:
### 2. 按步骤完成工作流的运行
1. 在 `Load Video` 节点中加载参考视频,或者你自己的素材
* 如最终视频时长为5s, 则输入视频要大于 5s
* 如最终视频时长为 10s, 则输入视频要大于 10s
2. 输入正向提示词(想要出现在画面中的内容)
3. 输入负向提示词(不想要出现在画面中的内容)
4. 设置 `length` 参数,设置视频生成时长
5. 设置 `control_type` 参数,选择视频转绘的参考类型
* `Motion Transfer` : 参考视频的画面动态进行生成
* `Pose Transfer` : 参考视频中的角色姿态进行生成
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频的生成
7. 等待 API 返回结果后,你可在 `Save Video` 节点中查看生成的视频,对应的视频也会被保存至 `ComfyUI/output/` 目录下
# OpenAI Chat API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/openai/chat
本文将介绍如何在 ComfyUI 中使用 OpenAI Chat API 节点来完成对话功能
OpenAI 是一家专注于生成式 AI 的科技公司,提供强大的对话功能。目前 ComfyUI 已集成 OpenAI API,你可以直接在 ComfyUI 中使用相关节点来完成对话功能。
本篇指南中,我们将引导你完成对应对话功能。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## OpenAI Chat 工作流
### 1. 工作流文件下载
请下载下面的 Json 文件并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
### 2. 按步骤完成工作流的运行
在对应模板中我们构建了一个用于分析提示词生成的角色设定,
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中加载你需要 AI 的解读图片
2. (可选) 如果需要你可以修改`OpenAI Chat Advanced Options` 中的设定,从而让 AI 来执行特定任务
3. 在 `OpenAI Chat` 节点你可以修改 `Prompt` 来设置对话的提示词,或者修改 `model` 来选择不同的模型
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行对话。
5. 等待 API 返回结果后,你可在 `Preview Any` 节点中查看对应 AI 返回的内容。
### 3. 补充说明
* 目前文件输入节点 `OpenAI Chat Input Files` 需要先将文件上传至`ComfyUI/input/` 目录下, 此节点正在改进,我们会在更新后修改模板
* 工作流中提供了使用 `Batch Images` 来输入的示例,如果你有多张图片需要 AI 解读,可参考步骤图在使用右键来将对应的节点模式设置为 `总是(always)` 来启用
# OpenAI DALL·E 2 节点
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/openai/dall-e-2
了解如何在 ComfyUI 中使用 OpenAI DALL·E 2 API 节点生成图像
OpenAI DALL·E 2 是 ComfyUI API 节点系列中的一员,它允许用户通过 OpenAI 的 **DALL·E 2** 模型生成图像。
这个节点支持:
* 文本到图像的生成
* 图像编辑功能(通过蒙版进行修复绘制)
## 节点概述
**OpenAI DALL·E 2** 节点通过 OpenAI 的图像生成 API 同步生成图像。它接收文本提示并返回符合描述的图像。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 参数说明
### 必填参数
| 参数名 | 说明 |
| -------- | ----------------- |
| `prompt` | 文本提示,描述你想要生成的图像内容 |
### Widget 参数
| 参数名 | 说明 | 选项/范围 | 默认值 |
| ------ | ------------------ | --------------------------------- | ----------- |
| `seed` | 生成图像的种子值(目前在后端未实现) | 0 到 2^31-1 | 0 |
| `size` | 输出图像的尺寸 | "256x256", "512x512", "1024x1024" | "1024x1024" |
| `n` | 生成的图像数量 | 1 到 8 | 1 |
### 可选参数
| 参数名 | 说明 | 选项/范围 | 默认值 |
| ------- | -------------- | ------ | --- |
| `image` | 可选的参考图像,用于图像编辑 | 任何图像输入 | 无 |
| `mask` | 可选的蒙版,用于局部重绘 | 蒙版输入 | 无 |
## 使用方法
## 工作流示例
目前该API 节点支持两种工作流,分别是:
* 文生图像(Text to Image)
* 局部重绘(Inpainting)
不支持图生图(Image to Image)工作流
### 文生图像(Text to Image)示例
下面的图片包含了一个简单的文生图像工作流,请下载对应的图像,并拖入 ComfyUI 以加载对应的工作流

对应的示例非常简单
你只需要在加载 `OpenAI DALL·E 2` 节点后,在 `prompt` 节点中输入你想要生成的图像的描述,并连接一个 `保存图像(Save Image)` 节点,然后运行工作流即可
### 局部重绘(Inpainting)工作流
DALL·E 2 支持图像编辑功能,允许您使用蒙版指定要替换的区域,下面是一个简单的局部重绘工作流示例:
#### 1. 工作流文件下载
下载下面的图片,并拖入 ComfyUI 以加载对应的工作流

我们将使用下面的图片作为输入:

#### 2. 工作流文件使用说明
由于此工作流较为简单,如果你想要自己手动实现对应的工作流,可以按照下面的步骤完成对应的工作流
1. 使用`加载图像(Load Image)`节点加载图像
2. 在加载图像节点中右键,选择 `遮罩编辑器(MaskEditor)`
3. 在遮罩编辑器中,使用画笔绘制你想要重绘的区域
4. 在**OpenAI DALL·E 2** 节点 `image` 输入中连接加载的图像
5. **OpenAI DALL·E 2** 节点 `mask` 输入中连接蒙版
6. 编辑 `prompt` 节点的提示词
7. 运行工作流
**注意事项**
* 如果您想使用图像编辑功能,必须同时提供图像和蒙版(缺一不可)
* 蒙版和图像必须大小相同
* 当输入大尺寸图片时,节点会自动将图像缩小到合适的尺寸
* API 返回的 URL 是短期有效的,请确保及时保存需要的结果
* 每次生成都会消耗积分,根据图像大小和数量收费
## 常见问题
请更新你的 ComfyUI 到最新版本(最新的 commit,或者最新的[桌面版](https://www.comfy.org/download)),由于我们后期可能会新增更多的 API 支持, 相应的节点也会进行更新, 所以请保持你的 ComfyUI 处于最新版本。
请注意需要区分 nightly 版本和 release 版本,有些情况下 `nightly` 版本(也就是最新的代码 commit 提交)才会包含最新的节点,因为 release 版本可能不会及时更新。
由于我们仍在快速地迭代中,所以当你无法找到对应节点时请确保你使用的是最新的版本。
API 访问需要你当前的请求是基于安全的网络环境,目前对 API 访问的网络环境要求如下:
* 本地网络仅允许 `127.0.0.1` 或者 `localhost` 访问, 这可能意味着,你无法在局域网环境下使用带有`--listen` 参数启动的 ComfyUI 服务中中使用 API Nodes 节点
* 能够正常访问我们的 API 服务(在某些地区可能需要使用代理服务)
* 你的账号没有足够的[积分](/zh-CN/interface/credits)
* 目前仅支持 `127.0.0.1` 或者 `localhost` 访问,
* 确保你的账户有足够余额
API Nodes 节点由于需要通过 API 调用闭源模型,所以需要使用积分,不支持免费使用
请参考下面的文档:
1. [Comfy账号](/zh-CN/interface/user): 在设置菜单中找到`用户`部分,进行登录
2. [积分](/zh-CN/interface/credits): 登录后设置界面会出现积分菜单,您可以在`设置` → `积分`中购买积分,我们使用预付费,不会有意外的费用
3. 通过 Stripe 完成付款
4. 查看积分是否更新,如果没有试着重启或者刷新页面
目前我们不支持对积分进行退款。
如果你觉得是因为技术问题出现了错误而存在未使用的余额,请[联系支持](mailto:support@comfy.org)
不允许积分出现负数,所以在对应 API 调用前,请确保你有足够的积分。
请在登录后访问[积分](/zh-CN/interface/credits) 菜单,查看相应的积分。
目前 API Nodes 节点仍在测试阶段,目前暂不支持,我们已经把这个功能纳入考虑中了。。
不你的积分不会过期。
不,你的积分不能转让给其他用户,也只限制于当前登录账户使用,但是我们并不限制登录设备的数量
我们不限制登录的设备数量,你可以在你想要的任何地方使用你的账号?
请发送请求邮件至 [support@comfy.org](mailto:support@comfy.org),我们将删除您的信息。
# OpenAI DALL·E 3 节点
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/openai/dall-e-3
了解如何在 ComfyUI 中使用 OpenAI DALL·E 3 API 节点生成图像
OpenAI DALL·E 3 是 ComfyUI API 节点系列中的一员,它允许用户通过 OpenAI 的 **DALL·E 3** 模型生成图像。此节点支持文本到图像的生成功能。
## 节点概述
DALL·E 3 是 OpenAI 的最新图像生成模型,能够根据文本提示创建详细且高质量的图像。通过 ComfyUI 中的这个节点,您可以直接访问 DALL·E 3 的生成能力,无需离开 ComfyUI 界面。
**OpenAI DALL·E 2** 节点通过 OpenAI 的图像生成 API 同步生成图像。它接收文本提示并返回符合描述的图像。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 参数详解
### 必需参数
| 参数名 | 类型 | 描述 |
| ------ | -- | ------------------------------------ |
| prompt | 文本 | 用于生成图像的文本提示。支持多行输入,可以详细描述您想要生成的图像内容。 |
### widget 参数
| 参数名 | 类型 | 可选值 | 默认值 | 描述 |
| ------- | -- | ------------------------------- | --------- | ---------------------------------------------------- |
| seed | 整数 | 0-2147483647 | 0 | 用于控制生成结果的随机种子 |
| quality | 选项 | standard, hd | standard | 图像质量设置。"hd"选项生成更高质量的图像,但可能需要更多计算资源 |
| style | 选项 | natural, vivid | natural | 图像风格。"vivid"倾向于生成超真实和戏剧性的图像,"natural"则产生更自然、不那么夸张的图像 |
| size | 选项 | 1024x1024, 1024x1792, 1792x1024 | 1024x1024 | 生成图像的尺寸。可以选择方形或不同方向的矩形图像 |
## 使用示例
你可以下载下面的图片,并拖入 ComfyUI 以加载对应的工作流

由于对应工作流非常简单,你也可以直接在 ComfyUI 中添加 **OpenAI DALL·E 3** 节点,并输入您想要生成的图像描述,然后运行工作流即可
1. 在 ComfyUI 中添加 **OpenAI DALL·E 3** 节点
2. 在提示文本框中输入您想要生成的图像描述
3. 根据需要调整可选参数(质量、风格、尺寸等)
4. 运行工作流程生成图像
## 常见问题
请更新你的 ComfyUI 到最新版本(最新的 commit,或者最新的[桌面版](https://www.comfy.org/download)),由于我们后期可能会新增更多的 API 支持, 相应的节点也会进行更新, 所以请保持你的 ComfyUI 处于最新版本。
请注意需要区分 nightly 版本和 release 版本,有些情况下 `nightly` 版本(也就是最新的代码 commit 提交)才会包含最新的节点,因为 release 版本可能不会及时更新。
由于我们仍在快速地迭代中,所以当你无法找到对应节点时请确保你使用的是最新的版本。
API 访问需要你当前的请求是基于安全的网络环境,目前对 API 访问的网络环境要求如下:
* 本地网络仅允许 `127.0.0.1` 或者 `localhost` 访问, 这可能意味着,你无法在局域网环境下使用带有`--listen` 参数启动的 ComfyUI 服务中中使用 API Nodes 节点
* 能够正常访问我们的 API 服务(在某些地区可能需要使用代理服务)
* 你的账号没有足够的[积分](/zh-CN/interface/credits)
* 目前仅支持 `127.0.0.1` 或者 `localhost` 访问,
* 确保你的账户有足够余额
API Nodes 节点由于需要通过 API 调用闭源模型,所以需要使用积分,不支持免费使用
请参考下面的文档:
1. [Comfy账号](/zh-CN/interface/user): 在设置菜单中找到`用户`部分,进行登录
2. [积分](/zh-CN/interface/credits): 登录后设置界面会出现积分菜单,您可以在`设置` → `积分`中购买积分,我们使用预付费,不会有意外的费用
3. 通过 Stripe 完成付款
4. 查看积分是否更新,如果没有试着重启或者刷新页面
目前我们不支持对积分进行退款。
如果你觉得是因为技术问题出现了错误而存在未使用的余额,请[联系支持](mailto:support@comfy.org)
不允许积分出现负数,所以在对应 API 调用前,请确保你有足够的积分。
请在登录后访问[积分](/zh-CN/interface/credits) 菜单,查看相应的积分。
目前 API Nodes 节点仍在测试阶段,目前暂不支持,我们已经把这个功能纳入考虑中了。。
不你的积分不会过期。
不,你的积分不能转让给其他用户,也只限制于当前登录账户使用,但是我们并不限制登录设备的数量
我们不限制登录的设备数量,你可以在你想要的任何地方使用你的账号?
请发送请求邮件至 [support@comfy.org](mailto:support@comfy.org),我们将删除您的信息。
# OpenAI GPT-Image-1 节点
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/openai/gpt-image-1
了解如何在 ComfyUI 中使用 OpenAI GPT-Image-1 API 节点生成图像
OpenAI GPT-Image-1 是 ComfyUI API 节点系列中的一员,它允许用户通过 OpenAI 的 **GPT-Image-1** 模型生成图像。这是与 ChatGPT 4o 图像生成相同的模型。
这个节点支持:
* 文本到图像的生成
* 图像编辑功能(通过蒙版进行修复绘制)
## 节点概述
**OpenAI GPT-Image-1** 节点通过 OpenAI 的图像生成 API 同步生成图像。它接收文本提示并返回符合描述的图像。GPT-Image-1 是目前 OpenAI 最先进的图像生成模型,能够创建高度详细和逼真的图像。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 参数说明
### 必填参数
| 参数名 | 类型 | 说明 |
| -------- | -- | ----------------- |
| `prompt` | 文本 | 描述您想要生成的图像内容的文本提示 |
### Widget 参数
| 参数名 | 类型 | 选项 | 默认值 | 说明 |
| ------------ | -- | ------------------------------------- | ------ | ---------------- |
| `seed` | 整数 | 0-2147483647 | 0 | 用于控制生成结果的随机种子 |
| `quality` | 选项 | low, medium, high | low | 图像质量设置,影响成本和生成时间 |
| `background` | 选项 | opaque, transparent | opaque | 返回的图像是否带有背景 |
| `size` | 选项 | auto, 1024x1024, 1024x1536, 1536x1024 | auto | 生成图像的尺寸 |
| `n` | 整数 | 1-8 | 1 | 生成的图像数量 |
### 可选参数
| 参数名 | 类型 | 选项 | 默认值 | 说明 |
| ------- | -- | ------ | --- | ---------------------- |
| `image` | 图像 | 任何图像输入 | 无 | 可选的参考图像,用于图像编辑 |
| `mask` | 蒙版 | 蒙版输入 | 无 | 可选的蒙版,用于局部重绘(白色区域将被替换) |
## 使用示例
### 文生图像(Text to Image)示例
下面的图片包含了一个简单的文生图像工作流,请下载对应的图像,并拖入 ComfyUI 以加载对应的工作流

对应的工作流非常简单:
你只需要加载 `OpenAI GPT-Image-1` 节点,在 `prompt` 节点中输入你想要生成的图像的描述,连接一个 `保存图像(Save Image)` 节点,然后运行工作流即可。
### 图生图(Image to Image)示例
下面的图片包含了一个简单的图生图工作流,请下载对应的图像,并拖入 ComfyUI 以加载对应的工作流

我们将使用下面的图片作为输入:

这个工作流中,我们使用 `OpenAI GPT-Image-1` 节点生成图像,并使用 `加载图像(Load Image)` 节点加载输入的图像,然后连接到 `OpenAI GPT-Image-1` 节点的 `image` 输入中。
### 多张图片输入示例
请下载下面的图片并拖入 ComfyUI 来加载对应的工作流

使用下面的帽子作为额外的输入图片

对应工作流如下图所示:
使用了`Batch Images` 节点来将多张图像加载到 `OpenAI GPT-Image-1` 节点 中
### 局部重绘(Inpainting)工作流
GPT-Image-1 也支持图像编辑功能,允许您使用蒙版指定要替换的区域,下面是一个简单的局部重绘工作流示例:
下载下面的图片,并拖入 ComfyUI 以加载对应的工作流,我们将继续使用 图生图工作流部分的输入图片。

对应工作流入图所示
与图生图工作流相比,我们在`Load Image`中通过右键菜单使用 蒙版编辑器(MaskEditor) 并绘制蒙版,然后连接到 `OpenAI GPT-Image-1` 节点的 `mask` 输入中,来完成对应工作流。
**注意事项**
* 蒙版和图像必须大小相同
* 当输入大尺寸图片时,节点会自动将图像缩小到合适的尺寸
## 常见问题
请更新你的 ComfyUI 到最新版本(最新的 commit,或者最新的[桌面版](https://www.comfy.org/download)),由于我们后期可能会新增更多的 API 支持, 相应的节点也会进行更新, 所以请保持你的 ComfyUI 处于最新版本。
请注意需要区分 nightly 版本和 release 版本,有些情况下 `nightly` 版本(也就是最新的代码 commit 提交)才会包含最新的节点,因为 release 版本可能不会及时更新。
由于我们仍在快速地迭代中,所以当你无法找到对应节点时请确保你使用的是最新的版本。
API 访问需要你当前的请求是基于安全的网络环境,目前对 API 访问的网络环境要求如下:
* 本地网络仅允许 `127.0.0.1` 或者 `localhost` 访问, 这可能意味着,你无法在局域网环境下使用带有`--listen` 参数启动的 ComfyUI 服务中中使用 API Nodes 节点
* 能够正常访问我们的 API 服务(在某些地区可能需要使用代理服务)
* 你的账号没有足够的[积分](/zh-CN/interface/credits)
* 目前仅支持 `127.0.0.1` 或者 `localhost` 访问,
* 确保你的账户有足够余额
API Nodes 节点由于需要通过 API 调用闭源模型,所以需要使用积分,不支持免费使用
请参考下面的文档:
1. [Comfy账号](/zh-CN/interface/user): 在设置菜单中找到`用户`部分,进行登录
2. [积分](/zh-CN/interface/credits): 登录后设置界面会出现积分菜单,您可以在`设置` → `积分`中购买积分,我们使用预付费,不会有意外的费用
3. 通过 Stripe 完成付款
4. 查看积分是否更新,如果没有试着重启或者刷新页面
目前我们不支持对积分进行退款。
如果你觉得是因为技术问题出现了错误而存在未使用的余额,请[联系支持](mailto:support@comfy.org)
不允许积分出现负数,所以在对应 API 调用前,请确保你有足够的积分。
请在登录后访问[积分](/zh-CN/interface/credits) 菜单,查看相应的积分。
目前 API Nodes 节点仍在测试阶段,目前暂不支持,我们已经把这个功能纳入考虑中了。。
不你的积分不会过期。
不,你的积分不能转让给其他用户,也只限制于当前登录账户使用,但是我们并不限制登录设备的数量
我们不限制登录的设备数量,你可以在你想要的任何地方使用你的账号?
请发送请求邮件至 [support@comfy.org](mailto:support@comfy.org),我们将删除您的信息。
# API Nodes
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/overview
在本篇中我们将介绍 ComfyUI 关于 API Nodes 的相关说明。
API Nodes 是 ComfyUI 新增的调用闭源模型的方式,通过 API 调用,这将为 ComfyUI 用户提供访问外部最先进 AI 模型的能力,而无需复杂的 API 密钥设置。
## 什么是 API Nodes?
API Nodes 是一组特殊的节点,它们能够连接到外部 API 服务,让您直接在 ComfyUI 工作流中使用闭源或第三方托管的 AI 模型。这些节点设计用于无缝集成外部模型的功能,同时保持 ComfyUI 核心的开源特性。
目前支持的模型包括:
* **Black Forest Labs**: Flux 1.1\[pro] Ultra, Flux .1\[pro], Flux .1 Kontext Pro, Flux .1 Kontext Max
* **Google**: Veo2, Gemini 2.5 Pro, Gemini 2.5 Flash
* **Ideogram**: V3, V2, V1
* **Kling**: 2.0, 1.6, 1.5 & Various Effects
* **Luma**: Photon, Ray2, Ray1.6
* **MiniMax**: Text-to-Video, Image-to-Video
* **OpenAI**: o1, o1-pro, o3, gpt-4o, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, DALL·E 2, DALL·E 3, GPT-Image-1
* **PixVerse**: V4 & Effects
* **Pika**: 2.2
* **Recraft**: V3, V2 & Various Tools
* **Rodin**: 3D Generation
* **Stability AI**: Stable Image Ultra, Stable Diffusion 3.5 Large, Image Upscale
* **Tripo**: v1-4, v2.0, v2.5
## 使用 API Nodes 的前提要求
要使用 API Nodes 节点,需要有以下节点要求
### 1. ComfyUI 版本要求
请更新你的 ComfyUI 到最新版本,由于我们后期可能会新增更多的 API 支持, 相应的节点也会进行更新, 所以请保持你的 ComfyUI 处于最新版本。
请注意需要区分 nightly 版本和 release 版本,我们推荐使用 `nightly` 版本(也就是最新的代码 commit 提交),因为 release 版本可能不会及时更新。
也就是对应的开发版本和稳定版本,由于我们仍在快速地迭代中,此文档并不一定及时更新,所以请注意对应的版本差异。
### 2. 账号及账户余额要求
需要当前已经在 ComfyUI 中登录了 [Comfy账号](/zh-CN/interface/user),并且账户[积分](/zh-CN/interface/credits)大于 0
在 `设置` -> `用户` 中进行登录:
并在 `设置` -> `积分` 中购买积分:
请参考对应的账号及积分部分的文档来确保这一要求:
* [Comfy账号](/zh-CN/interface/user): 在设置菜单中找到`用户`部分,进行登录
* [积分](/zh-CN/interface/credits): 登录后设置界面会出现积分菜单,您可以在`设置` → `积分`中购买积分,我们使用预付费,不会有意外的费用
### 3. 网络环境要求
* 本地网络仅允许 `127.0.0.1` 或者 `localhost` 访问可以直接使用登录功能
* 如果是局域网或者非白名单网站访问请使用 API Key 登录,请参考[使用 API Key 进行登录](/zh-CN/interface/user#使用-api-key-进行登录)
* 能够正常访问我们的 API 服务(在某些地区可能需要使用代理服务)
* 要求在 `https` 环境下访问,保证请求的安全性
不安全的上下文访问会有巨大的风险可能会有以下后果:
1. 身份认证被窃取,从而造成你的账号信息泄露
2. 账号被恶意使用,造成经济损失
就算是后期我们开放了这一限制,我们依然强烈建议你不要通过不安全的网络请求访问 API 服务,因为这一风险极大.
### 4. 使用对应节点
**添加到工作流**:将 API 节点添加到您的工作流中,就像使用其他节点一样
**运行**:设置好参数后运行工作流
## 使用 API Key 在非白名单网站登录 ComfyUI 账户来使用 API Nodes
目前我们设置有白名单来限制可以登录ComfyUI 账户的网站,如果需要在一些非白名单网站登录 ComfyUI 账户,请参考账号管理相关的部分了解如何使用 API Key 来进行登录,在这种情况下不需要对应的网站在我们的白名单中。
查看如何使用 ComfyUI API Key 登录
## 使用 ComfyUI API Key 集成来调用付费模型 API 节点
目前我们支持通过 ComfyUI API Key 集成来访问我们的服务来调用付费模型 API 节点,请参考 API Key 集成章节了解如何使用 API Key 集成来调用付费模型 API 节点
请参考 API Key 集成章节了解如何使用 API Key 集成来调用付费模型 API 节点
## API Nodes 的优势
API Nodes 为 ComfyUI 用户提供了几个重要优势:
* **访问闭源模型**:使用最先进的 AI 模型,无需自行部署
* **无缝集成**:API 节点与其他 ComfyUI 节点完全兼容,可以组合创建复杂工作流
* **简化的体验**:无需管理 API 密钥或处理复杂的 API 请求
* **可控的成本**:预付费系统确保您完全控制支出,没有意外费用
## 计费方式
请参考定价页面了解对应的 API 定价
## 关于开源和选择加入
重要的是要注意,**API Nodes 是完全可选的**。ComfyUI 将始终保持完全开源,并对本地用户免费。API 节点设计为"选择加入"功能,为那些想要访问外部 SOTA(最先进)模型的用户提供便利。
## 如何使用 API Nodes
API Nodes 的一个强大应用是将外部模型的输出与本地节点结合。例如:
* 使用 [GPT-Image-1](/zh-CN/tutorials/api-nodes/openai/gpt-image-1) 生成基础图像,然后通过本地 `WanImageToVideo` 节点转换为视频
* 结合外部生成的图像与本地的上采样或风格转换节点
* 创建混合工作流,充分利用闭源和开源模型的优势
这种灵活性使 ComfyUI 成为真正的通用生成式 AI 入口,将各种不同的 AI 功能整合到一个统一的工作流中,带来了更多可能性
## 常见问题
请更新你的 ComfyUI 到最新版本(最新的 commit,或者最新的[桌面版](https://www.comfy.org/download)),由于我们后期可能会新增更多的 API 支持, 相应的节点也会进行更新, 所以请保持你的 ComfyUI 处于最新版本。
请注意需要区分 nightly 版本和 release 版本,有些情况下 `nightly` 版本(也就是最新的代码 commit 提交)才会包含最新的节点,因为 release 版本可能不会及时更新。
由于我们仍在快速地迭代中,所以当你无法找到对应节点时请确保你使用的是最新的版本。
API 访问需要你当前的请求是基于安全的网络环境,目前对 API 访问的网络环境要求如下:
* 本地网络仅允许 `127.0.0.1` 或者 `localhost` 访问, 这可能意味着,你无法在局域网环境下使用带有`--listen` 参数启动的 ComfyUI 服务中中使用 API Nodes 节点
* 能够正常访问我们的 API 服务(在某些地区可能需要使用代理服务)
* 你的账号没有足够的[积分](/zh-CN/interface/credits)
* 目前仅支持 `127.0.0.1` 或者 `localhost` 访问,
* 确保你的账户有足够余额
API Nodes 节点由于需要通过 API 调用闭源模型,所以需要使用积分,不支持免费使用
请参考下面的文档:
1. [Comfy账号](/zh-CN/interface/user): 在设置菜单中找到`用户`部分,进行登录
2. [积分](/zh-CN/interface/credits): 登录后设置界面会出现积分菜单,您可以在`设置` → `积分`中购买积分,我们使用预付费,不会有意外的费用
3. 通过 Stripe 完成付款
4. 查看积分是否更新,如果没有试着重启或者刷新页面
目前我们不支持对积分进行退款。
如果你觉得是因为技术问题出现了错误而存在未使用的余额,请[联系支持](mailto:support@comfy.org)
不允许积分出现负数,所以在对应 API 调用前,请确保你有足够的积分。
请在登录后访问[积分](/zh-CN/interface/credits) 菜单,查看相应的积分。
目前 API Nodes 节点仍在测试阶段,目前暂不支持,我们已经把这个功能纳入考虑中了。。
不你的积分不会过期。
不,你的积分不能转让给其他用户,也只限制于当前登录账户使用,但是我们并不限制登录设备的数量
我们不限制登录的设备数量,你可以在你想要的任何地方使用你的账号?
请发送请求邮件至 [support@comfy.org](mailto:support@comfy.org),我们将删除您的信息。
# 计费
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/pricing
本文罗列了当前 API Nodes 的调用价格。
下面的表格列出了当前 API Nodes 的调用价格。所有价格均为美元。
## BFL
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| --------------------------- | ----- | ------- | ---- | -------- |
| Flux 1.1 \[pro] Ultra Image | Image | NA | NA | 0.06 |
| Flux.1 Canny Control Image | Image | NA | NA | 0.05 |
| Flux.1 Depth Control Image | Image | NA | NA | 0.05 |
| Flux.1 Expand Image | Image | NA | NA | 0.05 |
| Flux.1 Fill Image | Image | NA | NA | 0.05 |
| Flux.1 Kontext \[max] Image | Image | NA | NA | 0.08 |
| Flux.1 Kontext \[pro] Image | Image | NA | NA | 0.04 |
| Flux.1 Kontext \[pro] Image | Image | NA | NA | 0.05 |
## Kling
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ------------------------------------ | ----- | ---------------------------- | ---------------------------------------------- | -------- |
| Kling Image Generation | Image | image input, model\_name, n | kling-v1-5, 1, image to image | 0.028 |
| Kling Image Generation | Image | image input, model\_name, n | kling-v1-5, 1, text to image | 0.014 |
| Kling Image Generation | Image | image input, model\_name, n | kling-v1, 1, image to image | 0.0035 |
| Kling Image Generation | Image | image input, model\_name, n | kling-v1, 1, text to image | 0.0035 |
| Kling Image Generation | Image | image input, model\_name, n | kling-v2, 1, text to image | 0.014 |
| Kling Virtual Try On | Image | NA | NA | 0.07 |
| Kling Text to Video (Camera Control) | Video | NA | NA | 0.49 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-5, pro, 5 | 0.49 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-5, pro, 10 | 0.98 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-5, std, 5 | 0.28 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-5, std, 10 | 0.56 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-6, pro, 5 | 0.49 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-6, pro, 10 | 0.98 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-6, std, 5 | 0.28 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1-6, std, 10 | 0.56 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1, pro, 5 | 0.49 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1, pro, 10 | 0.98 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1, std, 5 | 0.14 |
| Kling Dual Character Video Effects | Video | duration, mode, model\_name | kling-v1, std, 10 | 0.28 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-5, pro, 5 | 0.49 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-5, pro, 10 | 0.98 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-5, std, 5 | 0.28 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-5, std, 10 | 0.56 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-6, pro, 5 | 0.49 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-6, pro, 10 | 0.98 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-6, std, 5 | 0.28 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1-6, std, 10 | 0.56 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1, pro, 5 | 0.49 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1, pro, 10 | 0.98 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1, std, 5 | 0.14 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v1, std, 10 | 0.28 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1-master, 5s | 1.4 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1-master, 10s | 2.8 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1, pro, 5s | 0.49 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1, pro, 10s | 0.98 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1, std, 5s | 0.28 |
| Kling Image to Video | Video | duration, model, model\_name | kling-v2-1, std, 10s | 0.56 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v2-maser, pro, 5s | 1.4 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v2-maser, pro, 10s | 2.8 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v2-maser, std, 5s | 1.4 |
| Kling Image to Video | Video | duration, mode, model\_name | kling-v2-maser, std, 10s | 2.8 |
| Kling Lip Sync Video with Audio | Video | output\_length | 5s | 0.07 |
| Kling Lip Sync Video with Audio | Video | output\_length | 10s | 0.14 |
| Kling Lip Sync Video with Text | Video | output\_length | 5s | 0.07 |
| Kling Lip Sync Video with Text | Video | output\_length | 10s | 0.14 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 5s duration / kling-v1 | 0.49 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 5s duration / kling-v1-5 | 0.49 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 5s duration / kling-v1-6 | 0.49 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 10s duration / kling-v1-5 | 0.98 |
| Kling Start-End Frame to Video | Video | mode | pro mode / 10s duration / kling-v1-6 | 0.98 |
| Kling Start-End Frame to Video | Video | mode | standard mode / 5s duration / kling-v1 | 0.14 |
| Kling Text to Video | Video | mode | 5s duration / kling-v2-1-master | 1.4 |
| Kling Text to Video | Video | mode | 10s duration / kling-v2-1-master | 2.8 |
| Kling Text to Video | Video | mode | pro mode / 5s duration / kling-v1 | 0.49 |
| Kling Text to Video | Video | mode | pro mode / 5s duration / kling-v2-master | 1.4 |
| Kling Text to Video | Video | mode | pro mode / 5s duration / kling-v2-master | 1.4 |
| Kling Text to Video | Video | mode | pro mode / 10s duration / kling-v1 | 0.98 |
| Kling Text to Video | Video | mode | pro mode / 10s duration / kling-v2-master | 2.8 |
| Kling Text to Video | Video | mode | standard mode / 5s duration / kling-v1 | 0.14 |
| Kling Text to Video | Video | mode | standard mode / 5s duration / kling-v1-6 | 0.28 |
| Kling Text to Video | Video | mode | standard mode / 5s duration / kling-v2-master | 1.4 |
| Kling Text to Video | Video | mode | standard mode / 10s duration / kling-v1 | 0.28 |
| Kling Text to Video | Video | mode | standard mode / 10s duration / kling-v1-6 | 0.56 |
| Kling Text to Video | Video | mode | standard mode / 10s duration / kling-v2-master | 2.8 |
| Kling Video Effects | Video | duration, effect\_scene | dizzydizzy or bloombloom, 5 | 0.49 |
| Kling Video Effects | Video | duration, effect\_scene | fuzzyfuzzy or squish or expansion, 5 | 0.28 |
| Kling Video Extend | Video | NA | NA | 0.28 |
## Luma
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ------------------- | ----- | -------------------------- | ---------------------- | -------- |
| Luma Text to Image | Image | model | photo-flash-1 | 0.0019 |
| Luma Text to Image | Image | model | photo-flash-1 | 0.0019 |
| Luma Image to Image | Image | model | photon-1 | 0.0073 |
| Luma Image to Image | Image | model | photon-1 | 0.0073 |
| Luma Image to Video | Video | duration, model, resoluion | ray-1-6, 720p, 5s | 0.35 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 4k, 5s | 6.37 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 4k, 9s | 11.47 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 540p, 5s | 0.4 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 540p, 9s | 0.72 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 720p, 5s | 0.71 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 720p, 9s | 1.27 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 1080p, 5s | 1.59 |
| Luma Image to Video | Video | duration, model, resoluion | ray-2, 1080p, 9s | 2.87 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 4k, 5s | 2.19 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 4k, 9s | 3.94 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 540p, 5s | 0.14 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 540p, 9s | 0.25 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 720p, 5s | 0.24 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 720p, 9s | 0.44 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 1080p, 5s | 0.55 |
| Luma Image to Video | Video | duration, model, resoluion | ray-flash-2, 1080p, 9s | 0.99 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-1-6, 720p, 5s | 0.35 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 4k, 5s | 6.37 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 4k, 9s | 11.47 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 540p, 5s | 0.4 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 540p, 9s | 0.72 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 720p, 5s | 0.71 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 720p, 9s | 1.27 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 1080p, 5s | 1.59 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-2, 1080p, 9s | 2.87 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 4k, 5s | 2.19 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 4k, 9s | 3.94 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 540p, 5s | 0.14 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 540p, 9s | 0.25 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 720p, 5s | 0.24 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 720p, 9s | 0.44 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 1080p, 5s | 0.55 |
| Luma Text-to-video | Video | duration, model, resoluion | ray-flash-2, 1080p, 9s | 0.99 |
## Google
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ----------------------------- | ----- | ---------------------- | -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| Google Veo 3 Video Generatoin | Video | generate\_audio, model | veo-3.0-fast-generate-001, false | 2 |
| Google Veo 3 Video Generatoin | Video | generate\_audio, model | veo-3.0-fast-generate-001, true | 3.2 |
| Google Veo 3 Video Generatoin | Video | generate\_audio, model | veo-3.0-generate-001, false | 4 |
| Google Veo 3 Video Generatoin | Video | generate\_audio, model | veo-3.0-generate-001, true | 6 |
| Google Veo2 Video Generation | Video | duration | 5 | 2.5 |
| Google Veo2 Video Generation | Video | duration | 8 | 4 |
| Google Gemini | Text | model | gemini-2.5-flash-preview-04-17 | $0.3/1M input tokens (txt, img, vid) + $2.5/1M output text tokens + \$1/1M input audio tokens (\< 200K tokens) |
| Google Gemini | Text | model | gemini-2.5-pro-preview-05-06 | $0.16/1M input tokens + $0.6/1M output tokens + \$1/1M input audio tokens (\< 200K tokens) |
| Google Gemini | Text | model | gemini-2.5-flash | $0.3/1M input tokens(txt, img, vid) + $2.5/1M output tex tokens + \$1/1M input audio tokens (\< 200K tokens) |
| Google Gemini | Text | model | gemini-2.5-pro | $1.25/1M input tokens (text, img, vid) + $10/1M output tokens (\< 200K total tokens) |
| Google Gemini | Text | model | gemini-2.5-pro-preview-05-06 | $1.25/1M input tokens(text, img, vid) + $10/1M output tokens (\< 200K total tokens) |
| Google Gemini Image | Image | model | gemini-2.5-flash-image-preview | $0.3/1M input tokens (text, video, image) + $1/1M input tokens (audio) + $2.5/1M output tokens (text) + $30/1M output tokens (image) |
## Minimax
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ---------------------- | ----- | ------- | ---------------------------------- | -------- |
| Minimax Text to Video | Video | model | Hailuo-02 / 768 P / 6s | 0.28 |
| Minimax Text to Video | Video | model | Hailuo-02 / 768 P / 10s | 0.56 |
| Minimax Text to Video | Video | model | Hailuo-02 / 1080 P / 6s | 0.49 |
| Minimax Text to Video | Video | model | T2V-01 | 0.43 |
| Minimax Text to Video | Video | model | T2V-01-Director | 0.43 |
| Minimax Image to Video | Video | model | Hailuo-02 / 768 P / 6s | 0.28 |
| Minimax Image to Video | Video | model | Hailuo-02 / 768 P / 10s | 0.56 |
| Minimax Image to Video | Video | model | Hailuo-02 / 1080 P / 6s | 0.49 |
| Minimax Image to Video | Video | model | I2V-01 | 0.43 |
| Minimax Image to Video | Video | model | I2V-01-live | 0.43 |
| Minimax Video | Video | model | 768P, 6 (Text or Image to video) | 0.28 |
| Minimax Video | Video | model | 768P, 10 (Text or Image to video) | 0.56 |
| Minimax Video | Video | model | 1080P, 6 (Text or Image to video) | 0.49 |
## Recraft
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ------------------------------ | ----- | ------- | ---- | -------- |
| Recraft Creative Upscale Image | Image | NA | NA | 0.25 |
| Recraft Crisp Upscale Image | Image | NA | NA | 0.004 |
| Recraft Image to Image | Image | n | 1 | 0.04 |
| Recraft Remove Background | Image | NA | NA | 0.01 |
| Recraft Replace Background | Image | n | 1 | 0.04 |
| Recraft Text to Image | Image | n | 1 | 0.04 |
| Recraft Vectorize Image | Image | NA | NA | 0.01 |
| Recraft Text to Vector | Image | n | 1 | 0.08 |
| Recraft Image Inpainting | Image | n | 1 | 0.04 |
## Ideogram
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ----------- | ----- | ------------------------------ | ----------- | -------- |
| Ideogram V1 | Image | num\_images, turbo(true/false) | 1, false | 0.06 |
| Ideogram V1 | Image | num\_images, turbo(true/false) | 1, true | 0.02 |
| Ideogram V2 | Image | num\_images, turbo(true/false) | 1, false | 0.08 |
| Ideogram V2 | Image | num\_images, turbo(true/false) | 1, true | 0.05 |
| Ideogram V3 | Image | num\_images, rendering\_speed | 1, Balanced | 0.06 |
| Ideogram V3 | Image | num\_images, rendering\_speed | 1, Quality | 0.09 |
| Ideogram V3 | Image | num\_images, rendering\_speed | 1, Turbo | 0.03 |
## Runway
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ----------------------------------- | ----- | -------- | ---- | -------- |
| Ruway Text to Image | Image | NA | NA | 0.08 |
| Runway First-Last-Frame to Video | Video | duration | 5s | 0.25 |
| Runway First-Last-Frame to Video | Video | duration | 10s | 0.5 |
| Runway Image to Video (Gen3a Turbo) | Video | duration | 5s | 0.25 |
| Runway Image to Video (Gen3a Turbo) | Video | duration | 10s | 0.5 |
| Runway Image to Video (Gen4 Turbo) | Video | duration | 5s | 0.25 |
| Runway Image to Video (Gen4 Turbo) | Video | duration | 10s | 0.5 |
## OpenAI
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ------------------------------- | ----- | ---------------- | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------- |
| GPT-Image-1 - Actual | Image | n, quality, size | input image tokens$10 / 1M tokens + input text tokens$5 / 1M tokens +output tokens\$40 / 1M tokens | ?? |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | high, 1024x1024 | 0.167 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | high, 1024x1536 | 0.25 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | high, 1536x1024 | 0.25 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | low, 1024x1024 | 0.011 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | low, 1024x1536 | 0.016 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | low, 1536x1024 | 0.016 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | medium, 1024x1024 | 0.042 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | medium, 1024x1536 | 0.063 |
| GPT-Image-1 (Approximate price) | Image | n, quality, size | medium, 1536x1024 | 0.063 |
| Image Generation (DALL·E 2) | Image | size | size = 512 \* 512 | 0.018 |
| Image Generation (DALL·E 2) | Image | size | size = 1024 \* 1024 | 0.02 |
| Image Generation (DALL·E 2) | Image | size | size 256 \* 256 | 0.016 |
| Image Generation (DALL·E 3 HD) | Image | quality, size | size = 1024 \* 1024, hd | 0.08 |
| Image Generation (DALL·E 3 HD) | Image | quality, size | size = 1024 \* 1792, hd | 0.12 |
| Image Generation (DALL·E 3 HD) | Image | quality, size | size = 1792 \* 1024, hd | 0.12 |
| Image Generation (DALL·E 3 Std) | Image | quality, size | size = 1024 \* 1024,std | 0.04 |
| Image Generation (DALL·E 3 Std) | Image | quality, size | size = 1024 \* 1792, std | 0.08 |
| Image Generation (DALL·E 3 Std) | Image | quality, size | size = 1792 \* 1024, std | 0.08 |
| OpenAI Chat | Text | gpt-4.1 | Per 1M tokens: Input Text: $2, Cached Input text:$0.5, Output text: \$8 | Per 1M tokens: Input Text: $0.05, Cached Input text: $0.005, Output text: \$0.4 |
| OpenAI Chat | Text | gpt-4.1-mini | Per 1M tokens: Input Text: $0.4, Cached Input text: $0.1, Output text: \$1.6 | |
| OpenAI Chat | Text | gpt-4.1-nano | Per 1M tokens: Input Text: $0.1, Cached Input text: $0.025, Output text: \$0.4 | |
| OpenAI Chat | Text | gpt-5 | | Per 1M tokens: Input Text: $1.25, Cached Input text: $0.125, Output text: \$10 |
| OpenAI Chat | Text | gpt-5-mini | | Per 1M tokens: Input Text: $0.25 ,Cached Input text: $0.025, Output text: \$2 |
| OpenAI Chat | Text | gpt-5-nano | | Per 1M tokens: Input Text: $0.05, Cached Input text: $0.005, Output text: \$0.4 |
| OpenAI Chat | Text | gpt-4o | | Per 1M tokens: Input Text: $2.5, Cached Input text: $1.25, Output text: \$10 |
| OpenAI Chat | Text | o1 | | Per 1M tokens: Input Text: $15, Cached Input text: $7.5, Output text: \$60 |
| OpenAI Chat | Text | o1-pro | | Per 1M tokens: Input Text: $150, Cached Input text: NA, Output text: $600 |
| OpenAI Chat | Text | o3 | | Per 1M tokens: Input Text: $2, Cached Input text: $0.5, Output text: \$8 |
| OpenAI Chat | Text | o4-mini | | Per 1M tokens: Input Text: $1.1, Cached Input text: $0.275, Output text: \$4.4 |
## Pixverse
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ------------------------- | ----- | ---------------------------- | --------------- | -------- |
| PixVerse Text to Video | Video | duration, quality, resoluion | 360p fast 5s | 0.9 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 360p normal 5s | 0.45 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 360p normal 8s | 0.9 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 540p fast 5s | 0.9 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 540p normal 5s | 0.45 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 540p normal 8s | 0.9 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 720p fast 5s | 1.2 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 720p normal 5s | 0.6 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 720p normal 8s | 1.2 |
| PixVerse Text to Video | Video | duration, quality, resoluion | 1080p normal 5s | 1.2 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 360p fast 5s | 0.9 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 360p normal 5s | 0.45 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 360p normal 8s | 0.9 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 540p fast 5s | 0.9 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 540p normal 5s | 0.45 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 540p normal 8s | 0.9 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 720p fast 5s | 1.2 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 720p normal 5s | 0.6 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 720p normal 8s | 1.2 |
| PixVerse Transition Video | Video | duration, quality, resoluion | 1080p normal 5s | 1.2 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 360p fast 5s | 0.9 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 360p normal 5s | 0.45 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 360p normal 8s | 0.9 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 540p fast 5s | 0.9 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 540p normal 5s | 0.45 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 540p normal 8s | 0.9 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 720p fast 5s | 1.2 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 720p normal 5s | 0.6 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 720p normal 8s | 1.2 |
| PixVerseImage to Video | Video | duration, quality, resoluion | 1080p normal 5s | 1.2 |
## Pika
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ------------------------------------- | ----- | ------------------- | ---------- | -------- |
| Pika Scenes (Video Image Composition) | Video | duration, resoluion | 720p, 5s | 0.3 |
| Pika Scenes (Video Image Composition) | Video | duration, resoluion | 720p, 10s | 0.4 |
| Pika Scenes (Video Image Composition) | Video | duration, resoluion | 1080p, 5s | 0.5 |
| Pika Scenes (Video Image Composition) | Video | duration, resoluion | 1080p, 10s | 1.5 |
| Pika Start and End Frame to Video | Video | duration, resoluion | 720p, 5s | 0.2 |
| Pika Start and End Frame to Video | Video | duration, resoluion | 720p, 10s | 0.25 |
| Pika Start and End Frame to Video | Video | duration, resoluion | 1080p, 5s | 0.3 |
| Pika Start and End Frame to Video | Video | duration, resoluion | 1080p, 10s | 1 |
| Pika Text to Video | Video | duration, resoluion | 720p, 5s | 0.2 |
| Pika Text to Video | Video | duration, resoluion | 720p, 10s | 0.6 |
| Pika Text to Video | Video | duration, resoluion | 1080p, 5s | 0.45 |
| Pika Text to Video | Video | duration, resoluion | 1080p, 10s | 1 |
| PikaImage to Video | Video | duration, resoluion | 720p, 5s | 0.2 |
| PikaImage to Video | Video | duration, resoluion | 720p, 10s | 0.6 |
| PikaImage to Video | Video | duration, resoluion | 1080p, 5s | 0.45 |
| PikaImage to Video | Video | duration, resoluion | 1080p, 10s | 1 |
| Pika Swaps (Video Object Replacement) | Video | NA | NA | 0.3 |
| Pikadditios (Video Object Insertion) | Video | NA | NA | 0.3 |
| Pikaffects (Video Effects) | Video | NA | NA | 0.45 |
## Moonvalley
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ------------------- | ----- | ------- | ---- | -------- |
| Image to video - 5s | Video | NA | NA | 1.5 |
| Text to video - 5s | Video | NA | NA | 1.5 |
| Video to video - 5s | Video | NA | NA | 2.25 |
## Rodin
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ------------------------------------ | -- | ------- | ---- | -------- |
| Rodin 3D Generate - Regular Generate | 3D | NA | NA | 0.4 |
| Rodin 3D Generate - Detail Generate | 3D | NA | NA | 0.4 |
| Rodin 3D Generate - Sketch Generate | 3D | NA | NA | 0.4 |
| Rodin 3D Generate - Smooth Generate | 3D | NA | NA | 0.4 |
## Tripo
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ----------------------------------------- | -- | -------------------------------------- | ------------------------------------ | -------- |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, false, any quality, false | 0.15 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, false, any quality, true | 0.2 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, true, detailed, false | 0.35 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, true, detailed, true | 0.4 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, true, standard, false | 0.25 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | any style, true, standard, true | 0.3 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, false, any quality, false | 0.1 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, false, any quality, true | 0.15 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, true, detailed, false | 0.3 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, true, detailed, true | 0.35 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, true, standard, false | 0.2 |
| Tripo: Text to Model | 3D | quad, style, texture, texture\_quality | none, true, standard, true | 0.25 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, false, any quality, false | 0.25 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, false, any quality, true | 0.3 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, true, detailed, false | 0.45 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, true, detailed, true | 0.5 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, true, standard, false | 0.35 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | any style, true, standard, true | 0.4 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, false, any quality, false | 0.2 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, false, any quality, true | 0.25 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, true, detailed, false | 0.4 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, true, detailed, true | 0.45 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, true, standard, false | 0.3 |
| Tripo:Image to Model / Multiview to Model | 3D | quad, style, texture, texture\_quality | none, true, standard, true | 0.35 |
| Tripo: Convert model | 3D | NA | NA | 0.1 |
| Tripo: Refine Draft model | 3D | NA | NA | 0.3 |
| Tripo: Retarget rigged model | 3D | NA | NA | 0.1 |
| Tripo: Rig model | 3D | NA | NA | 0.25 |
| Tripo: Texture model | 3D | texture\_quality | detailed | 0.2 |
| Tripo: Texture model | 3D | texture\_quality | standard | 0.1 |
## Stability AI
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| --------------------------------------- | ----- | ------- | ------------ | -------- |
| Stability AI Stable Image Ultra | Image | NA | NA | 0.08 |
| Stability AI Stable Diffusion 3.5 Image | Image | model | sd3.5-large | 0.065 |
| Stability AI Stable Diffusion 3.5 Image | Image | model | sd3.5-medium | 0.035 |
| Stability AI Upscale Conservative | Image | NA | NA | 0.40 |
| Stability AI Upscale Creative | Image | NA | NA | 0.60 |
| Stability AI Upscale Fast | Image | NA | NA | 0.02 |
## Vidu
| 模型(节点名称) | 类别 | 影响价格的参数 | 参数组合 | 价格(美元\$) |
| ----------------------- | ----- | ------- | ---- | -------- |
| ViduImageToVideoNode | Video | NA | NA | 0.4 |
| ViduReferenceVideoNode | Video | NA | NA | 0.4 |
| ViduStartEndToVideoNode | Video | NA | NA | 0.4 |
| ViduTextToVideoNode | Video | NA | NA | 0.4 |
# Recraft Text to Image API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/recraft/recraft-text-to-image
本文将介绍如何在 ComfyUI 中使用 Recraft Text to Image API 节点的相关功能
[Recraft Text to Image](/zh-CN/built-in-nodes/api-node/image/recraft/recraft-text-to-image) 节点允许你使用Recraft AI的图像生成技术,通过文本描述创建高质量、风格多样的图像内容。
本篇指南中,我们将引导你如何使用对应节点来进行文本到图像的工作流设置。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Recraft Text to Image API 节点文本到图像工作流
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的工作流运行:
1. (可选) 修改 `Color` 的 `Recraft Color RGB` 的颜色为你想要的颜色
2. (可选) 修改 `Recraft Style` 节点来控制图像的视觉风格,如数字插画、真实照片或Logo设计等,这个分组同时提供了其它的风格节点,你可以按需启用
3. (可选) 修改 `Recraft Text to Image` 节点中的 `prompt` 参数,你也可以通过修改`size`参数来改变
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的生成
5. 等待 API 返回结果后,你可在 `Save Image` 节点中查看生成的图像,对应的图像也会被保存至 `ComfyUI/output/` 目录下
> (可选) 我们在工作流中提供了 **Convert to SVG** 的分组,由于该分组中的 `Recraft Vectorize Image` 节点也会额外消耗积分,你可按需启用,将生成的图像转换成 SVG 格式
### 3. 补充说明
* **Recraft Style**:提供多种预设风格,如真实照片、数字插画、Logo栅格等
* **Seed 参数**:仅用于确定节点是否应重新运行,但实际生成结果与种子值无关
## 相关节点详解
你可查阅下面的文档了解对应节点的详细参数设置等
Recraft Text to Image API 节点说明文档
Recraft Style - Realistic Image API 节点说明文档
Recraft Controls API 节点说明文档
# Rodin API 节点模型生成 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/rodin/model-generation
本文将介绍如何在 ComfyUI 中使用 Rodin 节点的 API 来进行模型生成
Hyper3D Rodin (hyper3d.ai) 是一个专注于通过人工智能快速生成高质量、可直接用于生产环境的3D模型和材质的平台。
目前 ComfyUI 已原生集成了对应 Rodin 模型生成 API ,现在你可以在 ComfyUI 中便捷地使用相关节点来进行模型生成
目前 ComfyUI 的 API 节点中已经支持 Rodin 以下模型生成能力:
* 单视角模型生成
* 多视角模型生成
* 多种不同精度的模型生成
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 单视角模型生成工作流
### 1. 工作流文件下载
下载下面的文件,并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
下载下面的图片作为输入图片

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中加载提供的输入图片
2. (可选)在 `Rodin 3D Generate - Regular Generate` 调整对应参数
* polygon\_count: 可以设置不同的面数, 越大模型越平滑越精细
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行模型的生成,工作流完成后对应的模型会自动保存至 `ComfyUI/output/Rodin` 目录下
4. 在 `Preview 3D` 节点中点击展开菜单
5. 选择`Export` 可以直接将对应模型导出
## 多视角模型生成工作流
对应的 `Rodin 3D Generate - Regular Generate` 最多允许5张图像输入
### 1. 工作流文件下载
你可以将单视角部分的工作流修改为多视角工作流,或者直接下载下面的工作流文件
下载下面的文件,并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
下载下面的图片作为输入图片



### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中加载提供的输入图片
2. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行模型的生成,工作流完成后对应的模型会自动保存至 `ComfyUI/output/Rodin` 目录下
3. 在 `Preview 3D` 节点中点击展开菜单
4. 选择`Export` 可以直接将对应模型导出
## 其它相关节点
目前在 ComfyUI 中, Rodin 提供了不同类型的模型生成节点,由于对应输入条件与本文介绍的工作流相同,你可以按需启用,另外在对应模板中,我们提供了对应的节点,你也可以按需修改对应节点模式来启用
# Runway API 节点 图像生成 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/runway/image-generation
本文将介绍如何在 ComfyUI 中使用 Runway 节点进行文生图和参考生图功能
Runway 是一家专注于生成式AI的科技公司,提供强大的图像生成功能。其模型支持风格迁移、图像扩展和细节控制等特性。目前 ComfyUI 已集成 Runway API,你可以直接在 ComfyUI 中使用相关节点进行图像生成。
本篇指南中,我们将引导你完成下面的工作流:
* 文生图
* 参考生图
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Runway Image 文生图 工作流
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Runway Text to Image` 的 `prompt` 中输入提示词
2. (可选) 设置调整 `ratio` 来设置不同的输出比例
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的生成。
4. 等待 API 返回结果后,你可在 `Save Image` 节点中查看生成的图像(右键可以保存),对应的图像也会被保存至 `ComfyUI/output/` 目录下。
## Runway Image 参考生图 工作流
### 1. 工作流及输入图像下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

下载下面的图像用于输入

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的参考生图工作流运行:
1. 在 `Load Image` 节点中加载提供的输入图像
2. 在 `Runway Text to Image` 的 `prompt` 中输入提示词及进行尺寸调整
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的生成。
4. 等待 API 返回结果后,你可在 `Save Image` 节点中查看生成的图像(右键可以保存),对应的图像也会被保存至 `ComfyUI/output/` 目录下。
# Runway API 节点 视频生成 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/runway/video-generation
本文将介绍如何在 ComfyUI 中使用 Runway 节点进行视频生成的相关工作流
Runway 是一家专注于生成式 AI 的科技公司,提供强大的视频生成功能。目前 ComfyUI 已集成 Runway API,你可以直接在 ComfyUI 中使用相关节点进行视频生成。
目前 ComfyUI 中原生集成了 Runway 的以下视频生成模型:
* Runway Gen3a turbo
* Runway Gen4 turbo
* Runway First Last Frame to video
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Gen3a turbo 图生视频工作流
### 1. 工作流文件下载
下面的视频的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
下载下面的图片作为输入图片

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中加载提供的输入图片
2. 在 `Runway Gen3a turbo` 节点中设置 `prompt` 描述视频内容,修改 `duration` 参数来设置视频时长, 修改 `ratio` 参数来设置视频宽高比
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频的生成。
4. 等待 API 返回结果后,你可在 `Save Video` 节点中查看生成的视频(右键菜单可以保存),对应的视频也会被保存至 `ComfyUI/output/` 目录下。
## Gen4 turbo 图生视频工作流
### 1. 工作流文件下载
下面的视频的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
下载下面的图片作为输入图片

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中加载提供的输入图片
2. 在 `Runway Gen4 turbo` 节点中设置 `prompt` 描述视频内容,修改 `duration` 参数来设置视频时长, 修改 `ratio` 参数来设置视频宽高比
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频的生成。
4. 等待 API 返回结果后,你可在 `Save Video` 节点中查看生成的视频(右键菜单可以保存),对应的视频也会被保存至 `ComfyUI/output/` 目录下。
## 首尾帧视频生成工作流
### 1. 工作流文件下载
下面的视频的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
下载下面的图片作为输入图片


### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中加载起始帧
2. 在 `Load Image` 节点中加载结束帧
3. 在 `Runway First-Last-Frame to Video` 节点中设置 `prompt` 描述视频内容,修改 `duration` 参数来设置视频时长, 修改 `ratio` 参数来设置视频宽高比
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频的生成。
5. 等待 API 返回结果后,你可在 `Save Video` 节点中查看生成的视频(右键菜单可以保存),对应的视频也会被保存至 `ComfyUI/output/` 目录下。
# Stability AI Stable Diffusion 3.5 API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/stability-ai/stable-diffusion-3-5-image
本文将介绍如何在 ComfyUI 中使用 Stability AI Stable Diffusion 3.5 API 节点的文生图和图生图功能
[Stability AI Stable Diffusion 3.5 Image](/zh-CN/built-in-nodes/api-node/image/stability-ai/stability-ai-stable-diffusion-3-5-image) 节点允许你使用 Stability AI 的 Stable Diffusion 3.5 模型,通过文本提示词或参考图像创建高质量、细节丰富的图像内容。
本篇指南中,我们将引导你如何使用对应节点来进行文生图和图生图的工作流设置。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Stability AI Stable Diffusion 3.5 文生图工作流
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. (可选)修改 `Stability AI Stable Diffusion 3.5 Image` 节点中的 `prompt` 参数,输入你想要生成的图像描述。提示词越详细,生成的图像质量往往越好。
2. (可选)选择 `model` 参数来选择使用的SD 3.5模型版本。
3. (可选)选择 `style_preset` 参数来控制图像的视觉风格。不同的预设风格会产生不同风格特点的图像,如"cinematic"(电影感)、"anime"(动漫风格)等。选择"None"则不应用任何特定风格。
4. (可选)编辑 `String(Multiline)` 来修改负向提示词,用于指定不希望在生成图像中出现的元素。
5. 点击 `Run(运行)` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的生成。
6. 等待 API 返回结果后,你可在 `Save Image` 节点中查看生成的图像,对应的图像也会被保存至 `ComfyUI/output/` 目录下。
### 3. 补充说明
* **提示词(Prompt)**:提示词是生成过程中最重要的参数之一,详细、清晰的描述会带来更好的效果。可以包含场景、主体、颜色、光照、风格等元素。
* **CFG Scale**:控制生成器对提示词的遵循程度,值越高生成的图像越接近提示词描述,但太高可能会导致过度饱和或不自然的结果。
* **风格预设(Style Preset)**:提供多种预设风格,能够快速定义图像的整体风格。
* **负面提示词(Negative Prompt)**:用于指定不希望在生成图像中出现的元素
* **Seed 参数**:可以用于复现或微调生成结果,对于创作过程中的迭代很有帮助。
* 当前 `Load Image` 节点为 "绕过(Bypass)" 模式,如需启用可以参考步骤图在对应节点上右键然后将"模式(Mode)"设置为"总是(Always)" 来启用输入,即可转为图生图模式。
* `image_denoise` 在没有输入图像时,该参数不起作用。
## Stability AI Stable Diffusion 3.5 图生图工作流
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

下载下面的图片我们将用于输入图片

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成图生图工作流运行:
1. 通过 `Load Image` 节点加载一张参考图像,该图像将作为生成的基础。
2. (可选)修改 `Stability AI Stable Diffusion 3.5 Image` 节点中的 `prompt` 参数,描述你希望在参考图像基础上改变或增强的元素。
3. (可选)选择 `style_preset` 参数来控制图像的视觉风格,不同的预设风格会产生不同风格特点的图像。
4. (可选|重要)调整 `image_denoise` 参数(范围0.0-1.0)来控制对原始图像的修改程度:
* 值越接近0.0,生成的图像越接近输入的参考图像(当 0.0 时,基本和原始图像一致)
* 值越接近1.0,生成的图像越接近纯文本生成的效果(当 1.0 时,相当于没有提供参考图像)
5. (可选)编辑 `String(Multiline)` 来修改负向提示词,用于指定不希望在生成图像中出现的元素。
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的生成。
7. 等待 API 返回结果后,你可在 `Save Image` 节点中查看生成的图像,对应的图像也会被保存至 `ComfyUI/output/` 目录下。
### 3. 补充说明
下图展示了使用相同参数设置下,有无输入图像的对比效果:
**图像去噪强度(Image Denoise)**:这个参数决定了生成过程中保留原始图像特征的程度,是图生图模式中最关键的调节参数,下图是不同的去噪强度下生成的图像效果:
* **参考图像选择**:选择具有清晰主体和良好构图的图像通常能获得更好的结果。
* **提示词技巧**:在图生图模式中,提示词应该更多地关注你希望改变或增强的部分,而不需要描述已经存在于图像中的所有元素。
* **模式切换**:当提供输入图像时,节点会自动从文本到图像模式切换到图像到图像模式,并且会忽略宽高比参数设置。
## 相关节点详解
你可查阅下面的文档了解对应节点的详细参数设置等
Stability Stable Diffusion 3.5 Image API 节点说明文档
# Stability AI Stable Image Ultra API 节点 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/stability-ai/stable-image-ultra
本文将介绍如何在 ComfyUI 中使用 Stability AI Stable Image Ultra API 节点的文生图和图生图功能
[Stability Stable Image Ultra](/zh-CN/built-in-nodes/api-node/image/stability-ai/stability-ai-stable-image-ultra) 节点允许你使用 Stability AI 的 Stable Image Ultra 模型,通过文本提示词或参考图像创建高质量、细节丰富的图像内容。
本篇指南中,我们将引导你如何使用对应节点来进行文生图和图生图的工作流设置。
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Stability AI Stable Image Ultra 文生图工作流
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. (可选)修改 `Stability AI Stable Image Ultra` 节点中的 `prompt` 参数,输入你想要生成的图像描述。提示词越详细,生成的图像质量往往越好。你可以使用`(词:权重)`格式来控制特定词的权重,例如:`天空是清爽的(蓝色:0.3)和(绿色:0.8)`表示天空是蓝色和绿色的,但绿色更为突出。
2. (可选)选择 `style_preset` 参数来控制图像的视觉风格。不同的预设风格会产生不同风格特点的图像,如"cinematic"(电影感)、"anime"(动漫风格)等。选择"None"则不应用任何特定风格。
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的生成。
4. 等待 API 返回结果后,你可在 `Save Image` 节点中查看生成的图像,对应的图像也会被保存至 `ComfyUI/output/` 目录下。
### 3. 补充说明
* **提示词(Prompt)**:提示词是生成过程中最重要的参数之一,详细、清晰的描述会带来更好的效果。可以包含场景、主体、颜色、光照、风格等元素。
* **风格预设(Style Preset)**:提供多种预设风格,如电影感、动漫风、数字艺术等,能够快速定义图像的整体风格。
* **负面提示词(Negative Prompt)**:用于指定不希望在生成图像中出现的元素,可以帮助避免常见问题,如额外的肢体、扭曲的面部等。
* **Seed 参数**:可以用于复现或微调生成结果,对于创作过程中的迭代很有帮助。
* 当前 `Load Image` 节点为 “绕过(Bypass)” 模式,如需启用可以参考步骤图在对应节点上右键然后将“模式(Mode)”设置为“总是(Always)” 来启用输入,即可转为图生图模式
## Stability AI Stable Image Ultra 图生图工作流
### 1. 工作流文件下载
下面的图片的`metadata`中已经包含工作流信息,请下载并拖入 ComfyUI 中加载对应工作流。

下载下面的图片我们将用于输入图片

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成图生图工作流运行:
1. 通过 `Load Image` 节点加载一张参考图像,该图像将作为生成的基础。
2. (可选)修改 `Stability Stable Image Ultra` 节点中的 `prompt` 参数,描述你希望在参考图像基础上改变或增强的元素。
3. (可选)调整 `image_denoise` 参数(范围0.0-1.0)来控制对原始图像的修改程度:
* 值越接近0.0,生成的图像越接近输入的参考图像
* 值越接近1.0,生成的图像越接近纯文本生成的效果
4. (可选)同样可以设置 `style_preset` 和其他参数来进一步控制生成效果。
5. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图像的生成。
6. 等待 API 返回结果后,你可在 `Save Image` 节点中查看生成的图像,对应的图像也会被保存至 `ComfyUI/output/` 目录下。
### 3. 补充说明
**图像去噪强度(Image Denoise)**:这个参数决定了生成过程中保留原始图像特征的程度,是图生图模式中最关键的调节参数,下图是不同的去噪强度下生成的图像效果
* **参考图像选择**:选择具有清晰主体和良好构图的图像通常能获得更好的结果。
* **提示词技巧**:在图生图模式中,提示词应该更多地关注你希望改变或增强的部分,而不需要描述已经存在于图像中的所有元素。
## 相关节点详解
你可查阅下面的文档了解对应节点的详细参数设置等
Stability Stable Image Ultra API 节点说明文档
# Tripo API 节点模型生成 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/api-nodes/tripo/model-generation
本文将介绍如何在 ComfyUI 中使用 Tripo 节点的 API 来进行模型生成
Tripo AI 是一家专注于生成式 AI 3D 建模的公司,它提供用户友好的平台和 API 服务,能够快速地将文本提示或2D图像(单张或多张)转换成高质量的3D模型。
目前 ComfyUI 已原生集成了对应 Tripo API ,现在你可以在 ComfyUI 中便捷地使用相关节点来进行模型生成
目前 ComfyUI 的 API 节点中已经支持 Tripo 以下模型生成能力:
* 文生模型
* 图生模型
* 多视图模型生成
* 骨骼绑定
* 骨骼动画
使用 API 节点需要保证你已经正常登录,并在受许可的网络环境下使用,请参考[API 节点总览](/zh-CN/tutorials/api-nodes/overview)部分文档来了解使用 API 节点的具体使用要求。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 文生模型工作流
### 1. 工作流文件下载
下载下面的文件,并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Tripo: Text to Model` 节点的 `prompt` 中输入提示词
* model: 可以选择不同的模型,目前仅 v1.4 模型支持 `Tripo: Refine Draft model` 的后续优化
* style: 中可以设置不同的风格
* texture\_quality: 可以设置不同的纹理质量
2. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行模型的生成,工作流完成后对应的模型会自动保存至 `ComfyUI/output/` 目录下
3. 在 `Preview 3D` 节点中点击展开菜单
4. 选择`Export` 可以直接将对应模型导出
## 图生模型工作流
### 1. 工作流文件下载
下载下面的文件,并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
下载下面的图片作为输入图片

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中加载提供的输入图片
2. 在 `Tripo: Image to Model` 节点中修改对应的参数设置
* model: 可以选择不同的模型,目前仅 v1.4 模型支持 `Tripo: Refine Draft model` 的后续优化
* style: 中可以设置不同的风格
* texture\_quality: 可以设置不同的纹理质量
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行模型的生成,工作流完成后对应的模型会自动保存至 `ComfyUI/output/` 目录下
4. 模型下载请参考文生图部分的说明
## 多视图模型生成工作流
### 1. 工作流文件下载
下载下面的文件,并拖入 ComfyUI 中加载对应工作流。
下载 Json 格式工作流文件
下载下面的图片作为输入图片


### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成最基础的文生图工作流运行:
1. 在 `Load Image` 节点中分别加载提供的输入图片
2. 在 `Tripo: Image to Model` 节点中修改对应的参数设置
* model: 可以选择不同的模型,目前仅 v1.4 模型支持 `Tripo: Refine Draft model` 的后续优化
* style: 中可以设置不同的风格
* texture\_quality: 可以设置不同的纹理质量
3. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行模型的生成,工作流完成后对应的模型会自动保存至 `ComfyUI/output/` 目录下
4. 其它视图输入可以参考步骤图中的示意将对应节点的模式设置为 `总是(always)` 来启用
5. 模型下载请参考文生图部分的说明
## 对应任务的后续任务处理
Tripo 的对应节点提供了对于同一任务的后续处理,只需要在相关节点中输入对应的`model_task_id` 即可,我们在相关模板中也已提供了对应的节点,你也可以按需通过修改对应节点模式来启用
其中 `Tripo: Refine Draft model` 节点只对 V1.4 模型支持,其它模型不支持
# ComfyUI ACE-Step 原生示例
Source: https://docs.comfy.org/zh-CN/tutorials/audio/ace-step/ace-step-v1
本文将引导你在 ComfyUI 中使用 ACE-Step 模型来创造灵动音乐
ACE-Step是由中国团队阶跃星辰(StepFun)与ACE Studio联合开发的开源音乐生成基础大模型,旨在为音乐创作者提供高效、灵活且高质量的音乐生成与编辑工具。
该模型采用[Apache-2.0](https://github.com/ace-step/ACE-Step?tab=readme-ov-file#-license)许可证发布,可免费商用。
ACE-Step 作为一个强大的音乐生成基座,提供了丰富的扩展能力。通过 LoRA、ControlNet 等微调技术,开发者可以根据实际需求对模型进行定制化训练。
无论是音频编辑、歌声合成、伴奏制作、声音克隆还是风格转换等应用场景,ACE-Step 都能提供稳定可靠的技术支持。
这种灵活的架构设计大大简化了音乐 AI 应用的开发流程,让更多创作者能够快速将 AI 技术应用到音乐创作中。
目前 ACE-Step 已经发布相关的训练代码,包括 LoRA 模型训练等,对应 ControlNet 的训练代码也将在未来陆续发布,你可以访问他们的[Github](https://github.com/ace-step/ACE-Step?tab=readme-ov-file#-roadmap) 来了解更多详情。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## ACE-Step ComfyUI 文本到音频生成工作流示例
### 1. 工作流及相关模型下载
点击下面的按钮下载对应的工作流文件,拖入 ComfyUI 中即可加载对应的工作流信息,对应工作流已包含模型下载信息。
下载 Json 格式工作流文件
你也可以手动下载[ace\_step\_v1\_3.5b.safetensors](https://huggingface.co/Comfy-Org/ACE-Step_ComfyUI_repackaged/blob/main/all_in_one/ace_step_v1_3.5b.safetensors) 后保存到 `ComfyUI/models/checkpoints` 文件夹下
### 2. 按步骤完成工作流的运行
1. 确保 `Load Checkpoints` 节点加载了 `ace_step_v1_3.5b.safetensors` 模型
2. (可选)在 `EmptyAceStepLatentAudio` 节点上你可以设置生成音乐的时长
3. (可选)在 `LatentOperationTonemapReinhard` 节点,你可以调整 `multiplier` 来调整人声的音量大小(数字越大,人声音量越明显)
4. (可选)在 `TextEncodeAceStepAudio` 的 `tags` 输入对应的音乐风格等等
5. (可选)在 `TextEncodeAceStepAudio` 的 `lyrics` 中输入对应的歌词,如果你不知道该输入哪些歌词
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行音频的生成。
7. 工作流完成后,你可在 `Save Audio` 节点中查看生成的音频,你可以点击播放试听,对应的音频也会被保存至 `ComfyUI/output/audio` (由`Save Audio`节点决定子目录名称)。
## ACE-Step ComfyUI 音频到音频工作流
你可以像图生图工作流一样,输入一段音乐,使用下面的工作流来达到重新对音乐采样生成,同样,你也可以通过控制 `Ksampler` 的 `denoise` 来调整和原始音频的区别程度。
通过这样的流程,可以实现对音乐的重新编辑,来达到你想要的效果。
### 1. 工作流文件下载
点击下面的按钮下载对应的工作流文件,拖入 ComfyUI 中即可加载对应的工作流信息
下载 Json 格式工作流文件
下载下面的音频作为输入音频
下载示例音频文件用于输入
### 2. 按步骤完成工作流的运行
1. 确保 `Load Checkpoints` 节点加载了 `ace_step_v1_3.5b.safetensors` 模型
2. 在 `LoadAudio` 节点中上传提供的音频文件
3. (可选)在 `TextEncodeAceStepAudio` 的 `tags` 和 `lyrics` 中输入对应的音乐风格歌词等,提供歌词对于音频编辑来说非常重要
4. (可选)修改 `Ksampler` 节点的 `denoise` 参数,来调整采样过程中添加的噪声来调整与原始音频的相似程度,(越小与原始音频越相似,如果设置为 `1.00`则可以近似认为没有音频输入)
5. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行音频的生成。
6. 工作流完成后,你可在 `Save Audio` 节点中查看生成的音频,你可以点击播放试听,对应的音频也会被保存至 `ComfyUI/output/audio` (由`Save Audio`节点决定子目录名称)。
### 3. 工作流补充说明
1. 在 `TextEncodeAceStepAudio` 的 `tags` 中示例工作流中,将原本男声的 `tags` 修改为 `female voice` 来生成女声的音频
2. 在 `TextEncodeAceStepAudio` 的 `lyrics` 中示例工作流中,中对原本的歌词进行了调整修改,具体编辑你可以参考 ACE-Step 项目页面中的示例来了解如何完成修改
## ACE-Step 提示词指南
ACE 的提示词目前使用的有两个,一个是 `tags` 一个是 `lyrics`。
* `tags`: 主要用来描述音乐的风格、场景等, 和我们平常其它生成的 prompt 类似,主要描述音频整体的风格和要求,使用英文逗号分隔
* `lyrics`: 主要用来描述歌词,支持歌词结构标签,如 \[verse](主歌)、\[chorus](副歌)和 \[bridge](过渡段)来区分歌词的不同部分,也可以在纯音乐情况下输入乐器名称
对应的 `tags` 和 `lyrics` 在 [ACE-Step 模型主页](https://ace-step.github.io/) 中可以找到丰富的示例,你可以参考对应示例来尝试对应的提示词,本文档的提示词指南基于项目做了一些整理,以便让你能够快速尝试组合,来达到最想要的效果
### tags标签(prompt)
#### 主流音乐风格
使用简短标签组合,来生成特定风格的音乐
* electronic(电子音乐)
* rock(摇滚)
* pop(流行)
* funk(放克)
* soul(灵魂乐)
* cyberpunk(赛博朋克)
* Acid jazz(酸爵士)
* electro(电子)
* em(电子音乐)
* soft electric drums(软电鼓)
* melodic(旋律)
#### 场景类型
结合具体使用场景和氛围,生成符合对应氛围的音乐
* background music for parties(派对背景音乐)
* radio broadcasts(电台广播音乐)
* workout playlists(健身播放列表音乐)
#### 乐器元素
* saxophone,
* azz(萨克斯风、爵士)
* piano, violin(钢琴、小提琴)
#### 人声类型
* female voice(女声)
* male voice(男声)
* clean vocals(纯净人声)
#### 专业用语
使用音乐中常用的一些专业的用词,来精准控制音乐效果
* 110 bpm(每分钟节拍数为110)
* fast tempo(快节奏)
* slow tempo(慢节奏)
* loops(循环片段)
* fills(填充音)
* acoustic guitar(木吉他)
* electric bass(电贝斯)
{/* - 变体控制:
- variance=0.3(控制生成音乐与原始风格的差异程度) */}
{/* - 歌词编辑:
- edit lyrics: 'When I was young' -> 'When you were kid'(编辑歌词示例) */}
### 歌词(lyrics)
#### 歌词结构标签
* \[intro] (前奏)
* \[verse] (主歌)
* \[pre-chorus] (导歌)
* \[chorus] (副歌/合唱)
* \[bridge] (过渡段/桥段)
* \[outro] (尾声)
* \[hook] (钩子/主题旋律)
* \[refrain] (重复段落)
* \[interlude] (间奏)
* \[breakdown] (分解段)
* \[ad-lib] (即兴段落)
#### 多语言支持
* ACE-Step V1 是支持多语言的,实际使用的时候 ACE-Step 会获取到对应的不同语言转换后的英文字母,然后进行音乐生成。
* 在 ComfyUI 中我们并没有完全实现全部多语言到英文字母的转换,目前仅实现了[日语平假名和片假名字符](https://github.com/comfyanonymous/ComfyUI/commit/5d3cc85e13833aeb6ef9242cdae243083e30c6fc)
所以如果你需要使用多语言来进行相关的音乐生成,你需要首先将对应的语言转换成英文字母,然后在对应 `lyrics` 开头输入对应语言代码的缩写,比如中文`[zh]` 韩语 `[ko]` 等
比如:
```
[verse]
[zh]wo3zou3guo4shen1ye4de5jie1dao4
[zh]leng3feng1chui1luan4si1nian4de5piao4liang4wai4tao4
[zh]ni3de5wei1xiao4xiang4xing1guang1hen3xuan4yao4
[zh]zhao4liang4le5wo3gu1du2de5mei3fen1mei3miao3
[chorus]
[verse]
[ko]hamkke si-kkeuleo-un sesang-ui sodong-eul pihae
[ko]honja ogsang-eseo dalbich-ui eolyeompus-ileul balaboda
[ko]niga salang-eun lideum-i ganghan eum-ag gatdago malhaess-eo
[ko]han ta han tamada ma-eum-ui ondoga eolmana heojeonhanji ijge hae
[bridge]
[es]cantar mi anhelo por ti sin ocultar
[es]como poesía y pintura, lleno de anhelo indescifrable
[es]tu sombra es tan terca como el viento, inborrable
[es]persiguiéndote en vuelo, brilla como cruzar una mar de nubes
[chorus]
[fr]que tu sois le vent qui souffle sur ma main
[fr]un contact chaud comme la douce pluie printanière
[fr]que tu sois le vent qui s'entoure de mon corps
[fr]un amour profond qui ne s'éloignera jamais
```
目前 ACE-Step 支持了 19 种语言,但下面十种语言的支持会更好一些:
* English
* Chinese: \[zh]
* Russian: \[ru]
* Spanish: \[es]
* Japanese: \[ja]
* German: \[de]
* French: \[fr]
* Portuguese: \[pt]
* Italian: \[it]
* Korean: \[ko]
上面的语言标签在撰写文档时并没有经过完全测试,如果对应语言标签不正确,请[提交 issue 到我们的文档的仓库](https://github.com/Comfy-Org/docs/issues) 我们会进行及时修改
## ACE-Step 相关资源
* [项目主页](https://ace-step.github.io/)
* [Hugging Face 模型](https://huggingface.co/ACE-Step/ACE-Step-v1-3.5B)
* [GitHub 仓库](https://github.com/ace-step/ACE-Step)
* [训练脚本](https://github.com/ace-step/ACE-Step?tab=readme-ov-file#-train)
# ComfyUI 图生图工作流
Source: https://docs.comfy.org/zh-CN/tutorials/basic/image-to-image
本篇将引导了解并完成图生图工作流
## 什么是图生图
图生图(Image to Image)是 ComfyUI 中的一种工作流,它允许用户将一张图像作为输入,并生成一张新的图像。
图生图可以使用在以下场景中:
* 原始图像风格的转换,如把写实照片转为艺术风格
* 将线稿图像转换为写实图像
* 图像的修复
* 老照片着色
* ... 等其它场景
用一个比喻来讲解的话,大概是这样:
你需要画家根据你的参考图片,画出符合你要求特定效果的作品。
如果你仔细比对本篇教程和[文生图](/zh-CN/tutorials/basic/text-to-image)教程,你会发现图生图的流程和文生图的流程非常相似,只是多了个输入的参考图片作为输入条件,也就是在文生图中,我们是让画家(绘图模型)根据我们的提示词生成自由发挥,而在图生图中,我们是让画家(绘图模型)根据我们的参考图片和提示词生成图片。
## ComfyUI 图生图工作流示例讲解
### 1. 模型安装
请确保你已经在 `ComfyUI/models/checkpoints` 文件夹至少有一个 SD1.5 的模型文件,如果你还不了解如何安装模型,请参[开始 ComfyUI 的 AI 绘图之旅](/zh-CN/get_started/first_generation#3-安装绘图模型)章节中关于模型安装的部分说明。
你可以使用下面的这些模型:
* [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors)
* [Dreamshaper 8](https://civitai.com/models/4384?modelVersionId=128713)
* [Anything V5](https://civitai.com/models/9409?modelVersionId=30163)
### 2. 图生图工作流相关文件
保存并下载下面的图片到本地,然后 **拖拽或使用 ComfyUI 打开** 它,就会加载对应的工作流

或在 ComfyUI 的 **workflow template** 中加载 **image to image** 工作流
下载下面的图片作为使用示例,我们会在后面的步骤中使用它
### 3. 开始图生图工作流
在加载图生图工作流后,请对照图片,按照序号完成以下操作,完成示例工作流的生成
1. 在 **Load Checkpoint** 节点中加载好你本地的绘图模型
2. 在 **Load Image** 节点点击 `upload` 按钮,上传准备步骤中提供的图片
3. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl + Enter(回车)` 来执行图片生成
## 开始你自己的尝试
1. 试着修改 **KSampler** 节点中的 `denoise` 参数,逐渐从 1 到 0 变化,观察生成图片的变化
2. 更换你自己的提示词和参考图片,生成属于你自己的图片效果
## 图生图工作流核心要点
图生图工作流的核心在于在于 `KSampler` 节点中的 `denoise` 参数要是 **小于 1**
如果你调整过 `denoise` 参数,进行生成后会发现:
* `denoise` 越小,生成图片和参考图片的差异就会越小,
* `denoise` 越大,生成图片和参考图片的差异就会越大。
因为 `denoise` 决定了对应图片转换为潜空间图像后,向潜在空间图像添加的噪声强度,如果 `denoise` 为 1,对应潜空间图像就会变成一个完全随机的噪声,那这样就和`empty latent image`节点生成的潜在空间一样了,就会丢失参考图片的所有特征。
对应原理可以参考[文生图](/zh-CN/tutorials/basic/text-to-image)教程中的原理讲解。
# ComfyUI 局部重绘工作流
Source: https://docs.comfy.org/zh-CN/tutorials/basic/inpaint
本篇指南将带你了解 ComfyUI 中的局部重绘工作流,并带你完成一个局部重绘的示例,以及遮罩编辑器的使用等
本篇将引导了解 AI 绘图中,局部重绘的概念,并在 ComfyUI 中完成局部重绘工作流生成,我们将接触以下内容:
* 使用局部重绘工作流完成画面的修改
* 了解并使用 ComfyUI 中遮罩编辑器
* 了解相关节点 VAE Encoder (for Inpainting)
## 关于局部重绘
在 AI 图像生成过程中,我们常会遇到生成的画面整体较为满意,但是画面中存在一些不希望出现或者错误的元素,但是重新生成可能会生成另外一张完全不同的图片,所以这时候利用局部重绘来修复这部分的元素就非常有必要了。
这就像让 **画家(AI 绘图模型)** 画了一幅画,但是总是会有稍微有 **局部区域需要调整**,我们需要向画家说明**需要调整的区域(遮罩)**,然后让画家会根据我们的要求进行 **重新绘制(重绘)**。
局部重绘的场景包括:
* **瑕疵修复:** 消除照片中多余物体、错误的AI生成的画面的肢体等
* **细节优化:** 精准调整局部元素(如修改服装纹理、调整面部表情)
* 等其它场景
## ComfyUI 局部重绘工作流示例讲解
### 模型及相关素材准备
#### 1. 模型安装
下载下面的模型文件,并保存到`ComfyUI/models/checkpoints`目录下
* [512-inpainting-ema.safetensors](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/blob/main/512-inpainting-ema.safetensors)
#### 2. 局部重绘素材
请下载下面的图片,我们将在这个示例中使用这个图片作为输入使用
这张照片已经包含了 alpha 透明通道,所以并不需要你手动绘制蒙版,在本篇教程也会涉及如何使用遮罩编辑器来绘制蒙版的部分,我们会引导你一步步来完成整个局部重绘的过程
#### 3. 局部重绘工作流
下面这张图的 metadata 包含的对应的json工作流,请将其下载后 **拖入** ComfyUI 界面或者使用菜单 **工作流(Workflow)** --> **打开工作流(Open,快捷键 `Ctrl + O`)** 来加载这个局部重绘工作流

### ComfyUI 局部重绘工作流示例讲解
请参照图片序号对照下面的提示完下操作:
1. 请确保已经加载了你所下载使用的模型
2. 请在在 `Load Image` 节点中加载局部重绘的素材
3. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl + Enter(回车)` 来执行图片生成
此外我们在这里可以对比一下,下图是使用[v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors) 模型来进行 inpainting 的结果。
你会发现 [512-inpainting-ema.safetensors](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/blob/main/512-inpainting-ema.safetensors) 模型生成的结果局部重绘的效果更好过渡更自然。
这因为这个模型是专为 inpainting 设计的模型,它可以帮助我们更好地控制生成区域,从而获得更好的局部重绘效果。
记得我们一直用的比喻吗?不同的模型就像能力不同的画家一样,但每个画家都有自己能力的上限,选择合适的模型可以让你的生成效果更好。
你可以进行下面的尝试来让画面达到你想要的效果:
1. 修改正向 、负向提示词,使用更具体的描述
2. 尝试多次运行,让 `KSampler` 使用不同的种子,从而带来不同的生成效果
3. 在了解本篇遮罩编辑器使用的部分后,对于生成的结果再次进行重绘以获得满意的结果。
接下来我们将简单了解如何使用 **遮罩编辑器(Mask Editor)** ,因为之前提供的输入图片中是已经包含了`alpha`透明通道(也就是我们希望在绘图过程中进行编辑的区域),所以并不需要你手动绘制,但在日常使用中我们会更经常使用 **遮罩编辑器(Mask Editor)** 来绘制 蒙版(Mask)
### 使用遮罩编辑器(Mask Editor) 绘制蒙版
首先在上一步工作流中的`Save Image` 节点上右键,你可以在右键菜单中看到`复制(Clipspace)` 选项,点击后会复制当前图片到剪贴板
然后在 **加载图像(Load Image)** 节点上右键,你可以在右键菜单中看到`Paste(Clipspace)` 选项,点击后会从剪贴板中粘贴图片
然后在 **加载图像(Load Image)** 节点上右键,你可以在右键菜单中看到`在遮罩编辑器中打开(Open in MaskEditor)` 选项,点击后会打开遮罩编辑器
1. 你可以右侧编辑相关参数,比如调整画笔大小、透明度等等
2. 绘制错误区域可以使用橡皮檫来擦除
3. 绘制完成后点击 `Save` 按钮保存蒙版
这样绘制完成的内容就会作为 遮罩(Mask) 输入到 VAE Encoder (for Inpainting) 节点中一起进行编码
然后试着调整提示词,再次进行生成,直到你可以完成满意的生成结果。
## 局部重绘制相关节点
通过[文生图](/zh-CN/tutorials/basic/text-to-image)、[图生图](/zh-CN/tutorials/basic/image-to-image) 和本篇的工作流对比,我想你应该可以看到这几个工作流主要的差异都在于 VAE 部分这部分的条件输入,
在这个工作流中我们使用到的是 **VAE 内部编码器** 节点,这个节点是专门用于局部重绘的节点,它可以帮助我们更好地控制生成区域,从而获得更好的生成效果。
**输入类型**
| 参数名称 | 作用 |
| -------------- | ------------------------------------------------------ |
| `pixels` | 需要编码到潜空间的输入图像。 |
| `vae` | 用于将图片从像素空间编码到潜在空间的 VAE 模型。 |
| `mask` | 图片遮罩,用来具体指明哪个区域需要进行修改。 |
| `grow_mask_by` | 在原有的遮罩基础上,向外扩展的像素值,保证在遮罩区域外围有一定的过度区域,避免重绘区域与原图存在生硬的过渡。 |
**输出类型**
| 参数名称 | 作用 |
| -------- | ----------------- |
| `latent` | 经过 VAE 编码后的潜空间图像。 |
# ComfyUI LoRA 使用示例
Source: https://docs.comfy.org/zh-CN/tutorials/basic/lora
本篇将引导了解并完成单个 LoRA 模型的使用
**LoRA 模型(Low-Rank Adaptation)** 是一种用于微调大型生成模型(如 Stable Diffusion)的高效技术。
它通过在预训练模型的基础上引入可训练的低秩矩阵,仅调整部分参数,而非重新训练整个模型,从而以较低的计算成本实现特定任务的优化,相对于类似 SD1.5 这样的大模型,LoRA 模型更小,更容易训练。
上面的图片对比了同样参数下 [dreamshaper\_8](https://civitai.com/models/4384?modelVersionId=128713) 直接生成和使用 [blindbox\_V1Mix](https://civitai.com/models/25995/blindbox) LoRA 模型生成的图片对比,我们可以看到通过使用 LoRA 模型,可以在不调整基础模型的情况下,生成更符合我们需求的图片。
我们将演示如何使用 LoRA 的示例。所有 LoRA 变体:Lycoris, loha, lokr, locon, 等... 都是以这种方式使用。
在本示例中,我们将完成以下内容来学习[ComfyUI](https://github.com/comfyanonymous/ComfyUI) 中加载并使用 LoRA 模型,将涉及以下内容:
1. 安装 LoRA 模型
2. 使用 LoRA 模型生成图片
3. `Load LoRA` 节点的简单介绍
## 相关模型安装
请下载 [dreamshaper\_8.safetensors](https://civitai.com/api/download/models/128713?type=Model\&format=SafeTensor\&size=pruned\&fp=fp16) 并保存至 `ComfyUI/models/checkpoints` 目录
请下载 [blindbox\_V1Mix.safetensors](https://civitai.com/api/download/models/32988?type=Model\&format=SafeTensor\&size=full\&fp=fp16) 并保存至 `ComfyUI/models/loras` 目录
## LoRA 工作流文件
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

Metadata 中包含工作流 json 的图片可直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 来加载对应的工作流。
## 按步骤完成工作流的运行
请参照下图步骤,来确保对应的工作流可以正常运行
1. 确保`Load Checkpoint` 加载了 `dreamshaper_8.safetensors`
2. 确保`Load LoRA` 加载了 `blindbox_V1Mix.safetensors`
3. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
## Load LoRA 节点介绍
位于`ComfyUI\models\loras` 的模型会被 ComfyUI 检测到,并在这个节点中加载
### 输入类型
| 参数名称 | 作用 |
| ---------------- | ----------------------------------------- |
| `model` | 连接基础模型 |
| `clip` | 连接 CLIP 模型 |
| `lora_name` | 选择要加载使用的 LoRA 模型 |
| `strength_model` | 影响 LoRA 对 模型权重(model)的影响程度,数值越大 LoRA 风格越强 |
| `strength_clip` | 影响 LoRA 对 CLIP 词嵌入(clip)的影响程度 |
### 输出类型
| 参数名称 | 作用 |
| ------- | ---------------------- |
| `model` | 输出应用了 LoRA 调整的模型 |
| `clip` | 输出应用了 LoRA 调整的 CLIP 模型 |
该节点支持链式连接,可以将多个`Load LoRA` 节点串联来应用多个 LoRA 模型,具体请参考[ComfyUI 应用多个 LoRA 示例](/zh-CN/tutorials/basic/multiple-loras)
## 开始你的尝试
1. 试着修改提示词,或者调整 `Load LoRA` 节点的不同参数,比如 `strength_model` ,来观察生成图片的变化,熟悉对应节点。
2. 访问 [CivitAI](https://civitai.com/models) 网站,下载其它风格的 LoRA 模型,尝试使用。
# ComfyUI 应用多个 LoRA 示例
Source: https://docs.comfy.org/zh-CN/tutorials/basic/multiple-loras
本篇将引导你了解并完成在 ComfyUI 中同时应用多个 LoRA 模型
在 [ComfyUI LoRA 使用示例](/zh-CN/tutorials/basic/lora) 中,我们介绍了如何在 ComfyUI 中加载并使用 LoRA 模型,也提及了该节点支持链式连接。
在本篇中我们将使用链式连接`Load LoRA`节点的方式来同时使用多个 LoRA 模型,在本示例中,我们将使用 [blindbox\_V1Mix](https://civitai.com/models/25995?modelVersionId=32988) 和 [MoXinV1](https://civitai.com/models/12597?modelVersionId=14856) 两个 LoRA 模型。
下图是这两个 LoRA 模型在同样参数下单独使用的效果
但通过多个 LoRA 模型链式连接后,我们可以在最终的效果中看到两种风格融合在一起的效果
## 相关模型安装
请下载 [dreamshaper\_8.safetensors](https://civitai.com/api/download/models/128713?type=Model\&format=SafeTensor\&size=pruned\&fp=fp16) 并保存至 `ComfyUI/models/checkpoints` 目录
请下载 [blindbox\_V1Mix.safetensors](https://civitai.com/api/download/models/32988?type=Model\&format=SafeTensor\&size=full\&fp=fp16) 并保存至 `ComfyUI/models/loras` 目录
请下载 [MoXinV1.safetensors](https://civitai.com/api/download/models/14856?type=Model\&format=SafeTensor\&size=full\&fp=fp16) 并保存至 `ComfyUI/models/loras` 目录
## 多 LoRA 模型应用示例工作流
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

Metadata 中包含工作流 json 的图片可直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 来加载对应的工作流。
## 按步骤完成工作流的运行
请参照下图步骤完成,确保工作流能够正常运行
1. 确保`Load Checkpoint`可以加载 **dreamshaper\_8.safetensors**
2. 确保`Load LoRA`可以加载 **blindbox\_V1Mix.safetensors**
3. 确保`Load LoRA`可以加载 **MoXinV1.safetensors**
4. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
## 开始你的尝试
1. 试着调整两个 `Load LoRA` 的 `strength_model` 参数,来修改不同 LoRA 模型对最终生成图片的影响
2. 访问 [CivitAI](https://civitai.com/models) 网站,下载其它风格的 LoRA 模型,组合出你满意的效果
# ComfyUI 扩图(Outpaint)工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/basic/outpaint
本篇指南将带你了解 ComfyUI 中的扩图工作流,带你完成一个扩图的示例
本篇将引导了解 AI 绘图中扩图的概念,并在 ComfyUI 中完成扩图工作流生成。我们将接触以下内容:
* 使用扩图工作流完成画面的扩展
* 了解并使用 ComfyUI 中的扩图相关节点
* 掌握扩图的基本操作流程
## 关于扩图
在 AI 图像生成过程中,我们经常会遇到这样的需求:已有的图片构图很好,但是画面范围太小,需要扩展画布来获得更大的场景,这时候就需要用到扩图功能。
这就像让 **画家(AI 绘图模型)** 在已有的画作基础上,向外延伸绘制更大的场景。我们需要告诉画家 **需要扩展的方向和范围**,画家会根据已有的画面内容,合理地延伸和扩展场景。
基本上它要求的内容与[局部重绘](/zh-CN/tutorials/basic/inpaint)相似,只不过我们用来**构建遮罩(Mask)的节点不同**
扩图的应用场景包括:
* **场景扩展:** 扩大原有画面的场景范围,展现更完整的环境
* **构图调整:** 通过扩展画布来优化整体构图
* **内容补充:** 为原有画面添加更多相关的场景元素
## ComfyUI 扩图工作流示例讲解
### 准备工作
#### 1. 模型安装
* [512-inpainting-ema.safetensors](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/blob/main/512-inpainting-ema.safetensors)
#### 2. 输入图片
请准备一张你想要进行扩展的图片。在本例中,我们将使用下面这张图片作为示例:
#### 3. 扩图工作流
请下载下面的图片,并将其 **拖入** ComfyUI 界面或使用菜单 **工作流(Workflow)** --> **打开工作流(Open,快捷键 `Ctrl + O`)** 来加载这个扩图工作流

### 扩图工作流使用讲解
扩图工作流的关键步骤如下:
1. 请在 `加载模型(Load Checkpoint)` 节点中加载你本地安装的模型文件
2. 请在 `加载图片(Load Image)` 节点中点击 `Upload` 按钮上传
3. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl + Enter(回车)` 来执行图片生成
在这个工作流中主要是通过 `Pad Image for outpainting` 节点来控制图片的扩展方向和范围,其实这也是一个 [局部重绘(Inpaint)](/zh-CN/tutorials/basic/inpaint) 工作流,只不过我们用来构建遮罩(Mask)的节点不同。
### Pad Image for outpainting 节点
这个节点接受一个输入图片,并输出一张扩展过的图像和对应的遮罩(Mask),其中遮罩由于对应的节点参数构建。
#### 输入参数
| 参数名称 | 作用 |
| ------------ | ---------------------------- |
| `image` | 输入图片 |
| `left` | 左侧填充量 |
| `top` | 顶部填充量 |
| `right` | 右侧填充量 |
| `bottom` | 底部填充量 |
| `feathering` | 控制原始图像与添加的填充内容之间的过渡平滑度,越大越平滑 |
#### 输出参数
| 参数名称 | 作用 |
| ------- | ---------------------- |
| `image` | 输出`image`代表已填充的图像 |
| `mask` | 输出`mask`指示原始图像和添加的填充区域 |
#### 节点输出内容
经过 `Pad Image for outpainting` 节点处理后,输出的图片和蒙版预览如下:
你可以看到对应的输出结果
* `Image` 输出的是扩展后的图像
* `Mask` 输出的是标记了扩展区域的蒙版
# ComfyUI 文生图工作流
Source: https://docs.comfy.org/zh-CN/tutorials/basic/text-to-image
本篇将引导了解 AI 绘图中,文生图的概念,并在 ComfyUI 中完成文生图工作流生成
本篇目的主要带你初步了解 ComfyUI 的文生图的工作流,并初步了解一些 ComfyUI 相关节点的功能和使用。
在本篇文档中我们将完成以下内容:
* 完成一次文生图工作流
* 简单了解扩散模型原理
* 了解工作流中的节点的功能和作用
* 初步了解 SD1.5 模型
我们将会先进行文生图工作流的运行,然后进行相关内容的讲解,请按你的需要选择对应部分开始。
## 关于文生图
**文生图(Text to Image)** ,是 AI 绘图中的基础流程,通过输入文本描述来生成对应的图片,它的核心是 **扩散模型**。
在文生图过程中我们需要以下条件:
* **画家:** 绘图模型
* **画布:** 潜在空间
* \*\*对画面的要求(提示词):\*\*提示词,包括正向提示词(希望在画面中出现的元素)和负向提示词(不希望在画面中出现的元素)
这个文本到图片图片生成过程,可以简单理解成你把你的**绘图要求(正向提示词、负向提示词)**告诉一个**画家(绘图模型)**,画家会根据你的要求,画出你想要的内容。
## ComfyUI 文生图工作流示例讲解
### 1. 开始开始前的准备
请确保你已经在 `ComfyUI/models/checkpoints` 文件夹至少有一个 SD1.5 的模型文件,如果你还不了解如何安装模型,请参[开始 ComfyUI 的 AI 绘图之旅](/zh-CN/get_started/first_generation#3-安装绘图模型)章节中关于模型安装的部分说明。
你可以使用下面的这些模型:
* [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors)
* [Dreamshaper 8](https://civitai.com/models/4384?modelVersionId=128713)
* [Anything V5](https://civitai.com/models/9409?modelVersionId=30163)
### 2. 加载文生图工作流
请下载下面的图片,并将图片拖入 ComfyUI 的界面中,或者使用菜单 **工作流(Workflows)** -> **打开(Open)** 打开这个图片以加载对应的 workflow

也可以从菜单 **工作流(Workflows)** -> **浏览工作流示例(Browse example workflows)** 中选择 **Text to Image** 工作流
### 3. 加载模型,并进行第一次图片生成
在完成了对应的绘图模型安装后,请参考下图步骤加载对应的模型,并进行第一次图片的生成
请对应图片序号,完成下面操作
1. 请在**Load Checkpoint** 节点使用箭头或者点击文本区域确保 **v1-5-pruned-emaonly-fp16.safetensors** 被选中,且左右切换箭头不会出现**null** 的文本
2. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl + Enter(回车)` 来执行图片生成
等待对应流程执行完成后,你应该可以在界面的\*\*保存图像(Save Image)\*\*节点中看到对应的图片结果,可以在上面右键保存到本地
如果生成结果不满意,可以多运行几次图片生成,因为每次运行图片生成,**KSampler** 根据 `seed` 参数会使用不同的随机种子,所以每次生成的结果都会有所不同
### 4. 开始你的尝试
你可以尝试修改**CLIP Text Encoder**处的文本
其中连接到 KSampler 节点的`Positive`为正向提示词,连接到 KSampler 节点的`Negative`为负向提示词
下面是针对 SD1.5 模型的一些简单提示词原则
* 尽量使用英文
* 提示词之间使用英文逗号 `,` 隔开
* 尽量使用短语而不是长句子
* 使用更具体的描述
* 可以使用类似 `(golden hour:1.2)` 这样的表达来提升特定关键词的权重,这样它在画面中出现的概率会更高,`1.2` 为权重,`golden hour` 为关键词
* 可以使用类似 `masterpiece, best quality, 4k` 等关键词来提升生成质量
下面是几组不同的 prompt 示例,你可以尝试使用这些 prompt 来查看生成的效果,或者使用你自己的 prompt 来尝试生成
**1. 二次元动漫风格**
正向提示词:
```
anime style, 1girl with long pink hair, cherry blossom background, studio ghibli aesthetic, soft lighting, intricate details
masterpiece, best quality, 4k
```
负向提示词:
```
low quality, blurry, deformed hands, extra fingers
```
**2. 写实风格**
正向提示词:
```
(ultra realistic portrait:1.3), (elegant woman in crimson silk dress:1.2),
full body, soft cinematic lighting, (golden hour:1.2),
(fujifilm XT4:1.1), shallow depth of field,
(skin texture details:1.3), (film grain:1.1),
gentle wind flow, warm color grading, (perfect facial symmetry:1.3)
```
负向提示词:
```
(deformed, cartoon, anime, doll, plastic skin, overexposed, blurry, extra fingers)
```
**3. 特定艺术家风格**
正向提示词:
```
fantasy elf, detailed character, glowing magic, vibrant colors, long flowing hair, elegant armor, ethereal beauty, mystical forest, magical aura, high detail, soft lighting, fantasy portrait, Artgerm style
```
负向提示词:
```
blurry, low detail, cartoonish, unrealistic anatomy, out of focus, cluttered, flat lighting
```
## 文生图工作原理
整个文生图的过程,我们可以理解成是**扩散模型的反扩散过程**,我们下载的 [v1-5-pruned-emaonly-fp16.safetensors](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive/blob/main/v1-5-pruned-emaonly-fp16.safetensors) 是一个已经训练好的可以 **从纯高斯噪声生成目标图片的模型**,我们只需要输入我们的提示词,它就可以通随机的噪声降噪生成目标图片。
```mermaid
graph LR
A[纯高斯噪声] --> B[迭代去噪]
B --> C[中间潜在变量]
C --> D[最终生成图像]
E[文本提示词] --> F[CLIP编码器]
F --> G[语义向量]
G --> B
```
我们可能需要了解下两个概念,
1. **潜在空间:** 潜在空间(Latent Space)是扩散模型中的一种抽象数据表示方式,通过把图片从像素空间转换为潜在空间,可以减少图片的存储空间,并且可以更容易的进行扩散模型的训练和减少降噪的复杂度,就像建筑师设计建筑时使用蓝图(潜在空间)来进行设计,而不是直接在建筑上进行设计(像素空间),这种方式可以保持结构特征的同时,又大幅度降低修改成本
2. **像素空间:** 像素空间(Pixel Space)是图片的存储空间,就是我们最终看到的图片,用于存储图片的像素值。
如果你想要了解更多扩散模型相关内容,可以阅读下面的文章:
* [Denoising Diffusion Probabilistic Models (DDPM)](https://arxiv.org/pdf/2006.11239)
* [Denoising Diffusion Implicit Models (DDIM)](https://arxiv.org/pdf/2010.02502)
* [High-Resolution Image Synthesis with Latent Diffusion Models](https://arxiv.org/pdf/2112.10752)
## ComfyUI 文生图工作流节点讲解
### A. 加载模型(Load Checkpoint)节点
这个节点通常用于加载绘图模型, 通常 `checkpoint` 中会包含 `MODEL(UNet)`、`CLIP` 和 `VAE` 三个组件
* `MODEL(UNet)`:为对应模型的 UNet 模型, 负责扩散过程中的噪声预测和图像生成,驱动扩散过程
* `CLIP`:这个是文本编码器,因为模型并不能直接理解我们的文本提示词(prompt),所以需要将我们的文本提示词(prompt)编码为向量,转换为模型可以理解的语义向量
* `VAE`:这个是变分自编码器,我们的扩散模型处理的是潜在空间,而我们的图片是像素空间,所以需要将图片转换为潜在空间,然后进行扩散,最后将潜在空间转换为图片
### B. 空Latent图像(Empty Latent Image)节点
定义一个潜在空间(Latent Space),它输出到 KSampler 节点,空Latent图像节点构建的是一个 **纯噪声的潜在空间**
它的具体的作用你可以理解为定义画布尺寸的大小,也就是我们最终生成图片的尺寸
### C. CLIP文本编码器(CLIP Text Encoder)节点
用于编码提示词,也就是输入你对画面的要求
* 连接到 KSampler 节点的 `Positive` 条件输入的为正向提示词(希望在画面中出现的元素)
* 连接到 KSampler 节点的 `Negative` 条件输入的为负向提示词(不希望在画面中出现的元素)
对应的提示词被来自 `Load Checkpoint` 节点的 `CLIP` 组件编码为语义向量,然后作为条件输出到 KSampler 节点
### D. K 采样器(KSampler)节点
**K 采样器** 是整个工作流的核心,整个噪声降噪的过程都在这个节点中完成,并最后输出一个潜空间图像
```mermaid
graph LR
A[随机噪声 潜在空间] --> B{KSampler}
C[扩散模型] --> B
D[CLIP语义向量] --> B
B --> E[去噪Latent]
```
KSampler 节点的参数说明如下
| 参数名称 | 描述 | 作用 |
| ---------------------------- | ---------- | ------------------------------------ |
| **model** | 去噪使用的扩散模型 | 决定生成图像的风格与质量 |
| **positive** | 正向提示词条件编码 | 引导生成包含指定元素的内容 |
| **negative** | 负向提示词条件编码 | 抑制生成不期望的内容 |
| **latent\_image** | 待去噪的潜在空间图像 | 作为噪声初始化的输入载体 |
| **seed** | 噪声生成的随机种子 | 控制生成结果的随机性 |
| **control\_after\_generate** | 种子生成后控制模式 | 决定多批次生成时种子的变化规律 |
| **steps** | 去噪迭代步数 | 步数越多细节越精细但耗时增加 |
| **cfg** | 分类器自由引导系数 | 控制提示词约束强度(过高导致过拟合) |
| **sampler\_name** | 采样算法名称 | 决定去噪路径的数学方法 |
| **scheduler** | 调度器类型 | 控制噪声衰减速率与步长分配 |
| **denoise** | 降噪强度系数 | 控制添加到潜在空间的噪声强度,0.0保留原始输入特征,1.0为完全的噪声 |
在 KSampler 节点中,潜在空间使用 `seed` 作为初始化参数构建随机的噪声,语义向量 `Positive` 和 `Negative` 会作为条件输入到扩散模型中
然后根据 `steps` 参数指定的去噪步数,进行去噪,每次去噪会根据 `denoise` 参数指定的降噪强度系数,对潜在空间进行降噪,并生成新的潜在空间图像
### E. VAE 解码(VAE Decode)节点
将 **K 采样器(KSampler)** 输出的潜在空间图像转换为像素空间图像
### F. 保存图像(Save Image)节点
预览并保存从潜空间解码的图像,并保存到本地`ComfyUI/output`文件夹下
## SD1.5 模型简介
**SD1.5(Stable Diffusion 1.5)** 是一个由[Stability AI](https://stability.ai/)开发的AI绘图模型,Stable Diffusion系列的基础版本,基于 **512×512** 分辨率图片训练,所以其对 **512×512** 分辨率图片生成支持较好,体积约为4GB,可以在\*\*消费级显卡(如6GB显存)\*\*上流畅运行。目前 SD1.5 的相关周边生态非常丰富,它支持广泛的插件(如ControlNet、LoRA)和优化工具。
作为AI绘画领域的里程碑模型,SD1.5凭借其开源特性、轻量架构和丰富生态,至今仍是最佳入门选择。尽管后续推出了SDXL/SD3等升级版本,但其在消费级硬件上的性价比仍无可替代。
### 基础信息
* **发布时间**:2022年10月
* **核心架构**:基于Latent Diffusion Model (LDM)
* **训练数据**:LAION-Aesthetics v2.5数据集(约5.9亿步训练)
* **开源特性**:完全开源模型/代码/训练数据
### 优缺点
模型优势:
* 轻量化:体积小,仅 4GB 左右,在消费级显卡上流畅运行
* 使用门槛低:支持广泛的插件和优化工具
* 生态成熟:支持广泛的插件和优化工具
* 生成速度快:在消费级显卡上流畅运行
模型局限:
* 细节处理:手部/复杂光影易畸变
* 分辨率限制:直接生成1024x1024质量下降
* 提示词依赖:需精确英文描述控制效果
# ComfyUI 图像放大工作流
Source: https://docs.comfy.org/zh-CN/tutorials/basic/upscale
本篇将引导了解 AI 绘图中,放大图片的概念,并在 ComfyUI 中完成放大图片工作流生成
## 什么是图像放大
图像放大(Image Upscaling)是通过算法将低分辨率图像转换为高分辨率图像的过程。与传统插值放大不同,AI 放大模型(如 ESRGAN)能智能重建细节,保持图像质量。
比如默认通过 SD1.5 模型对于大尺寸的图片生成表现不佳,如果需要高分辨率,我们通常会先生成小尺寸的图像,然后使用图像放大来提升图片的分辨率。
当然本文介绍的只是诸多 ComfyUI 中图像放大方法中的一种,在这篇讲解中,我们将带你完成以下内容:
* 下载并安装放大模型
* 使用放大模型进行一次简单的放大
* 结合文生图工作流,完成图像的放大
## 下载并安装放大模型
额外需要下载 ESRGAN 等放大模型(必须):
访问 [OpenModelDB](https://openmodeldb.info/) 搜索下载需要的放大模型(如 RealESRGAN)
如图所示:
1. 你可以在筛选处根据你的图像类型选来过滤对应的模型类型
2. 对应模型右上角会有标注具体放大的倍数,比如在我们提供的截图里,对应的这个模型是将图像放大2倍的模型
本篇教程中我们将使用 [4x-ESRGAN](https://openmodeldb.info/models/4x-ESRGAN) 模型,点击进入详情页,点击 `Download` 下载模型

将模型文件(.pth)放入 `ComfyUI\models\upscale_models` 目录
## 简单放大工作流
### 1. 工作流及素材
请下载下面的图片,并拖入到 ComfyUI 中,加载简单版本放大工作流
请下载下面这张小尺寸的图片作为输入
### 2. 工作流讲解
1. 在`加载放大模型(Load Upscale Model)`节点中选择我们之前下载的放大模型
2. 在`加载图片(Load Image)`节点中选择我们之前准备的输入图片
3. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl + Enter(回车)` 来执行图片生成
通过以上步骤,我们就可以完成一个图片的放大,你可以看到在这个工作流中,核心主要在于 `Load Upscale Model` 和 `Upscale Image(Using Model)` 的组合,他们通过接收一个图像的输入,然后使用放大模型将图像放大。
## 结合文生图的放大工作流
在完成了简单的放大工作流后,我们就可以尝试结合[文生图](/zh-CN/tutorials/basic/text-to-image)的工作流来完成一个完整放大工作的流程,关于文生图的基础部分及相关模型要求,请参考[文生图](/zh-CN/tutorials/basic/text-to-image)的部分的说明完成。
请将下面的图片下载并保存后拖入到 ComfyUI 中,加载结合文生图的放大工作流
你可以看到在这个工作流里,就是在文生图工作流之后把对应的图片输入到放大工作流中完成了对应图片的放大。
## 其它相关补充
不同放大模型特性:
* **RealESRGAN**: 通用型放大,适合大多数场景
* **BSRGAN**: 擅长处理文字和锐利边缘
* **SwinIR**: 保持自然纹理,适合风景照片
1. **链式放大**:对于需要超高倍率放大的情况,可以串联多个放大节点(如先2x再4x)
2. **混合放大**:在生成工作流后接放大节点,实现"生成+增强"一体化流程
3. **对比测试**:不同模型对特定类型图片效果差异较大,建议同时测试多个模型
# ComfyUI ControlNet 使用示例
Source: https://docs.comfy.org/zh-CN/tutorials/controlnet/controlnet
本篇将引导了解基础的 ControlNet 概念,并在 ComfyUI 中完成对应的图像生成
在 AI 图像生成过程中,要精确控制图像生成并不是一键容易的事情,通常需要通过许多次的图像生成才可能生成满意的图像,但随着 **ControlNet** 的出现,这个问题得到了很好的解决。
ControlNet 是一种基于扩散模型(如 Stable Diffusion)的条件控制生成模型,最早由[Lvmin Zhang](https://lllyasviel.github.io/)与 Maneesh Agrawala 等人于 2023 年提出[Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543)
ControlNet 模型通过引入多模态输入条件(如边缘检测图、深度图、姿势关键点等),显著提升了图像生成的可控性和细节还原能力。
使得我们可以进一步开始控制图像的风格、细节、人物姿势、画面结构等等,这些限定条件让图像生成变得更加可控,在绘图过程中也可以同时使用多个 ControlNet 模型,以达到更好的效果。
在没有 ControlNet 之前,我们每次只能让模型生成图像,直到生成我们满意的图像,充满了随机性。
但随着 ControlNet 的出现,我们可以通过引入额外的条件,来控制图像的生成,比如我们可以使用一张简单的涂鸦,来控制图像的生成,就可以生成差不多类似的图片。
在本示例中,我们将引导你完成在 [ComfyUI](https://github.com/comfyanonymous/ComfyUI) 中 ControlNet 模型的安装与使用, 并完成一个涂鸦控制图像生成的示例。

ControlNet V1.1 其它类型的 ControlNet 模型的工作流也与都与本篇示例相同,你只需要根据需要选择对应的模型和上传对应的参考图即可。
## ControlNet 图片预处理相关说明
不同类型的 ControlNet 模型,通常需要使用不同类型的参考图:

> 图源:[ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
由于目前 **Comfy Core** 节点中,不包含所有类型的 **预处理器** 类型,但在本文档的实际示例中,我们都将提供已经经过处理后的图片,
但在实际使用过程中,你可能需要借助一些自定义节点来对图片进行预处理,以满足不同 ControlNet 模型的需求,下面是一些相关的插件
* [ComfyUI-Advanced-ControlNet](https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet)
* [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
## ComfyUI ControlNet 工作流示例讲解
### 1. ControlNet 工作流素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

Metadata 中包含工作流 json 的图片可直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 来加载对应的工作流。
该图片已包含对应模型的下载链接,直接拖入 ComfyUI 将会自动提示下载。
请下载下面的图片,我们将会将它作为输入

### 2. 手动模型安装
如果你网络无法顺利完成对应模型的自动下载,请尝试手动下载下面的模型,并放置到指定目录中
* [dreamCreationVirtual3DECommerce\_v10.safetensors](https://civitai.com/api/download/models/731340?type=Model\&format=SafeTensor\&size=full\&fp=fp16)
* [vae-ft-mse-840000-ema-pruned.safetensors](https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors?download=true)
* [control\_v11p\_sd15\_scribble\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_scribble_fp16.safetensors?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── dreamCreationVirtual3DECommerce_v10.safetensors
│ ├── vae/
│ │ └── vae-ft-mse-840000-ema-pruned.safetensors
│ └── controlnet/
│ └── control_v11p_sd15_scribble_fp16.safetensors
```
本示例中 vae 模型也可以使用 dreamCreationVirtual3DECommerce\_v10.safetensors 模型中的 vae 模型,这里我们遵循模型作者建议使用单独的 vae 模型。
### 3. 按步骤完成工作流的运行
1. 确保`Load Checkpoint`可以加载 **dreamCreationVirtual3DECommerce\_v10.safetensors**
2. 确保`Load VAE`可以加载 **vae-ft-mse-840000-ema-pruned.safetensors**
3. 在`Load Image`中点击`Upload` 上传之前提供的输入图片
4. 确保`Load ControlNet`可以加载 **control\_v11p\_sd15\_scribble\_fp16.safetensors**
5. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
## 相关节点讲解
### Load ControlNet 节点讲解
位于`ComfyUI\models\controlnet` 的模型会被 ComfyUI 检测到,并在这个节点中识别并加载
### Apply ControlNet 节点讲解
这个节点接受 `load controlnet` 加载的 ControlNet 模型,并根据输入的图片,生成对应的控制条件。
**输入类型**
| 参数名称 | 作用 |
| --------------- | ----------------------------------------------------------------- |
| `positive` | 正向条件 |
| `negative` | 负向条件 |
| `control_net` | 要应用的controlNet模型 |
| `image` | 用于 controlNet 应用参考的预处理器处理图片 |
| `vae` | Vae模型输入 |
| `strength` | 应用 ControlNet 的强度,越大则 ControlNet 对生成图像的影响越大 |
| `start_percent` | 确定开始应用controlNet的百分比,比如取值0.2,意味着ControlNet的引导将在扩散过程完成20%时开始影响图像生成 |
| `end_percent` | 确定结束应用controlNet的百分比,比如取值0.8,意味着ControlNet的引导将在扩散过程完成80%时停止影响图像生成 |
**输出类型**
| 参数名称 | 作用 |
| ---------- | ------------------------- |
| `positive` | 应用了 ControlNet 处理后的正向条件数据 |
| `negative` | 应用了 ControlNet 处理后的负向条件数据 |
你可以使用链式链接来应用多个 ControlNet 模型,如下图所示,你也可以参考 [混合 ControlNet 模型](/zh-CN/tutorials/controlnet/mixing-controlnets) 部分的指南来了解更多关于混合 ControlNet 模型的使用
你也许会在有些早期的工作流中看到如下的`Apply ControlNet(Old)` 节点,这个节点是早期 ControlNet 的节点,目前已弃用状态,默认在搜索和节点列表不可见
如需启用,请在**设置**--> **comfy** --> **Node** 中,启用`Show deprecated nodes in search` 选项,推荐使用新节点
## 开始你的尝试
1. 试着制作类似的涂鸦图片,甚至自己手绘,并使用 ControlNet 模型生成图像,体验 ControlNet 带来的乐趣
2. 调整 Apply ControlNet 节点的 `Control Strength` 参数,来控制 ControlNet 模型对生成图像的影响
3. 访问 [ControlNet-v1-1\_fp16\_safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/tree/main) 仓库下载其它类型的 ControlNet 模型,并尝试使用它们生成图像
# ComfyUI Depth ControlNet 使用示例
Source: https://docs.comfy.org/zh-CN/tutorials/controlnet/depth-controlnet
本篇将引导了解基础的 Depth ControlNet 概念,并在 ComfyUI 中完成对应的图像生成
## 深度图与 Depth ControlNet 介绍
深度图(Depth Map)是一种特殊的图像,它通过灰度值表示场景中各个物体与观察者或相机的距离。在深度图中,灰度值与距离成反比:越亮的区域(接近白色)表示距离越近,越暗的区域(接近黑色)表示距离越远。

Depth ControlNet 是专门训练用于理解和利用深度图信息的 ControlNet 模型。它能够帮助 AI 正确解读空间关系,使生成的图像符合深度图指定的空间结构,从而实现对三维空间布局的精确控制。
### 深度图结合 ControlNet 应用场景
深度图在多种场景中都有比较多的应用:
1. **人像场景**:控制人物与背景的空间关系,避免面部等关键部位畸变
2. **风景场景**:控制近景、中景、远景的层次关系
3. **建筑场景**:控制建筑物的空间结构和透视关系
4. **产品展示**:控制产品与背景的分离度和空间位置
本篇示例中,我们将使用深度图生成建筑可视化的场景生成。
## ComfyUI ControlNet 工作流示例讲解
### 1. ControlNet 工作流素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

Metadata 中包含工作流 json 的图片可直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 来加载对应的工作流。
该图片已包含对应模型的下载链接,直接拖入 ComfyUI 将会自动提示下载。
请下载下面的图片,我们将会将它作为输入。

### 2. 模型安装
如果你网络无法顺利完成对应模型的自动下载,请尝试手动下载下面的模型,并放置到指定目录中
* [architecturerealmix\_v11.safetensors](https://civitai.com/api/download/models/431755?type=Model\&format=SafeTensor\&size=full\&fp=fp16)
* [control\_v11f1p\_sd15\_depth\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11f1p_sd15_depth_fp16.safetensors?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── architecturerealmix_v11.safetensors
│ └── controlnet/
│ └── control_v11f1p_sd15_depth_fp16.safetensors
```
### 3. 按步骤完成工作流的运行
1. 确保`Load Checkpoint`可以加载 **architecturerealmix\_v11.safetensors**
2. 确保`Load ControlNet`可以加载 **control\_v11f1p\_sd15\_depth\_fp16.safetensors**
3. 在`Load Image`中点击`Upload` 上传之前提供的 Depth 图像
4. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
## 混合深度控制与其他技术
根据不同创作需求,可以将深度图 ControlNet 与其它类型的 ControlNet 混合使用来达到更好的效果:
1. **Depth + Lineart**:保持空间关系的同时强化轮廓,适用于建筑、产品、角色设计
2. **Depth + Pose**:控制人物姿态的同时维持正确的空间关系,适用于人物场景
关于多个 ControlNet 混合使用,可以参考 [混合 ControlNet](/zh-CN/tutorials/controlnet/mixing-controlnets) 示例。
# ComfyUI Depth T2I Adapter 使用示例
Source: https://docs.comfy.org/zh-CN/tutorials/controlnet/depth-t2i-adapter
本篇将引导了解基础的 Depth T2I Adapter ,并在 ComfyUI 中完成对应的图像生成
## T2I Adapter 介绍
[T2I-Adapter](https://huggingface.co/TencentARC/T2I-Adapter) 是由 [腾讯ARC实验室](https://github.com/TencentARC) 开发的轻量级适配器,用于增强文本到图像生成模型(如Stable Diffusion)的结构、颜色和风格控制能力。
它通过外部条件(如边缘检测图、深度图、草图或颜色参考图)与模型内部特征对齐,实现高精度控制,无需修改原模型结构。其参数仅约77M(体积约300MB),推理速度比 [ControlNet](https://github.com/lllyasviel/ControlNet-v1-1-nightly) 快约3倍,支持多条件组合(如草图+颜色网格)。应用场景包括线稿转图像、色彩风格迁移、多元素场景生成等。
### T2I Adapter 与 ControlNet 的对比
虽然功能相似,但两者在实现和应用上有明显区别:
1. **轻量级设计**:T2I Adapter 参数量更少,占用内存更小
2. **推理速度**:T2I Adapter 通常比 ControlNet 快约3倍
3. **控制精度**:ControlNet 在某些场景下控制更精确,而 T2I Adapter 更适合轻量级控制
4. **多条件组合**:T2I Adapter 在多条件组合时资源占用优势更明显
### T2I Adapter 主要类型
T2I Adapter 提供多种类型以控制不同方面:
* **深度 (Depth)**:控制图像的空间结构和深度关系
* **线稿 (Canny/Sketch)**:控制图像的边缘和线条
* **关键点 (Keypose)**:控制人物姿态和动作
* **分割 (Seg)**:通过语义分割控制场景布局
* **颜色 (Color)**:控制图像的整体配色方案
在 ComfyUI 中,使用 T2I Adapter 与 [ControlNet](/zh-CN/tutorials/controlnet/controlnet) 的界面和工作流相似。在本篇示例中,我们将以深度 T2I Adapter 控制室内场景为例,展示其使用方法。

## 深度 T2I Adapter 应用价值
深度图(Depth Map)在图像生成中有多种重要应用:
1. **空间布局控制**:准确描述三维空间结构,适用于室内设计、建筑可视化
2. **物体定位**:控制场景中物体的相对位置和大小,适用于产品展示、场景构建
3. **透视关系**:维持合理的透视和比例,适用于风景、城市场景生成
4. **光影布局**:基于深度信息的自然光影分布,增强真实感
我们将以室内设计为例,展示深度 T2I Adapter 的使用方法,但这些技巧也适用于其他应用场景。
## ComfyUI Depth T2I Adapter工作流示例讲解
### 1. Depth T2I Adapter 工作流素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

Metadata 中包含工作流 json 的图片可直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 来加载对应的工作流。
该图片已包含对应模型的下载链接,直接拖入 ComfyUI 将会自动提示下载。
请下载下面的图片,我们将会将它作为输入

### 2. 模型安装
如果你网络无法顺利完成对应模型的自动下载,请尝试手动下载下面的模型,并放置到指定目录中
* [interiordesignsuperm\_v2.safetensors](https://civitai.com/api/download/models/93152?type=Model\&format=SafeTensor\&size=full\&fp=fp16)
* [t2iadapter\_depth\_sd15v2.pth](https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_depth_sd15v2.pth?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── interiordesignsuperm_v2.safetensors
│ └── controlnet/
│ └── t2iadapter_depth_sd15v2.pth
```
### 3. 按步骤完成工作流的运行
1. 确保`Load Checkpoint`可以加载 **interiordesignsuperm\_v2.safetensors**
2. 确保`Load ControlNet`可以加载 **t2iadapter\_depth\_sd15v2.pth**
3. 在`Load Image`中点击`Upload` 上传之前提供的输入图片
4. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
## T2I Adapter 通用使用技巧
### 输入图像质量优化
无论应用场景如何,高质量的输入图像都是成功使用 T2I Adapter 的关键:
1. **对比度适中**:控制图像(如深度图、线稿)应有明确的对比,但不要过度极端
2. **清晰的边界**:确保主要结构和元素边界在控制图像中清晰可辨
3. **噪点控制**:尽量避免控制图像中有过多噪点,特别是深度图和线稿
4. **合理的布局**:控制图像应当具有合理的空间布局和元素分布
## T2I Adapter 的使用特点
T2I Adapter 的一大优势是可以轻松组合多个条件,实现复杂的控制效果:
1. **深度 + 边缘**:控制空间布局的同时保持结构边缘清晰,适用于建筑、室内设计
2. **线稿 + 颜色**:控制形状的同时指定配色方案,适用于角色设计、插画
3. **姿态 + 分割**:控制人物动作的同时定义场景区域,适用于复杂叙事场景
T2I Adapter 之间的混合,或与其他控制方法(如ControlNet、区域提示词等)的组合,可以进一步扩展创作可能性。要实现混合,只需按照与 [混合 ControlNet](/zh-CN/tutorials/controlnet/mixing-controlnets) 相同的方式,通过链式连接多个 `Apply ControlNet` 节点即可。
# ComfyUI ControlNet 混合使用示例
Source: https://docs.comfy.org/zh-CN/tutorials/controlnet/mixing-controlnets
我们将在本篇示例中,完成多个 ControlNet 混合使用,学会使用多个 ControlNet 模型来控制图像生成
在 AI 图像生成中,单一的控制条件往往难以满足复杂场景的需求。混合使用多个 ControlNet 可以同时控制图像的不同区域或不同方面,实现更精确的图像生成控制。
在一些场景下,混合使用 ControlNet 可以利用不同控制条件的特性,来达到更精细的条件控制:
1. **场景复杂性**:复杂场景需要多种控制条件共同作用
2. **精细控制**:通过调整每个 ControlNet 的强度参数,可以精确控制各部分的影响程度
3. **互补效果**:不同类型的 ControlNet 可以互相补充,弥补单一控制的局限性
4. **创意表达**:组合不同控制可以产生独特的创意效果
### 混合 ControlNet 的使用方法
当我们混合使用多个 ControlNet 时,每个 ControlNet 会根据其应用的区域对图像生成过程施加影响。ComfyUI 通过 `Apply ControlNet` 节点的链式连接方式,允许多个 ControlNet 条件按顺序叠加应用混合控制条件:
## ComfyUI ControlNet 区域分治混合示例
在本示例中,我们将使用 **Pose ControlNet** 和 **Scribble ControlNet** 的组合来生成一张包含多个元素的场景:左侧由 Pose ControlNet 控制的人物和右侧由 Scribble ControlNet 控制的猫咪滑板车。
### 1. ControlNet 混合使用工作流素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

该工作流图片包含 Metadata 数据,可直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 加载。系统会自动检测并提示下载所需模型。
用于输入的 pose 图片(控制左侧人物姿态):

用于输入的 scribble 图片(控制右侧猫咪和滑板车):

### 2. 手动模型安装
如果你网络无法顺利完成对应模型的自动下载,请尝试手动下载下面的模型,并放置到指定目录中
* [awpainting\_v14.safetensors](https://civitai.com/api/download/models/624939?type=Model\&format=SafeTensor\&size=full\&fp=fp16)
* [control\_v11p\_sd15\_scribble\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_scribble_fp16.safetensors?download=true)
* [control\_v11p\_sd15\_openpose\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_openpose_fp16.safetensors?download=true)
* [vae-ft-mse-840000-ema-pruned.safetensors](https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── awpainting_v14.safetensors
│ ├── controlnet/
│ │ └── control_v11p_sd15_scribble_fp16.safetensors
│ │ └── control_v11p_sd15_openpose_fp16.safetensors
│ ├── vae/
│ │ └── vae-ft-mse-840000-ema-pruned.safetensors
```
### 3. 按步骤完成工作流的运行
按照图片中的数字标记,执行以下步骤:
1. 确保`Load Checkpoint`可以加载 **awpainting\_v14.safetensors**
2. 确保`Load VAE`可以加载 **vae-ft-mse-840000-ema-pruned.safetensors**
第一组 ControlNet 使用 Openpose 模型:
3\. 确保`Load ControlNet Model`加载 **control\_v11p\_sd15\_openpose\_fp16.safetensors**
4\. 在`Load Image`中点击`Upload` 上传之前提供的 pose 图片
第二组 ControlNet 使用 Scribble 模型:
5\. 确保`Load ControlNet Model`加载 **control\_v11p\_sd15\_scribble\_fp16.safetensors**
6\. 在`Load Image`中点击`Upload` 上传之前提供的 scribble 图片
7\. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片的生成
## 工作流讲解
#### 强度平衡
当控制图像不同区域时,强度参数的平衡尤为重要:
* 如果一个区域的 ControlNet 强度明显高于另一个,可能导致该区域的控制效果过强而抑制另一区域
* 推荐为不同区域的 ControlNet 设置相似的强度值,例如都设为 1.0
#### 提示词技巧
在区域分治混合中,提示词需要同时包含两个区域的描述:
```
"A woman in red dress, a cat riding a scooter, detailed background, high quality"
```
这样的提示词同时涵盖了人物和猫咪滑板车,确保模型能够同时关注两个控制区域。
## 同一主体多维控制的混合应用
除了本例展示的区域分治混合外,另一种常见的混合方式是对同一主体进行多维控制。例如:
* **Pose + Depth**:控制人物姿势及空间感
* **Pose + Canny**:控制人物姿势及边缘细节
* **Pose + Reference**:控制人物姿势但参考特定风格
在这种应用中,多个 ControlNet 的参考图应该对准同一主体,并调整各自的强度确保适当平衡。
通过组合不同类型的 ControlNet 并指定其控制区域,你可以对画面元素进行精确控制。
# ComfyUI Pose ControlNet 使用示例
Source: https://docs.comfy.org/zh-CN/tutorials/controlnet/pose-controlnet-2-pass
本篇将引导了解基础的 Pose ControlNet,并通过二次图生图的方式,在 ComfyUI 中完成大尺寸的图像生成
## OpenPose 简介
[OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) 是由卡耐基梅隆大学(CMU)开发的开源实时多人姿态估计系统,是计算机视觉领域的重要技术突破。该系统能够同时检测图像中多个人的:
* **人体骨架**:18个关键点,包括头部、肩膀、手肘、手腕、髋部、膝盖和脚踝等
* **面部表情**:70个面部关键点,用于捕捉微表情和面部轮廓
* **手部细节**:21个手部关键点,精确表达手指姿势和手势
* **脚部姿态**:6个脚部关键点,记录站立姿势和动作细节
在 AI 图像生成领域,OpenPose 生成的骨骼结构图作为 ControlNet 的条件输入,能够精确控制生成人物的姿势、动作和表情,让我们能够按照预期的姿态和动作生成逼真的人物图像,极大提高了 AI 生成内容的可控性和实用价值。
特别针对早期 Stable diffusion 1.5 系列的模型,通过 OpenPose 生成的骨骼图,可以有效避免人物动作、肢体、表情畸变的问题。
## ComfyUI 2 Pass Pose ControlNet 使用示例
### 1. Pose ControlNet 工作流素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

Metadata 中包含工作流 json 的图片可直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 来加载对应的工作流。
该图片已包含对应模型的下载链接,直接拖入 ComfyUI 将会自动提示下载。
请下载下面的图片,我们将会将它作为输入

### 2. 手动模型安装
如果你网络无法顺利完成对应模型的自动下载,请尝试手动下载下面的模型,并放置到指定目录中
* [control\_v11p\_sd15\_openpose\_fp16.safetensors](https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/resolve/main/control_v11p_sd15_openpose_fp16.safetensors?download=true)
* [majicmixRealistic\_v7.safetensors](https://civitai.com/api/download/models/176425?type=Model\&format=SafeTensor\&size=pruned\&fp=fp16)
* [japaneseStyleRealistic\_v20.safetensors](https://civitai.com/api/download/models/85426?type=Model\&format=SafeTensor\&size=pruned\&fp=fp16)
* [vae-ft-mse-840000-ema-pruned.safetensors](https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors?download=true)
```
ComfyUI/
├── models/
│ ├── checkpoints/
│ │ └── majicmixRealistic_v7.safetensors
│ │ └── japaneseStyleRealistic_v20.safetensors
│ ├── vae/
│ │ └── vae-ft-mse-840000-ema-pruned.safetensors
│ └── controlnet/
│ └── control_v11p_sd15_openpose_fp16.safetensors
```
### 3. 按步骤完成工作流的运行
按照图片中的数字标记,执行以下步骤:
1. 确保`Load Checkpoint`可以加载 **majicmixRealistic\_v7.safetensors**
2. 确保`Load VAE`可以加载 **vae-ft-mse-840000-ema-pruned.safetensors**
3. 确保`Load ControlNet Model`可以加载 **control\_v11p\_sd15\_openpose\_fp16.safetensors**
4. 在`Load Image`节点中点击选择按钮,上传之前提供的姿态输入图片,或者使用你自己的OpenPose骨骼图
5. 确保`Load Checkpoint`可以加载 **japaneseStyleRealistic\_v20.safetensors**
6. 点击`Queue`按钮或使用快捷键`Ctrl(cmd) + Enter(回车)`来执行图片的生成
## Pose ControlNet 二次图生图工作流讲解
本工作流采用二次图生图(2-pass)的方式,将图像生成分为两个阶段:
### 第一阶段:基础姿态图像生成
在第一阶段,使用**majicmixRealistic\_v7**模型结合Pose ControlNet生成初步的人物姿态图像:
1. 首先通过`Load Checkpoint`加载majicmixRealistic\_v7模型
2. 通过`Load ControlNet Model`加载姿态控制模型
3. 输入的姿态图被送入`Apply ControlNet`节点与正向和负向提示词条件结合
4. 第一个`KSampler`节点(通常使用20-30步)生成基础的人物姿态图像
5. 通过`VAE Decode`解码得到第一阶段的像素空间图像
这个阶段主要关注正确的人物姿态、姿势和基本结构,确保生成的人物符合输入的骨骼姿态。
### 第二阶段:风格优化与细节增强
在第二阶段,将第一阶段的输出图像作为参考,使用**japaneseStyleRealistic\_v20**模型进行风格化和细节增强:
1. 第一阶段生成的图像通过`Upscale latent`节点创建的更大分辨率的潜在空间
2. 第二个`Load Checkpoint`加载japaneseStyleRealistic\_v20模型,这个模型专注于细节和风格
3. 第二个`KSampler`节点使用较低的`denoise`强度(通常0.4-0.6)进行细化,保留第一阶段的基础结构
4. 最终通过第二个`VAE Decode`和`Save Image`节点输出更高质量、更大分辨率的图像
这个阶段主要关注风格统一性、细节丰富度和提升整体画面质量。
## 二次图生图的优势
与单次生成相比,二次图生图方法具有以下优势:
1. **更高分辨率**:通过二次处理可以生成超出单次生成能力的高分辨率图像
2. **风格混合**:可以结合不同模型的优势,如第一阶段使用写实模型,第二阶段使用风格化模型
3. **更好的细节**:第二阶段可以专注于优化细节,而不必担心整体结构
4. **精确控制**:姿态控制在第一阶段完成后,第二阶段可以专注于风格和细节的完善
5. **降低GPU负担**:分两次生成可以在有限的GPU资源下生成高质量大图
如需了解更多关于混合多个ControlNet的技巧,请参考[混合ControlNet模型](/zh-CN/tutorials/controlnet/mixing-controlnets)教程。
# ComfyUI Flux.1 ControlNet 示例
Source: https://docs.comfy.org/zh-CN/tutorials/flux/flux-1-controlnet
本文将使用 Flux.1 ControlNet 来完成 ControlNet 的工作流示例。
## FLUX.1 ControlNet 模型介绍
FLUX.1 Canny 和 Depth 是由 [Black Forest Labs](https://blackforestlabs.ai/) 推出的 [FLUX.1 Tools 套件](https://blackforestlabs.ai/flux-1-tools/) 中的两个强大模型。这套工具旨在为 FLUX.1 添加控制和引导能力,使用户能够修改和重新创建真实或生成的图像。
**FLUX.1-Depth-dev** 和 **FLUX.1-Canny-dev** 都是 12B 参数的 Rectified Flow Transformer 模型,能够基于文本描述生成图像,同时保持与输入图像的一致性。其中 Depth 版本通过深度图提取技术来维持源图像的空间结构,而 Canny 版本则利用边缘检测技术来保持源图像的结构特征,使得用户可以根据不同需求选择合适的控制方式。
这两个模型都具有以下特点:
* 顶级的输出质量和细节表现
* 出色的提示遵循能力,同时保持源图像的结构布局
* 使用引导蒸馏技术训练,提高效率
* 开放权重供社区研究使用
* 提供 API 接口(pro 版)和开源权重(dev 版)
此外,Black Forest Labs 还提供了从完整模型中提取的 **FLUX.1-Depth-dev-lora** 和 **FLUX.1-Canny-dev-lora** 适配器版本,它们可以应用于 FLUX.1 \[dev] 基础模型,以较小的文件体积提供类似的功能,特别适合资源受限的环境。
本文将以分别以完整版本的 **FLUX.1-Canny-dev** 和 **FLUX.1-Depth-dev-lora** 为例,完成ComfyUI 中 Flux ControlNet 的工作流示例。
Metadata 中包含工作流 json 的图片可直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 来加载对应的工作流。
本篇示例中的图片包含对应模型的下载链接,直接拖入 ComfyUI 将会自动提示下载。
对于图像预处理器,你可以使用以下自定义节点来完成图像的预处理,在本示例中,我们将提供处理过的图片作为输入。
* [ComfyUI-Advanced-ControlNet](https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet)
* [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
## FLUX.1-Canny-dev 完整版工作流
### 1. 工作流及相关素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

请下载下面的图片,我们将使用它来作为输入图片

### 2. 手动模型下载
如果你之前使用过[完整版本的 Flux 相关工作流](/zh-CN/tutorials/flux/flux-1-text-to-image),那么你仅需要下载 **flux1-canny-dev.safetensors** 这个模型文件。
由于你需要先同意 [black-forest-labs/FLUX.1-Canny-dev](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev) 的协议,所以请访问 [black-forest-labs/FLUX.1-Canny-dev](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev) 页面,确保你参照下图同意了对应的协议。
完整模型列表:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true)
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-canny-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev/resolve/main/flux1-canny-dev.safetensors?download=true) (请确保你已经同意了对应 repo 的协议)
文件保存位置:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ └── diffusion_models/
│ └── flux1-canny-dev.safetensors
```
### 3. 按步骤完成工作流的运行
1. 确保在`Load VAE`中加载了`ae.safetensors`
2. 确保在`Load Diffusion Model`加载了`flux1-canny-dev.safetensors`
3. 确保在`DualCLIPLoader`中下面的模型已加载:
* clip\_name1: t5xxl\_fp16.safetensors
* clip\_name2: clip\_l.safetensors
4. 在`Load Image`节点中上传了文档中提供的输入图片
5. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
### 4. 开始你的尝试
尝试使用[FLUX.1-Depth-dev](https://huggingface.co/black-forest-labs/FLUX.1-Depth-dev) 模型完成 Depth 版本的工作流
你可以使用下面的图片作为输入

或者借助下面自定义节点中完成图像预处理:
* [ComfyUI-Advanced-ControlNet](https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet)
* [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
## FLUX.1-Depth-dev-lora 工作流
LoRA 版本的工作流是在完整版本的基础上,添加了 LoRA 模型,相对于[完整版本的 Flux 工作流](/zh-CN/tutorials/flux/flux-1-text-to-image),增加了对应 LoRA 模型的加载使用节点。
### 1. 工作流及相关素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

请下载下面的图片,我们将使用它来作为输入图片

### 2. 手动模型下载
如果你之前使用过[完整版本的 Flux 相关工作流](/zh-CN/tutorials/flux/flux-1-text-to-image),那么你仅需要下载 **flux1-depth-dev-lora.safetensors** 这个模型文件。
完整模型列表:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true)
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors?download=true)
* [flux1-depth-dev-lora.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-Depth-dev-lora/resolve/main/flux1-depth-dev-lora.safetensors?download=true)
文件保存位置:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ ├── diffusion_models/
│ │ └── flux1-dev.safetensors
│ └── loras/
│ └── flux1-depth-dev-lora.safetensors
```
### 3. 按步骤完成工作流的运行
1. 确保在`Load Diffusion Model`加载了`flux1-dev.safetensors`
2. 确保在`LoraLoaderModelOnly`中加载了`flux1-depth-dev-lora.safetensors`
3. 确保在`DualCLIPLoader`中下面的模型已加载:
* clip\_name1: t5xxl\_fp16.safetensors
* clip\_name2: clip\_l.safetensors
4. 在`Load Image`节点中上传了文档中提供的输入图片
5. 确保在`Load VAE`中加载了`ae.safetensors`
6. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
### 4. 开始你的尝试
尝试使用[FLUX.1-Canny-dev-lora](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev-lora) 模型完成 Canny 版本的工作流
借助 [ComfyUI-Advanced-ControlNet](https://github.com/Kosinkadink/ComfyUI-Advanced-ControlNet) 或者 [ComfyUI ControlNet aux](https://github.com/Fannovel16/comfyui_controlnet_aux) 完成图像预处理
## 社区版本 Flux Controlnets
XLab 和 InstantX + Shakker Labs 已经为 Flux 发布了 Controlnet。
**InstantX:**
* [FLUX.1-dev-Controlnet-Canny](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Canny/blob/main/diffusion_pytorch_model.safetensors)
* [FLUX.1-dev-ControlNet-Depth](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Depth/blob/main/diffusion_pytorch_model.safetensors)
* [FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro/blob/main/diffusion_pytorch_model.safetensors)
**XLab**: [flux-controlnet-collections](https://huggingface.co/XLabs-AI/flux-controlnet-collections)
将这些文件放在 `ComfyUI/models/controlnet` 目录下。
你可以访问[Flux Controlnet 示例](https://raw.githubusercontent.com/comfyanonymous/ComfyUI_examples/refs/heads/master/flux/flux_controlnet_example.png)来获取对应工作流图片,并使用[这里](https://raw.githubusercontent.com/comfyanonymous/ComfyUI_examples/refs/heads/master/flux/girl_in_field.png)的图片作为输入图片。
# ComfyUI Flux.1 fill dev 示例
Source: https://docs.comfy.org/zh-CN/tutorials/flux/flux-1-fill-dev
本文将使用Flux.1 fill dev 来完成 Inpainting 和 Outpainting 的工作流示例。
## Flux.1 fill dev 模型介绍
Flux.1 fill dev 是 [Black Forest Labs](https://blackforestlabs.ai/) 推出的 [FLUX.1 Tools 套件](https://blackforestlabs.ai/flux-1-tools/) 中的核心工具之一,专为图像修复和扩展设计。
Flux.1 fill dev 的核心特点:
* 强大的图像重绘(Inpainting)和扩绘(Outpainting)能力,生成效果仅次于商业版的 FLUX.1 Fill \[pro]。
* 出色的提示词理解和跟随能力,能够精确捕捉用户意图并与原图保持高度一致性。
* 采用先进的引导蒸馏训练技术,使模型在保持高质量输出的同时更加高效。
* 友好的许可条款,生成的输出可用于个人、科学和商业目的,具体请参见 [FLUX.1 \[dev\] 非商业许可证](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md)。
模型开源地址:[FLUX.1 \[dev\]](https://huggingface.co/black-forest-labs/FLUX.1-dev)
本文将基于 Flux.1 fill dev 模型来完成 Inpainting 和 Outpainting 的工作流,
如果你不太了解 Inpainting 和 Outpainting 的工作流可以参考 [ComfyUI 布局重绘示例](/zh-CN/tutorials/basic/inpaint) 和 [ComfyUI 扩图示例](/zh-CN/tutorials/basic/outpaint),部分的相关说明。
## Flux.1 Fill dev 工作流模型安装
在开始之前,让我们先完成 Flux.1 Fill dev 模型文件的安装, inpainting 和 outpainting 的工作流中会使用完全相同的模型文件,如果你之前使用过完整版本的 [Flux.1 文生图工作流](/zh-CN/tutorials/flux/flux-1-text-to-image),那么在这个部分你仅需要下载 **flux1-fill-dev.safetensors** 这个模型文件。
不过由于下载对应模型需要同意对应的使用协议,所以请访问 [black-forest-labs/FLUX.1-Fill-dev](https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev)页面,确保你参照下图同意了对应的协议。
完整模型列表:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true)
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-fill-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev/resolve/main/flux1-fill-dev.safetensors?download=true)
文件保存位置:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ └── diffusion_models/
│ └── flux1-fill-dev.safetensors
```
## Flux.1 Fill dev inpainting 工作流
### 1. Inpainting 工作流及相关素材
请下载下面的图片,并拖入 ComfyUI 以加载对应的工作流

请下载下面的图片,我们将使用它来作为输入图片

对应的图片已经包含 alpha 通道,所以你不需要额外进行蒙版的绘制, 如果你想要自己进行蒙版的绘制请[点击这里](https://raw.githubusercontent.com/Comfy-Org/example_workflows/main/flux/inpaint/flux_fill_inpaint_input_original.png)获取不带蒙版的版本,并参考 [ComfyUI 布局重绘示例](/zh-CN/tutorials/basic/inpaint) 中的 MaskEditor 的使用部分来了解如何在`Load Image`节点中绘制蒙版。
### 2. 参照图片序号检查完成工作流运行
1. 确保在`Load Diffusion Model`节点加载了`flux1-fill-dev.safetensors`
2. 确保在`DualCLIPLoader`节点中下面的模型已加载:
* clip\_name1: t5xxl\_fp16.safetensors
* clip\_name2: clip\_l.safetensors
3. 确保在`Load VAE`节点中加载了`ae.safetensors`
4. 在`Load Image`节点中上传了文档中提供的输入图片,如果你使用的是不带蒙版的版本,记得使用遮罩编辑器完成蒙版的绘制
5. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
## Flux.1 Fill dev Outpainting 工作流
### 1. Outpainting 工作流
请下载下面的图片,并拖入 ComfyUI 以加载对应的工作流

请下载下面的图片,我们将使用它来作为输入图片

### 2. 参照图片序号检查完成工作流运行
1. 确保在`Load Diffusion Model`节点加载了`flux1-fill-dev.safetensors`
2. 确保在`DualCLIPLoader`节点中下面的模型已加载:
* clip\_name1: t5xxl\_fp16.safetensors
* clip\_name2: clip\_l.safetensors
3. 确保在`Load VAE`节点中加载了`ae.safetensors`
4. 在`Load Image`节点中上传了文档中提供的输入图片
5. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
# ComfyUI Flux Kontext Dev 原生工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/flux/flux-1-kontext-dev
ComfyUI Flux Kontext Dev 原生工作流示例。
## 关于 FLUX.1 Kontext Dev
FLUX.1 Kontext 是 Black Forest Labs 推出的突破性多模态图像编辑模型,支持文本和图像同时输入,能够智能理解图像上下文并执行精确编辑。其开发版是一个拥有 120 亿参数的开源扩散变压器模型,具有出色的上下文理解能力和角色一致性保持,即使经过多次迭代编辑,也能确保人物特征、构图布局等关键元素保持稳定。
与 FLUX.1 Kontext 套件具备相同的核心能力:
角色一致性:在多个场景和环境中保留图像的独特元素,例如图片中的参考角色或物体。
局部编辑:对图像中的特定元素进行有针对性的修改,而不影响其他部分。
风格参考:根据文本提示,在保留参考图像独特风格的同时生成新颖场景。
交互速度:图像生成和编辑的延迟极小。
虽然之前发布的 API 版本提供了最高的保真度和速度,但 FLUX.1 Kontext \[Dev] 完全在本地机器上运行,为希望进行实验的开发者、研究人员和高级用户提供了无与伦比的灵活性。
### 版本说明
* **\[FLUX.1 Kontext \[pro]** - 商业版本,专注快速迭代编辑
* **FLUX.1 Kontext \[max]** - 实验版本,更强的提示遵循能力
* **FLUX.1 Kontext \[dev]** - 开源版本(本教程使用),12B参数,主要用于研究
目前在 ComfyUI 中,你可以使用所有的这些版本,其中 [Pro 及 Max 版本](/zh-CN/tutorials/api-nodes/black-forest-labs/flux-1-kontext) 可以通过 API 节点来进行调用,而 Dev 版本开源版本请参考本篇指南中的说明。
## 工作流说明
目前在本篇教程中,我们涉及了两类工作流,本质上他们其实是相同的,
* 使用了组节点 **FLUX.1 Kontext Image Edit** 的工作流,使得整个界面和工作流复用起来变得简单
* 而另一个工作流没有使用组节点,是完整的原始工作流。
使用组节点的主要优点是工作流简洁,你可以复用组节点来实现复杂的工作流,快速复用节点组,另外在新版本的前端中,我们也为 Flux.1 Kontext Dev 增加了一个快速添加组节点的功能:
这个功能目前只是一个实验性的新功能,可能在未来版本中进行调整。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 模型下载
为了使本篇指南的工作流能够顺利运行,你先需要下载下面的模型文件,你也可以直接加载对应工作流下直接获取模型的下载链接,对应的工作流已经包含了模型文件的下载信息。
**Diffusion Model**
* [flux1-dev-kontext\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/flux1-kontext-dev_ComfyUI/resolve/main/split_files/diffusion_models/flux1-dev-kontext_fp8_scaled.safetensors)
如果你想要使用原始权重,可以访问 Black Forest Labs 的相关仓库获取原始模型权重进行使用。
**VAE**
* [ae.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/blob/main/split_files/vae/ae.safetensors)
**Text Encoder**
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors) 或 [t5xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn_scaled.safetensors)
模型保存位置
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── flux1-dev-kontext_fp8_scaled.safetensors
│ ├── 📂 vae/
│ │ └── ae.safetensor
│ └── 📂 text_encoders/
│ ├── clip_l.safetensors
│ └── t5xxl_fp16.safetensors 或者 t5xxl_fp8_e4m3fn_scaled.safetensors
```
## Flux.1 Kontext Dev 工作流
这个工作流是正常的工作流,不过使用了 `Load Image(from output)` 节点来加载需要编辑的图像可以让你更方便地获取到编辑后的图像,从而进行多轮次编辑
### 1. 工作流及输入图片下载
下载下面的文件,并拖入 ComfyUI 中加载对应工作流

**输入图片**

### 2. 按步骤完成工作流的运行
你可参考图片中的序号来完成图工作流的运行:
1. 在 `Load Diffusion Model` 节点中加载 `flux1-dev-kontext_fp8_scaled.safetensors` 模型
2. 在 `DualCLIP Load` 节点中确保: `clip_l.safetensors` 及 `t5xxl_fp16.safetensors` 或 `t5xxl_fp8_e4m3fn_scaled.safetensors` 已经加载
3. 在 `Load VAE` 节点中确保加载 `ae.safetensors` 模型
4. 在 `Load Image(from output)` 节点中加载提供的输入图像
5. 在 `CLIP Text Encode` 节点中修改提示词,仅支持英文
6. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
## Flux Kontext 提示词技巧
### 1. 基础修改
* 简单直接:`"Change the car color to red"`
* 保持风格:`"Change to daytime while maintaining the same style of the painting"`
### 2. 风格转换
**原则:**
* 明确命名风格:`"Transform to Bauhaus art style"`
* 描述特征:`"Transform to oil painting with visible brushstrokes, thick paint texture"`
* 保留构图:`"Change to Bauhaus style while maintaining the original composition"`
### 3. 角色一致性
**框架:**
* 具体描述:`"The woman with short black hair"`而非`"she"`
* 保留特征:`"while maintaining the same facial features, hairstyle, and expression"`
* 分步修改:先改背景,再改动作
### 4. 文本编辑
* 使用引号:`"Replace 'joy' with 'BFL'"`
* 保持格式:`"Replace text while maintaining the same font style"`
## 常见问题解决
### 角色变化过大
❌ 错误:`"Transform the person into a Viking"`
✅ 正确:`"Change the clothes to be a viking warrior while preserving facial features"`
### 构图位置改变
❌ 错误:`"Put him on a beach"`
✅ 正确:`"Change the background to a beach while keeping the person in the exact same position, scale, and pose"`
### 风格应用不准确
❌ 错误:`"Make it a sketch"`
✅ 正确:`"Convert to pencil sketch with natural graphite lines, cross-hatching, and visible paper texture"`
## 核心原则
1. **具体明确** - 使用精确描述,避免模糊词汇
2. **分步编辑** - 复杂修改分为多个简单步骤
3. **明确保留** - 说明哪些要保持不变
4. **动词选择** - 用"change"、"replace"而非"transform"
## 最佳实践模板
**对象修改:**
`"Change [object] to [new state], keep [content to preserve] unchanged"`
**风格转换:**
`"Transform to [specific style], while maintaining [composition/character/other] unchanged"`
**背景替换:**
`"Change the background to [new background], keep the subject in the exact same position and pose"`
**文本编辑:**
`"Replace '[original text]' with '[new text]', maintain the same font style"`
> **记住:** 越具体越好,Kontext 擅长理解详细指令并保持一致性。
# ComfyUI Flux 文生图工作示例
Source: https://docs.comfy.org/zh-CN/tutorials/flux/flux-1-text-to-image
本文将简要介绍 Flux 绘图模型,并指导使用 Flux 模型进行文生图的示例包括原始完整版本和 FP8 Checkpoint 版本。
Flux 是目前最大的开源AI绘画模型之一,拥有 12B 参数,原始文件大小约为23GB。它由 [Black Forest Labs](https://blackforestlabs.ai/) 开发,该团队由前 Stable Diffusion 团队成员创立。
Flux 以其卓越的画面质量和灵活性而闻名,能够生成高质量、多样化的图像。
目前 Flux.1 模型主要有以下几个版本:
* **Flux.1 Pro:** 效果最佳模型,闭源模型,仅支持通过 API 调用。
* **[Flux.1 \[dev\]:](https://huggingface.co/black-forest-labs/FLUX.1-dev)** 开源但仅限非商业使用,从 Pro 版本蒸馏而来,效果接近Pro版。
* \*\*[Flux.1 \[schnell\]:](https://huggingface.co/black-forest-labs/FLUX.1-schnell)\*\*采用 Apache2.0 许可,仅需4步即可生成图像,适合低配置硬件。
**Flux.1 模型特点**
* **混合架构:** 结合了 Transformer 网络和扩散模型的优势,有效整合文本与图像信息,提升生成图像与提示词的对齐精度,对复杂的提示词依旧有非常好的还原能力。
* **参数规模:** Flux 拥有 12B 参数,可捕捉更复杂的模式关系,生成更逼真、多样化的图像。
* **支持多种风格:** 支持多样化的风格,对各种类型的图像都有非常好的表现能力。
在本篇示例中,我们将介绍使用 Flux.1 Dev 和 Flux.1 Schnell 两个版本进行文生图的示例,包括原始完整版模型和 FP8 Checkpoint 简化版本。
* **Flux 完整版本:** 效果最佳,但需要较大的显存资源(推荐16GB以上),需要安装多个模型文件。
* **Flux FP8 Checkpoint:** 仅需一个 fp8 版本的模型,但是质量相对完整版会有所降低。
本篇示例中的所有工作流图片的 Metadata 中已包含对应模型下载信息,使用以下方式来加载工作流:
* 直接拖入 ComfyUI
* 或使用菜单 `Workflows` -> `Open(ctrl+o)`
如果你使用的不是 Desktop 版本或者部分模型无法顺利下载,请参考手动安装部分保存模型文件到对应的文件夹。
请在开始之前确保你的 ComfyUI 已更新到最新版本。
## Flux.1 原始版本模型文生图示例
请注意如果你无法下载 [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) 中的模型,请确保你已登录 Huggingface 并同意了对应 Repo 的协议。
### Flux.1 Dev 完整版本工作流
#### 1. 工作流文件
请下载下面的图片,并拖入 ComfyUI 中加载工作流。

#### 2. 手动安装模型
* `flux1-dev.safetensors` 文件需要同意 [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) 的协议后才能使用浏览器进行下载。
* 如果你的显存较低,可以尝试使用 [t5xxl\_fp8\_e4m3fn.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors?download=true) 来替换 `t5xxl_fp16.safetensors` 文件。
请下载下面的模型文件:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true) 当你的显存大于 32GB 时推荐使用。
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors)
文件保存位置:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ └── diffusion_models/
│ └── flux1-dev.safetensors
```
#### 3. 按步骤检查确保工作流可以正常运行
请参照下面的图片,确保各个模型文件都已经加载完成
1. 确保在`DualCLIPLoader`节点中下面的模型已加载:
* clip\_name1: t5xxl\_fp16.safetensors
* clip\_name2: clip\_l.safetensors
2. 确保在`Load Diffusion Model`节点加载了`flux1-dev.safetensors`
3. 确保在`Load VAE`节点中加载了`ae.safetensors`
4. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
得益于 Flux 良好的提示词遵循能力,我们并不需要任何的负向提示词
### Flux.1 Schnell 完整版本工作流
#### 1. 工作流文件
请下载下面的图片,并拖入 ComfyUI 中加载工作流。

#### 2. 手动安装模型
在这个工作流中,只有两个模型文件与 Flux1 Dev 版本的工作流不同,对于 t5xxl 你仍可使用 fp16 版本来获得更好的效果。
* **t5xxl\_fp16.safetensors** -> **t5xxl\_fp8.safetensors**
* **flux1-dev.safetensors** -> **flux1-schnell.safetensors**
完整模型文件列表:
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp8\_e4m3fn.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors?download=true)
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
* [flux1-schnell.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors)
文件保存位置:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp8_e4m3fn.safetensors
│ ├── vae/
│ │ └── ae.safetensors
│ └── diffusion_models/
│ └── flux1-schnell.safetensors
```
#### 3. 按步骤检查确保工作流可以正常运行
1. 确保在`DualCLIPLoader`节点中下面的模型已加载:
* clip\_name1: t5xxl\_fp8\_e4m3fn.safetensors
* clip\_name2: clip\_l.safetensors
2. 确保在`Load Diffusion Model`节点加载了`flux1-schnell.safetensors`
3. 确保在`Load VAE`节点中加载了`ae.safetensors`
4. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
## Fp8 Checkpoint 版文生图示例
fp8 版本是对 flux1 原版 fp16 版本的量化版本,在一定程度上这个版本的质量会低于 fp16 版本,但同时它需要的显存也会更少,而且你仅需要安装一个模型文件即可尝试运行。
### Flux.1 Dev fp8 Checkpoint 版工作流
请下载下面的图片,并拖入 ComfyUI 中加载工作流。

请下载 [flux1-dev-fp8.safetensors](https://huggingface.co/Comfy-Org/flux1-dev/resolve/main/flux1-dev-fp8.safetensors?download=true)并保存至 `ComfyUI/models/Checkpoints/` 目录下。
确保对应的 `Load Checkpoint` 节点加载了 `flux1-dev-fp8.safetensors`,即可测试运行。
### Flux.1 Schnell fp8 Checkpoint 版工作流
请下载下面的图片,并拖入 ComfyUI 中加载工作流。

请下载[flux1-schnell-fp8.safetensors](https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell-fp8.safetensors?download=true)并保存至 `ComfyUI/models/Checkpoints/` 目录下。
确保对应的 `Load Checkpoint` 节点加载了 `flux1-schnell-fp8.safetensors`,即可测试运行。
# Flux.1 Krea Dev ComfyUI 工作流教程
Source: https://docs.comfy.org/zh-CN/tutorials/flux/flux1-krea-dev
Black Forest Labs 与 Krea 合作开发的最佳开源 FLUX 模型,专注于独特美学风格和自然细节,避免 AI 感,提供卓越的真实感和图像质量。
[Flux.1 Krea Dev](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev) 是由 Black Forest Labs (BFL) 与 Krea 合作开发的先进文本生成图像模型。这是目前最好的开源权重 FLUX 模型,专为文本到图像生成而设计。
**模型特点**
* **独特美学风格**: 专注于生成具有独特美学的图像,避免常见的"AI感"外观
* **自然细节**: 不会产生过曝的高光,保持自然的细节表现
* **卓越的真实感**: 提供出色的真实感和图像质量
* **完全兼容架构**: 与 FLUX.1 \[dev] 完全兼容的架构设计
**模型许可**
该模型采用 [flux-1-dev-non-commercial-license](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev/blob/main/LICENSE.md) 许可发布
## Flux.1 Krea Dev ComfyUI 工作流
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
#### 1. 工作流文件
下载下面的图片或JSON,并拖入 ComfyUI 以加载对应工作流

下载 JSON 格式工作流
#### 2. 模型链接
**Diffusion model** 下面两个模型选择其中一个版本即可
* [flux1-krea-dev\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/FLUX.1-Krea-dev_ComfyUI/blob/main/split_files/diffusion_models/flux1-krea-dev_fp8_scaled.safetensors)
下面这个版本是原始权重,如果你追求更高质量有足够的显存,可以尝试这个版本
* [flux1-krea-dev.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev/resolve/main/flux1-krea-dev.safetensors)
- `flux1-dev.safetensors` 文件需要同意 [black-forest-labs/FLUX.1-Krea-dev](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev/) 的协议后才能使用浏览器进行下载。
如果你使用过 Flux 相关的工作流,下面的模型是相同的,不需要重复下载
**Text encoders**
* [clip\_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true)
* [t5xxl\_fp16.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true) 当你的显存大于 32GB 时推荐使用。
* [t5xxl\_fp8\_e4m3fn.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors) For Low VRAM
**VAE**
* [ae.safetensors](https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors?download=true)
文件保存位置:
```
ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └── flux1-krea-dev_fp8_scaled.safetensors 或 flux1-krea-dev.safetensors
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── t5xxl_fp16.safetensors 或 t5xxl_fp8_e4m3fn.safetensors
│ ├── vae/
│ │ └── ae.safetensors
```
#### 3. 按步骤检查确保工作流可以正常运行
对于低显存用户, 这个模型可能无法在你的设备上顺利运行,你可以等待社区提供 FP8 或 GGUF 版本
请参照下面的图片,确保各个模型文件都已经加载完成
1. 确保在`Load Diffusion Model`节点加载了`flux1-krea-dev_fp8_scaled.safetensors` 或 `flux1-krea-dev.safetensors`
* `flux1-krea-dev_fp8_scaled.safetensors` 推荐低显存用户使用
* `flux1-krea-dev.safetensors` 如果你有足够的显存如 24GB, 你可以尝试这个版本以追求更好的质量
2. 确保在`DualCLIPLoader`节点中下面的模型已加载:
* clip\_name1: t5xxl\_fp16.safetensors 或 t5xxl\_fp8\_e4m3fn.safetensors
* clip\_name2: clip\_l.safetensors
3. 确保在`Load VAE`节点中加载了`ae.safetensors`
4. 确保
5. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
# Cosmos Predict2 文生图 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/image/cosmos/cosmos-predict2-t2i
本文介绍了如何在 ComfyUI 中完成 Cosmos-Predict2 文生图的工作流
Cosmos-Predict2 是由 NVIDIA 推出的新一代物理世界基础模型,专为物理 AI 场景下的高质量视觉生成与预测任务设计。
该模型具备极高的物理准确性、环境交互性和细节还原能力,能够真实模拟复杂的物理现象与动态场景。
Cosmos-Predict2 支持文本到图像(Text2Image)和视频到世界(Video2World)等多种生成方式,广泛应用于工业仿真、自动驾驶、城市规划、科学研究等领域,是推动智能视觉与物理世界深度融合的重要基础工具。
GitHub:[Cosmos-predict2](https://github.com/nvidia-cosmos/cosmos-predict2)
huggingface: [Cosmos-Predict2](https://huggingface.co/collections/nvidia/cosmos-predict2-68028efc052239369a0f2959)
本篇指南将引导你完成在 ComfyUI 中 **文生图** 工作流程。
对于视频生成部分,请参考下面的部分
使用 Cosmos-Predict2 的进行视频生成
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
{/*
## Cosmos Predict2 Video2World 工作流
对于 2B 版本,在实际运行时,需要 10GB 的显存
### 1.下载工作流文件

### 2.手动模型安装
**Diffusion model**
- [cosmos_predict2_2B_t2i.safetensors](https://huggingface.co/Comfy-Org/Cosmos_Predict2_repackaged/resolve/main/cosmos_predict2_2B_t2i.safetensors)
其它权重请访问 [Cosmos_Predict2_repackaged](https://huggingface.co/Comfy-Org/Cosmos_Predict2_repackaged) 进行下载
**Text encoder**
[oldt5_xxl_fp8_e4m3fn_scaled.safetensors](https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/resolve/main/text_encoders/oldt5_xxl_fp8_e4m3fn_scaled.safetensors)
**VAE**
[wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
文件保存位置
```
📂 ComfyUI/
├──📂 models/
│ ├── 📂 diffusion_models/
│ │ └─── cosmos_predict2_2B_t2i.safetensors
│ ├── 📂 text_encoders/
│ │ └─── oldt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── 📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 按步骤完成工作流运行

请参照图片序号进行逐步确认,来保证对应工作流的顺利运行
1. 确保 `Load Diffusion Model` 节点加载了 `cosmos_predict2_2B_t2i.safetensors`
2. 确保 `Load CLIP` 节点加载了 `oldt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors`
4. 在 `EmptySD3LatentImage` 设置图片的尺寸
5. 在 `ClipTextEncode` 节点中修改提示词
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行文生图
7. 生成完成后对应的图片会自动保存到 `ComfyUI/output/` 目录下,你也可以在 `save image` 节点中预览或者调整保存位置 */}
# ComfyUI 原生版本 HiDream-E1, E1.1 工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/image/hidream/hidream-e1
本篇将引导了解并完成 ComfyUI 原生版本 HiDream-I1 文生图工作流实例

HiDream-E1 是智象未来(HiDream-ai) 正式开源的交互式图像编辑大模型,是基于 HiDream-I1 构建的图像编辑模型。
可以使用自然语言来实现对图像的编辑,该模型基于 [MIT 许可证](https://github.com/HiDream-ai/HiDream-E1?tab=MIT-1-ov-file) 发布,支持用于个人项目、科学研究以及商用。
通过与此前发布的 [hidream-i1](/zh-CN/tutorials/image/hidream/hidream-i1)的共同组合,实现了 **从图像生成到编辑的** 创作能力。
| 名称 | 更新时间 | 推理步数 | 分辨率 | HuggingFace 仓库 |
| --------------- | --------- | ---- | --------- | ----------------------------------------------------------------------- |
| HiDream-E1-Full | 2025-4-28 | 28 | 768x768 | 🤗 [HiDream-E1-Full](https://huggingface.co/HiDream-ai/HiDream-E1-Full) |
| HiDream-E1.1 | 2025-7-16 | 28 | 动态(1百万像素) | 🤗 [HiDream-E1.1](https://huggingface.co/HiDream-ai/HiDream-E1-1) |
[HiDream E1 - Github](https://github.com/HiDream-ai/HiDream-E1)
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## HiDream E1 及 E1.1 工作流相关模型
本篇指南涉及的所有模型你都可以在[这里](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/tree/main/split_files)找到, E1, E1.1 除了 Diffusion model 之外都是用相同的模型
我们在对应的工作流文件中也以包含了对应的模型信息,你可以选择手动下载模型保存,或者在加载工作流后按工作流提示进行下载,推荐使用 E1.1
这个模型的运行对显存占用要求极高,具体显存占用请参考对应部分的说明
**Diffusion Model**
你不用同时下载这两个模型,由于 E1.1 是基于 E1 的迭代版本,在实际测试中它的质量和效果较 E1 都有较大提升
* [hidream\_e1\_1\_bf16.safetensors(推荐)](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_e1_1_bf16.safetensors) 34.2GB
* [hidream\_e1\_full\_bf16.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_e1_full_bf16.safetensors) 34.2GB
**Text Encoder**:
* [clip\_l\_hidream.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/clip_l_hidream.safetensors) 236.12MB
* [clip\_g\_hidream.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/clip_g_hidream.safetensors) 1.29GB
* [t5xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/t5xxl_fp8_e4m3fn_scaled.safetensors) 4.8GB
* [llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/llama_3.1_8b_instruct_fp8_scaled.safetensors) 8.46GB
**VAE**
* [ae.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/vae/ae.safetensors) 319.77MB
> 这个是 Flux 的 VAE 模型,如果你之前使用过 Flux 的工作流,你可能已经下载了这个文件。
文件保存位置
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 text_encoders/
│ │ ├─── clip_l_hidream.safetensors
│ │ ├─── clip_g_hidream.safetensors
│ │ ├─── t5xxl_fp8_e4m3fn_scaled.safetensors
│ │ └─── llama_3.1_8b_instruct_fp8_scaled.safetensors
│ └── 📂 vae/
│ │ └── ae.safetensors
│ └── 📂 diffusion_models/
│ ├── hidream_e1_1_bf16.safetensors
│ └── hidream_e1_full_bf16.safetensors
```
## HiDream E1.1 ComfyUI 原生工作流示例
E1.1 是于 2025年7月16日更新迭代的版本, 这个版本支持动态一百万分辨率,在工作流中使用了 `Scale Image to Total Pixels` 节点来将输入图片动态调整为 1百万像素
这里是在测试使用时对应的显存占用参考:
1. A100 40GB (VRAM 使用率 95%):第一次生成: 211s,第二次生成: 73s
2. 4090D 24GB (VRAM 使用率 98%)
* 完整版本: Out of memory
* FP8\_e4m3fn\_fast (VRAM 98%) 第一次生成: 120s, 第二次生成: 91s
### 1. HiDream E1.1 工作流及相关素材
下载下面的图片并拖入 ComfyUI 已加载对应工作流及模型

下载下面的图片作为输入

### 2. 按步骤完成 HiDream-e1 工作流运行
按步骤完成工作流的运行
1. 确保`Load Diffusion Model` 节点加载了 `hidream_e1_1_bf16.safetensors` 模型
2. 确保`QuadrupleCLIPLoader` 中四个对应的 text encoder 被正确加载
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. 确保`Load VAE` 节点中使用的是 `ae.safetensors` 文件
4. 在 `Load Image` 节点中加载提供的输入或你需要的图片
5. 在`Empty Text Encoder(Positive)` 节点中输入 **想要对图片进行的修改**
6. 在`Empty Text Encoder(Negative)` 节点中输入 **不想要在画面中出现的内容**
7. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片生成
### 3. 工作流补充说明
* 由于 HiDream E1.1 支持的是动态总像素为一百万像素输入,所以工作流使用了 `Scale Image to Total Pixels` 来将所有输入图片进行处理转化,这可能会导致比例尺寸相对于输入图片会有所变化
* 使用 fp16 版本的模型,在实际测试过程中,在 A100 40GB 和 4090D 24GB 时使用完整版本时会 Out of memory,所以工作流默认设置了使用 `fp8_e4m3fn_fast` 来进行推理
## HiDream E1 ComfyUI 原生 工作流示例
E1 是于 2025 年 4 月 28 日发布的,这个模型只支持 768\*768 的分辨率
### 1. HiDream-e1 工作流及相关素材
供参考,本文工作流在 Google Colab L4 22.5GB显存下采样步数 28步首次运行消耗 500s,第二次运行消耗 370s。
#### 1.1 下载工作流文件
请下载下面的图片并拖入 ComfyUI 中,工作流已包含模型下载信息,加载后将会提示你进行对应的模型下载。

#### 1.2 下载输入图片
请下载下面的图片,我们将用于输入

### 2. 按步骤完成 HiDream-e1 工作流运行
按步骤完成工作流的运行
1. 确保`Load Diffusion Model` 节点加载了 `hidream_e1_full_bf16.safetensors` 模型
2. 确保`QuadrupleCLIPLoader` 中四个对应的 text encoder 被正确加载
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. 确保`Load VAE` 节点中使用的是 `ae.safetensors` 文件
4. 在 `Load Image` 节点中加载我们之前下载的输入图片
5. (重要)在`Empty Text Encoder(Positive)` 节点中输入 **想要修改的画面的提示词**
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片生成
### ComfyUI HiDream-e1 工作流补充说明
* 可能需要修改多次提示词或者进行多次的生成才能得到较好的结果
* 这个模型在改变图片风格上比较难保持一致性,需要尽可能完善提示词
* 由于模型支持的是 768\*768 的分辨率,在实际测试中调整过其它尺寸,在其它尺寸下图像表现能力不佳,甚至差异较大
# ComfyUI 原生版本 HiDream-I1 文生图工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/image/hidream/hidream-i1
本篇将引导了解并完成 ComfyUI 原生版本 HiDream-I1 文生图工作流实例

HiDream-I1 是智象未来(HiDream-ai)于2025年4月7日正式开源的文生图模型。该模型拥有17B参数规模,采用 [MIT 许可证](https://github.com/HiDream-ai/HiDream-I1/blob/main/LICENSE) 发布,支持用于个人项目、科学研究以及商用,目前在多项基准测试中该模型表现优异。
## 模型特点
**混合架构设计**
采用扩散模型(DiT)与混合专家系统(MoE)的结合架构:
* 主体基于Diffusion Transformer(DiT),通过双流MMDiT模块处理多模态信息,单流DiT模块优化全局一致性。
* 动态路由机制灵活分配计算资源,提升复杂场景处理能力,在色彩还原、边缘处理等细节上表现优异。
**多模态文本编码器集成**
整合四个文本编码器:
* OpenCLIP ViT-bigG、OpenAI CLIP ViT-L(视觉语义对齐)
* T5-XXL(长文本解析)
* Llama-3.1-8B-Instruct(指令理解)
这一组合使其在颜色、数量、空间关系等复杂语义解析上达到SOTA水平,中文提示词支持显著优于同类开源模型。
**原始模型版本**
智象未来(HiDream-ai)提供了三个版本的 HiDream-I1 模型,以满足不同场景的需求,下面是原始的模型仓库链接:
* 完整版本:[🤗 HiDream-I1-Full](https://huggingface.co/HiDream-ai/HiDream-I1-Full) 推理步数为 50
* 蒸馏开发版本:[🤗 HiDream-I1-Dev](https://huggingface.co/HiDream-ai/HiDream-I1-Dev) 推理步数为 28
* 蒸馏快速版本:[🤗 HiDream-I1-Fast](https://huggingface.co/HiDream-ai/HiDream-I1-Fast) 推理步数为 16
## 关于本篇工作流示例
我们将在本篇示例中使用 ComfyOrg 的 repackaged 的版本,你可以在 [HiDream-I1\_ComfyUI](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/) 仓库中找到我们将在本篇示例中使用的所有模型文件。
在开始前请更新你的 ComfyUI 版本,至少保证在这个[提交](https://github.com/comfyanonymous/ComfyUI/commit/9ad792f92706e2179c58b2e5348164acafa69288) 之后才能确保你的 ComfyUI 有 HiDream 的原生支持
## HiDream-I1 工作流
对应不同 ComfyUI 原生版本 HiDream-I1 工作流的模型要求基本上是相同的,只有使用过的 [diffusion models](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/tree/main/split_files/diffusion_models) 文件不同。
如果你不知道如何选择合适的版本,请参考以下建议:
* **HiDream-I1-Full** 可以生成质量最高的图像
* **HiDream-I1-Dev** 在生成较高质量的图像的同时,又兼顾速度
* **HiDream-I1-Fast** 只需要 16 步就可以生成图像,适合需要实时迭代的场景
对于 **dev** 和 **fast** 版本并不需要负向提示词,所以请在采样时设置`cfg` 参数为 `1.0`,我们对应参数设置已在相关工作流中备注。
以上三个版本的完整版本对显存要求较高,你可能需要 27GB 以上的显存才能顺利运行。在对应版本的工作流教程中,我们将会使用 **fp8** 版本作为示例演示,以保证大多用户都可以顺利运行,不过我们仍会在对应示例中提供不同版本的模型下载链接,你可以根据你的显存情况来选择合适的文件。
### 模型安装
下面的模型文件是我们会共用的模型文件,请点击对应的链接进行下载,并参照模型文件保存位置进行保存,对应的 **diffusion models** 模型我们会在对应工作流中引导你进行下载。
**text\_encoders**:
* [clip\_l\_hidream.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/clip_l_hidream.safetensors)
* [clip\_g\_hidream.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/clip_g_hidream.safetensors)
* [t5xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/t5xxl_fp8_e4m3fn_scaled.safetensors) 这个模型在许多的工作流中都有使用过,你可能已经下载了这个文件。
* [llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/text_encoders/llama_3.1_8b_instruct_fp8_scaled.safetensors)
**VAE**
* [ae.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/blob/main/split_files/vae/ae.safetensors) 这个是 Flux 的 VAE 模型,如果你之前使用过 Flux 的工作流,你可能已经下载了这个文件。
**diffusion models**
这部分我们将在对应工作流中具体引导下载对应的模型文件。
模型文件保存位置
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 text_encoders/
│ │ ├─── clip_l_hidream.safetensors
│ │ ├─── clip_g_hidream.safetensors
│ │ ├─── t5xxl_fp8_e4m3fn_scaled.safetensors
│ │ └─── llama_3.1_8b_instruct_fp8_scaled.safetensors
│ └── 📂 vae/
│ │ └── ae.safetensors
│ └── 📂 diffusion_models/
│ └── ... # 将在对应版本的工作流中引导你进行安装
```
### HiDream-I1 full 版本工作流
#### 1. 模型文件下载
请根据你的硬件情况选择合适的版本,点击链接并下载对应的模型文件保存到 `ComfyUI/models/diffusion_models/` 文件夹下。
* FP8 版本:[hidream\_i1\_full\_fp8.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_full_fp8.safetensors?download=true) 需要 16GB 以上的显存
* 完整版本:[hidream\_i1\_full\_f16.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_full_fp16.safetensors?download=true) 需要 27GB 以上的显存
#### 2. 工作流文件下载
请下载下面的图片,并拖入 ComfyUI 中以加载对应的工作流

#### 3. 按步骤完成工作流的运行
按步骤完成工作流的运行
1. 确保`Load Diffusion Model` 节点中使用的是 `hidream_i1_full_fp8.safetensors` 文件
2. 确保`QuadrupleCLIPLoader` 中四个对应的 text encoder 被正确加载
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. 确保`Load VAE` 节点中使用的是 `ae.safetensors` 文件
4. 对于 **full** 版本你需要设置 `ModelSamplingSD3` 中的 `shift` 参数为 `3.0`
5. 对于 `Ksampler` 节点,你需要进行以下设置
* `steps` 设置为 `50`
* `cfg` 设置为 `5.0`
* (可选) `sampler` 设置为 `lcm`
* (可选) `scheduler` 设置为 `normal`
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片生成
### HiDream-I1 dev 版本工作流
#### 1. 模型文件下载
请根据你的硬件情况选择合适的版本,点击链接并下载对应的模型文件保存到 `ComfyUI/models/diffusion_models/` 文件夹下。
* FP8 版本:[hidream\_i1\_dev\_fp8.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_dev_fp8.safetensors?download=true) 需要 16GB 以上的显存
* 完整版本:[hidream\_i1\_dev\_bf16.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_dev_bf16.safetensors?download=true) 需要 27GB 以上的显存
#### 2. 工作流文件下载
请下载下面的图片,并拖入 ComfyUI 中以加载对应的工作流

#### 3. 按步骤完成工作流的运行
按步骤完成工作流的运行
1. 确保`Load Diffusion Model` 节点中使用的是 `hidream_i1_dev_fp8.safetensors` 文件
2. 确保`QuadrupleCLIPLoader` 中四个对应的 text encoder 被正确加载
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. 确保`Load VAE` 节点中使用的是 `ae.safetensors` 文件
4. 对于 **dev** 版本你需要设置 `ModelSamplingSD3` 中的 `shift` 参数为 `6.0`
5. 对于 `Ksampler` 节点,你需要进行以下设置
* `steps` 设置为 `28`
* (重要) `cfg` 设置为 `1.0`
* (可选) `sampler` 设置为 `lcm`
* (可选) `scheduler` 设置为 `normal`
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片生成
### HiDream-I1 fast 版本工作流
#### 1. 模型文件下载
请根据你的硬件情况选择合适的版本,点击链接并下载对应的模型文件保存到 `ComfyUI/models/diffusion_models/` 文件夹下。
* FP8 版本:[hidream\_i1\_fast\_fp8.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_fast_fp8.safetensors?download=true) 需要 16GB 以上的显存
* 完整版本:[hidream\_i1\_fast\_bf16.safetensors](https://huggingface.co/Comfy-Org/HiDream-I1_ComfyUI/resolve/main/split_files/diffusion_models/hidream_i1_fast_bf16.safetensors?download=true) 需要 27GB 以上的显存
#### 2. 工作流文件下载
请下载下面的图片,并拖入 ComfyUI 中以加载对应的工作流

#### 3. 按步骤完成工作流的运行
按步骤完成工作流的运行
1. 确保`Load Diffusion Model` 节点中使用的是 `hidream_i1_fast_fp8.safetensors` 文件
2. 确保`QuadrupleCLIPLoader` 中四个对应的 text encoder 被正确加载
* clip\_l\_hidream.safetensors
* clip\_g\_hidream.safetensors
* t5xxl\_fp8\_e4m3fn\_scaled.safetensors
* llama\_3.1\_8b\_instruct\_fp8\_scaled.safetensors
3. 确保`Load VAE` 节点中使用的是 `ae.safetensors` 文件
4. 对于 **fast** 版本你需要设置 `ModelSamplingSD3` 中的 `shift` 参数为 `3.0`
5. 对于 `Ksampler` 节点,你需要进行以下设置
* `steps` 设置为 `16`
* (重要) `cfg` 设置为 `1.0`
* (可选) `sampler` 设置为 `lcm`
* (可选) `scheduler` 设置为 `normal`
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行图片生成
## 使用建议
* 虽然 HiDream-I1 支持中文提示词,但建议还是优先使用英文提示词来保证准确性
* 你可以使用 fast 版本来快速生成示例验证,然后再用完整版本的模型来生成较高质量的图像
## 其它相关资源
### GGUF 版本模型
* [HiDream-I1-Full-gguf](https://huggingface.co/city96/HiDream-I1-Full-gguf)
* [HiDream-I1-Dev-gguf](https://huggingface.co/city96/HiDream-I1-Dev-gguf)
你需要使用 City96 的 [ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF) 中的 `Unet Loader (GGUF)`节点替换掉 `Load Diffusion Model` 节点来使用 GGUF 版本模型。
* [ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF)
### NF4 版本模型
* [HiDream-I1-nf4](https://github.com/hykilpikonna/HiDream-I1-nf4)
* 使用 [ComfyUI-HiDream-Sampler](https://github.com/SanDiegoDude/ComfyUI-HiDream-Sampler) 节点来使用 NF4 版本模型。
# ComfyUI OmniGen2 原生工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/image/omnigen/omnigen2
ComfyUI OmniGen2 原生工作流示例 - 统一的文生图、图像编辑和多图像合成模型。
## 关于 OmniGen2
OmniGen2 是一个强大且高效的统一多模态生成模型,总参数量约 **7B**(3B 文本模型 + 4B 图像生成模型)。与 OmniGen v1 不同,OmniGen2 采用创新的双路径 Transformer 架构,具有完全独立的文本自回归模型和图像扩散模型,实现参数解耦和专门优化。
### 模型亮点
* **视觉理解**:继承了 Qwen-VL-2.5 基础模型强大的图像内容解释和分析能力
* **文生图生成**:从文本提示创建高保真度和美观的图像
* **指令引导的图像编辑**:执行复杂的、基于指令的图像修改,在开源模型中达到最先进的性能
* **上下文生成**:多功能的能力,可以处理和灵活结合多样化的输入(包括人物、参考对象和场景),产生新颖且连贯的视觉输出
### 技术特性
* **双路径架构**:基于 Qwen 2.5 VL(3B)文本编码器 + 独立扩散 Transformer(4B)
* **Omni-RoPE 位置编码**:支持多图像空间定位和身份区分
* **参数解耦设计**:避免文本生成对图像质量的负面影响
* 支持复杂的文本理解和图像理解
* 可控的图像生成和编辑
* 优秀的细节保持能力
* 统一架构支持多种图像生成任务
* 文字生成能力:可以在图像中生成清晰的文字内容
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## OmniGen2 模型下载
由于本文涉及不同工作流,对应的模型文件及安装位置如下,对应工作流中也已包含了模型文件下载信息:
**Diffusion Models)**
* [omnigen2\_fp16.safetensors](https://huggingface.co/Comfy-Org/Omnigen2_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/omnigen2_fp16.safetensors)
**VAE**
* [ae.safetensors](https://huggingface.co/Comfy-Org/Lumina_Image_2.0_Repackaged/resolve/main/split_files/vae/ae.safetensors)
**Text Encoders)**
* [qwen\_2.5\_vl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Omnigen2_ComfyUI_repackaged/resolve/main/split_files/text_encoders/qwen_2.5_vl_fp16.safetensors)
文件保存位置:
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── omnigen2_fp16.safetensors
│ ├── 📂 vae/
│ │ └── ae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_fp16.safetensors
```
## ComfyUI OmniGen2 文生图工作流
### 1. 工作流文件下载

### 2. 按步骤完成工作流运行
请参照图片序号进行逐步确认,来保证对应工作流的顺利运行:
1. **加载主模型**:确保 `Load Diffusion Model` 节点加载了 `omnigen2_fp16.safetensors`
2. **加载文本编码器**:确保 `Load CLIP` 节点加载了 `qwen_2.5_vl_fp16.safetensors`
3. **加载 VAE**:确保 `Load VAE` 节点加载了 `ae.safetensors`
4. **设置图像尺寸**:在 `EmptySD3LatentImage` 节点设置生成图片的尺寸(推荐 1024x1024)
5. **输入提示词**:
* 在第一个 `CLipTextEncode` 节点中输入正向提示词(想要出现在图像中的内容)
* 在第二个 `CLipTextEncode` 节点中输入负向提示词(不想要出现在图像中的内容)
6. **开始生成**:点击 `Queue Prompt` 按钮,或使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行文生图
7. **查看结果**:生成完成后对应的图片会自动保存到 `ComfyUI/output/` 目录下,你也可以在 `SaveImage` 节点中预览
## ComfyUI OmniGen2 图片编辑工作流
OmniGen2 有丰富的图像编辑能力,并且支持为图像添加文本
### 1. 工作流文件下载

下载下面的图片,我们将使用它作为输入图片。

### 2. 按步骤完成工作流运行
1. **加载主模型**:确保 `Load Diffusion Model` 节点加载了 `omnigen2_fp16.safetensors`
2. **加载文本编码器**:确保 `Load CLIP` 节点加载了 `qwen_2.5_vl_fp16.safetensors`
3. **加载 VAE**:确保 `Load VAE` 节点加载了 `ae.safetensors`
4. **上传图像**:在 `Load Image` 节点中上传提供的图片
5. **输入提示词**:
* 在第一个 `CLipTextEncode` 节点中输入正向提示词(想要出现在图像中的内容)
* 在第二个 `CLipTextEncode` 节点中输入负向提示词(不想要出现在图像中的内容)
6. **开始生成**:点击 `Queue Prompt` 按钮,或使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行文生图
7. **查看结果**:生成完成后对应的图片会自动保存到 `ComfyUI/output/` 目录下,你也可以在 `SaveImage` 节点中预览
### 3. 工作流补充说明
* 如果你想要启用第二张图像输入 ,你可以将工作流中状态为粉紫色的节点使用快捷键 **Ctrl + B** 来启用对应的节点输入
* 如果你想要自定义尺寸 ,可以删除链接 `EmptySD3LatentImage` 节点的 `Get image size` 节点,并输入自定义尺寸
# Qwen-Image ComfyUI原生工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/image/qwen/qwen-image
Qwen-Image 是一个拥有 20B 参数的 MMDiT(多模态扩散变换器)模型,基于 Apache 2.0 许可证开源。
**Qwen-Image** 是阿里巴巴通义千问团队发布的首个图像生成基础模型,这是一个拥有 20B 参数的 MMDiT(多模态扩散变换器)模型,基于 Apache 2.0 许可证开源。该模型在**复杂文本渲染**和**精确图像编辑**方面取得了显著进展,无论是英语还是中文等多种语言都能实现高保真输出。
**模型亮点**:
* **卓越的多语言文本渲染**:支持英语、中文、韩语、日语等多种语言的高精度文本生成,保持字体细节和布局一致性
* **多样化艺术风格**:从照片级真实到印象派绘画,从动漫美学到极简设计,流畅适应各种创意提示
*相关链接*\*:
* [GitHub](https://github.com/QwenLM/Qwen-Image)
* [Hugging Face](https://huggingface.co/Qwen/Qwen-Image)
* [ModelScope](https://modelscope.cn/models/qwen/Qwen-Image)
另外目前 Qwen-Image 有多种 ControlNet 支持
* [Qwen-Image-DiffSynth-ControlNets/model\_patches](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/tree/main/split_files/model_patches): 包括 canny、depth、inpaint 三个模型
* [qwen\_image\_union\_diffsynth\_lora.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/blob/main/split_files/loras/qwen_image_union_diffsynth_lora.safetensors): 图像结构控制lora 支持 canny、depth、pose、lineart、softedge、normal、openpose
* instanX ControlNet: 待更新
## ComfyOrg Qwen-Image live stream
**Qwen-Image in ComfyUI - Lightning & LoRAs**
VIDEO
**Qwen-Image ControlNet in ComfyUI - DiffSynth**
VIDEO
## Qwen-Image 原生工作流示例
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
在本篇文档所附工作流中使用的不同模型有三种
1. Qwen-Image 原版模型 fp8\_e4m3fn
2. 8步加速版: Qwen-Image 原版模型 fp8\_e4m3fn 使用 lightx2v 8步 LoRA,
3. 蒸馏版:Qwen-Image 蒸馏版模型 fp8\_e4m3fn
**显存使用参考**
GPU: RTX4090D 24GB
| 使用模型 | VRAM Usage | 首次生成 | 第二次生成 |
| ------------------------------- | ---------- | ----- | ----- |
| fp8\_e4m3fn | 86% | ≈ 94s | ≈ 71s |
| fp8\_e4m3fn 使用 lightx2v 8步 LoRA | 86% | ≈ 55s | ≈ 34s |
| 蒸馏版 fp8\_e4m3fn | 86% | ≈ 69s | ≈ 36s |
### 1. 工作流文件
更新 ComfyUI 后你可以从模板中找到工作流文件,或者将下面的工作流拖入 ComfyUI 中加载

下载原始版 JSON 格式工作流
蒸馏版
下载蒸馏版JSON 格式工作流
### 2. 模型下载
**你可以在 ComfyOrg 仓库找到的版本**
* Qwen-Image\_bf16 (40.9 GB)
* Qwen-Image\_fp8 (20.4 GB)
* 蒸馏版本 (非官方,仅需 15 步)
所有模型均可在 [Huggingface](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) 或者 [魔搭](https://modelscope.cn/models/Comfy-Org/Qwen-Image_ComfyUI/files) 找到
**Diffusion model**
* [qwen\_image\_fp8\_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/diffusion_models/qwen_image_fp8_e4m3fn.safetensors)
Qwen\_image\_distill
* [qwen\_image\_distill\_full\_fp8\_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/non_official/diffusion_models/qwen_image_distill_full_fp8_e4m3fn.safetensors)
* [qwen\_image\_distill\_full\_bf16.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/non_official/diffusion_models/qwen_image_distill_full_bf16.safetensors)
- 蒸馏版本原始作者建议在 15 步 cfg 1.0
- 经测试该蒸馏版本在 10 步 cfg 1.0 下表现良好,根据你想要的图像类型选择 euler 或 res\_multistep
**LoRA**
* [Qwen-Image-Lightning-8steps-V1.0.safetensors](https://huggingface.co/lightx2v/Qwen-Image-Lightning/resolve/main/Qwen-Image-Lightning-8steps-V1.0.safetensors)
**Text encoder**
* [qwen\_2.5\_vl\_7b\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)
**VAE**
* [qwen\_image\_vae.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors)
模型保存位置
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ ├── qwen_image_fp8_e4m3fn.safetensors
│ │ └── qwen_image_distill_full_fp8_e4m3fn.safetensors ## 蒸馏版
│ ├── 📂 loras/
│ │ └── Qwen-Image-Lightning-8steps-V1.0.safetensors ## 8步加速 LoRA 模型
│ ├── 📂 vae/
│ │ └── qwen_image_vae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors
```
### 3. 工作流使用说明
1. 确保 `Load Diffusion Model`节点加载了`qwen_image_fp8_e4m3fn.safetensors`
2. 确保 `Load CLIP`节点中加载了`qwen_2.5_vl_7b_fp8_scaled.safetensors`
3. 确保 `Load VAE`节点中加载了`qwen_image_vae.safetensors`
4. 确保 `EmptySD3LatentImage`节点中设置好了图片的尺寸
5. 在`CLIP Text Encoder`节点中设置好提示词,目前经过测试目前至少支持:英语、中文、韩语、日语、意大利语等
6. 如果需要启用 lightx2v 的 8 步加速 LoRA ,请选中后用 `Ctrl + B` 启用该节点,并按 序号`8` 处的设置参数修改 Ksampler 的设置设置
7. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
8. 对于不同版本的模型和工作流的对应 KSampler 的参数设置
蒸馏版模型和 lightx2v 的 8 步加速 LoRA 似乎并不兼容,你可以测试具体的组合参数来验证组合使用的方式是否可行
## Qwen Image InstantX ControlNet 工作流
这是一个 ControlNet 模型
### 1. 工作流及输入图片
下载下面的图片并拖入 ComfyUI 以加载工作流

下载 JSON 格式工作流
下载下面的图片作为输入

### 2. 模型链接
1. InstantX Controlnet
下载 [Qwen-Image-InstantX-ControlNet-Union.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-InstantX-ControlNets/resolve/main/split_files/controlnet/Qwen-Image-InstantX-ControlNet-Union.safetensors) 并保存到 `ComfyUI/models/controlnet/` 文件夹下
2. **Lotus Depth model**
你也可以使用类似 [comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) 等自定义节点来完成各种类型图像的预处理
**Lotus Depth 模型**
我们将使用这个模型来生成图像的深度图,它需要安装以下两个模型:
**Diffusion Model**
* [lotus-depth-d-v1-1.safetensors](https://huggingface.co/Comfy-Org/lotus/resolve/main/lotus-depth-d-v1-1.safetensors)
**VAE Model**
* [vae-ft-mse-840000-ema-pruned.safetensors](https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors) 或者任意的 SD1.5 的 VAE 都可以使用
```
ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └─── lotus-depth-d-v1-1.safetensors
│ └── vae/
│ └── lvae-ft-mse-840000-ema-pruned.safetensors
```
### 3. 工作流说明
1. 确保 `Load ControlNet Model` 节点正确加载了 `Qwen-Image-InstantX-ControlNet-Union.safetensors` 模型
2. 上传输入图像
3. 这里是一个子图,这里是 ComfyUI 支持的 lotus Depth 模型,你可以在模板中找到 Lotus Depth 或者编辑对应子图了解对应工作流,请确保所有模型都正确加载
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
## Qwen Image ControlNet DiffSynth-ControlNets Model Patches 工作流
这个模型实际上并不是一个 controlnet,而是一个 Model patch, 支持 canny、depth、inpaint 三种不同的控制模式
原始模型地址:[DiffSynth-Studio/Qwen-Image ControlNet](https://www.modelscope.cn/collections/Qwen-Image-ControlNet-6157b44e89d444)
Comfy Org rehost 地址: [Qwen-Image-DiffSynth-ControlNets/model\_patches](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/tree/main/split_files/model_patches)
### 1. 工作流及输入图片
下载下面的图片拖入 ComfyUI 中以加载对应的工作流

下载 JSON 格式工作流
下载下面的图片作为输入图片:

### 2. 模型链接
其它模型与 Qwen-Image 基础工作流一致,你只需下载下面的模型并保存到 `ComfyUI/models/model_patches` 文件夹中
* [qwen\_image\_canny\_diffsynth\_controlnet.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/resolve/main/split_files/model_patches/qwen_image_canny_diffsynth_controlnet.safetensors)
* [qwen\_image\_depth\_diffsynth\_controlnet.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/resolve/main/split_files/model_patches/qwen_image_depth_diffsynth_controlnet.safetensors)
* [qwen\_image\_inpaint\_diffsynth\_controlnet.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/resolve/main/split_files/model_patches/qwen_image_inpaint_diffsynth_controlnet.safetensors)
### 3. 工作流使用说明
目前 diffsynth 有三个 patch 的模型: Canny、Detph、Inpaint 三个模型
如果你是第一次使用 ControlNet 相关的工作流,你需要了解的是,用于控制的图片需要预处理成受支持的图像才可以被模型使用和识别
* Canny: 处理后的 canny , 线稿轮廓
* Detph: 预处理后的深度图,体现空间关系
* Inpaint: 需要用 Mask 标记需要重绘的部分
由于这个 patch 模型分为了三个不同的模型,所以你需要在输入时选择正确的预处理类型来保证图像的正确预处理
**Canny 模型 ControlNet 使用说明**
1. 确保对应 `qwen_image_canny_diffsynth_controlnet.safetensors` 已被加载
2. 上传输入图片,用于后续处理
3. Canny 节点是原生的预处理节点,它将按照你设置的参数,将输入图像进行预处理,控制生成
4. 如果需要可以修改 `QwenImageDiffsynthControlnet` 节点的 `strength` 强度来控制线稿控制的强度
5. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
> 对于 qwen\_image\_depth\_diffsynth\_controlnet.safetensors 使用,需要将图像预处理成 detph 深度图,替换掉 `image proccessing` 图,对于这部分的使用,请参考本篇文档中 InstantX 的处理方法,其它部分与 Canny 模型的使用类似
**Inpaint 模型 ControlNet 使用说明**
对于 Inpaint 模型,它需要使用 [蒙版编辑器](/zh-CN/interface/maskeditor),来绘制一个蒙版然后作为输入控制条件
1. 确保 `ModelPatchLoader` 加载的是 `qwen_image_inpaint_diffsynth_controlnet.safetensors` 模型
2. 上传图片,并使用[蒙版编辑器](/zh-CN/interface/maskeditor) 绘制蒙版,你需要将对应 `Load Image`节点的 `mask` 输出连接到 `QwenImageDiffsynthControlnet` 的 `mask` 输入才能保证对应的蒙版被加载
3. 使用 `Ctrl-B` 快捷键,将原本工作流中的 Canny 设置为绕过模式,来使得对应的 Canny 节点处理不生效
4. 在 `CLIP Text Encoder` 输入你需要将蒙版部分修改成样式
5. 如需要可以修改 `QwenImageDiffsynthControlnet` 节点的 `strength` 强度来控制对应的控制强度
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
## Qwen Image union ControlNet LoRA 工作流
原始模型地址:[DiffSynth-Studio/Qwen-Image-In-Context-Control-Union](https://www.modelscope.cn/models/DiffSynth-Studio/Qwen-Image-In-Context-Control-Union/)
Comfy Org reshot 地址: [qwen\_image\_union\_diffsynth\_lora.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/blob/main/split_files/loras/qwen_image_union_diffsynth_lora.safetensors): 图像结构控制lora 支持 canny、depth、post、lineart、softedge、normal、openpose
### 1. 工作流及输入图片
下载下面的图片并拖入 ComfyUI 以加载工作流

下载 JSON 格式工作流
下载下面的图片作为输入图片

### 2. 模型链接
下载下面的模型,由于这是一个 LoRA 模型,所以需要保存到 `ComfyUI/models/loras/` 文件夹下
* [qwen\_image\_union\_diffsynth\_lora.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-DiffSynth-ControlNets/blob/main/split_files/loras/qwen_image_union_diffsynth_lora.safetensors): 图像结构控制lora 支持 canny、depth、post、lineart、softedge、normal、openpose
### 3. 工作流说明
这个模型是一个统一的控制 LoRA, 支持 canny、depth、pose、lineart、softedge、normal、openpose 等控制, 由于许多的图像预处理原生节点并未完全支持,所以你应该需要类似 [comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) 来完成其它图像的预处理
1. 确保 `LoraLoaderModelOnly` 正确加载了 `qwen_image_union_diffsynth_lora.safetensors` 模型
2. 上传输入图像
3. 如需要你可以调整 `Canny` 节点的参数,由于不同的输入图像需要不同的参数设置来获得更好的图像预处理结果,你可以尝试调整对应的参数值来获得更多/更少细节
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
> 其它类型的类型的控制,也是需要将图像处理的部分替换
# Qwen-Image-Edit ComfyUI 原生工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/image/qwen/qwen-image-edit
Qwen-Image-Edit 是 Qwen-Image 的图像编辑版本,基于20B模型进一步训练,支持精准文字编辑和语义/外观双重编辑能力。
**Qwen-Image-Edit** 是 Qwen-Image 的图像编辑版本。它基于20B的Qwen-Image模型进一步训练,成功将Qwen-Image的文本渲染特色能力拓展到编辑任务上,以支持精准的文字编辑。此外,Qwen-Image-Edit将输入图像同时输入到Qwen2.5-VL(获取视觉语义控制)和VAE Encoder(获得视觉外观控制),以同时获得语义/外观双重编辑能力。
**模型特性**
特性包括:
* 精准文字编辑: Qwen-Image-Edit支持中英双语文字编辑,可以在保留文字大小/字体/风格的前提下,直接编辑图片中文字,进行增删改。
* 语义/外观 双重编辑: Qwen-Image-Edit不仅支持low-level的视觉外观编辑(例如风格迁移,增删改等),也支持high-level的视觉语义编辑(例如IP制作,物体旋转等)
* 强大的跨基准性能表现: 在多个公开基准测试中的评估表明,Qwen-Image-Edit 在编辑任务中均获得SOTA,是一个强大的图像生成基础模型。
**官方链接**:
* [GitHub 仓库](https://github.com/QwenLM/Qwen-Image)
* [Hugging Face](https://huggingface.co/Qwen/Qwen-Image-Edit)
* [ModelScope](https://modelscope.cn/models/qwen/Qwen-Image-Edit)
## ComfyOrg Qwen-Image-Edit 直播回放
VIDEO
## Qwen-Image-Edit ComfyUI 原生工作流示例
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
### 1. 工作流文件
更新 ComfyUI 后你可以从模板中找到工作流文件,或者将下面的工作流拖入 ComfyUI 中加载

下载 JSON 格式工作流
下载下面的图片作为输入

### 2. 模型下载
所有模型都可在 [Comfy-Org/Qwen-Image\_ComfyUI](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) 或 [Comfy-Org/Qwen-Image-Edit\_ComfyUI](https://huggingface.co/Comfy-Org/Qwen-Image-Edit_ComfyUI) 找到
**Diffusion model**
* [qwen\_image\_edit\_fp8\_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image-Edit_ComfyUI/resolve/main/split_files/diffusion_models/qwen_image_edit_fp8_e4m3fn.safetensors)
**LoRA**
* [Qwen-Image-Lightning-4steps-V1.0.safetensors](https://huggingface.co/lightx2v/Qwen-Image-Lightning/resolve/main/Qwen-Image-Lightning-4steps-V1.0.safetensors)
**Text encoder**
* [qwen\_2.5\_vl\_7b\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)
**VAE**
* [qwen\_image\_vae.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors)
Model Storage Location
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── qwen_image_edit_fp8_e4m3fn.safetensors
│ ├── 📂 loras/
│ │ └── Qwen-Image-Lightning-4steps-V1.0.safetensors
│ ├── 📂 vae/
│ │ └── qwen_image_vae.safetensors
│ └── 📂 text_encoders/
│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors
```
### 3. 按步骤完成工作流
1. 模型加载
* 确保 `Load Diffusion Model`节点加载了`qwen_image_edit_fp8_e4m3fn.safetensors`
* 确保 `Load CLIP`节点中加载了`qwen_2.5_vl_7b_fp8_scaled.safetensors`
* 确保 `Load VAE`节点中加载了`qwen_image_vae.safetensors`
2. 图片加载
* 确保 `Load Image`节点中上传了用于编辑的图片
3. 提示词设置
* 在`CLIP Text Encoder`节点中设置好提示词
4. Scale Image to Total Pixels 节点会将你输入图片缩放到总像素为 一百万像素,
* 主要是避免输入图片尺寸过大如 2048x2048 导致的输出图像质量损失问题
* 如果你很了解你输入的图片尺寸,你可以使用 `Ctrl+B` 绕过这个节点
5. 如果你要使用 4 步 Lighting LoRA 来实现图片生成的提速,你可以选中 `LoraLoaderModelOnly` 节点,然后按 `Ctrl+B` 启用该节点
6. 对于 Ksampler 节点的 `steps` 和 `cfg` 设置,我们在节点下方增加了一个笔记,你可以测试一下最佳的参数设置
7. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
# Cosmos Predict2 视频生成 ComfyUI 官方示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/cosmos/cosmos-predict2-video2world
本文介绍了如何在 ComfyUI 中完成 Cosmos-Predict2 文生视频及图生视频的工作流
Cosmos-Predict2 是由 NVIDIA 推出的新一代物理世界基础模型,专为物理 AI 场景下的高质量视觉生成与预测任务设计。
该模型具备极高的物理准确性、环境交互性和细节还原能力,能够真实模拟复杂的物理现象与动态场景。
Cosmos-Predict2 支持文本到图像(Text2Image)和视频到世界(Video2World)等多种生成方式,广泛应用于工业仿真、自动驾驶、城市规划、科学研究等领域,是推动智能视觉与物理世界深度融合的重要基础工具。
GitHub:[Cosmos-predict2](https://github.com/nvidia-cosmos/cosmos-predict2)
huggingface: [Cosmos-Predict2](https://huggingface.co/collections/nvidia/cosmos-predict2-68028efc052239369a0f2959)
本篇指南将引导你完成在 ComfyUI 中 **图生视频** 的工作流
对于文生图部分,请参考下面的部分
使用 Cosmos-Predict2 的进行文生图
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
{/*
## Cosmos Predict2 Video2World 工作流
对于 2B 版本,在我们测试使用时,大约占用 16GB 的显存
#### 1.下载工作流文件
下载 Json 格式工作流文件
请下载下面的图片作为输入文件:

### 2.手动模型安装
**Diffusion model**
- [cosmos_predict2_2B_video2world_480p_16fps.safetensors](https://huggingface.co/Comfy-Org/Cosmos_Predict2_repackaged/resolve/main/cosmos_predict2_2B_video2world_480p_16fps.safetensors)
其它权重请访问 [Cosmos_Predict2_repackaged](https://huggingface.co/Comfy-Org/Cosmos_Predict2_repackaged) 进行下载
**Text encoder**
[oldt5_xxl_fp8_e4m3fn_scaled.safetensors](https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/resolve/main/text_encoders/oldt5_xxl_fp8_e4m3fn_scaled.safetensors)
**VAE**
[wan_2.1_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
文件保存位置
```
📂 ComfyUI/
├──📂 models/
│ ├── 📂 diffusion_models/
│ │ └─── cosmos_predict2_2B_video2world_480p_16fps.safetensors
│ ├── 📂 text_encoders/
│ │ └─── oldt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── 📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 按步骤完成工作流运行

请参照图片序号进行逐步确认,来保证对应工作流的顺利运行
1. 确保 `Load Diffusion Model` 节点加载了 `cosmos_predict2_2B_video2world_480p_16fps.safetensors`
2. 确保 `Load CLIP` 节点加载了 `oldt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors`
4. 在 `Load Image` 节点中上传提供的输入图片
5. (可选)如果需要首尾帧控制,可以使用快捷键 `Ctrl(cmd) + B` 来启用尾帧输入
6. (可选) 你可以在 `ClipTextEncode` 节点中修改提示词
7. (可选) 修改 `CosmosPredict2ImageToVideoLatent` 节点中的尺寸和帧数
8. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
9. 生成完成后对应的视频会自动保存到 `ComfyUI/output/` 目录下,你也可以在 `save video` 节点中预览或者调整保存位置 */}
# ComfyUI 混元视频示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/hunyuan-video
本文介绍了如何在 ComfyUI 中完成混元文生视频及图生视频的工作流
混元视频(Hunyuan Video)系列是是[腾讯](https://huggingface.co/tencent)研发并开源的,该模型以混合架构为核心,支持[文本生成视频](https://github.com/Tencent/HunyuanVideo)
和[图生成视频](https://github.com/Tencent/HunyuanVideo-I2V),参数规模达 13B。
技术特点:
* **核心架构:** 采用类似Sora的DiT(Diffusion Transformer)架构,有效融合了文本、图像和动作信息,提高了生成视频帧之间的一致性、质量和对齐度,通过统一的全注意力机制实现多视角镜头切换,确保主体一致性。
* **3D VAE:** 定义的 3D VAE 将视频压缩到紧凑的潜空间,同时压缩视频,使得图生视频的生成更加高效。
* **卓越的图像-视频-文本对齐:** 使用 MLLM 文本编码器,在图像和视频生成中表现出色,能够更好地遵循文本指令,捕捉细节,并进行复杂推理。
你可以在[混元视频](https://github.com/Tencent/HunyuanVideo) 和[混元视频-I2V](https://github.com/Tencent/HunyuanVideo-I2V) 了解到更多开源信息。
本篇指南将引导你完成在 ComfyUI 中 **文生视频** 和 **图生视频** 的视频生成。
本篇教程中的工作流图片的 Metadata 中已包含对应模型下载信息,直接拖入 ComfyUI 或使用菜单 `Workflows` -> `Open(ctrl+o)` 来加载对应的工作流,会提示完成对应的模型下载。
另外在本篇指南中也提供了对应的模型地址 ,如果自动下载无法完成或者你使用的不是 Desktop 版本,请尝试手动完成模型的下载。
所有模型保存在[这里](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/tree/main/split_files)可以下载
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 工作流共用模型
在文生视频和图生视频的工作流中下面的这些模型是共有的,请完成下载并保存到指定目录中
* [clip\_l.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/text_encoders/clip_l.safetensors?download=true)
* [llava\_llama3\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/text_encoders/llava_llama3_fp8_scaled.safetensors?download=true)
* [hunyuan\_video\_vae\_bf16.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/vae/hunyuan_video_vae_bf16.safetensors?download=true)
保存位置:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors
│ │ └── llava_llama3_fp8_scaled.safetensors
│ ├── vae/
│ │ └── hunyuan_video_vae_bf16.safetensors
```
## 混元文生视频工作流
混元文生视频开源于 2024 年 12 月,支持通过自然语言描述生成 5 秒的短视频,支持中英文输入。
### 1. 文生视频相关工作流
请保存下面的图片,并拖入 ComfyUI 以加载工作流

### 2. 混元文生图模型
请下载 [hunyuan\_video\_t2v\_720p\_bf16.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/diffusion_models/hunyuan_video_t2v_720p_bf16.safetensors?download=true) 并保存至 `ComfyUI/models/diffusion_models` 文件夹中
确保包括共用模型文件夹有以下完整的模型文件:
```
ComfyUI/
├── models/
│ ├── text_encoders/
│ │ ├── clip_l.safetensors // 共用模型
│ │ └── llava_llama3_fp8_scaled.safetensors // 共用模型
│ ├── vae/
│ │ └── hunyuan_video_vae_bf16.safetensors // 共用模型
│ └── diffusion_models/
│ └── hunyuan_video_t2v_720p_bf16.safetensors // T2V 模型
```
### 3. 按步骤完成工作流的运行
1. 确保在`DualCLIPLoader`中下面的模型已加载:
* clip\_name1: clip\_l.safetensors
* clip\_name2: llava\_llama3\_fp8\_scaled.safetensors
2. 确保在`Load Diffusion Model`加载了`hunyuan_video_t2v_720p_bf16.safetensors`
3. 确保在`Load VAE`中加载了`hunyuan_video_vae_bf16.safetensors`
4. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
`EmptyHunyuanLatentVideo` 节点的 `length` 设置为 1 时,该模型可以生成静态图像。
## 混元图生视频工作流
混元图生视频模型开源于2025年3月6日,基于 HunyuanVideo 框架,支持将静态图像转化为流畅的高质量视频,同时开放了 LoRA 训练代码,支持定制特殊视频效果如:头发生长、物体变形等等。
目前混元图生视频模型分为两个版本:
* v1 “concat” : 视频的运动流畅性较好,但比较少遵循图像引导
* v2 “replace”: 在v1 更新后的次日更新的版本,图像的引导性较好,但相对于 V1 版本似乎不那么有活力
v1 “concat”
v2 “replace”
### v1 及 v2 版本共用的模型
请下载下面的文件,并保存到 `ComfyUI/models/clip_vision` 目录中
* [llava\_llama3\_vision.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/clip_vision/llava_llama3_vision.safetensors?download=true)
### v1 “concat” 图生视频工作流
#### 1. 工作流及相关素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

请下载下面的图片,我们将使用它作为图生视频的起始帧

#### 2. v1 版本模型
* [hunyuan\_video\_image\_to\_video\_720p\_bf16.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/diffusion_models/hunyuan_video_image_to_video_720p_bf16.safetensors?download=true)
确保包括共用模型文件夹有以下完整的模型文件:
```
ComfyUI/
├── models/
│ ├── clip_vision/
│ │ └── llava_llama3_vision.safetensors // I2V 共用模型
│ ├── text_encoders/
│ │ ├── clip_l.safetensors // 共用模型
│ │ └── llava_llama3_fp8_scaled.safetensors // 共用模型
│ ├── vae/
│ │ └── hunyuan_video_vae_bf16.safetensors // 共用模型
│ └── diffusion_models/
│ └── hunyuan_video_image_to_video_720p_bf16.safetensors // I2V v1 "concat" 版本模型
```
#### 3. 按步骤完成工作流
1. 确保 `DualCLIPLoader` 中下面的模型已加载:
* clip\_name1: clip\_l.safetensors
* clip\_name2: llava\_llama3\_fp8\_scaled.safetensors
2. 确保 `Load CLIP Vision` 加载了 `llava_llama3_vision.safetensors`
3. 请在 `Load Image Model` 加载了 `hunyuan_video_image_to_video_720p_bf16.safetensors`
4. 确保 `Load VAE` 中加载了 `hunyuan_video_vae_bf16.safetensors`
5. 确保 `Load Diffusion Model` 中加载了 `hunyuan_video_image_to_video_720p_bf16.safetensors`
6. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
### v2 “replace” 图生视频工作流
v2 版本的工作流与 v1 版本的工作流基本相同,你只需要下载一个 **replace** 的模型,然后在 `Load Diffusion Model` 中使用即可。
#### 1. 工作流及相关素材
请下载下面的工作流图片,并拖入 ComfyUI 以加载工作流

请下载下面的图片,我们将使用它作为图生视频的起始帧

#### 2. v2 版本模型
* [hunyuan\_video\_v2\_replace\_image\_to\_video\_720p\_bf16.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/diffusion_models/hunyuan_video_v2_replace_image_to_video_720p_bf16.safetensors?download=true)
确保包括共用模型文件夹有以下完整的模型文件:
```
ComfyUI/
├── models/
│ ├── clip_vision/
│ │ └── llava_llama3_vision.safetensors // I2V 共用模型
│ ├── text_encoders/
│ │ ├── clip_l.safetensors // 共用模型
│ │ └── llava_llama3_fp8_scaled.safetensors // 共用模型
│ ├── vae/
│ │ └── hunyuan_video_vae_bf16.safetensors // 共用模型
│ └── diffusion_models/
│ └── hunyuan_video_v2_replace_image_to_video_720p_bf16.safetensors // V2 "replace" 版本模型
```
#### 3. 按步骤完成工作流
1. 确保 `DualCLIPLoader` 中下面的模型已加载:
* clip\_name1: clip\_l.safetensors
* clip\_name2: llava\_llama3\_fp8\_scaled.safetensors
2. 确保 `Load CLIP Vision` 加载了 `llava_llama3_vision.safetensors`
3. 请在 `Load Image Model` 加载了 `hunyuan_video_image_to_video_720p_bf16.safetensors`
4. 确保 `Load VAE` 中加载了 `hunyuan_video_vae_bf16.safetensors`
5. 确保 `Load Diffusion Model` 中加载了 `hunyuan_video_v2_replace_image_to_video_720p_bf16.safetensors`
6. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流
## 开始你的尝试
下面是我们提供了一些示例图片和对应的提示词,你可以基于这些内容,进行修改,创作出属于你自己的视频。
```
Futuristic robot dancing ballet, dynamic motion, fast motion, fast shot, moving scene
```
***
```
Samurai waving sword and hitting the camera. camera angle movement, zoom in, fast scene, super fast, dynamic
```
***
```
flying car fastly moving and flying through the city
```
***
```
cyberpunk car race in night city, dynamic, super fast, fast shot
```
# LTX-Video
Source: https://docs.comfy.org/zh-CN/tutorials/video/ltxv
快速生成可控视频
将任意视频直接拖入 ComfyUI 即可开始使用
## 快速入门
[LTX-Video](https://huggingface.co/Lightricks/LTX-Video) 是 Lightricks 开发的高效视频生成模型。
使用该模型的关键是提供详细的长描述提示词。
请下载 [ltx-video-2b-v0.9.5.safetensors](https://huggingface.co/Lightricks/LTX-Video/resolve/main/ltx-video-2b-v0.9.5.safetensors?download=true) 文件并放入 `ComfyUI/models/checkpoints` 目录。
若尚未下载 [t5xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/mochi_preview_repackaged/resolve/main/split_files/text_encoders/t5xxl_fp16.safetensors?download=true) 文件,请将其放入 `ComfyUI/models/text_encoders` 目录。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 多帧控制
通过系列图像控制视频生成。可下载输入图像:[起始帧](https://raw.githubusercontent.com/Comfy-Org/example_workflows/refs/heads/main/ltxv/multi-frame/house1.png) 和 [结束帧](https://raw.githubusercontent.com/Comfy-Org/example_workflows/refs/heads/main/ltxv/multi-frame/house2.png)。
## 图生视频
通过首帧图像控制视频生成:[示例首帧](https://raw.githubusercontent.com/Comfy-Org/example_workflows/refs/heads/main/ltxv/i2v/girl1.png)。
## 文生视频
# ComfyUI Wan2.1 Fun Camera 官方原生示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/fun-camera
本文介绍了如何在 ComfyUI 中使用 Wan2.1 Fun Camera 完成视频生成
## 关于 Wan2.1 Fun Camera
**Wan2.1 Fun Camera** 是阿里团队推出的视频生成项目,专注于通过摄像机运动来控制视频生成效果。
**模型权重下载地址**:
* [14B 版本](https://huggingface.co/alibaba-pai/Wan2.1-Fun-V1.1-14B-Control-Camera)
* [1.3B 版本](https://huggingface.co/alibaba-pai/Wan2.1-Fun-V1.1-1.3B-Control-Camera)
**代码仓库**:[VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
**目前 ComfyUI 已原生支持了 Wan2.1 Fun Camera 模型**。
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 相关模型安装
这些模型你仅需要安装一次,另外在对应的工作流图片中也包含了模型下载信息,你可以选择你喜欢的方式下载模型。
下面的所有模型你可以在 [Wan\_2.1\_ComfyUI\_repackaged](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged) 找到
**Diffusion Models** 选择 1.3B 或 14B:
* [wan2.1\_fun\_camera\_v1.1\_1.3B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors)
* [wan2.1\_fun\_camera\_v1.1\_14B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_fun_camera_v1.1_14B_bf16.safetensors)
下面的模型,如果你使用过 Wan2.1 的相关模型,那么你应该已经有了下面的模型,如果没有,请下载下面的模型:
**Text Encoders** 选择其中一个:
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors)
文件保存位置:
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ ├── wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors # 1.3B 版本
│ │ └── wan2.1_fun_camera_v1.1_14B_bf16.safetensors # 14B 版本
│ ├── 📂 text_encoders/
│ │ └── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ ├── 📂 vae/
│ │ └── wan_2.1_vae.safetensors
│ └── 📂 clip_vision/
│ └── clip_vision_h.safetensors
```
## ComfyUI Wan2.1 Fun Camera 1.3B 原生工作流示例
### 1. 工作流相关文件下载
#### 1.1 工作流文件
下载下面的视频,并拖入 ComfyUI 中以加载对应的工作流:
下载 Json 格式工作流文件
如果你想使用 14B 版本,只需要将模型文件替换为 14B 版本即可,但请注意显存要求。
#### 1.2 输入图片下载
请下载下面的图片,我们将作为起始帧:

### 2. 按步骤完成工作流
1. 确保加载了正确版本的模型文件:
* 1.3B 版本:`wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors`
* 14B 版本:`wan2.1_fun_camera_v1.1_14B_bf16.safetensors`
2. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors`
4. 确保 `Load CLIP Vision` 节点加载了 `clip_vision_h.safetensors`
5. 在 `Load Image` 节点上传起始帧
6. 修改 Prompt,如果你使用了你自己的图像输入
7. 在 `WanCameraEmbedding` 节点设置相机动作
8. 点击 `Run` 按钮,或使用快捷键 `Ctrl(cmd) + Enter(回车)` 执行生成
## ComfyUI Wan2.1 Fun Camera 14B 工作流及输入图片
下载 Json 格式工作流文件
**输入图片**

## 性能参考
**1.3B 版本**:
* 512×512 RTX 4090 生成 81 帧约需 72 秒
**14B 版本**:
* RTX4090 24GB 显存在生成 512×512 分辨率时可能会出现显存不足, 在 A100 上运行尺寸过大时也出现过显存不足的情况
# ComfyUI Wan2.1 Fun Control 视频示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/fun-control
本文介绍了如何在 ComfyUI 中完成 Wan2.1 Fun Control 使用控制视频来完成视频生成的示例
## 关于 Wan2.1-Fun-Control
**Wan2.1-Fun-Control** 是阿里团队推出的开源视频生成与控制项目,通过引入创新性的控制代码(Control Codes)机制,结合深度学习和多模态条件输入,能够生成高质量且符合预设控制条件的视频。该项目专注于通过多模态控制条件实现对生成视频内容的精准引导。
目前 Fun Control 模型支持多种控制条件,包括 **Canny(线稿)**、**Depth(深度)**、**OpenPose(人体姿势)**、**MLSD(几何边缘)** 等,同时支持使用 **轨迹控制**。
模型还支持多分辨率视频预测,分辨率可选 512、768 和 1024,帧率为每秒 16 帧,最长可生成 81 帧(约 5 秒)的视频。
模型版本方面:
* **1.3B** 轻量版:适合本地部署和快速推理,**对显存要求较低**
* **14B** 高性能版:模型体积达 32GB+,效果更优但 **需高显存支持**
下面是相关代码仓库的示例
* [Wan2.1-Fun-1.3B-Control](https://huggingface.co/alibaba-pai/Wan2.1-Fun-1.3B-Control)
* [Wan2.1-Fun-14B-Control](https://huggingface.co/alibaba-pai/Wan2.1-Fun-14B-Control)
* 代码仓库:[VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
**目前 ComfyUI 已原生支持了 Wan2.1 Fun Control 模型** ,在开始本篇教程前,请更新你的 ComfyUI 保证你的版本在[这个提交](https://github.com/comfyanonymous/ComfyUI/commit/3661c833bcc41b788a7c9f0e7bc48524f8ee5f82)版本之后
在本篇指南中我们将提供两个工作流:
* 仅使用原生的 Comfy Core 节点的工作流
* 使用自定义节点的工作流
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
由于目前原生节点在视频支持方面有所欠缺,完全使用原生节点的工作流是为了能保证在使用过程中用户不需要安装自定义节点就可以完成对应的工作流, 但在视频相关的生成中,我们发现现阶段很难在不使用自定义节点的情况下同时提供良好的使用体验,所以在本篇指南中我们提供了两个版本的工作流。
## 相关模型安装
这些模型你仅需要安装一次,另外在对应的工作流图片中也包含了模型下载信息,你可以选择你喜欢的方式下载模型。
下面的模型你可以在 [Wan\_2.1\_ComfyUI\_repackaged](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged) 和 [Wan2.1-Fun](https://huggingface.co/collections/alibaba-pai/wan21-fun-67e4fb3b76ca01241eb7e334) 找到
点击对应链接进行下载,如果你之前使用过 Wan 相关的工作流,那么你仅需要下载 **Diffusino models**
**Diffusion models** 选择 1.3B 或 14B, 14B 的文件体积更大(32GB)但是对于运行显存要求也较高,
* [wan2.1\_fun\_control\_1.3B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_fun_control_1.3B_bf16.safetensors?download=true)
* [Wan2.1-Fun-14B-Control](https://huggingface.co/alibaba-pai/Wan2.1-Fun-14B-Control/blob/main/diffusion_pytorch_model.safetensors?download=true): 建议下载后重命名为 `Wan2.1-Fun-14B-Control.safetensors`
**Text encoders** 选择下面两个模型中的一个,fp16 精度体积较大对性能要求高
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors?download=true)
文件保存位置
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── wan2.1_fun_control_1.3B_bf16.safetensors
│ ├── 📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── 📂 vae/
│ │ └── wan_2.1_vae.safetensors
│ └── 📂 clip_vision/
│ └── clip_vision_h.safetensors
```
## ComfyUI 原生工作流
在此工作流中,我们使用转换成 WebP 格式的视频,这是因为目前`Load Image` 节点还不支持 mp4 格式的视频,另外我们使用 Canny Edge 来对原始的视频进行图像的预处理, 由于经常有用户在安装自定义节点过程中遇到安装失败和环境的问题,所以这一版本的工作流完全使用原生节点来实现,来优先保证体验。
感谢我们强大的 ComfyUI 作者们,他们带来了功能丰富的相关节点,如果你需要直接查看相关版本直接查看[使用自定义节点的工作流](#使用自定义节点的工作流)
### 1. 工作流相关文件下载
#### 1.1 工作流文件
下载下面的图片,并拖入 ComfyUI 中以加载对应的工作流

#### 1.2 输入图片及视频下载
请下载下面的图片及视频,我们将作为输入。


### 2. 按步骤完成工作流
1. 确保 `Load Diffusion Model` 节点加载了 `wan2.1_fun_control_1.3B_bf16.safetensors`
2. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors`
4. 确保 `Load CLIP Vision` 节点加载了 `clip_vision_h.safetensors `
5. 在 `Load Image` 节点(已被重命名为`Start_image`) 上传起始帧
6. 在第二个 `Load Image` 节点上传用于控制视频。注意: 目前这个节点还不支持 mp4 只能使用 Webp 视频
7. (可选)修改 Prompt 使用中英文都可以
8. (可选)在 `WanFunControlToVideo` 修改对应视频的尺寸,不要使用过大的尺寸
9. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
### 3. 使用说明
* 由于我们需要和控制视频一致的帧数输入到 `WanFunControlToVideo` 节点,如果对应的帧数数值大于实际的控制视频帧数,将会导致多余的帧不符合控制条件的画面出现,这个问题我们将在[使用自定义节点的工作流](#使用自定义节点的工作流)中解决
* 使用类似 [ComfyUI-comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) 来实现更丰富的控制
## 使用自定义节点的工作流
我们将需要安装下面两个自定义节点:
* [ComfyUI-VideoHelperSuite](https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite)
* [ComfyUI-comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux)
你可以使用 [ComfyUI Manager](https://github.com/Comfy-Org/ComfyUI-Manager) 安装缺失节点的功能或者参照对应自定义节点包的安装说明来完成对应节点的安装
### 1. 工作流相关文件下载
#### 1.1 工作流文件
下载下面的图片,并拖入 ComfyUI 中以加载对应的工作流

由于视频文件体积较大,你也可以点击[这里](https://raw.githubusercontent.com/Comfy-Org/example_workflows/main/wan2.1_fun_control/wan2.1_fun_control_use_custom_nodes.json)下载 Json 格式的工作流文件。
#### 1.2 输入图片及视频下载
请下载下面的图片及视频,我们将会用于输入

### 2. 按步骤完成工作流
> 模型部分基本是一致的,如果你已经体验过仅使用原生节点的工作流,你可以直接上传对应的图片然后运行即可
1. 确保 `Load Diffusion Model` 节点加载了 `wan2.1_fun_control_1.3B_bf16.safetensors`
2. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors`
4. 确保 `Load CLIP Vision` 节点加载了 `clip_vision_h.safetensors `
5. 在 `Load Image` 节点上传起始帧
6. 在 `Load Video(Upload)` 自定义节点上传 mp4 格式视频,请注意对应工作流有对默认的 `frame_load_cap`进行了调整
7. `DWPose Estimator` 处针对当前图像仅使用了 `detect_face` 的选项
8. (可选)修改 Prompt 使用中英文都可以
9. (可选)在 `WanFunControlToVideo` 修改对应视频的尺寸,不要使用过大的尺寸
10. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
### 3. 工作流说明
感谢 ComfyUI 社区作者带来的自定义节点包
* 在这个示例中使用了 `Load Video(Upload)` 来实现对 mp4 视频的支持
* `Load Video(Upload)` 中获取到的 `video_info` 我们得以对输出的视频保持同样的 `fps`
* 你可以替换 `DWPose Estimator` 为 `ComfyUI-comfyui_controlnet_aux` 节点包中的其它预处理器
## 使用技巧
* 一个有用的技巧是,你可以结合多种图像预处理技术,然后使用 `Image Blend` 节点来实现同时应用多种控制方法的目的。
* 你可以使用 `ComfyUI-VideoHelperSuite` 的 `Video Combine` 节点来实现将对应视频存储为 mp4 格式
* 我们使用 `SaveAnimatedWEBP` 是因为我们目前并不支持在 **mp4** 中嵌入工作流信息, 而且有些自定义节点可能没有考虑工作流嵌入,为了在视频中保存工作流,所以我们选择 `SaveAnimatedWEBP` 节点。
* 不要设置过大的画面尺寸,这可能导致采样过程非常耗时,可以试着先生成小尺寸的图片然后再进行采样放大
* 发挥你的想象力,在这个工作流基础上加上一些文生图或者其它类型的工作流,实现直接从文本到视频生成风格转换
* 在 `WanFunControlToVideo` 节点中,`control_video` 不是必须的,所以有时候你可以不使用控制视频,先生成特别小尺寸的视频比如 320x320,然后使用再把它们作为控制视频输入来获得确定的结果
## 其它 Wan2.1 Fun Control 或者视频相关自定义节点
* [ComfyUI-WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper)
* [ComfyUI-KJNodes](https://github.com/kijai/ComfyUI-KJNodes)
# ComfyUI Wan2.1 Fun InP 视频示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/fun-inp
本文介绍了如何在 ComfyUI 中完成 Wan2.1 Fun InP 视频首尾帧视频生成示例
## 关于 Wan2.1-Fun-InP
Wan-Fun InP 是阿里巴巴推出的开源视频生成模型,属于 Wan2.1-Fun 系列的一部分,专注于通过图像生成视频并实现首尾帧控制。
**核心功能**:
* 首尾帧控制:支持输入首帧和尾帧图像,生成中间过渡视频,提升视频连贯性与创意自由度。相比早期社区版本,阿里官方模型的生成效果更稳定且质量显著提升。
* 多分辨率支持:支持生成512×512、768×768、1024×1024等分辨率的视频,适配不同场景需求。
**模型版本方面**:
* 1.3B 轻量版:适合本地部署和快速推理,对显存要求较低
* 14B 高性能版:模型体积达 32GB+,效果更优但需高显存支持
下面是相关模型权重和代码仓库:
* [Wan2.1-Fun-1.3B-Input](https://huggingface.co/alibaba-pai/Wan2.1-Fun-1.3B-Input)
* [Wan2.1-Fun-14B-Input](https://huggingface.co/alibaba-pai/Wan2.1-Fun-14B-Input)
* 代码仓库:[VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
目前 ComfyUI 已原生支持了 Wan2.1 Fun InP 模型,在开始本篇教程前,请更新你的 ComfyUI 保证你的版本在[这个提交](https://github.com/comfyanonymous/ComfyUI/commit/0a1f8869c9998bbfcfeb2e97aa96a6d3e0a2b5df)版本之后
## Wan2.1 Fun Control 工作流
下载下面的图片,并拖入 ComfyUI 中以加载对应的工作流

### 1. 工作流文件下载
### 2. 手动模型安装
如果对应的自动模型下载无效,请手动进行模型下载,并保存到对应的文件夹
下面的模型你可以在 [Wan\_2.1\_ComfyUI\_repackaged](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged) 和 [Wan2.1-Fun](https://huggingface.co/collections/alibaba-pai/wan21-fun-67e4fb3b76ca01241eb7e334) 找到
**Diffusion models** 选择 1.3B 或 14B, 14B 的文件体积更大(32GB)但是对于运行显存要求也较高,
* [wan2.1\_fun\_inp\_1.3B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_fun_inp_1.3B_bf16.safetensors?download=true)
* [Wan2.1-Fun-14B-InP](https://huggingface.co/alibaba-pai/Wan2.1-Fun-14B-InP/resolve/main/diffusion_pytorch_model.safetensors?download=true): 建议下载后重命名为 `Wan2.1-Fun-14B-InP.safetensors`
**Text encoders** 选择下面两个模型中的一个,fp16 精度体积较大对性能要求高
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors?download=true)
文件保存位置
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └── wan2.1_fun_inp_1.3B_bf16.safetensors
│ ├── 📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── 📂 vae/
│ │ └── wan_2.1_vae.safetensors
│ └── 📂 clip_vision/
│ └── clip_vision_h.safetensors
```
### 3. 按步骤完成工作流
1. 确保 `Load Diffusion Model` 节点加载了 `wan2.1_fun_inp_1.3B_bf16.safetensors`
2. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors`
4. 确保 `Load CLIP Vision` 节点加载了 `clip_vision_h.safetensors `
5. 在 `Load Image` 节点(已被重命名为`Start_image`) 上传起始帧
6. 在第二个 `Load Image` 节点上传用于控制视频。注意: 目前这个节点还不支持 mp4 只能使用 Webp 视频
7. (可选)修改 Prompt 使用中英文都可以
8. (可选)在 `WanFunInpaintToVideo` 修改对应视频的尺寸,不要使用过大的尺寸
9. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
### 4. 工作流说明
请注意要使用正确的模型,因为 `wan2.1_fun_inp_1.3B_bf16.safetensors` 和 `wan2.1_fun_control_1.3B_bf16.safetensors` 都保存在同一文件夹,同时名称又极为相似,请确保使用了正确的模型。
* 在体验 Wan Fun InP 时,你可能需要频繁修改提示词,从而来确保对应画面的过渡的准确性
## 其它 Wan2.1 Fun Inp 或者视频相关自定义节点
* [ComfyUI-VideoHelperSuite](https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite)
* [ComfyUI-WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper)
* [ComfyUI-KJNodes](https://github.com/kijai/ComfyUI-KJNodes)
# ComfyUI Wan2.1 VACE 视频示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/vace
本文介绍了如何在 ComfyUI 中完成 Wan2.1 VACE 视频生成示例
由于目前我们已经对模板做了调整,并增加了 CausVid LoRA 的相关使用及说明,本篇文档需要进行更新,还需一定准备时间,在此之前请参考模板中的备注信息进行使用
## 关于 VACE
VACE 14B 是阿里通义万相团队推出的开源视频编辑统一模型。该模型通过整合多任务能力、支持高分辨率处理及灵活的多模态输入机制,显著提升了视频创作的效率与质量。
该模型基于 [Apache-2.0](https://github.com/ali-vilab/VACE?tab=Apache-2.0-1-ov-file) 协议开源,可用于个人商业用途。
以下是其核心特性与技术亮点的综合分析:
* 多模态输入:支持文本、图像、视频、遮罩、控制信号等多种输入形式
* 统一架构:单一模型支持多种任务,可自由组合功能
* 动作迁移:基于参考视频生成连贯动作
* 局部替换:通过遮罩替换视频中的特定区域
* 视频扩展:补全动作或扩展背景
* 背景替换:保留主体更换环境背景
目前 VACE 发布了 1.3B 和 14B 两个版本,14B 版本相比 1.3B 版本,支持 720P 分辨率输出,画面细节和稳定性更好。
| 模型 | 480P | 720P |
| ----------------------------------------------------------- | ---- | ---- |
| [VACE-1.3B](https://huggingface.co/Wan-AI/Wan2.1-VACE-1.3B) | ✅ | ❌ |
| [VACE-14B](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B) | ✅ | ✅ |
相关模型权重和代码仓库:
* [VACE-1.3B](https://huggingface.co/Wan-AI/Wan2.1-VACE-1.3B)
* [VACE-14B](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B)
* [Github](https://github.com/ali-vilab/VACE)
* [VACE 项目主页](https://ali-vilab.github.io/VACE-Page/)
## 模型下载及在工作流中的加载
由于本篇文档中设计的几个工作流都使用同一套工作流模板,所以我们可以先完成模型下载及加载的信息介绍,然后通过 Bypass 不同的节点来启用/ 禁用不同的输入来实现不同的工作流。
在具体示例中对应的工作流信息中已经嵌入了模型下载信息,所以你也可以在下载具体示例的工作流时来完成模型下载。
### 模型下载
**diffusion\_models**
[wan2.1\_vace\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_vace_14B_fp16.safetensors)
[wan2.1\_vace\_1.3B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_vace_1.3B_fp16.safetensors)
如果你之前使用过 Wan Video 相关的工作流,下面的模型文件你已经下载过了。
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
从**Text encoders** 选择一个版本进行下载
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
文件保存位置
```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 diffusion_models/
│ │ └─── wan2.1_vace_14B_fp16.safetensors # 或 wan2.1_vace_1.3B_fp16.safetensors
│ ├── 📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors # 或 umt5_xxl_fp16.safetensors
│ └── 📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 模型加载
由于在本篇指南中,我们所使用的模型是一致的,工作流也相同,只是 Bypass 了部分的节点来启用/ 禁用不同的输入,请参考下面的图片确保在对应不同的工作流中,对应的模型都已正确加载
1. 确保 `Load Diffusion Model` 节点加载了 `wan2.1_vace_14B_fp16.safetensors`
2. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 或者 `umt5_xxl_fp16.safetensors`
3. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors`
### 如何取消节点的 Bypass 状态
当一个节点被设置为 Bypass 状态时,通过该节点的数据将不受节点的影响,直接输出,下面是如何取消节点的 Bypass 状态的三种方法
我们通常在不需要一些节点时设置节点的 Bypass 状态,而不用将它们从节点中删除改变工作流。
1. 选中节点后,在选择工具箱中点击标识部分的箭头,即可快速切换节点的 Bypass 状态
2. 选中节点后,鼠标右键点击节点,选择 `模式(Mode)` -> `总是(Always)` 切换到 Always 模式
3. 选中节点后,鼠标右键点击节点,选择 `绕过(Bypass)` 选项,切换 Bypass 状态
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## VACE 文生视频工作流
如果无法从 mp4 文件加载工作流,请确保你的 ComfyUI 前端版本是最新的版本,请参考 [requirements.txt](https://github.com/comfyanonymous/ComfyUI/blob/master/requirements.txt) ,确保你能够从 mp4 文件加载工作流。
目前 1.19.9 是 requirements.txt 文件中的最新 ComfyUI 前端版本。
### 1. 工作流下载
下载下面视频,并拖入 ComfyUI 中,以加载对应的工作流
### 2. 按步骤完成工作流的运行
请参照图片序号进行逐步确认,来保证对应工作流的顺利运行
1. 在 `CLIP Text Encode (Positive Prompt)` 节点中输入正向提示词
2. 在 `CLIP Text Encode (Negative Prompt)` 节点中输入负向提示词
3. 在 `WanVaceToVideo` 设置对应图像的尺寸(首次运行建议设置 640\*640 的分辨率),帧数(视频的时长)
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
5. 生成完成后对应的视频会自动保存到 `ComfyUI/output/video` 目录下(子文件夹位置取决于 `save video` 节点设置)
在测试过程中,使用 4090 显卡:
* 720\*1280 的分辨率,生成 81 帧视频需要 40 分钟左右
* 640\*640 的分辨率,生成 49 帧视频需要 7 分钟左右
但相对的 720P 的视频质量会更好。
## VACE 图生视频工作流
你可以继续使用上面的工作流文件,只需要将 **Load reference image** 的 `Load image` 节点的 Bypass 取消,并输入对应的图片,你也可以使用下面的图片,在这个文件里,我们已经完成了对应的参数设置。
### 1. 工作流下载
下载下面的视频,并拖入 ComfyUI 中,以加载对应的工作流
请下载下面图片作为输入图片

### 2. 按步骤完成工作流的运行
请参照图片序号进行逐步确认,来保证对应工作流的顺利运行
1. 在 `Load image` 节点中输入对应的图片
2. 你可以像文生图工作流一样完成来进行提示词的修改和编辑
3. 在 `WanVaceToVideo` 设置对应图像的尺寸(首次运行建议设置 640\*640 的分辨率),帧数(视频的时长)
4. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
5. 生成完成后对应的视频会自动保存到 `ComfyUI/output/video` 目录下(子文件夹位置取决于 `save video` 节点设置)
你可能会使用类似获取图片尺寸一点的节点来设置对应的分辨率,但是由于对应节点有宽度和高度的步长要求,会导致如果你的图片尺寸无法被 16 整除时,可能会出现报错提示。
### 3. 工作流补充说明
VACE 还支持在一张图像中输入多个参考图像,来生成对应的视频,你可以在 VACE 的项目页中看到相关的[示例](https://ali-vilab.github.io/VACE-Page/)
## VACE 视频到视频工作流
### 1. 工作流下载
下载下面的视频并拖入 ComfyUI 中,以加载对应的工作流
我们将使用下面的素材作为输入:
1. 用于参考图像的输入图片

2. 下面的视频已经经过预处理,我们将用于控制视频的生成
3. 下面的视频是原始视频,你可以下载下面的素材来使用类似 [comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) 这样的预处理节点来对图像进行预处理
### 2. 按步骤完成工作流的运行
请参照图片序号进行逐步确认,来保证对应工作流的顺利运行
1. 在 `Load reference image` 中的 `Load Image` 节点输入参考图片
2. 在 `Load control video` 中的 `Load Video` 节点输入控制视频,由于提供的视频是经过预处理的,所以你不需要进行额外的处理
3. 如果你需要自己针对原始视频进行预处理,可以修改 `Image preprocessing` 分组,或者使用 `comfyui_controlnet_aux` 节点来完成对应的节点预处理
4. 修改提示词
5. 在 `WanVaceToVideo` 设置对应图像的尺寸(首次运行建议设置 640\*640 的分辨率),帧数(视频的时长)
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
7. 生成完成后对应的视频会自动保存到 `ComfyUI/output/video` 目录下(子文件夹位置取决于 `save video` 节点设置)
## VACE 视频扩展工作流
\[待更新]
## VACE 首尾帧视频生成
\[待更新]
要保证首尾帧生效,需要满足:
* 对应视频 `length` 设置需要满足 `length-1` 后能够被 `4` 整除
* 对应的 `Batch_size` 设置需要满足 `Batch_size = length - 2`
## 相关节点文档
请查阅下面的文档了解相关的节点
WanVaceToVideo 节点文档
ComfyUI TrimVideoLatent 节点文档
# Wan ATI ComfyUI 原生工作流教程
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/wan-ati
使用轨迹控制视频生成。
**ATI(Any Trajectory Instruction)** 是由字节跳动团队提出的可控视频生成框架。ATI 基于 Wan2.1 实现,支持通过任意轨迹指令对视频中的物体、局部区域及摄像机运动进行统一控制。
项目地址:[https://github.com/bytedance/ATI](https://github.com/bytedance/ATI)
## 主要特性
* **统一运动控制**:支持物体、局部、摄像机等多种运动类型的轨迹控制。
* **交互式轨迹编辑器**:可视化工具,用户可在图片上自由绘制、编辑运动轨迹。
* **兼容 Wan2.1**:基于 Wan2.1 官方实现,环境和模型结构兼容。
* **丰富的可视化工具**:支持输入轨迹、输出视频及轨迹可视化。
## WAN ATI 轨迹控制工作流示例
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
### 1. 工作流下载
下载下面的视频并拖入 ComfyUI 中,以加载对应的工作流
我们将使用下面的素材作为输入:

### 2. 模型下载
如果你没有成功下载工作流中的模型文件,可以尝试使用下面的链接手动下载
**Diffusion Model**
* [Wan2\_1-I2V-ATI-14B\_fp8\_e4m3fn.safetensors](https://huggingface.co/Kijai/WanVideo_comfy/resolve/main/Wan2_1-I2V-ATI-14B_fp8_e4m3fn.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**Text encoders** Chose one of following model
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**clip\_vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors)
File save location
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ └───Wan2_1-I2V-ATI-14B_fp8_e4m3fn.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors # or other version
│ ├───📂 clip_vision/
│ │ └─── clip_vision_h.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 按步骤完成工作流的运行
请参照图片序号进行逐步确认,来保证对应工作流的顺利运行
1. 确保`Load Diffusion Model`节点加载了 `Wan2_1-I2V-ATI-14B_fp8_e4m3fn.safetensors` 模型
2. 确保`Load CLIP`节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
3. 确保`Load VAE`节点加载了 `wan_2.1_vae.safetensors` 模型
4. 确保`Load CLIP Vision`节点加载了 `clip_vision_h.safetensors` 模型
5. 在 `Load Image` 节点上传提供的输入图片
6. 轨迹编辑: 目前 ComfyUI 中还未有对应的轨迹编辑器,你可以使用下面的链接来完成轨迹编辑
* [在线轨迹编辑工具](https://comfyui-wiki.github.io/Trajectory-Annotation-Tool/)
7. 如果你需要修改提示词(正向及负向)请在序号`5` 的 `CLIP Text Encoder` 节点中进行修改
8. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
# ComfyUI Wan2.1 FLF2V 原生示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/wan-flf
本文介绍了如何在 ComfyUI 中完成 Wan2.1 FLF2V 视频生成示例
Wan FLF2V(首尾帧视频生成)是由阿里通义万相团队推出的开源视频生成模型。其开源协议为 [Apache 2.0](https://github.com/Wan-Video/Wan2.1?tab=Apache-2.0-1-ov-file)。
用户只需提供起始帧和结束帧两张图像,模型即可自动生成中间过渡帧,输出一段逻辑连贯、自然流畅的720p高清视频。
**核心技术亮点**
1. **首尾帧精准控制**:首尾帧匹配度达98%,通过起始和结束画面定义视频边界,模型智能填充中间动态变化,实现场景转换和物体形态演变等效果。
2. **稳定流畅视频生成**:采用CLIP语义特征和交叉注意力机制,视频抖动率比同类模型降低37%,确保转场自然流畅。
3. **多功能创作能力**:支持中英文字幕动态嵌入、二次元/写实/奇幻等多风格生成,适应不同创作需求。
4. **720p高清输出**:直接生成1280×720分辨率视频,无需后处理,适用于社交媒体和商业应用。
5. **开源生态支持**:模型权重、代码及训练框架全面开源,支持主流AI平台部署。
**技术原理与架构**
1. **DiT架构**:基于扩散模型和Diffusion Transformer架构,结合Full Attention机制优化时空依赖建模,确保视频连贯性。
2. **三维因果变分编码器**:Wan-VAE技术将高清画面压缩至1/128尺寸,同时保留细微动态细节,显著降低显存需求。
3. **三阶段训练策略**:从480P分辨率开始预训练,逐步提升至720P,通过分阶段优化平衡生成质量与计算效率。
**相关链接**
* **GitHub代码仓库**:[GitHub](https://github.com/Wan-Video/Wan2.1)
* **Hugging Face模型页**:[Hugging Face](https://huggingface.co/Wan-AI/Wan2.1-FLF2V-14B-720P)
* **ModelScope(魔搭社区)**:[ModelScope](https://www.modelscope.cn/models/Wan-AI/Wan2.1-FLF2V-14B-720P)
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Wan2.1 FLF2V 720P ComfyUI 原生工作流示例
### 1. 下载工作流文件及相关输入文件
由于这是一个在高分辨率图片上训练的模型,所以使用较小的尺寸可能无法获得较好的结果,我们在示例中使用了 720 \* 1280 的尺寸,这可能导致较低显存的用户无法很顺利运行,对应的图片生成也会非常差耗时。如果需要,在一开始的时候请修改视频生成的尺寸,但仅供测试,如需生成最终结果,请使用 720 \* 1280 左右的尺寸。
请下载下面的 WebP 保存下面的 WebP 文件,并拖入 ComfyUI 中来加载对应的工作流,对应工作流已嵌入对应的模型下载文件信息。

请下载下面的两张图片,我们将会作为作为视频的起始帧和结束帧


### 2.手动模型安装
本篇指南涉及的所有模型你都可以在[这里](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files)找到。
**diffusion\_models** 根据你的硬件情况选择一个版本进行下载,FP8 版本对显存要求低一些
* FP16:[wan2.1\_flf2v\_720p\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_flf2v_720p_14B_fp16.safetensors?download=true)
* FP8:[wan2.1\_flf2v\_720p\_14B\_fp8\_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/diffusion_models/wan2.1_flf2v_720p_14B_fp8_e4m3fn.safetensors)
如果你之前运行过 Wan Video 相关的工作流,你可能已经有了下面的这些文件。
从**Text encoders** 选择一个版本进行下载,
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors?download=true)
文件保存位置
```
ComfyUI/
├── models/
│ ├── diffusion_models/
│ │ └─── wan2.1_flf2v_720p_14B_fp16.safetensors # 或者 FP8 版本
│ ├── text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors # 或者你选择的版本
│ ├── vae/
│ │ └── wan_2.1_vae.safetensors
│ └── clip_vision/
│ └── clip_vision_h.safetensors
```
### 3. 按步骤完成工作流运行
1. 确保 `Load Diffusion Model` 节点加载了 `wan2.1_flf2v_720p_14B_fp16.safetensors` 或者 `wan2.1_flf2v_720p_14B_fp8_e4m3fn.safetensors`
2. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors`
4. 确保 `Load CLIP Vision` 节点加载了 `clip_vision_h.safetensors `
5. 在 `Start_image` 节点上传起始帧
6. 在 `End_image` 节点上传结束帧
7. (可选)修改 正向和负向的提示词(Prompt)使用中英文都可以
8. (**重要**)在 `WanFirstLastFrameToVideo` 修改对应视频的尺寸我们默认使用了 720 \* 1280 的尺寸来,因为这是一个 720P 的尺寸来,因为这是一个720P的模型,所以使用较小的尺寸会无法获得较好的结果。
9. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
# ComfyUI Wan2.1 Video 示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/wan-video
本文介绍了如何在 ComfyUI 中完成 Wan2.1 Video 视频首尾帧视频生成示例
Wan2.1 Video 系列为阿里巴巴于 2025年2月开源的视频生成模型,其开源协议为 [Apache 2.0](https://github.com/Wan-Video/Wan2.1?tab=Apache-2.0-1-ov-file),提供 14B(140亿参数)和 1.3B(13亿参数)两个版本,覆盖文生视频(T2V)、图生视频(I2V)等多项任务。
该模型不仅在性能上超越现有开源模型,更重要的是其轻量级版本仅需 8GB 显存即可运行,大大降低了使用门槛。
* [Wan2.1 代码仓库](https://github.com/Wan-Video/Wan2.1)
* [Wan2.1 相关模型仓库](https://huggingface.co/Wan-AI)
## Wan2.1 ComfyUI 原生(native)工作流示例
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## 模型安装
本篇指南涉及的所有模型你都可以在[这里](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files)找到, 下面是本篇示例中将会使用到的共用的模型,你可以提前进行下载:
从**Text encoders** 选择一个版本进行下载,
* [umt5\_xxl\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp16.safetensors?download=true)
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors?download=true)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors?download=true)
**CLIP Vision**
* [clip\_vision\_h.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors?download=true)
文件保存位置
```
ComfyUI/
├── models/
│ ├── diffusion_models/
│ ├── ... # 我们在对应的工作流中进行补充说明
│ ├── text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └── vae/
│ │ └── wan_2.1_vae.safetensors
│ └── clip_vision/
│ └── clip_vision_h.safetensors
```
对于 diffusion 模型,我们在本篇示例中将使用 fp16 精度的模型,因为我们发现相对于 bf16 的版本 fp16 版本的效果更好,如果你需要其它精度的版本,请访问[这里](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/diffusion_models)进行下载
## Wan2.1 文生视频工作流
在开始工作流前请下载 [wan2.1\_t2v\_1.3B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_t2v_1.3B_fp16.safetensors?download=true),并保存到 `ComfyUI/models/diffusion_models/` 目录下。
> 如果你需要其它的 t2v 精度版本,请访问[这里](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/diffusion_models)进行下载
### 1. 工作流文件下载
下载下面的文件,并拖入 ComfyUI 以加载对应的工作流

### 2. 按流程完成工作流运行
1. 确保`Load Diffusion Model`节点加载了 `wan2.1_t2v_1.3B_fp16.safetensors` 模型
2. 确保`Load CLIP`节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
3. 确保`Load VAE`节点加载了 `wan_2.1_vae.safetensors` 模型
4. (可选)可以在`EmptyHunyuanLatentVideo` 节点设置了视频的尺寸,如果有需要你可以修改
5. (可选)如果你需要修改提示词(正向及负向)请在序号`5` 的 `CLIP Text Encoder` 节点中进行修改
6. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
## Wan2.1 图生视频工作流
**由于 Wan Video 将 480P 和 720P 的模型分开** ,所以在本篇中我们将需要分别对两中清晰度的视频做出示例,除了对应模型不同之外,他们还有些许的参数差异
### 480P 版本
#### 1. 工作流及输入图片
下载下面的图片,并拖入 ComfyUI 中来加载对应的工作流

我们将使用下面的图片作为输入:

#### 2. 模型下载
请下载[wan2.1\_i2v\_480p\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_i2v_480p_14B_fp16.safetensors?download=true),并保存到 `ComfyUI/models/diffusion_models/` 目录下
#### 3. 按步骤完成工作流的运行
1. 确保`Load Diffusion Model`节点加载了 `wan2.1_i2v_480p_14B_fp16.safetensors` 模型
2. 确保`Load CLIP`节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
3. 确保`Load VAE`节点加载了 `wan_2.1_vae.safetensors` 模型
4. 确保`Load CLIP Vision`节点加载了 `clip_vision_h.safetensors` 模型
5. 在`Load Image`节点中上传我们提供的输入图片
6. (可选)在`CLIP Text Encoder`节点中输入你想要生成的视频描述内容,
7. (可选)在`WanImageToVideo` 节点中设置了视频的尺寸,如果有需要你可以修改
8. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
### 720P 版本
#### 1. 工作流及输入图片
下载下面的图片,并拖入 ComfyUI 中来加载对应的工作流

我们将使用下面的图片作为输入:

#### 2. 模型下载
请下载[wan2.1\_i2v\_720p\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/diffusion_models/wan2.1_i2v_720p_14B_fp16.safetensors?download=true),并保存到 `ComfyUI/models/diffusion_models/` 目录下
#### 3. 按步骤完成工作流的运行
1. 确保`Load Diffusion Model`节点加载了 `wan2.1_i2v_720p_14B_fp16.safetensors` 模型
2. 确保`Load CLIP`节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
3. 确保`Load VAE`节点加载了 `wan_2.1_vae.safetensors` 模型
4. 确保`Load CLIP Vision`节点加载了 `clip_vision_h.safetensors` 模型
5. 在`Load Image`节点中上传我们提供的输入图片
6. (可选)在`CLIP Text Encoder`节点中输入你想要生成的视频描述内容,
7. (可选)在`WanImageToVideo` 节点中设置了视频的尺寸,如果有需要你可以修改
8. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
# ComfyUI Wan2.2 Fun Camera Control 相机控制视频生成工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/wan2-2-fun-camera
本文介绍了如何在 ComfyUI 中完成 Wan2.2 Fun Camera Control 使用相机控制来完成视频生成的示例
**Wan2.2-Fun-Camera-Control** 是 Alibaba PAI 团队推出的新一代视频生成与相机控制模型,通过引入创新性的相机控制代码(Camera Control Codes)机制,结合深度学习和多模态条件输入,能够生成高质量且符合预设相机运动条件的视频。该模型采用 **Apache 2.0 许可协议**发布,支持商业使用。
**核心功能**:
* **相机运动控制**:支持多种相机运动模式,包括 **Pan Up(向上平移)**、**Pan Down(向下平移)**、**Pan Left(向左平移)**、**Pan Right(向右平移)**、**Zoom In(放大)**、**Zoom Out(缩小)** 等,同时支持组合运动控制
* **高质量视频生成**:基于 Wan2.2 架构,输出影视级质量视频
下面是相关模型权重和代码仓库:
* [🤗Wan2.2-Fun-A14B-Control-Camera](https://huggingface.co/alibaba-pai/Wan2.2-Fun-A14B-Control-Camera)
* 代码仓库:[VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
## Wan2.2 Fun Camera Control 相机控制视频生成工作流示例
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
这里提供的工作流包含了两个版本:
1. 使用了 lightx2v 的 [Wan2.2-Lightning](https://huggingface.co/lightx2v/Wan2.2-Lightning) 4 步 LoRA : 但可能导致生成的视频动态会有损失,但速度会更快
2. 没有使用加速 LoRA 的 fp8\_scaled 版本
下面是使用 RTX4090D 24GB 显存 GPU 测试的结果 640\*640 分辨率, 81 帧长度的用时对比
| 模型类型 | 分辨率 | 显存占用 | 首次生成时长 | 第二次生成时长 |
| ---------------------- | ------- | ---- | ------ | ------- |
| fp8\_scaled | 640×640 | 84% | ≈ 536秒 | ≈ 513秒 |
| fp8\_scaled + 4步LoRA加速 | 640×640 | 89% | ≈ 108秒 | ≈ 71秒 |
由于使用了4 步 LoRA 对于初次使用工作流的用户体验较好, 但可能导致生成的视频动态会有损失, 我们默认启用了使用了加速 LoRA 版本,如果你需要启用另一组的工作流,框选后使用 **Ctrl+B** 即可启用
### 1. 工作流及素材下载
下载下面的视频或者 JSON 文件并拖入 ComfyUI 中以加载对应的工作流,工作流会提示下载模型
下载 JSON 格式工作流
请下载下面的图片,我们将作为输入。

### 2. 模型链接
下面的模型你可以在 [Wan\_2.2\_ComfyUI\_Repackaged](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged) 找到
**Diffusion Model**
* [wan2.2\_fun\_camera\_high\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_camera_high_noise_14B_fp8_scaled.safetensors)
* [wan2.2\_fun\_camera\_low\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_camera_low_noise_14B_fp8_scaled.safetensors)
**Wan2.2-Lightning LoRA (可选,用于加速)**
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_high\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors)
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_low\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
File save location
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_fun_camera_low_noise_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_fun_camera_high_noise_14B_fp8_scaled.safetensors
│ ├───📂 loras/
│ │ ├─── wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
│ │ └─── wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 按步骤完成工作流
这个工作流是使用了 LoRA 的工作流,请确保对应的 Diffusion model 和 LoRA 是一致的, high noise 和 low noise 的模型和 LoRA 需要对应使用
1. **High noise** 模型及 **LoRA** 加载
* 确保 `Load Diffusion Model` 节点加载了 `wan2.2_fun_camera_high_noise_14B_fp8_scaled.safetensors` 模型
* 确保 `LoraLoaderModelOnly` 节点加载了 `wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors`
2. **Low noise** 模型及 **LoRA** 加载
* 确保 `Load Diffusion Model` 节点加载了 `wan2.2_fun_camera_low_noise_14B_fp8_scaled.safetensors` 模型
* 确保 `LoraLoaderModelOnly` 节点加载了 `wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors`
3. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
4. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors` 模型
5. 在 `Load Image` 节点上传起始帧
6. 修改 Prompt 使用中英文都可以
7. 在 `WanCameraEmbedding` 节点设置相机控制参数:
* **Camera Motion**: 选择相机运动类型(Zoom In、Zoom Out、Pan Up、Pan Down、Pan Left、Pan Right、Static 等)
* **Width/Height**: 设置视频分辨率
* **Length**: 设置视频帧数(默认81帧)
* **Speed**: 设置视频速度(默认1.0)
8. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
# ComfyUI Wan2.2 Fun Control 视频控制生成示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/wan2-2-fun-control
本文介绍了如何在 ComfyUI 中完成 Wan2.2 Fun Control 使用控制视频来完成视频生成的示例
**Wan2.2-Fun-Control** 是 Alibaba PAI 团队推出的新一代视频生成与控制模型,通过引入创新性的控制代码(Control Codes)机制,结合深度学习和多模态条件输入,能够生成高质量且符合预设控制条件的视频。该模型采用 **Apache 2.0 许可协议**发布,支持商业使用。
**核心功能**:
* **多模态控制**:支持多种控制条件,包括 **Canny(线稿)**、**Depth(深度)**、**OpenPose(人体姿势)**、**MLSD(几何边缘)** 等,同时支持使用 **轨迹控制**
* **高质量视频生成**:基于 Wan2.2 架构,输出影视级质量视频
* **多语言支持**:支持中英文等多语言提示词输入
下面是相关模型权重和代码仓库:
* [🤗Wan2.2-Fun-A14B-Control](https://huggingface.co/alibaba-pai/Wan2.2-Fun-A14B-Control)
* 代码仓库:[VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
## ComfyOrg Wan2.2 Fun InP & Control Youtube 直播回放
对于 ComfyUI Wan2.2 的使用,我们有进行了直播,你可以查看这些回放了解如何使用
## Wan2.2 Fun Control 视频控制生成工作流示例
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
这里提供的工作流包含了两个版本:
1. 使用了 lightx2v 的 [Wan2.2-Lightning](https://huggingface.co/lightx2v/Wan2.2-Lightning) 4 步 LoRA : 但可能导致生成的视频动态会有损失,但速度会更快
2. 没有使用加速 LoRA 的 fp8\_scaled 版本
下面是使用 RTX4090D 24GB 显存 GPU 测试的结果 640\*640 分辨率, 81 帧长度的用时对比
| 模型类型 | 分辨率 | 显存占用 | 首次生成时长 | 第二次生成时长 |
| ---------------------- | ------- | ---- | ------ | ------- |
| fp8\_scaled | 640×640 | 83% | ≈ 524秒 | ≈ 520秒 |
| fp8\_scaled + 4步LoRA加速 | 640×640 | 89% | ≈ 138秒 | ≈ 79秒 |
由于使用了4 步 LoRA 对于初次使用工作流的用户体验较好, 但可能导致生成的视频动态会有损失, 我们默认启用了使用了加速 LoRA 版本,如果你需要启用另一组的工作流,框选后使用 **Ctrl+B** 即可启用
### 1. 工作流及素材下载
下载下面的视频或者 JSON 文件并拖入 ComfyUI 中以加载对应的工作流
下载 JSON 格式工作流
请下载下面的图片及视频,我们将作为输入。

> 这里我们使用了经过预处理的视频, 可以直接用于控制视频生成
### 2. 手动下载模型
下面的模型你可以在 [Wan\_2.2\_ComfyUI\_Repackaged](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged) 找到
**Diffusion Model**
* [wan2.2\_fun\_control\_high\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors)
* [wan2.2\_fun\_control\_low\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors)
\*\* Wan2.2-Lightning LoRA (可选,用于加速)\*\*
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_high\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors)
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_low\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
File save location
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors
│ ├───📂 loras/
│ │ ├─── wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
│ │ └─── wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 按步骤完成工作流
这个工作流是使用了 LoRA 的工作流,请确保对应的 Diffusion model 和 LoRA 是一致的, high noise 和 low noise 的模型和 LoRA 需要对应使用
1. **High noise** 模型及 **LoRA** 加载
* 确保 `Load Diffusion Model` 节点加载了 `wan2.2_fun_control_high_noise_14B_fp8_scaled.safetensors` 模型
* 确保 `LoraLoaderModelOnly` 节点加载了 `wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors`
2. **Low noise** 模型及 **LoRA** 加载
* 确保 `Load Diffusion Model` 节点加载了 `wan2.2_fun_control_low_noise_14B_fp8_scaled.safetensors` 模型
* 确保 `LoraLoaderModelOnly` 节点加载了 `wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors`
3. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
4. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors` 模型
5. 在 `Load Image` 节点上传起始帧
6. 在第二个 `Load video` 节点控制视频的 pose 视频, 提供的视频已经经过预处理可以直接使用
7. 由于我们提供的视频是预处理过的 pose 视频,所以对应的视频图像预处理节点需要禁用,你可以选中后使用 Ctrl + B\` 来禁用
8. 修改 Prompt 使用中英文都可以
9. 在 `Wan22FunControlToVideo` 修改对应视频的尺寸, 默认设置了 640\*640 的分辨率来避免低显存用户使用这个工作流时过于耗时
10. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
### 补充说明
由于在 ComfyUI 自带的节点中,预处理器节点只有 Canny 的预处理器,你可以使用使用类似 [ComfyUI-comfyui\_controlnet\_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) 来实现其它类型的图像预处理
# ComfyUI Wan2.2 Fun Inp 首尾帧视频生成示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/wan2-2-fun-inp
本文介绍了如何在 ComfyUI 中完成 Wan2.2 Fun Inp 首尾帧视频生成示例
**Wan2.2-Fun-Inp** 是 Alibaba pai团队推出的首尾帧控制视频生成模型,支持输入**首帧和尾帧图像**,生成中间过渡视频,为创作者带来更强的创意控制力。该模型采用 **Apache 2.0 许可协议**发布,支持商业使用。
**核心功能**:
* **首尾帧控制**:支持输入首帧和尾帧图像,生成中间过渡视频,提升视频连贯性与创意自由度
* **高质量视频生成**:基于 Wan2.2 架构,输出影视级质量视频
* **多分辨率支持**:支持生成512×512、768×768、1024×1024等分辨率的视频,适配不同场景需求
**模型版本**:
* **14B 高性能版**:模型体积达 32GB+,效果更优但需高显存支持
下面是相关模型权重和代码仓库:
* [🤗Wan2.2-Fun-Inp-14B](https://huggingface.co/alibaba-pai/Wan2.2-Fun-A14B-InP)
* 代码仓库:[VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun)
## ComfyOrg Wan2.2 Fun InP & Control Youtube 直播回放
对于 ComfyUI Wan2.2 的使用,我们有进行了直播,你可以查看这些回放了解如何使用
## Wan2.2 Fun Inp 首尾帧视频生成工作流示例
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
这里提供的工作流包含了两个版本:
1. 使用了 lightx2v 的 [Wan2.2-Lightning](https://huggingface.co/lightx2v/Wan2.2-Lightning) 4 步 LoRA : 但可能导致生成的视频动态会有损失,但速度会更快
2. 没有使用加速 LoRA 的 fp8\_scaled 版本
下面是使用 RTX4090D 24GB 显存 GPU 测试的结果 640\*640 分辨率, 81 帧长度的用时对比
| 模型类型 | 分辨率 | 显存占用 | 首次生成时长 | 第二次生成时长 |
| ---------------------- | ------- | ---- | ------ | ------- |
| fp8\_scaled | 640×640 | 83% | ≈ 524秒 | ≈ 520秒 |
| fp8\_scaled + 4步LoRA加速 | 640×640 | 89% | ≈ 138秒 | ≈ 79秒 |
由于使用了加速 LoRA 后提速较为明显,虽然动态有所损失,但对低显存用户较为友好,所以在提供的两组工作流中,我们默认启用了使用了加速 LoRA 版本,如果你需要启用另一组的工作流,框选后使用 **Ctrl+B** 即可启用
### 1. 工作流文件下载
下载 JSON 格式工作流
使用下面的素材作为首尾帧


### 2. 手动下载模型
**Diffusion Model**
* [wan2.2\_fun\_inpaint\_high\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_inpaint_high_noise_14B_fp8_scaled.safetensors)
* [wan2.2\_fun\_inpaint\_low\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_inpaint_low_noise_14B_fp8_scaled.safetensors)
**Lightning LoRA (可选,用于加速)**
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_high\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors)
* [wan2.2\_i2v\_lightx2v\_4steps\_lora\_v1\_low\_noise.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_fun_inpaint_high_noise_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_fun_inpaint_low_noise_14B_fp8_scaled.safetensors
│ ├───📂 loras/
│ │ ├─── wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors
│ │ └─── wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 按步骤完成工作流
这个工作流是使用了 LoRA 的工作流,请确保对应的 Diffusion model 和 LoRA 是一致的
1. **High noise** 模型及 **LoRA** 加载
* 确保 `Load Diffusion Model` 节点加载了 `wan2.2_fun_inpaint_high_noise_14B_fp8_scaled.safetensors` 模型
* 确保 `LoraLoaderModelOnly` 节点加载了 `wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors`
2. **Low noise** 模型及 **LoRA** 加载
* 确保 `Load Diffusion Model` 节点加载了 `wan2.2_fun_inpaint_low_noise_14B_fp8_scaled.safetensors` 模型
* 确保 `LoraLoaderModelOnly` 节点加载了 `wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors`
3. 确保 `Load CLIP` 节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
4. 确保 `Load VAE` 节点加载了 `wan_2.1_vae.safetensors` 模型
5. 首尾帧图片上传,分别上传首尾帧图片素材
6. 在 Prompt 组中输入提示词
7. `WanFunInpaintToVideo` 节点尺寸和视频长度调整
* 调整 `width` 和 `height` 的尺寸,默认为 `640`, 我们设置了较小的尺寸你可以按需进行修改
* 调整 `length`, 这里为视频总帧数,当前工作流 fps 为 16, 假设你需要生成一个 5 秒的视频,那么你应该设置 5\*16 = 80
8. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
# Wan2.2-S2V 音频驱动视频生成 ComfyUI 原生工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/wan2-2-s2v
这是一个基于 ComfyUI 的 Wan2.2-S2V 音频驱动视频生成原生工作流示例。
我们很高兴地宣布,先进的音频驱动视频生成模型 Wan2.2-S2V 现已原生支持 ComfyUI!这个强大的 AI 模型可以将静态图片和音频输入转化为动态视频内容,支持对话、唱歌、表演等多种创意内容需求。
**模型亮点**
* **音频驱动视频生成**:将静态图片和音频转化为同步视频
* **电影级画质**:生成具有自然表情和动作的高质量视频
* **分钟级生成**:支持长时长视频创作
* **多格式支持**:适用于全身和半身角色
* **增强动作控制**:可根据文本指令生成动作和环境
Wan2.2 S2V 代码仓库:[Github](https://github.com/aigc-apps/VideoX-Fun)
Wan2.2 S2V 模型仓库:[Hugging Face](https://huggingface.co/Wan-AI/Wan2.2-S2V-14B)
## Wan2.2 S2V ComfyUI 原生工作流
### 1. 工作流文件下载
下载以下工作流文件并拖入 ComfyUI 中加载工作流。
Download JSON Workflow
下载下面的图片及音频作为输入:

下载输入音频
### 2. 模型链接
你可以在 [我们的仓库](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged) 中找到所有模型。
**diffusion\_models**
* [wan2.2\_s2v\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_s2v_14B_fp8_scaled.safetensors)
* [wan2.2\_s2v\_14B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_s2v_14B_bf16.safetensors)
**audio\_encoders**
* [wav2vec2\_large\_english\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/audio_encoders/wav2vec2_large_english_fp16.safetensors)
**vae**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**text\_encoders**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├──── wan2.2_s2v_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_s2v_14B_bf16.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ ├───📂 audio_encoders/ # 如果这个文件夹不存在请手动创建一个
│ │ └─── wav2vec2_large_english_fp16.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 工作流说明
#### 3.1 关于 Lightning LoRA
#### 3.2 关于 fp8\_scaled 和 bf16 模型
你可以在 [这里](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/diffusion_models) 找到两种模型:
* [wan2.2\_s2v\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_s2v_14B_fp8_scaled.safetensors)
* [wan2.2\_s2v\_14B\_bf16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_s2v_14B_bf16.safetensors)
本模板使用 `wan2.2_s2v_14B_fp8_scaled.safetensors`,它需要更少的显存。但你可以尝试 `wan2.2_s2v_14B_bf16.safetensors` 来减少质量损失。
#### 3.3 逐步操作说明
**步骤 1:加载模型**
1. **Load Diffusion Model**:加载 `wan2.2_s2v_14B_fp8_scaled.safetensors` 或 `wan2.2_s2v_14B_bf16.safetensors`
* 提供工作流使用 `wan2.2_s2v_14B_fp8_scaled.safetensors`,它需要更少的显存
* 但你可以尝试 `wan2.2_s2v_14B_bf16.safetensors` 来减少质量损失
2. **Load CLIP**:加载 `umt5_xxl_fp8_e4m3fn_scaled.safetensors`
3. **Load VAE**:加载 `wan_2.1_vae.safetensors`
4. **AudioEncoderLoader**:加载 `wav2vec2_large_english_fp16.safetensors`
5. **LoraLoaderModelOnly**:加载 `wan2.2_t2v_lightx2v_4steps_lora_v1.1_high_noise.safetensors`(Lightning LoRA)
* 测试了所有 wan2.2 lightning LoRAs,由于这并不是一个专门为 Wan2.2 S2V 训练的 LoRA,很多键值不匹配,但由于它能大幅减少生成时间,后续将继续优化这个模板
* 使用它会导致极大的动态和质量损失
* 如果你发现输出质量太差,可以尝试原始的 20 步工作流
6. **LoadAudio**:上传我们提供的音频文件,或者你自己的音频
7. **Load Image**:上传参考图片
8. **Batch sizes**:根据你添加的 Video S2V Extend 子图节点数量设置
* 每个 Video S2V Extend 子图会为最终输出添加 77 帧
* 例如:如果添加了 2 个 Video S2V Extend 子图,批处理大小应设为 3, 也就是这里应为所有总采样次数
* **Chunk Length**:保持默认值 77
9. **采样器设置**: 根据是否使用 Lightning LoRA 选择不同设置
* 使用 4 步 Lightning LoRA: steps: 4, cfg: 1.0
* 不使用 4 步 Lightning LoRA: steps: 20, cfg: 6.0
10. **尺寸设置**: 设置输出视频的尺寸
11. **Video S2V Extend**:视频扩展子图节点,由于我们默认的每次采样帧数为 77, 由于这是一个 16fps 的模型,所以每个扩展将会生成 77 / 16 = 4.8125 秒的视频
* 你需要一定的计算来使得视频扩展子图节点的数量和输入音频数量匹配,如: 输入音频为 14s, 则需要的总帧数为 14x16=224, 每个视频扩展为 77 帧,所以你需要 224/77 = 2.9 向上取整则为 3 个视频扩展子图节点
12. 使用 Ctrl-Enter 或者点击 运行按钮来运行工作流
# Wan2.2 视频生成ComfyUI 官方原生工作流示例
Source: https://docs.comfy.org/zh-CN/tutorials/video/wan/wan2_2
阿里云通义万相2.2视频生成模型在ComfyUI中的官方使用指南
通义万相 2.2(Wan 2.2)是阿里云推出的新一代多模态生成模型。该模型采用创新的 MoE(Mixture of Experts)架构,由高噪专家模型和低噪专家模型组成,能够根据去噪时间步进行专家模型划分,从而生成更高质量的视频内容。
Wan 2.2 具备三大核心特性:影视级美学控制,深度融合专业电影工业的美学标准,支持光影、色彩、构图等多维度视觉控制;大规模复杂运动,轻松还原各类复杂运动并强化运动的流畅度和可控性;精准语义遵循,在复杂场景和多对象生成方面表现卓越,更好还原用户的创意意图。
模型支持文生视频、图生视频等多种生成模式,适用于内容创作、艺术创作、教育培训等多种应用场景。
[Wan2.2 提示词指南](https://alidocs.dingtalk.com/i/nodes/jb9Y4gmKWrx9eo4dCql9LlbYJGXn6lpz)
## 模型亮点
* **影视级美学控制**:专业镜头语言,支持光影、色彩、构图等多维度视觉控制
* **大规模复杂运动**:流畅还原各类复杂运动,强化运动可控性和自然度
* **精准语义遵循**:复杂场景理解,多对象生成,更好还原创意意图
* **高效压缩技术**:5B版本高压缩比VAE,显存优化,支持混合训练
## Wan2.2 开源模型版本
Wan2.2 系列模型基于 Apache2.0 开源协议,支持商业使用。Apache2.0 许可证允许您自由使用、修改和分发这些模型,包括商业用途,只需保留原始版权声明和许可证文本。
| 模型类型 | 模型名称 | 参数量 | 主要功能 | 模型仓库 |
| ---- | --------------- | --- | ----------------------------------- | ------------------------------------------------------------------- |
| 混合模型 | Wan2.2-TI2V-5B | 5B | 支持文本生成视频和图像生成视频的混合版本,单一模型满足两大核心任务需求 | 🤗 [Wan2.2-TI2V-5B](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B) |
| 图生视频 | Wan2.2-I2V-A14B | 14B | 将静态图像转换为动态视频,保持内容一致性和流畅的动态过程 | 🤗 [Wan2.2-I2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B) |
| 文生视频 | Wan2.2-T2V-A14B | 14B | 从文本描述生成高质量视频,具备影视级美学控制和精准语义遵循 | 🤗 [Wan2.2-T2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B) |
## ComfyOrg Wan2.2 直播回放
对于 ComfyUI Wan2.2 的使用,我们有进行了直播,你可以查看这些回放了解如何使用
VIDEO
VIDEO
VIDEO
本篇教程将使用 [🤗 Comfy-Org/Wan\_2.2\_ComfyUI\_Repackaged](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged)的版本进行
请确保你的 ComfyUI 已经更新。
* [ComfyUI 下载](https://www.comfy.org/download)
* [ComfyUI 更新教程](/zh-CN/installation/update_comfyui)
本指南里的工作流可以在 ComfyUI 的[工作流模板](/zh-CN/interface/features/template)中找到。如果找不到,可能是 ComfyUI 没有更新。
如果加载工作流时有节点缺失,可能原因有:
1. 你用的不是最新开发版(nightly)。
2. 你用的是稳定版或桌面版(没有包含最新的更新)。
3. 启动时有些节点导入失败。
## Wan2.2 TI2V 5B 混合版本工作流示例
Wan2.2 5B 版本配合 ComfyUI 原生 offloading功能,能很好地适配 8GB 显存。
### 1. 工作流文件下载
请更新你的 ComfyUI 到最新版本,并通过菜单 `工作流` -> `浏览模板` -> `视频` 找到 “Wan2.2 5B video generation” 以加载工作流
下载 JSON 格式工作流
### 2. 手动下载模型
**Diffusion Model**
* [wan2.2\_ti2v\_5B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_ti2v_5B_fp16.safetensors)
**VAE**
* [wan2.2\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan2.2_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ └───wan2.2_ti2v_5B_fp16.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan2.2_vae.safetensors
```
### 3. 按步骤完成工作流
1. 确保`Load Diffusion Model`节点加载了 `wan2.2_ti2v_5B_fp16.safetensors` 模型
2. 确保`Load CLIP`节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
3. 确保`Load VAE`节点加载了 `wan2.2_vae.safetensors` 模型
4. (可选)如果你需要进行图生视频,可以使用快捷键 Ctrl+B 来启用 `Load image` 节点来上传图片
5. (可选)在`Wan22ImageToVideoLatent` 你可以进行尺寸的设置调整,和视频总帧数 `length` 调整
6. (可选)如果你需要修改提示词(正向及负向)请在序号`5` 的 `CLIP Text Encoder` 节点中进行修改
7. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
## Wan2.2 14B T2V 文生视频工作流示例
### 1. 工作流文件下载
请更新你的 ComfyUI 到最新版本,并通过菜单 `工作流` -> `浏览模板` -> `视频` 找到 “Wan2.2 14B T2V”
或者更新你的 ComfyUI 到最新版本后,下载下面的工作流并拖入 ComfyUI 以加载工作流
下载 JSON 格式工作流
### 2. 手动下载模型
**Diffusion Model**
* [wan2.2\_t2v\_high\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors)
* [wan2.2\_t2v\_low\_noise\_14B\_fp8\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors
│ │ └─── wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 按步骤完成工作流
1. 确保第一个 `Load Diffusion Model`节点加载了 `wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors` 模型
2. 确保第二个 `Load Diffusion Model`节点加载了 `wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors` 模型
3. 确保`Load CLIP`节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
4. 确保`Load VAE`节点加载了 `wan_2.1_vae.safetensors` 模型
5. (可选)在`EmptyHunyuanLatentVideo` 你可以进行尺寸的设置调整,和视频总帧数 `length` 调整
6. 如果你需要修改提示词(正向及负向)请在序号`6` 的 `CLIP Text Encoder` 节点中进行修改
7. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
## Wan2.2 14B I2V 图生视频工作流示例
### 1. 工作流文件
请更新你的 ComfyUI 到最新版本,并通过菜单 `工作流` -> `浏览模板` -> `视频` 找到 “Wan2.2 14B I2V” 以加载工作流
或者更新你的 ComfyUI 到最新版本后,下载下面的工作流并拖入 ComfyUI 以加载工作流
下载 JSON 格式工作流
你可以使用下面的图片作为输入

### 2. 手动下载模型
**Diffusion Model**
* [wan2.2\_i2v\_high\_noise\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_i2v_high_noise_14B_fp16.safetensors)
* [wan2.2\_i2v\_low\_noise\_14B\_fp16.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_i2v_low_noise_14B_fp16.safetensors)
**VAE**
* [wan\_2.1\_vae.safetensors](https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors)
**Text Encoder**
* [umt5\_xxl\_fp8\_e4m3fn\_scaled.safetensors](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors)
```
ComfyUI/
├───📂 models/
│ ├───📂 diffusion_models/
│ │ ├─── wan2.2_i2v_low_noise_14B_fp16.safetensors
│ │ └─── wan2.2_i2v_high_noise_14B_fp16.safetensors
│ ├───📂 text_encoders/
│ │ └─── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│ └───📂 vae/
│ └── wan_2.1_vae.safetensors
```
### 3. 按步骤完成工作流
1. 确保第一个 `Load Diffusion Model`节点加载了 `wan2.2_t2v_high_noise_14B_fp8_scaled.safetensors` 模型
2. 确保第二个 `Load Diffusion Model`节点加载了 `wan2.2_t2v_low_noise_14B_fp8_scaled.safetensors` 模型
3. 确保`Load CLIP`节点加载了 `umt5_xxl_fp8_e4m3fn_scaled.safetensors` 模型
4. 确保`Load VAE`节点加载了 `wan_2.1_vae.safetensors` 模型
5. 在 `Load Image` 节点上传作为起始帧的图像
6. 如果你需要修改提示词(正向及负向)请在序号`6` 的 `CLIP Text Encoder` 节点中进行修改
7. 可选)在`EmptyHunyuanLatentVideo` 你可以进行尺寸的设置调整,和视频总帧数 `length` 调整
8. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
## Wan2.2 14B FLF2V 首尾帧视频生成工作流示例
首尾帧工作流使用模型位置与 I2V 部分完全一致
### 1. 工作流及素材生成
下载下面的视频或者 JSON 格式工作流在 ComfyUI 中打开
下载 JSON 格式工作流
下载下面的素材作为输入
### 2. 按步骤完成工作流
1. 在第一个 `Load Image` 节点上传作为起始帧的图像
2. 在第二个 `Load Image` 节点上传作为起始帧的图像
3. 在 `WanFirstLastFrameToVideo` 上修改尺寸设置
* 我们默认设置了一个比较小的尺寸,防止低显存用户运行占用过多资源
* 如果你有足够的显存,可以尝试 720P 左右尺寸
4. 根据你的首尾帧撰写合适的提示词
5. 点击 `Run` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来执行视频生成
## 社区资源
### GGUF 版本
* [bullerwins/Wan2.2-I2V-A14B-GGUF/](https://huggingface.co/bullerwins/Wan2.2-I2V-A14B-GGUF/)
* [bullerwins/Wan2.2-T2V-A14B-GGUF](https://huggingface.co/bullerwins/Wan2.2-T2V-A14B-GGUF)
* [QuantStack/Wan2.2 GGUFs](https://huggingface.co/collections/QuantStack/wan22-ggufs-6887ec891bdea453a35b95f3)
**自定义节点**
[City96/ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF)
### WanVideoWrapper
* [Kijai/ComfyUI-WanVideoWrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper)
**Wan2.2 models**
* [Kijai/WanVideo\_comfy\_fp8\_scaled](https://hf-mirror.com/Kijai/WanVideo_comfy_fp8_scaled)
**Wan2.1 models**
* [Kijai/WanVideo\_comfy/Lightx2v](https://huggingface.co/Kijai/WanVideo_comfy/tree/main/Lightx2v)
**Lightx2v 4steps LoRA**
* [Wan2.2-T2V-A14B-4steps-lora-rank64-V1](https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-rank64-V1)