Inputs
| Parameter | Description | Data Type | Required | Range |
|---|---|---|---|---|
clip_name | The filename of the text encoder model to load. This must be a file located in the ComfyUI/models/text_encoders/ or ComfyUI/models/clip/ directory. | STRING | Yes | List of files found in the text_encoders folder |
type | The architecture type of the model being loaded. This determines which specific encoder variant to use. The default is "stable_diffusion". | STRING | Yes | "stable_diffusion""stable_cascade""sd3""stable_audio""mochi""ltxv""pixart""cosmos""lumina2""wan""hidream""chroma""ace""omnigen2""qwen_image""hunyuan_image""flux2""ovis""longcat_image""cogvideox" |
device | The device to load the model onto. "default" uses the GPU if available, while "cpu" forces CPU loading. This is an advanced option (default: "default"). | STRING | No | "default""cpu" |
Supported Type-to-Encoder Mappings
Thetype parameter selects the correct encoder for a given model architecture. The following are common mappings:
| Type | Encoder |
|---|---|
| stable_diffusion | clip-l |
| stable_cascade | clip-g |
| sd3 | t5 xxl / clip-g / clip-l |
| stable_audio | t5 base |
| mochi | t5 xxl |
| cogvideox | t5 xxl (226-token padding) |
| cosmos | old t5 xxl |
| lumina2 | gemma 2 2B |
| wan | umt5 xxl |
| hidream | llama-3.1 (recommended) or t5 |
| omnigen2 | qwen vl 2.5 3B |
Outputs
| Output Name | Description | Data Type |
|---|---|---|
clip | The loaded text encoder model, ready to be connected to other nodes for text encoding and conditioning. | CLIP |
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
Source fingerprint (SHA-256):
1051bfe5570dff81719682cb09938bae4c03e94e0e72f7a2be84867cccb48017