ByteDance USO ComfyUI Native Workflow example

USO (Unified Style-Subject Optimized) is a model developed by ByteDance’s UXO Team that unifies style-driven and subject-driven generation tasks. Built on FLUX.1-dev architecture, the model achieves both style similarity and subject consistency through disentangled learning and style reward learning (SRL). USO supports three main approaches:

Subject-Driven: Place subjects into new scenes while maintaining identity consistency
Style-Driven: Apply artistic styles to new content based on reference images
Combined: Use both subject and style references simultaneously

Related Links

ByteDance USO ComfyUI Native Workflow

Portable or self deployed users
Desktop or Cloud users

Make sure your ComfyUI is updated.

Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated. (Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:

You are not using the latest ComfyUI version (Nightly version)
Some nodes failed to import at startup

1. Workflow and input

Download the image below and drag it into ComfyUI to load the corresponding workflow.

Download JSON Workflow

Run on Comfy Cloud

Use the image below as an input image. input

2. Model links

checkpoints

flux1-dev-fp8.safetensors

loras

uso-flux1-dit-lora-v1.safetensors

model_patches

uso-flux1-projector-v1.safetensors

clip_visions

sigclip_vision_patch14_384.safetensors

Please download all models and place them in the following directories:

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 checkpoints/
│   │   └── flux1-dev-fp8.safetensors
│   ├── 📂 loras/
│   │   └── uso-flux1-dit-lora-v1.safetensors
│   ├── 📂 model_patches/
│   │   └── uso-flux1-projector-v1.safetensors
│   ├── 📂 clip_visions/
│   │   └── sigclip_vision_patch14_384.safetensors

3. Workflow instructions

Load models:
- 1.1 Ensure the Load Checkpoint node has flux1-dev-fp8.safetensors loaded
- 1.2 Ensure the LoraLoaderModelOnly node has dit_lora.safetensors loaded
- 1.3 Ensure the ModelPatchLoader node has projector.safetensors loaded
- 1.4 Ensure the Load CLIP Vision node has sigclip_vision_patch14_384.safetensors loaded
Content Reference:
- 2.1 Click Upload to upload the input image we provided
- 2.2 The ImageScaleToMaxDimension node will scale your input image for content reference, 512px will keep more character features, but if you only use the character’s head as input, the final output image often has issues like the character taking up too much space. Setting it to 1024px gives much better results.
In the example, we only use the content reference image input. If you want to use the style reference image input, you can use Ctrl-B to bypass the marked node group.
Write your prompt or keep default
Set the image size if you need
The EasyCache node is for inference acceleration, but it will also sacrifice some quality and details. You can bypass it (Ctrl+B) if you don’t need to use it.
Click the Run button, or use the shortcut Ctrl(Cmd) + Enter to run the workflow

4. Additional Notes

Style reference only:

We also provide a workflow that only uses style reference in the same workflow we provided

The only different is we replaced the content reference node and only use an Empty Latent Image node.

You can also bypass whole Style Reference group and use the workflow as a text to image workflow, which means this workflow has 4 variations

Only use content (subject) reference
Only use style reference
Mixed content and style reference
As a text to image workflow

Get Started

Basic Concepts

Interface Guide

Tutorials

ByteDance USO ComfyUI Native Workflow example