Lens is an open text-to-image model by Microsoft, licensed under MIT. With 3.8 billion parameters, it uses a dual-stream MMDiT architecture with GPT-OSS-20B text encoder features and the FLUX.2 semantic VAE to deliver competitive image quality with substantially less training compute than larger T2I models. Model highlights:Documentation Index
Fetch the complete documentation index at: https://docs.comfy.org/llms.txt
Use this file to discover all available pages before exploring further.
- Dual-stream MMDiT architecture — joint image-text processing for coherent generation
- GPT-OSS-20B multi-layer text features — rich text understanding for accurate prompt following
- FLUX.2 semantic VAE — high-fidelity latent representation
- Mixed-resolution training — supports various aspect ratios without cropping
- Compact 3.8B footprint — runs on consumer GPUs with 24 GB VRAM
Lens text-to-image workflow
Both the standard and Turbo variants use a Subgraph node to manage the text-to-image generation pipeline. You can open the subgraph to inspect or customize the internal nodes.Learn about Subgraph
This workflow uses Subgraph nodes for modular processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.
Lens
Download Workflow
Download JSON or search “Lens” in Template Library
Get started
- Update ComfyUI to the latest version
- Go to Template and search for Lens
- Select the Lens workflow
- Download any missing models (see model downloads), enter your prompt, and click Queue
Example output
Lens Turbo
Lens Turbo is a distilled variant that generates images in fewer sampling steps for faster inference.Download Workflow
Download JSON or search “Lens Turbo” in Template Library
Get started
- Update ComfyUI to the latest version
- Go to Template and search for Lens Turbo
- Select the Lens Turbo workflow
- Download any missing models (see model downloads), enter your prompt, and click Queue
Example output
Lens model downloads
You can find all model files at Comfy-Org/Lens on Hugging Face.lens_bf16.safetensors
Diffusion model for Lens (BF16).
lens_turbo_bf16.safetensors
Diffusion model for Lens Turbo (BF16).
gpt_oss_20b_nvfp4.safetensors
Text encoder (GPT-OSS-20B) shared by both Lens and Lens Turbo.
flux2-vae.safetensors
VAE (FLUX.2) shared by both Lens and Lens Turbo.
Available models
| Model | Description | Inference steps | Link |
|---|---|---|---|
| Lens | Standard 3.8B model — higher quality, more steps | ~50 | Hugging Face |
| Lens Turbo | Distilled variant — faster generation with fewer steps | ~8 | Hugging Face |