Gemma 4 ComfyUI workflow example

Local users
Cloud users

Make sure your ComfyUI is updated.

Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated.If nodes are missing when loading a workflow, possible reasons:

You are not using the latest ComfyUI version (Nightly version)
Some nodes failed to import at startup

Gemma 4 is the latest generation of lightweight open LLMs from Google DeepMind, built for text generation, image understanding, video analysis, audio transcription, and structured tool use. It is natively supported in ComfyUI as the default Text Generation model. Model highlights:

Multimodal by design — accepts text, image, video, and audio inputs simultaneously
Three sizes available:
- E2B (2B) — Fast and lightweight, ideal for consumer GPUs
- E4B (4B) — Balanced performance, recommended default
- 31B — Best quality, requires higher VRAM
Thinking mode — Built-in step-by-step reasoning before generating answers
Long context — Up to 128K tokens (E2B/E4B) and 256K tokens (31B)
Multilingual — 35+ languages out of the box, pre-trained on 140+
Function calling — Native support for structured tool use and agentic workflows
ComfyUI native — loaded and run through the built-in TextGenerate and CLIPLoader nodes

Related links:

Available workflow

Gemma 4: Text Generation

Download Workflow

Download JSON or search “Gemma 4 Text Generation” in Template Library

Run on Comfy Cloud

Open in Comfy Cloud

This workflow demonstrates the core text generation capabilities of Gemma 4. It accepts an optional image, audio file, or video as additional context alongside your text prompt, and generates natural language output — with support for reasoning, coding, and multilingual prompts. Inputs:

Text prompt — your question or instruction
Image (optional) — for visual understanding tasks (OCR, object detection, chart reading, etc.)
Audio (optional) — for speech recognition or transcription
Video (optional) — for video understanding across frames (subsampled to 1 FPS internally)

Key controls:

Max length — maximum number of tokens to generate (default 256)
Sampling mode — toggle sampling on/off and adjust temperature, top-k, top-p, repetition penalty, and seed
Thinking mode — enable step-by-step reasoning before the final answer
Use default template — apply the built-in system prompt for the model

Output:

Generated text — the model’s response as a plain text string

Learn about Subgraph

This workflow uses Subgraph nodes for modular processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.

Model Download

Gemma 4 models are loaded as text encoders in ComfyUI. Download the relevant model file and place it in the correct directory:

Gemma 4 2B (E2B IT FP8)

Fast, lightweight. Recommended for consumer GPUs.

Gemma 4 4B (E4B IT FP8)

Balanced performance. The default model in the workflow.

View All Variants

Browse all Gemma 4 model weights.

Place the downloaded .safetensors file in:

📂 ComfyUI/
├── 📂 models/
│   └── 📂 text_encoders/
│       └── gemma4_e4b_it_fp8_scaled.safetensors

Get Started

Agent Tools / MCP

Basic Concepts

Interface Guide

Tutorials

Partner Nodes

Gemma 4 ComfyUI workflow example

Available workflow

Gemma 4: Text Generation

Download Workflow

Run on Comfy Cloud

Learn about Subgraph

Model Download

Gemma 4 2B (E2B IT FP8)

Gemma 4 4B (E4B IT FP8)

View All Variants

​Available workflow

​Gemma 4: Text Generation

Download Workflow

Run on Comfy Cloud

Learn about Subgraph

​Model Download

Gemma 4 2B (E2B IT FP8)

Gemma 4 4B (E4B IT FP8)

View All Variants

Available workflow

Gemma 4: Text Generation

Model Download