Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.comfy.org/llms.txt

Use this file to discover all available pages before exploring further.

Make sure your ComfyUI is updated.Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated. (Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:
  1. You are not using the latest ComfyUI version (Nightly version)
  2. Some nodes failed to import at startup
Stable Audio 3 (GitHub) is Stability AI’s latest open-source audio generation model, trained on fully licensed music data and licensed for commercial use. It uses a dedicated subgraph node to produce high-quality stereo audio — including music, sound effects, and instruments — from text descriptions, with optional Qwen-powered category-aware reprompting. Stable Audio 3 comes in three variants:
  • Small-SFX — Sound effects and short ambiance, up to 2:00. Small enough to run on CPU.
  • Small-Music — Short music loops, on-device-friendly, up to 2:00.
  • Medium — Longer tracks with stronger structure and musicality, up to ~6:20. Requires a GPU.
Related links:

Available workflows

Stable Audio 3 Medium

Download Workflow

Download JSON or search “Stable Audio 3 Medium” in Template Library

Run on Comfy Cloud

Open in Comfy Cloud
Stable Audio 3 Medium workflow The Stable Audio 3 Medium workflow is a full-featured text-to-audio generation pipeline. You provide a short text idea, optional duration, seed, and category — the workflow expands your prompt using Qwen with a category-aware reprompt template, then generates stereo audio via the Stable Audio 3 checkpoint. How to use:
  1. Text idea — Enter a short description of the sound, music, or effect you want (e.g. “upbeat electronic dance track with heavy bass”)
  2. Duration — Set the desired clip length in seconds (default varies)
  3. Seed — Control reproducibility by adjusting the seed value
  4. Category — Choose a reprompt preset: Music, Instrument, SFX, or One-shot
  5. Enable reprompt — Toggle use_reprompt on to let Qwen expand your short idea into a detailed prompt before generation
  6. Click Run (Ctrl/Cmd + Enter) to generate. The audio will be saved to ComfyUI/output/audio/

Stable Audio 3 Medium Base

Download Workflow

Download JSON or search “Stable Audio 3 Medium Base” in Template Library

Run on Comfy Cloud

Open in Comfy Cloud
Stable Audio 3 Medium Base workflow A simplified version of Stable Audio 3 without Qwen reprompt expansion. It expects a complete text prompt and passes it directly to the model. Use this when you already have a detailed prompt and want faster generation. How to use:
  1. Text prompt — Enter a detailed description of the audio you want
  2. Duration — Set the clip length in seconds
  3. Seed — Control reproducibility
  4. Click Run (Ctrl/Cmd + Enter) to generate

Model download

When loading the workflow, ComfyUI will prompt you with download links for any missing models. To set up manually, download the files below and place them in the correct folders.

Checkpoints

stable_audio_3_medium.safetensors

For the Medium workflow. Place in models/checkpoints/

stable_audio_3_medium_base.safetensors

For the Medium Base workflow. Place in models/checkpoints/
Place checkpoints in:
📂 ComfyUI/
├── 📂 models/
│   └── 📂 checkpoints/
│       ├── stable_audio_3_medium.safetensors
│       └── stable_audio_3_medium_base.safetensors

Text encoders

t5gemma_b_b_ul2.safetensors

Required for all Stable Audio 3 workflows. Place in models/text_encoders/

qwen3.5_2b_bf16.safetensors

Required for the Medium workflow (Qwen reprompt). Place in models/text_encoders/
Place text encoders in:
📂 ComfyUI/
├── 📂 models/
│   └── 📂 text_encoders/
│       ├── t5gemma_b_b_ul2.safetensors
│       └── qwen3.5_2b_bf16.safetensors
After placing the files, press R in ComfyUI to refresh nodes and load the latest models.