Skip to main content
Gemini Omni Flash is Google DeepMind’s high-quality, cost-efficient video generation and conversational editing model. First introduced at Google I/O 2026 as part of the Gemini Omni family, it combines Gemini’s multimodal reasoning with native video creation, enabling developers to generate, edit, and remix videos through natural conversation.
To use the Partner Nodes, you need to ensure that you are logged in properly and using a permitted network environment. Please refer to the Partner Nodes Overview section of the documentation to understand the specific requirements for using the Partner Nodes.
Make sure your ComfyUI is updated.Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated. (Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:
  1. You are not using the latest ComfyUI version (Nightly version)
  2. Some nodes failed to import at startup

What Gemini Omni Flash offers

  • Conversational video editing: Refine and edit videos using natural language — swap characters, relight scenes, alter angles, add or remove objects while maintaining original audio and video tracks
  • Multimodal input: Combine text, images, and video inputs to guide generation. Natively generates synchronized audio with every video output
  • World knowledge and simulation: Combines physics understanding with Gemini’s knowledge of history, science, and cultural context, enabling meaningful storytelling beyond photorealism
  • Text and action synchronization: Render legible text and graphics directly into video, syncing kinetic typography with on-screen movements
  • Pricing: $0.10 per second of video output, matching Veo 3.1 Fast pricing

Workflows

Text to Video

Run in Comfy Cloud

Open in Comfy Cloud

Download Workflow

Download JSON or search “Gemini Omni Flash” in Template Library
Generate cinematic video from natural language prompts. Transform text descriptions into video output with world-aware motion, lighting, and sound. Ideal for social media content creation, rapid video prototyping, and iterative visual storytelling.

Image to Video

Run in Comfy Cloud

Open in Comfy Cloud

Download Workflow

Download JSON or search “Gemini Omni Flash” in Template Library

Download Sample Image 1

Get the example input image for this workflow

Download Sample Image 2

Get the second example input image
Generate a video from two images using Gemini Omni Flash. Interpret natural language prompts to control duration and aspect ratio. Perfect for creating short brand clips, dynamic social media content, and iterative video edits through conversational prompting.

Video Edit

Run in Comfy Cloud

Open in Comfy Cloud

Download Workflow

Download JSON or search “Gemini Omni Flash” in Template Library

Download Sample Video

Get the example input video for this workflow
Edit videos with natural language using Gemini Omni Flash. Transform a single input video into one edited output based on your descriptive instructions. Specify the duration and aspect ratio in your prompt. Ideal for quick social media remixes, cinematic scene adjustments, and iterative video refinements.

Get started

  1. Update ComfyUI to the latest version
  2. Double-click the canvas and search for “Gemini Omni Flash” nodes
  3. Or go to the Template Library to use the ready-to-go workflows
  4. Choose the workflow that matches your input type (text, image, or video)
  5. Enter your prompt and generate
For the best results, combine Gemini Omni Flash with Nano Banana 2 Lite: generate images at high speed, then use Gemini Omni Flash to animate them into video.