VOID (Video Object Inpainting and Deletion) is a powerful video inpainting model open-sourced by Netflix. It uses a two-pass diffusion pipeline built on CogVideoX to remove objects from videos and fill the resulting holes with temporally coherent content. VOID removes objects along with all interactions they induce on the scene — not just secondary effects like shadows and reflections, but physical interactions like objects falling when a person is removed. For example, if a person holding a guitar is removed, VOID also removes the person’s effect on the guitar, causing it to fall naturally. VOID is natively supported in ComfyUI (PR #13403), and its complete model weights are available under the Apache 2.0 License. VOID Model - GitHub | Paper (arXiv) | 🤗 Diffusers Pipeline Before (left) — the original footage with the snowboarder. After (right) — the processed result after removing the snowboarder from the scene. VOID removes unwanted objects while maintaining natural motion, lighting, and scene coherence across frames.Documentation Index
Fetch the complete documentation index at: https://docs.comfy.org/llms.txt
Use this file to discover all available pages before exploring further.
Key strengths
- Interaction-aware removal — removes not just the object, but all physical interactions it caused on the scene (shadows, reflections, falling objects)
- Object removal, not single-frame patching — produces coherent motion and lighting across the entire clip
- Two-pass refinement — Pass 2 provides superior temporal stability (fewer jitters and flashes) compared to Pass 1 alone, especially on longer cuts or textured backgrounds
Limitations: Unclear masks, chaotic motion, or targets that dominate the frame may still produce suboptimal results — prompting cannot fix fundamentally wrong segmentation.
VOID Video Inpainting Workflow
1. Download Workflow
Update your ComfyUI to the latest version, then go toWorkflow -> Browse Templates and find “VOID: Video Inpainting” under the Utility category.
Download JSON Workflow File
Download workflow
Run on Comfy Cloud
Open in cloud
2. Download Models
All models are hosted on the Comfy-Org VOID model repository. Diffusion Models — the core two-pass inpainting model:- void_pass2.safetensors — Refinement pass, better temporal stability
- void_pass1.safetensors — Primary pass
3. Using the Workflow
Inputs:- Source video — Load a video via the
Load Videonode (place it in the ComfyUIinput/folder) - Positive prompt (inpaint fill) — Describe the scene after removal. Focus on what remains and how it looks, not on what was removed
- Example:
empty kitchen counter, daylight, tiles visible
- Example:
- Negative prompt — Optional anti-artifact list; can be left empty
- SAM3 object prompt — A short label for what to mask out. SAM3 uses semantic understanding to create a segmentation mask for the target object.
- Example:
person in blue jacket,red cup on table - Max tokens for SAM3 prompts is 32. To prompt multiple subjects separately, separate with commas and use
:Nto specify the max objects detected per prompt:eye:2, window panels:4
- Example:
| Prompt | Role |
|---|---|
| SAM3 object | What is removed (SAM3 creates the mask via semantic segmentation) |
| Positive (inpaint) | How the hole is filled across time |
Learn about Subgraph
This workflow uses Subgraph nodes for modular video processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.
Additional Notes
- Mask quality matters — a clean, tight mask around the target object produces the best results
- Prompt writing tip — describe the scene as it should appear naturally after removal, not the removal itself
- Use negative prompt only when you see repeating defects (watermarks, blur, extra limbs)
- Two-pass workflow — the template runs Pass 1 then Pass 2 automatically; you can also run just Pass 1 for faster iterations during testing