ComfyUI Depth Anything 3 Introduction

Depth Anything 3 (DA3), from ByteDance Seed, is a vision transformer that recovers spatially consistent geometry from arbitrary visual inputs, with or without known camera poses. A single plain DINO encoder and unified depth-ray representation let one model family cover monocular depth, multi-view depth, camera pose estimation, and 3D reconstruction. Key capabilities:

Unified monocular & multi-view depth: estimate depth from a single image or multiple views
Camera pose estimation: recover camera positions from unordered image sets
3D reconstruction from multi-view inputs
Video depth estimation: per-frame depth sequences for video inputs
Multiple model variants: Small, Base, Mono/Metric Large

Portable or self deployed users
Desktop or Cloud users

Make sure your ComfyUI is updated.

Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated. (Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:

You are not using the latest ComfyUI version (Nightly version)
Some nodes failed to import at startup

ComfyUI now natively supports Depth Anything 3 nodes. Make sure you have updated to the latest version of ComfyUI before starting.

Model Installation

Download the Depth Anything 3 checkpoint(s) and save them to the corresponding ComfyUI folder:

Small (depth_anything_3_small.safetensors) — Lightweight, fast inference
Base (depth_anything_3_base.safetensors) — Balanced performance
Mono-Large (depth_anything_3_mono_large.safetensors) — Best for monocular depth, includes sky detection
Metric-Large (depth_anything_3_metric_large.safetensors) — Metric scale depth in metres, includes sky detection

ComfyUI/
├── models/
│   ├── geometry_estimation/
│   │   ├── depth_anything_3_small.safetensors
│   │   ├── depth_anything_3_base.safetensors
│   │   ├── depth_anything_3_mono_large.safetensors
│   │   └── depth_anything_3_metric_large.safetensors

Example Workflows

1. Image Depth Estimation

What it does: Upload one image and run Image Depth Estimation (Depth Anything 3) to produce a depth map. The result is shown in Depth Preview, with a side-by-side comparison view of the original image and depth output.

Download Workflow

Download JSON or search “Depth Anything 3” in Template Library

Download Sample Image

Get the example input image for this workflow

Steps to Run

LoadImage — load your input image
LoadDA3Model — select a Depth Anything 3 variant
Run — click Queue or use Cmd+Enter
The workflow outputs a depth map and side-by-side comparison

Learn about Subgraph

This workflow uses Subgraph nodes for modular processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.

2. Video Depth Estimation

What it does: Upload a video and run Video Depth Estimation (Depth Anything 3) to produce a per-frame depth sequence. Inside the subgraph, GetVideoComponents splits the input video into frames, LoadDA3Model loads the model, and SetVideoComponents reassembles the depth frames back into a video output.

Download Workflow

Download JSON or search “Depth Anything 3” in Template Library

Run on Comfy Cloud

Open in Comfy Cloud

Steps to Run

LoadVideo — load your input video
Select Model — choose between Small, Base, Mono-Large, or Metric-Large
Run — click Queue or use Cmd+Enter
The workflow outputs a video with per-frame depth maps

Learn about Subgraph

This workflow uses Subgraph nodes for modular processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.

Model Variants

Variant	head_type	has_sky	has_confidence	camera_decoder	Best for
Small	dualdpt	❌	✅	✅	Fast inference, mobile/edge
Base	dualdpt	❌	✅	✅	Balanced performance
Mono-Large	dpt	✅	❌	❌	Monocular depth with sky detection
Metric-Large	dpt	✅	❌	❌	Physical metric depth in metres

Small and Base use the dualdpt head type with confidence estimation and camera decoder support for multi-view applications.
Mono-Large and Metric-Large use the dpt head type with sky detection. Metric-Large outputs raw depth in metres.

Community Resources

Depth Anything 3 GitHub (ByteDance-Seed) — Research paper and code
Comfy-Org/Depth-Anything-3 — Official ComfyUI model weights

​ComfyUI Depth Anything 3 Introduction

​Model Installation

​Example Workflows

​1. Image Depth Estimation

Download Workflow

Download Sample Image

​Steps to Run

Learn about Subgraph

​2. Video Depth Estimation

Download Workflow

Run on Comfy Cloud

​Steps to Run

Learn about Subgraph

​Model Variants

​Community Resources

ComfyUI Depth Anything 3 Introduction

Model Installation

Example Workflows

1. Image Depth Estimation

Steps to Run

2. Video Depth Estimation

Steps to Run

Model Variants

Community Resources