TripoSplat is an open-source model that generates 3D Gaussian splat representations directly from a single 2D image. It was developed by VAST-AI and released under an open-source license. Unlike traditional 3D reconstruction methods that require multiple views or generate meshes as the primary output, TripoSplat creates Gaussian splat representations — a rendering technique where thousands of colored 3D Gaussians are placed in space to represent a scene. This approach enables fast, high-quality rendering with controllable density and budget.Documentation Index
Fetch the complete documentation index at: https://docs.comfy.org/llms.txt
Use this file to discover all available pages before exploring further.
Download Workflow
Download JSON or search “TripoSplat” in Template Library
How it works
TripoSplat uses a feed-forward architecture that takes a single RGB image and directly predicts a set of 3D Gaussian primitives. The pipeline involves:- Image encoding — the input image is processed by a vision encoder (DINOv2)
- Triplane generation — features are decoded into a triplane representation
- Gaussian prediction — the triplane is sampled to produce Gaussian parameters (position, scale, rotation, opacity, color)
- Rendering — Gaussians are rendered from arbitrary viewpoints using differentiable splatting
Learn about Subgraph
This workflow uses a Subgraph node for modular processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.
Workflow node guide
LoadImage
- Loads your input image (PNG/JPG)
- Sample image:
white-hotel-on-rocky-island.png(available in Template Library)
TripoSplat (subgraph)
The main subgraph node processes the image and generates the 3D Gaussian splat. Exposed parameters:| Parameter | Default | Description |
|---|---|---|
switch | — | Enable/disable the subgraph |
num_gaussians | — | Number of Gaussian primitives to generate (controls quality/performance) |
seed | — | Random seed for reproducibility |
unet_name | — | TripoSplat diffusion model checkpoint |
clip_name | — | CLIP vision encoder model |
vae_name | — | VAE for encoding/decoding (2 entries: one for the main VAE, one for the encoder) |
bg_removal_name | — | Background removal model |
CreateCameraInfo
- Defines the camera orbit for rendering the result
- Parameters: orbit type, angle, distance, field of view, etc.
- Default: orbit at 35° elevation, 30 distance, 2.5 zoom
RenderSplat
- Renders the Gaussian splat into a 2D image from the defined camera angle
- Parameters: output resolution (default 1024×1024), image quality settings
SplatToMesh
- Converts the Gaussian splat to a mesh (optional)
- Parameters: mesh density, smoothing, simplification
SaveGLB
- Saves the result as a GLB 3D file
SaveVideo
- Saves a video of the rendered 3D scene
SplatToFile3D
- Exports the Gaussian splat in SPZ format
CreateVideo
- Creates a video from rendered frames
Steps to run
- Load an image — use the LoadImage node to load a single 2D image
- Run the TripoSplat subgraph — the model will generate a Gaussian splat representation
- Choose output format — export as GLB, SPZ, video, or render to mesh
- View results — use the created 3D file or rendered preview
Output options
| Node | Format | Use case |
|---|---|---|
| SaveGLB | .glb | Standard 3D file format, importable into 3D software |
| SplatToFile3D | .spz | Compressed Gaussian splat format for efficient storage |
| RenderSplat | 2D image | Quick preview of the result from any angle |
| SplatToMesh | Mesh | Convert to traditional mesh for further editing |
Model downloads
Download the TripoSplat model and required files. Place them in the correspondingmodels/ subdirectories.
TripoSplat diffusion
triposplat_fp16.safetensors — TripoSplat diffusion model checkpoint
TripoSplat VAE decoder
triposplat_vae_decoder_fp16.safetensors — VAE decoder
Flux2 VAE
flux2-vae.safetensors — Flux.2 VAE for latent encoding
DINOv2 CLIP
dino_v3_vit_h.safetensors — CLIP vision encoder (DINOv2)
BiRefNet bg removal
birefnet.safetensors — Background removal model for preprocessing