SAM3_VideoTrack - ComfyUI Built-in Node Documentation

Overview

Track objects across video frames using SAM3’s memory-based tracker. This node processes a sequence of video frames and maintains object identities across frames, using either initial masks or text prompts to define what to track.

Inputs

Parameter	Description	Data Type	Required	Range
`images`	Video frames as batched images	IMAGE	Yes	Batched video frames
`model`	The SAM3 model to use for tracking	MODEL	Yes	SAM3 model
`initial_mask`	Mask(s) for the first frame to track (one per object). Required if `conditioning` is not provided.	MASK	No	One mask per object
`conditioning`	Text conditioning for detecting new objects during tracking. Required if `initial_mask` is not provided.	CONDITIONING	No	Text conditioning
`detection_threshold`	Score threshold for text-prompted detection	FLOAT	No	0.0 to 1.0 (default: 0.5)
`max_objects`	Max tracked objects. Initial masks count toward this limit. 0 uses the internal cap of 64.	INT	No	0 to 64 (default: 0)
`detect_interval`	Run detection every N frames (1=every frame). Higher values save compute.	INT	No	1 to unlimited (default: 1)

Note: Either initial_mask or conditioning must be provided. If both are omitted, the node will raise an error.

Outputs

Output Name	Description	Data Type
`track_data`	Tracking data containing object masks and metadata across all video frames	SAM3TrackData

This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub

Source fingerprint (SHA-256): 36ee256c46ea3816be4d06b64d945b79af530032f29e5e4c8741971c7ebf9fae

SAM3_TrackToMask - ComfyUI Built-in Node Documentation

SDPoseDrawKeypoints - ComfyUI Built-in Node Documentation

​Overview

​Inputs

​Outputs

Overview

Inputs

Outputs