ComfyUI SCAIL-2 Character Replacement Workflow Tutorial

SCAIL-2 is an end-to-end character animation model built on Wan2.1. It drives a reference character image with a driving video, enabling both character animation (making a character perform the motion) and in-video character replacement (swapping a tracked person in a video with the reference character). Key Features:

End-to-End Character Animation: Drive a still character image with motion from a driving video
Two Modes: Animation Mode (character performs the motion) and Replacement Mode (swap tracked person with reference character)
Long Video Support: Chunk-based extended generation with frame overlap between segments
Built-in ComfyUI Nodes: Uses native WanSCAILToVideo, SCAIL2ColoredMask, and SAM3 tracking — no custom nodes required beyond standard model downloads

Related Links:

SCAIL-2 Character Replacement Workflow

Run in Comfy Cloud

Open in Comfy Cloud

Download Workflow

Download JSON or search “SCAIL-2” in Template Library

Portable or self deployed users
Desktop or Cloud users

Make sure your ComfyUI is updated.

Workflows in this guide can be found in the Workflow Templates. If you can’t find them in the template, your ComfyUI may be outdated. (Desktop version’s update will delay sometime)If nodes are missing when loading a workflow, possible reasons:

You are not using the latest ComfyUI version (Nightly version)
Some nodes failed to import at startup

How the Workflow Works

This workflow uses two subgraph nodes — a Base subgraph (first segment) and an Extend subgraph (subsequent segments) — to support character animation for both short and long videos.

Load a driving video (pose_video) and a reference character image
Base subgraph processes the first segment (81 frames by default)
Extend subgraph processes additional segments 2+, chaining previous_frames from the prior segment
Preview the result and save

Learn about Subgraph

This workflow uses Subgraph nodes for modular processing. Check out the Subgraph documentation to learn how to customize and extend the workflow.

Long Videos

For longer videos, calculate the number of segments: ceil(total_frames / 76). Each segment except the first uses the Extend subgraph. Duplicate the Extend node for more segments, chain the previous_frames output, and increment segment_index.

Note: WanSCAILToVideo cannot queue all segments automatically — run each segment manually.

Two Modes

Mode	`replace_mode`	Driving Mask BG	Summary
Replacement	`true` (default)	White	Swap the tracked person in the driving video with the reference character
Animation	`false`	Black	Reference character performs the driving motion

Set the replace_mode parameter on both subgraph nodes.

Inputs and Parameters

Shared Parameters (Base & Extend)

Parameter	Description
`pose_video`	The driving video containing motion to transfer
`reference_image`	The character image to animate or insert
`prompt`	Output video description
`replace_mode`	`true` = Replacement, `false` = Animation
`segment_index`	`1` for first chunk, `2+` for continuation. Pose offset = `76 × (index − 1)`
`width` / `height`	Output resolution, e.g. `896×512`. Must be divisible by 16
`frame_count`	Frames per segment (default: 81)
`previous_frame_count`	Overlap frames between segments (default: 5)
`pose_strength` / `pose_start` / `pose_end`	Pose conditioning strength and timing

SAM3 Tracking (two inputs)

The sam3_video_object and sam3_image_object inputs control the SAM3 mask tracking — not the SCAIL-2 output prompt. These determine which objects are tracked for the colored masks:

Input	Target	Output
`sam3_video_object`	Driving video	`pose_video_mask`
`sam3_image_object`	Reference image	`reference_image_mask`

Use open-vocabulary text (default: human)
Use the same term when the subject is the same across video and reference
Use different terms if the video and reference need different focus (e.g., crowded scenes)

Model Installation

Update ComfyUI to the latest version first for the built-in WanSCAILToVideo and SCAIL2ColoredMask nodes.

Required Models

diffusion_models

wan2.1_14B_SCAIL_2_fp16.safetensors

text_encoders (choose one)

umt5_xxl_fp8_e4m3fn_scaled.safetensors

clip_vision

clip_vision_h.safetensors

vae

Wan2_1_VAE_bf16.safetensors

loras

checkpoints

sam3.1_multiplex_fp16.safetensors

File Storage Locations

ComfyUI/
├── models/
│   ├── diffusion_models/
│   │   └── wan2.1_14B_SCAIL_2_fp16.safetensors
│   ├── text_encoders/
│   │   └── umt5_xxl_fp8_e4m3fn_scaled.safetensors
│   ├── clip_vision/
│   │   └── clip_vision_h.safetensors
│   ├── vae/
│   │   └── Wan2_1_VAE_bf16.safetensors
│   ├── loras/
│   │   ├── lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors
│   │   └── wan2.1_SCAIL_2_DPO_lora_bf16.safetensors
│   └── checkpoints/
│       └── sam3.1_multiplex_fp16.safetensors

Get Started

Agent Tools / MCP

Basic Concepts

Interface Guide

Tutorials

Partner Nodes

ComfyUI SCAIL-2 Character Replacement Workflow Tutorial

SCAIL-2 Character Replacement Workflow

Run in Comfy Cloud

Download Workflow

How the Workflow Works

Learn about Subgraph

Long Videos

Two Modes

Inputs and Parameters

Shared Parameters (Base & Extend)

SAM3 Tracking (two inputs)

Model Installation

Required Models

File Storage Locations

​SCAIL-2 Character Replacement Workflow

Run in Comfy Cloud

Download Workflow

​How the Workflow Works

Learn about Subgraph

​Long Videos

​Two Modes

​Inputs and Parameters

​Shared Parameters (Base & Extend)

​SAM3 Tracking (two inputs)

​Model Installation

​Required Models

​File Storage Locations

SCAIL-2 Character Replacement Workflow

How the Workflow Works

Long Videos

Two Modes

Inputs and Parameters

Shared Parameters (Base & Extend)

SAM3 Tracking (two inputs)

Model Installation

Required Models

File Storage Locations