- 3.5B Parameter Model: Efficient yet powerful model size for high-quality anime generation
- Next-DiT Architecture: Based on research from the Lumina architecture with a newly designed NewBie architecture
- Dual Text Encoders: Uses Gemma3-4B-it as primary encoder with Jina CLIP v2 for improved prompt understanding
- FLUX VAE: Utilizes FLUX.1-dev 16-channel VAE for richer colors and finer texture details
- XML Structured Prompts: Supports XML format for better attention binding and attribute disentanglement
NewBie-image text-to-image workflow
Download JSON Workflow File
Run on ComfyUI Cloud
Model links
text_encoders diffusion_models vae Model Storage LocationPrompt format
NewBie-image is an anime image generation model optimized for character generation. It uses XML structured prompts for training, where each<> tag defines a category (like <appearance>, <clothing>) and </> closes it. The tags inside are standard Danbooru tags. This structure enables precise control over multi-character scenes with better attribute binding.
For the complete prompt writing guide, see the official documentation.
NewBie-image-Exp0.1 supports three prompt formats:
- Natural language: Standard text descriptions
- Tags: Danbooru-style tags
- XML structured format: Recommended for multi-character scenes
XML structured prompt
For multi-character scenes, using XML structured prompts typically leads to more accurate image generation results with better attention binding and attribute disentanglement.XML tag reference
| Tag | Description |
|---|---|
<n> | Character name/identifier |
<gender> | Character gender (1girl, 1boy, etc.) |
<appearance> | Physical features (hair, eyes, body type) |
<clothing> | Outfit and accessories |
<expression> | Facial expression |
<action> | Pose and actions |
<position> | Position in the image |
<count> | Number of characters |
<style> | Art style |
<background> | Background description |
<atmosphere> | Overall mood |
<quality> | Quality tags |
<objects> | Objects in the scene |
<other> | Additional tags |