Inputs
| Parameter | Description | Data Type | Required | Range |
|---|---|---|---|---|
seconds | The duration of the audio in seconds (default: 47.6) | FLOAT | Yes | 1.0 - 1000.0 |
batch_size | The number of latent images in the batch (default: 1) | INT | Yes | 1 - 4096 |
Outputs
| Output Name | Description | Data Type |
|---|---|---|
LATENT | Returns an empty latent tensor for audio processing with the specified duration and batch size. The tensor has a shape of [batch_size, 64, length], where length is calculated from the audio duration and sample rate. The output also includes metadata indicating the type is “audio” and a temporal downscale ratio of 2048. | LATENT |
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
Source fingerprint (SHA-256):
02d7623358a6cc8200e74ab36f6a83bcfa059abd6ef91e3fd1c3732706926033