> ## Documentation Index
> Fetch the complete documentation index at: https://docs.comfy.org/llms.txt
> Use this file to discover all available pages before exploring further.

# LTXVReferenceAudio - ComfyUI Built-in Node Documentation

> Complete documentation for the LTXVReferenceAudio node in ComfyUI. Learn its inputs, outputs, parameters and usage.

> This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! [Edit on GitHub](https://github.com/Comfy-Org/embedded-docs/blob/main/comfyui_embedded_docs/docs/LTXVReferenceAudio/en.md)

The LTXV Reference Audio node is used for speaker identity transfer in audio generation. It encodes a reference audio clip into the conditioning for a model, allowing the generated audio to adopt the speaker's voice characteristics. It can also apply identity guidance, which runs an extra processing step to amplify the speaker identity effect.

## Inputs

| Parameter                 | Data Type    | Required | Range       | Description                                                                                                                                                            |
| ------------------------- | ------------ | -------- | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `model`                   | MODEL        | Yes      | -           | The model to be patched with identity guidance.                                                                                                                        |
| `positive`                | CONDITIONING | Yes      | -           | The positive conditioning input.                                                                                                                                       |
| `negative`                | CONDITIONING | Yes      | -           | The negative conditioning input.                                                                                                                                       |
| `reference_audio`         | AUDIO        | Yes      | -           | Reference audio clip whose speaker identity to transfer. \~5 seconds recommended (training duration). Shorter or longer clips may degrade voice identity transfer.     |
| `audio_vae`               | VAE          | Yes      | -           | LTXV Audio VAE for encoding the reference audio.                                                                                                                       |
| `identity_guidance_scale` | FLOAT        | No       | 0.0 - 100.0 | Strength of identity guidance. Runs an extra forward pass without reference each step to amplify speaker identity. Set to 0 to disable (no extra pass). (default: 3.0) |
| `start_percent`           | FLOAT        | No       | 0.0 - 1.0   | Start of the sigma range where identity guidance is active. (default: 0.0)                                                                                             |
| `end_percent`             | FLOAT        | No       | 0.0 - 1.0   | End of the sigma range where identity guidance is active. (default: 1.0)                                                                                               |

## Outputs

| Output Name | Data Type    | Description                                                                 |
| ----------- | ------------ | --------------------------------------------------------------------------- |
| `model`     | MODEL        | The model patched with the identity guidance function.                      |
| `positive`  | CONDITIONING | The positive conditioning, now containing the encoded reference audio data. |
| `negative`  | CONDITIONING | The negative conditioning, now containing the encoded reference audio data. |

***

**Source fingerprint (SHA-256):** `a25e24a08df73b8a34fd476544634e396a0eec5b6dc630e911c371f1b16931b8`
