The Generate Voice Audio Preview API lets you synthesize a short audio clip using a selected voice and sample text, allowing you to test tone, pronunciation, and pacing before finalizing your voice selection for video generation.

📘
Note:
This APIs is Enterprise-only and consumes API credits.

1. Provide a Voice ID and Sample Text

To generate an audio preview, you need to provide:

A valid voice_id, which can be retrieved from the List All Voices (V2) endpoint.
A short text sample (or SSML input) that you’d like to preview.

The API will generate an audio clip and return a playback URL along with additional details such as the duration and word-level timestamps.

2. Generating Preview

Once you’ve selected your preferred voice from the list of available voices, you can use its voice_id to generate a preview. In the request body, provide:

The voice_id of the chosen voice.
The text you want to synthesize.

Optionally, specify text_type as plain or ssml depending on your input format.

The API responds with an audio_url for the generated preview and metadata such as preview duration, timestamps, and job status.

For further information about request and response parameters, see detailed API reference: Generate Voice Audio Preview.