This API now generates videos with our New AI Studio backend.
Request Body
Field | Type | Description |
---|---|---|
caption | bool (optional) | Whether to add a caption to the video. Default is |
title | str (optional) | Title for the video. |
callback_id | str (optional) | A custom ID for callback purposes. |
video_inputs | List of video input settings (scenes). Must contain between 1 to 50 items. A video input describes the avatar, background, voice, and script, which together equals a 'scene'. | |
dimension | The dimensions of the output video. | |
folder_id | str (optional) | Allows them to specify the video output folder destination. |
callback_url | str(optional) | An optional callback url. This is useful if your callback endpoint is dynamic and each video have it's separate callback url. |
VideoInput
Field | Type | Description |
---|---|---|
character | AvatarSettings or TalkingPhotoSettings (optional). | Character settings. |
voice | TextVoiceSettings or AudioVoiceSettings or SilenceVoiceSettings | Voice settings. |
background | ColorBackground or ImageBackground or VideoBackground (optional) | Background settings. |
Character Settings
AvatarSettings
Field | Type | Description |
---|---|---|
type | Literal["avatar"] | Indicates that this is an avatar character setting. |
avatar_id | str | Avatar ID. Please note that this is NOT the Avatar Group ID; they are different. |
scale | float | Avatar scale, value between 0 and 5.0 . Default is 1.0 . Use the Avatar Positioning tool for easier adjustment. |
avatar_style | CharacterRenderType (optional) | Avatar style. Supported values are: circle , normal , closeUp . |
offset | Offset | Avatar offset. Default is { "x": 0.0, "y": 0.0 } . Use the Avatar Positioning tool for easier adjustment. |
matting | bool (optional) | Whether to do matting |
circle_background_color | str (optional) | background color in the circle when using circle style |
Note: Currently, background-removed custom avatars are not supported in the API.
TalkingPhotoSettings
Field | Type | Description |
---|---|---|
type | Literalst Body [block:p | Indicates that this is a talking photo character setting. |
talking_photo_id | str | Talking Photo ID. |
scale | float | Talking Photo scale, value between |
talking_photo_style | TACropStyle (optional) | Talking Photo crop style. Supported values are: |
offset | Offset | Talking Photo offset. |
talking_style | TPExpression | Talking Photo talking style. Default is |
expression | TPExpressionStyle | Talking Photo expression style. Default is |
super_resolution | bool (optional) | Whether to enhance this photar image. |
matting | bool (optional) | Whether to do matting. |
circle_background_color | str (optional) | background color in the circle/square when using circle/square style |
Voice Settings
TextVoiceSettings
Field | Type | Description |
---|---|---|
type | Literal["text"] | Indicates that this is a text voice setting. |
voice_id | str | Voice ID. |
input_text | str | Input text. |
speed | float (optional) | Voice speed, value between 0.5 and 1.5 . Default is 1 . |
pitch | int (optional) | Voice pitch, value between -50 and 50 . Default is 0 . |
emotion | str (optional) | Voice emotion, if voice support emotion. value are ['Excited','Friendly','Serious','Soothing','Broadcaster'] |
locale | str (optional) | Allows to specify voice accents/locales for multilingual voices. (e.g., en-US , en-IN , pt-PT , pt-BR ) |
elevenlabs_settings | ElevenLabsSettings object (optional) | ElevenLabs specific voice settings. |
ElevenLabsSettings:
Field | Type | Description |
---|---|---|
model | string | The ElevenLabs model to use. Valid options: eleven_monolingual_v1 , eleven_multilingual_v1 , eleven_multilingual_v2 , eleven_turbo_v2 , eleven_turbo_v2_5 |
similarity_boost | float | Controls how similar the generated speech should be to the original voice. Range: 0.0 to 1.0 |
stability | float | Controls the stability of the voice generation. Higher values result in more consistent and stable output. Range: 0.0 to 1.0 |
style | float | Controls the style intensity of the generated speech. Range: 0.0 to 1.0 |
AudioVoiceSettings
Field | Type | Description |
---|---|---|
type | Literal["audio"] | Indicates that this is an audio voice setting. |
audio_url | str (optional) | Audio URL. |
audio_asset_id | str (optional) | Audio asset ID. Either audio_url or audio_asset_id must be provided. |
SilenceVoiceSettings
Field | Type | Description |
---|---|---|
type | Literal["silence"] | Indicates that this is a silence voice setting. |
duration | float | Duration of silence, value between 1.0 and 100.0 . Default is 1.0 . |
Background Settings
ColorBackground
Field | Type | Description |
---|---|---|
type | Literal["color"] | Indicates that this is a color background setting. Default is color. |
value | str | Color value in hex format. Default is #f6f6fc . |
ImageBackground
Field | Type | Description |
---|---|---|
type | Literal["image"] | Indicates that this is an image background setting. |
url | str (optional) | Image URL. |
image_asset_id | str (optional) | Image asset ID. Either url or image_asset_id must be provided. |
fit | str (optional) | Background image fit to the screen. Choose among cover , crop , contain and none . Default is cover |
VideoBackground
Field | Type | Description |
---|---|---|
type | Literal["video"] | Indicates that this is a video background setting. |
url | str (optional) | Video URL. |
video_asset_id | str (optional) | Video asset ID. Either url or video_asset_id must be provided. |
play_style | VideoPlayback | Video play style. Supported values are: fit_to_scene , freeze , loop , once . More Info |
fit | str (optional) | Background video fit to the screen. Choose among cover , crop , contain and none . Default is cover |
Response
Field | Type | Description |
---|---|---|
video_id | str | ID of the generated video. |