New Session

This endpoint is used to initiate a new streaming session.

This endpoint is used to initiate a new streaming session with an Interactive Avatar. It sets up a fresh session, allowing for real-time interactions and communication.


Request Body

Field

Type

Description

quality

string

The quality of the data to be retrieved. Can be "high", "medium", or "low".
high: 2000kbps and 720p.
medium: 1000kbps and 480p.
low: 500kbps and 360p.

avatar_id

string (optional)

The ID of the Interactive Avatar to use. If not provided, a default avatar will be chosen. Default: default

voice

VoiceSetting(optional)

The settings for the Interactive Avatar's voice.

stt_settings

STTSetting(optional)
beta

The speech-to-text settings control how audio is converted into text.

video_encoding

string (optional)

Specifies the encoding format for streaming video. Can be "H264", "VP8". Default: VP8.
Note: Choosing "H264" may offer better compatibility with certain devices and platforms.

knowledge_base

string (optional)

Knowledge Base prompt used for chat task type.

version

string (optional) beta

Specifies the version to use. Currently, the only valid value is v2.
Default: Not specified (uses v1).

knowledge_base_id

string (optional) beta

The ID of the knowledge base to use for the avatar's responses. Only applicable when version is set to v2.

disable_idle_timeout

boolean(optional)

By default session has a 2 minute idle timeout, setting to true disables it.
⚠️ Do not use this feature without proper session management, as open sessions can consume your API credits!

activity_idle_timeout

integer(optional)

Specifies the maximum idle time in seconds allowed after the last user activity before the session is considered inactive. Default value: 120 seconds. Min value: 30 seconds, Max value: 3600 seconds.

livekit_settings

LiveKitSettings (optional) beta

Settings for connecting to your own LiveKit instance.
⚠️ Only provide this if you are using your own LiveKit, not HeyGen-managed LiveKit.

VoiceSetting

FieldTypeDescription
voice_idstring (optional)Voice for your Interactive Avatar. See the available voices by calling the List Voices endpoint. Note: Not every voice is supported in the streaming API.
ratefloat(optional)Voice speed rate. Default is 1.
emotionstring (optional)Emotion to use for Emotional voices. Available emotions are Excited, Serious, Friendly, Soothing, Broadcaster
elevenlabs_settingsElevenlabsSettings (optional)Voice settings to pass over if the voice provider for the session is Elevenlabs.

ElevenlabsSettings

FieldTypeDescription
stabilityfloat (optional)Default is 0.75.
model_idstring(optional)Voice model id. For ElevenLabs available models is: eleven_flash_v2_5, eleven_multilingual_v2. Default: eleven_flash_v2_5
similarity_boostfloat(optional)Default is 0.75.
stylefloat (optional)Default is 0.0.
use_speaker_boostbool (optional)Default is true.

STTSettings

Field

Type

Description

provider

string (optional)
beta

STT model. Allowed values: deepgram, gladia, assembly_ia. assembly_aiis default provider for English language transcription.

confidence

float(optional)

Default is 0.55.

LiveKitSettings

FieldTypeDescription
roomstring (optional)The LiveKit room to join. Must match the token’s room claim.
urlstring (optional)Your LiveKit server URL. Example: wss://mylivekit.example.com
tokenstring (optional)A valid LiveKit access token with the required permissions.

Response

Field

Type

Description

code

integer

The response status code.

message

string

A message providing more details about the request's result, typically explaining success or error.

data

object

Contains the main data for the response.

data.session_id

string

A unique identifier for the session.

data.url

string

The WebSocket URL for accessing the LiveKit room (e.g., wss://heygen-feapbkvq.livekit.cloud).

data.access_token

string

The access token required to authenticate and join the LiveKit room.

data.session_duration_limit

integer

The maximum allowed duration (in seconds) for the session, indicating any session time limits.

data.is_paid

boolean

A flag indicating whether the session is part of a paid plan.

data.realtime_endpoint

string

The real-time endpoint URL for alpha implementations, which could be used for experimental features (e.g., wss://webrtc-signaling.heygen.io/v2-alpha/...).

data.livekit_agent_token

string

Only returned when using HeyGen-managed LiveKit.
This token is for HeyGen’s audio agents (e.g., Pipecat / LiveKit Agent) to join the room and provide audio services.
Not needed when using your own LiveKit instance.

Language
Credentials
Header
Click Try It! to start a request and see the response here!