Discussions

Ask a Question
Back to All

Urgent: Default Language Setting & SDK vs Streaming API for Interactive Avatar with Custom LLM Integration

Hello HeyGen Team,

We are currently integrating your interactive avatar feature into our application and using a custom LLM on the backend to enable a deeply customized conversational workflow.

We have a few urgent clarifications regarding this integration:

Default Language Setting:
Is there a way to set a default language for the avatar (e.g., English, Hindi, etc.) programmatically during initialization, or should this be managed on the speech input/output level via the Azure Speech SDK?

Speech SDK vs Streaming API:

For sending and receiving audio, would you recommend using the Azure Speech SDK, or is it better to work with your Streaming API?

Are there specific advantages or limitations to each, especially when working with low-latency, real-time audio?

SDK/API Usage Restrictions:

Are there any limitations or restrictions on using HeyGen's SDK or API in production apps (e.g., mobile support, commercial use, concurrent sessions)?

Are there cases where you recommend not using the SDK, and instead opt for the API or other means?

Our goal is to ensure a seamless, real-time avatar interaction experience tightly integrated with our backend logic. Please advise on the best architectural path forward, especially considering language control and voice stream handling.

Looking forward to your guidance urgently, as we are finalizing our architecture this week.

Thank you!