Using my own audio track for streaming avatar.

Hi,

I'm currently using an interactive avatar integrated with third-party STT and LLM services. So that, I'm using only the POST /v1/streaming.task endpoint to make the avatar speak pre-generated text.

However, I've encountered issues with the speech quality produced by HeyGen for certain non-English languages.

As a result, I'm considering integrating third-party TTS service (ElevenLabs) into my project, to produce more natural-sounded voices.

My question is: are there any solutions that would allow me to create a lip-synced interactive avatar that can reproduce my own audio of pre-recorded speech? In other words, I'm looking for a way to synchronize the avatar's lip movements with my custom audio clips generated from text.

Thank you!