Discussions

Ask a Question
Back to All

Inquiry about compatibility, costs, and options for Whisper models in the HeyGen SDK

Hello, I am working on a custom project using the Interactive Avatar SDK with Vite, based on the "Create a Vite Project with Streaming SDK" demo. I am currently using the audio-handler and adding the model like this:
formData.append("model", "whisper-1").

I have a few questions and would appreciate your help in answering them:

Is it possible to use more recent versions of the model, such as "whisper-2" or "whisper-3", instead of "whisper-1"?

If compatible, what are the key differences between "whisper-1", "whisper-2", and "whisper-3" in terms of capacity, performance, and features?

Are these models paid or free to use in the project? Is there a credit system or usage fees associated with them?

If these versions cannot be used or there are restrictions, what alternative options do you recommend for efficiently integrating audio with the avatar?

Do the models "whisper-2" or "whisper-3" require additional configuration or code adjustments to integrate them correctly with the SDK?

Is it possible to use multiple Whisper versions simultaneously in a single project, or should only one specific version be chosen?}

Are there examples or official documentation on how to integrate newer versions of Whisper into the HeyGen SDK?

I appreciate your time and assistance.