Discussions
Streaming Avatar API Interrupt Not Operational
We are using the HeyGen V1 Streaming Avatar API (endpoints like /v1/streaming.new, /v1/streaming.start, /v1/streaming.task) via direct HTTP calls to power an interactive avatar. We are encountering an issue specifically with interrupting the avatar's speech.
Our Implementation for Interrupting Speech:
A session is established using /v1/streaming.create_token, /v1/streaming.new, and /v1/streaming.start. This part works fine, and the avatar starts.
When the avatar is speaking an initial or ongoing utterance, and we need it to stop and say something new (e.g., in response to a user's barge-in), our client-side JavaScript makes a POST request to:
https://api.heygen.com/v1/streaming.task
The JSON payload for this "interrupting" request is structured as follows:
{
"session_id": "YOUR_ACTIVE_V1_SESSION_ID", // The active session_id obtained from /v1/streaming.new
"text": "This is the new text the avatar should say immediately.",
"interrupt": true
}.
Json
Our application logs confirm that this request is sent successfully to your API, and we typically receive a success response (e.g., HTTP 200, with {"code": 100, "message": "SUCCESS"}).
The Observed Behavior (The Problem):
Despite sending the new task with interrupt: true, the avatar does not immediately stop its current speech. Instead, it completes its entire ongoing sentence or utterance before it begins to speak the new text sent in the "interrupting" task.
For example, if the avatar is saying, "Hello, welcome to our service, we offer many features including A, B, and C," and the user interrupts after "welcome to our service," our system sends a new task like {"text": "How can I help you?", "interrupt": true}. The avatar, however, finishes saying "...we offer many features including A, B, and C" before starting "How can I help you?".
Our Question:
Could you please help us understand why setting interrupt: true in the /v1/streaming.task payload is not resulting in an immediate cessation of the avatar's current speech?
Is there a specific nuance to how interrupt: true functions within the V1 streaming.task API that we might be misunderstanding?
Is there a minimum processing time or buffering that would cause the current utterance to complete regardless of the interrupt: true flag?
Are there other parameters or a different V1 endpoint/method we should be using in conjunction with or instead of interrupt: true in the streaming.task to achieve an immediate, mid-utterance cut-off?
Our goal is to create a fluid conversational experience where the avatar can be promptly interrupted. Any detailed explanation or guidance on the correct usage of the V1 API to achieve this would be immensely helpful.
Thank you for your assistance.