Discussions

Ask a Question
Back to All

Streaming Avatar: optimize latency strategies

Hi,

I'm only using the speak task and was wondering how I can optimize latency:

  • I send a block of text (approx 150 words)
  • Pause, waiting for the user to react
  • send another block of text.

If I send a largetext block, it looks like the latency is higher than if I send a small block.
So, I cut the text into phrases (using punctuation).


I have not measured a huge difference in terms of latency, it looks better but not in high proportions.

My questions are:

  • Do I loose quality doing this ? (as the inference on voice is done only with part of the context and not full text)
  • Have you done such tests, do you see speed improvement ?

Cheers