[Streaming API] Better way to design the response mechanism.

When building avatar ui using streaming API, i found there're two approaches to design the response mechanism:

Using 'Knowledge Base' when creating streaming sessions:
This approach seems to create the whole response mechanism using built-in LLM model when creating streaming session; however, this method doesn't allow us to dynamically change the prompt (knowledge base) in include user's question and answer, nor retrieve the text generated from HeyGen server.
Using other LLM solution:
Another approach is to use LLM model like OpenAI. The whole prompt is created separately, the question is sent to LLM model, get the response, then pass it to HeyGen to generate the video. However, this method involves multiple requests hence longer overall latency.

What's your choice? or maybe there's sth i missed when using the API?