Discussions
Custom user text processing
5 months ago by null
Looking at other discussions: a general situation.
The voice recognition module can not be used other than together than with provided LLM solution (probably it can be named "stream voice processing and events are not connected enough").
What would be perfect (pseudo)
avatar.current?.on(StreamingEvents.USER_TALKING_MESSAGE, (message) => {
console.info('---> USER_TALKING_MESSAGE:', message);
testToSpeak = await handleUserMessage(message); // custom LLM call, other stuff
avatar.current.speak({text: testToSpeak, taskType: TaskType.REPEAT, taskMode: TaskMode.SYNC})
});
And for example while creating the avatar:`const res = await avatar.current.createStartAvatar({ quality: AvatarQuality.Low, avatarName: avatarId, customLLM: true, knowledgeId: knowledgeId, voice: { rate: 1.5, // 0.5 ~ 1.5 emotion: VoiceEmotion.EXCITED, }, language: language, });
For now this can be achieved by having an LLM always answering something like "let me think for a while" on HeyGen side, but looks a bit of an overhead.
Do you plan adding something like this or we should always perform STT separately? (which looks like a bit of an overhead, too, taking into account we have it here out of the box).