Integrating OpenAI Assistant with Streaming SDK

The OpenAI Assistants API enables you to create AI-powered assistants directly within your applications. An assistant can follow specific instructions and use models, tools, and files to respond to user queries effectively.

In this example, we will create a basic assistant to generate responses and integrate it with an avatar. To achieve this, we will build on the Vite + SDK project created in the previous step.

Step 1: Install OpenAI Package

Add the OpenAI library to your project:

npm install openai

Step 2: Create openai-assistant.ts

This file contains a class to interact with OpenAI's API. Place it in the src folder.

import OpenAI from "openai";

export class OpenAIAssistant {
  private client: OpenAI;
  private assistant: any;
  private thread: any;

  constructor(apiKey: string) {
    this.client = new OpenAI({ apiKey, dangerouslyAllowBrowser: true });
  }

  async initialize(
    instructions: string = `You are an English tutor. Help students improve their language skills by:
    - Correcting mistakes in grammar and vocabulary
    - Explaining concepts with examples
    - Engaging in conversation practice
    - Providing learning suggestions
    Be friendly, adapt to student's level, and always give concise answers.`
  ) {
    // Create an assistant
    this.assistant = await this.client.beta.assistants.create({
      name: "English Tutor Assistant",
      instructions,
      tools: [],
      model: "gpt-4-turbo-preview",
    });

    // Create a thread
    this.thread = await this.client.beta.threads.create();
  }

  async getResponse(userMessage: string): Promise<string> {
    if (!this.assistant || !this.thread) {
      throw new Error("Assistant not initialized. Call initialize() first.");
    }

    // Add user message to thread
    await this.client.beta.threads.messages.create(this.thread.id, {
      role: "user",
      content: userMessage,
    });

    // Create and run the assistant
    const run = await this.client.beta.threads.runs.createAndPoll(
      this.thread.id,
      { assistant_id: this.assistant.id }
    );

    if (run.status === "completed") {
      // Get the assistant's response
      const messages = await this.client.beta.threads.messages.list(
        this.thread.id
      );

      // Get the latest assistant message
      const lastMessage = messages.data.filter(
        (msg) => msg.role === "assistant"
      )[0];

      if (lastMessage && lastMessage.content[0].type === "text") {
        return lastMessage.content[0].text.value;
      }
    }

    return "Sorry, I couldn't process your request.";
  }
}

Using dangerouslyAllowBrowser: true allows direct API calls from the browser.

Best Practice: For security, perform these calls on your backend instead of exposing the API key in the browser. This implementation is kept simple for demonstration.

Step 3: Update main.ts

  1. Import and Declare the Assistant:

import { OpenAIAssistant } from "./openai-assistant";

let openaiAssistant: OpenAIAssistant | null = null;

  1. Update initializeAvatarSession:

Add the OpenAI assistant initialization:

// Initialize streaming avatar session
async function initializeAvatarSession() {
  // Disable start button immediately to prevent double clicks
  startButton.disabled = true;

  try {
    const token = await fetchAccessToken();
    avatar = new StreamingAvatar({ token });

    // Initialize OpenAI Assistant
    const openaiApiKey = import.meta.env.VITE_OPENAI_API_KEY;
    openaiAssistant = new OpenAIAssistant(openaiApiKey);
    await openaiAssistant.initialize();

    sessionData = await avatar.createStartAvatar({
      quality: AvatarQuality.Medium,
      avatarName: "Wayne_20240711",
      language: "English",
    });

    console.log("Session data:", sessionData);

    // Enable end button
    endButton.disabled = false;

    avatar.on(StreamingEvents.STREAM_READY, handleStreamReady);
    avatar.on(StreamingEvents.STREAM_DISCONNECTED, handleStreamDisconnected);
  } catch (error) {
    console.error("Failed to initialize avatar session:", error);
    // Re-enable start button if initialization fails
    startButton.disabled = false;
  }
}
  1. Update handleSpeak:

Passes user input to OpenAI, retrieves a response, and instructs the avatar to speak the response aloud.

// Handle speaking event
async function handleSpeak() {
  if (avatar && openaiAssistant && userInput.value) {
    try {
      const response = await openaiAssistant.getResponse(userInput.value);
      await avatar.speak({
        text: response,
        taskType: TaskType.REPEAT,
      });
    } catch (error) {
      console.error("Error getting response:", error);
    }
    userInput.value = ""; // Clear input after speaking
  }
}

Step 4: Update Environment Variables

Add your OpenAI API key to the .env file:

VITE_OPENAI_API_KEY=your-key

How It Works

  1. User Input: The user enters a query.
  2. OpenAI Interaction: The query is sent to OpenAI's Assistant via OpenAIAssistant.getResponse.
  3. Avatar Response: The response from OpenAI is passed to the HeyGen avatar via avatar.speak.

Example Workflow

  1. User enters a question: "What is the difference between affect and effect?"
  2. OpenAI processes and responds.
  3. The avatar speaks the response aloud.

Conclusion

With these updates, your HeyGen avatar can now utilize OpenAI's Assistant API for interactive, intelligent conversations. While the example prioritizes simplicity, implementing a secure backend for API calls is highly recommended for production.