We’re excited to launch the Video Agent endpoint (/v1/video_agent/generate), a powerful "one-shot" tool that creates high-quality avatar videos from simple, natural language prompts.

Instead of manually configuring every scene, script, and asset, you can now provide a description of what you want. The Video Agent handles the heavy lifting—from scriptwriting to visual assembly—allowing you to scale video production with minimal overhead.

What’s New:

  • Prompt-to-Video: Generate a complete video from a single text description.
  • Reduced Complexity: No need to orchestrate complex JSON structures for scenes or assets.
  • Rapid Automation: Ideal for personalized messaging, social content, and internal comms at scale.

📘

Ready to test it out?

Head over to the Video Agent API Reference to try a request directly from your browser.


We now support voice configuration for ElevenLabs V3 Voice Model in Create Avatar Video API.

This is in addition to the existing supported models: eleven_monolingual_v1, eleven_multilingual_v1, eleven_multilingual_v2, eleven_turbo_v2, and eleven_turbo_v2_5 for video generation.

To use ElevenLabs V3 model, please set

voice.elevanlabs_settings.model (string, optional): to be eleven_v3

Voice.elevanlabs_settings.stability (float, optional): default value is 1.0, allow value to one of 0, 0.5, 1.0

We’ve introduced a new Video Translation model, quality, designed to deliver significantly more natural lip-sync and context-aware translations. This new mode is optional and available via the /v2/video_translate endpoint using the new mode parameter.

Available Modes

Fast Mode (default)

mode: fast

Cost: 3 credits per minute

Optimized for speed and standard translations, especially effective for videos with limited facial movement.

Quality Mode (new)

mode: quality

Cost: 6 credits per minute

Produces highly natural, context-aware lip-sync for premium translation results.

You may now specify the desired mode when calling the /v2/video_translate endpoint. If no mode is provided, it will default to fast.


Reference: https://docs.heygen.com/reference/video-translate#/

We’ve added a new endpoint to retrieve dynamic and custom variables defined in your AI Studio templates.

Endpoint: v3/template/<template_id> This endpoint allows developers to programmatically access all variable definitions — including avatar, text and elements and media (e.g. image, video, etc.) — that can be dynamically populated during video generation. It enables full integration of AI-powered avatar video creation workflows using your own template schema.

Reference: https://docs.heygen.com/reference/get-template-v3#/

Endpoint: /v1/streaming.new
Reference: https://docs.heygen.com/reference/new-session

A new parameter activity_idle_timeout has been added to the streaming.new endpoint.
It specifies the maximum idle time (in seconds) after the last user interaction before the session is marked as inactive.Default: 120 seconds, Range: 30 to 3600 seconds. This gives you greater control over session lifecycle management and resource usage.