What's New


1. Video Agent API — Price Reduced to 2 Credits per Minute

We heard your feedback: the previous pricing of 6 credits per minute for the Video Agent API was too steep. Effective immediately, Video Agent API usage is now billed at 2 credits per minute — a 3× reduction in cost.

Video Agent API ReferenceVideo Agent Developer Guide


2. Starfish Text-to-Speech Model — Now Available via API

HeyGen's in-house TTS model, Starfish, is now accessible directly through the API. Previously available only in AI Studio on the web, you can now programmatically generate natural-sounding speech from text.

How it works:

  1. List compatible voicesGET /v1/audio/voices — retrieve available voice IDs.
  2. Generate speechPOST /v1/audio/text_to_speech — send your text and a voice_id to receive generated audio.

Starfish Developer GuideText-to-Speech API ReferenceList Compatible Voices


3. Video Translation API — Now Open to All Tiers

The Video Translation API, including both Fast and Quality modes, is now available across all API plan tiers — Free, Pro, Scale, and Enterprise.

ModeDescriptionCost
Fast (default)Optimized for speed; ideal for videos with limited facial movement3 credits / min
QualityNatural lip-sync with context-aware translations for premium results6 credits / min

Video Translate Developer GuideTranslate Video API ReferenceSupported Languages


Happy creating with HeyGen! 🎬

We’re excited to announce that Avatar IV is now supported for Talking Photos in the Create Avatar Video API. Draft v4 for /v2/video/generate has been rolled out to 100% of users.

What’s new:

Avatar IV for Talking Photos You can now enable the Avatar IV motion engine when generating videos with Talking Photos by using use_avatar_iv_model in /v2/video/generate.

Improved motion quality Avatar IV provides more expressive facial motion and more natural head movement compared to the previous Unlimited motion engine.

Default resolution documented as 1080p The default resolution shown in the documentation is now 1080p (previously 720p), reflecting the current supported behavior for this endpoint.


📘

Head over to the Create Avatar Video API reference to see the updated parameters and examples.

We’re excited to launch the Video Agent endpoint (/v1/video_agent/generate), a powerful "one-shot" tool that creates high-quality avatar videos from simple, natural language prompts.

Instead of manually configuring every scene, script, and asset, you can now provide a description of what you want. The Video Agent handles the heavy lifting—from scriptwriting to visual assembly—allowing you to scale video production with minimal overhead.

What’s New:

  • Prompt-to-Video: Generate a complete video from a single text description.
  • Reduced Complexity: No need to orchestrate complex JSON structures for scenes or assets.
  • Rapid Automation: Ideal for personalized messaging, social content, and internal comms at scale.

📘

Ready to test it out?

Head over to the Video Agent API Reference to try a request directly from your browser.


We now support voice configuration for ElevenLabs V3 Voice Model in Create Avatar Video API.

This is in addition to the existing supported models: eleven_monolingual_v1, eleven_multilingual_v1, eleven_multilingual_v2, eleven_turbo_v2, and eleven_turbo_v2_5 for video generation.

To use ElevenLabs V3 model, please set

voice.elevanlabs_settings.model (string, optional): to be eleven_v3

Voice.elevanlabs_settings.stability (float, optional): default value is 1.0, allow value to one of 0, 0.5, 1.0

We’ve introduced a new Video Translation model, quality, designed to deliver significantly more natural lip-sync and context-aware translations. This new mode is optional and available via the /v2/video_translate endpoint using the new mode parameter.

Available Modes

Fast Mode (default)

mode: fast

Cost: 3 credits per minute

Optimized for speed and standard translations, especially effective for videos with limited facial movement.

Quality Mode (new)

mode: quality

Cost: 6 credits per minute

Produces highly natural, context-aware lip-sync for premium translation results.

You may now specify the desired mode when calling the /v2/video_translate endpoint. If no mode is provided, it will default to fast.


Reference: https://docs.heygen.com/reference/video-translate#/

We’ve added a new endpoint to retrieve dynamic and custom variables defined in your AI Studio templates.

Endpoint: v3/template/<template_id> This endpoint allows developers to programmatically access all variable definitions — including avatar, text and elements and media (e.g. image, video, etc.) — that can be dynamically populated during video generation. It enables full integration of AI-powered avatar video creation workflows using your own template schema.

Reference: https://docs.heygen.com/reference/get-template-v3#/