Avatars Overview

Avatars are animated presenters for your content. They can stand in for a talking head and speak using a stock AI voice or your custom voice clone. Just designate a speaker, assign an avatar, and Descript will generate an animated presenter to deliver your content onscreen.

Use a pre-made avatar from the Descript gallery or create your own by uploading a photo, generating an avatar image from a text prompt, or combining both. Learn how to create a custom avatar.

Avatars work with both text-to-speech (TTS) and your own recorded content. Each speaker in the composition can have one avatar assigned at a time, which applies consistently across the entire project.

Usage note

On current plans, this feature uses AI Credits. Learn more about tracking your Media Minutes and AI Credits.

Legacy and Sunset plans track usage differently. See our Understanding your Legacy and Sunset plan guide for details.

How to use avatars in your project

The workflow depends on the type of audio you’re using:

  • Write a script and generate audio with text-to-speech.
  • Record your own voice directly in Descript.
  • Import pre-recorded audio, like a podcast or screen recording.

For step-by-step instructions for each workflow, see How to use avatars in Descript.

Manage avatars in a project

Once assigned, an avatar follows its speaker anywhere that speaker appears in your composition. You can:

  • Update or remove avatars by clicking the speaker label and choosing Update speaker’s avatar.
  • Resize, reposition, and crop the avatar in the Scene panel.
  • Use avatars with visual effects like Green Screen.
  • Regenerate the avatar at any time. (Note: changes to the script or speaker label require re-generation.)

Avatar gallery showing a selection of avatars

Avatar generation and limits

  • When you click Generate avatar, you’ll see your remaining avatar minutes and the estimated generation time.
  • Avatars generate in the background; you’ll get an email when the render is complete.
  • The maximum avatar generation length is currently 12 minutes.

Tips for using avatars

  • Generate after you’ve finalized your script and speaker labels.
  • Use Replace media to swap a camera layer for an avatar.
  • If an avatar isn’t visible, confirm it’s above other visuals in the layer stack; see layer order.