Avatars Overview

avatars.gif

Avatars make it easy to create videos—no camera required. Just assign a speaker to an avatar, and Descript will generate a video featuring a lifelike, animated speaker delivering your content on screen.

What are avatars?

An avatar in Descript is an animated speaker that appears onscreen and syncs with your script. It can act as a digital stand-in for a real person, speaking either with a stock speaker or your custom AI speaker.

  • Select a stock avatar from Descript’s gallery or create your own custom avatar by uploading a photo
  • Avatars work with both text-to-speech (TTS) or existing recorded speech audio
  • Each speaker can have only one avatar assigned at a time, and the avatar appears consistently throughout the project
The avatar gallery is the easiest way to get started quickly. Stock avatars are professionally designed to generate reliably and work well in a variety of settings. 

When you assign an avatar, Descript adds an avatar layer to the scene editor. This layer isn’t visible in the timeline but is accessible via the Layer panel. You can position, crop, and layer your avatar like any other visual element on the scene editor.

Why use avatars?

  • Save time by skipping recording sessions, just type your script and generate.
  • Keep your content flexible. If your script changes, you can regenerate the avatar instead of re-recording.
  • Add a visual and human element to voiceovers or audio-only projects.
  • Quickly prototype or test video ideas without needing production equipment.

Avatar generation minutes are separate from Text-to-Speech minutes

If you pair your avatar with an existing voice track, only avatar minutes are used.
If you generate an AI voice for your avatar, you’ll use both avatar minutes and Text-to-Speech minutes. Learn more about these limits in our Understanding AI Limitations, Features, and Usage Tracking guide.

Creating avatars in a blank project

Avatar compositions default to 4K

When you start a project with an avatar, Descript sets the resolution to 4K to match the avatar’s source image. The generated avatar will be < 4K, and you can change the project resolution in Video settings at any time.

  1. Open a new project and click Create with AI speaker.
    2025-04-08_15-39-25.png
  2. Select an AI speaker. You can preview our available stock voices by clicking the play button.
    Using a custom AI speaker?

    To assign an avatar to a custom AI speaker, you’ll need to add the speaker to your project first. Follow the steps to add an avatar to existing media to add your custom AI speaker then assign an avatar.

  3. Click Choose speaker to make your selection.
    2025-04-08_15-40-35.png
  4. Next, choose an avatar. In this step, you can select one from the gallery or upload your own headshot. (If you upload a custom image, follow the best practices below for the best result.)
  5. Click Assign avatar to confirm your selection.
    2025-04-08_15-43-00.png
  6. Type your script in Write mode. Once you've finished writing your script, click Done writing in the top left corner of the script panel.
    2025-04-08_15-44-34.png
  7. Once all your edits have been made, click the Generate avatar button at the top of the scene editor to animate the avatar.
    2025-04-08_15-46-40.png

    Once the generation is complete, click Continue and your avatar will appear in the scene, synced to the script.

    2025-04-22_09-24-39.png
Only generate avatars as your final step. While generating an avatar video, you will be prevented from making any edits to the content. In addition, any edits to your script afterward will require a full re-generation of the avatar. The generation process can take a while, especially for longer videos.

Adding avatars to existing media

  1. Open your project and add or assign a speaker label to the transcript or script track.
  2. Click the speaker label and select Assign avatar.
    2025-04-08_15-51-41.png
  3. Choose an avatar from the gallery or upload your own photo.
  4. You may need to reveal the avatar layer in the Scene panel by clicking the “show layer” icon. Then, drag the avatar above any existing layer to make it visible.
    2025-04-08_15-54-24.gif
  5. Click Generate avatar to create the visual animation synced with your script.

If your existing video content is covering the avatar, adjust the layer order or use Replace media to change your media source to the avatar layer.
2025-04-08_15-55-52.png

Custom avatar best practices

Uploading your own photo gives you complete creative control, but you’ll want to follow these guidelines for the best results:

  • Use a 16:9 image with the face in the upper portion and shoulders just slightly visible.
  • Ensure the subject is facing directly toward the camera—avoid side angles.
  • Use soft, even lighting to reduce shadows. Avoid reflections on glasses.
  • Clear, neutral backgrounds work best. Remove people, pets, or clutter.
  • Keep facial expressions neutral and accessories simple.
  • For more tips, see: Best practices for using custom avatars

Only JPEG, PNG, and WEBP file types are supported. Images are subject to automated safety checks and may be rejected if they include celebrities or inappropriate content.

Managing avatars

Once assigned, an avatar spans the entire composition and will appear wherever that speaker is present. You can:

  • Update an avatar by clicking the speaker label and choosing Update speaker's avatar.
  • Update or delete an avatar at any time, but this will require regeneration.
  • Position, crop, and resize the avatar layer using scene editor tools.
  • Use avatar layers alongside other effects like greenscreen or scene transitions.

Generated avatars are not tied to video scenes—they apply across the entire composition. If you need to make changes to the speaker, script, avatar, etc., Descript will prompt you to regenerate your avatar.

Avatar generation and limits

  • Clicking Generate avatar displays a modal with your remaining avatar generation minutes and the estimated time needed to complete your current generation.
  • The amount of avatar minutes used are calculated based on the total time of the spoken audio, not just a selected portion.
  • Avatars will continue to generate in the background even if you leave the project. You’ll receive an email once your avatar has generated successfully
  • At this time, the max length of an avatar generation is 12 minutes.

You can check your avatar time balance in the usage tab of your account settings.

Using avatars best practices

  • Assign avatars after you've finalized your script and layout.
  • Use the Replace media to easily substitute an avatar for video layers.
  • Position your avatar in the scene editor to ensure clear framing and visibility across scenes.

Coming soon

  • Create custom avatars using text prompts.
  • Ability to preview avatars in motion before applying.
  • Saving and reusing custom avatars across projects.