Custom Avatars

Descript lets you create AI avatars that animate in sync with your speaker's voice. You can upload a photo of a human face or generate an avatar using a text prompt—no camera required. It's a flexible way to personalize your content while working within Descript's text-based editing workflow.

This article covers:

Create a custom avatar with an image

  1. Assign a speaker to your script.
  2. Click the speaker label and choose Assign avatar.
    Assign avatar button
  3. Click Upload photo and select a supported image file (.jpeg, .png, or .webp).
    click the panel or drag and drop a file to uplod a photo to use as a custom avatar
  4. Preview your image and click Assign avatar to apply it.
    Photo preview for avatar
  5. Once your script is finalized, click Generate avatar to animate your image and sync it with the speaker.
    Generate avatar button

If your photo doesn't meet formatting or safety requirements, Descript will prompt you to upload a different one.

Create a custom avatar from a text prompt

  1. Assign a speaker to your script.
  2. Click the speaker label and choose Assign avatar to [speaker name].
  3. In the Text prompt tab, enter a description to generate an avatar, or click the Inspire button to use a suggested prompt. For best results, describe a realistic, human-like figure.
  4. Descript will generate three avatar options. You can:
    • Select one of the images to apply, or
    • Continue iterating on your prompt. Previous generations remain visible as long as the avatar modal stays open.
    Note: Closing the modal will clear your avatar history.
  5. Click Assign avatar to apply your selection.
  6. Once your script is finalized, click Generate avatar to animate the image and sync it with the speaker.

Generated avatar preview

Generation workflow and timing

  • Avatar generation uses avatar minutes, not AI voice minutes.
  • Minutes are calculated based on total spoken audio duration.
  • You’ll see a modal showing your remaining avatar minutes and estimated usage before confirming.
  • The current max length of an avatar generation is 12 minutes.
    Avatar generation continues in the background if you close the project. You’ll receive an email when it’s ready.
Only generate avatars as your final step. You won't be able to edit the script during generation, and any script edits afterward will require a full re-generation.

Avatar generation length limit

Managing and updating custom avatars

To update a custom avatar, click the speaker label and choose Update speaker’s avatar. This updates the avatar across your project while preserving its size, position, and visibility in the scene editor.

Update avatar

If the avatar doesn’t appear in the scene:

  • Open the Scene panel and click the "show layer" icon
  • Drag the avatar layer above other visuals if needed

Show avatar layer

Or use Replace media in the scene editor to swap it back into view.

Replace media example

Avatar layers aren’t visible in the timeline but can be cropped, styled, or repositioned in the scene editor. You can also apply effects like greenscreen.

Best practices for using custom avatars

Whether you’re uploading a photo or generating an avatar from a text prompt, use a clear, human-like face with even lighting and a relaxed, natural expression. Avoid using images or prompts that include animals, objects, or abstract characters—they won’t animate reliably.

Head position and framing

Attribute Best Practices
Framing Use a close-up headshot with shoulders slightly visible. The subject should be centered and squared to the camera.
Head Position Face forward with the head upright. Avoid angled or three-quarter views that distort motion.

Glasses, mouth, and eyes

Attribute Best Practices
Glasses Avoid heavy reflections. Eyes must be clearly visible.
Mouth Ensure the mouth is unobstructed so the AI can sync lip shapes accurately.
Eyes Eyes must be open and clearly visible. Avoid shadows or squinting that obscure the eye shape.

Background, foreground, and lighting

Attribute Best Practices
Background Remove people or animals in the background—they won’t animate and may distract from the avatar.
Foreground Keep the subject unobstructed. Avoid props or objects that block the face or shoulders.
Lighting Use soft, even lighting to avoid harsh shadows.
Contrast Ensure good contrast between the subject and the background for clean separation.

File upload requirements

Attribute Best Practices
Image Content Photos may be rejected if they contain celebrities or inappropriate content, such as nudity.
File Types Supported formats: JPEG, PNG, and WEBP.
File Size Maximum file size is 10MB.
Aspect Ratio Use a 16:9 aspect ratio for best compatibility with Descript’s scene editor.