Custom avatars let you bring your own face, or someone else’s, into Descript videos. By uploading a headshot photo, you can generate a visual avatar that syncs with your script and speaker’s voice. It’s a powerful way to personalize your content while maintaining the flexibility of Descript’s text-based editing.
Upload and assign a custom avatar
- Assign a speaker to your script. This can be a stock AI voice or a human-recorded track.
- Click the speaker label and choose Assign avatar.
- Click Upload photo and select a supported image file (
.jpeg
,.png
, or.webp
). - Preview your image and click Assign avatar to apply it.
- Once your script is finalized, click Generate avatar to animate your image and sync it with the speaker.
If your photo doesn’t meet formatting or safety standards, Descript will prompt you to upload a different one.
Generation workflow and timing
- Avatar generation uses avatar minutes, not AI voice minutes.
- The amount of avatar minutes used is calculated based on the total time of the spoken audio, not just a selected portion.
- You’ll see a modal displaying your remaining avatar minutes and estimated usage before confirming generation.
- At this time, the max length of an avatar generation is 12 minutes.
- Avatars continue generating in the background if you close the project. You’ll receive an email when it’s complete.
Managing and updating custom avatars
To update a custom avatar, click the speaker label and choose Update speaker’s avatar. This updates the avatar across your entire project while maintaining its visibility, size, and position on the scene editor.
If your avatar doesn’t appear in the scene, open the Scene panel and click the “show layer” icon to make it visible. If needed, drag the avatar layer above other visual layers.
or use Replace media in the scene editor toolbar to swap it into view.
Avatar layers are not visible in the timeline but can be repositioned, cropped, or styled directly in the scene editor using standard tools. You can also apply effects like greenscreen to avatar layers.
Best practices for using custom avatars
To ensure your custom avatar generates properly and performs well in video, follow these guidelines when uploading your photo. These tips help Descript animate your avatar with more accurate mouth, eye, and head movements. Along with these suggestions, use a natural, relaxed expression rather than an exaggerated one. For best results, upload a clear image of a human face; we do not recommend using images of animals or objects.
Head Position & Framing
Attribute | Best Practices |
---|---|
Framing | Use a close-up headshot with shoulders just slightly visible. The subject should be centered and squared to the camera. |
Head Position | The head should be upright and facing forward. Avoid angled or three-quarter views, which can distort motion. |
Glasses, Mouth, and Eyes
Attribute | Best Practices |
---|---|
Glasses | Although glasses are not recommended, try to reduce reflections on glasses so the eyes are clearly visible. |
Mouth | Ensure the mouth is visible and unobscured. It should be easy for the AI to interpret lip shapes for syncing. |
Eyes | Eyes must be clearly visible and open. Avoid shadows or squinting that might hide the eye shape. |
Background, Foreground, and Lighting
Attribute | Best Practices |
---|---|
Background | You may consider removing people or animals from the background as they will not be animated and will remain still throughout the video. |
Foreground | A clean foreground ensures better subject recognition. Objects obstructing your avatar might cause issues with the final generation. |
Lighting Setup | Use soft, even lighting to minimize harsh shadows. |
Contrast | Maintain good contrast between the subject and background for better generations. |
File Upload Requirements
Attribute | Best Practices |
---|---|
Image Content | Photos may be rejected if they contain celebrities or inappropriate content such as nudity. This is due to automated safety checks. |
File Types | Only JPEG, PNG, and WEBP formats are supported. |
File Size | Only files up to 10MB in size are accepted. |
Aspect Ratio | Use a 16:9 aspect ratio for best compatibility with Descript’s scene editor. |