This page will guide you through some of the most common questions and issues while using AI speakers.
This page answers some of the most common questions and issues when using AI Speakers in Descript.
If you're not happy with the results, create a new AI Speaker with a fresh recording. There's no way to add more voice samples or training data to an existing AI Speaker.
When re-recording:
- Choose the tone and style you want — casual and friendly, formal and authoritative, or anything else.
- Speak clearly and intentionally; your delivery during the consent script directly shapes how your AI Speaker will sound.
- Use your usual microphone and recording setup.
Descript doesn't support style variations within a single AI Speaker. However, you can create multiple AI Speakers using the same process described above. For example:
- Create one AI Speaker with a casual, conversational tone.
- Create another AI Speaker with a more formal, professional delivery.
- Create additional AI Speakers with different emotional qualities or speaking styles.
Each AI Speaker will appear as a separate option in your speaker dropdown menu, allowing you to choose the most appropriate voice for different sections of your project.
Sometimes, an AI Speaker might mispronounce a word or phrase. We've created a guide with tips on getting the pronunciation right.
That's normal! AI Speakers are designed to generate a little more on either side to give you editing flexibility in case words don't transition perfectly. You can use the trim tool to line things up and a cross-fade to blend them together.
Text-to-speech and Regenerate currently do not work over sequences. When used on a sequence, the video will be removed. We're working on improvements (see more info).
In the meantime, try this workaround:
- Convert the AI voice clip into an audio layer.
- Select the AI audio clip and cut (
Cmd + XorCtrl + X). - Use the Trim tool to restore the original audio/video in the Script track.
- Paste the AI audio clip as a layer above the original audio.
- Use the
Blade tool to split the Script track, then mute the replaced portion in the Layer panel.
- Adjust as needed.
If you're typing text and no audio generates, try:
- Ensure speech generation is enabled for your speaker. If not, create a new AI Speaker.
- Try duplicating the project and check again.
- Try creating a new project to see if the issue persists.
AI speech uses a predefined voice model, so cadence and pacing are consistent. Here are some tips that might help.
The current model is based on US English pronunciation. We're exploring broader accent support in the future — upvote this on our feedback board. In the meantime, try using the Translate feature to create a version of your voice in another language.
To adjust the speed of AI-generated clips, you must first click Convert to audio. Once converted, you can adjust speed using the selection toolbar.
Unexpected audio artifacts are usually caused by issues in the source media. Try to eliminate:
- Static or sudden loud sounds
- Background noise (appliances, traffic, music)
- Excessive mouth noise or breathing
AI Speakers currently support English only. Support for additional languages is in development — share your feedback here.