Descript lets you choose which AI model powers your generated images and videos. Different models are better at different things: some work faster, others create more polished results, and some handle complex instructions better. This article explains the available generative media models in plain language, so you can quickly decide, âWhich model should I use?â without needing to be an AI expert.
This article covers
Availability & usage
Model availability depends on your plan. Some premium models use more AI Credits than others. You can switch models anytime; choosing a new model only affects generations made after the switch.
Where to pick a model
Choose a model for each type of generative media in your App Settings, under the General tab, in the Advanced section.
Clicking this link will take you directly to your App Settings in Descript
If you only need to change the model for a single image or video generation and don't want to modify your overall account settings, select a model for the specific generation you are performing while using the associated AI tool.Â
Changing the model in this menu does not change your App Settings.
A note about model descriptions
The below model descriptions are our best-faith effort to characterize each model's strengths based on provider documentation and community feedback. Think of these descriptions as general guidance rather than precise specifications.
Image models
| Model | Overview | Provider | Model release date |
|---|---|---|---|
| Nano Banana (default) | Versatile all-purpose model for complex prompts and precise editing. Balances speed and quality across various tasks. | Google Deepmind | November 2025 |
|
Flux 2 Pro |
Balances speed and quality with enhanced text rendering and image restyling capabilities. Suitable for most image generation needs. | Black Forest Labs (BFL) | November 2025 |
| Qwen | Works well for readable text over artistic imagery. Supports Chinese, Japanese, and English with accurate multi-line layouts. | Alibaba / Qwen team | August 2025 |
| Nano Banana Pro | Renders lighting and textures with greater detail for both artistic and photorealistic prompts. | Google Deepmind | November 2025 |
| Flux Kontext [pro] | Maintains visual consistency across edits and generations. Balances quality and cost for projects requiring visual coherence. | Black Forest Labs (BFL) | May 2025 |
| GPT Image 1 | Designed for complex editing with nuanced instructions. | OpenAI | April 2025 |
| Flux [dev] | Fast, budget-friendly model for quick generation. Excels at creative, fantasy-style imagery and concept work. Uses fewer credits for rapid iteration. | Black Forest Labs (BFL) | August 2024 |
Video models
| Model | Overview | Provider | Model release date |
|---|---|---|---|
|
Pixverse v5 (default) |
Well-balanced for most video generation needs. Offers smooth motion, stable camera work, and good lighting with balanced speed, quality, and cost. | PixVerse AI | August 2025 |
| Kling O1 | Generates high-resolution, cinematic visuals up to 10 seconds long. Strong for visual quality and production value. | Kling AI (Kuaishou) | December 2025 |
| Veo 3.1 | High-end model with integrated audio. Features enhanced realism, synchronized dialogue and sound effects, and cinematic output. | Google Deepmind | October 2025 |
| Veo 3.1 [fast] | Faster version of Veo 3.1 with similar visual quality. Cost-effective for iteration. | Google Deepmind | October 2025 |
| Kling Video 2.5 Turbo Pro | Optimized for character animation and motion fluidity with fast rendering. | Kling AI (Kuaishou) | September 2025 |
| Sora 2 |
Advanced cinematic model with exceptional realism, realistic
physics, and synchronized audio. Features consistent multi-shot
sequences and accurate physics simulation. Note: Cannot accept image inputs with human faces, but can generate humans from scratch. |
OpenAI | September 2025 |
| Hailuo 02 | Physics-focused model excelling at realistic movement and character consistency. Handles complex actions with high fidelity. | Alibaba | June 2025 |
| Wan v2.2 [turbo] | Speed-optimized for rapid experimentation and concept validation. Efficient resource usage ideal for early-stage work. | ByteDance | March 2025 |
Avatar generation models
| Model | Overview | Provider | Model release date |
|---|---|---|---|
| Hedra Character 2 (default) |
Reliable generalâpurpose talkingâhead model. Good baseline
for
most projects where you want stable, straightforward delivery. 480p resolution. Maximum generation of 12 minutes. |
Hedra | October 2024 |
| Kling Avatar v2 |
Balanced 720p avatar model for everyday use. Good mix of
realism,
motion, and stability for most production work. Supports additional attitude prompting. |
Kling AI (Kuaishou) | December 2025 |
| Kling Avatar v2 Pro |
Highâfidelity 1080p avatar model. Strong model for photorealism
and expressive motion. Supports additional attitude prompting. |
Kling AI (Kuaishou) | December 2025 |
FAQ
Which model should I use?
It depends! The table above can help match your goal (speed vs polish vs complexity) to an available model. You can always choose a different model in your App settings and re-create your content to see what works best for you.
Do premium models use more AI Credits?
Often, yes. Higher-end models and longer videos usually consume more credits. If youâre exploring or drafting, try a âfast/leanâ model first; switch to a premium model when youâre ready to finalize. Learn more about how to track and understand your Media Minutes and AI Credits.
If I change models, will my previous generations update?
No. Model changes only affect new generations. Existing images/videos stay as-is.
Why donât I see a specific model?
Availability can vary by plan. If a model isn't available from your App Settings, check your plan details.
I'm seeing an error message: What does that mean?
These errors come directly from our model providers and can occur for a few different reasons. See Troubleshooting AI image and video generation errors.