More Voices, Better Audio, and AI Cover Art

This week brings major improvements to voice quality with AI-powered voice direction, an expanded library of 29 Gemini voices, and smarter podcast cover generation. Plus critical audio fixes and infrastructure upgrades.

We've been focused on making your podcasts sound more natural and professional. This week's updates bring AI-powered voice coaching that helps hosts sound distinct and authentic, plus we've unlocked a much larger voice library to choose from.

New Features

AI Voice Direction for Natural-Sounding Hosts

Your podcast hosts can now sound more authentic and distinct from each other. The system generates personalized voice profiles for each host based on their personality traits and accent, then provides director-style coaching to guide the performance. Each episode segment gets mood hints that help hosts match their energy to the content - excited for breaking news, thoughtful for deep dives.

You'll find expandable Voice Direction controls in your host settings where you can review and regenerate the AI-generated coaching notes. The Scene Description field on your podcast's general settings lets you set the overall tone and atmosphere.

Choose Your Text-to-Speech Provider

You can now pick between two text-to-speech engines for each podcast: Gemini (our new default with better rate limits) or ElevenLabs (for premium production quality). The Gemini voice browser lets you preview and select from 29 available voices, making it easy to find the perfect match for your hosts.

Switch providers anytime in your podcast's Hosts & Voices settings tab.

Expanded Voice Library

We've grown the Gemini voice library from 6 curated options to all 29 official Google Cloud voices. Each voice is now tagged by gender to help you filter and find the right fit faster.

Smarter Cover Art Generation

The AI cover generator now produces more authentic, intentional illustrations. We've refined the prompts to avoid generic AI-generated looks by prescribing specific illustration constraints, limited color palettes, and medium-specific authenticity markers.

Improvements

Better Script Direction Support

Script generation now uses a more natural direction format that works better with Gemini. Stage directions like pauses and reactions are formatted in a way that helps the AI understand and deliver them more naturally.

More Reliable Audio Processing

Fixed a critical bug where Gemini-generated audio files appeared to be the correct size but only played the first few seconds. The issue was in how we were combining multiple audio chunks - we now properly handle the raw audio data before creating the final file.

When Gemini text-to-speech fails and automatically falls back to ElevenLabs, the system now correctly maps voice settings to prevent errors.

Improved Script Preprocessing

Section markers like "INTRO" and "SEGMENT 1" are now automatically removed from scripts before they're sent to text-to-speech, so they won't accidentally be read aloud in your episodes.

Bug Fixes

Fixed an issue where podcast audio assets (intro jingles, transition sounds, outro music) were missing from Gemini-generated episodes. Your branded audio now plays correctly regardless of which text-to-speech provider you use.

Knowledge graph isolation now works correctly per podcast. Previously, entities and topics from one podcast could leak into suggestions for another podcast in the same account.