We've been focused on making your podcasts sound more natural and professional. This week's updates bring AI-powered voice coaching that helps hosts sound distinct and authentic, plus we've unlocked a much larger voice library to choose from.
New Features
AI Voice Direction for Natural-Sounding Hosts
Your podcast hosts can now sound more authentic and distinct from each other. The system generates personalized voice profiles for each host based on their personality traits and accent, then provides director-style coaching to guide the performance. Each episode segment gets mood hints that help hosts match their energy to the content - excited for breaking news, thoughtful for deep dives.
You'll find expandable Voice Direction controls in your host settings where you can review and regenerate the AI-generated coaching notes. The Scene Description field on your podcast's general settings lets you set the overall tone and atmosphere.
Choose Your Text-to-Speech Provider
You can now pick between two text-to-speech engines for each podcast: Gemini (our new default with better rate limits) or ElevenLabs (for premium production quality). The Gemini voice browser lets you preview and select from 29 available voices, making it easy to find the perfect match for your hosts.
Switch providers anytime in your podcast's Hosts & Voices settings tab.
Expanded Voice Library
We've grown the Gemini voice library from 6 curated options to all 29 official Google Cloud voices. Each voice is now tagged by gender to help you filter and find the right fit faster.
Smarter Cover Art Generation
The AI cover generator now produces more authentic, intentional illustrations. We've refined the prompts to avoid generic AI-generated looks by prescribing specific illustration constraints, limited color palettes, and medium-specific authenticity markers.
Improvements
Better Script Direction Support
Script generation now uses a more natural direction format that works better with Gemini. Stage directions like pauses and reactions are formatted in a way that helps the AI understand and deliver them more naturally.
More Reliable Audio Processing
Fixed a critical bug where Gemini-generated audio files appeared to be the correct size but only played the first few seconds. The issue was in how we were combining multiple audio chunks - we now properly handle the raw audio data before creating the final file.
When Gemini text-to-speech fails and automatically falls back to ElevenLabs, the system now correctly maps voice settings to prevent errors.
Improved Script Preprocessing
Section markers like "INTRO" and "SEGMENT 1" are now automatically removed from scripts before they're sent to text-to-speech, so they won't accidentally be read aloud in your episodes.
Bug Fixes
Fixed an issue where podcast audio assets (intro jingles, transition sounds, outro music) were missing from Gemini-generated episodes. Your branded audio now plays correctly regardless of which text-to-speech provider you use.
Knowledge graph isolation now works correctly per podcast. Previously, entities and topics from one podcast could leak into suggestions for another podcast in the same account.