Google has updated the voice they use for text-to-speech and search functions on Google Now and Search app on iOS and Android, the voice is meant to be more natural and human. YouTubers Nat and Lo, who have over 32,000 subscribers, have posted behind the scenes footage of how the voice was developed.
One of the key differences between the new voice and the old one is that Google focused on human intonation (pitch variation) and prosody (intonation, tone, stress, and rhythm). Basically, the new Google voice does a better job of understanding and mimicking pitch, tone, and other melodic elements of speech to sound more person-like.
Google assembled a team including a voice coach, a linguist and the talent (who didn’t show her face on camera) to record a combination of phrases and words to upload onto the database. Once this process was done, they spliced together chunks of recordings and speech patterns to create more natural sounding speech.
YouTubers Nat and Lo, who produced short documentaries at Google, have posted a behind-the-scenes look at the team improving Google’s voice: