We've come a long way since an IBM 704 first croaked its way through Daisy Bell. Now we've got Siri copping an attitude when we ask a stupid question and Google Now feeding information in an incredibly realistic sounding voice. AT&T has its own initiative, dubbed Natural Voices. At this morning's Foundry event, one demo involved using the voice synthesis engine to read a children's book -- specifically Goldilocks and the Three Bears. This isn't just another text-to-speech demo though, StorEbook uses the impressive and appropriately named library of sampled phonemes to speak in unique, realistic voices for each character. What's more, from the library of different voices (of which there are dozens), the web-based app chooses the most appropriate voice automatically, based on character traits input by the developer, Taniya Mishra.
In the future, she envisions a system smart enough to analyze the text of a story and pick out the salient traits on its own, then assign a voice to that character. Or even use algorithms to modify vocal features to convey emotion or age a character. Perhaps the most ambitious idea is to create personalized voices. A child could then have a story read to him or her, virtually, by a parent or grandparent. A mother would need to create a database of her voice first, by reading a few hundred sentences. Though, this wouldn't mean sitting down and reading through 100 sample sentences in one shot. Theoretically the necessary data could be collected overtime through recorded voice searches, commands or conversations (if you're willing to accept something that intrusive and creepy). There are still some rough edges, and no one is going to mistake Natural Voices for actual natural voices. But Mishra's goals aren't as far fetched as you might imagine -- the era of the vocal computer is upon us, friends.