IBM develops computerized voice that actually sounds human
If there's one thing that still grates our nerves, it's automated calling systems. Or, more specifically, the robotic beings that simply fail to understand our slang and incomprehensible rants. IBM's working hard and fast to change all that, with a team at the company's Thomas J Watson research division developing and patenting a computerized voice that can utter "um," "er" and "yes, we're dead serious." The sophisticated system adds in the minutiae that makes conversation believable to Earthlings, and it's even programmed to learn new nuances and react to phrases such as "shh." The technology has been difficulty coined "generating paralinguistic phenomena via markup in text-to-speech syntheses," and while exact end uses have yet to be discussed publicly, we can certainly imagine a brave new world of automated CSRs.



















I welcome our... er... human sounding... um... robotic overlords (yes I'm dead serious)
Beat me to it by mere minutes
Hasta la vista... um... baby! I'll be... er... back!
Now all we need is the dyslexia plug-in and the stutter/lisp filter and it will be perfect.
Er, um...apparently providing a video or audio sample still evades IBM.
This is perfect for the new robot I'm developing that farts and has bad breath.
"Oh, E71, I am so hot for you. Uh, Uh, Uh..."
Next step...cybernetic organisms!
And then... The WORLD!!!
ooo...definitely took a double take on that one. thank you.
How is this news? We already have this technology built into AHNOLD SCHWARZENEGGAH. rofl
Would...you...like...to play...a game?
Tic Tac Toe?
Mr. McKitrick, after careful consideration I have come to the conclusion that your defense system sucks.
Global.... thermonuclear war?
One word: proofread
One word: proofreak
I knew it. Towards the end of the century we won't be driven to extinction by terminators, Skynet, or insect-like hover drones. No, our fate will be sealed by an army of semi-autonomous Rosie Perez (plural: Rosies Perez???) that speaks worse English than we do.
Telemarketers all just got a scared about loosing their job.
YAY!
Better yet, 900 operators are scared about losing their jobs ( =
This article is useless without an example :(
microsoft sam better polish up is resume
I wouldn't want it to sound too real though.
I'll be more impressed when they've got it speaking Mumbai-accented English.
http://www.research.att.com/~ttsweb/tts/demo.php
choose Anjali...
SOI
Actually the Sprint automated voice is pretty good. I forgot her name but it's definitely the best I've heard. Not believable but close enough.
this article has inspired me to start shhh'ing tele-marketers.
Exactly what I was thinking...
"Please state your name followed by the, whatchamakallit... ah, hash mark."
"Your call is now being directed to a er, specialized operator."
Robot voice, robot voice...all the kids on the streets love the robot voice.
Of particular interest is the image of the guy trying to insert some admittedly small, but clearly not enough, black thing in his ear and frowning at the insurmountable difficulties the problem presents.
big boy... BIG BOY!
This is your president, John Henry Eden.
hahahahaha!!!!! uprated for the fantastic, perfectly placed yet still subtle reference to an amazing game!!!!!!! :-D
I hereby declare your comment as the best on this article.
It is not possible to outdo you.
Polish synth is still better:
http://www.ivona.com/
"We robots can try to help with your queries if you hold on, but if you still want to speak to a human press 0".
I have to say it think TomTom did this first.
Anyone ever heard the default computer voices?
What do you think? Routing quality aside.
IBM is the real world Cybernet.
1. Audio or it didn't happen.
2. I imagine this could get annoying. I get pissed when it takes more than 2 seconds for a page to load. Imagine waiting while your GPS system "um"s and "er"s through a set of driving directions.
Can they give it a Bengali accent?
All your base are belong to us
Yamaha already has computer voices that sound human and can sing!
http://www.vocaloid.com/
One step closer to Skynet.
Had to be said.
Alex sounds OKay to me.
Can you choose the Pierce Brosnan voice?
No one can find a sample of this?
Come on engadget! :)
It sounds like this research will also improve voice recognition; think Dragon Naturally Speaking...
Okay, so we get an article about the voice, yet no sound clip? I know this isn't engadget's fault but someone must be blamed for this!
and the plus of this is? this is just a gift to telemarketers, i'm still going to have to wait hours on those helplines regardless.
Nick Ross
www.jazdtech.com
I still heart microsoft sam
Arsenal Gear anyone?
Does it also understand when I say F you? That would increase responsiveness when I'm talking to automated systems for phone companies.
Is it better than AT&T's speech lab demo?
http://www.research.att.com/~ttsweb/tts/demo.php
"I am afraid, Dave"
This should be unnerving in the near future- you get through a service support call, you really think you've gotten through to the person on the other side of the phone, then you detect something strange in one response- wait, am I talking to a human or not? What a customer service flub! The feeling of deception created by this possibility is sure to shove this genie back in the bottle for a little while longer.
Actually, the guy in the photo is an IBM employee in R&D. Or rather, he was. You see, he's just taken a call from the robotic voice that he just helped to create, which has just told him he's being let-go, despite the better than expected earnings last year.
Hey! For the record, that's a picture of me from an online video about speaking to sentient automated calling system. It's funny and should be watched immediately:
http://www.youtube.com/watch?v=N1H4JlJcIhc
PS. What would YOU say if a phone system started talking about "donkey balls?"
It's not that computer generated voices are getting better, it's just that spoken English is getting worse.
Hello, and again, welcome to the Aperture Science Computer Aided Enrichment Center. We hope your brief detention in the relaxation vault has been a pleasant one. Your specimen has been processed, and we are now ready to begin the test proper.
Yeah, thats just what we need, a computer CSR agent that picks up new words and phrases of speech from all the idiots across America that call in to bitch about their products/services..
"Uh, Sir, would you like to suck my tits"
"What, wait, all I want to do is extend my warranty, but if you insist"
Cops slam in doorway, "Get your muthafuckin hands up" "You just been set up by the Po-Leece"
whatever, fuck it
"Ahem... errr... Excuse me... SHUSH! Thanks. Take the next, errr, left. Damn we've gone past it.
This screenshot is from a short, fun vid showing off some of this technology that was developed at Carnegie Mellon. Check it out: http://www.youtube.com/watch?v=QTnyAg2Mho4