SpeechRecognition

Latest

  • Nuance gobbles up Vlingo, yearns to transcribe its own announcement

    by 
    Dante Cesa
    Dante Cesa
    12.21.2011

    Apparently, if you can't (legally) beat them, you buy them. Such is the thinking over at Nuance, who has decided to acquire its competitor and former courtroom dance partner, Vlingo. Should make for some nice additions to the former's voice recognition tubes -- technology which powers everything from Apple's Siri, Dragon dictation and even various autos. No indications as to how many greenbacks exchanged hands, but the newlyweds were happy to boast their "complementary research and development efforts" will result in a company "stronger together than alone." We'll have to see about that. PR after the break.

  • Programmable robots coming to Korean stores, will assimilate your Android phone

    by 
    Mat Smith
    Mat Smith
    12.09.2011

    South Korea loves its robots. While the country prepares them to teach the kids and guard its prisons, smartphone-compatible models are now propping up shelves in hobbyist shops. Dongbu Robot (previously Dasarobot) is launching several new products for wannabe bot engineers, but it's the Google OS-compatible HOVIS kits that caught our eye. While we already know Android-powered bots can make a mean cocktail, these kits will get new features programmed to them through a phone's Bluetooth and WiFi connections. The basic wheeled model can be upgraded to fully-fledged legs, while Dongbu Robot is working alongside the country's SK Telecom network to offer speech recognition as the first software add-on, with plans for education and home security all in the pipeline. The price of sowing the seeds of the Robopocalypse? Around $620 for the starter model. Sound like too much? Well, there's always Romo.

  • Developer teases voice control of Zune, using PC and Windows Phone (video)

    by 
    Zachary Lutz
    Zachary Lutz
    11.29.2011

    The great thinkers of the world have long known a secret that we're now happy to disclose: it's not necessity that's the mother invention, but rather laziness. Fortunately, expending a great deal of effort on a project -- simply to perform a task effortlessly -- sometimes brings very cool results. A concept app known as ZuneVoice easily passes muster in this realm, which is used to control Zune software on the PC with only a standard microphone and spoken commands. As you can see in the demo video, its creator, keyboardp, is able to play individual songs, issue commands such a "pause" or "next song", and even display full-screen music videos from YouTube. The developer even crafted an app for his Lumia 800 known as PhoneZune, which serves as a remote control for times when he's away from the box. Neither application is yet publicly available, though feedback is welcome. Next, we're told to expect Kinect integration. Perhaps one day, these gems will see the light of day.

  • Jailbroken iOS 5 devices get Siri0us, tap into Nuance's dictation servers (video) (update)

    by 
    Zachary Lutz
    Zachary Lutz
    11.29.2011

    Sure, it's leaps and bounds away from all the parlor tricks that Siri is able to perform, but now, jailbroken iPhone 4, iPhone 3GS and iPod Touch devices -- that have been upgraded to iOS 5 -- may access the dictation portion of Siri's prowess. Thanks to Siri0us, the free app available through Cydia, users will gain the option to speak messages and search queries rather than type them, which could be a huge time saver -- unless there's a series of mistakes, anyway. Rather than accessing Apple's own system, the app works by tapping into Nuance's Dragon Go servers for speech recognition. Rather subversive, don't you think? If you'd like to get in on the fun (before Nuance breaks up the party), just check the video following the break. Update: Well, who didn't see this one coming? Nuance has pulled the rug out from under Siri0us, and the app has been yanked from Cydia while the developer searches for another speech recognition server. Happy hunting, dude.

  • Nuance Dragon Dictate 2.5 for Mac review

    by 
    Darren Murph
    Darren Murph
    09.06.2011

    Voice recognition. Or, more specifically, speech recognition. It's one of those technological wonders that we all seem to take for granted, while simultaneously throwing laughter its way for not being nearly sophisticated enough. Anyone that's used an early generation Ford SYNC system -- or pretty much any vehicular voice command system -- knows exactly what we're getting at. While processing speeds and user interfaces have made great strides in the past handful of years, voice recognition has managed to continually disappoint. It's not that things aren't improving, it's just that they aren't improving at the same rate as the hardware and software surrounding them. Even today, most new automobiles have to be spoken to loudly, pointedly and directly, and even then it's a crapshoot as to whether or not your command will be recognized and acted upon. For as much as we complain, we totally get it. Teaching a computer program how to recognize, understand and act upon the movement of human vocal chords is a Herculean task. Throw in nearly unlimited amounts of dialect and regional variation with even a single language, and it's a wonder that programs such as Nuance's Dragon Dictate even exist. Teaching a vehicle how to route calls, adjust volume and tweak a radio station is one thing, but having a program that turns actual speech into presentable documents requires a heightened level of accuracy. The newest build of Dragon Dictate for Mac (v2.5) allows users to seamlessly combine dictation with mouse and keyboard input in Microsoft Word 2011; it also gives yappers the ability to more finely control how Dragon formats text such as dates, times, numbers and addresses, while a free iOS app turns your iPhone, iPad or iPod touch into a wireless microphone. We recently pushed our preconceived notions about this stuff aside in order to spend a solid week relying on our voice instead of our fingertips -- read on to see how it turned on. %Gallery-132799%

  • Leak: future iOS update to introduce Siri-based voice control

    by 
    Sean Buckley
    Sean Buckley
    07.25.2011

    When Apple snatched up Siri back in April, we had to wonder exactly what Cupertino was planning for the voice controlled virtual assistant. The answer, according to a new leak, is unsurprisingly obvious: iOS integration. A screenshot leaked to 9to5Mac flaunts an "Assistant" feature presumably built into a firmware update. To back up the screenshot, the aforesaid site dove into the iOS SDK and uncovered code describing Siri-like use of the iPhone's location, contact list, and song metadata. The code also outlined a "speaker" feature, opening a door for further Nuance integration in Apple products. Sound awesome? Sure it does, but keep it salty: 9to5's source says the assistant feature only just went into testing, and may not be ready in time for Apple's next big handset upgrade. Hit the source link to see the code and conjecture for yourself.

  • Dragon Go! is a must-have voice search app for your iPhone

    by 
    Mel Martin
    Mel Martin
    07.14.2011

    Like the proverbial genie in the bottle, you can ask a lot of Dragon Go! and have a pretty good chance of the app granting your wish. Dragon Go! is the latest free app from Nuance, creators of Dragon Dictate for the Mac and Dragon Dictation for iOS devices. In this latest app, Nuance has delivered what they consider the next generation of voice search, and after several days of testing I have no reason to doubt it. Here's the deal. Speak just about anything to Dragon Go! and it will try to parse your meaning and bring up the right set of tools to complete your search. Ask for news about Libya, or news about Libya from the New York Times and the app complies. Ask for reservations for 2 at a favorite restaurant and Open Table is queried. Directions from your current location to the nearest hospital will launch Google Maps with the route. Say a product name, like JBL speakers and an Amazon page comes up with the JBL speakers Amazon sells. It gets better. Ask it to play an artist on Pandora, and if you have the app installed it will launch and start playing the artist you asked for. Say "Play the Beatles" and if you have the Beatles on your device the music will play. You can also direct a query to a particular site. I tried "stories about Apple TV on TUAW" and it brought up a list from our website. Then a tough test. I asked to see pictures of obscure character actor Whit Bissell and the images popped up right on cue. Check our gallery. Holy Moly! No app is perfect, and every so often Dragon Go! botched a search, but most questions I asked delivered useful answers. It may seem like the app has a bit of overlap with Siri, which is also powered by Nuance Technology. There is some, but Dragon Go! reaches deeper and takes you to the appropriate place on the web, rather than try to contain the info within the app itself. The sources Dragon Go! is using are displayed at the top of the screen. You can change those sources manually if you want. The default search engine is Google, but Bing and Yahoo! are fine if you'd rather use them. I found Dragon Go! an extraordinarily useful app in day to day use. I can only scratch the surface of its capabilities in this review. You must try it for yourself. I was often wishing this kind of technology was built into my iPhone at the system level, and I'll bet Nuance wishes it were too. Of course with Apple buying Siri, we may see something similar. Dragon Go! is free, and iPhone-only at this point. According to Matt Revis, VP of Product Management at Nuance, the app is US English for now. It will come to Android sometime in the future, and also to the iPad. For all intents it replaces Dragon Search, which is not as full featured. The app will continue to function, but it won't be downloadable from the US app store. My guess is that most people will replace it with Dragon Go! anyway. I'd seriously recommend you download and give the app a test drive. It's a great iPhone demo, and I think it will work its way into your daily routine. Share your experiences with us, and tell us what you like and what you don't like. %Gallery-128357%

  • Pioneer solicits Whodoo guinea pigs for speech-based Android assistant (video)

    by 
    Zachary Lutz
    Zachary Lutz
    07.13.2011

    Ever wish you could have a personal attendant living inside your Android smartphone? You know... one you can boss around without incurring human rights or labor law violations? Apparently Pioneer shares your vision, because its voice-controlled social assistant named Whodoo is seemingly ready to "hop to" at a moment's notice -- willing to locate a restaurant and send it to friends, route the appropriate directions, and announce your intentions to Facebook or Twitter -- all based on your verbal commands (and ostensibly perfect for in-dash navigation). The company is seeking bossy applicants for its closed beta experiment, which involves completing a lengthy application, providing considerable feedback, and submitting audio samples that are gathered by Whodoo. Think you've got the chops? Just follow the source, where you're free to convince Pioneer of the same.

  • Nuance buys SVOX ahead of iOS 5 release

    by 
    Mike Schramm
    Mike Schramm
    06.16.2011

    There's a whole trail of rumors hinting at an upcoming deal between speech recognition company Nuance and Apple. For quite a while now (ever since Apple picked up personal assistant software maker Siri), the scuttlebuzz has claimed that the folks in Cupertino would make a deal with Nuance for some kind of speech recognition, most likely an iOS-level integration that would allow you to ask your iOS device for whatever you want, and get it quickly and easily. But even if that deal is on, that hasn't stopped Nuance from slowing down. The company has acquired another speech recognition firm, SVOX, the creators of high-end speech recognition and text-to-speech services. That's a natural fit for Nuance, of course, and the release says that the new deal "will advance the proliferation of voice in the automotive market, and accelerate the development of new voice capabilities that enable natural, conversational interactions between consumers and their connected cars, mobile phones, and other consumer devices." Sounds exciting to us. We didn't actually get to see either Siri or an updated voice control service show up during the iOS 5 announcement at WWDC, but that doesn't mean it's completely out of the cards yet. Maybe a deal like this is just what Nuance needs to set up the partnership that Apple's reportedly been seeking for a while.

  • Dragon NaturallySpeaking 11.5 updates your Facebook, turns your iPhone into a wireless mic

    by 
    Terrence O'Brien
    Terrence O'Brien
    06.15.2011

    All your sci-fi dreams of being able to talk to your gadgets and have the do your bidding are slowly becoming a reality. Nuance, the company behind Dragon NaturallySpeaking, has been at the forefront of the technology since 1997 and, with the release of 11.5, it has added a few neat tricks to its dictation-taking repertoire. On the desktop side, new widgets allow you to post updates to your Facebook and Twitter accounts simply by saying "post to" you social network of choice before spouting off your status update -- perfect for drunk tweeting when those beer goggles make it hard to hit the keys. Nuance also released the Dragon Remote Mic App for iOS, which turns your Apple device into a wireless mic that beams commands and dictated notes straight to your PC. We're pretty excited for all this voice control stuff -- so long as our computers don't start refusing our requests in a detached monotone. Check out the PR after the break.

  • Chaufr lets you shout searches, yell URLs at Chrome

    by 
    Terrence O'Brien
    Terrence O'Brien
    05.31.2011

    Generally, shouting commands at the internet isn't going to get you very far but, if you're just yelling a few destinations and search terms, Chrome extension Chaufr can take you where you need to go. A previous add-on, Speechify, let you speak to fill input fields, but couldn't help you actually navigate the web. Chaufr, on the other hand, lets you simply say the magic word -- "Engadget" -- and it drops you right at our online doorstep. You can also use it to perform searches by saying Wikipedia, Google, Amazon, YouTube, or Yahoo followed by whatever it is you're looking for. It worked well enough in our brief hands-on, but we do have one nit to pick -- activating voice input requires you click on an icon in the tool bar then click on a microphone in the drop down menu. (Can't a brother get a keyboard shortcut?) You can try it out for yourself by clicking on the source link.

  • NTT DoCoMo exhibits on-the-fly speech translation, lets both parties just talk (video)

    by 
    Sharif Sakr
    Sharif Sakr
    05.30.2011

    The race to smash linguistic barriers with simultaneous speech-to-speech translation is still wide open, and Japanese mobile operator NTT DoCoMo has just joined Google Translate and DARPA on the track. Whereas Google Translate's Conversation Mode was a turn-based affair when it was demoed back in January, requiring each party to pause awkwardly between exchanges, NTT DoCoMo's approach seems a lot more natural. It isn't based on new technology as such, but brings together a range of existing cloud-based services that recognize your words, translate them and then synthesize new speech in the other language -- hopefully all before your cross-cultural buddy gets bored and hangs up. As you'll see in the video after the break, this speed comes with the sacrifice of accuracy and it will need a lot of work after it's trialled later in the year. But hey, combine NTT DoCoMo's system with a Telenoid robot or kiss transmission device and you can always underline your meaning physically.

  • IBM's Jeopardy-winning supercomputer headed to hospitals. Dr. Watson, we presume?

    by 
    Amar Toor
    Amar Toor
    05.24.2011

    We always knew that Watson's powers extended well beyond the realm of TV trivia, and now IBM has provided a little more insight into how its supercomputer could help doctors treat and diagnose their patients. Over the past few months, researchers have been stockpiling Watson's database with information from journals and encyclopedias, in an attempt to beef up the device's medical acumen. The idea is to eventually sync this database with a hospital's electronic health records, allowing doctors to remotely consult Watson via cloud computing and speech-recognition technology. The system still has its kinks to work out, but during a recent demonstration for the AP, IBM's brainchild accurately diagnosed a fictional patient with Lyme disease using only a list of symptoms. It may be another two years, however, before we see Watson in a white coat, as IBM has yet to set a price for its digitized doc. But if it's as sharp in the lab as it was on TV, we may end up remembering Watson for a lot more than pwning Ken Jennings. Head past the break for a video from the University of Maryland School of Medicine, which, along with Columbia University, has been directly involved in IBM's program.

  • Nuance voices found in OS X Lion, patent application suggests new iPhone speech / text capabilities

    by 
    Donald Melanson
    Donald Melanson
    05.16.2011

    Apple's certainly no stranger to speech recognition, but it looks like it may have enlisted a bit of outside help for the next version of OS X, otherwise known as Lion. As Netputing reports, some of the text-to-speech voice options available in the developer preview of Lion just so happen to match the voices available from Nuance -- which would seem to suggest a partnership or licensing agreement of some sort, as the voices themselves cost $45 apiece directly from Nuance. In somewhat related news, Apple has also recently filed a patent application that would bring some fairly extensive new speech recognition options to the iPhone -- if it ever actually moves beyond a patent application, that is. In short, it would let you either instantly have a phone call converted to text, or send some text and have it converted to voice on the other end -- which the application notes could come in handy both in noisy environments or in situations where you simply aren't able to talk. It would even apparently incorporate a noise meter that could automatically trigger various options when the ambient noise hits a certain level. Hit up the source link below for a closer look at how it would work. [Thanks to everyone who sent this in]

  • iOS 5 speech recognition concept showcased in video

    by 
    Kelly Hodgkins
    Kelly Hodgkins
    05.16.2011

    Recent rumors and a patent application suggest an upcoming version of iOS will include some form of speech recognition. Inspired by these revelations, graphic designer Jan-Michael Cart created a short video that shows how Apple could add this speech-to-text functionality to iOS 5. His conceptualization takes speech recognition one step further than the patent, which focus on calling only. Cart envisions a world where speech is incorporated into the core of iOS and used throughout the user interface. For example, a long-press of the home button would launch the speech recognition module and let you create text messages. An API could be made available to developers so that they could add speech recognition to their applications. It's an interesting concept that would make many users happy if Apple implements speech-to-text in this way. Read on for Jan-Michael Cart's concept video. [Via iPhoneDownloadBlog]

  • Nuance voices found in Lion Developer Preview 3

    by 
    Michael Grothaus
    Michael Grothaus
    05.14.2011

    Yesterday, we told you that Apple had seeded Lion Developer Preview 3 to developers. We noted at the time that among the new features in Developer Preview 3 were a new boot animation, new graphical elements in the Finder's toolbar, new desktop wallpapers and that Reading List is now enabled in Safari. Other details of the latest Lion preview have emerged, but perhaps the most important is that Nuance voices, shown in the image above, have been discovered in the OS itself. Nuance is, of course, rumored to have entered into a major partnership with Apple for its speech recognition technology being incorporated into iOS 5. But now it appears Apple is going to be pushing speech recognition as a feature across all of its operating systems. As discovered by NetPuting, a quick check of Lion's speech preferences finds that a number of voices from Nuance's RealSpeak Solo software are now integrated directly into Mac OS X Lion. Earlier today an Apple patent emerged describing a way Nuance-like speech recognition software could be used in iOS to help make it easier for iPhone users to communicate in loud or quiet environments.

  • Nuance-like Apple speech recognition patent emerges

    by 
    Michael Grothaus
    Michael Grothaus
    05.14.2011

    Rumors have been flying that Apple has entered into some kind of agreement with speech recognition company Nuance. Now Patently Apple has published an Apple patent that shows a possible use for Nuance's technology in the iPhone. The patent covers text-to-speech and speech-to-text conversion. In the patent, Apple lists two ways it might be hard for someone to answer their phone in the usual way: communicating in noisy environments and being unable to communicate during a meeting. In the first situation Apple says the user might try shouting to overcome the noise, but shouting frequently renders the voice signal unintelligible. Likewise in a quiet environment, such as a meeting where the user doesn't want to disrupt what's going on around him, he might try whispering into his phone, but again whispering frequently renders the voice signal unintelligible. Apple proposes to get around these limitations by running text-to-speech and speech-to-text conversion on the fly. Instead of shouting or whispering into the phone in a noise or quiet environment, respectively, the user could type a text message while live on the call and it would be read aloud to the person on the other end of the line.

  • Is a Nuance and Apple deal in the works?

    by 
    Michael Grothaus
    Michael Grothaus
    05.07.2011

    TechCrunch is reporting that Apple is in the process of some sort of deal with Nuance Communications, one of the leading companies in the field of speech recognition. Many readers may be familiar with Nuance's Dragon NaturallySpeaking software, however the Dragon speech engine is also licensed and used in a number of apps for Windows, OS X, iOS, and Android. What could the deal be? The most obvious choice is an acquisition, but as TC points out, it would cost Apple at least US$6 billion to buy the company. Apple's got the cash, but even for them that would be quite a purchase. TechCrunch thinks it's most likely the two companies are entering into some sort of partnership "that will be vital to both companies and could shape the future of iOS." Speech recognition has been rumored to be a big part of the future of iOS. Last year, Apple bought another speech recognition company, Siri, which itself is powered by Nuance technology. Perhaps with the release of iOS 5 we'll be talking to our phones more than using them to talk to people.

  • Chrome 11 goes beta with speech-to-text capabilities

    by 
    Donald Melanson
    Donald Melanson
    03.23.2011

    Well, it looks like Google is unsurprisingly adding more than just a new logo to the latest version of its Chrome browser -- the just-released beta of Chrome 11 also now boasts speech-to-text capabilities. That comes in the form of support for the HTML5 speech input API, which web developers will be able to take advantage of to let folks simply talk to websites and have their speech magically transcribed to text. Also making a first appearance in the beta is support for GPU-accelerated 3D CSS, which will let developers apply all sorts of 3D effects to websites -- Blingee will never be the same, surely. Hit up the link below to try it out for yourself.

  • PALRO robot masters English, will never shut up again (video)

    by 
    Tim Stevens
    Tim Stevens
    01.21.2011

    When first we saw Fujisoft's PALRO robot doing its thing we were charmed but, as it didn't speak English, we had to adore it from afar. No longer. The little critter has obviously mastered our language quite quickly and can be seen below chatting with an even more robotic humanoid about such idle things as the weather, career aspirations, and just how great PALRO is. How great is PALRO? PALRO is really great -- but humble. Inside that barrel chest is a full-fledged PC with an Atom Z530 processor, 4GB of flash storage, and an Ubuntu kernel keeping everything in check. It's available as ever for educational and research institutions for about $3,600, but we're trying to get one ourselves. If we can get it to type prepare yourselves for many more posts about software based on real Japanese cutting-edge technology.