SpeechRecognition

Latest

  • MACH system from MIT can coach those with social anxiety

    by 
    Terrence O'Brien
    Terrence O'Brien
    06.15.2013

    Plenty of people out there have a serious phobia of public speaking and there are tons of other disorders, such as Asperger's, that severely limit a person's ability to handle even simple social interactions. M. Ehsan Hoque, a student at the MIT Media Lab, has made these subjects the focus of her latest project: MACH (My Automated Conversation coacH). At the heart of MACH is a complex system of facial and speech recognition algorithms that can detect subtle nuances in intonation while tracking smiles, head nods and eye movement. The latter is especially important since the front end of MACH is a computer generated avatar that can tell when you break eye contact and shift your attention elsewhere. The software then provides feedback about your performance, helping to prep you for that big presentation or just guide you out of your shell. Experimental data suggests that coaching from MACH could even help you perform better in a job interview. What's particularly exciting is that the program requires no special hardware; it's designed to be used with a standard webcam and microphone on a laptop. So it might not be too long before we start seeing apps designed to help users through social awkwardness. Before you go, make sure to check out the video after the break.

  • Nuance Dragon Notes brings quick, spoken memos and messages to Windows 8

    by 
    Jon Fingas
    Jon Fingas
    05.14.2013

    Sometimes, the smallest and simplest apps make the most sense. Take Nuance's new Dragon Notes for Windows 8, for example. Unlike its NaturallySpeaking cousin, it's not a universal tool: instead, it's narrowly focused on the voice dictation of memos, email, social networking updates and web searches. That limited scope leads to a very simple interface, however, and slims down the price from $100 to a far more accessible $20. Fans of minimalism can grab Dragon Notes directly from Nuance on May 15th, although they'll need to spend $10 for every language they speak beyond English.

  • Microsoft demos improvements to Bing voice recognition for Windows Phone

    by 
    Joseph Volpe
    Joseph Volpe
    03.21.2013

    "Nothing says fun like a speech demo." Those are the words of Microsoft's CTSO Eric Rudder, not ours -- although we do have to agree. As you'll see for yourself in the video after the break, Microsoft held a private event for its employees a few weeks ago showcasing some of the advancements it's achieved with Bing's voice search for Windows Phone. Thanks to the work of MS' Research arm and the folks at Advanced Technology Group, voice recognition accuracy on a prototype build has now been improved by up to 15-percent on the back end and should even see a further 10- to 15-percent performance boost. In addition to this decreased error rate, the team's also greatly enhanced the speed at which the app delivers relevant results. So when can you expect this new and improved Bing app for WP? That part's unclear, but it appears Microsoft's already implementing changes on the back end to bolster current use.

  • Dragon Mobile Assistant 3.0 can share locations, call meeting numbers for you

    by 
    Jon Fingas
    Jon Fingas
    03.12.2013

    Nuance has long wanted Dragon Mobile Assistant to do as much of the heavy lifting as possible for common Android phone tasks. The newly available 3.0 beta is shouldering even more of the load, including responsibilities that can still involve separate apps with rivals. It's now possible to share map coordinates, or ask for someone else's location, through simple requests. The refresh will also skip the drudgery needed to dial a conference call or an important friend: set a calendar event with phone numbers and passcodes attached and Dragon can punch in the numbers itself, right on cue. As a final touch, the upgrade brings truly hands-free text messaging that includes both spoken incoming messages and voice-dictated replies. The beta remains free and will work with Android 2.3 or above; if Google Now and S Voice aren't pulling enough weight, there might be some relief through the source link.

  • QNX builds in-car speech framework with AT&T's Watson, knows our true intentions

    by 
    Jon Fingas
    Jon Fingas
    01.07.2013

    QNX wants to put an end to in-car voice systems that require an awkward-sounding syntax to get the job done. As part of its CES launches, it's rolling out a framework for its speech recognition technology leaning on AT&T's Watson engine. By offloading the phrase interpretation to AT&T's servers, any infotainment system with the framework inside can focus on deciphering the speaker's intent -- letting drivers spend more time navigating or playing music, instead of remembering the necessary magic words. QNX will roll out the voice element as part of its CAR platform at an unspecified point in 2013. We'll have to wait until car and head-end unit designers implement the platform in tangible hardware, but the new speech system will hopefully lead to more organic-sounding conversations with our cars. Follow all the latest CES 2013 news at our event hub.

  • Honda's HEARBO robot can separate and locate four sound sources at once (video)

    by 
    Jamie Rigg
    Jamie Rigg
    11.20.2012

    Robots are already adept at all manner of things, from hunting to feeling, but over at Honda's Research Institute, one team is focused on an ability bots aren't so hot at yet -- hearing. Puny humans can quickly deduce the direction of a sound and assess its significance, while also ignoring unimportant background noise. Honda is trying to replicate these traits with HEARBO, a robot with eight microphones hidden in its head. Using its HARK software system, HEARBO can distinguish between and locate the position of up to four unique sound sources simultaneously to within one degree of accuracy. It can also filter out din generated by its own 17 motors with a method called "ego-noise suppression." HEARBO's sound localization skills are shown in the first video below, while the second proves it can beat match, dance poorly, and isolate voice commands when music is playing and motors are whirring. The overall goal of Honda's efforts is to generally advance intelligent speech and sound recognition technology. We can't help but wonder, however, if bots will just end up using it to pinpoint our screams when the inevitable occurs.

  • Telenav's Scout gives iOS users offline navigation in exchange for ten bucks

    by 
    Michael Gorman
    Michael Gorman
    08.14.2012

    We know that iOS 6 will bless iPhone users with some in-house-made mapping, but that hasn't stopped Telenav from bettering its own Scout navigation offering for Apple's favorite handsets. Scout now does offline navigation by letting users download maps of the west, central or eastern United States over WiFi only. Plus, Scout now takes voice commands, so on your next road trip you can tell it to find the nearest Whataburger whether you have cell signal or not. Interested parties can head on over to the App Store to get their download on, but you'll pay for the privilege -- offline navigation costs $9.99 a year or $2.99 a month, though the free, data-dependant version of Scout for iPhone still includes speech recognition. Still not sold? Perhaps the video after the break will persuade you.

  • Mountain Lion 101: Dictation

    by 
    Steve Sande
    Steve Sande
    07.25.2012

    What can I say about my love of Mountain Lion's new Dictation feature? I've wanted to be able to talk and have my words transcribed to text ever since I saw the original "Assignment: Earth" episode of Star Trek back in 1968 (image at top of post). That's actress Teri Garr talking to a typewriter, and it's transcribing her words. Now it's finally happening, and I think that's pretty cool. I know that a lot of people are unimpressed by the dictation capabilities of Mountain Lion, the iPhone 4S, and the third-generation iPad, but I'm one of those people who is both blessed with a voice that seems to be made for Siri (the brains behind Dictation) and who has practiced dictating to my Mac and iOS devices. Unlike Rich Gaywood, who stated in his big Mountain Lion review that Dictation was having cutting through his Welsh accent, I seem to be having very few problems. As you'd expect, I am dictating this post on my Mountain Lion-equipped MacBook Air. By default, Dictation is turned on in Mountain Lion. To shut it off permanently or change other settings, use the new Dictation & Speech pref in System Preferences. With the pref it's possible to select the microphone used by Dictation, set the key(s) to press to activate Dictation (by default, you press the fn key on your keyboard twice), or learn more about Dictation and privacy. That last feature comes courtesy of a button on the bottom of the preference pane. Click it, and you're basically told that anything you dictate is recorded and sent to Apple to convert into text. That's right; it won't work without a live Internet connection. The Apple privacy statement also says that your computer will also send Apple "other information, such as your first name and nickname; and the names, nicknames, and relationship with you (for example, "my dad") of your address book contacts." Enough about the preferences panel. Let's talk about how accurate dictation really is. If I stop and think about what I'm trying to say to my Mac, and then speak clearly and a little bit slowly, then the accuracy rate is almost 100 percent. On the other hand, if I just start talking and stumble over what I'm saying, my accuracy suffers. Don't expect to be able to talk to your Mac for an hour and have a perfectly-typed term paper ready to submit at the end. Dictation works in 30-second chunks; any more than that and it will chime to let you know that it's done. I've found that the response time for Dictation is very fast compared to that on the iPhone 4S and third-generation iPad. In our book, "Talking to Siri", Erica Sadun and I discuss ways of improving accuracy of Siri dictation. We also talk about how to add caps and punctuation to your dictation, but you'll find that some of those commands don't work quite the same in Mountain Lion. For example, it was previously possible to say "My cat is named cap emerald" to have Siri type out "My cat is named Emerald." You no longer need to say "cap" to get Dictation to capitalize the proper name. However, none of the capitalization commands work any more. Likewise, spacing commands -- "space" and "no space" -- that used to add or eliminate spaces between words no longer work. All punctuation commands seem to be enabled from the testing I've been able to do. Dictation is one of those Mountain Lion features that you're either going to love or hate -- I'm not sure there's much of an in-between. Personally, I find it to be extremely useful, especially in combination with Messages. There's nothing more satisfying than tapping the function key twice, dictating a quick response to my wife, and then getting back to work. I'd suggest to anyone who is upgrading to Mountain Lion to at least give Dictation a try. You might find out that it works better than you think.

  • iSpeech intros voice recognition platform for connected homes, enables vocal control of TVs and appliances

    by 
    Edgar Alvarez
    Edgar Alvarez
    07.19.2012

    We've been seeing the growing trend of peculiar services like Cupertino's Siri, Samsung's S Voice and Google Now on mobile devices, but up until now, we have yet to spot something similar in the world of connected homes. Having previous experience in the text-to-speech department, iSpeech is hoping to be able to do just that with the world debut of its voice recognition platform for smart households. With iSpeech Home, the company's aiming to give OEMs and manufacturers a canvas where they can implement voice recognition software into TVs, home entertainment systems, lighting, refrigerators and even washers and dryers -- which would, according to iSpeech, open the doors to natural language commands such as "Watch ESPN" or "Turn off the lights in the living room." As exciting as it all sounds, the company's COO Yaron Oren did tell us there aren't any official partners on board at the moment, but that he does expect to have iSpeech Home-powered products within the next 6-12 months.

  • AT&T officially releases Watson speech API, gives devs a bit of babel fish for their apps

    by 
    Michael Gorman
    Michael Gorman
    07.10.2012

    Ma Bell's been hard at work on its Watson speech recognition system for years, and 2012 has seen the tech show up in an automobile and a real-time translator app. Months after announcing it would grant Watson's skills to the developer masses, AT&T has made good on its promise and officially released its Speech API. In case you forgot, AT&T's Nuance competitor's been tailored for different use cases -- including voice web search, voicemail-to-text and talk-to-text -- so that it can offer contextually accurate results in any app. If you're among the coders itchin' to test out Watson's capabilities, head on past the break for a promotional video, then click the source below to sign up for access.

  • HTC teases voice control and/or dog translator for Sense

    by 
    Terrence O'Brien
    Terrence O'Brien
    06.22.2012

    HTC might be over selling it a bit with the top secret stamp, and the foot note sort of indicates that your next One device wont be interpreting Fido's barks. So, really, that only leaves one logical conclusion -- HTC is working on a voice control app. It shouldn't come as any surprise if you've been paying any attention to the mobile landscape these past few years. Google kicked off the party with Voice Actions and Apple gave the speech recognition tech some personality with Siri. Now Samsung has S-Voice and LG has Quick Voice... what's a Taiwanese manufacturer to do? Presumably make your own speech-driven virtual assistant. When will it debut, what will it be called? Who knows, but judging from the image above it seems safe to assume that HTC's new tool will be delivered as software update to at least some existing handsets. [Thanks, Naman] Update: HTC tells us that it never intended to hint at a new voice service -- the image was just the punchline to a week of pet-related smartphone tips it featured on Facebook.

  • Voice Answer updated with more features for people locked out of Siri

    by 
    Mel Martin
    Mel Martin
    06.12.2012

    Owners of the Apple iPad 1 and 2 won't get Siri on iOS 6 when it ships. Neither will those who use an iPhone older than the 4S. That leaves people looking for alternatives, and fortunately there are a few. I took a look at Evi some months back, and now newcomer Voice Answer, which is US$3.99 in the App Store. It runs on any iOS device with iOS 4.3 or greater installed. The new version, just released, adds voice messaging, email dictation and calling from the iPhone contact list. The app can translate into 54 languages, update Facebook and publish Tweets. It also can set reminders, but not through the iOS Reminders app. Instead, it goes on your calendar and adds an alert, which is fine. Despite the improvements, Voice Answer just doesn't seem as sharp as Siri. In my last review I asked the app for driving directions to Phoenix, and got a "Your question is not clear to me" response. I tried with the new version and got, "I do not understand, sorry." Voice Answer did find the nearest golf courses, something the older version could not do. While the app says it can make calls from your contacts, that didn't work very well in my tests. When I asked the app to call my friend John (I gave the first and last name) it gave me a list of every John in my contact list, none of which had a similar sounding last name. Another 2 points for Siri. Sending email had some of the same problems as calling someone. I would give Voice Answer a unique name, and it asked me to rephrase the query. This new version has an articulated animated robot on screen, which is where you direct your questions. I found the experience a bit creepy and the look of the robot was kind of grotesque. Thankfully, you can opt for the older, Siri-like GUI and turn the robot animation off. Despite the less than perfect recognition, I think Voice Answer is a credible substitute for people who can't get Siri. Sure, it has some rough edges, but so does Siri, which remains in beta. If you need a quick check on the weather, or a web search, or information on a variety of subjects, Voice Answer does very well. It seems to trip up on the newer features, like calling and messaging, but I'm hoping those things will improve over time.

  • Grocery iQ for iOS adds speech recognition

    by 
    Mel Martin
    Mel Martin
    05.09.2012

    There's no shortage of grocery lists apps for the iPhone and iPad. One of the more popular apps is the free Grocery iQ, which has added speech recognition from Nuance in an update available today. That addition hoists Grocery iQ a bit above the average list-maker. You can add items to your shopping list by typing, scanning barcodes, and now by simply talking to your iPhone. The app lets you access coupons (the app is provided by Coupons.com) and also find nearby grocery stores. The app database accesses millions of items, so you're unlikely to be stuck with an item that is unknown. When you first start up the app your list is already populated, and I saw some ads from Hormel with accompanying coupons. I didn't find it a distraction, but I would have preferred to start with an empty list, rather than one partly filled out for me. The speech recognition was excellent, and I tested it with some obscure locally sold brands and all were identified quickly, which was an impressive feat. It has been suggested, but not confirmed, that the Nuance speech recognition engine is the same that powers Siri on the iPhone 4S. While this new version has added speech and multi-barcode scanning, it has also taken away certain features (favorite lists for a particular store and aisle layout, for example) and rankled some users. Still, I found the app very useful, and the addition of highly-accurate speech recognition is a real time saver. If you're not married to your current grocery list app, I'd take a look at Grocery iQ. It's a universal app, and requires iOS 4.2.1 or greater. %Gallery-155070%

  • Voice Answer is an interesting solution for those that don't have Siri

    by 
    Mel Martin
    Mel Martin
    04.19.2012

    Siri is very clever, and has had a significant impact on how iPhone users interact with the world of information. The problem is, Siri is iPhone 4S only, and even the new iPad has only voice dictation. We've looked at Evi, which is a third-party solution, but it's had some reliability issues (so has Siri), and I think iOS users are looking for a bit more. Well, I can suggest another solution called Voice Answer, which is in the app store as of today. It's not perfect, but it appears to give more thorough answers and more graphical answers to queries. The app is universal and sells for US$3.99. It supports direct questions about hundreds of topics on history, the current weather, nutrition, physics, solving math problems, stock questions, etc. There is a 'chat box' function which allows you to engage in conversation, although that is a bit hit and miss. The developers say they are finalizing an update that will include messaging and email, along with some unannounced features. I find the answers generally more complete than Evi, and it seems to answer more questions directly than Siri can. Often, Siri throws you to the web, while Voice Answer gets you the information without that step. Like Siri and Evi, one of the biggest information providers is Wolfram Alpha. Like all apps, it is not perfect. I asked where the nearest golf course is and the answer was "I don't know." Asking about directions to Phoenix got a "Your question is not clear to me." The directions to Phoenix didn't work very well in Evi either. It told me to try Google route planner. Thanks. Siri handled that query just fine. Still, Voice Answer works well on a lot of questions you want a quick answer to. I like the graphical displays, and I saw no issues with server side reliability. Of course that could change if Voice Answer gets more popular. If you're without Siri I'd give this app a serious look. Even if you have Siri, which I do, I found the app useful, especially on my iPad, and it will likely get better. Since no iPad has Siri, iPad owners will want to check this app out too. Some screen shots are in the gallery below. %Gallery-153592%

  • QNX's Watson-connected Porsche 911, hands-on (video)

    by 
    Terrence O'Brien
    Terrence O'Brien
    04.19.2012

    Remember that QNX-loaded Porsche 911 we sat down with (in?) at CES? Well, it's back and it learned a few new tricks en route to New Amsterdam York City. The car-friendly software company got its hooks into AT&T's Watson Speech API and used it to power a new voice-command system for it's own take on the "virtual assistant." Using the new speech recognition tool and Ma Bell's LTE network QNX was able to pull up websites, find a Starbucks (though, in New York City you'd have to be blind to not find one) and place calls. All in all, the demo wasn't too different from what we saw in Vegas in January -- in fact, we wouldn't be surprised to find out that Porsche was also utilizing Watson, long before it was announced. For a familiar, but still interesting demo, check out the video after the break.

  • AT&T Translator app hands-on: smashing the language barrier (video)

    by 
    Terrence O'Brien
    Terrence O'Brien
    04.19.2012

    Translation apps aren't exactly the newest or sexiest thing in the world of technology, but we've got to hand it to AT&T for whipping up a rather impressive demo. The company showed off a next-gen version of its AT&T Translator app, which may one day allow people to communicate in real time regardless of their spoken language. The app uses the carrier's new Watson Speech API, in this case via a VoIP call on a pair of iPads, to not only transcribe dialog, but translate it from English to Spanish (and vice-versa), then play it back in the target tongue using a computer generated voice. This isn't like the Google Translate app on your phone -- the translation happens in near real time, with only a slight latency as your words are fed through the system. The demo wasn't without its hitches (the room was noisy and filled with bloggers totting wireless devices), but it went more or less as planned, and our gracious hosts were able to complete a call requesting a taxi cab. One day AT&T hopes to make this a standard feature of its services, eliminating the language barrier once and for all. To see the app in action check out the video after the break.

  • Microsoft strikes deal with 24/7, promises to 'redefine' customer service

    by 
    Donald Melanson
    Donald Melanson
    02.07.2012

    A partnership between Microsoft and customer service company 24/7 may not exactly sound like the most exciting proposition on the face of things, but the two are making some fairly lofty promises, and Microsoft seems to be making a serious investment in the initiative. As ZDNet's Mary Jo Foley reports, part of the deal will see Microsoft send at least some of the 400 employees it brought on in its 2007 acquisition of TellMe Networks to 24/7, and it will also license some of its speech-related IP to the company (in addition to taking an equity stake in it). The goal there being to combine natural user interfaces with a cloud-based customer service platform, which Microsoft promises will "redefine what customer service looks like." To that end, it gives the example of a credit card company getting in touch with you to report suspicious behavior; rather than a phone call, you could get a notification with all the pertinent details sent directly to your phone, which could anticipate a number of potential actions and let you respond by voice (or touch, presumably). Unfortunately, while the two are talking plenty about the future of customer service, there's not a lot of word as to when that might arrive.

  • Siri clone Evi is off to a very bad start

    by 
    Mel Martin
    Mel Martin
    01.24.2012

    Siri has been a big hit for Apple, but as we all know, it runs only on an iPhone 4S. I've been expecting some Siri knock-offs to appear, and now one has that can be used on any iPhone and even the iPad if you don't mind not seeing it full screen. The app is called Evi by True Knowledge. It's US$0.99 and runs on any iDevice with iOS 4.0 or greater. "Run" is a bit of a misnomer. Evi's speech recognition is powered by Nuance, just like Siri, and the recognition part is first rate. But that's where the good news ends. Evi has not successfully responded to a single spoken query I've made since yesterday afternoon. Generally the app sits there for awhile, then reports that it is "Thinking about it," followed by "Let me see'" and then, inevitably, "I'm having trouble getting a response from my servers. You might want to try again in a minute." Actually, I don't ever want to try again. Ever. Reviews at the app store are ugly, with the majority being negative and some are outright hostile. You would think an app maker would have some degree of preparation for what is sure to be a popular offering. I can understand some failures, even Siri fails on a semi-regular basis, but Siri was labelled beta when it came out. Evi is supposed to be ready to go. This is an app that Apple should quickly pull, not because it competes with Siri (hardly), but because it is simply a complete and utter failure. In frustration I asked Evi if I can get my $0.99 back. Evi replied, "Bear with me" followed by "hang on," "I'm on it" and finally the server failure warning. I guess that would be a "no." Remember, you can't spell 'evil' without Evi. Check the gallery for some screen grabs of Evi not answering any of my questions. %Gallery-145537%

  • Nuance launches Dragon Go! for Android, available today for free

    by 
    Brad Molen
    Brad Molen
    01.10.2012

    As if its acquisition of Swype wasn't enough indication, Nuance has been working on its goal of dominating the Android speech recognition market, one step at a time. Today the company's pressing forward once again by introducing its Dragon Go! app for Google's mobile OS. The app focuses on verbal commands, giving you the ability to ask it to perform internet searches, make dinner reservations, buy movie tickets, play music on services like Pandora and Spotify and the list goes on. If you crave the specific details, make your way beneath the break and have a gander at the press release below.

  • Nuance's Dragon TV offers voice recognition platform for connected televisions

    by 
    Brad Molen
    Brad Molen
    01.09.2012

    Nuance isn't skipping a beat in Las Vegas, as the speech recognition company is busy launching a brand new platform that focuses on bringing its technology to connected TVs. According to the company, the platform, called Dragon TV, can be used to build customized voice and touch apps that run on televisions, set-top boxes, phones and tablets. Essentially, the technology will allow the viewer to use their voice to conduct searches, send messages and access plenty of other features, and mobile devices can be used to act as a remote to control the TV. Nuance's new platform is available now for OEMs, developers and operators to take advantage of, and supports Linux, Android and iOS as well as all major TV, set-top box and remote control standards. Head past the break for the full press release, and make your way to the company's site below to get more details.