voice recognition
Latest
Dragon Dictate 2.5 offers support for Microsoft Word 2011
Nuance has announced Dragon Dictate 2.5, a free upgrade to the company's Mac voice control/input app for version 2.0 customers. The new version dramatically improves mouse and keyboard entry in Microsoft Word 2011, among other features. According to Nuance, Word 2011 is the most commonly used app for Dragon Dictate customers, so it makes sense that the company would put emphasis on adding more dictation functionality for the word processing market leader. Earlier versions of Dictate would get confused about where the insertion point or document elements were located when users switched between voice and mouse input (except in the company's own Notepad app or in TextEdit, where Dragon supported more complex behaviors). The recommendation against mixing dictation and keyboard/mouse editing has been so ingrained in the product's DNA that Dragon refers to it informally as the Golden Rule. Meanwhile, users of the corresponding Dragon NaturallySpeaking app on the Windows platform had far fewer restrictions. With 2.5 and Microsoft Word 2011, the Golden Rule is history; users can easily switch between voice and keyboard input at will, or between dictation and command mode within Dictate itself, all without disrupting Dictate's internal model of the document. This lends itself to a far more natural and workflow-friendly way of using Dragon; instead of having to stop and start between dictation and editing phases, just keep on going. Dragon SVP/general manager Peter Mahoney told TUAW that there's nothing specific to announce about enhanced support for Apple's Pages or other popular Mac productivity apps, but the company is looking at other integrations. "This is the first time that we've done this [on the Mac] for a meaningful application, and there was a lot of new invention in the way we created these integration models," he said. "Some of the approach we used in Word 2011 will benefit the Windows product, too... It's certainly something that we plan to expand to other applications over time." Version 2.5 adds the ability to dictate without distraction from the mouse and keyboard, and also adds a microphone option in the form of an iOS app -- Dragon Remote Microphone (Free) -- for those situations where you'd rather not be tied to a traditional headset, but where you do share a Wi-Fi network between your computer and your phone. There are new capabilities for controlling how Dragon Dictate formats text, and new voice commands even allow posting to Facebook and Twitter. Even doing searches on Google, Bing, Yahoo! or with Spotlight on the Mac can be accomplished with a voice command. The microphone can now be set to automatically "sleep" after a preset amount of time so that it won't recognize speech until you specifically wake it. For new users, a digital download Dragon Dictate 2.5 for Mac is available for $179.99 through the Nuance website; owners of the Windows NaturallySpeaking product can cross-grade for $99. Check out the slideshow below for a demonstration of some of the Word commands that are available in the upgrade.
Steve Sande07.25.2011Pioneer solicits Whodoo guinea pigs for speech-based Android assistant (video)
Ever wish you could have a personal attendant living inside your Android smartphone? You know... one you can boss around without incurring human rights or labor law violations? Apparently Pioneer shares your vision, because its voice-controlled social assistant named Whodoo is seemingly ready to "hop to" at a moment's notice -- willing to locate a restaurant and send it to friends, route the appropriate directions, and announce your intentions to Facebook or Twitter -- all based on your verbal commands (and ostensibly perfect for in-dash navigation). The company is seeking bossy applicants for its closed beta experiment, which involves completing a lengthy application, providing considerable feedback, and submitting audio samples that are gathered by Whodoo. Think you've got the chops? Just follow the source, where you're free to convince Pioneer of the same.
Zachary Lutz07.13.2011Windows Phone 7.5 Mango in-depth preview (video)
Make no mistake, Microsoft isn't playing coy in the smartphone market any longer. The folks in Redmond are making a significant jump forward in the mobile arena, announcing that the upcoming version of Windows Phone, codenamed "Mango," will be heading to a device near you in time for the holidays. As its competitors have raised the bar of expectations to a much higher level, Microsoft followed suit by adding at least 500 features to its mobile investment, which the company hopes will plug all of the gaping holes the first two versions left open. We received a Samsung Focus preloaded with the most recent developer build (read: not even close to the market release version) and we had a few good days to put it through its paces. It's still far from completion, as there were several key features that we couldn't test out; some weren't fully implemented, and others involved third-party apps that won't be updated until closer to launch. Yet we don't want to call this build half-baked -- in fact, it was surprisingly smooth for software that still has at least four months to go before it's available for public consumption. At the risk of sounding ridiculously obvious, we're mighty interested in seeing the final result when all is said and done this holiday season. As a disclaimer, we can't guarantee that the stuff we cover here will actually look or act the same when it's ready to peek out and make its official introduction in Q4; as often happens, features and UI enhancements are subject to be changed by the Windows Phone team as Mango gets closer and closer to release. Let's get straight to brass tacks, since there's a lot of details to dive into. It'd be best to grab a large beverage (we'd recommend a Big Gulp, at least), find your most comfortable chair, and meet us after the break.
Brad Molen06.27.2011Nuance buys SVOX ahead of iOS 5 release
There's a whole trail of rumors hinting at an upcoming deal between speech recognition company Nuance and Apple. For quite a while now (ever since Apple picked up personal assistant software maker Siri), the scuttlebuzz has claimed that the folks in Cupertino would make a deal with Nuance for some kind of speech recognition, most likely an iOS-level integration that would allow you to ask your iOS device for whatever you want, and get it quickly and easily. But even if that deal is on, that hasn't stopped Nuance from slowing down. The company has acquired another speech recognition firm, SVOX, the creators of high-end speech recognition and text-to-speech services. That's a natural fit for Nuance, of course, and the release says that the new deal "will advance the proliferation of voice in the automotive market, and accelerate the development of new voice capabilities that enable natural, conversational interactions between consumers and their connected cars, mobile phones, and other consumer devices." Sounds exciting to us. We didn't actually get to see either Siri or an updated voice control service show up during the iOS 5 announcement at WWDC, but that doesn't mean it's completely out of the cards yet. Maybe a deal like this is just what Nuance needs to set up the partnership that Apple's reportedly been seeking for a while.
Mike Schramm06.16.2011Weekend rumor roundup: Apple Retail event, new MacBook Airs, unlocked iPhones, more
Several rumors with varying degrees of credibility came up over the weekend. According to AppleInsider, Twitter user @chronicwire (reportedly a source of past Apple leaks) reports that Apple's retail stores are setting up to launch Apple's annual Back to School promotion on Wednesday. The same source initially reported that the Back to School promo will coincide with the launch of new MacBook Airs, but he has since retracted that claim. Instead, Chronic claims the part numbers he initially thought represented new MacBook Airs indicate that Apple will start selling versions of the GSM iPhone 4 that are not carrier-locked to AT&T. Although the MacBook Air is widely expected to have a refresh soon, this is the first we've heard of unlocked iPhones being offered for sale in the U.S., and it's something we'll file under "We'll believe it when we see it." The iPhone is already sold free and clear of carrier locks in several markets, but GSM model iPhones sold in the U.S. remain carrier-locked to AT&T unless you jailbreak. Chronic has also released screenshots that supposedly come from an "internal build" of iOS 5. These screenshots show that Nuance voice recognition, expected to be integrated in iOS 5 but not discussed at WWDC, is still in development. Other sources have claimed these voice recognition features weren't ready to be shown off at WWDC but should be good to go by the time iOS 5 launches this fall. Finally, a reader has informed us that New Zealand's online Apple Store is now showing shipping times of 5-7 business days for the 1 TB Time Capsule and 1-2 weeks for the 2 TB model. These extended shipping times are also showing up in Apple's Australian and UK stores, and the Canadian Apple Store is showing a 1-2 week delay for the 1 TB Time Capsule. The U.S. store and most international stores are not showing the same delay, but they're further indicative of the Time Capsule supply constraints we reported last week, which may mean a product refresh is imminent. We'll be keeping a very close eye on Apple's online store on the Tuesday overnight shift, and we'll let you know if anything new comes up.
Chris Rawson06.12.2011Russian ATM uses voice analysis to tell when you're lying
Credit card applications via automated teller are all the rage abroad these days. That's why Russia's Sberbank is using Speech Technology Center's voice recognition system in its new ATM to tell when you fudge your financials to get approved. Like a polygraph, the technology senses involuntary stress cues to ferret out fib-filled statements -- only instead of using wired sensors, it listens to your angst-ridden voice. Designed using samples from Russian police interrogation recordings where subjects were found to be lying, the system is able to detect the changes in speech patterns when a person isn't telling the truth. Of course, it's not completely accurate, so the biometric voice data is combined with credit history and other info before the ATM can crush an applicant's credit dreams. And to assuage the public's privacy concerns, patrons' voice prints will be kept on chips in their credit cards instead of a bank database. So, we don't have to worry about hackers stealing our biometric info, but we're slightly concerned that we'll no longer be able to deceive our robot overlords should the need arise.
Michael Gorman06.11.2011Kinect integration in Ghost Recon: Future Soldier, hands-off (video)
Microsoft's E3 keynote may have exploded with deeper Kinect support, but nothing caught our eyes quite as sharply as Ghost Recon: Future Soldier's rifle-exploding Gunsmith demo. A Ubisoft representative showed us how it's done: separating your arms separates your deadly firearm into a gorgeous display of floating screws, components, and accessories, which can be effortlessly modified, swapped, and replaced with gesture and voice commands. Too picky to decide for yourself? Then don't: just tell Gunsmith what you're looking for. For instance, saying "Optimize for range" produces a weapon any sniper should be proud of -- even better, we found that commanding Gunsmith to "optimize for awesome" birthed a rifle (pictured above) sporting an underbarrel shotgun attachment. A gun attached to a gun? Yeah, that works. Weapons can be tested in Gunsmith's gesture-controlled firing range, an engaging shooting mode exclusive to the Gunsmith weapon editor and not usable in regular gameplay. Head past the break for a hands-on (figuratively speaking) video.
Sean Buckley06.08.2011Mass Effect 3 gets Kinect support with voice recognition
Mass Effect 3 will allow you to choose whether various characters live or die, BioWare announced this morning during Microsoft's E3 2011 press conference. CEO Ray Muzyka took to the stage to reveal the added functionality -- rumored just this past week -- detailing the functionality as voice commands for various effect. Choosing whether characters live or die (by voice) and ordering around your squad around the battlefield were both shown off, though it's possible more features will be added before the game's planned early 2012 launch. We're not exactly sure why Kinect is required for this functionality where an Xbox 360 headset would work (a la Ruse), but it sure is neat being able to command someone's death with nothing more than the sound of our voice.
Ben Gilbert06.06.2011Chaufr lets you shout searches, yell URLs at Chrome
Generally, shouting commands at the internet isn't going to get you very far but, if you're just yelling a few destinations and search terms, Chrome extension Chaufr can take you where you need to go. A previous add-on, Speechify, let you speak to fill input fields, but couldn't help you actually navigate the web. Chaufr, on the other hand, lets you simply say the magic word -- "Engadget" -- and it drops you right at our online doorstep. You can also use it to perform searches by saying Wikipedia, Google, Amazon, YouTube, or Yahoo followed by whatever it is you're looking for. It worked well enough in our brief hands-on, but we do have one nit to pick -- activating voice input requires you click on an icon in the tool bar then click on a microphone in the drop down menu. (Can't a brother get a keyboard shortcut?) You can try it out for yourself by clicking on the source link.
Terrence O'Brien05.31.2011Garmin nuLink! 2390 torn apart by FCC, put back together again on US site
Last week Garmin announced the latest member of its high-end GPS navigator family, the nuLink! 2390. Sadly, it was a Europe only affair, leaving American consumers wondering why the company was giving us the cold shoulder. (Whatever it was baby, we're sorry, come back.) Then we spotted an unnamed 4.3-inch Garmin making its way through the FCC that matches up quite nicely, size- and feature-wise, with the 2390. The newest nuLink-enabled device is even showing its face over at the company's US website (you really do love us!), though it's not available to order and you'll have to do some serious digging to unearth it. Whenever it does hit American shores you'll be able to pull in 3D traffic data and search Google thanks to its GSM radio and tether your phone to it using Bluetooth for hands-free calls. It also has voice recognition software so you can furiously bark commands at it when you miss a turn and a tracking feature for keeping tabs on unruly teens. If you're into seeing gadgets splayed open like an organ transplant patient check out the gallery below. %Gallery-123787%
Terrence O'Brien05.17.2011TomTom sends HD Traffic update to all Live models, extends Traffic Manifesto to US (video)
TomTom CEO Harold Goddijn announced at a NYC event last night that the company's HD Traffic service, previously only included with the Go 2535 M Live, would be available on all U.S. Live models, including the Go 740 Live and XL 340 Live. Traffic updates will be one component of the subscription-based Live, which will also see a 50 percent price drop, to $60 per year. This is all part of TomTom's grand Traffic Manifesto, which aims to cut traffic by five percent overall. Achieving this rather lofty goal in the U.S. would require 10 percent of the country's drivers to be using Live, which transmits real-time traffic data using a dedicated AT&T SIM. The company says drivers using the service themselves can expect to see travel times reduced by up to 15 percent. Our commute often involves a pajama-clad hike from the bed to the desk, so if you're currently a subscriber who drives to work, let us know if Traffic is making a dent in your travels.
Zach Honig05.12.2011Windows Phone 7 updates Bing to find music and barcodes, provide turn-by-turn directions and send speech-to-text SMS?
Developers are getting plenty of toys alongside Windows Phone 7's "Mango" release, but there may be extra baubles for regular users, too -- Microsoft will reportedly add a few features to Bing in the near future which could prove particularly useful. According to the latest episode of the Windows Phone Dev Podcast -- which hosted Microsoft's Brandon Watson as a guest -- a new function called Bing Audio will act like a Shazam for recognizing music (and will sell you Zune tracks) while Bing Vision will use your smartphone's camera to read barcodes and do optical character recognition, plus potentially provide support for augmented reality apps. There's also allegedly turn-by-turn voice directions for Bing Maps and a native podcast player, and one more potentially exciting thing -- voice-to-text for sending SMS messages without lifting a finger. Hear all about the rumor at our source link, at just about the 40-minute mark. [Thanks to everyone who sent this in]
Sean Hollister05.08.2011Is a Nuance and Apple deal in the works?
TechCrunch is reporting that Apple is in the process of some sort of deal with Nuance Communications, one of the leading companies in the field of speech recognition. Many readers may be familiar with Nuance's Dragon NaturallySpeaking software, however the Dragon speech engine is also licensed and used in a number of apps for Windows, OS X, iOS, and Android. What could the deal be? The most obvious choice is an acquisition, but as TC points out, it would cost Apple at least US$6 billion to buy the company. Apple's got the cash, but even for them that would be quite a purchase. TechCrunch thinks it's most likely the two companies are entering into some sort of partnership "that will be vital to both companies and could shape the future of iOS." Speech recognition has been rumored to be a big part of the future of iOS. Last year, Apple bought another speech recognition company, Siri, which itself is powered by Nuance technology. Perhaps with the release of iOS 5 we'll be talking to our phones more than using them to talk to people.
Michael Grothaus05.07.2011Voice-controlled Japanese robot assists with eating, makes veggies more fun (video)
Isao Wakabayashi, a student at Chukyo University in Japan, seems to have made the arduous chore of eating easier. Using a customized version of a Robix robot kit, Wakabayashi coded a program that makes the feeder recognize individual food items and feed them to you. The meal-assistant features two arms, dexterous enough to handle utensils, and can be controlled using your voice. In theory, this system would be ideal for the elderly, folks that currently have trouble eating by themselves, or you know -- for those that may or may not be too lazy to bring food to their face.
Sam Sheffer03.23.2011Japanese elevators get voice recognition, Japanese elevator rides get even more awkward
We here at Engadget are all about helping the less fortunate, so Mitsubishi Electric's latest innovation in elevator tech has us all warm and fuzzy. The new interface allows for blind users -- and presumably lazy users -- to select their destination floor by voice, with a subsequent announcement when they arrive. Additionally, the system kicks in whenever it detects a wheelchair, replacing the potentially difficult process of reaching high buttons with the simple act of speaking. No word on whether the system works in English just yet or if it'll make it to the States, but you might want to brush up on your Japanese either way.
Jacob Schulman03.08.2011TomTom's GO 2435 / 2535 PNDs get quiet teaser, we're left wondering what's new
The very busy folks over at TomTom have just squeezed out two new sets of PNDs sporting touchscreens, voice recognition, and a "new, intuitive user interface," but despite the company's high profile on the GPS market, the GO 2435, which works a 4.3-inch screen, and the GO 2535, a 5-inch iteration, slipped out without much ado. Both PNDs come in three versions: the "T" series supports lifetime traffic updates, the "M" line offers lifetime map updates, and the "MT" edition features -- you guessed it -- lifetime traffic and map updates. Thus far, the basic specs resemble those of previous GO PNDs -- both tout Bluetooth calling, 4GB flash storage, and 3 hours of battery life -- leaving us to wonder what's up with this "new, intuitive user interface?" Among other things, TomTom is still mum on price and availability, which means we'll have to wait until they speak up to give you all the dirty details.
Christopher Trout02.24.2011Nuance opens Dragon Mobile SDK to app developers, we see end to embarrassing dictation
There are some messages that are just too embarrassing to dictate to a human being. Lucky for us and the retired circus contortionist we hired to type up our missives, Nuance is expanding the reach of its transcription software by making its Dragon Mobile SDK available to developers for use in iOS and Android applications. The SDK, which is free to members of the Nuance Mobile Developer Program, sports speech-to-text capabilities in eight languages and text-to-speech in 35. There are already apps out there that can do the job, including Nuance's own Dragon Dictation, but we welcome new advances in automated transcription. You know, it's not exactly a walk in the park dictating an entire Clay Aiken Fan Club newsletter to a guy named Sid the Human Pretzel.
Christopher Trout01.23.2011Apple looking to hire voice technology, speech recognition specialists
Apple is looking to hire several voice and speech recognition experts for iOS, according to four new postings on Apple's job board. The new hires would join the iOS Application Framework team and would be working on "speech-related development activities." These multiple listings come hot on the heels of the public release of several patents detailing contextual voice commands for the iPhone. If you add in Apple's acquisition of Siri, you have the possible beginning of a robust voice control system for the iPhone and other iOS devices. Improved speech recognition would be a welcome addition to the iOS platform. In my experience, the current implementation of voice command on the iPhone is mediocre. While third-party applications like Dragon Dictation are superb, native speech control is wrong as often as it is right. Apple needs to refresh this portion of its mobile OS as it is falling behind Android in voice control technology. Google recently added support for personalized recognition to its popular Voice Search application for Android. In its current form, the app can be fine tuned to your voice so that you can dictate a text, navigate to a destination or place a call with increasing accuracy. [via 9to5 & CNET]
Kelly Hodgkins12.22.2010Google Voice Search update helps you personalize your results, helps Google build another database to take over the world
Google Voice Actions was the first step towards our Star Trek dreams of lassoing the world with naught but vocal cords, and today Google's taken a second hop towards that inevitable future by letting Android devices record our every utterance. Yes, if you've got a handset running Froyo or better, you can download an update for Google Voice Search right now, which will let your phone dynamically personalize its speech-to-text engine to better recognize your voice most every time you use it. Of course, by so doing you're giving Google permission to record your sentences -- anonymously, of course -- to use in future products, but whether that's a problem or just a happy coincidence depends on whether you take Google at its word. We hit the "yes" button, in case you're curious. Find it on Android Market, or just use the handy-dandy QR code below.
Sean Hollister12.14.2010LG's Optimus 7 gets previewed by Korean newspaper, has voice to text feature?
You know how we abhor machine translation, but this rumor was too juicy to pass up -- the Korea Economic Daily reportedly got hands-on with LG's Optimus 7 (aka E900) way ahead of release, and if we're reading this right, the Windows Phone 7 device will be capable of writing your text messages, emails and status updates just by hearing you speak. The publication also reports it's got a 3.8-inch, 800 x 480 screen (rather than the 3.5 or 3.7 inches we've heard before), a 1500 mAh battery, 16GB of built-in storage and a 1GHz processor. There's also apparently "automatic panorama" feature where you simply pan the camera to take stills and stitch them together, which sounds a lot like the Sweep Panorama dealie Sony recently added to its Cyber-Shot lineup. Can we expect a US version to have these features? Hard to say. Even should this preview be wholly legit, speech-to-text would probably need quite the overhaul to tell English from Korean -- and let's not even get started on Engrish.
Sean Hollister09.28.2010