voice recognition

Latest

  • Dragon Dictate 2.5 offers support for Microsoft Word 2011

    by 
    Steve Sande
    Steve Sande
    07.25.2011

    Nuance has announced Dragon Dictate 2.5, a free upgrade to the company's Mac voice control/input app for version 2.0 customers. The new version dramatically improves mouse and keyboard entry in Microsoft Word 2011, among other features. According to Nuance, Word 2011 is the most commonly used app for Dragon Dictate customers, so it makes sense that the company would put emphasis on adding more dictation functionality for the word processing market leader. Earlier versions of Dictate would get confused about where the insertion point or document elements were located when users switched between voice and mouse input (except in the company's own Notepad app or in TextEdit, where Dragon supported more complex behaviors). The recommendation against mixing dictation and keyboard/mouse editing has been so ingrained in the product's DNA that Dragon refers to it informally as the Golden Rule. Meanwhile, users of the corresponding Dragon NaturallySpeaking app on the Windows platform had far fewer restrictions. With 2.5 and Microsoft Word 2011, the Golden Rule is history; users can easily switch between voice and keyboard input at will, or between dictation and command mode within Dictate itself, all without disrupting Dictate's internal model of the document. This lends itself to a far more natural and workflow-friendly way of using Dragon; instead of having to stop and start between dictation and editing phases, just keep on going. Dragon SVP/general manager Peter Mahoney told TUAW that there's nothing specific to announce about enhanced support for Apple's Pages or other popular Mac productivity apps, but the company is looking at other integrations. "This is the first time that we've done this [on the Mac] for a meaningful application, and there was a lot of new invention in the way we created these integration models," he said. "Some of the approach we used in Word 2011 will benefit the Windows product, too... It's certainly something that we plan to expand to other applications over time." Version 2.5 adds the ability to dictate without distraction from the mouse and keyboard, and also adds a microphone option in the form of an iOS app -- Dragon Remote Microphone (Free) -- for those situations where you'd rather not be tied to a traditional headset, but where you do share a Wi-Fi network between your computer and your phone. There are new capabilities for controlling how Dragon Dictate formats text, and new voice commands even allow posting to Facebook and Twitter. Even doing searches on Google, Bing, Yahoo! or with Spotlight on the Mac can be accomplished with a voice command. The microphone can now be set to automatically "sleep" after a preset amount of time so that it won't recognize speech until you specifically wake it. For new users, a digital download Dragon Dictate 2.5 for Mac is available for $179.99 through the Nuance website; owners of the Windows NaturallySpeaking product can cross-grade for $99. Check out the slideshow below for a demonstration of some of the Word commands that are available in the upgrade.

  • Pioneer solicits Whodoo guinea pigs for speech-based Android assistant (video)

    by 
    Zachary Lutz
    Zachary Lutz
    07.13.2011

    Ever wish you could have a personal attendant living inside your Android smartphone? You know... one you can boss around without incurring human rights or labor law violations? Apparently Pioneer shares your vision, because its voice-controlled social assistant named Whodoo is seemingly ready to "hop to" at a moment's notice -- willing to locate a restaurant and send it to friends, route the appropriate directions, and announce your intentions to Facebook or Twitter -- all based on your verbal commands (and ostensibly perfect for in-dash navigation). The company is seeking bossy applicants for its closed beta experiment, which involves completing a lengthy application, providing considerable feedback, and submitting audio samples that are gathered by Whodoo. Think you've got the chops? Just follow the source, where you're free to convince Pioneer of the same.

  • Windows Phone 7.5 Mango in-depth preview (video)

    by 
    Brad Molen
    Brad Molen
    06.27.2011

    Make no mistake, Microsoft isn't playing coy in the smartphone market any longer. The folks in Redmond are making a significant jump forward in the mobile arena, announcing that the upcoming version of Windows Phone, codenamed "Mango," will be heading to a device near you in time for the holidays. As its competitors have raised the bar of expectations to a much higher level, Microsoft followed suit by adding at least 500 features to its mobile investment, which the company hopes will plug all of the gaping holes the first two versions left open. We received a Samsung Focus preloaded with the most recent developer build (read: not even close to the market release version) and we had a few good days to put it through its paces. It's still far from completion, as there were several key features that we couldn't test out; some weren't fully implemented, and others involved third-party apps that won't be updated until closer to launch. Yet we don't want to call this build half-baked -- in fact, it was surprisingly smooth for software that still has at least four months to go before it's available for public consumption. At the risk of sounding ridiculously obvious, we're mighty interested in seeing the final result when all is said and done this holiday season. As a disclaimer, we can't guarantee that the stuff we cover here will actually look or act the same when it's ready to peek out and make its official introduction in Q4; as often happens, features and UI enhancements are subject to be changed by the Windows Phone team as Mango gets closer and closer to release. Let's get straight to brass tacks, since there's a lot of details to dive into. It'd be best to grab a large beverage (we'd recommend a Big Gulp, at least), find your most comfortable chair, and meet us after the break.

  • Nuance buys SVOX ahead of iOS 5 release

    by 
    Mike Schramm
    Mike Schramm
    06.16.2011

    There's a whole trail of rumors hinting at an upcoming deal between speech recognition company Nuance and Apple. For quite a while now (ever since Apple picked up personal assistant software maker Siri), the scuttlebuzz has claimed that the folks in Cupertino would make a deal with Nuance for some kind of speech recognition, most likely an iOS-level integration that would allow you to ask your iOS device for whatever you want, and get it quickly and easily. But even if that deal is on, that hasn't stopped Nuance from slowing down. The company has acquired another speech recognition firm, SVOX, the creators of high-end speech recognition and text-to-speech services. That's a natural fit for Nuance, of course, and the release says that the new deal "will advance the proliferation of voice in the automotive market, and accelerate the development of new voice capabilities that enable natural, conversational interactions between consumers and their connected cars, mobile phones, and other consumer devices." Sounds exciting to us. We didn't actually get to see either Siri or an updated voice control service show up during the iOS 5 announcement at WWDC, but that doesn't mean it's completely out of the cards yet. Maybe a deal like this is just what Nuance needs to set up the partnership that Apple's reportedly been seeking for a while.

  • Weekend rumor roundup: Apple Retail event, new MacBook Airs, unlocked iPhones, more

    by 
    Chris Rawson
    Chris Rawson
    06.12.2011

    Several rumors with varying degrees of credibility came up over the weekend. According to AppleInsider, Twitter user @chronicwire (reportedly a source of past Apple leaks) reports that Apple's retail stores are setting up to launch Apple's annual Back to School promotion on Wednesday. The same source initially reported that the Back to School promo will coincide with the launch of new MacBook Airs, but he has since retracted that claim. Instead, Chronic claims the part numbers he initially thought represented new MacBook Airs indicate that Apple will start selling versions of the GSM iPhone 4 that are not carrier-locked to AT&T. Although the MacBook Air is widely expected to have a refresh soon, this is the first we've heard of unlocked iPhones being offered for sale in the U.S., and it's something we'll file under "We'll believe it when we see it." The iPhone is already sold free and clear of carrier locks in several markets, but GSM model iPhones sold in the U.S. remain carrier-locked to AT&T unless you jailbreak. Chronic has also released screenshots that supposedly come from an "internal build" of iOS 5. These screenshots show that Nuance voice recognition, expected to be integrated in iOS 5 but not discussed at WWDC, is still in development. Other sources have claimed these voice recognition features weren't ready to be shown off at WWDC but should be good to go by the time iOS 5 launches this fall. Finally, a reader has informed us that New Zealand's online Apple Store is now showing shipping times of 5-7 business days for the 1 TB Time Capsule and 1-2 weeks for the 2 TB model. These extended shipping times are also showing up in Apple's Australian and UK stores, and the Canadian Apple Store is showing a 1-2 week delay for the 1 TB Time Capsule. The U.S. store and most international stores are not showing the same delay, but they're further indicative of the Time Capsule supply constraints we reported last week, which may mean a product refresh is imminent. We'll be keeping a very close eye on Apple's online store on the Tuesday overnight shift, and we'll let you know if anything new comes up.

  • Russian ATM uses voice analysis to tell when you're lying

    by 
    Michael Gorman
    Michael Gorman
    06.11.2011

    Credit card applications via automated teller are all the rage abroad these days. That's why Russia's Sberbank is using Speech Technology Center's voice recognition system in its new ATM to tell when you fudge your financials to get approved. Like a polygraph, the technology senses involuntary stress cues to ferret out fib-filled statements -- only instead of using wired sensors, it listens to your angst-ridden voice. Designed using samples from Russian police interrogation recordings where subjects were found to be lying, the system is able to detect the changes in speech patterns when a person isn't telling the truth. Of course, it's not completely accurate, so the biometric voice data is combined with credit history and other info before the ATM can crush an applicant's credit dreams. And to assuage the public's privacy concerns, patrons' voice prints will be kept on chips in their credit cards instead of a bank database. So, we don't have to worry about hackers stealing our biometric info, but we're slightly concerned that we'll no longer be able to deceive our robot overlords should the need arise.

  • Kinect integration in Ghost Recon: Future Soldier, hands-off (video)

    by 
    Sean Buckley
    Sean Buckley
    06.08.2011

    Microsoft's E3 keynote may have exploded with deeper Kinect support, but nothing caught our eyes quite as sharply as Ghost Recon: Future Soldier's rifle-exploding Gunsmith demo. A Ubisoft representative showed us how it's done: separating your arms separates your deadly firearm into a gorgeous display of floating screws, components, and accessories, which can be effortlessly modified, swapped, and replaced with gesture and voice commands. Too picky to decide for yourself? Then don't: just tell Gunsmith what you're looking for. For instance, saying "Optimize for range" produces a weapon any sniper should be proud of -- even better, we found that commanding Gunsmith to "optimize for awesome" birthed a rifle (pictured above) sporting an underbarrel shotgun attachment. A gun attached to a gun? Yeah, that works. Weapons can be tested in Gunsmith's gesture-controlled firing range, an engaging shooting mode exclusive to the Gunsmith weapon editor and not usable in regular gameplay. Head past the break for a hands-on (figuratively speaking) video.

  • Mass Effect 3 gets Kinect support with voice recognition

    by 
    Ben Gilbert
    Ben Gilbert
    06.06.2011

    Mass Effect 3 will allow you to choose whether various characters live or die, BioWare announced this morning during Microsoft's E3 2011 press conference. CEO Ray Muzyka took to the stage to reveal the added functionality -- rumored just this past week -- detailing the functionality as voice commands for various effect. Choosing whether characters live or die (by voice) and ordering around your squad around the battlefield were both shown off, though it's possible more features will be added before the game's planned early 2012 launch. We're not exactly sure why Kinect is required for this functionality where an Xbox 360 headset would work (a la Ruse), but it sure is neat being able to command someone's death with nothing more than the sound of our voice.

  • Chaufr lets you shout searches, yell URLs at Chrome

    by 
    Terrence O'Brien
    Terrence O'Brien
    05.31.2011

    Generally, shouting commands at the internet isn't going to get you very far but, if you're just yelling a few destinations and search terms, Chrome extension Chaufr can take you where you need to go. A previous add-on, Speechify, let you speak to fill input fields, but couldn't help you actually navigate the web. Chaufr, on the other hand, lets you simply say the magic word -- "Engadget" -- and it drops you right at our online doorstep. You can also use it to perform searches by saying Wikipedia, Google, Amazon, YouTube, or Yahoo followed by whatever it is you're looking for. It worked well enough in our brief hands-on, but we do have one nit to pick -- activating voice input requires you click on an icon in the tool bar then click on a microphone in the drop down menu. (Can't a brother get a keyboard shortcut?) You can try it out for yourself by clicking on the source link.

  • Garmin nuLink! 2390 torn apart by FCC, put back together again on US site

    by 
    Terrence O'Brien
    Terrence O'Brien
    05.17.2011

    Last week Garmin announced the latest member of its high-end GPS navigator family, the nuLink! 2390. Sadly, it was a Europe only affair, leaving American consumers wondering why the company was giving us the cold shoulder. (Whatever it was baby, we're sorry, come back.) Then we spotted an unnamed 4.3-inch Garmin making its way through the FCC that matches up quite nicely, size- and feature-wise, with the 2390. The newest nuLink-enabled device is even showing its face over at the company's US website (you really do love us!), though it's not available to order and you'll have to do some serious digging to unearth it. Whenever it does hit American shores you'll be able to pull in 3D traffic data and search Google thanks to its GSM radio and tether your phone to it using Bluetooth for hands-free calls. It also has voice recognition software so you can furiously bark commands at it when you miss a turn and a tracking feature for keeping tabs on unruly teens. If you're into seeing gadgets splayed open like an organ transplant patient check out the gallery below. %Gallery-123787%

  • TomTom sends HD Traffic update to all Live models, extends Traffic Manifesto to US (video)

    by 
    Zach Honig
    Zach Honig
    05.12.2011

    TomTom CEO Harold Goddijn announced at a NYC event last night that the company's HD Traffic service, previously only included with the Go 2535 M Live, would be available on all U.S. Live models, including the Go 740 Live and XL 340 Live. Traffic updates will be one component of the subscription-based Live, which will also see a 50 percent price drop, to $60 per year. This is all part of TomTom's grand Traffic Manifesto, which aims to cut traffic by five percent overall. Achieving this rather lofty goal in the U.S. would require 10 percent of the country's drivers to be using Live, which transmits real-time traffic data using a dedicated AT&T SIM. The company says drivers using the service themselves can expect to see travel times reduced by up to 15 percent. Our commute often involves a pajama-clad hike from the bed to the desk, so if you're currently a subscriber who drives to work, let us know if Traffic is making a dent in your travels.

  • Windows Phone 7 updates Bing to find music and barcodes, provide turn-by-turn directions and send speech-to-text SMS?

    by 
    Sean Hollister
    Sean Hollister
    05.08.2011

    Developers are getting plenty of toys alongside Windows Phone 7's "Mango" release, but there may be extra baubles for regular users, too -- Microsoft will reportedly add a few features to Bing in the near future which could prove particularly useful. According to the latest episode of the Windows Phone Dev Podcast -- which hosted Microsoft's Brandon Watson as a guest -- a new function called Bing Audio will act like a Shazam for recognizing music (and will sell you Zune tracks) while Bing Vision will use your smartphone's camera to read barcodes and do optical character recognition, plus potentially provide support for augmented reality apps. There's also allegedly turn-by-turn voice directions for Bing Maps and a native podcast player, and one more potentially exciting thing -- voice-to-text for sending SMS messages without lifting a finger. Hear all about the rumor at our source link, at just about the 40-minute mark. [Thanks to everyone who sent this in]

  • Is a Nuance and Apple deal in the works?

    by 
    Michael Grothaus
    Michael Grothaus
    05.07.2011

    TechCrunch is reporting that Apple is in the process of some sort of deal with Nuance Communications, one of the leading companies in the field of speech recognition. Many readers may be familiar with Nuance's Dragon NaturallySpeaking software, however the Dragon speech engine is also licensed and used in a number of apps for Windows, OS X, iOS, and Android. What could the deal be? The most obvious choice is an acquisition, but as TC points out, it would cost Apple at least US$6 billion to buy the company. Apple's got the cash, but even for them that would be quite a purchase. TechCrunch thinks it's most likely the two companies are entering into some sort of partnership "that will be vital to both companies and could shape the future of iOS." Speech recognition has been rumored to be a big part of the future of iOS. Last year, Apple bought another speech recognition company, Siri, which itself is powered by Nuance technology. Perhaps with the release of iOS 5 we'll be talking to our phones more than using them to talk to people.

  • Voice-controlled Japanese robot assists with eating, makes veggies more fun (video)

    by 
    Sam Sheffer
    Sam Sheffer
    03.23.2011

    Isao Wakabayashi, a student at Chukyo University in Japan, seems to have made the arduous chore of eating easier. Using a customized version of a Robix robot kit, Wakabayashi coded a program that makes the feeder recognize individual food items and feed them to you. The meal-assistant features two arms, dexterous enough to handle utensils, and can be controlled using your voice. In theory, this system would be ideal for the elderly, folks that currently have trouble eating by themselves, or you know -- for those that may or may not be too lazy to bring food to their face.

  • Japanese elevators get voice recognition, Japanese elevator rides get even more awkward

    by 
    Jacob Schulman
    Jacob Schulman
    03.08.2011

    We here at Engadget are all about helping the less fortunate, so Mitsubishi Electric's latest innovation in elevator tech has us all warm and fuzzy. The new interface allows for blind users -- and presumably lazy users -- to select their destination floor by voice, with a subsequent announcement when they arrive. Additionally, the system kicks in whenever it detects a wheelchair, replacing the potentially difficult process of reaching high buttons with the simple act of speaking. No word on whether the system works in English just yet or if it'll make it to the States, but you might want to brush up on your Japanese either way.

  • TomTom's GO 2435 / 2535 PNDs get quiet teaser, we're left wondering what's new

    by 
    Christopher Trout
    Christopher Trout
    02.24.2011

    The very busy folks over at TomTom have just squeezed out two new sets of PNDs sporting touchscreens, voice recognition, and a "new, intuitive user interface," but despite the company's high profile on the GPS market, the GO 2435, which works a 4.3-inch screen, and the GO 2535, a 5-inch iteration, slipped out without much ado. Both PNDs come in three versions: the "T" series supports lifetime traffic updates, the "M" line offers lifetime map updates, and the "MT" edition features -- you guessed it -- lifetime traffic and map updates. Thus far, the basic specs resemble those of previous GO PNDs -- both tout Bluetooth calling, 4GB flash storage, and 3 hours of battery life -- leaving us to wonder what's up with this "new, intuitive user interface?" Among other things, TomTom is still mum on price and availability, which means we'll have to wait until they speak up to give you all the dirty details.

  • Nuance opens Dragon Mobile SDK to app developers, we see end to embarrassing dictation

    by 
    Christopher Trout
    Christopher Trout
    01.23.2011

    There are some messages that are just too embarrassing to dictate to a human being. Lucky for us and the retired circus contortionist we hired to type up our missives, Nuance is expanding the reach of its transcription software by making its Dragon Mobile SDK available to developers for use in iOS and Android applications. The SDK, which is free to members of the Nuance Mobile Developer Program, sports speech-to-text capabilities in eight languages and text-to-speech in 35. There are already apps out there that can do the job, including Nuance's own Dragon Dictation, but we welcome new advances in automated transcription. You know, it's not exactly a walk in the park dictating an entire Clay Aiken Fan Club newsletter to a guy named Sid the Human Pretzel.

  • Apple looking to hire voice technology, speech recognition specialists

    by 
    Kelly Hodgkins
    Kelly Hodgkins
    12.22.2010

    Apple is looking to hire several voice and speech recognition experts for iOS, according to four new postings on Apple's job board. The new hires would join the iOS Application Framework team and would be working on "speech-related development activities." These multiple listings come hot on the heels of the public release of several patents detailing contextual voice commands for the iPhone. If you add in Apple's acquisition of Siri, you have the possible beginning of a robust voice control system for the iPhone and other iOS devices. Improved speech recognition would be a welcome addition to the iOS platform. In my experience, the current implementation of voice command on the iPhone is mediocre. While third-party applications like Dragon Dictation are superb, native speech control is wrong as often as it is right. Apple needs to refresh this portion of its mobile OS as it is falling behind Android in voice control technology. Google recently added support for personalized recognition to its popular Voice Search application for Android. In its current form, the app can be fine tuned to your voice so that you can dictate a text, navigate to a destination or place a call with increasing accuracy. [via 9to5 & CNET]

  • Google Voice Search update helps you personalize your results, helps Google build another database to take over the world

    by 
    Sean Hollister
    Sean Hollister
    12.14.2010

    Google Voice Actions was the first step towards our Star Trek dreams of lassoing the world with naught but vocal cords, and today Google's taken a second hop towards that inevitable future by letting Android devices record our every utterance. Yes, if you've got a handset running Froyo or better, you can download an update for Google Voice Search right now, which will let your phone dynamically personalize its speech-to-text engine to better recognize your voice most every time you use it. Of course, by so doing you're giving Google permission to record your sentences -- anonymously, of course -- to use in future products, but whether that's a problem or just a happy coincidence depends on whether you take Google at its word. We hit the "yes" button, in case you're curious. Find it on Android Market, or just use the handy-dandy QR code below.

  • LG's Optimus 7 gets previewed by Korean newspaper, has voice to text feature?

    by 
    Sean Hollister
    Sean Hollister
    09.28.2010

    You know how we abhor machine translation, but this rumor was too juicy to pass up -- the Korea Economic Daily reportedly got hands-on with LG's Optimus 7 (aka E900) way ahead of release, and if we're reading this right, the Windows Phone 7 device will be capable of writing your text messages, emails and status updates just by hearing you speak. The publication also reports it's got a 3.8-inch, 800 x 480 screen (rather than the 3.5 or 3.7 inches we've heard before), a 1500 mAh battery, 16GB of built-in storage and a 1GHz processor. There's also apparently "automatic panorama" feature where you simply pan the camera to take stills and stitch them together, which sounds a lot like the Sweep Panorama dealie Sony recently added to its Cyber-Shot lineup. Can we expect a US version to have these features? Hard to say. Even should this preview be wholly legit, speech-to-text would probably need quite the overhaul to tell English from Korean -- and let's not even get started on Engrish.