voice-recognition
Latest
Android's Andy Rubin is not a fan of Siri
Siri is the talk of the town now that the iPhone 4S is in the hands of over four million customers. There has been a deluge of articles about using Siri, funny phrases it says and even clever hacks that let third-party companies tap into the service. Apple and its fans may be excited by the voice recognition technology, but one of Google's executives is not overly impressed. Speaking in Hong Kong at the AsiaD conference, Google's Android chief, Andy Rubin, was sour on the utility of Siri. Rubin said, I don't believe that your phone should be an assistant. Your phone is a tool for communicating. You shouldn't be communicating with the phone; you should be communicating with somebody on the other side of the phone. Rubin may not look favorably on Siri, but he does give Apple credit for waiting until the technology was mature before rolling it out on the iPhone 4S. He noted, In projecting the future, I think Apple did a good job of figuring out when the technology was ready to be consumer-grade. Though Rubin claims not to be fond of voice recognition on a mobile phone, he does oversee Android's development at Google and has allowed advanced voice recognition features to be built into this mobile OS.
Siri gets lost internationally, promises to do better next year
The iPhone 4S' Siri integration may be a potential game changer, but she's not quite the world traveler some of us would like her to be. In fact, it seems she's as lost outside of US borders as any unprepared tourist. Looking for a pub in London? Better find a traditional map. Need to know the time of day in Canada? Siri admits she has no idea, go buy a watch. Business search (via Yelp), directions, and traffic data search all appear to be US-only features for now, and Wolfram Alpha only works in English-speaking countries. The automated assistant's international failings aren't too big of a surprise, however -- Apple's own Siri page outs the service as a beta, noting that some features may vary by area. Stuck with sub-par international support? Sit tight, it's coming: Apple's Siri FAQ states that additional language support (including Japanese, Chinese, Korean, Italian and Spanish), maps and local search content are set to go international in 2012. Update: Wolfram Alpha works outside the US in English speaking countries, thanks to everyone in the comments for the clarification.
Microsoft reportedly preparing Silverlight-like app framework ahead of Xbox Live update
Earlier this month, Microsoft announced a new slate of Xbox Live partnerships with Verizon, Comcast, and a host of other content providers. Now, the company has unveiled new details about the code upon which these new apps will run. Sources close to the situation tell GigaOM that the new framework, code-named "Lakeview," will be based on Silverlight, but will also bring a few new features from Xbox Kinect, including voice recognition and gesture-based controls. More intriguing, perhaps, are insider claims that Microsoft's new content partners stream video using Apple's HTTP Live Streaming, rather than Redmond's Smooth Streaming. GigaOM's sources went on to say that Microsoft has been introducing major changes to the platform over the past few weeks, in the hopes of having it ready for third-party developers once the Xbox Live update rolls out. Spokespersons for Xbox and Silverlight said they have "nothing to announce" about the new framework, though GigaOM reports that Redmond is aiming to release the update on Black Friday.
Dragon Go for iPhone gets smarter
Dragon Go!, the all-purpose voice recognition search app from Nuance, is getting a significant upgrade today. In fact, it's almost a preview of some of the functionality we suspect will be in iOS5. The free app lets you speak conversationally with your iPhone, iPod touch or iPad. Say things like "What's the best steakhouse in Kansas City", or "Find me some pictures of Lady Gaga", and the app will parse what you said and nearly always return useable results. The update, which should hit the App Store today, adds many more options, including the ability to launch popular movie and TV streaming services; get direct access to more of the most popular names in mobile content, like Spotify; get answers to the toughest of questions from Wolfram|Alpha and Ask.com; and, find friends on Google+. I tried some of the new functions, and was impressed. For instance, I said "Watch Mad Men on Netflix," and Dragon Go initiated a Google search. When I clicked on the resulting link, my Netflix app launched and the show started. I also successfully searched TUAW for articles and had it define words using Dictionary.com. For apps that require a login, you'll have to set up Dragon Go! to link with those apps, but that's not a difficult task. Vlad Sejnoha, chief technology officer at Nuance said "We're deeply invested in continuing to evolve Dragon Go! with new features, more content providers and richer app integration, and ultimately opening new doors for the consumer mobile destination experience. This is another step towards the mobile semantic web, and we've just gotten started." These new services join Google, Bing, Yahoo, Wikipedia, Twitter, YouTube and many others that were already built into the app. I find Dragon Go! and Siri (now owned by Apple) to be two of the best demos for the iPhone around. If you already have Dragon Go! you should see the update today. If you don't have it, download it and impress yourself and your friends. %Gallery-135166%
Xbox Live Fall 2011 Dashboard update preview: Bing search, voice control, and a Metro overhaul
Autumn is fast approaching -- and you know what that means: it's round about time for an Xbox Dashboard update. Sure, we got a peek of Microsoft's upcoming harvest back at E3, but the good folks from Redmond invited us to take a closer look at what they're calling the "most significant update to the Dashboard since NXE." Senior project Manager Terry Ferrell was on-site to walk us through an early engineering beta and show us how an updated Metro UI, Bing search and deeper Kinect integration is going to change the way folks manage their entertainment content.
Apple patent application imagines iPhones that learn the sweet sound of your voice
Button-loathing Apple really wants people to stop dirtying its devices with sticky fingerprints. That's why it's applied for a patent that should improve the frustrating experience of using iOS's voice control -- precisely the kind of update we've been awaiting since Apple bought Siri last year. With the help of a technology billed as "User profiling for voice input processing," your device would identify your voice, check against a library of words associated with you without having to trawl through its entire dictionary. We just hope Apple doesn't do away with physical inputs entirely -- we'd hate to broadcast to the world all the guilty pleasures we have loaded on our iPods.
Dragon Dictate 2.5 offers support for Microsoft Word 2011
Nuance has announced Dragon Dictate 2.5, a free upgrade to the company's Mac voice control/input app for version 2.0 customers. The new version dramatically improves mouse and keyboard entry in Microsoft Word 2011, among other features. According to Nuance, Word 2011 is the most commonly used app for Dragon Dictate customers, so it makes sense that the company would put emphasis on adding more dictation functionality for the word processing market leader. Earlier versions of Dictate would get confused about where the insertion point or document elements were located when users switched between voice and mouse input (except in the company's own Notepad app or in TextEdit, where Dragon supported more complex behaviors). The recommendation against mixing dictation and keyboard/mouse editing has been so ingrained in the product's DNA that Dragon refers to it informally as the Golden Rule. Meanwhile, users of the corresponding Dragon NaturallySpeaking app on the Windows platform had far fewer restrictions. With 2.5 and Microsoft Word 2011, the Golden Rule is history; users can easily switch between voice and keyboard input at will, or between dictation and command mode within Dictate itself, all without disrupting Dictate's internal model of the document. This lends itself to a far more natural and workflow-friendly way of using Dragon; instead of having to stop and start between dictation and editing phases, just keep on going. Dragon SVP/general manager Peter Mahoney told TUAW that there's nothing specific to announce about enhanced support for Apple's Pages or other popular Mac productivity apps, but the company is looking at other integrations. "This is the first time that we've done this [on the Mac] for a meaningful application, and there was a lot of new invention in the way we created these integration models," he said. "Some of the approach we used in Word 2011 will benefit the Windows product, too... It's certainly something that we plan to expand to other applications over time." Version 2.5 adds the ability to dictate without distraction from the mouse and keyboard, and also adds a microphone option in the form of an iOS app -- Dragon Remote Microphone (Free) -- for those situations where you'd rather not be tied to a traditional headset, but where you do share a Wi-Fi network between your computer and your phone. There are new capabilities for controlling how Dragon Dictate formats text, and new voice commands even allow posting to Facebook and Twitter. Even doing searches on Google, Bing, Yahoo! or with Spotlight on the Mac can be accomplished with a voice command. The microphone can now be set to automatically "sleep" after a preset amount of time so that it won't recognize speech until you specifically wake it. For new users, a digital download Dragon Dictate 2.5 for Mac is available for $179.99 through the Nuance website; owners of the Windows NaturallySpeaking product can cross-grade for $99. Check out the slideshow below for a demonstration of some of the Word commands that are available in the upgrade.
Windows Phone 7.5 Mango in-depth preview (video)
Make no mistake, Microsoft isn't playing coy in the smartphone market any longer. The folks in Redmond are making a significant jump forward in the mobile arena, announcing that the upcoming version of Windows Phone, codenamed "Mango," will be heading to a device near you in time for the holidays. As its competitors have raised the bar of expectations to a much higher level, Microsoft followed suit by adding at least 500 features to its mobile investment, which the company hopes will plug all of the gaping holes the first two versions left open. We received a Samsung Focus preloaded with the most recent developer build (read: not even close to the market release version) and we had a few good days to put it through its paces. It's still far from completion, as there were several key features that we couldn't test out; some weren't fully implemented, and others involved third-party apps that won't be updated until closer to launch. Yet we don't want to call this build half-baked -- in fact, it was surprisingly smooth for software that still has at least four months to go before it's available for public consumption. At the risk of sounding ridiculously obvious, we're mighty interested in seeing the final result when all is said and done this holiday season. As a disclaimer, we can't guarantee that the stuff we cover here will actually look or act the same when it's ready to peek out and make its official introduction in Q4; as often happens, features and UI enhancements are subject to be changed by the Windows Phone team as Mango gets closer and closer to release. Let's get straight to brass tacks, since there's a lot of details to dive into. It'd be best to grab a large beverage (we'd recommend a Big Gulp, at least), find your most comfortable chair, and meet us after the break.
Nuance buys SVOX ahead of iOS 5 release
There's a whole trail of rumors hinting at an upcoming deal between speech recognition company Nuance and Apple. For quite a while now (ever since Apple picked up personal assistant software maker Siri), the scuttlebuzz has claimed that the folks in Cupertino would make a deal with Nuance for some kind of speech recognition, most likely an iOS-level integration that would allow you to ask your iOS device for whatever you want, and get it quickly and easily. But even if that deal is on, that hasn't stopped Nuance from slowing down. The company has acquired another speech recognition firm, SVOX, the creators of high-end speech recognition and text-to-speech services. That's a natural fit for Nuance, of course, and the release says that the new deal "will advance the proliferation of voice in the automotive market, and accelerate the development of new voice capabilities that enable natural, conversational interactions between consumers and their connected cars, mobile phones, and other consumer devices." Sounds exciting to us. We didn't actually get to see either Siri or an updated voice control service show up during the iOS 5 announcement at WWDC, but that doesn't mean it's completely out of the cards yet. Maybe a deal like this is just what Nuance needs to set up the partnership that Apple's reportedly been seeking for a while.
Weekend rumor roundup: Apple Retail event, new MacBook Airs, unlocked iPhones, more
Several rumors with varying degrees of credibility came up over the weekend. According to AppleInsider, Twitter user @chronicwire (reportedly a source of past Apple leaks) reports that Apple's retail stores are setting up to launch Apple's annual Back to School promotion on Wednesday. The same source initially reported that the Back to School promo will coincide with the launch of new MacBook Airs, but he has since retracted that claim. Instead, Chronic claims the part numbers he initially thought represented new MacBook Airs indicate that Apple will start selling versions of the GSM iPhone 4 that are not carrier-locked to AT&T. Although the MacBook Air is widely expected to have a refresh soon, this is the first we've heard of unlocked iPhones being offered for sale in the U.S., and it's something we'll file under "We'll believe it when we see it." The iPhone is already sold free and clear of carrier locks in several markets, but GSM model iPhones sold in the U.S. remain carrier-locked to AT&T unless you jailbreak. Chronic has also released screenshots that supposedly come from an "internal build" of iOS 5. These screenshots show that Nuance voice recognition, expected to be integrated in iOS 5 but not discussed at WWDC, is still in development. Other sources have claimed these voice recognition features weren't ready to be shown off at WWDC but should be good to go by the time iOS 5 launches this fall. Finally, a reader has informed us that New Zealand's online Apple Store is now showing shipping times of 5-7 business days for the 1 TB Time Capsule and 1-2 weeks for the 2 TB model. These extended shipping times are also showing up in Apple's Australian and UK stores, and the Canadian Apple Store is showing a 1-2 week delay for the 1 TB Time Capsule. The U.S. store and most international stores are not showing the same delay, but they're further indicative of the Time Capsule supply constraints we reported last week, which may mean a product refresh is imminent. We'll be keeping a very close eye on Apple's online store on the Tuesday overnight shift, and we'll let you know if anything new comes up.
Russian ATM uses voice analysis to tell when you're lying
Credit card applications via automated teller are all the rage abroad these days. That's why Russia's Sberbank is using Speech Technology Center's voice recognition system in its new ATM to tell when you fudge your financials to get approved. Like a polygraph, the technology senses involuntary stress cues to ferret out fib-filled statements -- only instead of using wired sensors, it listens to your angst-ridden voice. Designed using samples from Russian police interrogation recordings where subjects were found to be lying, the system is able to detect the changes in speech patterns when a person isn't telling the truth. Of course, it's not completely accurate, so the biometric voice data is combined with credit history and other info before the ATM can crush an applicant's credit dreams. And to assuage the public's privacy concerns, patrons' voice prints will be kept on chips in their credit cards instead of a bank database. So, we don't have to worry about hackers stealing our biometric info, but we're slightly concerned that we'll no longer be able to deceive our robot overlords should the need arise.
Kinect integration in Ghost Recon: Future Soldier, hands-off (video)
Microsoft's E3 keynote may have exploded with deeper Kinect support, but nothing caught our eyes quite as sharply as Ghost Recon: Future Soldier's rifle-exploding Gunsmith demo. A Ubisoft representative showed us how it's done: separating your arms separates your deadly firearm into a gorgeous display of floating screws, components, and accessories, which can be effortlessly modified, swapped, and replaced with gesture and voice commands. Too picky to decide for yourself? Then don't: just tell Gunsmith what you're looking for. For instance, saying "Optimize for range" produces a weapon any sniper should be proud of -- even better, we found that commanding Gunsmith to "optimize for awesome" birthed a rifle (pictured above) sporting an underbarrel shotgun attachment. A gun attached to a gun? Yeah, that works. Weapons can be tested in Gunsmith's gesture-controlled firing range, an engaging shooting mode exclusive to the Gunsmith weapon editor and not usable in regular gameplay. Head past the break for a hands-on (figuratively speaking) video.
Mass Effect 3 gets Kinect support with voice recognition
Mass Effect 3 will allow you to choose whether various characters live or die, BioWare announced this morning during Microsoft's E3 2011 press conference. CEO Ray Muzyka took to the stage to reveal the added functionality -- rumored just this past week -- detailing the functionality as voice commands for various effect. Choosing whether characters live or die (by voice) and ordering around your squad around the battlefield were both shown off, though it's possible more features will be added before the game's planned early 2012 launch. We're not exactly sure why Kinect is required for this functionality where an Xbox 360 headset would work (a la Ruse), but it sure is neat being able to command someone's death with nothing more than the sound of our voice.
Chaufr lets you shout searches, yell URLs at Chrome
Generally, shouting commands at the internet isn't going to get you very far but, if you're just yelling a few destinations and search terms, Chrome extension Chaufr can take you where you need to go. A previous add-on, Speechify, let you speak to fill input fields, but couldn't help you actually navigate the web. Chaufr, on the other hand, lets you simply say the magic word -- "Engadget" -- and it drops you right at our online doorstep. You can also use it to perform searches by saying Wikipedia, Google, Amazon, YouTube, or Yahoo followed by whatever it is you're looking for. It worked well enough in our brief hands-on, but we do have one nit to pick -- activating voice input requires you click on an icon in the tool bar then click on a microphone in the drop down menu. (Can't a brother get a keyboard shortcut?) You can try it out for yourself by clicking on the source link.
Garmin nuLink! 2390 torn apart by FCC, put back together again on US site
Last week Garmin announced the latest member of its high-end GPS navigator family, the nuLink! 2390. Sadly, it was a Europe only affair, leaving American consumers wondering why the company was giving us the cold shoulder. (Whatever it was baby, we're sorry, come back.) Then we spotted an unnamed 4.3-inch Garmin making its way through the FCC that matches up quite nicely, size- and feature-wise, with the 2390. The newest nuLink-enabled device is even showing its face over at the company's US website (you really do love us!), though it's not available to order and you'll have to do some serious digging to unearth it. Whenever it does hit American shores you'll be able to pull in 3D traffic data and search Google thanks to its GSM radio and tether your phone to it using Bluetooth for hands-free calls. It also has voice recognition software so you can furiously bark commands at it when you miss a turn and a tracking feature for keeping tabs on unruly teens. If you're into seeing gadgets splayed open like an organ transplant patient check out the gallery below. %Gallery-123787%
Windows Phone 7 updates Bing to find music and barcodes, provide turn-by-turn directions and send speech-to-text SMS?
Developers are getting plenty of toys alongside Windows Phone 7's "Mango" release, but there may be extra baubles for regular users, too -- Microsoft will reportedly add a few features to Bing in the near future which could prove particularly useful. According to the latest episode of the Windows Phone Dev Podcast -- which hosted Microsoft's Brandon Watson as a guest -- a new function called Bing Audio will act like a Shazam for recognizing music (and will sell you Zune tracks) while Bing Vision will use your smartphone's camera to read barcodes and do optical character recognition, plus potentially provide support for augmented reality apps. There's also allegedly turn-by-turn voice directions for Bing Maps and a native podcast player, and one more potentially exciting thing -- voice-to-text for sending SMS messages without lifting a finger. Hear all about the rumor at our source link, at just about the 40-minute mark. [Thanks to everyone who sent this in]
Is a Nuance and Apple deal in the works?
TechCrunch is reporting that Apple is in the process of some sort of deal with Nuance Communications, one of the leading companies in the field of speech recognition. Many readers may be familiar with Nuance's Dragon NaturallySpeaking software, however the Dragon speech engine is also licensed and used in a number of apps for Windows, OS X, iOS, and Android. What could the deal be? The most obvious choice is an acquisition, but as TC points out, it would cost Apple at least US$6 billion to buy the company. Apple's got the cash, but even for them that would be quite a purchase. TechCrunch thinks it's most likely the two companies are entering into some sort of partnership "that will be vital to both companies and could shape the future of iOS." Speech recognition has been rumored to be a big part of the future of iOS. Last year, Apple bought another speech recognition company, Siri, which itself is powered by Nuance technology. Perhaps with the release of iOS 5 we'll be talking to our phones more than using them to talk to people.
Japanese elevators get voice recognition, Japanese elevator rides get even more awkward
We here at Engadget are all about helping the less fortunate, so Mitsubishi Electric's latest innovation in elevator tech has us all warm and fuzzy. The new interface allows for blind users -- and presumably lazy users -- to select their destination floor by voice, with a subsequent announcement when they arrive. Additionally, the system kicks in whenever it detects a wheelchair, replacing the potentially difficult process of reaching high buttons with the simple act of speaking. No word on whether the system works in English just yet or if it'll make it to the States, but you might want to brush up on your Japanese either way.
TomTom's GO 2435 / 2535 PNDs get quiet teaser, we're left wondering what's new
The very busy folks over at TomTom have just squeezed out two new sets of PNDs sporting touchscreens, voice recognition, and a "new, intuitive user interface," but despite the company's high profile on the GPS market, the GO 2435, which works a 4.3-inch screen, and the GO 2535, a 5-inch iteration, slipped out without much ado. Both PNDs come in three versions: the "T" series supports lifetime traffic updates, the "M" line offers lifetime map updates, and the "MT" edition features -- you guessed it -- lifetime traffic and map updates. Thus far, the basic specs resemble those of previous GO PNDs -- both tout Bluetooth calling, 4GB flash storage, and 3 hours of battery life -- leaving us to wonder what's up with this "new, intuitive user interface?" Among other things, TomTom is still mum on price and availability, which means we'll have to wait until they speak up to give you all the dirty details.
Nuance opens Dragon Mobile SDK to app developers, we see end to embarrassing dictation
There are some messages that are just too embarrassing to dictate to a human being. Lucky for us and the retired circus contortionist we hired to type up our missives, Nuance is expanding the reach of its transcription software by making its Dragon Mobile SDK available to developers for use in iOS and Android applications. The SDK, which is free to members of the Nuance Mobile Developer Program, sports speech-to-text capabilities in eight languages and text-to-speech in 35. There are already apps out there that can do the job, including Nuance's own Dragon Dictation, but we welcome new advances in automated transcription. You know, it's not exactly a walk in the park dictating an entire Clay Aiken Fan Club newsletter to a guy named Sid the Human Pretzel.