speech recognition

Latest

  • PALRO robot masters English, will never shut up again (video)

    by 
    Tim Stevens
    Tim Stevens
    01.21.2011

    When first we saw Fujisoft's PALRO robot doing its thing we were charmed but, as it didn't speak English, we had to adore it from afar. No longer. The little critter has obviously mastered our language quite quickly and can be seen below chatting with an even more robotic humanoid about such idle things as the weather, career aspirations, and just how great PALRO is. How great is PALRO? PALRO is really great -- but humble. Inside that barrel chest is a full-fledged PC with an Atom Z530 processor, 4GB of flash storage, and an Ubuntu kernel keeping everything in check. It's available as ever for educational and research institutions for about $3,600, but we're trying to get one ourselves. If we can get it to type prepare yourselves for many more posts about software based on real Japanese cutting-edge technology.

  • Google Voice Search update helps you personalize your results, helps Google build another database to take over the world

    by 
    Sean Hollister
    Sean Hollister
    12.14.2010

    Google Voice Actions was the first step towards our Star Trek dreams of lassoing the world with naught but vocal cords, and today Google's taken a second hop towards that inevitable future by letting Android devices record our every utterance. Yes, if you've got a handset running Froyo or better, you can download an update for Google Voice Search right now, which will let your phone dynamically personalize its speech-to-text engine to better recognize your voice most every time you use it. Of course, by so doing you're giving Google permission to record your sentences -- anonymously, of course -- to use in future products, but whether that's a problem or just a happy coincidence depends on whether you take Google at its word. We hit the "yes" button, in case you're curious. Find it on Android Market, or just use the handy-dandy QR code below.

  • Patent application suggests contextual voice commands for iPhone

    by 
    David Quilty
    David Quilty
    12.09.2010

    A patent application filed by Apple in 2009 but just released to the public last week shows that they want to improve on the voice command abilities of the iPhone. As reported by AppleInsider, the patent looks as though it would make voice commands available in individual applications rather than system-wide, narrowing down the possibilities to a chosen few commands and drastically reducing the chances of the iPhone making a mistake. The patent also mentions allowing third-party apps to make use of voice commands, and that users could be audibly notified of what app they had selected along with a list of corresponding voice commands. This could come in really handy when driving the car or riding a bicycle, when one's eyes should be on the road and not staring at an iPhone screen. Now I don't know about you, but I have never been able to reliably use the voice command feature on my iPhone. The few times I've tried to use it, I ended up calling an ex-girlfriend when I meant to call the current one, or I called my grandmother instead of my brother. So any improvements Apple could make to voice command would be more than welcome. I have used other voice command apps like Dragon Dictation and Apple's recent acquisition Siri, but a context-based voice command system would be a great addition to the iPhone's abilities.

  • Amazon releases Price Check app just in time for holiday shopping

    by 
    Mel Martin
    Mel Martin
    11.22.2010

    Amazon has released Price Check, a free app that allows you to do some comparison shopping. You start by saying the name of your target product, scanning its barcode, typing its name, or snapping a photo. You'll get several results; tap one to see who's offering it with prices and shipping costs listed. Of course, Amazon hopes it has the lowest price, but it doesn't always win. I tried the app in a local store and found the barcode scanning and voice recognition worked well. I took a picture of some DVDs, and the app figured out what the movie was and offered meaningful price comparisons to other retailers. I used to use the Amazon Mobile app in bookstores while I browsed, and I often just ordered from Amazon while I was in a brick and mortar book store. I felt bad, but sometimes the price differential was too significant to resist. You can share your pricing info via email, text message, Facebook, or Twitter. The Price Check app is listed for the iPhone only, but it will run on the iPad and the iPod touch. Amazon also has an iPad app called Windowshop, but it doesn't do price comparisons and is just an easy way to shop at Amazon. Price Check is so good that I can't conceive of shopping without it. If you want to give it a go, be careful -- there are a bunch of apps with Price Check as part of the name. Price Check requires iOS 3.1 or later. %Gallery-108010%

  • AT&T Navigator for iPhone updated, features direct speech recognition

    by 
    Steve Sande
    Steve Sande
    10.01.2010

    If you're a subscriber to the free AT&T Navigator app and the associated service , then you'll want to load the latest update ASAP. AT&T Navigator v1.7i is the newest version of the TeleNav-powered app, and it's now the first iPhone GPS navigation app that incorporates direct speech recognition. As you can see in the video above, all you need to do is tap an icon, speak your destination, and the app will display appropriate destination addresses. Tap one of the addresses, and navigation begins. The new version also provides alerts for traffic cameras, works in landscape mode, and has a lane-assist function that shows you which lane you need to be in before you get to an intersection. When you need directions back to your home location, there's a new "shake to go home" function -- just shake the iPhone, and the app knows you want directions back home. The free app works with a US$9.99 monthly service that appears on your AT&T iPhone bill. You can choose a monthly or annual subscription, and you can cancel at any time. Note that navigation is only possible in areas where you have cellular data coverage, as the maps are downloaded on demand.

  • Dragon Dictation updated with iOS 4 support and some new features

    by 
    Mel Martin
    Mel Martin
    07.23.2010

    Dragon Dictation is one of the most popular free apps on the iPhone and iPad, and now it has been updated to support iOS 4. Nuance Communications, creators of the app, have added a pop-up toolbar that allows you to speak a status update and send it directly to Facebook or Twitter. You can also speak and send the text to the clipboard. As in the original version, you can dictate emails and text messages. Another nice to feature is the app now saves your dictated text if you are interrupted by a phone call. This latest version also supports U.K English now, as well as German. The app already supports Spanish, Italian and French. I've had an advance copy of the app for a week, and I can confirm it works as advertised, although my high school German is a bit rusty so I didn't try that feature. Someday, I hope, Apple will build complete speech recognition into the iPhone and iPad. If they do, I hope they use the Nuance speech engine, which is very accurate and easy to use. Until then, Dragon Dictation is a must download for use with email, social networking and texting. The app runs on iOS3.1 or later on the iPhone, iPod touch and iPad.

  • Speak4it is yet another voice activated destination finder

    by 
    Mel Martin
    Mel Martin
    06.30.2010

    There have been some impressive apps for finding nearby services released lately. AT&T didn't want to get left out, so they are offering Speak4it, a free app that lets your vocal cords do the walking. Say something like "Chinese restaurant," and the app will mark relevant hits on a map, on a list, or produce an augmented reality view using your iPhone camera, pointing you in the right direction. Speak4it does things that Siri, Google and others do, but the execution is great and it certainly beats typing. Speak4it also has a unique feature where you can draw a circle on the map and the app will show businesses just within that area. If you draw a line, it will find places along that route. %Gallery-96574%

  • Would you buy a voice-controlled camera, or perhaps a DSLR with touchsceen?

    by 
    Sean Hollister
    Sean Hollister
    06.09.2010

    Do you talk to your digital camera? Perhaps stroke its glossy LCD? If a pair of recent patent applications are any indication, those mildly creepy gestures might one day actually do something. Sony's just laid claim to a DSLR touchscreen that can be manipulated by thumb even while the rest of one's face is smushed up against the viewfinder, and Canon's got its eye on technology that lets shooters activate advanced camera functions using simple voice control. The latter wouldn't be limited to "fire," but could potentially be directed to switch modes, stops and even zoom in and out of the frame. It wouldn't necessarily substitute for a remote as there are just two modes, "close-talking" for speech uttered when using the viewfinder, and "non-close-talking" when you line up shots on the LCD display. Neat as they are, these alternatives to physical controls make some at Engadget HQ quite sad, but we understand that minimalism is the word of the day.

  • Rumor: Natal test kit photos reveal 'motorized tilt mechanism,' power cord

    by 
    Ludwig Kietzmann
    Ludwig Kietzmann
    04.20.2010

    Motion camera meta-voyeurism news now, with an alleged Project Natal test kit capturing every movement of a man ... taking pictures of it. According to Italian gaming site Multiplayer.it, the photos originate from a tester, who was tasked with having an unreciprocated conversation with the Xbox 360 peripheral in order to test its speech recognition capabilities. The surprisingly cheerful documentation included with the supposed test kit explicitly warns against tilting the camera manually, as it's already equipped with a "motorized tilt mechanism" -- all the better to see you with, my dear. The "Quick Start Guide" also shows how the early model of the camera connects to an Xbox 360 development kit via USB and a power outlet via a split cable. It's not known how representative these photos are of early Project Natal development kits, nor how much of it will change by the time the final product arrives this holiday. Microsoft did not comment on the veracity of the images, with a representative telling Joystiq: "We announced earlier this year that Project Natal will launch this holiday, and our teams are working hard to bring the best experiences to life. We have nothing further to announce at this time." [Via Engadget]

  • Dragon Dictation comes to the iPad

    by 
    Mel Martin
    Mel Martin
    04.02.2010

    This should get a lot of iPad owners excited. Dragon Dictation, which has been so popular on the iPhone, is now available for the iPad. The app is free for a limited time, and adds some new features to the original iPhone version. The app also includes a new Dragon Dictation Notes feature that lets users speak and save drafts of documents, emails, to-do lists, social media status updates, and more. "Dragon Dictation has proven to be a must-have app for iPhone and iPod touch, so it made sense to immediately extend those benefits to iPad," said Michael Thompson, senior vice president and general manager, Nuance Mobile. "The iPad is a unique and remarkable device, and with the Dragon Dictation App users can experience added flexibility and convenience as they quickly convert speech into text." Dragon Dictation has also been updated for the iPhone today. The update note says it has bug fixes.

  • Siri updated for iPod touch and gets some new features

    by 
    Mel Martin
    Mel Martin
    03.05.2010

    Siri for the iPhone was quite a hit when it came out earlier this year. You could ask it questions like, "where is the best pizza nearby?" and Siri would find the answer. My favorite response was when I asked if there was a God, and Siri gave me directions to the nearest churches. As much as people loved the app, iPod touch owners were left out in the cold. Not now. The app has been updated to run on the iPod touch with OS version 3.0 or above. And if you've already been using it, the app has been improved with more data, a larger vocabulary and some improvements to its reasoning algorithms. You can also give it integer math problems and you'll get an answer. The app uses Microsoft Instant Answers from Bing for the heavy lifting. Siri uses the speech recognition from Nuance Communications, which also powers the Dragon Dictatation and Dragon Search apps for the iPhone and iPod touch. It is uncannily accurate in my daily use, so Siri has gotten a bit smarter and learned a few new tricks. For free, it's a must have.

  • $2 Sensory chip could give toys (and other products) improved speech recognition, additional capabilities

    by 
    Donald Melanson
    Donald Melanson
    02.17.2010

    Sensory Inc. may stay behind the scenes most of the time, but the company's speech recognition chips are already used in toys from JVC, Mattel, Hasbro and others, and it's now announced a new chip that could lead to toys with some significantly improved capabilities. Costing just $2 apiece (in quantities over 100K/year), the company's NLP-5X chip not only boasts support for speech recognition and text-to-speech that lets it "generate thousands of voices on the fly," but support for sound samples and MIDI playback as well. What's more, the chip uses what's described as an "incredible algorithm" that allows it to be on all the time and simply listen and activate itself when needed -- or when you least suspect it. Of course, while toys are one application, the company also sees the chip being used in a whole range of other consumer electronics -- Sensory even gives the example of an internet-connected oven that could let look up a recipe and then have a conversation with your oven about how you'd like to cook it.

  • Nuance acquires MacSpeech

    by 
    Mel Martin
    Mel Martin
    02.16.2010

    Nuance Communications, the company behind Dragon Dictate and Dragon Search for the iPhone, has acquired MacSpeech, the company that makes MacSpeech Dictate and other voice recognition apps for the Mac platform. The first product from MacSpeech was iListen, which was available until 2008. At that time it was the only speech recognition app that could provide dictation services for the Mac after IBM discontinued ViaVoice. iListen was replaced with MacSpeech Dictate, and the company licensed the Dragon recognition engine created by Nuance for the program. MacSpeeech Dictate was a big improvement over iListen, but it still wasn't as powerful or as full-featured as the Dragon versions running on the Windows Platform. That's all going to change. Last week I talked with Peter Mahoney, a Senior Vice President at Nuance, who told me the acquisition of MacSpeech will speed up the flow of new features to MacSpeech Dictate. At some point the program will acquire the Dragon name. Mahoney told me we can expect to see a macro scripting language, integrated support for digital recorders, and accuracy improvements. Nuance made a big splash on the iPhone platform with Dragon Dictate [iTunes link] and Dragon Search [iTunes link]. Nuance also provided the speech recognition for Siri [iTunes link], which has received rave reviews.

  • Google's Nexus One censors your voice-to-text input, we #### you not

    by 
    Richard Lai
    Richard Lai
    01.24.2010

    It'd be kinda funny if someone was live-bleeping your profanity, right? Sure, but five minutes later you'd sober up to regret and lingering annoyance. Turns out the Nexus One does it for real, courtesy of Google's speech-to-text engine -- it replaces notorious curses like the F and S words with a '####,' which is a more dramatic take on the Zune HD's now-obsolete Twitter censorship. As silly as this sounds, Google has come up with a good reason: We filter potentially offensive or inappropriate results because we want to avoid situations whereby we might misrecognize a spoken query and return profanity when, in fact, the user said something completely innocent. Kudos for caring, but it wouldn't hurt to have an on / off option either -- after all, it's not like we're asking for pinch-to-zoom here, and we'll promise to use a swear jar.

  • Dragon Dictation and Search now updated, supports iPod touch

    by 
    Mel Martin
    Mel Martin
    01.11.2010

    If you lust after Dragon Dictation [iTunes link] and Dragon Search [iTunes link] and own an iPod touch, your prayers have been answered. Nuance, the creator of both apps, now has updated versions of the free apps that allow 2nd and 3rd generation iPod touch devices to dictate and search all they want. Of course, you'll need a microphone if you don't use the Apple-included headset/mic. iPod touch users were sorely disappointed when the Dragon apps came out last month, but they should be happy now. In addition to the iPod touch support, the new version of Dragon Dictation has an enhanced UI, and now the app can figure out that you are done dictating when there is silence. This is configured on the iPhone settings menu, rather than in the Dictation app itself. There is also an opt-out button if you don't want the app to send your list of contacts to the Nuance server for enhanced recognition. Dragon Search also has an updated UI and sports some bug fixes. I think the major complaint against the Dictation app is the 20-second limit on length of the audio clip that will be processed into text. That may be to keep the bandwidth to the Nuance servers low, but I think it is the only real weakness the app has. I think it's likely we'll see more updates of these apps with extended features. The apps are free for now, but Nuance has said they may not be free forever, so if you crave an app to send a quick email or text, or search the web using only your voice, best to get off the dime and download these puppies.

  • January 1 reflections on my favorite things

    by 
    Mel Martin
    Mel Martin
    01.01.2010

    January 1 is always a little strange. A quiet time after a night out, a time to take the tree down and deal with all the green light cords that started out so neatly applied and wind up a tangled maze of complexity. Time to get rid of all the holiday wrappings and hope the trash pickup is soon. It is also a time to reflect on all things Apple and how the ecosystem of products has changed our lives in ways we sometimes forget or are barely aware of. This morning I was in a melancholy mood and needed some music to match. I thought a good idea was for some music by Eric Ewazen, [iTunes link] who writes some pretty deep and mystical compositions. I had already bought some of his tracks from the Apple Store. In the old days, waking up on a holiday and craving some music you don't have was a lost cause. Now I can get what is admittedly an obscure album of music, download it to my computer, put it on my Sonos system with a few clicks and sync it to my iPhone for my morning jog. Basking in the early morning Arizona light I loved hearing Ewazen's 'Hymn for the Lost and the Living' while contemplating a new year with new challenges. Apple enabled much of what I was able to do, and we take it for granted, but when you stand back from it all you can see how changed our lives are. Some of my other favorite things from this year include MacSpeech Dictate, software that allows me to reliably dictate my emails, some longer reports, and even some of my TUAW posts. It's truly science fiction in the here and now (or is it 'hear' and now?) and some updates in 2009 made it easier to use and far more accurate.

  • ShoutOUT TXT brings voice recognition to SMS messaging

    by 
    Mel Martin
    Mel Martin
    12.24.2009

    ShoutOUT TXT is a new app for the iPhone [iTunes link] that lets you dictate text messages to your iPhone and send them just as you would with regular SMS text messaging. You set up an account and text away, using your existing contact list, or entering any phone number that can be texted. After a quick setup, I could see that the voice recognition was pretty good. The app is U.S. $0.99 and you get 25 text messages free. The catch is that you then have to make in-app purchases for messages beyond the free 25, paying $4.99 for 250 messages, or $1.99 for 50. At those levels, those rates are cheaper than the AT&T rates, so if you don't have a text plan, or are maxed out, it isn't a bad deal. On the other hand, AT&T charges $15.00 for 1500 text messages (with no voice recognition of course). 1500 text to speech messages on this app would be $30.00. If you just want to send typed text messages there is no charge, which is certainly cheaper than AT&T. Since this app uses its own server, AT&T is bypassed. It's a bit of a mystery how this app got approved. It certainly duplicates some basic functions of the iPhone, and AT&T can't be all that happy about it. It gets harder and harder to understand the app store rules, which seem to be in a state of perpetual flux. Who is this app for? Heavy texters who don't have a plan now, or keep running over their AT&T allotment. Of course, if you want text to speech you can use Dragon Dictation, which is free and supports text messaging, but you'll still be paying for every message you send. In my tests the app worked as advertised, with good recognition, and I was notified of incoming texts. If you are texting to unusual names, it probably won't recognize them, but you can edit any text before it goes out. I have a basic AT&T plan, and don't see the need to add something like this, but I think it would work well for some. I don't have any feeling on how reliable the servers that power this app are. If they are good, it could be a winner for many iPhone users. The app currently supports North American English. It works on AT&T in the states, and on Bell Mobility, Rogers, and Virgin Mobile in Canada. Here's an FAQ if you want to learn more.

  • Dragon Dictation comes to the iPhone. Wow.

    by 
    Mel Martin
    Mel Martin
    12.08.2009

    Put this into the 'I didn't think they could ever get this to work on an iPhone' category. I'm talking about Dragon Dictation [iTunes link] from Nuance, the developers of the very popular Dragon Naturally Speaking for the PC. Nuance also provides the speech recognition engine for MacSpeech Dictate on the Mac platform. To dictate on the iPhone you just launch the app, press the record button, and start talking. Your dictation can be a brief sentence, or a much longer treatise. Once the text has been created from your speech, it's possible to email it, send it as a text message, or put the result in your clipboard. After recording your message, you can edit the resulting text before you send it off for others to read. It's pretty slick! When you record your message, it is quickly transmitted to Nuance servers where a speech recognition algorithm is run against your data. The resulting text is returned to your iPhone very quickly; my informal benchmarks showed that it took about a second for text to be processed on a Wi-Fi network, and less than 5 seconds over 3G. You'll need a data connection for the app to work, but having this speech-to-text capability is going to be very important to a lot of people, who will find all sorts of uses for it. I tested the app for about a week and found the accuracy to be very good. Accuracy diminishes if you are in a very noisy environment, as I found when I tried some dictation while being driven down the interstate. There were a few errors, but they were easy to correct. To add punctuation to your text, you can say 'period', 'question mark', or 'new paragraph,' and Dragon Dictation adds the appropriate punctuation.

  • Voice on the Go makes your cellphone safer in the car

    by 
    Mel Martin
    Mel Martin
    10.30.2009

    Voice on the Go has been out for quite a long time, and I'm surprised we never reviewed it. Imagine getting your emails and texts read to you while you drive, and creating and sending emails and texts while never touching your cellphone. Recently a friend suggested I give it a try, so I did and found there was actually a new iPhone app [iTunes link] that supported it. Here's what Voice on the Go is all about. You sign up, choose a local number to connect to them, and assign yourself a 4 digit password. If you live in a smaller town and there isn't a number for Voice on the Go you can call any of the numbers. If you're on a national cell plan there won't be any extra cost. You then go to the Voice on the Go website and put in your email details, and you can upload a CSV file that contains your contacts. This is much easier if you have an iPhone, so more on that later. Once you are set up and in the car, you can call Voice on the Go, and an automated attendant will ask for your passcode. You'll then be told if you have any emails or SMS messages. You can listen to them, skip them, delete them, or the really nice feature, you can respond to them. You do it all by voice, with simple and obvious commands. You dictate your mail, and the Voice on the Go software turns it into text and sends it off to the proper destination. As an added feature, your email gets an audio attachment so the person can listen to what you said. How accurate is the transcription? Very. I sent about a dozen emails and every word was correct. That was calling from a noisy moving car using the Bluetooth speakerphone. A couple of times, when I was on a rough patch of road and issued a command, the attendant would ask me to repeat something, but the system always got it on the second try.

  • iSpeak: Voice dialing for iPhone 3G

    by 
    Steve Sande
    Steve Sande
    07.09.2008

    Sunday night on the TUAW Talkcast, we were discussing how much fun this week was going to be from the iPhone software perspective. This announcement from Fonix Speech is exactly what we were talking about.Fonix iSpeak is a voice activation application for the iPhone 3G. There are a couple of operations that you'll be able to accomplish just by speaking a command. You can dial someone by saying a phone number or the name of a person in your Contacts list. You'll also be to whiz through your music library, play a song, or start up a playlist by saying the name of an artist, song, or playlist.According to Fonix Speech, Fonix iSpeak "includes a run-time engine that sits on the phone allowing users to interact with the personal contents of their Apple iPhone™. Unlike other voice applets that enable voice search of the Internet by sending commands over the airwaves, this client-side application gives users the power of voice interaction with their personal content and eliminates network latency."There's no word on when the app will actually be available nor is there a price on the website, and the company didn't respond to a phone call. Fonix Speech says that they'll be selling it "directly" and through "traditional Apple distribution channels" -- the App Store, perhaps?