SpeechRecognition

Latest

  • Google Voice Search update helps you personalize your results, helps Google build another database to take over the world

    by 
    Sean Hollister
    Sean Hollister
    12.14.2010

    Google Voice Actions was the first step towards our Star Trek dreams of lassoing the world with naught but vocal cords, and today Google's taken a second hop towards that inevitable future by letting Android devices record our every utterance. Yes, if you've got a handset running Froyo or better, you can download an update for Google Voice Search right now, which will let your phone dynamically personalize its speech-to-text engine to better recognize your voice most every time you use it. Of course, by so doing you're giving Google permission to record your sentences -- anonymously, of course -- to use in future products, but whether that's a problem or just a happy coincidence depends on whether you take Google at its word. We hit the "yes" button, in case you're curious. Find it on Android Market, or just use the handy-dandy QR code below.

  • Patent application suggests contextual voice commands for iPhone

    by 
    David Quilty
    David Quilty
    12.09.2010

    A patent application filed by Apple in 2009 but just released to the public last week shows that they want to improve on the voice command abilities of the iPhone. As reported by AppleInsider, the patent looks as though it would make voice commands available in individual applications rather than system-wide, narrowing down the possibilities to a chosen few commands and drastically reducing the chances of the iPhone making a mistake. The patent also mentions allowing third-party apps to make use of voice commands, and that users could be audibly notified of what app they had selected along with a list of corresponding voice commands. This could come in really handy when driving the car or riding a bicycle, when one's eyes should be on the road and not staring at an iPhone screen. Now I don't know about you, but I have never been able to reliably use the voice command feature on my iPhone. The few times I've tried to use it, I ended up calling an ex-girlfriend when I meant to call the current one, or I called my grandmother instead of my brother. So any improvements Apple could make to voice command would be more than welcome. I have used other voice command apps like Dragon Dictation and Apple's recent acquisition Siri, but a context-based voice command system would be a great addition to the iPhone's abilities.

  • AT&T Navigator for iPhone updated, features direct speech recognition

    by 
    Steve Sande
    Steve Sande
    10.01.2010

    If you're a subscriber to the free AT&T Navigator app and the associated service , then you'll want to load the latest update ASAP. AT&T Navigator v1.7i is the newest version of the TeleNav-powered app, and it's now the first iPhone GPS navigation app that incorporates direct speech recognition. As you can see in the video above, all you need to do is tap an icon, speak your destination, and the app will display appropriate destination addresses. Tap one of the addresses, and navigation begins. The new version also provides alerts for traffic cameras, works in landscape mode, and has a lane-assist function that shows you which lane you need to be in before you get to an intersection. When you need directions back to your home location, there's a new "shake to go home" function -- just shake the iPhone, and the app knows you want directions back home. The free app works with a US$9.99 monthly service that appears on your AT&T iPhone bill. You can choose a monthly or annual subscription, and you can cancel at any time. Note that navigation is only possible in areas where you have cellular data coverage, as the maps are downloaded on demand.

  • Would you buy a voice-controlled camera, or perhaps a DSLR with touchsceen?

    by 
    Sean Hollister
    Sean Hollister
    06.09.2010

    Do you talk to your digital camera? Perhaps stroke its glossy LCD? If a pair of recent patent applications are any indication, those mildly creepy gestures might one day actually do something. Sony's just laid claim to a DSLR touchscreen that can be manipulated by thumb even while the rest of one's face is smushed up against the viewfinder, and Canon's got its eye on technology that lets shooters activate advanced camera functions using simple voice control. The latter wouldn't be limited to "fire," but could potentially be directed to switch modes, stops and even zoom in and out of the frame. It wouldn't necessarily substitute for a remote as there are just two modes, "close-talking" for speech uttered when using the viewfinder, and "non-close-talking" when you line up shots on the LCD display. Neat as they are, these alternatives to physical controls make some at Engadget HQ quite sad, but we understand that minimalism is the word of the day.

  • Dragon Dictation comes to the iPad

    by 
    Mel Martin
    Mel Martin
    04.02.2010

    This should get a lot of iPad owners excited. Dragon Dictation, which has been so popular on the iPhone, is now available for the iPad. The app is free for a limited time, and adds some new features to the original iPhone version. The app also includes a new Dragon Dictation Notes feature that lets users speak and save drafts of documents, emails, to-do lists, social media status updates, and more. "Dragon Dictation has proven to be a must-have app for iPhone and iPod touch, so it made sense to immediately extend those benefits to iPad," said Michael Thompson, senior vice president and general manager, Nuance Mobile. "The iPad is a unique and remarkable device, and with the Dragon Dictation App users can experience added flexibility and convenience as they quickly convert speech into text." Dragon Dictation has also been updated for the iPhone today. The update note says it has bug fixes.

  • $2 Sensory chip could give toys (and other products) improved speech recognition, additional capabilities

    by 
    Donald Melanson
    Donald Melanson
    02.17.2010

    Sensory Inc. may stay behind the scenes most of the time, but the company's speech recognition chips are already used in toys from JVC, Mattel, Hasbro and others, and it's now announced a new chip that could lead to toys with some significantly improved capabilities. Costing just $2 apiece (in quantities over 100K/year), the company's NLP-5X chip not only boasts support for speech recognition and text-to-speech that lets it "generate thousands of voices on the fly," but support for sound samples and MIDI playback as well. What's more, the chip uses what's described as an "incredible algorithm" that allows it to be on all the time and simply listen and activate itself when needed -- or when you least suspect it. Of course, while toys are one application, the company also sees the chip being used in a whole range of other consumer electronics -- Sensory even gives the example of an internet-connected oven that could let look up a recipe and then have a conversation with your oven about how you'd like to cook it.

  • Nuance acquires MacSpeech

    by 
    Mel Martin
    Mel Martin
    02.16.2010

    Nuance Communications, the company behind Dragon Dictate and Dragon Search for the iPhone, has acquired MacSpeech, the company that makes MacSpeech Dictate and other voice recognition apps for the Mac platform. The first product from MacSpeech was iListen, which was available until 2008. At that time it was the only speech recognition app that could provide dictation services for the Mac after IBM discontinued ViaVoice. iListen was replaced with MacSpeech Dictate, and the company licensed the Dragon recognition engine created by Nuance for the program. MacSpeeech Dictate was a big improvement over iListen, but it still wasn't as powerful or as full-featured as the Dragon versions running on the Windows Platform. That's all going to change. Last week I talked with Peter Mahoney, a Senior Vice President at Nuance, who told me the acquisition of MacSpeech will speed up the flow of new features to MacSpeech Dictate. At some point the program will acquire the Dragon name. Mahoney told me we can expect to see a macro scripting language, integrated support for digital recorders, and accuracy improvements. Nuance made a big splash on the iPhone platform with Dragon Dictate [iTunes link] and Dragon Search [iTunes link]. Nuance also provided the speech recognition for Siri [iTunes link], which has received rave reviews.

  • Google's Nexus One censors your voice-to-text input, we #### you not

    by 
    Richard Lai
    Richard Lai
    01.24.2010

    It'd be kinda funny if someone was live-bleeping your profanity, right? Sure, but five minutes later you'd sober up to regret and lingering annoyance. Turns out the Nexus One does it for real, courtesy of Google's speech-to-text engine -- it replaces notorious curses like the F and S words with a '####,' which is a more dramatic take on the Zune HD's now-obsolete Twitter censorship. As silly as this sounds, Google has come up with a good reason: We filter potentially offensive or inappropriate results because we want to avoid situations whereby we might misrecognize a spoken query and return profanity when, in fact, the user said something completely innocent. Kudos for caring, but it wouldn't hurt to have an on / off option either -- after all, it's not like we're asking for pinch-to-zoom here, and we'll promise to use a swear jar.

  • January 1 reflections on my favorite things

    by 
    Mel Martin
    Mel Martin
    01.01.2010

    January 1 is always a little strange. A quiet time after a night out, a time to take the tree down and deal with all the green light cords that started out so neatly applied and wind up a tangled maze of complexity. Time to get rid of all the holiday wrappings and hope the trash pickup is soon. It is also a time to reflect on all things Apple and how the ecosystem of products has changed our lives in ways we sometimes forget or are barely aware of. This morning I was in a melancholy mood and needed some music to match. I thought a good idea was for some music by Eric Ewazen, [iTunes link] who writes some pretty deep and mystical compositions. I had already bought some of his tracks from the Apple Store. In the old days, waking up on a holiday and craving some music you don't have was a lost cause. Now I can get what is admittedly an obscure album of music, download it to my computer, put it on my Sonos system with a few clicks and sync it to my iPhone for my morning jog. Basking in the early morning Arizona light I loved hearing Ewazen's 'Hymn for the Lost and the Living' while contemplating a new year with new challenges. Apple enabled much of what I was able to do, and we take it for granted, but when you stand back from it all you can see how changed our lives are. Some of my other favorite things from this year include MacSpeech Dictate, software that allows me to reliably dictate my emails, some longer reports, and even some of my TUAW posts. It's truly science fiction in the here and now (or is it 'hear' and now?) and some updates in 2009 made it easier to use and far more accurate.

  • ShoutOUT TXT brings voice recognition to SMS messaging

    by 
    Mel Martin
    Mel Martin
    12.24.2009

    ShoutOUT TXT is a new app for the iPhone [iTunes link] that lets you dictate text messages to your iPhone and send them just as you would with regular SMS text messaging. You set up an account and text away, using your existing contact list, or entering any phone number that can be texted. After a quick setup, I could see that the voice recognition was pretty good. The app is U.S. $0.99 and you get 25 text messages free. The catch is that you then have to make in-app purchases for messages beyond the free 25, paying $4.99 for 250 messages, or $1.99 for 50. At those levels, those rates are cheaper than the AT&T rates, so if you don't have a text plan, or are maxed out, it isn't a bad deal. On the other hand, AT&T charges $15.00 for 1500 text messages (with no voice recognition of course). 1500 text to speech messages on this app would be $30.00. If you just want to send typed text messages there is no charge, which is certainly cheaper than AT&T. Since this app uses its own server, AT&T is bypassed. It's a bit of a mystery how this app got approved. It certainly duplicates some basic functions of the iPhone, and AT&T can't be all that happy about it. It gets harder and harder to understand the app store rules, which seem to be in a state of perpetual flux. Who is this app for? Heavy texters who don't have a plan now, or keep running over their AT&T allotment. Of course, if you want text to speech you can use Dragon Dictation, which is free and supports text messaging, but you'll still be paying for every message you send. In my tests the app worked as advertised, with good recognition, and I was notified of incoming texts. If you are texting to unusual names, it probably won't recognize them, but you can edit any text before it goes out. I have a basic AT&T plan, and don't see the need to add something like this, but I think it would work well for some. I don't have any feeling on how reliable the servers that power this app are. If they are good, it could be a winner for many iPhone users. The app currently supports North American English. It works on AT&T in the states, and on Bell Mobility, Rogers, and Virgin Mobile in Canada. Here's an FAQ if you want to learn more.

  • iSpeak: Voice dialing for iPhone 3G

    by 
    Steve Sande
    Steve Sande
    07.09.2008

    Sunday night on the TUAW Talkcast, we were discussing how much fun this week was going to be from the iPhone software perspective. This announcement from Fonix Speech is exactly what we were talking about.Fonix iSpeak is a voice activation application for the iPhone 3G. There are a couple of operations that you'll be able to accomplish just by speaking a command. You can dial someone by saying a phone number or the name of a person in your Contacts list. You'll also be to whiz through your music library, play a song, or start up a playlist by saying the name of an artist, song, or playlist.According to Fonix Speech, Fonix iSpeak "includes a run-time engine that sits on the phone allowing users to interact with the personal contents of their Apple iPhone™. Unlike other voice applets that enable voice search of the Internet by sending commands over the airwaves, this client-side application gives users the power of voice interaction with their personal content and eliminates network latency."There's no word on when the app will actually be available nor is there a price on the website, and the company didn't respond to a phone call. Fonix Speech says that they'll be selling it "directly" and through "traditional Apple distribution channels" -- the App Store, perhaps?

  • FineDigital gets official with speech-recognizing Bio GPS

    by 
    Donald Melanson
    Donald Melanson
    07.01.2008

    FineDigital was showing off one iteration of a speech-recognizing GPS unit only last month, but it looks like it's already turned out a more refined version, complete with a spiffy new name. Now dubbed the FineDrive Bio, this one packs the usual 7-inch touchscreen, along with DMB mobile TV support, dual SD card slots for some added storage, and FineDigital's FineSR speech-recognition technology, which will supposedly recognize up to 450,000 words. Look for this one to hit Korea on July 7th in both 2GB and 4GB versions for 499,000 won and 549,000 won, respectively (or about $475 and $520).[Via Tech Digest]

  • Garmin's pricey nuvi 850 shows up fashionably late

    by 
    Darren Murph
    Darren Murph
    01.19.2008

    Quite frankly, we were a touch overwhelmed by the sheer quantity of new nüvis announced for CES, but apparently, Garmin has managed to recuperate from its own outpouring and is dishing out yet another newcomer. On the docket today is the nüvi 850, a Bluetooth-less navigator that attempts to compensate for its lack of handsfree support by featuring a 4.3-inch 480 x 272 resolution touchscreen, a rechargeable Li-ion good for around four hours, a microSD slot, speech recognition, a 3D map view, support for MSN Direct and a built-in media player. Additionally, you'll find an FM transmitter, audio out and an internal (read: non flip-up) antenna to ratchet the style factor up a notch. According to Garmin, this fairly potent device will be up for grabs in Q2 for upwards of $800.[Via NaviGadget]

  • Microsoft adds speech recognition to Live Search

    by 
    Michael Caputo
    Michael Caputo
    11.06.2007

    If your hands and fingers are beat up from too much typing on your Windows Mobile piece, we may (or may not) have just the solution for you. Available now, Microsoft has released an updated version of Live Search for both Windows Mobile 5 and 6 that incorporates speech recognition for business listings and locations. Other gee-whiz features include searches for gas prices, hours of operations for business, and even being able to connect to your GPS for location-based stuff.

  • IBM's SiSi virtually translates speech to sign language

    by 
    Darren Murph
    Darren Murph
    09.13.2007

    We've seen a wide array of devices designed to help the deaf communicate and experience life more fully, and IBM is hoping to make yet another advancement in the field with its SiSi (Say It Sign It) system. Developed at an IBM research center in Hursley, England, the technology works "by using speech recognition to convert a conversation into text," after which SiSi "translates the text into the gestures used in sign language and animates a customizable avatar that carries them out." Currently, the system is still labeled a prototype and only works with British sign language, but there's already plans to commercialize the invention in due time. For a better look at exactly what SiSi can do, take a peek at the video demonstration waiting after the jump.

  • VoiceSignal ports voice recognition software to iPhone

    by 
    Darren Murph
    Darren Murph
    08.24.2007

    Those not preoccupied with unlocking their iPhone may be interested in what VoiceSignal's talking about, as it has apparently ported several of its applications to Apple's handset. Currently deemed "proof-of-concept applications," both VSearch (speak for search keywords) and VTunes (speak a band you'd like to hear) enable users to simply talk to their mobile and allow the software to handle the rest. Of course, speech recognition apps can be explained much better with, you know, sound, so be sure and check out the video of VTunes in action after the break.

  • XNA Challenge: Abdux

    by 
    Kyle Orland
    Kyle Orland
    03.07.2007

    Andre Furtado isn't an artist, as he's quick to tell me when showing off the somewhat simple drawings and animations of his XNA Challenge entry Abdux. But while visual art might not be his specialty, Furtado's work shows a certain artistry in the simple, natural input it uses.Furtado first made his mark on the XNA development world with a speech recognition modification to the platform's built-in Space War game. The mod used simple spoken commands like "move" and "fire" to control a pair of helper ships and won Furtado a Brazilian XNA competition. He hasn't gotten similar speech commands into his new alien abduction game yet, but he says he plans to let people create plagues like earthquakes with just the sound of their voice. "Perhaps in the future we will be made fun of for using keyboards and mice and gamepads to control games," he tells me.Furtado isn't in the competition for personal glory, but for experience and knowledge that he can take back to his fellow students in Brazil. "With technology like XNA, students and organizations can easily build a roadmap to game development without much knowledge of programming," Furtado said. Check out some early video of his latest creation after the jump.

  • Remote "exploit" of Vista Speech reveals fatal flaw

    by 
    Paul Miller
    Paul Miller
    02.01.2007

    Run for the hills, everybody, Windows Vista has been proven vulnerable to the hax0rs mere days after its release -- Steve Ballmer should clearly just give up now and resign while he still has a bit of dignity left. Or not. The vulnerability in question is hardly a hack at all, at least of the traditional variety, instead this one relies on you turning up your speakers and leaving your microphone on. See, the new Windows Speech Recognition in Windows Vista has all sorts of new abilities, but unlike Mac OS speech recognition of yore, no keyword is required to make your computer start listening to what you have to say, meaning any stray word could be interpreted as a command by Windows if it has the right tone and is within Vista's repertoire. Microsoft also hasn't done anything to ensure speech recognition doesn't listen to the sounds coming out of your computer via the speakers, all of which means that if you visit a malicious website with the speakers turned up and the mic turned on (and Speech Recognition loaded, of course) an audio file could wake SR, open Windows Explorer, delete the documents folder and then empty the recycle bin. Not exactly the most likely of occurrences, but certain security types are already up in arms, and Microsoft has confirmed the potential problem, but merely recommends users turn of their speakers and/or microphone, along with killing any apps trying to attack them with such verbage. Not the greatest vote of confidence, so perhaps we'll be seeing a fix for this from Microsoft before too long.[Via Slashdot]Read - Vista Speech Command exposes remote exploitRead - Microsoft confirms

  • Better speech recognition through chipsets

    by 
    Donald Melanson
    Donald Melanson
    08.23.2006

    Researchers at Carnegie Mellon University are hoping to do for speech recognition what graphics cards did for gaming -- that is, make it better. The idea being to use specialized computer chips in order to overcome the problems inherent with software-based solutions, which'll also lower the power consumption required and allow for better speech recognition in things like cellphones. And according to the researchers, it's working, although currently limited to a not-very-practical 1,000 word vocabulary. The technology isn't just for dictating your email though; one of the many possible noted examples of applications is making search available on a specific piece of dialog from a movie. Wonder if they've been having any secret back room meetings with a certain rumor-happy video game company?