speech recognition

Latest

  • Facebook CEO Mark Zuckerberg speaks about Messenger at Facebook Inc's annual F8 developers conference in San Jose, California, U.S. May 1, 2018. REUTERS/Stephen Lam

    Facebook's latest AI can learn speech without human transcriptions

    by 
    Saqib Shah
    Saqib Shah
    05.21.2021

    Facebook has developed a machine learning system that can recognize speech without the need for transcribed data.

  • NEW YORK, NEW YORK - MARCH 10: Microsoft logo is seen in a Microsoft store on March 10, 2021, in New York. The Nasdaq Composite continued falling more than half a percent during the day. Also, the move away from Apple Inc, Amazon.com Inc , Facebook Inc, Tesla Inc and Microsoft Corp, falling during the day, helped small-cap stocks rise more than double the gains of the S&P 500. (Photo by John Smith/VIEWpress)

    Microsoft is reportedly close to buying speech tech giant Nuance

    by 
    Jon Fingas
    Jon Fingas
    04.11.2021

    Microsoft is reportedly in late talks to buy Nuance for $16 billion, giving it advantages in speech tech and AI.

  • LipSync by YouTube

    Google wants you to train its AI by lip syncing 'Dance Monkey' by Tones and I

    by 
    Christine Fisher
    Christine Fisher
    09.24.2020

    Google is asking users to lip sync 'Dance Monkey' by Tones and I in order to train its AI.

  • DonkeyWorx via Getty Images

    Researchers highlight racial bias in speech recognition systems

    by 
    Rachel England
    Rachel England
    03.24.2020

    Researchers have identified significant racial disparities in speech recognition systems from five of the world's biggest tech companies. According to a study from Stanford University, systems from Amazon, Apple, Google, IBM and Microsoft make far more errors with users who are black than those who are white.

  • ASSOCIATED PRESS

    Facebook will pay for user recordings to improve speech recognition

    by 
    Christine Fisher
    Christine Fisher
    02.20.2020

    Facebook may have stopped listening to and transcribing Messenger voice chats, but it still needs voice recordings to improve its speech recognition technology. So the company is going to pay select users to record snippets of audio through a new program called "Pronunciations," The Verge reports. In exchange, users can earn up to $5.

  • VCG via Getty Images

    Google will help you pronounce difficult words

    by 
    Christine Fisher
    Christine Fisher
    11.14.2019

    Google wants to make it easier to learn word pronunciations. Today, it introduced a new Search feature that will let users practice saying tricky words. When you look up a pronunciation, Google will provide an answer, and when you say the word into your phone's microphone, Search will let you know if you said it correctly.

  • ASSOCIATED PRESS

    Here's all the important stuff Google announced at I/O 2019

    by 
    Amrita Khalid
    Amrita Khalid
    05.07.2019

    A better, faster, stronger Google is in store for 2019. During its I/O developer conference on Tuesday, the company unveiled dozens of updates to every corner of the Google ecosystem; from search and Google Assistant to the next generation of Android. In the keynote, Google CEO Sundar Pinchai said the company's mission is shifting from a company that "helps you find answers" to one that "helps you get things done." Whether it's hailing a Lyft, translating foreign languages or transcribing video in real-time, the theme today was how Google can help users perform more tasks than ever before.

  • Google

    Google trains its AI to accommodate speech impairments

    by 
    Christine Fisher
    Christine Fisher
    05.07.2019

    For most users, voice assistants are helpful tools. But for the millions of people with speech impairments caused by neurological conditions, voice assistants can be yet another frustrating challenge. Google wants to change that. At its I/O developer conference today, Google revealed that it's training AI to better understand diverse speech patterns, such as impaired speech caused by brain injury or conditions like ALS.

  • S3studio via Getty Images

    Google Assistant will finally work with business G Suite accounts

    by 
    Christine Fisher
    Christine Fisher
    04.10.2019

    Google has been steadily rolling out G Suite updates like AI grammar suggestions in Google Docs, streamlined two-step verification, new Tasks features and shortcuts to make Google Doc and Sheet creation faster. Today, at the Cloud Next '19 event, Google announced its newest batch of G Suite changes.

  • iTranslate Voice delivers version 2.0 with new features and faster performance

    by 
    Mel Martin
    Mel Martin
    02.13.2014

    When I looked at iTranslate Voice in 2012, I though it was like science fiction. It lets you speak to your iOS device in one language, and hear any of 36 other languages come back. It's a powerful application for world travelers and anyone who must speak or interpret a foreign language. Version 2.0 (US$0.99) is now available. It's been redesigned for iOS 7 and is a universal binary. The app supports the new iOS 7 offline voices, making translations faster. This update also improves AirTranslate, which uses Bluetooth to communicate with another device running the app. Voice recognition is claimed to be faster and more accurate. Other useful features include the ability to repeat a spoken translation, and copying the text for use in an email or SMS. Text can be edited if need be to make changes. Voices seem very clear in this latest version, and speed is good on WiFi, 3G or LTE. Note that the app requires an internet connection to reach the translation server. I tried the app and found the speed of translation quite fast. I speak some German, and the app seemed to get the translation right in both directions. Reviews of the previous versions of the app have been strong. Of course, the app is designed for short sentences. Don't expect uninterrupted translation as someone delivers a speech, for example. There are quite a few translators out there, including a free one from Google. The Google app doesn't support cut and paste, and doesn't have iOS 7 optimizations as yet. Still, for free, it's powerful. iTraslate Voice has always been a top-rated translator, and with version 2 it's even better.

  • Rumor: New Xbox has natural speech detection, speech-to-text

    by 
    Richard Mitchell
    Richard Mitchell
    02.07.2013

    Microsoft's next Xbox, AKA Durango, will have much better speech detection, according to sources at the Verge. That includes natural language detection, which will allow the Xbox to process normal speech patterns, similar to Apple's Siri. So, for example, instead of saying "Xbox, play 3," you might simply say, "Xbox, where can I watch The X-Files?"Furthermore, users may be able to turn the new Xbox on using only voice. Speech-to-text is also a possibility, which would allow users to, say, compose Xbox Live messages using only voice. The Verge also mentions a possible function that would allow the Kinect to detect the number of people in a room and suggest suitable multiplayer games (hopefully not used games).An improved Kinect, possibly bundled with or built into the next Xbox, has been expected for quite some time now. As such, it's not much of a stretch to assume the upgraded device would feature improved voice recognition in addition to improved physical recognition.

  • Telenav's Scout gives iOS users offline navigation in exchange for ten bucks

    by 
    Michael Gorman
    Michael Gorman
    08.14.2012

    We know that iOS 6 will bless iPhone users with some in-house-made mapping, but that hasn't stopped Telenav from bettering its own Scout navigation offering for Apple's favorite handsets. Scout now does offline navigation by letting users download maps of the west, central or eastern United States over WiFi only. Plus, Scout now takes voice commands, so on your next road trip you can tell it to find the nearest Whataburger whether you have cell signal or not. Interested parties can head on over to the App Store to get their download on, but you'll pay for the privilege -- offline navigation costs $9.99 a year or $2.99 a month, though the free, data-dependant version of Scout for iPhone still includes speech recognition. Still not sold? Perhaps the video after the break will persuade you.

  • Mountain Lion 101: Dictation

    by 
    Steve Sande
    Steve Sande
    07.25.2012

    What can I say about my love of Mountain Lion's new Dictation feature? I've wanted to be able to talk and have my words transcribed to text ever since I saw the original "Assignment: Earth" episode of Star Trek back in 1968 (image at top of post). That's actress Teri Garr talking to a typewriter, and it's transcribing her words. Now it's finally happening, and I think that's pretty cool. I know that a lot of people are unimpressed by the dictation capabilities of Mountain Lion, the iPhone 4S, and the third-generation iPad, but I'm one of those people who is both blessed with a voice that seems to be made for Siri (the brains behind Dictation) and who has practiced dictating to my Mac and iOS devices. Unlike Rich Gaywood, who stated in his big Mountain Lion review that Dictation was having cutting through his Welsh accent, I seem to be having very few problems. As you'd expect, I am dictating this post on my Mountain Lion-equipped MacBook Air. By default, Dictation is turned on in Mountain Lion. To shut it off permanently or change other settings, use the new Dictation & Speech pref in System Preferences. With the pref it's possible to select the microphone used by Dictation, set the key(s) to press to activate Dictation (by default, you press the fn key on your keyboard twice), or learn more about Dictation and privacy. That last feature comes courtesy of a button on the bottom of the preference pane. Click it, and you're basically told that anything you dictate is recorded and sent to Apple to convert into text. That's right; it won't work without a live Internet connection. The Apple privacy statement also says that your computer will also send Apple "other information, such as your first name and nickname; and the names, nicknames, and relationship with you (for example, "my dad") of your address book contacts." Enough about the preferences panel. Let's talk about how accurate dictation really is. If I stop and think about what I'm trying to say to my Mac, and then speak clearly and a little bit slowly, then the accuracy rate is almost 100 percent. On the other hand, if I just start talking and stumble over what I'm saying, my accuracy suffers. Don't expect to be able to talk to your Mac for an hour and have a perfectly-typed term paper ready to submit at the end. Dictation works in 30-second chunks; any more than that and it will chime to let you know that it's done. I've found that the response time for Dictation is very fast compared to that on the iPhone 4S and third-generation iPad. In our book, "Talking to Siri", Erica Sadun and I discuss ways of improving accuracy of Siri dictation. We also talk about how to add caps and punctuation to your dictation, but you'll find that some of those commands don't work quite the same in Mountain Lion. For example, it was previously possible to say "My cat is named cap emerald" to have Siri type out "My cat is named Emerald." You no longer need to say "cap" to get Dictation to capitalize the proper name. However, none of the capitalization commands work any more. Likewise, spacing commands -- "space" and "no space" -- that used to add or eliminate spaces between words no longer work. All punctuation commands seem to be enabled from the testing I've been able to do. Dictation is one of those Mountain Lion features that you're either going to love or hate -- I'm not sure there's much of an in-between. Personally, I find it to be extremely useful, especially in combination with Messages. There's nothing more satisfying than tapping the function key twice, dictating a quick response to my wife, and then getting back to work. I'd suggest to anyone who is upgrading to Mountain Lion to at least give Dictation a try. You might find out that it works better than you think.

  • iSpeech intros voice recognition platform for connected homes, enables vocal control of TVs and appliances

    by 
    Edgar Alvarez
    Edgar Alvarez
    07.19.2012

    We've been seeing the growing trend of peculiar services like Cupertino's Siri, Samsung's S Voice and Google Now on mobile devices, but up until now, we have yet to spot something similar in the world of connected homes. Having previous experience in the text-to-speech department, iSpeech is hoping to be able to do just that with the world debut of its voice recognition platform for smart households. With iSpeech Home, the company's aiming to give OEMs and manufacturers a canvas where they can implement voice recognition software into TVs, home entertainment systems, lighting, refrigerators and even washers and dryers -- which would, according to iSpeech, open the doors to natural language commands such as "Watch ESPN" or "Turn off the lights in the living room." As exciting as it all sounds, the company's COO Yaron Oren did tell us there aren't any official partners on board at the moment, but that he does expect to have iSpeech Home-powered products within the next 6-12 months.

  • AT&T officially releases Watson speech API, gives devs a bit of babel fish for their apps

    by 
    Michael Gorman
    Michael Gorman
    07.10.2012

    Ma Bell's been hard at work on its Watson speech recognition system for years, and 2012 has seen the tech show up in an automobile and a real-time translator app. Months after announcing it would grant Watson's skills to the developer masses, AT&T has made good on its promise and officially released its Speech API. In case you forgot, AT&T's Nuance competitor's been tailored for different use cases -- including voice web search, voicemail-to-text and talk-to-text -- so that it can offer contextually accurate results in any app. If you're among the coders itchin' to test out Watson's capabilities, head on past the break for a promotional video, then click the source below to sign up for access.

  • HTC teases voice control and/or dog translator for Sense

    by 
    Terrence O'Brien
    Terrence O'Brien
    06.22.2012

    HTC might be over selling it a bit with the top secret stamp, and the foot note sort of indicates that your next One device wont be interpreting Fido's barks. So, really, that only leaves one logical conclusion -- HTC is working on a voice control app. It shouldn't come as any surprise if you've been paying any attention to the mobile landscape these past few years. Google kicked off the party with Voice Actions and Apple gave the speech recognition tech some personality with Siri. Now Samsung has S-Voice and LG has Quick Voice... what's a Taiwanese manufacturer to do? Presumably make your own speech-driven virtual assistant. When will it debut, what will it be called? Who knows, but judging from the image above it seems safe to assume that HTC's new tool will be delivered as software update to at least some existing handsets. [Thanks, Naman] Update: HTC tells us that it never intended to hint at a new voice service -- the image was just the punchline to a week of pet-related smartphone tips it featured on Facebook.

  • QNX's Watson-connected Porsche 911, hands-on (video)

    by 
    Terrence O'Brien
    Terrence O'Brien
    04.19.2012

    Remember that QNX-loaded Porsche 911 we sat down with (in?) at CES? Well, it's back and it learned a few new tricks en route to New Amsterdam York City. The car-friendly software company got its hooks into AT&T's Watson Speech API and used it to power a new voice-command system for it's own take on the "virtual assistant." Using the new speech recognition tool and Ma Bell's LTE network QNX was able to pull up websites, find a Starbucks (though, in New York City you'd have to be blind to not find one) and place calls. All in all, the demo wasn't too different from what we saw in Vegas in January -- in fact, we wouldn't be surprised to find out that Porsche was also utilizing Watson, long before it was announced. For a familiar, but still interesting demo, check out the video after the break.

  • AT&T Translator app hands-on: smashing the language barrier (video)

    by 
    Terrence O'Brien
    Terrence O'Brien
    04.19.2012

    Translation apps aren't exactly the newest or sexiest thing in the world of technology, but we've got to hand it to AT&T for whipping up a rather impressive demo. The company showed off a next-gen version of its AT&T Translator app, which may one day allow people to communicate in real time regardless of their spoken language. The app uses the carrier's new Watson Speech API, in this case via a VoIP call on a pair of iPads, to not only transcribe dialog, but translate it from English to Spanish (and vice-versa), then play it back in the target tongue using a computer generated voice. This isn't like the Google Translate app on your phone -- the translation happens in near real time, with only a slight latency as your words are fed through the system. The demo wasn't without its hitches (the room was noisy and filled with bloggers totting wireless devices), but it went more or less as planned, and our gracious hosts were able to complete a call requesting a taxi cab. One day AT&T hopes to make this a standard feature of its services, eliminating the language barrier once and for all. To see the app in action check out the video after the break.

  • Microsoft strikes deal with 24/7, promises to 'redefine' customer service

    by 
    Donald Melanson
    Donald Melanson
    02.07.2012

    A partnership between Microsoft and customer service company 24/7 may not exactly sound like the most exciting proposition on the face of things, but the two are making some fairly lofty promises, and Microsoft seems to be making a serious investment in the initiative. As ZDNet's Mary Jo Foley reports, part of the deal will see Microsoft send at least some of the 400 employees it brought on in its 2007 acquisition of TellMe Networks to 24/7, and it will also license some of its speech-related IP to the company (in addition to taking an equity stake in it). The goal there being to combine natural user interfaces with a cloud-based customer service platform, which Microsoft promises will "redefine what customer service looks like." To that end, it gives the example of a credit card company getting in touch with you to report suspicious behavior; rather than a phone call, you could get a notification with all the pertinent details sent directly to your phone, which could anticipate a number of potential actions and let you respond by voice (or touch, presumably). Unfortunately, while the two are talking plenty about the future of customer service, there's not a lot of word as to when that might arrive.

  • Nuance launches Dragon Go! for Android, available today for free

    by 
    Brad Molen
    Brad Molen
    01.10.2012

    As if its acquisition of Swype wasn't enough indication, Nuance has been working on its goal of dominating the Android speech recognition market, one step at a time. Today the company's pressing forward once again by introducing its Dragon Go! app for Google's mobile OS. The app focuses on verbal commands, giving you the ability to ask it to perform internet searches, make dinner reservations, buy movie tickets, play music on services like Pandora and Spotify and the list goes on. If you crave the specific details, make your way beneath the break and have a gander at the press release below.