text-to-speech

Latest

  • Amazon bringing Voice Guide and Explore by Touch features to Kindle Fires for vision-impaired users (update)

    by 
    Michael Gorman
    Michael Gorman
    12.06.2012

    Amazon's been attuned to the needs of its vision-impaired customers for years, first rolling out text-to-speech technology on its original Kindle e-reader years ago. Today the company revealed plans to add to that feature set in its Kindle Fire and Fire HD (7-inch) tablets with Voice Guide and Explore by Touch technology. Voice Guide's an improvement upon regular text-to-speech tech that reads aloud any action performed by users -- things like announcing app names and book titles when they're selected. Explore by Touch lets folks swipe their fingers across their Fire's display and identifies each onscreen item as their phalanges pass over them. Once aware of what app or piece of content's being touched, a simple tap opens the item. Ready for the new assisted navigation experience right now? Well, all you anxious Fire owners will have to wait, the update doesn't land until early next year. Update: The good folks at Amazon reached out to let us know that the Kindle Fire HD 8.9 already has both Explore by Touch and Voice Guide.

  • Evernote adds text-to-speech to Clearly Chrome extension, for Premium members only

    by 
    Nicole Lee
    Nicole Lee
    11.28.2012

    If you ever wanted to catch up on your online reading while on the treadmill or puttering about the kitchen, Evernote now offers you the ability to do so without actually, well, reading. The online brain dump has introduced text-to-speech functionality to its Clearly extension for Google Chrome, a plugin that clears out ads and other distractions for a clean reading experience. While the Clearly extension itself is free, the text-to-speech feature is only for Premium accounts, each of which costs $5 a month or $45 a year. Words are highlighted as they're read, and you can pause and skip as you like. The feature launches with support for over twelve languages and is powered by iSpeech, which has worked with BlackBerry apps and connected homes in the past. Just don't accidentally blast TMZ articles during your next conference call, ok?

  • Prizmo is a powerful OS X scanning app

    by 
    Mel Martin
    Mel Martin
    11.07.2012

    Prizmo 2 is a scanning application with Optical Character Recognition (OCR) and several unique features that will attract those who do moderate scanning. Additional options are available from the Pro-pack via in-app purchase. Prizmo 2 is currently available from the App Store for US$24.99, which is a limited half-price sale. Prizmo recognizes most scanners, and also works with digital images and PDF files. To get started, select New from the File Menu, then drag-and-drop your file onto a target. If you have a scanner, you can initiate a scan on that hardware. Press the recognize button and you are on your way. %Gallery-170414% Prizmo 2 can recognize business cards and differentiate between text, images and numbers. You can output your capture as JPEG, PNG, TIFF, PDF, RTF or plain text. You can also export to Evernote, Dropbox, Google Drive and WebDAV services. In addition, the system recognizes 40 languages, can translate 23 languages on the fly and read documents aloud via text-to-speech. I tested Prizmo 2 with PDF files, screen captures and my Epson XP-400 printer/scanner. The results were highly accurate and the OCR speed was very fast. The app took about 1.5 seconds to recognize the text in a single document. The text-to-speech was easy to understand. As it reads, the software highlights the words it is reading on-screen. If you are a more heavy duty OCR user, there is a $24.99 in-app purchase that adds batch document processing, Automator actions and some custom export scripts. The basic version will be fine for most general users. The only thing I would improve is the use of the non-intuitive File>Open command. Since I'm not much of a documentation reader, I scanned a file into a JPEG, and used File>Open to import it. That's not what you do. Instead, you choose File>New to import files. It was easy when I figured it out, but a bit non-intuitive. Other than that, Prizmo 2 is powerful and reliable. If you need its features, I can recommend it without reservation. You can find a detailed feature list and video demos at the Prizmo website. The company also has an iOS scanning app which I reviewed in 2010. It was also useful and reliable.

  • OLPC delivers big OS update with text-to-speech, DisplayLink and WebKit

    by 
    Jon Fingas
    Jon Fingas
    09.02.2012

    While most of its energy is focused on the XO-4 Touch, the One Laptop Per Child project is swinging into full gear for software, too. The project team has just posted an OS 12.1.0 update that sweetens the Sugar for at least present-day XO units. As of this latest revamp, text-to-speech is woven into the interface and vocalizes any selectable text -- a big help for students that are more comfortable speaking their language than reading it. USB video output has been given its own lift through support for more ubiquitous DisplayLink adapters. If you're looking for the majority of changes, however, they're under-the-hood tweaks to bring the OLPC architecture up to snuff. Upgrades to GTK3+ and GNOME 3.4 help, but we're primarily noticing a shift from Mozilla's web engine to WebKit for browsing: although the OLPC crew may have been forced to swap code because of Mozilla's policies on third-party apps, it's promising a much faster and more Sugar-tinged web experience as part of the switch. While they're not the same as getting an XO-3 tablet, the upgrades found at the source link are big enough that classrooms (and the occasional individual) will be glad they held on to that early XO model.

  • Samsung announces Drive Link, a car-friendly app with MirrorLink integration

    by 
    Jamie Rigg
    Jamie Rigg
    08.28.2012

    Until self-driving cars become mainstream, it's best to keep eyes on roads and hands off phones. With this in mind, Samsung's debuting Drive Link, an app that balances in-car essentials with driver safety, complete with approval from the no-nonsense Japanese Automotive Manufacturers Association. It's all about the bare essentials -- navigation, hands-free calling and audiotainment from your phone-based files or TuneIn. Destinations can be pulled from S Calendar appointments or texts without trouble, and the text-to-speech feature means you won't miss a message, email or social media update. The best bit is that via MirrorLink, all these goodies can be fed through compatible dash screens and speaker systems. Drive Link is available now through Sammy's app store for Europeans sporting an international Galaxy S III, and will be coming to other ICS handsets "in the near future."

  • Apple seeks patent for hearing aids that deliver speech at an even keel

    by 
    Jon Fingas
    Jon Fingas
    08.23.2012

    Although they're called hearing aids, they can sometimes be as much of a hindrance as a help. Catch an unfamiliar accent and the attention might be on just parsing the words, let alone moving the conversation forward. Apple is applying for a patent on a technique that would take the guesswork out of listening by smoothing out all the quirks. The proposed idea would convert speech to text and back, using the switch to remove any unusual pronunciation or too-quick talking before it reaches the listener's ear. Not surprisingly for a company that makes phones and tablets, the hearing aid wouldn't always have to do the heavy lifting, either: iOS devices could handle some of the on-the-fly conversion, and pre-recorded speech could receive advance treatment to speed up the process. We don't know if Apple plans to use its learning in any kind of shipping product, although it's undoubtedly been interested in the category before -- and its ambitions of having iPhone-optimized hearing aids could well get a lift from technology that promises real understanding, not just a boost in volume.

  • Arduino-based SocialChatter reads your Twitter feeds so you don't have to (video)

    by 
    Jamie Rigg
    Jamie Rigg
    08.16.2012

    If you prefer reading your RSS feeds without the backlight, there's hardware for that, and if you'd prefer not reading your Twitter feeds at all, there's now hardware for that as well. Mix an Arduino Ethernet board, an Emic 2 Text-To-Speech Module and the knowhow to put them together, and you've got SocialChatter -- a neat little build that'll read your feeds aloud. The coding's already been done for you, and it's based on Adafruit's own Internet of Things printer sketch with a little bit of tinkering so nothing's lost in translation. If your eyes need a Twitter break and you've got the skills and kit to make it happen, head over to the source link for a how-to guide. Don't fill the requirements? Then jump past the break to hear SocialChatter's soothing voice without all the effort.

  • iSpeech intros voice recognition platform for connected homes, enables vocal control of TVs and appliances

    by 
    Edgar Alvarez
    Edgar Alvarez
    07.19.2012

    We've been seeing the growing trend of peculiar services like Cupertino's Siri, Samsung's S Voice and Google Now on mobile devices, but up until now, we have yet to spot something similar in the world of connected homes. Having previous experience in the text-to-speech department, iSpeech is hoping to be able to do just that with the world debut of its voice recognition platform for smart households. With iSpeech Home, the company's aiming to give OEMs and manufacturers a canvas where they can implement voice recognition software into TVs, home entertainment systems, lighting, refrigerators and even washers and dryers -- which would, according to iSpeech, open the doors to natural language commands such as "Watch ESPN" or "Turn off the lights in the living room." As exciting as it all sounds, the company's COO Yaron Oren did tell us there aren't any official partners on board at the moment, but that he does expect to have iSpeech Home-powered products within the next 6-12 months.

  • Perkins Smart Brailler helps the blind learn to type, closes the digital divide

    by 
    Jon Fingas
    Jon Fingas
    07.18.2012

    Most digital Braille devices are built on the assumption that the legally blind already know how to write in the format -- if they don't, they're often forced back to the analog world to learn. PDT and Perkins hope to address that longstanding technology gap with the Perkins Smart Brailler. Going digital lets Perkins build in lessons for newcomers as well as provide immediate audio feedback (visual for writers with borderline vision) and text-to-speech conversion to give even an old hand a boost. Logically, the leap into the modern world also allows transferring documents over USB along with traditional Braille printouts. Smart Braillers will cost a weighty $1,995 each when they first ship in September, but it's hard to put a price tag on mastering communication and fully joining the digital generation.

  • BMW's 3 and 7 Series to be the first with Nuance's Dragon Drive! Messaging aboard

    by 
    Edgar Alvarez
    Edgar Alvarez
    07.09.2012

    It somehow feels like it was only yesterday that Nuance unveiled its Dragon Drive! creation to the world, hoping to in the process make drivers' lives easier by delivering a fresh eyes / hands-free messaging system inside connected cars. Unfortunately, back then the savvy company didn't announce any partnerships with auto manufacturers -- still, we had a feeling it wouldn't be too long before one of them would want to come along for the voice dictation ride. The good news is, that's about to change pretty soon. Per the outfit itself, BMW's decided to bring the Dragon Drive! tech to its 2012 7 Series later this month, with the 3 Series Touring and the eco-friendly 3 Series ActiveHybrid expected to get it "later this year." Notably, Dragon Drive! will offer multi-language support, including English, Spanish, Italian, French and German. There's no word yet on just how much the fee for the service will be, but we do know those who land themselves one of these new Beemers will get a two-month trial to take Dragon Drive! for a quick spin.

  • TUAW and MacTech interview: iSpeech

    by 
    Victor Agreda Jr
    Victor Agreda Jr
    07.09.2012

    iSpeech makes text to speech tech available for free to iOS developers (with support for other platforms as well), and it's the tech used in DriveSafe.ly. TechCrunch has a nice writeup of iSpeech here. In this video, Neil Ticktin (Editor-in-Chief, MacTech Magazine) interviews Yaron Oren of iSpeech at WWDC 2012. Yaron was kind enough to tell us about their thoughts on the announcements on WWDC, and how it will affect their plans moving forward.

  • EyeRing finger-mounted connected cam captures signs and dollar bills, identifies them with OCR (hands-on)

    by 
    Zach Honig
    Zach Honig
    04.25.2012

    Ready to swap that diamond for a finger-mounted camera with a built-in trigger and Bluetooth connectivity? If it could help identify otherwise indistinguishable objects, you might just consider it. The MIT Media Lab's EyeRing project was designed with an assistive focus in mind, helping visually disabled persons read signs or identify currency, for example, while also serving to assist children during the tedious process of learning to read. Instead of hunting for a grownup to translate text into speech, a young student could direct EyeRing at words on a page, hit the shutter release, and receive a verbal response from a Bluetooth-connected device, such as a smartphone or tablet. EyeRing could be useful for other individuals as well, serving as an ever-ready imaging device that enables you to capture pictures or documents with ease, transmitting them automatically to a smartphone, then on to a media sharing site or a server. We peeked at EyeRing during our visit to the MIT Media Lab this week, and while the device is buggy at best in its current state, we can definitely see how it could fit into the lives of people unable to read posted signs, text on a page or the monetary value of a currency note. We had an opportunity to see several iterations of the device, which has come quite a long way in recent months, as you'll notice in the gallery below. The demo, which like many at the Lab includes a Samsung Epic 4G, transmits images from the ring to the smartphone, where text is highlighted and read aloud using a custom app. Snapping the text "ring," it took a dozen or so attempts before the rig correctly read the word aloud, but considering that we've seen much more accurate OCR implementations, it's reasonable to expect a more advanced version of the software to make its way out once the hardware is a bit more polished -- at this stage, EyeRing is more about the device itself, which had some issues of its own maintaining a link to the phone. You can get a feel for how the whole package works in the video after the break, which required quite a few takes before we were able to capture an accurate reading.

  • Kindle Touch update adds Europe-friendly languages, landscape mode

    by 
    Daniel Cooper
    Daniel Cooper
    04.12.2012

    April 27th is nearly upon us, heralding the arrival of the Kindle Touch in Europe. Before that happens, Amazon's pushed out a software update packed with language support for the continent, landscape mode and text-to-speech, amongst others. You can manually download version 5.1.0 now or wait for the over-WiFi update in a couple of weeks. Pre-orders for the device are open as we speak, the WiFi-only model costing £109 / €129, the 3G edition costing £169 / €189.

  • Google aids accessibility with ChromeVox reader, better YouTube captions and more

    by 
    Sharif Sakr
    Sharif Sakr
    02.29.2012

    Engineers from Google have commandeered a booth at this year's CSUN accessibility conference and they're keen to talk up their latest efforts. For the visually impaired, there's now a beta version of a Chrome screen reader called ChromeVox (demo'd after the break), plus improved shortcuts and screen reader support in Google Docs, Sites and Calendar. Meanwhile, YouTube boasts expanded caption support for the hard of hearing, with automatic captions enabled for 135 million video clips -- a healthy tripling of last year's total. Check the source link for full details or, if you're anywhere near San Diego, go and hassle those engineers the old-fashioned way.

  • Nintendo testing classroom text-to-speech tech with DSis

    by 
    JC Fletcher
    JC Fletcher
    01.30.2012

    Nintendo and Japanese telecom company NTT are working together on a voice recognition project, aimed at making it easier for students with hearing or other disabilities to keep up in classrooms.The project, which NHK reports is undergoing trials in Okinawa and Tottori Prefecture, captures instructor speech, converts it to text, and saves it to the cloud while also sending it to devices -- like the DSi. That way, students can read along, and have an automatic record of lessons. See it in action in the video on NHK's site.The downside to this plan, of course, is that it creates a situation in which a student is expected to be paying very close attention to his or her DS in the back of the classroom -- a situation ripe for abuse of the Retro Game Challenge variety.

  • Ford brings Bluetooth text message readouts to more SYNC vehicles

    by 
    Sharif Sakr
    Sharif Sakr
    10.18.2011

    Got a SYNC-tastic Ford from 2011 onwards? Then you'll find that the latest update (G1 V3.2.2) to the dash software will let you listen to your smartphone's incoming emails and SMS messages via the car's audio system, thanks to the inclusion of Bluetooth MAP (Message Access Profile). We've already seen the tech running in BMW's iDrive dash system and in MyFord Touch-equipped cars too, so the news here is just a wider roll-out to a bigger range of vehicles -- but we'll welcome anything that keeps more eyes on the prize. Read the full PR after the break and then enter your VIN at the More Coverage link below to see if you're eligible.

  • Changes to Nuance developer program will result in a flood of voice enabled apps

    by 
    Mel Martin
    Mel Martin
    09.27.2011

    The company behind the Dragon speech recognition applications for computers and iOS devices has announced a new developer program that will allow software to access Dragon Voice technology at no charge. It could result in a tidal wave of apps that harness the power of the Nuance speech recognition and text to speech technologies. Many of our readers have no doubt used Nuance tech in apps like Siri, and Dragon Go. I talked with Kenneth Harper, Senior Product Manager for Nuance, who says opening up the technology is a way to help Nuance become an even bigger standard in voice technology, as well as introduce developers to the company. Harper says that the free developer service, called NDEV Silver, will apply to about 90% of the app developers for iOS. Developers will also have free access to Nuance's connected text-to-speech (TTS) capabilities in over 30 languages, bringing natural sounding text-to-speech in the cloud. Further, NDEV Silver members get access to Bluetooth hands-free voice applications. For larger corporate customers, Nuance will offer higher levels of services at what they call the Gold and Emerald level, but even these services will cost much less than the previous developer programs Nuance has offered. Harper wouldn't comment on how all this will tie in with rumored voice technology built into iOS 5 and new hardware that Apple is expected to announce soon, but since Apple now owns Siri, and has used Nuance technology in the past, it is likely there will be synergies. Many developers will leap at the chance to add very sophisticated speech features to their apps, and iPhones are likely to get even much more useful. The new developer program will also support Android and Windows Phone 7.

  • Leak: future iOS update to introduce Siri-based voice control

    by 
    Sean Buckley
    Sean Buckley
    07.25.2011

    When Apple snatched up Siri back in April, we had to wonder exactly what Cupertino was planning for the voice controlled virtual assistant. The answer, according to a new leak, is unsurprisingly obvious: iOS integration. A screenshot leaked to 9to5Mac flaunts an "Assistant" feature presumably built into a firmware update. To back up the screenshot, the aforesaid site dove into the iOS SDK and uncovered code describing Siri-like use of the iPhone's location, contact list, and song metadata. The code also outlined a "speaker" feature, opening a door for further Nuance integration in Apple products. Sound awesome? Sure it does, but keep it salty: 9to5's source says the assistant feature only just went into testing, and may not be ready in time for Apple's next big handset upgrade. Hit the source link to see the code and conjecture for yourself.

  • OS X Lion introduces new, multilingual, high-quality text-to-speech voices

    by 
    Chris Rawson
    Chris Rawson
    07.24.2011

    First announced in March, then found in developer previews, one of the little-heralded new features of OS X Lion is its inclusion of several high-quality text-to-speech voices in 22 different languages. The last major addition to Apple's built-in OS X voices was Alex, a higher-quality voice included in Mac OS X Leopard back in 2007. While Alex was a breakthrough for text-to-speech Mac voices at the time, the over 50 new voices included in Lion outmatch him in several key ways. These new voices, sourced from Nuance, are not only available in several dialects of English but also, in an OS X first, in several other languages. Text-to-speech voices are now available in Arabic, three different Chinese dialects, Czech, Danish, two varieties of Dutch, Finnish, two French dialects, German, Greek, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Norwegian, Polish, two Portuguese dialects, Romanian, Russian, Slovak, two Spanish dialects, Swedish, Thai, and Turkish. Like a few other features of OS X Lion, Apple hasn't made these new voices easily discoverable unless you know where to look for them. It's also a bit of a misnomer to say they're "included" with OS X, as they are not included in the standard Lion install and require a separate download. In the Speech pane of System Preferences, clicking on the Text to Speech tab gives you an option for "System Voice" in a pulldown menu. This will likely be set to "Alex" by default. Clicking on "Customize" gives you access to the plethora of new optional voices, and you can play previews of each one before downloading them. (You can also listen to previews of these voices at NextUp.) Most of these new voices sound astonishingly natural, especially compared to the old, robotic, pre-Alex voices that were the bread and butter of text-to-speech in OS X's distant past. In particular, the Australian English "Lee" voice (now my default) and Mexican Spanish "Javier" sound incredibly lifelike to my ears. Selecting a checkbox next to a voice and clicking "OK" will present an alert asking if you're sure you want to download the voice. You'll find this alert welcome, because these high-quality voice files are huge, generally in the neighborhood of 350 to 500 MB each. If your bandwidth or hard drive space are limited, I wouldn't recommend downloading more than a few of these voices. I've generally shied away from utilizing OS X's text-to-speech functions in the past, because even "Alex" sounded jarringly artificial to me. The new voices aren't perfect and don't fill every dialectical niche (Richard Gaywood was dismayed there was no "Welsh English" voice, and I'm having to make do with Australian Lee rather than a full-fledged "Kiwi English" voice). That said, many of the new voices sound natural enough that having my Mac "talk" to me is now a useful feature, even though I don't have any accessibility requirements that make them necessary as they are for some users. In particular, Australian voice "Lee" makes my MacBook Pro sound like a bloke worth taking down to the pub for a pint, and that's a feature definitely worth having.

  • Nuance buys SVOX ahead of iOS 5 release

    by 
    Mike Schramm
    Mike Schramm
    06.16.2011

    There's a whole trail of rumors hinting at an upcoming deal between speech recognition company Nuance and Apple. For quite a while now (ever since Apple picked up personal assistant software maker Siri), the scuttlebuzz has claimed that the folks in Cupertino would make a deal with Nuance for some kind of speech recognition, most likely an iOS-level integration that would allow you to ask your iOS device for whatever you want, and get it quickly and easily. But even if that deal is on, that hasn't stopped Nuance from slowing down. The company has acquired another speech recognition firm, SVOX, the creators of high-end speech recognition and text-to-speech services. That's a natural fit for Nuance, of course, and the release says that the new deal "will advance the proliferation of voice in the automotive market, and accelerate the development of new voice capabilities that enable natural, conversational interactions between consumers and their connected cars, mobile phones, and other consumer devices." Sounds exciting to us. We didn't actually get to see either Siri or an updated voice control service show up during the iOS 5 announcement at WWDC, but that doesn't mean it's completely out of the cards yet. Maybe a deal like this is just what Nuance needs to set up the partnership that Apple's reportedly been seeking for a while.