Latest in Science

Image credit:

Disney Research's AI system knows what a car sounds like

Soon, image recognition software may be able to tell you what sound an object makes.
Sean Buckley, @seaniccus
November 16, 2016
Share
Tweet
Share

Sponsored Links

Alexandre Meneghini / Reuters

A picture may be worth a thousand words, but sound is just as important to how we experience the world as how we see it -- that's why a team at Disney Research is working on a computer vision system that can not only recognize what an image is, but how it sounds, too. In an initial study presented at the European Conference on Computer Vision, the group's system successfully managed to pair appropriate audio with images of doors closing, glasses clinking and vehicles driving down the road.

Audio association might be easy for humans, but teaching a computer to do it is actually pretty challenging. Disney researchers trained AI to recognize the sound of images by feeding it a collection of videos demonstrating an object making a specific sound, but background noise, narration or sound made from other objects could easily confuse the system. If the system was fed samples with most of the uncorrelated sounds filtered out, however, it did a pretty good job of suggesting the right sound for each image. Still, the system isn't perfect: the team reports that it occasional had trouble differentiating the image of a car or a tram, causing it to sometimes suggest the wrong sound for a particular vehicle.

Audio image recognition probably isn't useful to most of the population, but the team hopes it can be used to create an automatic Foley processing system for video production -- making it easier for editors to add-in sound-effects during the production process. The technology may also be able to help the visually impaired by creating an image personification system, enabling them to 'hear' objects on a computer screen. Still, Disney Research has a lot of work to do before it gets close to making either of those futures a reality.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.
Comment
Comments
Share
Tweet
Share

Popular on Engadget

Samsung, Stanford make a 10,000PPI display that could lead to 'flawless' VR

Samsung, Stanford make a 10,000PPI display that could lead to 'flawless' VR

View
Xbox Series X and Series S walkthrough is a day-one primer

Xbox Series X and Series S walkthrough is a day-one primer

View
Facebook will not ban Oculus owners with multiple VR headsets (updated)

Facebook will not ban Oculus owners with multiple VR headsets (updated)

View
LG unveils the first Tone Free wireless earphones with ANC

LG unveils the first Tone Free wireless earphones with ANC

View
iPhone 12 drop test confirms the new screen helps durability, to an extent

iPhone 12 drop test confirms the new screen helps durability, to an extent

View

From around the web

Page 1Page 1ear iconeye iconFill 23text filevr