Latest in Science

Image credit:

Machines can generate sound effects that fool humans

Sorry Foley artists, the robots are coming for your jobs, too.
289 Shares
Share
Tweet
Share
Save

Sponsored Links

Can machines come up with plausible sounds effects for video? Recently, MIT's artificial intelligence (CSAIL) lab created a sort of Turing test that fooled folks into thinking that machine-created letters were written by humans. Using the same principal, the researchers created algorithms that act just like Hollywood "Foley artists," adding sound to silent video. In a psychological test, it fooled subjects into believing that the computer-generated banging, scratching and rustling was recorded live.

Researchers used a drumstick (chosen for consistency and because it doesn't obscure the video) to hit various objects, including railings, bushes and metal gratings. The algorithm was fed 978 videos with 46,620 actions, helping it recognize patterns in the audiovisual signal. "Training a model to synthesize plausible impact sounds from silent videos, [is] a task that requires implicit knowledge of material properties and physical interactions," according to the paper.

The AI uses deep learning to figure out how sounds relate to video, meaning it finds the patterns on its own without intervention from scientists. Then, when it's shown a new, silent video, "the algorithm looks at the properties of each frame of that video, and matches them to the most similar sounds in the database," says lead author Andrew Owens. As shown in the video (above), it can simulate the differences between someone tapping rocks, leaves or a couch cushion.

In an online study, subjects were more than twice as likely to pick the AI version over the live one as the "real" sound, particularly for non-solid materials like leaves and dirt. In addition, the algorithm can reveal details about an object from its sound: 67 percent of the time, it could tell whether a material was hard or soft.

The AI isn't perfect -- it can get faked out by a near-hit, and can't pick up sounds not related to a visual action, like a buzzing computer. However, they believe the work could eventually help robots figure out whether a surface is cement-solid or has some give, like grass. Knowing that, they could predict how to step and avoid a (hilarious) accident. If the team can enlarge its database of sounds, the machines could eventually do a Foley artist's job -- with no need for coconuts.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.
Comment
Comments
Share
289 Shares
Share
Tweet
Share
Save

Popular on Engadget

Engadget's Guide to Privacy

Engadget's Guide to Privacy

View
FCC creates two 'innovation zones' to test next-gen wireless

FCC creates two 'innovation zones' to test next-gen wireless

View
‘Call of Duty’ comes to mobile on October 1st

‘Call of Duty’ comes to mobile on October 1st

View
AT&T reportedly considers offloading its DirecTV satellite unit

AT&T reportedly considers offloading its DirecTV satellite unit

View
T-Mobile’s Sprint merger is opposed by 18 state attorneys general

T-Mobile’s Sprint merger is opposed by 18 state attorneys general

View

From around the web

Page 1Page 1ear iconeye iconFill 23text filevr