If you've ever played a YouTube video for what it seems like the thousandth time to listen to your instrument's part of a composition, you'll love MIT's new AI. PixelPlayer, which hails from the institution's Computer Science and Artificial Intelligence Laboratory (CSAIL), can recognize instruments in a video, identify specific ones at pixel level and isolate the sounds they produce. If there are several instruments playing in a video, for instance, PixelPlayer will allow you to pick the one you want to listen to -- it will play the sounds coming out of that instrument the loudest and will lower the volume or everything else.
CSAIL trained PixelPlayer by using a self-supervised deep learning technique and feeding it over 60 hours of videos to learn from. It's still far from perfect, though: it can only identify the sounds of 20 instruments at the moment and is still having trouble telling similar ones apart. With further development, it could become an effective audio editing tool, giving engineers a way to improve or restore the quality of old concert footage. It could also be used to train robots on how to identify various environmental sounds, such as those made by animals, vehicles and appliances.