You normally need to listen to a song or read its lyrics to tell if it’s using explicit language, but Deezer thinks AI might be up to the job before long. The streaming music service is researching a machine learning technique that would detect explicit lyrics solely through audio. Instead of merely training an AI to recognize colorful words with a giant set of annotated samples, like you often see with machine learning, Deezer extracts the vocals and looks for instances where it’s likely that a word matches entries in a dictionary of foul expressions. From there, a simple binary classifier decides if a given word is naughty. It’s an “explainable” system that can show why the AI came to a decision, the company said.
The team also hopes to reduce bias and improve accuracy by using equal amounts of explicit and clean songs in each music genre, and with a variety of genres.
Deezer is frank: the current approach isn’t ready for practical use right now. While it’s much better at detecting profanity than the classic method, it still falls short of an AI that has direct access to text lyrics, let alone a human. The system as-is could be helpful all the same. As Deezer’s AI knows where words pop up in a song, it might assist humans deciding if a song merits an explicit tag. Ideally, the technology will advance enough that the AI can work alone. That could not only lighten the load for people tagging songs, but reduce the chances that your kids will inadvertently hear an F-bomb in songs you thought were ‘safe’ for impressionable young minds.