MIT finds smaller neural networks that are easier to train
MIT’s "Lottery Ticket" mini-neural networks train only the nodes they need.
Despite all the advancements in artificial intelligence, most AI-based products still rely on "deep neural networks," which are often extremely large and prohibitively expensive to train. Researchers at MIT are hoping to change that. In a paper presented today, the researchers reveal that neural networks contain "subnetworks" that are up to 10 times smaller and could be cheaper and faster to teach.
To train most neural networks, engineers feed them massive datasets, but that can take days and expensive GPUs. The researchers from MIT's Computer Science and Artificial Intelligence Lab (CSAIL) found that within those trained networks are smaller, subnetworks that can make equally accurate predictions. CSAIL's so-called 'lottery-ticket hypothesis' is based on the idea that training most neural networks is something like buying all the tickets in a lottery to guarantee a win. By comparison, training the subnetworks would be like buying just the winning tickets.
The catch is that the researchers haven't figured out how to find those subnetworks without building a full neural network and then pruning out the unnecessary bits. If they can find a way to skip that step and go straight to the subnetworks, this process could save hours of work and make training neural networks accessible to individual programmers -- not just huge companies. But determining how to efficiently find subnetworks and understanding why some are better than others at learning will likely keep researchers busy for years.