Training deep learning models to recognize images, as well as objects within those images, takes quite a bit of effort. Often, each training image has to be labeled by humans and when you're using millions of images, that process becomes rather labor-intensive. Scaling up to billions of images becomes nearly impossible. So, Facebook has been working on a way to train deep learning models with limited human supervision. Instead, its researchers have turned to public images that are, in a way, already labeled -- with hashtags.
With this method, Facebook researchers and engineers trained image recognition networks with up to 3.5 billion Instagram images labeled with as many as 17,000 hashtags. After training the computer vision system with one billion images and 1,500 hashtags, it was able to achieve 85.4 percent image recognition accuracy on the popular benchmarking tool ImageNet. It beat out the previous state-of-the-art model, which achieved an accuracy rating of 83.1 percent.
The work shows that weakly supervised training is a valid option moving forward, opening up deep learning model training to larger data sets and, possibly, more accurate image recognition and classification. Better image recognition could improve AI-generated audio captions of photos for the visually impaired, but Facebook says there are other useful applications as well. Using hashtags as labels for computer vision could impact how Facebook ranks images in feeds and improve an AI system's understanding of video footage.
"As training data sets get larger, the need for weakly supervised -- and, in the longer term, unsupervised -- learning will become increasingly vital," said Facebook. "Understanding how to offset the disadvantages of noisier, less curated labels is critical to building and using larger-scale training sets."
Click here to catch up on the latest news from F8 2018!