Facebook opens its advanced AI vision tech to everyone

Hopefully this will speed up its progress.

Senior Editor

Updated Thu, Aug 25, 2016, 12:00 PM·5 min read

Over the past two years, Facebook's artificial intelligence research team (also known as FAIR) has been hard at work figuring out how to make computer vision as good as human vision. The crew has made a lot of progress so far (Facebook has already incorporated some of that tech for the benefit of its blind users), but there's still room for improvement. In a post published today, Facebook details not only its latest computer-vision findings but also announces that it's open-sourcing them to the public so that everyone can pitch in to develop the tech. And as FAIR tells us, improved computer vision will not only make image recognition easier but could also lead to applications in augmented reality.

There are essentially three sets of code that Facebook is putting on GitHub today. They're called DeepMask, SharpMask and MultiPathNet: DeepMask figures out if there's an object in the image, SharpMask delineates those objects and MultiPathNet attempts to identify what they are. Combined, they make up a visual-recognition system that Facebook says is able to understand images at the pixel level, a surprisingly complex task for machines.

"There's a view that a lot of computer vision has progressed and a lot of things are solved," says Piotr Dollar, a research scientist at Facebook. "The reality is we're just starting to scratch the surface." For example, he says, computer vision can currently tell you if an image has a dog or a person. But a photo is more than just the objects that are in it. Is the person tall or short? Is it a man or a woman? Is the person happy or sad? What is the person doing with the dog? These are questions that machines have a lot of difficulty answering.

In the blog post, he describes a photo of a man next to an old-fashioned camera. He's standing in a grassy field with buildings in the background. But a machine sees none of this; to a machine, it's just a bunch of pixels. It's up to computer-vision technology like the one developed at FAIR to segment each object out. Considering that real-world objects come in so many shapes and sizes as well as the fact that photos are subject to varying backgrounds and lighting conditions, it's easy to see why visual recognition is so complex.

The answer, Dollar writes, lies in deep convolutional neural networks that are "trained rather than designed." The networks essentially learn from millions of annotated examples over time to identify the objects. "The first stage would be to look at different parts of the image that could be interesting," he says. "The second step is to then say, 'OK, that's a sheep,' or 'that's a dog.'

"Our whole goal is to get at all the pixels, to get at all the information in the image," he says. "It's still sort of a first step in the grand scheme of computer vision and having a visual recognition system that's on par with the human visual system. We're starting to move in that direction."

By open-sourcing the project on GitHub, he hopes that the community will start working together to solve any problems with the algorithm. It's a step that Facebook has taken before with other AI projects, like fasText (AI language processing) and Big Sur (the hardware that runs its AI programs). "As a company, we care more about using AI than owning AI," says Larry Zitnick, a research manager at FAIR. "The faster AI moves forward, the better it is for Facebook."

One of the reasons Facebook is so excited about computer vision is that visual content has exploded on the site in the past few years. Photos and videos practically rule News Feed. In a statement, Facebook said that computer vision could be used for anything from searching for images with just a few keywords (think Google Photos) to helping those with vision loss understand what's in a photo.

There are also some interesting augmented reality possibilities. Computer vision could identify how many calories are in a photo of a sandwich, for example, or it could see if a runner has the proper form. Now imagine if this kind of information was accessible on Facebook. It could bring a whole new level of interaction to the photos and videos you already have. Ads could let you arrange furniture in a room or try on virtual clothes. "It's critical to understand not just what's in the image, but where it is," says Zitnick about what it would take for augmented reality applications to take off.

Dollar brought up Pokémon Go as an example. Right now the cartoon monsters are mostly just floating in the middle of the capture scene. "Imagine if the creature can interact with the environment," he says. "If it could hide behind objects, or jump on top of them."

The next step would be to bring this computer-vision research into the realm of video, which is especially challenging because the objects are always moving. FAIR says that some progress has already been made: It's able to figure out certain items in a video, like cats or food. If this identification could happen in real time, then it could theoretically be that much easier to surface the Live videos that are the most relevant to your interests.

Still, with so many possibilities, Zitnick says FAIR's focus right now is on the underlying tech. "The fundamental goal here is to create the technologies that enable these different potential applications," he says. Making the code open-source is a start.

Engadget
Apple has reportedly resumed talks with OpenAI to build a chatbot for the iPhone
Apple has resumed talks with OpenAI, the maker of ChatGPT, to build an AI-powered chatbot into the iPhone, according to a new report.
7h ago
Engadget
The FTC accuses Amazon of using Signal’s auto-deleting messages to erase evidence
As part of its antitrust suit against Amazon, the FTC accused the company of using Signal’s disappearing messages feature to conceal communications.
10h ago
Engadget
Drake deletes AI-generated Tupac track after Shakur’s estate threatened to sue
Drake apparently learned it isn’t wise to mess with Tupac Shakur — even nearly three decades after his death. Tthe Canadian hip-hop artist deleted the post with his track “Taylor Made Freestyle,” which used an AI-generated recreation of Shakur’s voice.
12h ago
Engadget
Aaron Sorkin is working on a Jan. 6-focused follow-up to The Social Network
Aaron Sorkin has announced that he’s currently writing a followup script to The Social Network. The original was his take on the initial years of Facebook.
12h ago
Engadget
Samsung's Galaxy S24 Ultra falls to a new low, plus the rest of the week's best tech deals
This week's best tech deals include a new low on the Samsung Galaxy S24 Ultra, Apple's MacBook Air M3 for $989 and Anker's Soundcore Space A40 earbuds for $49, among others.
13h ago
Engadget
Nikon’s Z8 is a phenomenal mirrorless camera for the price
Nikon's Z8 is one of the highest resolution full-frame cameras with 45 megapixels, but is also one of the fastest and has incredible video capabilities too.
13h ago
Engadget
Some of our favorite Bose headphones and earbuds are back to all-time low prices
Amazon has some of the highest-rated Bose headphones on sale for record-low prices. That includes the Bose QuietComfort Ultra headphones, which have best-in-class active noise cancellation (ANC).
13h ago
Engadget
Apple's 13-inch MacBook Air with the M3 chip has never been cheaper
The latest Apple MacBook Air with the M3 chip is down to a new low price at Amazon.
14h ago
Engadget
NHTSA concludes Tesla Autopilot investigation after linking the system to 14 deaths
The National Highway Traffic Safety Administration has concluded a lengthy investigation into Tesla’s Autopilot system. It found 13 fatal crashes due to misuse and software that doesn’t prioritize driver attentiveness.
15h ago
Engadget
Wacom's first OLED pen display is also the thinnest and lightest it has ever made
Wacom's latest pen display model is called Movink, and it's the company's first with a OLED screen. It's also Wacom's thinnest and lightest option ever, while still offering 13 inches of work space.
16h ago
Engadget
It doesn’t matter how many Vision Pro headsets Apple sells
This week, there was a lot of back and forth about Apple Vision Pro production numbers. Here's why they don't matter.
16h ago
Engadget
The Google Pixel Buds Pro are back on sale for $135
Google's Pixel Buds Pro are on sale for $135 at Wellbots, which is the lowest price we've seen this year.
18h ago
Engadget
Dell XPS 13 and XPS 14 review (2024): Gorgeous laptops with usability quirks
Dell’s XPS 13 and 14 are stylish, portable and powerful. You’ll have to get used to some of its design quirks, though, and it’s far pricier than older models.
18h ago
Engadget
OpenAI's Sam Altman and other tech leaders join the federal AI safety board
Sam Altman, OpenAI's CEO, Microsoft chief Satya Nadella, Alphabet CEO Sundar Pichai are joining the government's Artificial Intelligence Safety and Security Board, according to The Wall Street Journal.
19h ago
Engadget
The best gaming gear for graduates
New graduates have earned the time to unwind after a busy year. These pieces of gaming gear would make great gifts for the new college graduate in your life.
a year ago
Engadget
The Morning After: Apple announces an iPad event for May 7
The biggest news stories this morning: Adobe’s new upscaling tech uses AI to sharpen video, BlizzCon 2024 is canceled, The world’s biggest 3D printer can make a house in under 80 hours.
20h ago
Engadget
Engadget Podcast: Why TikTok will never be the same again
Biden passed the TikTok divestment bill -- now what?
20h ago
Engadget
The best wireless earbuds for 2024
It's safe to say the wireless earbuds space is pretty saturated. We've tested and reviewed dozens of models; these are our top picks.
4 months ago
Engadget
Apple is launching new iPads May 7: Here's what to expect from the 'Let Loose' event
Apple has scheduled an event for May 7 that'll more than likely focus on new iPads. Here's what we expect the company to show off.
1d ago
Engadget
Spotify tests Apple's resolve with new pricing update in the EU
Spotify submitted a new update for Apple's approval that would display pricing right in the app.
2d ago

Facebook opens its advanced AI vision tech to everyone

Hopefully this will speed up its progress.

Latest Stories

Apple has reportedly resumed talks with OpenAI to build a chatbot for the iPhone

The FTC accuses Amazon of using Signal’s auto-deleting messages to erase evidence

Drake deletes AI-generated Tupac track after Shakur’s estate threatened to sue

Aaron Sorkin is working on a Jan. 6-focused follow-up to The Social Network

Samsung's Galaxy S24 Ultra falls to a new low, plus the rest of the week's best tech deals

Nikon’s Z8 is a phenomenal mirrorless camera for the price

Some of our favorite Bose headphones and earbuds are back to all-time low prices

Apple's 13-inch MacBook Air with the M3 chip has never been cheaper

NHTSA concludes Tesla Autopilot investigation after linking the system to 14 deaths

Wacom's first OLED pen display is also the thinnest and lightest it has ever made

It doesn’t matter how many Vision Pro headsets Apple sells

The Google Pixel Buds Pro are back on sale for $135

Dell XPS 13 and XPS 14 review (2024): Gorgeous laptops with usability quirks

OpenAI's Sam Altman and other tech leaders join the federal AI safety board

The best gaming gear for graduates

The Morning After: Apple announces an iPad event for May 7

Engadget Podcast: Why TikTok will never be the same again

The best wireless earbuds for 2024

Apple is launching new iPads May 7: Here's what to expect from the 'Let Loose' event

Spotify tests Apple's resolve with new pricing update in the EU

About

Sections

Contribute

Buying Guides