How AI made Facebook’s Portal your ‘personal cameraman’

The video-calling device eschews motorized cameras for AI.

Former Senior Editor

Updated Fri, Feb 15, 2019, 12:00 PM·6 min read

"Thekey thing for us, is really invoking that we are connected in the two roomsand making you feel like you're there and just hanging out

After releasing its Portal video-calling tool to largely positive reviews (especially from its employees) last November, Facebook is finally cracking open the device and giving the rest of us a glimpse at the Portal's inner workings. Engadget sat down with Facebook's Rafa Camargo, Vice President of Hardware, and Matt Uyttendaele, Engineering Director of Mobile Vision to discuss the device's development and the artificial intelligence that powers Portal.

When Facebook's AI research group (FAIR) began working on the systems that would eventually become the Portal two years ago, the team asked itself, "How do we create an automated a camera that will feel natural, will feel engaging and would actually not get in the way," Camargo explained to Engadget. "The key thing for us, is really invoking that we are connected in the two rooms and making you feel like you're there and just hanging out."

In order to create that effect, the Portal team designed the device's Smart Camera to mimic the movements and judgements of human camera operators. That involved collaborating with "award winning film directors, documentary producers, and camera people," Camargo said. Their feedback helped steer development until "we're essentially to the point where literally the camera disappears because it becomes so natural that you just don't notice the camera. You just see the scene and what's happening."

Accomplishing that is harder than it sounds, mind you. As the FAIR team explain, the Portal was originally slated to use a mechanical camera. However, it suffered a number of drawbacks, including an increased likelihood of breakdown and the inability to react to events and actions happening off camera. "Smart Camera," the FAIR team wrote, "which was always a key component of our product, became increasingly central to our planned reinvention of the video-calling experience."

Once they settled on what hardware to employ, the Portal team set about creating the AI that would command it. They started with the Mask R-CNN model that FAIR had released in 2017. The Mask model is a body detection system that "detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance," according to Facebook's research blog. "Mask R-CNN is a really elegant solution to finding objects in an image. It can be applied to lots of things but it works really well for this [application] as well," Uyttendaele told Engadget.

However, the model as it existed in 2017, was not suitable for use with the mobile chipsets that the company was using in the Portal. For one thing, Mask R-CNN only operated at 5fps and was very processor intensive. Camargo said using the existing Mask model would have required additional processing and cooling hardware that would have made the Portal more expensive and less reliable. "Mask R-CNN2go, however, "is very tuned to the compute constrained mobile environment."

In response, Facebook's research teams streamlined the model until it was only a few megabytes in size and dubbed it Mask R-CNN2Go. Despite its reduced footprint, Mask 2 Go runs 400 times faster than its predecessor while maintaining pose-detection accuracy. The new model also improved "low-light performance by applying data augmentation on low-light examples in the training data set and balanced multiple pose-detection approaches," the FAIR team wrote.

"The original Mask R-CNN paper threw a lot of capacity at identifying humans in all sorts of different situations -- skiing, on a horse, in an outdoor environment," Uyttendaele continued. However since stability and efficiency were key goals of the Portal's development (not to mention nobody was going to be video conferencing while astride a steed), there wasn't call for training the system on much more than humans doing stuff indoors.

But even in an enclosed environment, there are plenty of things around to confuse a poorly-trained AI, which is part of the reason why the Mask2Go model focuses on body, rather than facial, detection.

"We really needed Portal to understand the full body position -- in real time, all the time -- to frame you," Camargo stated. If, for example, you're laying on the couch and are covered by a blanket, the system needs to realize that it might only be seeing your face and that your body position will be horizontal, rather than vertical "because it would frame you differently, would zoom into you differently."

As such, the Smart Camera's AI analyzes every frame of the video call. This allows it to effectively track (or, conversely, ignore) various human-like objects in the scene. That is, if you're calling your Grandparents on a Portal, the system can actively track your elderly relatives while ignoring the life-size portrait hanging on the wall behind them because it "sees" that they're shifting in their chairs and fidgeting, while the man-sized painting behind them remains unmoved from frame to frame.

Given the recent spate of hackers infiltrating internet connected home security cameras and baby monitors, Facebook made sure to bake a degree of privacy and security into the hardware itself. "The whole AI engine that is doing analytics on the camera runs local, so everything stays local," Camargo insisted. "None of that ever leaves your home or wherever you're putting the device. The only media feed that leaves the device is the final result and it's when you're in a call. And it's only going to the people on the other side of the call."

While the Portal and its larger variant, the Portal+, are currently on store shelves, Facebook is not done developing its Mask2Go model and Smart Camera tech (or content that runs on it). The company is working on an AR-based feature called Story Time, for example. "When you truly hang out with people you're not just chatting or talking, you actually do activities together," Camargo explained. "So we see a lot of potential actually, to bring AR as a way to actually help people engage and feel deeper and stay more time engaged together through the connection."

This technology could eventually make the leap to other devices as well. "Something that drives us is making sure that these computer driven algorithms can run across our community's set of devices," Uyttendaele concluded. "We focus a lot on lower end phones and as we do that, we make computer vision more and more performant over time. I think, because of that, we will bring a more optimized 2D pose tracker to Portal in the future."

Engadget
ISPs are fighting to raise the price of low-income broadband
Internet service providers are objected to the lower rates they need to offer lower income customers if they want to obtain government funds from a new Internet access program.
Engadget
Amazon is giving The Boys the prequel treatment
The cast and crew of Amazon's The Boys announced a bunch of new spinoffs for the supe action series.
Engadget
You can date everything in Date Everything!
Date Everything! is an upcoming dating sim game that lets you date evert
Engadget
The Bioshock movie is still happening but with a reduced budget
The Bioshock movie is still happening, but with steep budget cuts. It’s being reconfigured to become a ‘more personal’ film.
Engadget
Warner Bros. Discovery sues the NBA in a last-ditch effort to block Amazon’s new streaming package
Warner Bros. Discovery followed through on its threat to “take appropriate action” against the NBA for rejecting its broadcasting rights offer. On Friday, the media company sued the league after the NBA turned down its bid to match Amazon’s streaming package.
Engadget
Apple’s M3 MacBook Air with 16GB of RAM is $200 off right now
Apple’s M3 MacBook Air combines Apple’s lightest and thinnest laptop design with the cutting-edge horsepower of the latest Apple silicon chip. You can get the 2024 model on sale for $200 off right now.
Engadget
Here's how to stop Grok's AI models using your tweets for training
X automatically opted users into letting Grok's AI models train on their tweets and interactions with the chatbot. Here's how to opt out.
Engadget
The 10th-generation iPad is back down to $300, plus the rest of this week's best tech deals
The week after Amazon's Prime Day can be a bit sleepy for deals, but we still found a few decent discounts on gear we've tested and recommend.
Engadget
The 65-inch LG C3 OLED TV is nearly half off for today only
The 65-inch LG C3 OLED TV is nearly half off for today only. That brings the set down to a record low of $1,300.
Engadget
NASA's Perseverance rover found a rock on Mars that could indicate ancient life
A Martian rock sample collected by Perseverance contains "chemical signatures and structures" that could've been formed by ancient microbial life from billions of years ago.
Engadget
Apple agrees to stick by Biden administration's voluntary AI safeguards
Apple has joined more than a dozen other tech companies in signing up for the Biden administration's voluntary AI code of practice.
Engadget
North Korean who used ransomware to attack US healthcare providers has been indicted
A grand jury in Kansas City has indicted Rim Jong Hyok, a North Korean intelligence operative who allegedly used ransomware to attack health providers' systems in the US.
Engadget
Samsung Galaxy Ring review: A bit basic, a bit pricey
The Galaxy Ring is comfortable and seemingly basic, but actually delivers detailed insight on your sleep, walks and runs.
Engadget
Apple's 14-inch MacBook Pro laptop with an M3 Pro chip is $300 off at Amazon
Apple's well-specked 14-inch MacBook Pro with an M3 Pro chip, 18GB of memory and 512GB of storage is on sale for the lowest price we've seen yet at Amazon.
Engadget
Gran Turismo 7's more realistic physics update is launching cars into orbit
Gran Turismo 7's latest update is causing some bizarre problems, making cars bounce violently or launch completely into the air.
Engadget
The Morning After: OpenAI reveals its AI-powered search engine, SearchGPT
The biggest news stories this morning: AI video startup Runway reportedly trained on ‘thousands’ of YouTube videos without permission, The best cameras for 2024, WhatsApp hits 100 million monthly active US users.
Engadget
The best fitness trackers for 2024
Here's a list of the best fitness trackers you can buy, as chosen by Engadget editors.
Engadget
The best cameras for 2024
Here's a list of the best cameras you can buy, as chosen by Engadget editors.
Engadget
X's Grok chatbot is misleading voters about the presidential election
Grok's AI chatbot claims that President Biden's name must stay on the ballot in nine states, a claim that is categorically false.
Engadget
Comic-Con leak sparks rumors of two remastered Soul Reaver games
A photo from Comic-Con has leaked possible remasters of two Soul Reaver games from Crystal Dynamics.

How AI made Facebook’s Portal your ‘personal cameraman’

The video-calling device eschews motorized cameras for AI.

Latest Stories

ISPs are fighting to raise the price of low-income broadband

Amazon is giving The Boys the prequel treatment

You can date everything in Date Everything!

The Bioshock movie is still happening but with a reduced budget

Warner Bros. Discovery sues the NBA in a last-ditch effort to block Amazon’s new streaming package

Apple’s M3 MacBook Air with 16GB of RAM is $200 off right now

Here's how to stop Grok's AI models using your tweets for training

The 10th-generation iPad is back down to $300, plus the rest of this week's best tech deals

The 65-inch LG C3 OLED TV is nearly half off for today only

NASA's Perseverance rover found a rock on Mars that could indicate ancient life

Apple agrees to stick by Biden administration's voluntary AI safeguards

North Korean who used ransomware to attack US healthcare providers has been indicted

Samsung Galaxy Ring review: A bit basic, a bit pricey

Apple's 14-inch MacBook Pro laptop with an M3 Pro chip is $300 off at Amazon

Gran Turismo 7's more realistic physics update is launching cars into orbit

The Morning After: OpenAI reveals its AI-powered search engine, SearchGPT

The best fitness trackers for 2024

The best cameras for 2024

X's Grok chatbot is misleading voters about the presidential election

Comic-Con leak sparks rumors of two remastered Soul Reaver games

About

Sections

Contribute

Buying Guides