"We ended up discovering—through trial and error and a healthy dose of luck—a treasure trove of expertise in the form of a documentary filmmaker, a photojournalist, and a fine arts photographer," said Josh Lovejoy, Senior Interaction Designer at Google. "Together, we began gathering footage from people on the team and trying to answer the question, 'What makes a memorable moment?'"
Some of that learning comes down to principles that you may have learnt as you've struggled to get to grips with a new smartphone camera or point-and-shoot. Understanding focus, particularly depth of field, and the rule of thirds are key, but so are some more "common sense" suggestions. Everybody knows to keep fingers out of the shot and to not make quick movements, but machine learning algorithms have no such understanding.
"We needed to train models on what bad looked like," said Lovejoy. "By ruling out the stuff the camera wouldn't need to waste energy processing (because no one would find value in it), the overall baseline quality of captured clips rose significantly."
Google admits that while it trained its AI to appreciate "stability, sharpness, and framing," Clips won't always get it right. It can ensure that it's framed a shot well and has a family member in focus, but it won't know that the big shiny ring on someone's finger is what everyone will want to see.
"Success with Clips isn't just about keeps, deletes, clicks, and edits (though those are important)," Lovejoy notes. "It's about authorship, co-learning, and adaptation over time. We really hope users go out and play with it."