Voices that Matter iPhone: How Ben Newhouse created Yelp Monocle, and the future of AR

Yelp's Ben Newhouse (who is actually still a student at Stanford) gave a fascinating talk this weekend at the Seattle Voices that Matter iPhone conference. He talked about Yelp Monocle, the augmented reality (AR) iPhone app that he created, and revealed the surprising (and somewhat scandalous) story behind what's known as the iPhone's first-released AR app. He gave some technical details about how he designed the code to make it all run and speculated a little bit about where augmented reality and camera vision are headed.

It was very interesting stuff. Newhouse seems like an extremely smart, young guy who already knows this burgeoning technology very well. When it comes to augmented reality, it certainly seems like the iPhone is leading the charge as a relatively cheap device that will eventually replace more expensive and cumbersome technologies.

Newhouse first recounted how Monocle was actually created: on an overnight Red Bull bender. He was working as an intern at Yelp when tech pundit Robert Scoble suddenly tweeted that the company was working on an amazing iPhone app, using the compass in the (then-new) 3G version. The tweet wasn't in any way true (no one at Yelp, Newhouse says, had any idea what Scoble was talking about), but it started Newhouse thinking about how he might be able to combine the compass on the iPhone with Yelp's online database of businesses and locations.

He went home one night, "bought a case of Red Bull, and coded away." He started with an Apple-created iPhone demo app called "GlGravity," which uses the iPhone's accelerometer to display a teapot that is always pointed down on the screen. At 6 a.m. the next morning, Newhouse was able to have the teapot always pointed north, no matter where the iPhone was facing. A few days later, he had graphics running over the camera view and had hooked it up to Yelp's public API. When he overheard someone else at Yelp talking about how cool it would be to get some sort of augmented reality app running, he showed them the prototype that he'd been working on during his off hours. He was immediately given the go-ahead and ushered into Yelp's official iPhone developer program.

Two weeks later, Monocle was up and running but still very wonky. Therefore, Yelp decided to include it in their app as an easter egg instead of listing it as an official feature. Coincidentally, says Newhouse, it was Robert Scoble who, again, revealed the easter egg to the world. Hours after that, the topic was trending on Twitter; since then, Monocle has been lauded as one of the first big successful augmented reality applications. It all resulted "from what started out as a midnight hackathon project," said Newhouse. "It's been quite a ride."

Newhouse then went into more detail (in fact, more detail than this blogger could understand) about the actual code behind the app. It mostly runs in OpenGL, which he says isn't the best for a UI, but it worked best for his situation. Monocle, and augmented reality in general, said Newhouse, is tough to manage for performance, since it uses almost every system the iPhone has, from the camera to the accelerometer, the compass, the GPS, and so on. There were a few hitches -- the camera code was actually super sensitive, and so the team had to "average out" the image movement, so it wasn't jumping all over the place as the user moved the phone. Newhouse also had to to deal with teaching the phone where it was according to the horizon, which he found out later that NASA was also dealing with -- terms like "gimbal lock" and "quaturnions" confused even veteran developers in attendance.

Finally Newhouse hinted at what would be possible in the future of augmented reality -- he said that the sensors we use are what defines the reality that we can augment, and that computer vision, or reality as seen through a camera, is what's crucial to that reality in the future. The iPhone's camera isn't bad, but the better cameras get, the better the reality that they can augment for us. Newhouse talked about using optical character recognition to read text or other symbols (like an app that could pronounce Chinese characters out loud), or simple motion tracking -- using a camera to follow your body means computers may not even need touchscreens in the future to determine your movements.

He also mentioned something called 3D reconstruction, which is a computer system that can create a 3D model from a camera view -- he showed a picture of a Google Street View-style system, where a car driving down a road used a camera to create a 3D reconstruction of the surrounding space. Newhouse hinted at Yelp working on more things like this, including actually using a camera inside a store to determine where items were -- aisle-to-aisle guidance is one application he winked at.

And he completed his talk by saying that the future is already here. Square has made headlines by creating an iPhone dongle that can be used to accept credit card payments, but Newhouse said he didn't even need that -- he put together a quick app that could read and recognize his credit card with just the iPhone's camera (it occurred to me that a real solution in this fashion would need to authenticate the card in some way, but that's probably possible -- if a vendor trusts you to read out or type in your credit card while ordering pizza, they should trust optical recognition of the card through an iPhone camera). Newhouse also envisioned all kinds of expensive computers, from medical sensors to manufacturing robots, being replaced or perfected with simply a well-written iPhone app.

It was an extremely interesting talk. We're only scratching the surface of augmented reality so far, and Newhouse's presentation proved that there are plenty of intriguing applications and implementations yet to come.