Advertisement

Google's Amit Singhal tells us about the dreams search engines are made of

Do Googlers dream of electric algorithms? For a little insight into what makes the search engine that became a verb tick, we recently attended a talk by Amit Singhal, one of its chief engineers. Amit is part of the team in charge of tweaking and improving Google's ranking algorithms and has 20 years of experience when it comes to sorting through data, with that time split into even decades spent within the academic sphere and over in Mountain View. What he had to tell us mostly revolved around his aspirations from when he started out back in 1990, but it's the way that Google has acted to meet each of those goals that's the fun and interesting stuff (or as we like to call it around here, the meat). So do put on your reading monocle and join us past the break.%Gallery-97608%


The two major challenges of information retrieval, says Amit, are volume and latency. Specifically, that means how much data is available out there, on the one hand, and how long it takes you to collect it, on the other. A quirky example he provided was to imagine sitting with your grandfather by a campfire. You might not have the full works of Shakespeare at your fingertips, but latency on procuring what information you seek is virtually non-existent (depending on how senile your gramps is). On the other hand, imagine yourself in a legal or engineering library in need of an obscure reference and you'll be encountering the opposite problem, where information is plentiful but so is the time it takes you to acquire it.

Google and the development of the internet have collectively taken the sting out of both those issues, but more can and is being done. Google's goal is to make all information universally available, and where data isn't just hanging out on the web looking suave, Amit said Google will create it. That's why we have Google Maps, Google Earth and even Google Books. Well, that's the official feelgood version of why we have them -- these Googlers have a beguiling way of making everything seems so rosy and wonderful that even when they're right, we find ourselves suspicious. Anyhow, Amit's dreams aim at these further possibilities, and we've listed them off below, in ascending order of difficulty.



1. Search beyond text. Amit points out that while Google has cracked the challenge of crawling around and finding familiar shaped letters, words, and passages, there is still much more information out there that doesn't come packaged in textual form. Google image and video search options were instituted to account for that, though originally the algorithm was quite "dumb." Image searching, for example, still relied on finding nearby text -- whether in captions, headlines or the like -- and matching that with the user's search query. Today, colors, shapes and other optical data get factored into a search, something Amit described as a (still basic) form of "computer vision algorithms."

2. Search beyond language. This one's kind of obvious, but not all the world speaks or writes in English. Google's now capable of sorting search results in particular languages, with the example we were given culling results for ratatouille recipes that weren't written in French, and of course you then have Google Translate to haphazardly interpret the content of the remaining results.

3. Search that knows me. Making search relevant requires tailoring, which Google has been doing by filtering results at the national level first, and is now seeking to expand on with its relatively new social search feature. Provided you can be bothered to log in and associate your Twitter account to your Google ID, results get fleshed out with commentary from the people you follow. Ergo, when you're looking for advice on a particular issue, your friends can pitch in without you even having to talk to them. Bliss!

4. Search the present moment. Real time search is something Google introduced in December of last year and it's basically a ragtag collection of info from Twitter, Facebook, MySpace, blog RSS feeds, news sites, and crawl results obtained with Caffeine. The outcome, according to Amit, is that Google is dramatically reducing the time between an article being published and you being able to access it. He cited an example of this new up-to-the-minute search identifying an earthquake within 90 seconds of it happening, whereas the relevant US agency took 12 minutes to update its site with the news. Signal-to-noise ratio problems still trouble this nascent innovation, but Amit's not giving up and points out there's a reason why nobody else has a comparable search offering: "Because it's hard. It's crazy hard!"

5. Search that understands meaning. This is where things get even more difficult, because computers still don't have any real understanding of what they're doing. You stick data in at one end, you churn some results out of the other, but the machine is still none the wiser about what it has done. Amit threw up the example of the word "change," which -- depending on the particulars of what you want to find -- may mean that you want to adjust, to modify, to convert, to exchange, to switch, to install, or to replace something, all of which are nuances of the word that Google understands. But what's most impressive is that no data entry monkeys were required to punch in all the variate connotations of the word into the Googlebrain -- contextualized meaning is being worked out almost entirely on the basis of algorithms. Smells like the first steps toward sentience to us.



6. Searching without searches. Amit finished off by teasing us with his vision of where information retrieval should be headed. He stressed that this is neither on Google's roadmap nor something that's doable today, while also reiterating that it would be done with user privacy and control as the number one concern. His search-less vision, then, revolves around the amalgamation of all those ancillary services that Google has being developing to deliver timely and relevant information without you having to ask for it.

His example illustrates this well. Imagine that you have to buy a new cricket bat because your old one's broken and you happen to have an hour of free time between meetings one day. Your phone knows about your shopping needs because they're in your to-do list and it knows about your meetings because they're in your schedule. All it needs is your location (which, of course, it has) and some local area information, and it'll ping out a message advising you that you can just pop down the road, buy that wooden stick, and be back in time for your 2PM with Marty from the Synergy Department.

If that sounds like your phone babying you and potentially controlling your life, then you're attuned to the exact same frequency that we are. We get Amit's point, our devices have come nowhere near to maxing out the intelligence or utility that we can extract from them, but this just seems like another step toward over-reliance on technology. Most people nowadays can't remember their friends' phone numbers -- in spite of being in constant contact with them -- and correct spelling is growing largely irrelevant when doing a Google search with all those ultra-smart (okay, sometimes they can be real dumb too) suggestions. So it's a nice dream and all, but we kinda like things on the search front as they are: functional, predictable, and not too flashy.