Visual encyclopedia builds itself by scouring the internet

Crowdsourced knowledge bases like Wikipedia encompass a lot of knowledge, but humans can only add to them so quickly. Wouldn't it be better if computers did all the hard work? The University of Washington certainly believes so. Its LEVAN (Learn EVerything about ANything) program is building a visual encyclopedia by automatically searching the Google Books library for descriptive language, and using that to find pictures illustrating the associated concepts. Once LEVAN has seen enough, it can associate images with ideas simply by looking at pixel arrangements. Unlike earlier learning systems, such as Carnegie Mellon's NEIL, it's smart enough to tell the difference between two similar objects (such as a Trojan horse and a racing horse) while lumping them under one broader category.

Right now, the folks at the Wikimedia Foundation have little to worry about. LEVAN has only explored about 175 concepts as of this writing, and it can take as much as 12 hours to add another to the mix. It's open to suggestions from the public, though, and the university has open-sourced its code so that anyone can build on the formula. You won't want to depend on this self-assembling information hub for vital knowledge in the near future, but it should eventually be very useful for both schools teaching basic ideas as well as computer vision software that needs a helping hand.