Latest in Science

Image credit:

Computers can now describe images using language you'd understand

239 Shares
Share
Tweet
Share

Sponsored Links

Software can now easily spot objects in images, but it can't always describe those objects well; "short man with horse" not only sounds awkward, it doesn't reveal what's really going on. That's where a computer vision breakthrough from Google and Stanford University might come into play. Their system combines two neural networks, one for image recognition and another for natural language processing, to describe a whole scene using phrases. The program needs to be trained with captioned images, but it produces much more intelligible output than you'd get by picking out individual items. Instead of simply noting that there's a motorcycle and a person in a photo, the software can tell that this person is riding a motorcycle down a dirt road. The software is also roughly twice as accurate at labeling previously unseen objects when compared to earlier algorithms, since it's better at recognizing patterns.

The technology has its problems, especially if you don't have a large pool of training images. It frequently makes mistakes, as you can see in the examples above. However, these are still early days. Larger data sets should help with the detection routine's performance, and there are likely to be refinements to the code itself. When it does get significantly better, it could have a tremendous effect on everything ranging from artificial intelligence to search. A robot could tell you exactly what it sees without requiring that you look at a camera to check its findings; alternately, you could search for images using ordinary sentences and get only the results you want. It might be years before you see Google's technique used in the real world, but it's clear that you won't have to deal with stilted, machine-like descriptions for too much longer.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.
Comment
Comments
Share
239 Shares
Share
Tweet
Share

Popular on Engadget

Clarence Thomas laments ruling that let FCC kill net neutrality

Clarence Thomas laments ruling that let FCC kill net neutrality

View
Google Translate adds languages for the first time in four years

Google Translate adds languages for the first time in four years

View
The creator of the Konami Code has died

The creator of the Konami Code has died

View
Pope Francis wants you to give up being a jerk online for Lent

Pope Francis wants you to give up being a jerk online for Lent

View
Facial recognition startup Clearview AI says its full client list was stolen

Facial recognition startup Clearview AI says its full client list was stolen

View

From around the web

Page 1Page 1ear iconeye iconFill 23text filevr