Imparting vision upon machines has been a massive, multi-decade undertaking by the scientific community. And while the acuity of today's state-of-the-art computer systems can match or exceed a human's high-resolution optical anatomy, training these machines to understand what they're looking at is still a labor intensive task. But thanks to the work of Dr. Pawan Sinha, Professor of Vision and Computational Neuroscience at MIT, and his Project Prakash (Sanskrit for "Light"), we may have stumbled upon a faster and far more efficient method of machine learning. Also, thousands of congenitally blind children in India have had their vision restored, so there's that.
Each year, around four in 10,000 children worldwide are born with congenital cataract (CC), a rare occurrence by WHO standards, yet the condition accounts for as much as 20 percent of childhood blindness cases. CC is treatable with simple corrective surgery but in parts of the world where medical facilities are few and far between, this treatment isn't always an option.
For children who do not receive adequate medical care, the consequences can be dire -- 90 percent of them will not receive a proper education, only 50 percent will survive until adulthood and, among those, only 20 percent will find gainful employment. But that's where Sinha and Project Prakash come in.
Sinha founded Project Prakash in 2005 as a dual mission charity. "It's a project that grows out of a desire to be good scientists, but also to be good Samaritans, to actually engage with real world problems," he told Engadget.
The organization provides life-changing eye surgery to children and young adults with treatable conditions, such as congenital cataract, that cause blindness. The other mission seeks "to understand how a brain that has been deprived of vision for so many years, whether and how it can learn to acquire visual proficiency at that point in life," Sinha explained.
One could fairly assume that removing the cataracts from a child would have the same effect as removing them from an elderly patient -- specifically, their sight being immediately restored -- but that's not the case and we're not entirely sure why.
Your average person enjoys 20/20 vision. That is, they see objects from 20 feet away with the same visual acuity as the rest of the population from the same distance. If you're near or far-sighted, basic corrective lenses are all that's needed. But the Prakash children, as Sinha refers to them, often suffer from degraded vision, around 20/100 or five times worse than average.
What's more, their vision cannot be corrected using glasses or contacts, because there's nothing physiologically wrong with their eyes or any other part of their optical anatomy. The problem lies within how their brains developed without visual input during infancy. It's this very problem that eventually led to Sinha's computer vision Eureka moment. But first, a word on infants and their incredible nearsightedness.
Your average baby sees slightly worse than Mr. Magoo, somewhere in the neighborhood of 20/800 or 40 times worse than an adult, yet by the time they're a few years old, their vision has corrected itself to 20/20 (or thereabouts).
"A big part of the reason [that infants have blurred vision] is that the retina of the baby is quite immature," Sinha explained. He also noted that the cones in our eyes, the specialized cells which provide us with high resolution vision, are, at infancy, much larger than they are in adulthood. This reduces the density of those cells in the eye and in turn limits the eye's resolving capabilities. As the child grows, the cones become smaller and more tightly packed, improving their visual acuity. Yet their initially poor vision plays a crucial role in their cognitive development. Sinha theorized the same concept might apply to neophyte computer vision systems.
"For the first several months, they [children] rely strongly on motion to parse the world. So it seems that the programs of visual learning that are being deployed in the children's brains seem to recapitulate some aspects of the developmental programs in the normally developing infants," Sinha said.
Essentially, Sinha hypothesizes that a baby's poor eyesight acts as a set of "training wheels" for their developing minds. By reducing the high definition detail, their formative minds can focus on more fundamental visual and cognitive development without getting mired down in the details. Evidence suggests that there is a very narrow window of opportunity in a child's early development for this to happen.
Unfortunately for the Prakash Children, by the time that they receive the surgery, that window has long since passed. Sinha points to a case of a Chinese orphan born with congenital cataracts who was adopted by an American family and brought to the US for vision restoration surgery at age six. While the operation went swimmingly, the child's adoptive parents began to notice that the kid had difficulty making friends.
The child's doctor determined that the issue stemmed from the child's difficulty recognizing and remembering faces -- not because of any optical issues, but because that part of the child's brain didn't fully develop during infancy. "The brain requires a normal face related visual input," Sinha said. "And if it's deprived of that input during that critical window, then forevermore it's going to be compromised."
This case got Sinha and his team thinking about whether AI and computer vision (CV) systems might be trained in the same manner as infants. Rather than overwhelming a learning AI with high definition video inputs, could it instead be trained first on blurring imagery before slowly being weaned onto increasingly high-res inputs. Turns out, the answer is a resounding yes.
Sinha's team trained a deep convolutional neural network (CNN) on a database of faces, which had been intentionally blurred to varying degrees. The team trained four of these networks using different regimens -- blurred-to-high-res, high-res-to-blurred, blurred-to-blurred, and high-res-to-high-res.
The team discovered that when the systems were exposed to blurred images, its receptive field (RF) expanded. Oddly though, "starting with high-resolution images and then later introducing blurred ones leads to a significant increase in RF sizes, but the converse (blurred followed by high-resolution images) does not cause the network to reduce the sizes of the already established large RFs." So it doesn't matter when the CV system sees the blurred images, only that it does. This conversely could lead to a treatment for the Prakash children: artificially blurring their newly restored sight until their brains develop the necessary mechanisms to understand what they're seeing.
"To our pleasant surprise, we found that networks that trained with poor quality images actually performed better than networks that began with higher resolution images," Sinha said. "Having poor vision at the outset might force the brain to look at the overall structure of images, rather than being fixated on bits and pieces."
What's more, the storage space and bandwidth saved by utilizing low resolution images could help speed up training times while reducing the size of already-unwieldy training databases. Sinha's research was recently published in the journal, PNAS, though there is no word yet on when this CV training technique might make its way into practical applications. Though I doubt that's of much concern to the thousands of people who can now see for the first time in their lives.