Earlier this year, in March, something very significant
happened in the history of artificial intelligence. A computer program,
AlphaGo, developed by Google DeepMind, defeated the South Korean professional
Go player, Lee Sedol, in the home turf of the human player. The collective
human ego went into a shock. It was unthinkable that a machine could beat the
best of human minds in this extremely complex strategy game! The world had
hardly recovered from the defeat handed to Gary Kasparov, the formidable world
chess champion, by a mere computer, the IBM Deep Blue, in 1997.
Well… we should have seen this coming. The processing power
of computers has gone through astronomical advances, thanks to the relentless
pursuit of Moore’s law, named after the founder of Intel. Basically, the
transistor count in the processor chip has been doubling every two years over
the past four decades! In parallel, algorithms for machine learning and
artificial intelligence also went through revolutionary leaps, with the
invention and enhancements of the convolutional neural network approach. The
combination of the advances in these two fields is now enabling previously
unthinkable computer vision and machine intelligence capabilities.
Appropriately, SID invited Professor Jitendra Malik of UC
Berkeley, a pioneer in the field of computer vision, to present the Luncheon
Keynote this year. He started his presentation by showing the picture below.
Can a computer program be developed that understands the “semantics” of this scene? Facts such as: 1) the lady on the left is walking away with 3 bags, 2) the woman on the right is playing the accordion sitting on the bench by a bag, while 3) the guy in the middle is looking at the woman. Pushing it further, Professor Malik asked if it would then be possible to predict if the guy has the intention to put some money in the tip bag of the woman!
Most would probably say these are impossible tasks for a computer to accomplish. However, Professor Malik walked the audience through the advances in computer vision over the past few decades, and demonstrated how the advent of multi-layer neural network based algorithms have resulted in unprecedented accuracies in semantic visual understanding that would make such tasks possible.
On to “deep” understanding of the visual world from a collection of pixels on an image, with the help of “deep” learning algorithms running on powerful modern computers! The future is intelligent, and the future is already here… --Achin Bhowmik
No comments:
Post a Comment