Saturday, 18 March 2017

A Confluence of AI technologies

Last Friday I met with an old friend who said that he was keen to investigate how AI technologies could be used in some ideas he had. (He knew that I had been involved in AI research some years before). I explained that I had seen an interesting blog post [1] by Greg Wayne and Alexander Graves, researchers at DeepMind [2], (now owned by Google), on a form of memory-augmented neural networks called a differentiable neural computer.

And yesterday as I walked through the computer lab at Cambridge I noticed an advert for a lecture by Dr Demis HassabisCo-founder and CEO of DeepMind; "Towards General Artificial Intelligence".

When I sat down and checked my news feeds I found an article in the Financial Times entitled, "DeepMind’s social agenda plays to its AI strengths" (17th March 2017) [3], in this they explain that DeepMind’s researchers have in common a clearly defined if lofty mission: to crack human intelligence and recreate it artificially. 

Marvin Minsky (Wikipedia 2008)
All this took me back to a meeting I had at the Massachusetts Institute of Technology in 2003 with Marvin Minsky. [4]

Wikipedia says of Marvin: Marvin Lee Minsky (August 9, 1927 – January 24, 2016) won the Turing Award (the greatest distinction in computer science) in 1969, the Japan Prize in 1990, the IJCAI Award for Research Excellence for 1991, and the Benjamin Franklin Medal from the Franklin Institute for 2001. In 2006, he was inducted as a Fellow of the Computer History Museum "for co-founding the field of artificial intelligence, creating early neural networks and robots, and developing theories of human and machine cognition." In 2011, Minsky was inducted into IEEE Intelligent Systems' AI's Hall of Fame for the "significant contributions to the field of AI and intelligent systems". In 2014, Minsky won the Dan David Prize for "Artificial Intelligence, the Digital Mind".

In preparation for the meeting I had “crammed” on AI, reading all the latest research papers and of course Marvin’s book “The Society of Mind”. [5]

We spent the morning discussing his work and the challenges faced by AI research in the following decade. 

But why do I mention this meeting, which took place 14 years ago? 

Because at that meeting Marvin had mentioned that there were three areas of AI related technology that needed to advance significantly for AI systems to make the next major leap forward. I remember these because they could be easily summarised as follows:

  • Computer vision
  • Common sense
  • Consolidated learning – (learning and remembering)

Computer Vision

The enormous focus lately on self-driving vehicles by companies such as Apple, Audi, BMW, Daimler and Google, has driven the research into computer vision systems coupled to AI systems. The need to identify objects and then to react very quickly has led to very sophisticated computer vision systems. [6]

Computer Vision and Robotics Groups, such as the one at the University of Cambridge [7], have produced systems able to recognise objects in a moving complex video scene at the pixel level. Their latest paper (2017) "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Scene Segmentation" [8], describes a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation.

(A short video showing a video of a complex moving scene together with the computer analysis).

The push to provide safe and efficient systems for vehicles has led to significant advances in these computer vision systems.

Common Sense

Marvin Minsky understood that for AI systems to interact with human beings in a way in which the human beings would find "natural", the AI systems would have to have a knowledge of the world similar to that learned by people in the early years of their lives. This is very well explained in Marvin's book, “The Society of Mind”.

What happens when a child reads a story that begins like this?
Mary was invited to Jack's party.
She wondered if he would like a kite.

The child, because of the experience gained in their life, will probably suppose the following:
The "party" is a birthday party.
Jack and Mary are children.
"She" is Mary.
"He" is Jack.
She is considering giving Jack a kite.

How can we provide an AI system with this life experience (or common sense)?

ConceptNetIn an attempt to solve this problem a group at the MIT Media Lab , inspired by Marvin Minsky,  created a project called "Open Mind Common Sense" (OMCS) [9]. This was an artificial intelligence project whose goal was to build and utilise a large commonsense knowledge base from the contributions of many thousands of people across the Web.

The project collected more than a million English facts from over 15,000 contributors and brought together knowledge bases in other languages. The natural language corpus created has led to a semantic network built from this corpus called ConceptNet.[10] If you are unfamiliar with this project I suggest that you open the page and try some quite diverse inputs to the search engine. For example try entering first “Mammoth” and then “London”, then consider how you might describe these two.

Consolidated learning – (learning and remembering)

Finally we come back to the research mentioned at the beginning of this article. The DeepMind team have produced a form of memory-augmented neural network called a differentiable neural computer (DNC), and they show that it can learn to use its memory to answer questions about complex, structured data, including artificially generated stories [11], family trees, and even a map of the London Underground. 

In the blog post (by Greg Wayne and Alexander Graves), they say, "Neural networks excel at pattern recognition and quick, reactive decision-making, but we are only just beginning to build neural networks that can think slowly – that is, deliberate or reason using knowledge. 

A video of one of their demonstrations, from their blog, is below:

The team explain: In a family tree, we showed that it could answer questions that require complex deductions. For example, even though we only described parent, child, and sibling relationships to the network, we could ask it questions like “Who is Freya’s maternal great uncle?”

A point in time?

These three developments in computer vision, common sense and consolidated learning, did not all happen at the same time, but their existence together brings us to a point in time (a confluence of AI technologies). Is this the time when a major leap forward in Artificial Intelligence can take place?

I really don't know, but it certainly feels like a special time, one in which things can happen.


  1. Blog post by Greg Wayne and Alexander Graves, researchers at DeepMind -
  2. DeepMind -
  3. Article in the Financial Times entitled, "DeepMind’s social agenda plays to its AI strengths" (17th March 2017), by: Madhumita Murgia, European Technology Correspondent -
  4. Marvin Minsky (Wikipedia) -
  5. Marvin Minsky’s book “The Society of Mind” -
  6. Computer vision systems (Wikipedia) -
  7. Computer Vision and Robotics Group at the University of Cambridge -
  8.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Scene Segmentation - Vijay Badrinarayanan, Alex Kendall & Roberto Cipolla, (2017) -
  9. "Open Mind Common Sense" (Wikipedia) -
  10. ConceptNet -
  11. Artificially generated stories -





No comments:

Post a Comment