As seen in the last post, after the event of CNN’s (Convolutional Neural Networks) we step in the second AI Winter, but in 2006 this would change…
- 2006: Deep Learning Arises
In 2006, Hinton, Simon Osindero and Yee-Whye Teh published a paper where they solved the vanishing or exploding gradient problem. In this paper, “A fast learning algorithm for deep belief nets”, where they showed that instead of initialise the neuron weights randomly, neural networks with many layers could be trained well, if the weights are initialised in a clever way. The idea was to train each layer with a Restricted Boltzmann Machine (RBM). A RBM is a network similar to a neural net. It has units that are similar to perceptron’s, but instead of calculate an output based on inputs and weights, each unit in the network can compute a probability of it having a value of 1 (be on) or 0 (be off) given the values of connected units and weights. The RBMs, contrary to Boltzmann Machines, do not have connections to hidden units.
Fig. 1. RBM Initialisation
- 2010: Deep Learning and GPU’s
With CPUs starting to hit a ceiling in terms of speed growth, computer power was starting to increase mainly through weakly parallel computations with several CPUs. To learn the millions of weights typical in deep models, the limitations of weak CPU parallelism had to be left behind and replaced with the massively parallel computing powers of GPUs. In 2005 the production of GPU’s grows and they became cheaper.
With the computation boost gain through GPU’s, a GPU implementation of back-propagation in a big neural net, sets in the same year a new record on the famous MNIST handwritten digit recognition benchmark. - 2012: Victory in ImageNet competition
Using CNN’s with efficient GPU implementations, it was the first time that a Deep Learning system won a visual object detection competition. The system error was of 15.3% and the second closest had 26.2%. This system was the first and only CNN entry in that competition, today all the entries in the competition are CNN’s. This was the turning point. Deep learning had arisen and come to stay.
References
- Andrey Kurenkov, “A brief history of neural nets and deep learning.”
- Scholarpedia, “Deep Learning.”
- Nature, “Computer Science, The Learning Machine.”
- LeCun, Y, Bengio, Y., Hinton, G.: Deep Learning. Nature. 521, 436–444 (2015)
- Nilsson, N. J.: The Quest for Artificial Intelligence. Cambridge University Press, New York (2010)
- KDnuggets, exclusive interview
- Schmidhuber, J.: Deep Learning in Neural Networks: An Overview. Cornell University (2014)
- Scholarpedia, “Boltzman Machine.”