In 1959, Hubel & Wiesel found that cells in animal visual cortex are responsible for detecting light in receptive fields. Inspired by this discovery, Kunihiko Fukushima proposed the neocognitron in 1980, which could be regarded as the predecessor of CNN. Since 2006, many methods have been developed to overcome the difficulties encountered in training deep CNNs. Most notably, Krizhevsky proposed a classic CNN architecture and showed significant improvements upon previous methods on the image classification task.
Operations in CNN’s
Convolution operation is (w.x+b) applied to all the different spatial localities in the input volume. Using more number of convolution operations helps to learn a particular shape even if its location in the image is changed. We take a filter/kernel(n×n matrix) and apply it to the input image to get the convolved feature. This convolved feature is passed on to the next layer after adding bias and applying any suitable activation function.
The first layer usually extracts basic features such as horizontal or diagonal edges. This output is passed on to the next layer which detects more complex features such as corners or combinational edges. As we move deeper into the network it can identify even more complex features such as objects, faces, etc.
Pooling is an important concept of CNN. It lowers the computational burden by reducing the number of connections between convolutional layers. In this section, we introduce some recent pooling methods used in CNNs. There are different types of pooling:
4.Mixed Pooling and so on
A proper activation function significantly improves the performance of a CNN for a certain task. For CNN’s, ReLU is the preferred activation function because of its simple differentiability and fastness compared to other activation functions like tanh and sigmoid. ReLU is typically followed after convolution operation. Other names in a list of activation functions include Sigmoid, softmax, Leaky ReLu etc.
The weights (w) of every convolution operation are updated using backpropagation. Backpropagation involves calculation of gradients which in turn helps w to reach an ideal state where the error rate (or any other loss metric) is very less.
These are the basic working parts of any convolution neural networks. Other domains to study in it are different optimization technique, which loss function to use and so on.