How does CNN work?
The Convolutional Neural Network can be divided into 4 broad parts. They are-
Convolutional Layer
Pooling Layer
Dense Layer
Output Layer
Convolutional Layer
As we discussed in previous part that the computer does not understand images in different orientations. Let's get into this in more detail.
The way recognition works is that the image is made up of pixels. These pixels have a numerical value.
Our normal photos use RGB channels and each pixel can have a value from 0-255. For simplicity, let's take a simple 8x8 image. This is how it will look.
Now in order to recognrecognizerecognizesise these images, we simply break them down into little features. For example, when we look at a dog our brain quickly recognises features such as ears, eyes, nose, etc to identify that there's a dog.
As you can see, to recognize X, we divide it into 3 features. These blocks are called Filters. So whenever we have a complex image, we apply filters to recognise the features of the image.
Mathematical operations are performed between the original image and the filters to come up with a shorter 7x7 image.
This is done using all three filters which gives us 3 different 7x7 images. You can see that the diagonal has the value 1 indicating that they're the diagonals. Similarly, other filters will give the middle part and the opposite diagonal.
This is the Convolutional Layer.
To summarise, the convolutional layer uses filters in order to recognize features which in turn helps to recognize the image.
Next we will learn P,ooling. If you want to learn deeper, I highly recommend watching the video by Brandon Rohrer.
Comments