In new computer era the object detection is used for image processing. The task of image processing is to deals with detecting instances of semantic objects of a certain category (such as humans, buildings, or cars) in digital images and videos. Like as show in the below image.
There are many methods to achieve Object Detection. Some of the methods which can be used for object detection are:
Single Shot MultiBox Detector (SSD)
- Faster R-CNN
- Histogram of Oriented Gradients (HOG)
- Region-based Convolutional Neural Network (R-CNN)
- YOLO (You Only Look Once)
WHAT IS SINGLE SHOT DETECTION (SSD)?
In Deep learning model, Single shot detection is one concept which is used to detect the objects in the image or from the given input video source. SSD has two components mainly called as Backbone Model and the SSD Head.
The backbone model refers to the network which takes input as the image and extracts the feature map upon which the rest of the network is based. SSD Head is another type of convolutional layers added to the backbone model and the outputs are presented as the bounding boxes and classes of objects in the spatial location of the final layer’s activation.
In place of using a traditional sliding window algorithm, SSD will divides the image as grids, and each grid cell are responsible for detecting objects in the frame of image. If there is no object detected, then we get the output as nothing or to be more precise we will put a “0” indicating that there is no object found.
How to detect many objects of the same instance in a single image?
In this type of situations Anchor Box comes into play. Anchor Boxes are simple rectangle boxes that are assigned with multiple anchors/prior boxes, these anchor boxes will have predefined and fixed size and shape within the grid cell. Based on the that the data scientists are able to detect multiple objects (like person, car, animals, etc) in an image.