AI SANGAM | Deep Learning | Machine Learning | Emerging Power in Artificial Intelligence

Some people might confuses image classification with object detection but there is a key difference between them. Image classification refers to belongingness of image to a specific category whereas object detection refers to identify the location of objects in an image or counting objects in an image. In the early 2001, first object detection framework was proposed by Paul viola and Michael Jones. It was developed for face detection provided faces are not tilted or posses frontal upright view. Again research went on progressing and in year 2005 another approach was developed so called Histograms of oriented gradients which was transforming the image into regions of gradient. Surrounding pixels are analyzed and looked for the region of increasing dark levels. Gradient in general represents the direction where maximum change is achieved. Though both the models were quite impressive but it was year 2012 when deep learning made the progress quite fast. Convolutional neural network made it easy for object detection but it was more suitable for classification.

Some limitations of CNN was overcome by R-CNN in which the region of interest/ region proposals/ boundary boxes are generated before feeding the image to model. This in turn will lead to extraction of features of each bounding boxes and helps to identify whether objects are present in the particular bounding boxes or not based on some threshold. In the present scenario YELO (You only look once) became very popular that divides the image into 13x13. Each cell will provide 5 bounding boxes and these bounding boxes will overlap and will gets transformed into bigger bounding boxes. There will be large number of boxes and this is passed to pretrained model such as AlexNet and at the end you will left with few bounding boxes which contains objects in it. It will classify the classes of all these objects. Please look down to know what AI Sangam offers you.

Object detection