The project preforms live time object detection using the webcam and classifies the hand gestures into four different classes: Livelong, ThumbsUp, ThumbsDown and Peace.
The webcam captures images at fixed intervals of time for training dataset for each class using the LabelImg tool. Tensorflow Object Detection API is installed to work both locally as well as on Google Colab. A pre-trained model (ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8) is modified to label the hand gestures trained into the model. The preformance matrix of our model is evaluated: Average Precision - 78% and Average Recall - 79% with 10 images to train for each class.
Output seen on Google Colab when model is tested using an image -