Hand Cricket
Table of Content
Overview
This is a computer vision based implementation of the popular childhood game 'Hand Cricket/Odd or Even' in python. Behind the game is a CNN model that is trained to identify hand sign for numbers 0,1,2,3,4,5 & 6. For those who have never played this game, the rules are explained below.
The Game in action
hand-cricket.mov
Installation
- You need Python (3.6) & git (to clone this repo)
git clone [email protected]:abhinavnayak11/Hand-Cricket.git .
: Clone this repocd path/to/Hand-Cricket
: cd into the project folderconda env create -f environment.yml
: Create a virtual env with all the dependenciesconda activate comp-vision
: activate the virtual envpython src/hand-cricket.py
: Run the script
Game rules
Hand signs
- You can play numbers 0, 1, 2, 3, 4, 5, 6. Their hand sign are shown here
Toss
- You can choose either odd or even (say you choose odd)
- Both the players play a number (say players play 3 & 6). Add those numbers (3+6=9).
- Check if the sum is odd or even. (9 is odd)
- If the result is same as what you have chosen, you have won the toss, else you have lost. (9 is odd, you chose odd, hence you win)
The Game
- The person who wins the toss is the batsman, the other player is the bowler. (In the next version of the game, the toss winner will be allowed to chose batting/bowling)
- Scoring Runs:
- Both players play a number.
- The batsman's number is added to his score only when the numbers are different.
- There is special power given to 0. If batsman plays 0 and bowler plays any number but 0, bowler's number is added to batsman's score
- Getting out:
- Batsman gets out when both the players play the same number. Even if both the numbers are 0.
- Winning/Losing:
- After both the players have finished their innings, the person scoring more runs wins the game
Game code : hand-cricket.py
Project Details
- Data Collection :
- After failing to find a suitable dataset, I created my own dataset using my phone camera.
- The dataset contains a total of 1848 images. To ensure generality (i.e prevent overfitting to one type of hand in one type of environment) images were taken with 4 persons, in 6 different lighting conditions, in 3 different background.
- Sample of images post augmentations are shown below,
- Data can be found uploaded at : github | kaggle. Data collection code : collect-data.py
- Data preprocessing :
- A Pytorch dataset was created to handle the preprocessing of the image dataset (code : dataset.py).
- Images were augmented before training. Following augmentations were used : Random Rotation, Random Horizontal Flip and Normalization. All the images were resized to (128x128).
- Images were divided into training and validation set. Training set was used to train the model, whereas validation set helped validate the model performance.
- Model training :
- Different pretrained models(resent18, densenet121 etc, which are pre-trained on the ImageNet dataset) from pytorch library were used to train on this dataset. Except the last 2 layers, all the layers were frozen and then trained. With this the pre-trained model helps extracting useful features and the last 2 layers will be fine-tuned to my dataset.
- Learning rate for training the model was chosen with trial and error. For each model, learning rate was different.
- Of all the models trained, densnet121 performed the best, with a validation accuracy of 0.994.
- Training the model : train.py, engine.py, training-notebook
Future Scope
- Although, this was a fun application, the dataset can be used in applications like sign language recognition.