Transformers
🚗
Arabic licence plate recognition - Solution to the kaggle competition Machathon 3.0.
- Ranked in the top
6️⃣ at the final evaluation phase. - Check our solution now on collab!
- Check the solution presentation
Preprocessing Pipeline
Approach
Step1: Preprocessing Enhancments on the image.
- Most images had bad illumination and noise
- Morphological operations to Maximize Contrast.
- Gaussian Blur to remove Noise.
- Thresholding on both Value and Saturation channels.
Step2: Extracting white plate using countours.
- Get countours and sort based on Area.
- Polygon Approximation For noisy countours.
- Convex hull for Concave polygons.
- 4-Point transformation For difficult camera angles.
Now have numbers in a countor and letters in another.
Step3: Separating characters from white plate using sliding windows.
Can't use countours to get symbols in white plate since Arabic Letter may consist of multiple charachters e.g ت this may consist of 2/3 countours.
Solution
- Tuned 2 sliding windows, one for letters' white plate, the other for numbers.
- Variable window width
- Window height is the white plate height, since arabic characters may consist multiple parts
- Selecting which window
- Must have no black pixels on the sides
- Must have a specific range of black pixels inside
- For each group of windows the one with max black pixels is selected
Step4: Character Recognition.
- Training 2 model since Arabic letters and numbers are similar e.g (أ,1) (5, ه)
- one for classifing only arabic letters.
- one for classifying arabic numbers.
Project Organization
Scripts applied on images
./Macathon/code/
├── extract_bbx_xml.ipynb : Takes directory of images and their bbx data stored in an xml files, and crop the bbxs from the images.
| The xml file contains licence label(name), xmin, ymin, xmax, ymax of the bbxs in an image.
├── extract_bbx_txt.ipynb : Takes directory of images and their bbx data stored in a txt files, and crop the bbxs from the images.
| The txt file corresponding to one image may consist of multiple bbxs, each corresponds to a row of xmin,ymin,xmax,ymax for that bbx.
└── crop_right_noise.ipynb : Crops an image with some percentage and replace with the cropped image.
Model versions
./Macathon/code/
└── model.ipynb : - The preprocessing and modeling stage, Contains:
- Preprocessing Functions
- Training both classifers
- Prediction and generating the output csv file
Data Folder
./Macathon/data/
├── challenging_images.rar : Contains most challenging images collected from the train data.
├── cropped_letters.zip : 28 Subfolders corresponding to the 28 letter in Arabic alphabet.
| Each subfolder holds images for the letter it's named after, cropped from the train data distribution.
├── cropped_numbers.zip : 10 Subfolders for the 10 numbers.
| Each subfolder holds images for the number it's named after, cropped from the train data distribution.
├── machathon-3.zip : The uploaded data found with the kaggle competition.
└── testLetters.zip : 200 images labeled from the test data distribution.
Each image has a corresponding xml file holding the bbxs locations in it.
Contributors
This masterpiece was designed, and implemented by
Hossam Saeed |
Mostafa Wael |
Nada Elmasry |
Noran Hany |