Classical OCR DCNN reproduction based on PaddlePaddle framework.

Last update: Nov 12, 2021

Related tags

Deep Learning Paddle-SVHN

Overview

Paddle-SVHN

Classical OCR DCNN reproduction based on PaddlePaddle framework.

This project reproduces Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks based on the paddlepaddle framework and participates in the Baidu paper reproduction competition. The AIStudio link is provided as follow:

link

Results_Compared

SVHN Dataset

Methods	Model Download	Batch Size	Learning Rate	Patience	Decay Step	Decay Rate	Training Speed (FPS)	Accuracy
Pytorch_SVHN	torch_model	512	0.16	100	625	0.9	~1700	95.65%
PaddlePaddle_SVHN	paddle_model	1024	0.01	100	625	0.9	~1700	95.65%

Introduction

The main idea of this exercise is to study the evolvement of the state of the art and main work along topic of visual attention model. There are two datasets that are studied: augmented MNIST and SVHN. The former dataset focused on canonical problem — handwritten digits recognition, but with cluttering and translation, the latter focus on real world problem — street view house number (SVHN) transcription. In this exercise, the following papers are studied in the way of developing a good intuition to choose a proper model to tackle each of the above challenges.

For more detail, please refer to this blog

Recommended environment

Python 3.6+
paddlepaddle-gpu 2.0.2
nccl 2.0+
editdistance
visdom
h5py
protobuf
lmdb

Install

Install env

Install paddle following the official tutorial.

pip install visdom
pip install h5py
pip install protobuf
pip install lmdb

Dataset

Download SVHN Dataset format 1

Extract to data folder, now your folder structure should be like below:

SVHNClassifier
    - data
        - extra
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - test
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - train
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat

Usage

(Optional) Take a glance at original images with bounding boxes
```
Open `draw_bbox.ipynb` in Jupyter
```

Convert to LMDB format

$ python convert_to_lmdb.py --data_dir ./data

(Optional) Test for reading LMDBs

Open `read_lmdb_sample.ipynb` in Jupyter

Train

$ python train.py --data_dir ./data --logdir ./logs

Retrain if you need

$ python train.py --data_dir ./data --logdir ./logs_retrain --restore_checkpoint ./logs/model-100.pth

Evaluate

$ python eval.py --data_dir ./data ./logs/model-100.pth

Visualize

$ python -m visdom.server
$ python visualize.py --logdir ./logs

Infer

$ python infer.py --checkpoint=./logs/model-100.pth ./images/test1.png

Clean

$ rm -rf ./logs
or
$ rm -rf ./logs_retrain

You might also like...

End-to-end image segmentation kit based on PaddlePaddle.

Classical OCR DCNN reproduction based on PaddlePaddle framework.

Related tags

Overview

Paddle-SVHN

Results_Compared

Introduction

Recommended environment

Install

Install env

Dataset

Usage

You might also like...

End-to-end image segmentation kit based on PaddlePaddle.

Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

rastrainer is a QGIS plugin to training remote sensing semantic segmentation model based on PaddlePaddle.

Multi-Modal Machine Learning toolkit based on PaddlePaddle.

Awesome Remote Sensing Toolkit based on PaddlePaddle.

A PaddlePaddle version image model zoo.

Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

Owner

YOLOX-Paddle - A reproduction of YOLOX by PaddlePaddle

一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

text_recognition_toolbox: The reimplementation of a series of classical scene text recognition papers with Pytorch in a uniform way.

A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.

Reproduction process of AlexNet

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

Object detection and instance segmentation toolkit based on PaddlePaddle.

Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

Remote sensing change detection tool based on PaddlePaddle