Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

sushant097

Last update: Jan 7, 2023

Related tags

Computer Vision deep-neural-networks deep-learning tensorflow cnn python3 handwritten-text-recognition ctc-loss recurrent-neural-network blstm iam-dataset crnn-tensorflow

Overview

Handwritten Line Text Recognition using Deep Learning with Tensorflow

Description

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train. More read this Medium Post

Why Deep Learning?

Deep Learning self extracts features with a deep neural networks and classify itself. Compare to traditional Algorithms it performance increase with Amount of Data.

Basic Intuition on How it Works.

First Use Convolutional Recurrent Neural Network to extract the important features from the handwritten line text Image.

The output before CNN FC layer (512x100x8) is passed to the BLSTM which is for sequence dependency and time-sequence operations.

Then CTC LOSS Alex Graves is used to train the RNN which eliminate the Alignment problem in Handwritten, since handwritten have different alignment of every writers. We just gave the what is written in the image (Ground Truth Text) and BLSTM output, then it calculates loss simply as -log("gtText"); aim to minimize negative maximum likelihood path.

Finally CTC finds out the possible paths from the given labels. Loss is given by for (X,Y) pair is:

Finally CTC Decode is used to decode the output during Prediction.

Detail Project Workflow

Project consists of Three steps:
1. Multi-scale feature Extraction --> Convolutional Neural Network 7 Layers
2. Sequence Labeling (BLSTM-CTC) --> Recurrent Neural Network (2 layers of LSTM) with CTC
3. Transcription --> Decoding the output of the RNN (CTC decode)

Requirements

Tensorflow 1.8.0
Flask
Numpy
OpenCv 3
Spell Checker autocorrect >=0.3.0 pip install autocorrect

Dataset Used

IAM dataset download from here
Only needed the lines images and lines.txt (ASCII).
Place the downloaded files inside data directory

The Trained model is available and download from this link. The trained model CER=8.32% and trained on IAM dataset with some additional created dataset.

To Train the model from scratch

$ python main.py --train

To validate the model

$ python main.py --validate

To Prediction

$ python main.py

Run in Web with Flask

$ python upload.py
Validation character error rate of saved model: 8.654728%
Python: 3.6.4 
Tensorflow: 1.8.0
Init with stored values from ../model/snapshot-24
Without Correction clothed leaf by leaf with the dioappoistmest
With Correction clothed leaf by leaf with the dioappoistmest

Prediction output on IAM Test Data

Prediction output on Self Test Data

See the project Devnagari Handwritten Word Recognition with Deep Learning for more insights.

Further Improvement

Using MDLSTM to recognize whole paragraph at once Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
Line segementation can be added for full paragraph text recognition. For line segmentation you can use A* path planning algorithm or CNN model to seperate paragraph into lines.
Better Image preprocessing such as: reduce backgoround noise to handle real time image more accurately.
Better Decoding approach to improve accuracy. Some of the CTC Decoder found here

Feel Free to improve this project with pull Request.

This is part of my last semester project of Computer Engineering From Tribhuvan University. July 2019

Comments

throwing errors
Hi Sushant,

Great work by you!! kudos sir.

I am facing the following issues while running this model.

in DataLoader.py file for reading the data from ground truth text file

GT text are columns starting at 10

| 77 | gtText_list = lineSplit[9].split('|') | 78 | gtText = self.truncateLabel(' '.join(gtText_list), maxTextLen)

this throws the error -- index out of range and on correcting gtText_list = lineSplit[8].split('|')

Also in main.py file totalEpoch = loader.trainSamples//Model.batchSize # loader.numTrainSamplesPerEpoch |26| while True: | 27 | epoch += 1 | 28 | print('Epoch:', epoch, '/', totalEpoch)

is also throwing the error. On commenting totalEpoch line and sending epoch to print statement-

#totalEpoch = loader.trainSamples//Model.batchSize # loader.numTrainSamplesPerEpoch

while True: epoch += 1 print('Epoch:', epoch, '/', epoch)

Also Autocorrect in spellchecker.py is shown as depreceated and on changing it to pyspellchecker v.4.0

I am able to run the model but on training from scratch its showing very high validation CER of around 43. let me know if change in spellchecker and other performed changes can lead to this. Also let me know if some other approach has to be taken for training this model on IAM line based dataset
opened by rpro91 10
Unreasonably low results, when using IAM dataset

I'm also getting incredibly poor scores, character error rate of ~45%, word error rate of ~650% Using the IAM lines dataset, all according to instructions. Not using word beam search decoding or anything similar, just python main.py --train Exact same results using either CPU or GPU (the only difference of course being that the GPU way is several times faster). The CTC loss scores and the error rate stop improving after the 7th or so epoch. Tried increasing the file input size, but got considerably worse results. Attached an image of an example word error rate while training.

My specs are: 16 GB RAM, Ryzen 3700X CPU, NVIDIA RTX2070 SUPER GPU. Using Docker and TensorFlow 1.13.

Originally posted by @mcmalzahar in https://github.com/sushant097/Handwritten-Line-Text-Recognition-using-Deep-Learning-with-Tensorflow/issues/4#issuecomment-623444144

opened by mcmalzahar 8
Running error

Traceback (most recent call last): File "main.py", line 193, in main() File "main.py", line 164, in main Model.imgSize, Model.maxTextLen, load_aug=True) File "C:\Users\hp\Desktop\Handwritten-Line-Text-Recognition-using-Deep-Learning-with-Tensorflow-master\src\DataLoader.py", line 79, in init gtText_list = lineSplit[9].split('|') IndexError: list index out of range

opened by anush97 7
accuracy.txt and charList.txt not found in Model Directory

Hi @sushant097

Thank you for your repository, its very useful. I am not able to run python upload.py as it is unable to find accuracy.txt and charList.txt

Thanks in advance :)

opened by codebugged 6
No Model in Model Folder

Hey,

I was looking at your code and noticed there is no saved model in the model folder. Is there something I am missing? Also when I am trying to train the model myself, I run into the error -

Traceback (most recent call last): File "main.py", line 193, in main() File "main.py", line 164, in main Model.imgSize, Model.maxTextLen, load_aug=True) File "/Users/anagh/Downloads/Handwritten-Line-Text-Recognition-using-Deep-Learning-with-Tensorflow-master/src/DataLoader.py", line 79, in init gtText_list = lineSplit[9].split('|') IndexError: list index out of range

Please note - I have downloaded the dataset lines.tgz and also have put lines.txt in data folder.

opened by anaghrao-99 5
AttributeError: module 'tensorflow' has no attribute 'placeholder'

Hi I installed Tensorflow 2.x as 1.8.0 version not available on their repository. When I running this example I am getting the following error. AttributeError: module 'tensorflow' has no attribute 'placeholder'

Can you please update the code to support Tensorflow latest version.

Thank you

opened by gayathrisubbu 4
Output wrong

I get this for the test image. is this supposed to happen?

W/O correction Clothed leat by leaf with the disappoisthet

W correction Clothes left by leaf with the disappoisthet

Can please check this out. is it supposed to happen?

opened by ACE07-Sev 3
Keras implementation and model structure

Hello,

i am trying to implement this with keras, in your code (Model.py) you have mentioned the layer filters, but i am not sure about the rnn and ctc layers.

please have a look at this

opened by shekarneo 3
its about implementation

could you please explain how to implement it on ubuntu 18.04 ?can you guide me the steps needed for implementation of this project? eagerly waiting for your answer.

opened by santhubaby 3
AttributeError: module 'tensorflow' has no attribute 'placeholder'

I ran tf_upgrade_v2 --intree Handwritten-Line-Text-Recognition-using-Deep-Learning-with-Tensorflow/ --outtree Handwritten-Line-Text-Recognition/ --reportfile changesreport.txt but I'm still getting the error. I installed tf-nightly but still same. can you please guide me?

2021-11-21 04:19:05.588993: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2021-11-21 04:19:05.602350: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. TensorFlow 2.0 Upgrade Script

Converted 0 files Detected 0 issues that require attention

Make sure to read the detailed log 'changesreport.txt'

This is what I get.

opened by ACE07-Sev 2
Loss too high

Hello, Sir. Thank for your post. I'm a newbie to this domain and I read your post on Medium. I clone repos into Google Colab, download your data on kaggle and upload it to folder data as in readme. I also redefine images path in Data_Loader.py at line 75: fileName = filePath + 'self_lines' + fileNameSplit[0] + '.png'. After training done: It has a really bad result. WER 400% and accuracy 0.00000%. Could you help me figure out which part I have done wrong, Sir. Thank you in advance.

opened by AICampB4 2
Training process
Hello, Sushant!

For the past few days I have been trying to reproduce the results of the repository. For that I followed the guide described in README.md but the outcome was different.

Steps:

Clone the repo in a new directory

Download IAM database from official site

Copy lines.txt file and lines directory to the data directory (13 353 records).

In the file DataLoader.py change the following line: gtText_list = lineSplit[9].split('|') to this: gtText_list = lineSplit[8].split('|') This is required because the 8-th element (not 9-th) contains ground truth labels. For example: a01-000u-00 ok 154 19 408 746 1661 89 A|MOVE|to|stop|Mr.|Gaitskell|from

Run the following command from src_tensorflow2 directory: python main.py --train

Environment:

Python: 3.7.9 Tensorflow: 2.7.0

Expected behaviour:

CER is expected to descend slowly approximately to the value specified in README.md: 8.32%.

Actual behaviour:

First try: CER after epoch 1: 28.1% CER after epoch 2: 21.0% But from 3rd to at least 12th epoch CER is between 45% and 52%. And it is not going to go down.

Second try. After 8th epoch: Train loss: 62.25793147463152 Val loss: 64.84262824781013 Character error rate: 45.535652%

After 21th epoch: Train loss: 56.68565004330704 Val loss: 66.37841461644028 Character error rate: 44.809107%

Could you describe the correct way to train the model?

Update 2022-06-09 It seems that the problem is reproduced only in src_tensorflow2 directory. The code in src_tensorflow1 directory (using TF 1.15.5) after third epoch gives CER 19% and loss still going down.

Update 2022-06-10 The code in src_tensorflow1 directory (using TF 1.15.5) doesn't give stable results too. I tried 3 more times to run the training from scratch. And CER was not decreasing from some epoch.
opened by ivankrylatskoe 3
About training from scratch

Hello sir, I want to learn how to build handwritten text recognition using deep learning. Can you kindly suggest me which course to take to fully understand how to proceed with the code?

Thank you....

opened by monika153 0

Owner

sushant097

Machine Learning Engineer | Computer Vision Developer. Working in the field of Research, development of Machine learning and Computer Vision .

GitHub

Detect handwritten words in a text-line (classic image processing method).

Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro

190 Jan 3, 2023

IMGUR5K handwriting set. It is a handwritten in-the-wild dataset, which contains challenging real world handwritten samples from different writers.The dataset is shared as a set of image urls with annotations. This code downloads the images and verifies the hash to the image to avoid data contamination.

IMGUR5K Handwriting Dataset To run the code for downloading the urls and generate corresponding annotations : Usage: python download_imgur5k.py --data

213 Dec 26, 2022

This is used to convert a string to an Image with Handwritten Characters.

Text-to-Handwriting-using-python This is used to convert a string to an Image with Handwritten Characters. text_to_handwriting(string: str, save_to: s

3 Aug 15, 2022

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

489 Dec 21, 2022

This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

2 Feb 6, 2022

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Handwriting Recognition System This repository is the Tensorflow implementation of the Handwriting Recognition System described in Handwriting Recogni

346 Jan 7, 2023

Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

CRNN_Tensorflow This is a TensorFlow implementation of a Deep Neural Network for scene text recognition. It is mainly based on the paper "An End-to-En

1000 Dec 27, 2022

Pre-Recognize Library - library with algorithms for improving OCR quality.

PRLib - Pre-Recognition Library. The main aim of the library - prepare image for recogntion. Image processing can really help to improve recognition q

80 Dec 30, 2022

Converts an image into funny, smaller amongus characters

SussyImage Converts an image into funny, smaller amongus characters Demo Mona Lisa | Lona Misa (Made up of AmongUs characters) API I've also added an

14 Aug 18, 2022

CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

356 Dec 8, 2022

CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别文本检测：CTPN 文本识别：DenseNet + CTC 环境部署 sh setup.sh 注：CPU环境执行前需注释掉for gpu部分，并解开for cpu部分的注释 Demo 将测试图片放入test_images

2.6k Dec 29, 2022

Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

17 Aug 25, 2022

Handwritten Text Recognition (HTR) using TensorFlow 2.x

Handwritten Text Recognition (HTR) system implemented using TensorFlow 2.x and trained on the Bentham/IAM/Rimes/Saint Gall/Washington offline HTR data

160 Dec 21, 2022

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Handwritten Text Recognition with TensorFlow Update 2021: more robust model, faster dataloader, word beam search decoder also available for Windows Up

1.5k Jan 7, 2023

OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

562 Jan 3, 2023

Apply different text recognition services to images of handwritten documents.

Handprint The Handwritten Page Recognition Test is a command-line program that invokes HTR (handwritten text recognition) services on images of docume

117 Jan 2, 2023

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

127 Dec 3, 2022

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

90 Dec 22, 2022

Give a solution to recognize MaoYan font.

猫眼字体识别该 github repo 在于帮助xjtlu的同学们识别猫眼的扭曲字体。已经打包上传至 pypi ，可以使用 pip 直接安装。猫眼字体的识别不出来的原理与解决思路在采茶上使用方法： import MaoYanFontRecognize

4 Jun 30, 2022

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Related tags

Overview

Handwritten Line Text Recognition using Deep Learning with Tensorflow

Description

Why Deep Learning?

Basic Intuition on How it Works.

Detail Project Workflow

Requirements

Dataset Used

The Trained model is available and download from this link. The trained model CER=8.32% and trained on IAM dataset with some additional created dataset.

Further Improvement

Comments

GT text are columns starting at 10

Converted 0 files Detected 0 issues that require attention

Steps:

Environment:

Expected behaviour:

Actual behaviour:

Owner

sushant097

Detect handwritten words in a text-line (classic image processing method).

This is used to convert a string to an Image with Handwritten Characters.

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

This can be use to convert text in a file to handwritten text.

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

Pre-Recognize Library - library with algorithms for improving OCR quality.

Converts an image into funny, smaller amongus characters

CNN+LSTM+CTC based OCR implemented using tensorflow.

CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

Handwritten Number Recognition using CNN and Character Segmentation

Handwritten Text Recognition (HTR) using TensorFlow 2.x

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

OCR software for recognition of handwritten text

Apply different text recognition services to images of handwritten documents.

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Give a solution to recognize MaoYan font.