Text recognition (optical character recognition) with deep learning methods.

Overview

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis

| paper | training and evaluation data | failure cases and cleansed label | pretrained model | Baidu ver(passwd:rryk) |

Official PyTorch implementation of our four-stage STR framework, that most existing STR models fit into.
Using this framework allows for the module-wise contributions to performance in terms of accuracy, speed, and memory demand, under one consistent set of training and evaluation datasets.
Such analyses clean up the hindrance on the current comparisons to understand the performance gain of the existing modules.

Honors

Based on this framework, we recorded the 1st place of ICDAR2013 focused scene text, ICDAR2019 ArT and 3rd place of ICDAR2017 COCO-Text, ICDAR2019 ReCTS (task1).
The difference between our paper and ICDAR challenge is summarized here.

Updates

Aug 3, 2020: added guideline to use Baidu warpctc which reproduces CTC results of our paper.
Dec 27, 2019: added FLOPS in our paper, and minor updates such as log_dataset.txt and ICDAR2019-NormalizedED.
Oct 22, 2019: added confidence score, and arranged the output form of training logs.
Jul 31, 2019: The paper is accepted at International Conference on Computer Vision (ICCV), Seoul 2019, as an oral talk.
Jul 25, 2019: The code for floating-point 16 calculation, check @YacobBY's pull request
Jul 16, 2019: added ST_spe.zip dataset, word images contain special characters in SynthText (ST) dataset, see this issue
Jun 24, 2019: added gt.txt of failure cases that contains path and label of each image, see image_release_190624.zip
May 17, 2019: uploaded resources in Baidu Netdisk also, added Run demo. (check @sharavsambuu's colab demo also)
May 9, 2019: PyTorch version updated from 1.0.1 to 1.1.0, use torch.nn.CTCLoss instead of torch-baidu-ctc, and various minor updated.

Getting Started

Dependency

  • This work was tested with PyTorch 1.3.1, CUDA 10.1, python 3.6 and Ubuntu 16.04.
    You may need pip3 install torch==1.3.1.
    In the paper, expriments were performed with PyTorch 0.4.1, CUDA 9.0.
  • requirements : lmdb, pillow, torchvision, nltk, natsort
pip3 install lmdb pillow torchvision nltk natsort

Download lmdb dataset for traininig and evaluation from here

data_lmdb_release.zip contains below.
training datasets : MJSynth (MJ)[1] and SynthText (ST)[2]
validation datasets : the union of the training sets IC13[3], IC15[4], IIIT[5], and SVT[6].
evaluation datasets : benchmark evaluation datasets, consist of IIIT[5], SVT[6], IC03[7], IC13[3], IC15[4], SVTP[8], and CUTE[9].

Run demo with pretrained model

  1. Download pretrained model from here
  2. Add image files to test into demo_image/
  3. Run demo.py (add --sensitive option if you use case-sensitive model)
CUDA_VISIBLE_DEVICES=0 python3 demo.py \
--Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn \
--image_folder demo_image/ \
--saved_model TPS-ResNet-BiLSTM-Attn.pth

prediction results

demo images TRBA (TPS-ResNet-BiLSTM-Attn) TRBA (case-sensitive version)
available Available
shakeshack SHARESHACK
london Londen
greenstead Greenstead
toast TOAST
merry MERRY
underground underground
ronaldo RONALDO
bally BALLY
university UNIVERSITY

Training and evaluation

  1. Train CRNN[10] model
CUDA_VISIBLE_DEVICES=0 python3 train.py \
--train_data data_lmdb_release/training --valid_data data_lmdb_release/validation \
--select_data MJ-ST --batch_ratio 0.5-0.5 \
--Transformation None --FeatureExtraction VGG --SequenceModeling BiLSTM --Prediction CTC
  1. Test CRNN[10] model. If you want to evaluate IC15-2077, check data filtering part.
CUDA_VISIBLE_DEVICES=0 python3 test.py \
--eval_data data_lmdb_release/evaluation --benchmark_all_eval \
--Transformation None --FeatureExtraction VGG --SequenceModeling BiLSTM --Prediction CTC \
--saved_model saved_models/None-VGG-BiLSTM-CTC-Seed1111/best_accuracy.pth
  1. Try to train and test our best accuracy model TRBA (TPS-ResNet-BiLSTM-Attn) also. (download pretrained model)
CUDA_VISIBLE_DEVICES=0 python3 train.py \
--train_data data_lmdb_release/training --valid_data data_lmdb_release/validation \
--select_data MJ-ST --batch_ratio 0.5-0.5 \
--Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn
CUDA_VISIBLE_DEVICES=0 python3 test.py \
--eval_data data_lmdb_release/evaluation --benchmark_all_eval \
--Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn \
--saved_model saved_models/TPS-ResNet-BiLSTM-Attn-Seed1111/best_accuracy.pth

Arguments

  • --train_data: folder path to training lmdb dataset.
  • --valid_data: folder path to validation lmdb dataset.
  • --eval_data: folder path to evaluation (with test.py) lmdb dataset.
  • --select_data: select training data. default is MJ-ST, which means MJ and ST used as training data.
  • --batch_ratio: assign ratio for each selected data in the batch. default is 0.5-0.5, which means 50% of the batch is filled with MJ and the other 50% of the batch is filled ST.
  • --data_filtering_off: skip data filtering when creating LmdbDataset.
  • --Transformation: select Transformation module [None | TPS].
  • --FeatureExtraction: select FeatureExtraction module [VGG | RCNN | ResNet].
  • --SequenceModeling: select SequenceModeling module [None | BiLSTM].
  • --Prediction: select Prediction module [CTC | Attn].
  • --saved_model: assign saved model to evaluation.
  • --benchmark_all_eval: evaluate with 10 evaluation dataset versions, same with Table 1 in our paper.

Download failure cases and cleansed label from here

image_release.zip contains failure case images and benchmark evaluation images with cleansed label.

When you need to train on your own dataset or Non-Latin language datasets.

  1. Create your own lmdb dataset.
pip3 install fire
python3 create_lmdb_dataset.py --inputPath data/ --gtFile data/gt.txt --outputPath result/

The structure of data folder as below.

data
├── gt.txt
└── test
    ├── word_1.png
    ├── word_2.png
    ├── word_3.png
    └── ...

At this time, gt.txt should be {imagepath}\t{label}\n
For example

test/word_1.png Tiredness
test/word_2.png kills
test/word_3.png A
...
  1. Modify --select_data, --batch_ratio, and opt.character, see this issue.

Acknowledgements

This implementation has been based on these repository crnn.pytorch, ocr_attention.

Reference

[1] M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman. Synthetic data and artificial neural networks for natural scenetext recognition. In Workshop on Deep Learning, NIPS, 2014.
[2] A. Gupta, A. Vedaldi, and A. Zisserman. Synthetic data fortext localisation in natural images. In CVPR, 2016.
[3] D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Big-orda, S. R. Mestre, J. Mas, D. F. Mota, J. A. Almazan, andL. P. De Las Heras. ICDAR 2013 robust reading competition. In ICDAR, pages 1484–1493, 2013.
[4] D. Karatzas, L. Gomez-Bigorda, A. Nicolaou, S. Ghosh, A. Bagdanov, M. Iwamura, J. Matas, L. Neumann, V. R.Chandrasekhar, S. Lu, et al. ICDAR 2015 competition on ro-bust reading. In ICDAR, pages 1156–1160, 2015.
[5] A. Mishra, K. Alahari, and C. Jawahar. Scene text recognition using higher order language priors. In BMVC, 2012.
[6] K. Wang, B. Babenko, and S. Belongie. End-to-end scenetext recognition. In ICCV, pages 1457–1464, 2011.
[7] S. M. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, andR. Young. ICDAR 2003 robust reading competitions. In ICDAR, pages 682–687, 2003.
[8] T. Q. Phan, P. Shivakumara, S. Tian, and C. L. Tan. Recognizing text with perspective distortion in natural scenes. In ICCV, pages 569–576, 2013.
[9] A. Risnumawan, P. Shivakumara, C. S. Chan, and C. L. Tan. A robust arbitrary text detection system for natural scene images. In ESWA, volume 41, pages 8027–8048, 2014.
[10] B. Shi, X. Bai, and C. Yao. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. In TPAMI, volume 39, pages2298–2304. 2017.

Links

Citation

Please consider citing this work in your publications if it helps your research.

@inproceedings{baek2019STRcomparisons,
  title={What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis},
  author={Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk},
  booktitle = {International Conference on Computer Vision (ICCV)},
  year={2019},
  pubstate={published},
  tppubtype={inproceedings}
}

Contact

Feel free to contact us if there is any question:
for code/paper Jeonghun Baek [email protected]; for collaboration [email protected] (our team leader).

License

Copyright (c) 2019-present NAVER Corp.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Issues
  • To train on my own dataset

    To train on my own dataset

    Hi. I created lmdb dataset on my own data by running create_lmdb_dataset.py. then I run the train command on it and got the following output:

    CUDA_VISIBLE_DEVICES=0 python3 train.py --train_data result/train --valid_data result/test --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn

    dataset_root: result/train opt.select_data: ['MJ', 'ST'] opt.batch_ratio: ['0.5', '0.5']

    dataset_root: result/train dataset: MJ Traceback (most recent call last): File "train.py", line 283, in train(opt) File "train.py", line 26, in train train_dataset = Batch_Balanced_Dataset(opt) File "/home/mor-ai/Work/deep-text-recognition-benchmark/dataset.py", line 37, in init _dataset = hierarchical_dataset(root=opt.train_data, opt=opt, select_data=[selected_d]) File "/home/mor-ai/Work/deep-text-recognition-benchmark/dataset.py", line 106, in hierarchical_dataset concatenated_dataset = ConcatDataset(dataset_list) File "/home/mor-ai/.local/lib/python3.6/site-packages/torch/utils/data/dataset.py", line 187, in init assert len(datasets) > 0, 'datasets should not be an empty iterable' AssertionError: datasets should not be an empty iterable

    Can you help me resolve this?

    opened by xxxpsyduck 22
  • recog error using  TPS-ResNet、VGG-BiLSTM-Attn

    recog error using TPS-ResNet、VGG-BiLSTM-Attn

    The sample images d_autohomecar__wKgHPltYXyWAZHTbAANpvWtj5Hs964_0 和兰豪华感,而风上的 wrong d_autohomecar__wKgHPltYXyWAZHTbAANpvWtj5Hs964_1 内饰设局觉觉温馨范儿 wrong d_autohomecar__wKgHPlstHCiANf7JAAGvh7L-4DU249_1 后扭力梁非独立悬架 correct d_autohomecar__wKgHPlt2pX6AQNrVAANoQlFZZXQ045_0 变之水波落务变得更出 wrong

    I train the model using 32X256, then set batch_max_length=64(test and train),I feel something has wrong,when the character has many in the sample,the result is wrong。

    The traing datasets is normal。

    Thanks

    opened by AnddyWang 16
  • Accuracy difference between local retraining model and pretrained one

    Accuracy difference between local retraining model and pretrained one

    First, thanks for your great work :) ! You've done a good job!

    Here's my question, I've retrained the model with the option as: "--select_data MJ-ST --batch_ratio 0.5-0.5 --Transformation None --FeatureExtraction VGG --SequenceModeling BiLSTM --Prediction CTC" , corresponding to the original version of CRNN. The rest parameters are set as default and the model is trained on MJ and ST datasets.

    However, when testing with my local retrained best_accuracy model, the result accuracy is shown as below: in IC13_857: only 88.45% while 91.1% in paper. in IC13_1015: 87.68% while 89.2% in paper. in IC15_1811: 66.37% while 69.4% in paper. in IC15_2077: 64.07% while 64.2% in paper.

    It seems like there is still something inappropriate in my retraining process. Should I reset the learning rate or expand my training iteration? Do you guys have any idea about improving the performance to align with the public results illustrated in the paper?

    And I've attempted to train only on MJ dataset, whose model seems to have a higher accuracy in IC13_857. When I extend the training on both MJ and ST, is it necessary to add up the iteration number, so that I can get a better accuracy?

    Expect for your reply ^_^

    opened by 1LOVESJohnny 11
  • How to train with custom data - AssertionError: datasets should not be an empty iterable

    How to train with custom data - AssertionError: datasets should not be an empty iterable

    I use this command to train: !python3 '/content/deep-text-recognition-benchmark/create_lmdb_dataset.py' --inputPath '/content/deep-text-recognition-benchmark/train' --gtFile '/content/gt.txt' --outputPath '/content/deep-text-recognition-benchmark/result' I want to use it to detect license plates.

    Input_pathhave inputPath with input images, with names like 'CJFY10,jpg' outputPath it is an empty folder. gtFile it is an txt with this format:

    C:/Users/X/Desktop/deep-text-recognition-benchmark-master/train/DVHS56.png DVHS56
    C:/Users/X/Desktop/deep-text-recognition-benchmark-master/train/DYVS72.png DYVS72
    C:/Users/X/Desktop/deep-text-recognition-benchmark-master/val/HDYP18.png HDYP18
    C:/Users/X/Desktop/deep-text-recognition-benchmark-master/val/HKHT72.png HKHT72
    C:/Users/X/Desktop/deep-text-recognition-benchmark-master/val/HPXC69.png HPXC69
    

    And error when I run the command it is:

    Traceback (most recent call last):
      File "/content/deep-text-recognition-benchmark/create_lmdb_dataset.py", line 87, in <module>
        fire.Fire(createDataset)
      File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 138, in Fire
        component_trace = _Fire(component, args, parsed_flag_args, context, name)
      File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 468, in _Fire
        target=component.__name__)
      File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 672, in _CallAndUpdateTrace
        component = fn(*varargs, **kwargs)
      File "/content/deep-text-recognition-benchmark/create_lmdb_dataset.py", line 47, in createDataset
        imagePath, label = datalist[i].strip('\n').split('\t')
    ValueError: not enough values to unpack (expected 2, got 1)
    

    What I'm doing wrong? Any help is welcome

    opened by pendex900x 8
  • validation wrong

    validation wrong

    when the code run at valid_loss, current_accuracy, current_norm_ED, preds, confidence_score, labels, infer_time, length_of_data = validation(model, criterion, valid_loader, converter, opt) , it stop here and raise no errors, Program cannot continue

    opened by cqray1990 8
  • Can't training model with own lmdb dataset

    Can't training model with own lmdb dataset

    I have a problem training model with own lmdb dataset. I use create_lmdb_dataset.py with 1000 sample Vietnamese to create database. When I training model, dataset_root: data/training opt.select_data: ['ST'] opt.batch_ratio: ['0.5']

    dataset_root: data/training dataset: ST sub-directory: /ST num samples: 3 num total samples of ST: 3 x 1.0 (total_data_usage_ratio) = 3 num samples of ST per batch: 192 x 0.5 (batch_ratio) = 96

    Total_batch_size: 96 = 96

    Can you please tell me how to training own database. Thank you

    opened by thangtran480 8
  • Difference in performance between online demo website and the offline code

    Difference in performance between online demo website and the offline code

    Hi, I have been trying to run the code on these two images:

    new_img_0

    new_img_2

    I get correct results on the demo website https://demo.ocr.clova.ai/, This is the result I get through the offline code: First Image: he11505973 Second Image: hijo86

    opened by AyushP123 7
  • miss match in size

    miss match in size

    RuntimeError: Error(s) in loading state_dict for DataParallel: size mismatch for module.Prediction.attention_cell.rnn.weight_ih: copying a param with shape torch.Size([1024, 352]) from checkpoint, the shape in current model is torch.Size([1024, 294]). size mismatch for module.Prediction.generator.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([38]). size mismatch for module.Prediction.generator.weight: copying a param with shape torch.Size([96, 256]) from checkpoint, the shape in current model is torch.Size([38, 256]).

    opened by GSATHYANARAYANA 7
  • Size mismatch while loading pretrained model for fine-tuning with additional characters

    Size mismatch while loading pretrained model for fine-tuning with additional characters

    Hi,

    I have followed all the necessary steps for FT as given in the other threads but size mismatch error keeps recurring as the num of characters in the pre-trained model and custom dataset is not same. Can somebody please guide me as to how to load the pre-trained model with a modified prediction layer such it can be used for fine-tuning?

    Thanks in advance!

    opened by PrithaGanguly 6
  • path does not exist error while creating custom dataset

    path does not exist error while creating custom dataset

    opened by omersert 6
  • Simple image causing troubles

    Simple image causing troubles

    I added a simple test image to the demo folder and it gets wrongly recognized. The image: test_dbg The output: 128456782012

    The command I used is the same of the README. Thinking that the problem could have been related to the string length, I tried retraining the TPS-ResNet-BiLSTM-Attn model using an imgW of 200 pixels, but the problem seems to be very similar.

    Any idea on why this happens? It seems to me that this image is much simpler compared to the other demo images.

    opened by marcoromelli 6
  • How do we make the files required for running our custom model?

    How do we make the files required for running our custom model?

    The EasyOCR ReadME states that the following three files are required to use a custom model: custom_model.pth, custom_model.yaml, and custom_model.py.

    How are we supposed to create the custom_model.yaml and custom_model.py files?

    Are we supposed to copy functions / classes from modules.py? If so, TPS and Attention are not present in that file, so where can we find these?

    Thank you

    opened by sameearif88 0
  • Special character fine tune with pre trained model

    Special character fine tune with pre trained model

    Shape mismatch is happening if I try to add extra 2 characters. Any solution for the same? If i want to do fine tuning on TPS-ResNet-BiLSTM-Attn.pth

    RuntimeError: Error(s) in loading state_dict for DataParallel: size mismatch for module.Prediction.attention_cell.rnn.weight_ih: copying a param with shape torch.Size([1024, 294]) from checkpoint, the shape in current model is torch.Size([1024, 296]). size mismatch for module.Prediction.generator.weight: copying a param with shape torch.Size([38, 256]) from checkpoint, the shape in current model is torch.Size([40, 256]). size mismatch for module.Prediction.generator.bias: copying a param with shape torch.Size([38]) from checkpoint, the shape in current model is torch.Size([40]).

    opened by aafaqin 2
  • training dataset problem

    training dataset problem

    Hi,

    I would like to make the model working well on both printed and handwritten papers. I am confused about the way to train the model. So, if I want the model to recognize both printed characters and handwritings, do I need to train both printed and handwriting characters at the same time? Or should I train them separately? This is my first time training the OCR model, so please help me in details if possible.

    Thank you in advance!

    opened by hannah4kr 5
  • Ratio of training data with SJ MJ and T_spe

    Ratio of training data with SJ MJ and T_spe

    With MJ and ST the batch ratio is 0.5 0.5 respectively. Can someone guide me, what would be the best approach when adding the special character data into the training set as well? Currently, I've set a ratio of 0.4,0.4,0.2 (MJ,ST,ST_spe)

    opened by SaeedArisha 0
  • Fine tuning problem! urgent

    Fine tuning problem! urgent

    So i have trained from scratch using lmdb dataset from the repo then i tried to create my own dataset using trdg to add non latin characters and retrain the model. With the original dataset i got pretty good predictions but after i trained with the dataset that i created i got a very bad one. Please help ! Or should i train the model from scratch using non-latin characters ?

    Another question is : after creating the dataset .mdb files i have to replace the datasets from validation folder also ?

    This is confusing, we create the training model python3 create_lmdb_dataset.py --inputPath data/ --gtFile data/gt.txt --outputPath result/

    But what about the validation datasets?

    opened by gitdeepheolp 2
Owner
Clova AI Research
Open source repository of Clova AI Research, NAVER & LINE
Clova AI Research
Optical character recognition for Japanese text, with the main focus being Japanese manga

Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

Maciej Budyś 168 Jun 26, 2022
Extract tables from scanned image PDFs using Optical Character Recognition.

ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

Abhijeet Singh 201 Jun 11, 2022
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics 21 Dec 8, 2021
Provides OCR (Optical Character Recognition) services through web applications

OCR4all As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety

null 168 Jun 27, 2022
Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

gosseract OCR Golang OCR package, by using Tesseract C++ library. OCR Server Do you just want OCR server, or see the working example of this package?

Hiromu OCHIAI 1.8k Jul 1, 2022
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 355 Jun 23, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 71 Jun 22, 2022
Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Este programa tem o intuito de ser um modificador de arquivos PDF. Os arquivos PDFs podem ser 3: PDFs verdadeiros - em que podem ser selecionados o ti

Daniel Soares Saldanha 2 Oct 11, 2021
Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

Clova AI Research 2.4k Jul 1, 2022
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

null 445 Jun 22, 2022
Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

Sparsha Saha 16 Jun 5, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Jun 18, 2022
Character Segmentation using TensorFlow

Character Segmentation Segment characters and spaces in one text line,from this paper Chinese English mixed Character Segmentation as Semantic Segment

null 25 Apr 21, 2022
Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Sign Language Recognition Service This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform s

Martin Lønne 1 Jan 8, 2022
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 488 Jun 30, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 352 Jun 22, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

null 23 Jun 21, 2022