Visual Attention based OCR

Overview

Attention-OCR

Authours: Qi Guo and Yuntian Deng

Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to height 32 while preserving aspect ratio). Then an LSTM is stacked on top of the CNN. Finally, an attention model is used as a decoder for producing the final outputs.

example image 0

Prerequsites

Most of our code is written based on Tensorflow, but we also use Keras for the convolution part of our model. Besides, we use python package distance to calculate edit distance for evaluation. (However, that is not mandatory, if distance is not installed, we will do exact match).

Tensorflow: Installation Instructions (tested on 0.12.1)

Distance (Optional):

wget http://www.cs.cmu.edu/~yuntiand/Distance-0.1.3.tar.gz
tar zxf Distance-0.1.3.tar.gz
cd distance; sudo python setup.py install

Usage:

Note: We assume that the working directory is Attention-OCR.

Train

Data Preparation

We need a file (specified by parameter data-path) containing the path of images and the corresponding characters, e.g.:

path/to/image1 abc
path/to/image2 def

And we also need to specify a data-base-dir parameter such that we read the images from path data-base-dir/path/to/image. If data-path contains absolute path of images, then data-base-dir needs to be set to /.

A Toy Example

For a toy example, we have prepared a training dataset of the specified format, which is a subset of Synth 90k

wget http://www.cs.cmu.edu/~yuntiand/sample.tgz
tar zxf sample.tgz
python src/launcher.py --phase=train --data-path=sample/sample.txt --data-base-dir=sample --log-path=log.txt --no-load-model

After a while, you will see something like the following output in log.txt:

...
2016-06-08 20:47:22,335 root  INFO     Created model with fresh parameters.
2016-06-08 20:47:52,852 root  INFO     current_step: 0
2016-06-08 20:48:01,253 root  INFO     step_time: 8.400597, step perplexity: 38.998714
2016-06-08 20:48:01,385 root  INFO     current_step: 1
2016-06-08 20:48:07,166 root  INFO     step_time: 5.781749, step perplexity: 38.998445
2016-06-08 20:48:07,337 root  INFO     current_step: 2
2016-06-08 20:48:12,322 root  INFO     step_time: 4.984972, step perplexity: 39.006730
2016-06-08 20:48:12,347 root  INFO     current_step: 3
2016-06-08 20:48:16,821 root  INFO     step_time: 4.473902, step perplexity: 39.000267
2016-06-08 20:48:16,859 root  INFO     current_step: 4
2016-06-08 20:48:21,452 root  INFO     step_time: 4.593249, step perplexity: 39.009864
2016-06-08 20:48:21,530 root  INFO     current_step: 5
2016-06-08 20:48:25,878 root  INFO     step_time: 4.348195, step perplexity: 38.987707
2016-06-08 20:48:26,016 root  INFO     current_step: 6
2016-06-08 20:48:30,851 root  INFO     step_time: 4.835423, step perplexity: 39.022887

Note that it takes quite a long time to reach convergence, since we are training the CNN and attention model simultaneously.

Test and visualize attention results

The test data format shall be the same as training data format. We have also prepared a test dataset of the specified format, which includes ICDAR03, ICDAR13, IIIT5k and SVT.

wget http://www.cs.cmu.edu/~yuntiand/evaluation_data.tgz
tar zxf evaluation_data.tgz

We also provide a trained model on Synth 90K:

wget http://www.cs.cmu.edu/~yuntiand/model.tgz
tar zxf model.tgz
python src/launcher.py --phase=test --visualize --data-path=evaluation_data/svt/test.txt --data-base-dir=evaluation_data/svt --log-path=log.txt --load-model --model-dir=model --output-dir=results

After a while, you will see something like the following output in log.txt:

2016-06-08 22:36:31,638 root  INFO     Reading model parameters from model/translate.ckpt-47200
2016-06-08 22:36:40,529 root  INFO     Compare word based on edit distance.
2016-06-08 22:36:41,652 root  INFO     step_time: 1.119277, step perplexity: 1.056626
2016-06-08 22:36:41,660 root  INFO     1.000000 out of 1 correct
2016-06-08 22:36:42,358 root  INFO     step_time: 0.696687, step perplexity: 2.003350
2016-06-08 22:36:42,363 root  INFO     1.666667 out of 2 correct
2016-06-08 22:36:42,831 root  INFO     step_time: 0.466550, step perplexity: 1.501963
2016-06-08 22:36:42,835 root  INFO     2.466667 out of 3 correct
2016-06-08 22:36:43,402 root  INFO     step_time: 0.562091, step perplexity: 1.269991
2016-06-08 22:36:43,418 root  INFO     3.366667 out of 4 correct
2016-06-08 22:36:43,897 root  INFO     step_time: 0.477545, step perplexity: 1.072437
2016-06-08 22:36:43,905 root  INFO     4.366667 out of 5 correct
2016-06-08 22:36:44,107 root  INFO     step_time: 0.195361, step perplexity: 2.071796
2016-06-08 22:36:44,127 root  INFO     5.144444 out of 6 correct

Example output images in results/correct (the output directory is set via parameter output-dir and the default is results): (Look closer to see it clearly.)

Format: Image index (predicted/ground truth) Image file

Image 0 (j/j): example image 0

Image 1 (u/u): example image 1

Image 2 (n/n): example image 2

Image 3 (g/g): example image 3

Image 4 (l/l): example image 4

Image 5 (e/e): example image 5

Parameters:

  • Control

    • phase: Determine whether to train or test.
    • visualize: Valid if phase is set to test. Output the attention maps on the original image.
    • load-model: Load model from model-dir or not.
  • Input and output

    • data-base-dir: The base directory of the image path in data-path. If the image path in data-path is absolute path, set it to /.
    • data-path: The path containing data file names and labels. Format per line: image_path characters.
    • model-dir: The directory for saving and loading model parameters (structure is not stored).
    • log-path: The path to put log.
    • output-dir: The path to put visualization results if visualize is set to True.
    • steps-per-checkpoint: Checkpointing (print perplexity, save model) per how many steps
  • Optimization

    • num-epoch: The number of whole data passes.
    • batch-size: Batch size. Only valid if phase is set to train.
    • initial-learning-rate: Initial learning rate, note the we use AdaDelta, so the initial value doe not matter much.
  • Network

    • target-embedding-size: Embedding dimension for each target.
    • attn-use-lstm: Whether or not use LSTM attention decoder cell.
    • attn-num-hidden: Number of hidden units in attention decoder cell.
    • attn-num-layers: Number of layers in attention decoder cell. (Encoder number of hidden units will be attn-num-hidden*attn-num-layers).
    • target-vocab-size: Target vocabulary size. Default is = 26+10+3 # 0: PADDING, 1: GO, 2: EOS, >2: 0-9, a-z

References

Convert a formula to its LaTex source

What You Get Is What You See: A Visual Markup Decompiler

Torch attention OCR

Comments
  • Testing the model

    Testing the model

    Hi, I have just run the new version of your code (TF and Keras) at my dataset. The perplexity is ~ 1.3, which is very nice. But when I want to test the model I get ~1000. After looking into visualization it look like the model always predict the same sequence of letters, no matter what input is (both on train test and validation set which the model did not see).

    This is my command which I use for testing:

    python src/launcher.py --phase=test --data-path=/home/ubuntu/Data/anno_v2/labels_vals.txt --data-base-dir=/home/ubuntu/Data/anno_v2 --log-path=logtest.txt --load-model --visualize
    

    In the other hand, when I continue training, everything is ok. The only think which vary between this two runs is 'decoder' input: while testing it is predicted words from previous iterations while in 'train' it is ground truth. Do you know what goes wrong? Or I overfitt to data and now first letter does create a all sequence, no matter what input is (my data have 15k examples)?

    opened by melgor 23
  • What's your training parameters?

    What's your training parameters?

    Hi guys,

    My training model can't reach the accuracy of your release model.tar. My training parameter is here:

    python src/launcher.py \
        --phase=train \
        --data-path=${data_path} \
        --data-base-dir=${data_base_dir} \
        --log-path=log_train.txt \
        --load-model \
        --model-dir=$model_dir \
        --num-epoch=800 \
        --target-embedding-size=10 
    

    When the steps reach translate.ckpt-48000, I use it to evaluate svt. But the accuracy is every poor of 6.1%(39.365199 out of 647 correct) comparing with your result of 68%(437.192016 out of 647 correct).

    I don't know why. What's your training parameters?

    waiting for your answers

    opened by striversist 21
  • Why the accuracy rate is so low from your model?

    Why the accuracy rate is so low from your model?

    Hi guys,

    I've download the evaluation_data and model you supported, and verify the test result by the command python src/launcher.py --phase=test --visualize --data-path=evaluation_data/svt/test.txt --data-base-dir=evaluation_data/svt --log-path=log.txt --load-model --model-dir=model --output-dir=results --target-embedding-size=10
    (BTW. should add --target-embedding-size=10 option or else it will fails by "Assign requires shapes of both tensors to match. lhs shape= [39,20] rhs shape= [39,10]")

    But the result log.txt shows "188.780794 out of 647 correct" which means only 29% accuracy rate you get. Why is it so bad? Or Did I something wrong?

    opened by striversist 14
  • AttributeError: 'InputLayer' object has no attribute 'set_input'

    AttributeError: 'InputLayer' object has no attribute 'set_input'

    python -V Python 2.7.10

    python src/launcher.py --phase=train --data-path=sample/sample.txt --data-base-dir=sample --log-path=log.txt --no-load-model Using TensorFlow backend. 2017-01-09 11:52:13,224 root INFO loading data 2017-01-09 11:52:13,225 root INFO data_path: sample/sample.txt 2017-01-09 11:52:13,225 root INFO phase: train 2017-01-09 11:52:13,225 root INFO batch_size: 64 2017-01-09 11:52:13,225 root INFO num_epoch: 1000 2017-01-09 11:52:13,225 root INFO steps_per_checkpoint 500 2017-01-09 11:52:13,225 root INFO target_vocab_size: 39 2017-01-09 11:52:13,225 root INFO model_dir: train 2017-01-09 11:52:13,225 root INFO target_embedding_size: 10 2017-01-09 11:52:13,226 root INFO attn_num_hidden: 128 2017-01-09 11:52:13,226 root INFO attn_num_layers: 2 2017-01-09 11:52:13,226 root INFO buckets 2017-01-09 11:52:13,226 root INFO [(16, 11), (27, 17), (35, 19), (64, 22), (80, 32)] Traceback (most recent call last): File "src/launcher.py", line 142, in main(sys.argv[1:], exp_config.ExpConfig) File "src/launcher.py", line 138, in main old_model_version = parameters.old_model_version) File "/Users/logiz/www/test/Attention-OCR/src/model/model.py", line 119, in init cnn_model = CNN(self.img_data) File "/Users/logiz/www/test/Attention-OCR/src/model/cnn.py", line 30, in init self._build_network(input_tensor) File "/Users/logiz/www/test/Attention-OCR/src/model/cnn.py", line 43, in _build_network input_layer.set_input(input_tensor=input_tensor) AttributeError: 'InputLayer' object has no attribute 'set_input'

    how to fix,thank you.

    opened by webluoye 6
  • Training seems broken from tf 11 and keras 1.1.1

    Training seems broken from tf 11 and keras 1.1.1

    I was checking the last version that should support tensorflow 11 and keras 1.1.1 that I have in my system

    I run on full synth90 corpus the sample training (without GRU option)

    python src/launcher.py \
    	--phase=train \
    	--data-path=90kDICT32px/new_annotation_train.txt \
    	--data-base-dir=90kDICT32px \
    	--log-path=log_sy.txt \
    	--attn-num-hidden 256 \
    	--batch-size 32 \
    	--model-dir=model_x \
    	--initial-learning-rate=1.0 \
    	--num-epoch=20000 \
    	--gpu-id=0 \
    --target-embedding-size=10
    
    

    It seem to converge better than the old version. After 70k iterations I get pretty nice perplexity

    2016-12-08 13:12:07,393 root  INFO     current_step: 78998
    2016-12-08 13:12:07,589 root  INFO     step_time: 0.196019, step_loss: 0.080441, step perplexity: 1.083765
    2016-12-08 13:12:08,620 root  INFO     current_step: 78999
    2016-12-08 13:12:08,884 root  INFO     step_time: 0.264444, step_loss: 0.071078, step perplexity: 1.073665
    2016-12-08 13:12:08,885 root  INFO     global step 79000 step-time 0.26 loss 0.094580  perplexity 1.10
    2016-12-08 13:12:08,885 root  INFO     Saving model, current_step: 79000
    
    

    However, when testing on SVT, I get a very bad performance

    2016-12-08 13:35:25,947 root  INFO     step_time: 0.049570, loss: 3.750340, step perplexity: 42.535551
    2016-12-08 13:35:25,950 root  INFO     62.324373 out of 647 correct
    

    Interestingly, the model provided by the authors works great with --old-model option. So it is not the decoder bug but rather some problem in training that does not show up in training log but does affect the final performance.

    I wonder if someone tried to traing/test the new version using tensorflow 11 and keras 1.1.1 ? Thanks

    opened by gorinars 5
  • train fail

    train fail

    hello, I tried to train your data with your code, I use the tensorflow as the backend and I meet a problem. how can i solve it? I’d like to receive your answer, thanks!

    File "~/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 496, in __iter__
     raise TypeError("'Tensor' object is not iterable.")
    TypeError: 'Tensor' object is not iterable.
    
    opened by Yang507 5
  • Accuracy on SVT data set

    Accuracy on SVT data set

    I was trying to evaluate the performance of your tool on SVT test data

    As suggested by README, I downloaded the model from http://www.cs.cmu.edu/~yuntiand/model.tgz and the data from http://www.cs.cmu.edu/~yuntiand/evaluation_data.tgz

    After running

    python src/launcher.py --phase=test --visualize --data-path=evaluation_data/svt/test.txt --data-base-dir=evaluation_data/svt --log-path=log.txt --load-model --model-dir=model --output-dir=results

    I get

    INFO     3.166667 out of 6 correct
    
    ...
    
    INFO     437.192016 out of 647 correct
    

    I was expecting to get about 80% accuracy, but the results seem to be worse.

    Do you know what is the expected accuracy of this tool with the provided model and how it can possibly improved to achieve state-of-the art result?

    Thanks

    opened by gorinars 5
  • error using my own trained model to test.

    error using my own trained model to test.

    Hi, I've trained a model following the toy example. And using the model to test the iiit5k data. While error happens: "../Attention-OCR/src/model/model.py" line 182, in init self.saver_all = tf.train.Saver(tf.all_variables()) an exception is throwed: tensorflow.python.framework.errors.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shap=[1024] rhs shape=[2048]

    When using the author's model trained on Synth 90K to test iiit5k , everythin is ok and finally get "2034.98 out of 3000 correct".

    Could anybody explain this? Thanks a lot.

    opened by flymark2010 4
  • error when I tried to run the toy example

    error when I tried to run the toy example

    I downloaded the sample data in the Attention-OCR folder and depressed there. I executed such command line and encountered such error. TypeError: unsupported operand type(s) for *: 'NoneType' and 'int' below is the trackback information: File "src/launcher.py", line 129, in main(sys.argv[1:], exp_config.ExpConfig) File "src/launcher.py", line 125, in main session = sess) File "/data/clzhai/Attention-OCR-master/src/model/model.py", line 118, in init cnn_model = CNN(self.img_data) File "/data/clzhai/Attention-OCR-master/src/model/cnn.py", line 31, in init self._build_network(input_tensor) File "/data/clzhai/Attention-OCR-master/src/model/cnn.py", line 51, in _build_network border_mode='same')) File "/data/clzhai/anaconda2/envs/tensorflow/lib/python2.7/site-packages/keras/models.py", line 308, in add output_tensor = layer(self.outputs[0]) File "/data/clzhai/anaconda2/envs/tensorflow/lib/python2.7/site-packages/keras/engine/topology.py", line 487, in call self.build(input_shapes[0]) File "/data/clzhai/anaconda2/envs/tensorflow/lib/python2.7/site-packages/keras/layers/convolutional.py", line 410, in build self.W = self.init(self.W_shape, name='{}_W'.format(self.name)) File "/data/clzhai/anaconda2/envs/tensorflow/lib/python2.7/site-packages/keras/initializations.py", line 57, in glorot_uniform fan_in, fan_out = get_fans(shape, dim_ordering=dim_ordering) File "/data/clzhai/anaconda2/envs/tensorflow/lib/python2.7/site-packages/keras/initializations.py", line 15, in get_fans receptive_field_size = np.prod(shape[2:]) File "/data/clzhai/anaconda2/envs/tensorflow/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2492, in prod out=out, keepdims=keepdims) File "/data/clzhai/anaconda2/envs/tensorflow/lib/python2.7/site-packages/numpy/core/_methods.py", line 35, in _prod return umr_prod(a, axis, dtype, out, keepdims) TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'

    opened by prismzhai 4
  • Run error: TypeError(

    Run error: TypeError("'Tensor' object is not iterable.")

    hi guys,

    When I run "python src/launcher.py --phase=train --data-path=sample/sample.txt --data-base-dir=sample --log-path=log.txt --no-load-model", the following errors came out:

    Traceback (most recent call last): File "src/launcher.py", line 129, in main(sys.argv[1:], exp_config.ExpConfig) File "src/launcher.py", line 125, in main session = sess) File "/home/aaron/projects/Attention-OCR/src/model/model.py", line 164, in init for old_value, new_value in cnn_model.model.updates: File "/home/aaron/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 495, in iter raise TypeError("'Tensor' object is not iterable.") TypeError: 'Tensor' object is not iterable.

    opened by striversist 4
  • Question about training procedure

    Question about training procedure

    I train the model on my own dataset, which contains 10k plate images. When I set the batch_size as 256, the step perplexity reaches to 1.001, however, the step perplexity of the trained model increases to 10 if the batch_size is set to 2. If I fix the batch_size to 256, does the model really converge? I wonder that a large batch_size is not appropriate.

    opened by chenmulin 3
  • training and testing as README introduced, but got a bad accuracy on provided evaluation dataset. What may cause that?

    training and testing as README introduced, but got a bad accuracy on provided evaluation dataset. What may cause that?

    2021-07-29 10:45:49,016 root INFO step_time: 0.138798, loss: 3.648251, step perplexity: 38.407426 2021-07-29 10:45:49,174 root INFO 0.000000 out of 614 correct 2021-07-29 10:45:49,315 root INFO step_time: 0.133353, loss: 3.665064, step perplexity: 39.058630 2021-07-29 10:45:49,397 root INFO 0.000000 out of 615 correct 2021-07-29 10:45:49,551 root INFO step_time: 0.148111, loss: 3.655409, step perplexity: 38.683334 2021-07-29 10:45:49,598 root INFO 0.000000 out of 616 correct

    opened by SharonJin422 1
  • generating attention mask based on the SavedModel output

    generating attention mask based on the SavedModel output

    Hi,

    I try to visualise attention masks based on the output of SavedModel. What is the meaning of real_len variable in visualise_attention function and why is the attention mask convolved with [0.199547,0.200226,0.200454,0.200226,0.199547]?

    opened by abucka 0
  • printed mathematical expression recognition

    printed mathematical expression recognition

    Hi, can this model used in recognition of printed mathematical expression recognition? like

    im2latex-100k? https://paperswithcode.com/dataset/im2latex-100k

    opened by StephenKyung 0
  • Testing trained images with bad accuracy.

    Testing trained images with bad accuracy.

    I try to train my own dataset. I get loss, perflexity and precision might be good. But when I run model to test on same dataset I trained. It shows bad. Any help? Thanks Beside, I test author's pre-train model, it happens: 2019-12-11 05:06:25,839 root INFO 0.000000 out of 647 correct

    opened by DLH06 0
  • How to test a model that I choose

    How to test a model that I choose

    After training, many files were created in 'train' folder as shown in the picture. How do I select and test the model I want? train

    Which parameters represent the path to the model that I want to test?

    python src/launcher.py --phase=test --visualize --data-path=evaluation_data/svt/test.txt --data-base-dir=evaluation_data/svt --log-path=log.txt --load-model --model-dir=model --output-dir=results

    opened by 5thwin 2
Owner
Yuntian Deng
Yuntian Deng
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 6, 2021
🖺 OCR using tensorflow with attention

tensorflow-ocr ?? OCR using tensorflow with attention, batteries included Installation git clone --recursive http://github.com/pannous/tensorflow-ocr

null 646 Nov 11, 2022
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

Jerod Weinman 489 Dec 21, 2022
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 8, 2022
Repository collecting all the submodules for the new PyTorch-based OCR System.

OCRopus3 is being replaced by OCRopus4, which is a rewrite using PyTorch 1.7; release should be soonish. Please check github.com/tmbdev/ocropus for up

NVIDIA Research Projects 138 Dec 9, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 3, 2023
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.7k Dec 31, 2022
FastOCR is a desktop application for OCR API.

FastOCR FastOCR is a desktop application for OCR API. Installation Arch Linux fastocr-git @ AUR Build from AUR or install with your favorite AUR helpe

Bruce Zhang 58 Jan 7, 2023
OCR-D-compliant page segmentation

ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e

OCR-D 59 Sep 10, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 3, 2023
Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr

Eric Ihli 311 Dec 24, 2022
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 5, 2023
A pure pytorch implemented ocr project including text detection and recognition

ocr.pytorch A pure pytorch implemented ocr project. Text detection is based CTPN and text recognition is based CRNN. More detection and recognition me

coura 444 Dec 30, 2022
python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Danny Crasto 38 Dec 5, 2022