🖺 OCR using tensorflow with attention

Last update: Nov 11, 2022

Related tags

Overview

tensorflow-ocr

🖺 OCR using tensorflow with attention, batteries included

Installation

git clone --recursive http://github.com/pannous/tensorflow-ocr
# sudo apt install python3-pip
cd tensorflow-ocr
pip install -r requirements.txt

Evaluation

You can detect the text under your mouse pointer with mouse_prediction.py

it takes 10 seconds to load the network and startup, then it should return multiple results per second .

text_recognizer.py

To combine our approach with real world images we forked the EAST boundary boxing.

Customized training

To get started with a minimal example similar to the famous MNIST try ./train_letters.py ; It automatically generates letters for all different font types from your computer in all different shapes and trains on it.

For the full model used in the demo start ./train.py

Comments

beginning to do research on OCR using tensorflow
Dear sir,

I just start learning about Neural Network and I found Tensorflow is easy to use.

I learn tensorflow by running example from this tutorial https://github.com/nlintz/TensorFlow-Tutorials and reading this book deeplearningbook.org.

My goal is use Neural network to recognize each letters and number (eg: a,b...x,y,z, A..Z..0...9) . So It means 62 class.

I can segment each letter from the picture by QT.

Now,I confusing about how to OCR by using tensorflow. Cause there are a lot of project about "Handwriting recognition" but a few about "OCR for printed documents". And I a newbie so I really don't know how or where to begin.

Would you mind helping me.

1/ Can I use the same those MNIST classifiers (http://yann.lecun.com/exdb/mnist/) for recognizing English alphabet/letters ? Cause MNIST classifiers recognize digit (0 - 9) so I really don't know It could handle 62 class or not ?

2/ Do you have any documents or papers about "OCR for printed documents". I really lack of knowledge.

3/ I found your project tensorflow-OCR . Would you mind if I learn from your code.

4/ I got errors when i run : ./train_ocr_layer.py

(tensorflow)khoa@khoa:~/tensorflow/tensorflow-ocr$ ./train_ocr_layer.py ls: cannot access /tmp/tensorboard_logs/: No such file or directory Traceback (most recent call last): File "./train_ocr_layer.py", line 4, in <module> import layer File "/home/khoa/tensorflow/tensorflow-ocr/layer/__init__.py", line 1, in <module> from net import * File "/home/khoa/tensorflow/tensorflow-ocr/layer/net.py", line 8, in <module> set_tensorboard_run(auto_increment=True) File "/home/khoa/tensorflow/tensorflow-ocr/layer/tensorboard_util.py", line 18, in set_tensorboard_run run_nr = get_last_tensorboard_run_nr() File "/home/khoa/tensorflow/tensorflow-ocr/layer/tensorboard_util.py", line 8, in get_last_tensorboard_run_nr logs=subprocess.check_output(["ls", tensorboard_logs]).split("\n") File "/usr/lib/python2.7/subprocess.py", line 573, in check_output raise CalledProcessError(retcode, cmd, output=output) subprocess.CalledProcessError: Command '['ls', '/tmp/tensorboard_logs/']' returned non-zero exit status 2

Thank you and regards, Khoa
opened by Khoa-NT 2
ImportError: No module named layer

lack layer.py

localhost:tensorflow-ocr didi$ ./train_ocr_layer.py Traceback (most recent call last): File "./train_ocr_layer.py", line 3, in import layer ImportError: No module named layer localhost:tensorflow-ocr didi$ pip install layer Collecting layer Could not find a version that satisfies the requirement layer (from versions: ) No matching distribution found for layer

opened by vitamin 1
where is training data?

@pannous ,when i train the tensorflow-ocr, I can't find the training data the system input. and when i type the ./train_ocr_layer, there are throwing an error: Exception: BAD FONT: /usr/share/texlive/texmf-dist/fonts/truetype/public/dejavu/DejaVuSansMono-Oblique.ttf.

opened by shendaizi 1
[ImgBot] Optimize images

Beep boop. Your images are optimized!

Your image file size has been reduced by 44% 🎉

Details

| File | Before | After | Percent reduction | |:--|:--|:--|:--| | /test_image.png | 25.10kb | 14.16kb | 43.58% |

📝docs | :octocat: repo | 🙋issues | 🏅swag | 🏪marketplace

opened by imgbot[bot] 0
Does this project have any document to refer to?

I can't find any document or tutorial, and have no clue on how to start with it. Do you guys have any material or wiki page to explain this project?

Thanks very much!

opened by xray1111 0
why accuracy and test accuray are so low?

I installed tensorflow 0.11 and PIL 1.17 under ubuntu 16.04, but when i execute the command: python train_ocr_layer.py the result of accuracy and test accuray are below: I don't know the reason,would help me?@pannous

opened by deajosha 0
OSError: Unable to open file (file signature not found)

When i try to run mouse_prediction.py i get the following error message:

\AppData\Local\Programs\Python\Python37\lib\site-packages\h5py\_hl\files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (file signature not found)

opened by VollRahm 1

transfer denseConv ckpt to pb file error

What did I do : python train_ocr_layer.py After I got ckpt file, I try to transfer it into pb file with code:

import tensorflow as tf
from tensorflow.python.framework import graph_util

def freeze_graph(input_checkpoint, output_graph):
    '''
    :param input_checkpoint:
    :param output_graph: PB模型保存路径
    :return:
    '''
    model_folder = './checkpoints/'
    checkpoint = tf.train.get_checkpoint_state(model_folder) #检查目录下ckpt文件状态是否可用
    input_checkpoint = checkpoint.model_checkpoint_path #得ckpt文件路径

    # 指定输出的节点名称,该节点名称必须是原模型中存在的节点
    output_node_names = "group_deps"
    saver = tf.train.import_meta_graph(input_checkpoint + '.meta', clear_devices=True)
    graph = tf.get_default_graph()  # 获得默认的图
    input_graph_def = graph.as_graph_def()  # 返回一个序列化的图代表当前的图

    with tf.Session() as sess:
        saver.restore(sess, input_checkpoint)  # 恢复图并得到数据
        output_graph_def = graph_util.convert_variables_to_constants(  # 模型持久化，将变量值固定
            sess=sess,
            input_graph_def=input_graph_def,  # 等于:sess.graph_def
            output_node_names=output_node_names.split(","))  # 如果有多个输出节点，以逗号隔开

        with tf.gfile.GFile(output_graph, "wb") as f:  # 保存模型
            f.write(output_graph_def.SerializeToString())  # 序列化输出
        print("%d ops in the final graph." % len(output_graph_def.node))  # 得到当前图有几个操作节点

        for op in graph.get_operations():
             print(op.name, op.values())

if __name__ == '__main__':

    input_checkpoint = './checkpoints/denseConv0.ckpt'
    output_graph = './checkpoints/frozen_graph.pb'
    freeze_graph(input_checkpoint, output_graph)

After I got pb file, I just can`t open it with tensorborad or transfer it into ONNX. I got this error message:

Traceback (most recent call last):
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 418, in import_graph_def
    graph._c_graph, serialized, options)  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node import/model/batchnorm/AssignMovingAvg was passed float from import/model/batchnorm//moving_mean:0 incompatible with expected float_ref.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/cenhong/.local/bin/tfpb_tensorboard", line 11, in <module>
    load_entry_point('doml', 'console_scripts', 'tfpb_tensorboard')()
  File "/home/cenhong/do-ml/scripts/tfpb_tensorboard.py", line 33, in main
    tfpb_tensorboard(args.input_path, args.log_path, 6006 if args.port is None else args.port)
  File "/home/cenhong/do-ml/scripts/tfpb_tensorboard.py", line 18, in tfpb_tensorboard
    g_in = tf.import_graph_def(graph_def)
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 422, in import_graph_def
    raise ValueError(str(e))
ValueError: Input 0 of node import/model/batchnorm/AssignMovingAvg was passed float from import/model/batchnorm//moving_mean:0 incompatible with expected float_ref.

Does anyone know how to solve this? Thanks.

opened by bolyor 0

train_letters.py not running

Traceback (most recent call last): File "train_letters.py", line 61, in net.train(data=data, dropout=.6, display_step=10, test_step=1000) # run resume File "/usr/local/lib/python3.5/dist-packages/layer/net.py", line 438, in train tf.add_check_numerics_ops() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/numerics.py", line 90, in add_check_numerics_ops raise ValueError("tf.add_check_numerics_ops() is not compatible " ValueError:tf.add_check_numerics_ops() is not compatible with TensorFlow control flow operations such as tf.cond() or tf.while_loop().

opened by RaziAhm 4

Owner

Magic and A.I. 🔮

GitHub

Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

5 Dec 6, 2021

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 8, 2023

CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

356 Dec 8, 2022

CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别文本检测：CTPN 文本识别：DenseNet + CTC 环境部署 sh setup.sh 注：CPU环境执行前需注释掉for gpu部分，并解开for cpu部分的注释 Demo 将测试图片放入test_images

2.6k Dec 29, 2022

Visual Attention based OCR

Attention-OCR Authours: Qi Guo and Yuntian Deng Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to hei

1.1k Jan 2, 2023

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

489 Dec 21, 2022

A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.

About An OCR translator tool. Made by me by utilizing Tesseract, compiled to .exe using pyinstaller. I made this program to learn more about python. I

41 Dec 30, 2022

python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

38 Dec 5, 2022

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

gosseract OCR Golang OCR package, by using Tesseract C++ library. OCR Server Do you just want OCR server, or see the working example of this package?

1.9k Dec 28, 2022

A bot that extract text from images using the Tesseract OCR.

Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I

4 Aug 6, 2021

Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

PDFImage2TXT - DOWNLOAD INSTALLER HERE What can you do with it? Convert scanned PDFs to TXT. Convert scanned Documents to TXT. No coding required!! In

2 Feb 22, 2022

A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu

125 Dec 30, 2022

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

1 Jan 27, 2022

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

933 Dec 29, 2022

This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.

DeepSceneTextReader This is a c++ project deploying a deep scene text reading pipeline. It reads text from natural scene images. Prerequsites The proj

49 Sep 10, 2022

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

16.7k Jan 3, 2023

🖺 OCR using tensorflow with attention

Related tags

Overview

tensorflow-ocr

Installation

Evaluation

Customized training

Comments

beginning to do research on OCR using tensorflow

ImportError: No module named layer

where is training data?

[ImgBot] Optimize images

Beep boop. Your images are optimized!

Does this project have any document to refer to?

why accuracy and test accuray are so low?

OSError: Unable to open file (file signature not found)