🖺 OCR using tensorflow with attention

Overview

tensorflow-ocr

🖺 OCR using tensorflow with attention, batteries included

Installation

git clone --recursive http://github.com/pannous/tensorflow-ocr
# sudo apt install python3-pip
cd tensorflow-ocr
pip install -r requirements.txt

Evaluation

You can detect the text under your mouse pointer with mouse_prediction.py

it takes 10 seconds to load the network and startup, then it should return multiple results per second .

text_recognizer.py

To combine our approach with real world images we forked the EAST boundary boxing.

Customized training

To get started with a minimal example similar to the famous MNIST try ./train_letters.py ; It automatically generates letters for all different font types from your computer in all different shapes and trains on it.

For the full model used in the demo start ./train.py

Comments
  • beginning to do research on OCR using tensorflow

    beginning to do research on OCR using tensorflow

    Dear sir,

    I just start learning about Neural Network and I found Tensorflow is easy to use.

    I learn tensorflow by running example from this tutorial https://github.com/nlintz/TensorFlow-Tutorials and reading this book deeplearningbook.org.

    My goal is use Neural network to recognize each letters and number (eg: a,b...x,y,z, A..Z..0...9) . So It means 62 class.

    I can segment each letter from the picture by QT.

    Now,I confusing about how to OCR by using tensorflow. Cause there are a lot of project about "Handwriting recognition" but a few about "OCR for printed documents". And I a newbie so I really don't know how or where to begin.

    Would you mind helping me.

    1/ Can I use the same those MNIST classifiers (http://yann.lecun.com/exdb/mnist/) for recognizing English alphabet/letters ? Cause MNIST classifiers recognize digit (0 - 9) so I really don't know It could handle 62 class or not ?

    2/ Do you have any documents or papers about "OCR for printed documents". I really lack of knowledge.

    3/ I found your project tensorflow-OCR . Would you mind if I learn from your code.

    4/ I got errors when i run : ./train_ocr_layer.py

    (tensorflow)khoa@khoa:~/tensorflow/tensorflow-ocr$ ./train_ocr_layer.py 
    ls: cannot access /tmp/tensorboard_logs/: No such file or directory
    Traceback (most recent call last):
      File "./train_ocr_layer.py", line 4, in <module>
        import layer
      File "/home/khoa/tensorflow/tensorflow-ocr/layer/__init__.py", line 1, in <module>
        from net import *
      File "/home/khoa/tensorflow/tensorflow-ocr/layer/net.py", line 8, in <module>
        set_tensorboard_run(auto_increment=True)
      File "/home/khoa/tensorflow/tensorflow-ocr/layer/tensorboard_util.py", line 18, in set_tensorboard_run
        run_nr = get_last_tensorboard_run_nr()
      File "/home/khoa/tensorflow/tensorflow-ocr/layer/tensorboard_util.py", line 8, in get_last_tensorboard_run_nr
        logs=subprocess.check_output(["ls", tensorboard_logs]).split("\n")
      File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
        raise CalledProcessError(retcode, cmd, output=output)
    subprocess.CalledProcessError: Command '['ls', '/tmp/tensorboard_logs/']' returned non-zero exit status 2
    

    Thank you and regards, Khoa

    opened by Khoa-NT 2
  • ImportError: No module named layer

    ImportError: No module named layer

    lack layer.py


    localhost:tensorflow-ocr didi$ ./train_ocr_layer.py Traceback (most recent call last): File "./train_ocr_layer.py", line 3, in import layer ImportError: No module named layer localhost:tensorflow-ocr didi$ pip install layer Collecting layer Could not find a version that satisfies the requirement layer (from versions: ) No matching distribution found for layer

    opened by vitamin 1
  • where is training data?

    where is training data?

    @pannous ,when i train the tensorflow-ocr, I can't find the training data the system input. and when i type the ./train_ocr_layer, there are throwing an error: Exception: BAD FONT: /usr/share/texlive/texmf-dist/fonts/truetype/public/dejavu/DejaVuSansMono-Oblique.ttf.

    opened by shendaizi 1
  • [ImgBot] Optimize images

    [ImgBot] Optimize images

    Beep boop. Your images are optimized!

    Your image file size has been reduced by 44% 🎉

    Details

    | File | Before | After | Percent reduction | |:--|:--|:--|:--| | /test_image.png | 25.10kb | 14.16kb | 43.58% |


    📝docs | :octocat: repo | 🙋issues | 🏅swag | 🏪marketplace

    opened by imgbot[bot] 0
  • Does this project have any document to refer to?

    Does this project have any document to refer to?

    I can't find any document or tutorial, and have no clue on how to start with it. Do you guys have any material or wiki page to explain this project?

    Thanks very much!

    opened by xray1111 0
  • why accuracy and test accuray are so low?

    why accuracy and test accuray are so low?

    I installed tensorflow 0.11 and PIL 1.17 under ubuntu 16.04, but when i execute the command: python train_ocr_layer.py the result of accuracy and test accuray are below: qq 20161129150742 I don't know the reason,would help me?@pannous

    opened by deajosha 0
  • OSError: Unable to open file (file signature not found)

    OSError: Unable to open file (file signature not found)

    When i try to run mouse_prediction.py i get the following error message:

    \AppData\Local\Programs\Python\Python37\lib\site-packages\h5py\_hl\files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (file signature not found)

    opened by VollRahm 1
  • transfer denseConv ckpt to pb file error

    transfer denseConv ckpt to pb file error

    What did I do : python train_ocr_layer.py After I got ckpt file, I try to transfer it into pb file with code:

    import tensorflow as tf
    from tensorflow.python.framework import graph_util
    
    def freeze_graph(input_checkpoint, output_graph):
        '''
        :param input_checkpoint:
        :param output_graph: PB模型保存路径
        :return:
        '''
        model_folder = './checkpoints/'
        checkpoint = tf.train.get_checkpoint_state(model_folder) #检查目录下ckpt文件状态是否可用
        input_checkpoint = checkpoint.model_checkpoint_path #得ckpt文件路径
    
        # 指定输出的节点名称,该节点名称必须是原模型中存在的节点
        output_node_names = "group_deps"
        saver = tf.train.import_meta_graph(input_checkpoint + '.meta', clear_devices=True)
        graph = tf.get_default_graph()  # 获得默认的图
        input_graph_def = graph.as_graph_def()  # 返回一个序列化的图代表当前的图
    
        with tf.Session() as sess:
            saver.restore(sess, input_checkpoint)  # 恢复图并得到数据
            output_graph_def = graph_util.convert_variables_to_constants(  # 模型持久化,将变量值固定
                sess=sess,
                input_graph_def=input_graph_def,  # 等于:sess.graph_def
                output_node_names=output_node_names.split(","))  # 如果有多个输出节点,以逗号隔开
    
            with tf.gfile.GFile(output_graph, "wb") as f:  # 保存模型
                f.write(output_graph_def.SerializeToString())  # 序列化输出
            print("%d ops in the final graph." % len(output_graph_def.node))  # 得到当前图有几个操作节点
    
            for op in graph.get_operations():
                 print(op.name, op.values())
    
    if __name__ == '__main__':
    
        input_checkpoint = './checkpoints/denseConv0.ckpt'
        output_graph = './checkpoints/frozen_graph.pb'
        freeze_graph(input_checkpoint, output_graph)
    

    After I got pb file, I just can`t open it with tensorborad or transfer it into ONNX. I got this error message:

    Traceback (most recent call last):
      File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 418, in import_graph_def
        graph._c_graph, serialized, options)  # pylint: disable=protected-access
    tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node import/model/batchnorm/AssignMovingAvg was passed float from import/model/batchnorm//moving_mean:0 incompatible with expected float_ref.
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/cenhong/.local/bin/tfpb_tensorboard", line 11, in <module>
        load_entry_point('doml', 'console_scripts', 'tfpb_tensorboard')()
      File "/home/cenhong/do-ml/scripts/tfpb_tensorboard.py", line 33, in main
        tfpb_tensorboard(args.input_path, args.log_path, 6006 if args.port is None else args.port)
      File "/home/cenhong/do-ml/scripts/tfpb_tensorboard.py", line 18, in tfpb_tensorboard
        g_in = tf.import_graph_def(graph_def)
      File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
        return func(*args, **kwargs)
      File "/home/cenhong/.local/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 422, in import_graph_def
        raise ValueError(str(e))
    ValueError: Input 0 of node import/model/batchnorm/AssignMovingAvg was passed float from import/model/batchnorm//moving_mean:0 incompatible with expected float_ref.
    

    Does anyone know how to solve this? Thanks.

    opened by bolyor 0
  • train_letters.py not running

    train_letters.py not running

    Traceback (most recent call last): File "train_letters.py", line 61, in net.train(data=data, dropout=.6, display_step=10, test_step=1000) # run resume File "/usr/local/lib/python3.5/dist-packages/layer/net.py", line 438, in train tf.add_check_numerics_ops() File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/numerics.py", line 90, in add_check_numerics_ops raise ValueError("tf.add_check_numerics_ops() is not compatible " ValueError:tf.add_check_numerics_ops() is not compatible with TensorFlow control flow operations such as tf.cond() or tf.while_loop().

    opened by RaziAhm 4
Owner
Magic and A.I. 🔮
null
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 6, 2021
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

null 27.5k Jan 8, 2023
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 8, 2022
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
Visual Attention based OCR

Attention-OCR Authours: Qi Guo and Yuntian Deng Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to hei

Yuntian Deng 1.1k Jan 2, 2023
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

Jerod Weinman 489 Dec 21, 2022
A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.

About An OCR translator tool. Made by me by utilizing Tesseract, compiled to .exe using pyinstaller. I made this program to learn more about python. I

Fauzan F A 41 Dec 30, 2022
python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Danny Crasto 38 Dec 5, 2022
Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

gosseract OCR Golang OCR package, by using Tesseract C++ library. OCR Server Do you just want OCR server, or see the working example of this package?

Hiromu OCHIAI 1.9k Dec 28, 2022
A bot that extract text from images using the Tesseract OCR.

Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I

Weverton Marques 4 Aug 6, 2021
Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

PDFImage2TXT - DOWNLOAD INSTALLER HERE What can you do with it? Convert scanned PDFs to TXT. Convert scanned Documents to TXT. No coding required!! In

Hans Alemão 2 Feb 22, 2022
A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu

francis 125 Dec 30, 2022
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 3, 2023
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.7k Dec 31, 2022
FastOCR is a desktop application for OCR API.

FastOCR FastOCR is a desktop application for OCR API. Installation Arch Linux fastocr-git @ AUR Build from AUR or install with your favorite AUR helpe

Bruce Zhang 58 Jan 7, 2023
OCR-D-compliant page segmentation

ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e

OCR-D 59 Sep 10, 2022