A pure pytorch implemented ocr project including text detection and recognition

Overview

ocr.pytorch

A pure pytorch implemented ocr project.
Text detection is based CTPN and text recognition is based CRNN.
More detection and recognition methods will be supported!

Prerequisite

  • python-3.5+
  • pytorch-0.4.1+
  • torchvision-0.2.1
  • opencv-3.4.0.14
  • numpy-1.14.3

They could all be installed through pip except pytorch and torchvision. As for pytorch and torchvision, they both depends on your CUDA version, you would prefer to reading pytorch's official site

Detection

Detection is based on CTPN, some codes are borrowed from pytorch_ctpn, several detection results: detect1 detect2

Recognition

Recognition is based on CRNN, some codes are borrowed from crnn.pytorch

Test

Download pretrained models from Baidu Netdisk (extract code: u2ff) or Google Driver and put these files into checkpoints. Then run

python3 demo.py

The image files in ./test_images will be tested for text detection and recognition, the results will be stored in ./test_result.

If you want to test a single image, run

python3 test_one.py [filename]

Train

Training codes are placed into train_code directory.
Train CTPN
Train CRNN

Licence

MIT License

Comments
  • Training CRNN & extracting the CTPN detection

    Training CRNN & extracting the CTPN detection

    @courao Thank you for your hard work,

    • Will you release CRNN training code & documentation.
    • Can you add the option to extract the detected lines of CTPN.
    opened by ghost 26
  • loss=0.44 on the SROIE dataset

    loss=0.44 on the SROIE dataset

    @courao I have been training for 1 day on the SROIE dataset., the loss is still 0.44 ! It works well on other datasets, but not the SROIE dataset? Am I doing something wrong? dataset download link

    X00016469620 X00016469622

    opened by ghost 10
  • 更改crnn_recognizer.py报错

    更改crnn_recognizer.py报错

    您好,我和前面的朋友遇见的问题一样,修改crnn_recognizer.py文件的第100行def init(self, model_path='/root/zjut/ocr.pytorch/checkpoints/CRNN.pth')。当我执行'python demo.py'命令出错,显示如下: Traceback (most recent call last): File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/ptvsd_launcher.py", line 43, in main(ptvsdArgs) File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/lib/python/old_ptvsd/ptvsd/main.py", line 432, in main run() File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/lib/python/old_ptvsd/ptvsd/main.py", line 316, in run_file runpy.run_path(target, run_name='main') File "/root/anaconda3/lib/python3.6/runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "/root/anaconda3/lib/python3.6/runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "/root/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/zjut/ocr.pytorch/demo.py", line 10, in from ocr import ocr File "/root/zjut/ocr.pytorch/ocr.py", line 6, in recognizer = PytorchOcr() File "/root/zjut/ocr.pytorch/recognize/crnn_recognizer.py", line 111, in init self.model.load_state_dict({k.replace('module.', ''): v for k, v in torch.load(model_path).items()}) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for CRNN: Missing key(s) in state_dict: "conv1.weight", "conv1.bias", "conv2.weight", "conv2.bias", "conv3_1.weight", "conv3_1.bias", "bn3.weight", "bn3.bias", "bn3.running_mean", "bn3.running_var", "conv3_2.weight", "conv3_2.bias", "conv4_1.weight", "conv4_1.bias", "bn4.weight", "bn4.bias", "bn4.running_mean", "bn4.running_var", "conv4_2.weight", "conv4_2.bias", "conv5.weight", "conv5.bias", "bn5.weight", "bn5.bias", "bn5.running_mean", "bn5.running_var". Unexpected key(s) in state_dict: "cnn.conv0.weight", "cnn.conv0.bias", "cnn.conv1.weight", "cnn.conv1.bias", "cnn.conv2.weight", "cnn.conv2.bias", "cnn.batchnorm2.weight", "cnn.batchnorm2.bias", "cnn.batchnorm2.running_mean", "cnn.batchnorm2.running_var", "cnn.batchnorm2.num_batches_tracked", "cnn.conv3.weight", "cnn.conv3.bias", "cnn.conv4.weight", "cnn.conv4.bias", "cnn.batchnorm4.weight", "cnn.batchnorm4.bias", "cnn.batchnorm4.running_mean", "cnn.batchnorm4.running_var", "cnn.batchnorm4.num_batches_tracked", "cnn.conv5.weight", "cnn.conv5.bias", "cnn.conv6.weight", "cnn.conv6.bias", "cnn.batchnorm6.weight", "cnn.batchnorm6.bias", "cnn.batchnorm6.running_mean", "cnn.batchnorm6.running_var", "cnn.batchnorm6.num_batches_tracked". size mismatch for rnn.1.embedding.weight: copying a param with shape torch.Size([5997, 512]) from checkpoint, the shape in current model is torch.Size([5835, 512]). size mismatch for rnn.1.embedding.bias: copying a param with shape torch.Size([5997]) from checkpoint, the shape in current model is torch.Size([5835]). 其中CRNN.pth是您度盘所提供的。

    opened by wangguanhua 7
  • 运行demo.py的时候,出现报错。。。

    运行demo.py的时候,出现报错。。。

    C:\Users\Administrator\AppData\Local\Programs\Python\Python37\python.exe C:/Users/Administrator/Desktop/pytorch/ocr.pytorch-master/demo.py Traceback (most recent call last): ./test_result\test_images\t1.txt File "C:/Users/Administrator/Desktop/pytorch/ocr.pytorch-master/demo.py", line 29, in txt_f = open(txt_file, 'w') FileNotFoundError: [Errno 2] No such file or directory: './test_result\test_images\t1.txt'

    Process finished with exit code 1

    我看到demo.py运行之后会清空掉test_result文件夹里的内容,然后就报这个错误。。。 求up主帮忙看一下。。

    opened by gi19901212 4
  • Upload pretrained models to other host

    Upload pretrained models to other host

    Can you please upload the pretrained models to another site than pan.baidu? Most non chinese users can't download from there. Maybe https://www.mediafire.com/, https://www.4shared.com/, https://mega.nz/, googledrive or https://zippyshare.com/ Thank you very much.

    opened by Johndirr 3
  • I made a pytorch-lightning implementation of your CTPN

    I made a pytorch-lightning implementation of your CTPN

    Hi! I've made a pytorch-lightning implementation of ctpn, mainly by using your code. Pytorch-lightning has many nice features, such as training with tpus/multiple gpus by changing one line of code, 16-bit precision, works on cpu (nice for testing), automatic learning rate finder... Would you be open to a pull request? Link to fork here! I'm in the process of converting your CRNN to pytorch-lightning as well.

    Here's the simplified training loop:

    datamodule = ICDARDataModule(
            config.icdar17_mlt_img_dir,
            config.icdar17_mlt_gt_dir,
            batch_size=1,
            num_workers=config.num_workers,
            shuffle=True,
        )
    
    len_train_dataset = len(datamodule.train_data)
    
    model = CTPN_Model()
    
    trainer = pl.Trainer(gpus=1, # number of gpus, 0 if you want to use cpu
                           max_epochs=max_epochs,
                           log_every_n_steps=1,
                           callbacks=[LoadCheckpoint(config.pretrained_weights),
                                      InitializeWeights(),
                                      LossAndCheckpointCallback(config, len_train_dataset)])
    
    trainer.fit(model, datamodule)
    
    opened by mathemusician 2
  • 对一段代码不解,特来与up主交流

    对一段代码不解,特来与up主交流

    ocr.pytorch/detect/ctpn_predict.py第43行, image = image.astype(np.float32) - config.IMAGE_MEAN 对这一步的操作的意义(是减均值吗?),以及config.IMAGE_MEAN取值的依据不懂。。。 我是刚入门机器视觉,问题也比较小白,有劳up主了。。

    opened by gi19901212 2
  • need help for training with custom images

    need help for training with custom images

    I have an excel of 10k crops containing two columns:

    Column 1: Image_path (<path/abc.png>)

    Column 2: Ground_Truth ()

    data.csv looks like:

    path | gt

    C:/Users/1234/crop/ABC 07 07 2020_page1.png | 8 05 75 824.46Cr C:/Users/1234/crop/PQW 07 10 2020_page1.png | Time 11 42 23 C:/Users/1234/crop/XRE 08 10 2020_page1.png | Account No. 200000592 C:/Users/1234/crop/JKL 07 10 2020_page1.png | 1 00 00 00 000.00

    Now, I need to use this input for the training. Shall I feed this in dataset.py first ? Then need to start training?.

    Or if you can help me with steps.

    opened by anidiatm41 0
  • Is there any option for language?

    Is there any option for language?

    I am trying to apply this model to my own images which are in English but I got many Chinese Characters because I am using Baidu pertaining weighs? so Is there any option to predict only in English or any other trained weights?

    opened by AymenSe 0
  • CRNN可以用在Khmer语言吗?

    CRNN可以用在Khmer语言吗?

    我尝试训练CRNN模型, 但是我得到的结果一直是:

    Not Covering Char: ១ - 6113 Not Covering Char: ១ - 6113 Not Covering Char: ៩ - 6121 Not Covering Char: ៧ - 6119 Not Covering Char: ១ - 6113 Not Covering Char: ទ - 6033 Not Covering Char: ស - 6047 Not Covering Char: រ - 6042 error Train loss: 0.000000

    Start val ~/image0-1.jpg ~/image0-1.jpg pred :—眯恂 target:គោគ្គនាមនិងនាមៈ កូល វន្ធសហា 0.0 ocr_acc: 0.000000

    请问我该如何成功地训练CRNN模型呢?感谢您的解答。

    opened by CHAMYJ 1
Owner
coura
CV&ML
coura
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 5, 2023
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

Peace 10 Jun 30, 2021
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 363 Jan 3, 2023
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

null 27 Jan 8, 2023
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

null 27.5k Jan 8, 2023
A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT 151 Dec 12, 2022
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 3, 2023
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 6, 2021
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
Pure Javascript OCR for more than 100 Languages 📖🎉🖥

Version 2 is now available and under development in the master branch, read a story about v2: Why I refactor tesseract.js v2? Check the support/1.x br

Project Naptha 29.2k Jan 5, 2023
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 8, 2022
Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

Products Recognition 介绍 商品识别,围绕在复杂的商场零售场景中,识别出货架图像中的商品信息。主要组成部分: 重复图像检测。【更新进度 4/10】 图像拼接。【更新进度 0/10】 目标检测。【更新进度 0/10】 商品识别。【更新进度 1/10】 OCR。【更新进度 1/10】

zhenjieWang 18 Jan 27, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 3, 2023
Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Handwritten Text Recognition with TensorFlow Update 2021: more robust model, faster dataloader, word beam search decoder also available for Windows Up

Harald Scheidl 1.5k Jan 7, 2023
OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

Hussein Youssef 144 Jan 5, 2023
A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

awesome-deep-text-detection-recognition A curated list of awesome deep learning based papers on text detection and recognition. Text Detection Papers

null 2.4k Jan 8, 2023