A pure pytorch implemented ocr project including text detection and recognition

coura

Last update: Dec 30, 2022

Related tags

Computer Vision ocr text-recognition text-detection crnn ctpn ocr-pytorch

Overview

ocr.pytorch

A pure pytorch implemented ocr project.
Text detection is based CTPN and text recognition is based CRNN.
More detection and recognition methods will be supported!

Prerequisite

python-3.5+
pytorch-0.4.1+
torchvision-0.2.1
opencv-3.4.0.14
numpy-1.14.3

They could all be installed through pip except pytorch and torchvision. As for pytorch and torchvision, they both depends on your CUDA version, you would prefer to reading pytorch's official site

Detection

Detection is based on CTPN, some codes are borrowed from pytorch_ctpn, several detection results:

Recognition

Recognition is based on CRNN, some codes are borrowed from crnn.pytorch

Test

Download pretrained models from Baidu Netdisk (extract code: u2ff) or Google Driver and put these files into checkpoints. Then run

python3 demo.py

The image files in ./test_images will be tested for text detection and recognition, the results will be stored in ./test_result.

If you want to test a single image, run

python3 test_one.py [filename]

Train

Training codes are placed into train_code directory.
Train CTPN
Train CRNN

Licence

MIT License

Comments

Training CRNN & extracting the CTPN detection
@courao Thank you for your hard work,

Will you release CRNN training code & documentation.

Can you add the option to extract the detected lines of CTPN.
opened by ghost 26
loss=0.44 on the SROIE dataset

@courao I have been training for 1 day on the SROIE dataset., the loss is still 0.44 ! It works well on other datasets, but not the SROIE dataset? Am I doing something wrong? dataset download link

opened by ghost 10
更改crnn_recognizer.py报错

您好，我和前面的朋友遇见的问题一样，修改crnn_recognizer.py文件的第100行def init(self, model_path='/root/zjut/ocr.pytorch/checkpoints/CRNN.pth')。当我执行'python demo.py'命令出错，显示如下： Traceback (most recent call last): File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/ptvsd_launcher.py", line 43, in main(ptvsdArgs) File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/lib/python/old_ptvsd/ptvsd/main.py", line 432, in main run() File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/lib/python/old_ptvsd/ptvsd/main.py", line 316, in run_file runpy.run_path(target, run_name='main') File "/root/anaconda3/lib/python3.6/runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "/root/anaconda3/lib/python3.6/runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "/root/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/zjut/ocr.pytorch/demo.py", line 10, in from ocr import ocr File "/root/zjut/ocr.pytorch/ocr.py", line 6, in recognizer = PytorchOcr() File "/root/zjut/ocr.pytorch/recognize/crnn_recognizer.py", line 111, in init self.model.load_state_dict({k.replace('module.', ''): v for k, v in torch.load(model_path).items()}) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for CRNN: Missing key(s) in state_dict: "conv1.weight", "conv1.bias", "conv2.weight", "conv2.bias", "conv3_1.weight", "conv3_1.bias", "bn3.weight", "bn3.bias", "bn3.running_mean", "bn3.running_var", "conv3_2.weight", "conv3_2.bias", "conv4_1.weight", "conv4_1.bias", "bn4.weight", "bn4.bias", "bn4.running_mean", "bn4.running_var", "conv4_2.weight", "conv4_2.bias", "conv5.weight", "conv5.bias", "bn5.weight", "bn5.bias", "bn5.running_mean", "bn5.running_var". Unexpected key(s) in state_dict: "cnn.conv0.weight", "cnn.conv0.bias", "cnn.conv1.weight", "cnn.conv1.bias", "cnn.conv2.weight", "cnn.conv2.bias", "cnn.batchnorm2.weight", "cnn.batchnorm2.bias", "cnn.batchnorm2.running_mean", "cnn.batchnorm2.running_var", "cnn.batchnorm2.num_batches_tracked", "cnn.conv3.weight", "cnn.conv3.bias", "cnn.conv4.weight", "cnn.conv4.bias", "cnn.batchnorm4.weight", "cnn.batchnorm4.bias", "cnn.batchnorm4.running_mean", "cnn.batchnorm4.running_var", "cnn.batchnorm4.num_batches_tracked", "cnn.conv5.weight", "cnn.conv5.bias", "cnn.conv6.weight", "cnn.conv6.bias", "cnn.batchnorm6.weight", "cnn.batchnorm6.bias", "cnn.batchnorm6.running_mean", "cnn.batchnorm6.running_var", "cnn.batchnorm6.num_batches_tracked". size mismatch for rnn.1.embedding.weight: copying a param with shape torch.Size([5997, 512]) from checkpoint, the shape in current model is torch.Size([5835, 512]). size mismatch for rnn.1.embedding.bias: copying a param with shape torch.Size([5997]) from checkpoint, the shape in current model is torch.Size([5835]). 其中CRNN.pth是您度盘所提供的。

opened by wangguanhua 7
运行demo.py的时候，出现报错。。。

C:\Users\Administrator\AppData\Local\Programs\Python\Python37\python.exe C:/Users/Administrator/Desktop/pytorch/ocr.pytorch-master/demo.py Traceback (most recent call last): ./test_result\test_images\t1.txt File "C:/Users/Administrator/Desktop/pytorch/ocr.pytorch-master/demo.py", line 29, in txt_f = open(txt_file, 'w') FileNotFoundError: [Errno 2] No such file or directory: './test_result\test_images\t1.txt'

Process finished with exit code 1

我看到demo.py运行之后会清空掉test_result文件夹里的内容，然后就报这个错误。。。求up主帮忙看一下。。

opened by gi19901212 4
Upload pretrained models to other host

Can you please upload the pretrained models to another site than pan.baidu? Most non chinese users can't download from there. Maybe https://www.mediafire.com/, https://www.4shared.com/, https://mega.nz/, googledrive or https://zippyshare.com/ Thank you very much.

opened by Johndirr 3

I made a pytorch-lightning implementation of your CTPN

Hi! I've made a pytorch-lightning implementation of ctpn, mainly by using your code. Pytorch-lightning has many nice features, such as training with tpus/multiple gpus by changing one line of code, 16-bit precision, works on cpu (nice for testing), automatic learning rate finder... Would you be open to a pull request? Link to fork here! I'm in the process of converting your CRNN to pytorch-lightning as well.

Here's the simplified training loop:

datamodule = ICDARDataModule(
        config.icdar17_mlt_img_dir,
        config.icdar17_mlt_gt_dir,
        batch_size=1,
        num_workers=config.num_workers,
        shuffle=True,
    )

len_train_dataset = len(datamodule.train_data)

model = CTPN_Model()

trainer = pl.Trainer(gpus=1, # number of gpus, 0 if you want to use cpu
                       max_epochs=max_epochs,
                       log_every_n_steps=1,
                       callbacks=[LoadCheckpoint(config.pretrained_weights),
                                  InitializeWeights(),
                                  LossAndCheckpointCallback(config, len_train_dataset)])

trainer.fit(model, datamodule)

opened by mathemusician 2

对一段代码不解，特来与up主交流

ocr.pytorch/detect/ctpn_predict.py第43行， image = image.astype(np.float32) - config.IMAGE_MEAN 对这一步的操作的意义（是减均值吗？），以及config.IMAGE_MEAN取值的依据不懂。。。我是刚入门机器视觉，问题也比较小白，有劳up主了。。

opened by gi19901212 2
need help for training with custom images

I have an excel of 10k crops containing two columns:

Column 1: Image_path (<path/abc.png>)

Column 2: Ground_Truth ()

data.csv looks like:

path | gt

C:/Users/1234/crop/ABC 07 07 2020_page1.png | 8 05 75 824.46Cr C:/Users/1234/crop/PQW 07 10 2020_page1.png | Time 11 42 23 C:/Users/1234/crop/XRE 08 10 2020_page1.png | Account No. 200000592 C:/Users/1234/crop/JKL 07 10 2020_page1.png | 1 00 00 00 000.00

Now, I need to use this input for the training. Shall I feed this in dataset.py first ? Then need to start training?.

Or if you can help me with steps.

opened by anidiatm41 0
Is there any option for language?

I am trying to apply this model to my own images which are in English but I got many Chinese Characters because I am using Baidu pertaining weighs? so Is there any option to predict only in English or any other trained weights?

opened by AymenSe 0
CRNN可以用在Khmer语言吗？

我尝试训练CRNN模型，但是我得到的结果一直是：

Not Covering Char: ១ - 6113 Not Covering Char: ១ - 6113 Not Covering Char: ៩ - 6121 Not Covering Char: ៧ - 6119 Not Covering Char: ១ - 6113 Not Covering Char: ទ - 6033 Not Covering Char: ស - 6047 Not Covering Char: រ - 6042 error Train loss: 0.000000

Start val ~/image0-1.jpg ~/image0-1.jpg pred :—眯恂 target:គោគ្គនាមនិងនាមៈ កូល វន្ធសហា 0.0 ocr_acc: 0.000000

请问我该如何成功地训练CRNN模型呢？感谢您的解答。

opened by CHAMYJ 1

Owner

coura

CV&ML

GitHub

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

496 Jan 5, 2023

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

10 Jun 30, 2021

A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

363 Jan 3, 2023

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a

684 Jan 6, 2023

OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

354 Dec 12, 2022

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

27 Jan 8, 2023

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 8, 2023

A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT

151 Dec 12, 2022

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

4 Jul 11, 2022

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

16.7k Jan 3, 2023

A pure pytorch implemented ocr project including text detection and recognition

Related tags

Overview

ocr.pytorch

Prerequisite

Detection

Recognition

Test

Train

Licence

Comments

Owner

coura

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR, Scene-Text-Understanding, Text Recognition

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

A novel region proposal network for more general object detection ( including scene text detection ).

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Indonesian ID Card OCR using tesseract OCR

CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

Pure Javascript OCR for more than 100 Languages 📖🎉🖥

CNN+LSTM+CTC based OCR implemented using tensorflow.

Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

OCR software for recognition of handwritten text

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

OCR system for Arabic language that converts images of typed text to machine-encoded text.

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.