A pure pytorch implemented ocr project including text detection and recognition

Overview

ocr.pytorch

A pure pytorch implemented ocr project.
Text detection is based CTPN and text recognition is based CRNN.
More detection and recognition methods will be supported!

Prerequisite

  • python-3.5+
  • pytorch-0.4.1+
  • torchvision-0.2.1
  • opencv-3.4.0.14
  • numpy-1.14.3

They could all be installed through pip except pytorch and torchvision. As for pytorch and torchvision, they both depends on your CUDA version, you would prefer to reading pytorch's official site

Detection

Detection is based on CTPN, some codes are borrowed from pytorch_ctpn, several detection results: detect1 detect2

Recognition

Recognition is based on CRNN, some codes are borrowed from crnn.pytorch

Test

Download pretrained models from Baidu Netdisk (extract code: u2ff) or Google Driver and put these files into checkpoints. Then run

python3 demo.py

The image files in ./test_images will be tested for text detection and recognition, the results will be stored in ./test_result.

If you want to test a single image, run

python3 test_one.py [filename]

Train

Training codes are placed into train_code directory.
Train CTPN
Train CRNN

Licence

MIT License

Issues
  • Training CRNN & extracting the CTPN detection

    Training CRNN & extracting the CTPN detection

    @courao Thank you for your hard work,

    • Will you release CRNN training code & documentation.
    • Can you add the option to extract the detected lines of CTPN.
    opened by ghost 26
  • CRNN

    CRNN

    可否告知CRNN的训练数据格式

    opened by 897486562 11
  • loss=0.44 on the SROIE dataset

    loss=0.44 on the SROIE dataset

    @courao I have been training for 1 day on the SROIE dataset., the loss is still 0.44 ! It works well on other datasets, but not the SROIE dataset? Am I doing something wrong? dataset download link

    X00016469620 X00016469622

    opened by ghost 10
  • 更改crnn_recognizer.py报错

    更改crnn_recognizer.py报错

    您好,我和前面的朋友遇见的问题一样,修改crnn_recognizer.py文件的第100行def init(self, model_path='/root/zjut/ocr.pytorch/checkpoints/CRNN.pth')。当我执行'python demo.py'命令出错,显示如下: Traceback (most recent call last): File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/ptvsd_launcher.py", line 43, in main(ptvsdArgs) File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/lib/python/old_ptvsd/ptvsd/main.py", line 432, in main run() File "/root/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/lib/python/old_ptvsd/ptvsd/main.py", line 316, in run_file runpy.run_path(target, run_name='main') File "/root/anaconda3/lib/python3.6/runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "/root/anaconda3/lib/python3.6/runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "/root/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/zjut/ocr.pytorch/demo.py", line 10, in from ocr import ocr File "/root/zjut/ocr.pytorch/ocr.py", line 6, in recognizer = PytorchOcr() File "/root/zjut/ocr.pytorch/recognize/crnn_recognizer.py", line 111, in init self.model.load_state_dict({k.replace('module.', ''): v for k, v in torch.load(model_path).items()}) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 845, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for CRNN: Missing key(s) in state_dict: "conv1.weight", "conv1.bias", "conv2.weight", "conv2.bias", "conv3_1.weight", "conv3_1.bias", "bn3.weight", "bn3.bias", "bn3.running_mean", "bn3.running_var", "conv3_2.weight", "conv3_2.bias", "conv4_1.weight", "conv4_1.bias", "bn4.weight", "bn4.bias", "bn4.running_mean", "bn4.running_var", "conv4_2.weight", "conv4_2.bias", "conv5.weight", "conv5.bias", "bn5.weight", "bn5.bias", "bn5.running_mean", "bn5.running_var". Unexpected key(s) in state_dict: "cnn.conv0.weight", "cnn.conv0.bias", "cnn.conv1.weight", "cnn.conv1.bias", "cnn.conv2.weight", "cnn.conv2.bias", "cnn.batchnorm2.weight", "cnn.batchnorm2.bias", "cnn.batchnorm2.running_mean", "cnn.batchnorm2.running_var", "cnn.batchnorm2.num_batches_tracked", "cnn.conv3.weight", "cnn.conv3.bias", "cnn.conv4.weight", "cnn.conv4.bias", "cnn.batchnorm4.weight", "cnn.batchnorm4.bias", "cnn.batchnorm4.running_mean", "cnn.batchnorm4.running_var", "cnn.batchnorm4.num_batches_tracked", "cnn.conv5.weight", "cnn.conv5.bias", "cnn.conv6.weight", "cnn.conv6.bias", "cnn.batchnorm6.weight", "cnn.batchnorm6.bias", "cnn.batchnorm6.running_mean", "cnn.batchnorm6.running_var", "cnn.batchnorm6.num_batches_tracked". size mismatch for rnn.1.embedding.weight: copying a param with shape torch.Size([5997, 512]) from checkpoint, the shape in current model is torch.Size([5835, 512]). size mismatch for rnn.1.embedding.bias: copying a param with shape torch.Size([5997]) from checkpoint, the shape in current model is torch.Size([5835]). 其中CRNN.pth是您度盘所提供的。

    opened by wangguanhua 7
  • 运行demo出错

    运行demo出错

    111

    opened by vcbeaut 5
  • 能不能给出训练部分的代码?

    能不能给出训练部分的代码?

    作者您好,我试了一个你给出的模型,效果挺不错的,非常谢谢您给出的代码 期待您的字符识别训练的代码和数据集

    opened by yugengde 4
  • 空格不识别

    空格不识别

    image

    opened by Crescentz 4
  • 运行demo.py的时候,出现报错。。。

    运行demo.py的时候,出现报错。。。

    C:\Users\Administrator\AppData\Local\Programs\Python\Python37\python.exe C:/Users/Administrator/Desktop/pytorch/ocr.pytorch-master/demo.py Traceback (most recent call last): ./test_result\test_images\t1.txt File "C:/Users/Administrator/Desktop/pytorch/ocr.pytorch-master/demo.py", line 29, in txt_f = open(txt_file, 'w') FileNotFoundError: [Errno 2] No such file or directory: './test_result\test_images\t1.txt'

    Process finished with exit code 1

    我看到demo.py运行之后会清空掉test_result文件夹里的内容,然后就报这个错误。。。 求up主帮忙看一下。。

    opened by gi19901212 4
  • Upload pretrained models to other host

    Upload pretrained models to other host

    Can you please upload the pretrained models to another site than pan.baidu? Most non chinese users can't download from there. Maybe https://www.mediafire.com/, https://www.4shared.com/, https://mega.nz/, googledrive or https://zippyshare.com/ Thank you very much.

    opened by Johndirr 3
  • need help for training with custom images

    need help for training with custom images

    I have an excel of 10k crops containing two columns:

    Column 1: Image_path (<path/abc.png>)

    Column 2: Ground_Truth ()

    data.csv looks like:

    path | gt

    C:/Users/1234/crop/ABC 07 07 2020_page1.png | 8 05 75 824.46Cr C:/Users/1234/crop/PQW 07 10 2020_page1.png | Time 11 42 23 C:/Users/1234/crop/XRE 08 10 2020_page1.png | Account No. 200000592 C:/Users/1234/crop/JKL 07 10 2020_page1.png | 1 00 00 00 000.00

    Now, I need to use this input for the training. Shall I feed this in dataset.py first ? Then need to start training?.

    Or if you can help me with steps.

    opened by anidiatm41 0
  • Is there any option for language?

    Is there any option for language?

    I am trying to apply this model to my own images which are in English but I got many Chinese Characters because I am using Baidu pertaining weighs? so Is there any option to predict only in English or any other trained weights?

    opened by AymenSe 0
  • CRNN这个是怎么回事?Aborted (core dumped)

    CRNN这个是怎么回事?Aborted (core dumped)

    free(): invalid next size (normal) Aborted (core dumped)

    opened by zhuofalin 0
  • CRNN可以用在Khmer语言吗?

    CRNN可以用在Khmer语言吗?

    我尝试训练CRNN模型, 但是我得到的结果一直是:

    Not Covering Char: ១ - 6113 Not Covering Char: ១ - 6113 Not Covering Char: ៩ - 6121 Not Covering Char: ៧ - 6119 Not Covering Char: ១ - 6113 Not Covering Char: ទ - 6033 Not Covering Char: ស - 6047 Not Covering Char: រ - 6042 error Train loss: 0.000000

    Start val ~/image0-1.jpg ~/image0-1.jpg pred :—眯恂 target:គោគ្គនាមនិងនាមៈ កូល វន្ធសហា 0.0 ocr_acc: 0.000000

    请问我该如何成功地训练CRNN模型呢?感谢您的解答。

    opened by CHAMYJ 1
  • crnn.py中看到有好几个不同的crnn模型

    crnn.py中看到有好几个不同的crnn模型

    这几个不同的crnn模型,请求大佬给出解释哈,有没有不限制输入图片尺寸的模型

    opened by HouYanGG 1
  • I made a pytorch-lightning implementation of your CTPN

    I made a pytorch-lightning implementation of your CTPN

    Hi! I've made a pytorch-lightning implementation of ctpn, mainly by using your code. Pytorch-lightning has many nice features, such as training with tpus/multiple gpus by changing one line of code, 16-bit precision, works on cpu (nice for testing), automatic learning rate finder... Would you be open to a pull request? Link to fork here! I'm in the process of converting your CRNN to pytorch-lightning as well.

    Here's the simplified training loop:

    datamodule = ICDARDataModule(
            config.icdar17_mlt_img_dir,
            config.icdar17_mlt_gt_dir,
            batch_size=1,
            num_workers=config.num_workers,
            shuffle=True,
        )
    
    len_train_dataset = len(datamodule.train_data)
    
    model = CTPN_Model()
    
    trainer = pl.Trainer(gpus=1, # number of gpus, 0 if you want to use cpu
                           max_epochs=max_epochs,
                           log_every_n_steps=1,
                           callbacks=[LoadCheckpoint(config.pretrained_weights),
                                      InitializeWeights(),
                                      LossAndCheckpointCallback(config, len_train_dataset)])
    
    trainer.fit(model, datamodule)
    
    opened by mathemusician 2
  • 关于重新训练自己的数据集,预测为空的问题

    关于重新训练自己的数据集,预测为空的问题

    作者您好。我按照数据集的规范完成我自己的数据集格式放上去之后,训练代码可以运行,但是预测结果一直是空字符,我已经更新过alphabet.pkl文件 想咨询一下是哪里出了问题?

    opened by BobM-DS 0
  • crnn训练的时候出现损失在140多,精度为0

    crnn训练的时候出现损失在140多,精度为0

    咱们这个框架的crnn部分,train.py train_python_ctc.py keys.py recognizer.py 麻烦大家解释一下这几个的用法呀,我在训练中文的时候出现损失特别大

    opened by HouYanGG 1
  • CRNN出现str不可用,不知道该怎么修改,求赐教

    CRNN出现str不可用,不知道该怎么修改,求赐教

    image 这个问题不知道怎么搞,百度没有查出来,不知道是不是我环境问题嘛,麻烦大家看一下

    opened by HouYanGG 5
Owner
coura
CV&ML
coura
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 487 Dec 28, 2021
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

Peace 10 Jun 30, 2021
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 347 Dec 21, 2021
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 348 Dec 28, 2021
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

null 18.6k Jan 12, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

null 17 Dec 22, 2021
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 3 Oct 15, 2021
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 6, 2021
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 13.5k Jan 11, 2022
A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT 149 Dec 27, 2021
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.5k Jan 7, 2022
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 349 Dec 29, 2021
Pure Javascript OCR for more than 100 Languages 📖🎉🖥

Version 2 is now available and under development in the master branch, read a story about v2: Why I refactor tesseract.js v2? Check the support/1.x br

Project Naptha 25.4k Jan 13, 2022
Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

Products Recognition 介绍 商品识别,围绕在复杂的商场零售场景中,识别出货架图像中的商品信息。主要组成部分: 重复图像检测。【更新进度 4/10】 图像拼接。【更新进度 0/10】 目标检测。【更新进度 0/10】 商品识别。【更新进度 1/10】 OCR。【更新进度 1/10】

zhenjieWang 18 Oct 26, 2021
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 482 Jan 18, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Handwritten Text Recognition with TensorFlow Update 2021: more robust model, faster dataloader, word beam search decoder also available for Windows Up

Harald Scheidl 1.3k Jan 13, 2022
OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

Hussein Youssef 86 Jan 13, 2022
A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

awesome-deep-text-detection-recognition A curated list of awesome deep learning based papers on text detection and recognition. Text Detection Papers

null 2.3k Jan 17, 2022