A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

Zuming Huang

Last update: Jan 3, 2023

OCR Resources

This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

Papers by Year

Papers by Topics

Papers by Conferences and Journals

Datasets

References

HCIILAB Scene-Text-Detection. https://github.com/HCIILAB/Scene-Text-Detection
HCIILAB Scene-Text-Recognition. https://github.com/HCIILAB/Scene-Text-Recognition
HCIILAB Scene-Text-End2end. https://github.com/HCIILAB/Scene-Text-End2end
A general list of resources to image text localization and recognition. https://github.com/whitelok/image-text-localization-recognition
A curated list of resources dedicated to scene text localization and recognition. https://github.com/chongyangtao/Awesome-Scene-Text-Recognition
A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods. https://github.com/hwalsuklee/awesome-deep-text-detection-recognition
Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized. https://github.com/Jyouhou/SceneTextPapers
Links to awesome OCR projects. https://github.com/kba/awesome-ocr
A curated list of promising OCR resources. https://github.com/wanghaisheng/awesome-ocr

You might also like...

A pure pytorch implemented ocr project including text detection and recognition

ocr.pytorch A pure pytorch implemented ocr project. Text detection is based CTPN and text recognition is based CRNN. More detection and recognition me

444 Dec 30, 2022

MXNet OCR implementation. Including text recognition and detection.

insightocr Text Recognition Accuracy on Chinese dataset by caffe-ocr Network LSTM 4x1 Pooling Gray Test Acc SimpleNet N Y Y 99.37% SE-ResNet34 N Y Y 9

99 Nov 1, 2022

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 8, 2023

Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

Quick and Dirty OCR of Facebook Papers Gizmodo has been working through the Facebook Papers and releasing the docs that they process and review. As lu

2 Oct 28, 2021

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

4 Jul 11, 2022

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

16.7k Jan 3, 2023

A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

Related tags

Overview

OCR Resources

Contents

Papers by Year

Papers by Topics

Papers by Conferences and Journals

Datasets

References

You might also like...

A pure pytorch implemented ocr project including text detection and recognition

MXNet OCR implementation. Including text recognition and detection.

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

Dirty, ugly, and hopefully useful OCR of Facebook Papers docs released by Gizmodo

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Indonesian ID Card OCR using tesseract OCR

A curated list of papers, code and resources pertaining to image composition

A curated list of promising OCR resources

Owner

Zuming Huang

ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

Provides OCR (Optical Character Recognition) services through web applications

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Text recognition (optical character recognition) with deep learning methods.

Extract tables from scanned image PDFs using Optical Character Recognition.

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

Optical character recognition for Japanese text, with the main focus being Japanese manga

make a better chinese character recognition OCR than tesseract

A curated list of papers and resources for scene text detection and recognition