A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

Overview

awesome-deep-text-detection-recognition

A curated list of awesome deep learning based papers on text detection and recognition.

Text Detection

  • Papers are sorted by published date.
  • IC is shorts for ICDAR.
  • Score is F1-score for localization task.
    • (L) stands for score in leader-board.
    • If the reported score in leader-board is somewhat different from the paper, (L) is provided.
  • *CODE means official code and CODE(M) means that traiend model is provided.
Conf. Date Title IC13 IC15 Resources
'14-ECCV 14/10/07 Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees
15-CVPR 15/06/01 Symmetry-based text line detection in natural scenes 0.8043 PRJ
CODE
'16-TIP 15/10/12 Text-Attentional Convolutional Neural Networks for Scene Text Detection 0.8165
'15-ICCV 15/12/13 Text Flow : A Unified Text Detection System in Natural Scene Images 0.8025
'16-arXiv 16/03/31 Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork 0.86
'16-CVPR 16/04/14 Multi-Oriented Text Detection with Fully Convolutional Networks 0.83 0.54 *TORCH(M)
'16-CVPR 16/04/22 Synthetic Data for Text Localisation in Natural Images 0.847
(L)0.8359
CODE
DB
'16-arXiv 16/06/29 Scene Text Detection Via Holistic, Multi-Channel Prediction 0.8433 0.6477
'16-ECCV 16/09/12 Detecting Text in Natural Image with Connectionist Text Proposal Network 0.8215 0.6085 *CAFFE(M)
CAFFE
TF(M)
TF
DEMO
BLOG(CH)
'17-AAAI 16/11/21 TextBoxes: A fast text detector with a single deep neural network 0.85
(L)0.8767
*CAFFE(M)
TF
BLOG(KR)
'18-TM 17/03/03 Arbitrary-Oriented Scene Text Detection via Rotation Proposals 0.9125 0.8020 *CAFFE
'17-CVPR 17/03/04 Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection 0.7064
'17-CVPR 17/03/19 Detecting Oriented Text in Natural Images by Linking Segments 0.853 0.75
(L)0.7636
*TF(M)
TF(M)
SLIDE
VIDEO
'17-arXiv 17/03/24 Deep Direct Regression for Multi-Oriented Scene Text Detection 0.86 0.81
'17-arXiv 17/04/03 Cascaded Segmentation-Detection Networks for Word-Level Text Spotting 0.86 0.71
'17-CVPR 17/04/11 EAST: An Efficient and Accurate Scene Text Detector 0.8072
(L)0.8038
TF(M)
TF
PYTORCH(M)
PYTORCH
DEMO
KERAS(M)
VIDEO
'17-ICIP 17/05/15 WordFence: Text Detection in Natural Images with Border Awareness 0.86
'17-arXiv 17/06/30 R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection 0.8773 0.8254 TF(M)
CAFFE(M)
'17-CVPR 17/07/21 Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting In The Wild 0.85 0.63
'17-arXiv 17/08/17 Deep Scene Text Detection with Connected Component Proposals 0.919
'17-ICCV 17/08/22 WordSup: Exploiting Word Annotations for Character based Text Detection 0.9064 0.7816
'17-ICCV 17/09/01 Single Shot Text Detector with Regional Attention 0.8704 0.7691 *CAFFE(M)
PYTORCH
VIDEO
'17-arXiv 17/09/11 Fused Text Segmentation Networks for Multi-oriented Scene Text Detection 0.8414
'17-ICCV 17/10/13 WeText: Scene Text Detection under Weak Supervision 0.869
(L)0.8313
'17-ICCV 17/10/22 Self-organized Text Detection with Minimal Post-processing via Border Learning 0.84 *KERAS(M)
'17-ICDAR 17/11/11 Deep Residual Text Detection Network for Scene Text 0.9117
(L)0.8925
'18-AAAI 17/11/12 Feature Enhancement Network: A Refined Scene Text Detector 0.9161
'17-arXiv 17/11/30 ArbiText: Arbitrary-Oriented Text Detection in Unconstrained Scene 0.759
'18-AAAI 18/01/04 PixelLink: Detecting Scene Text via Instance Segmentation 0.881 0.8519 *TF(M) TF
'18-CVPR 18/01/05 FOTS: Fast Oriented Text Spotting with a Unified Network 0.925 0.8984 PYTORCH
PYTORCH
VIDEO
'18-TIP 18/01/09 TextBoxes++: A Single-Shot Oriented Scene Text Detector 0.88 0.829
(L)0.8475
*CAFFE(M)
'18-CVPR 18/02/27 Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation 0.88 0.843 *PYTORCH(M)
'18-CVPR 18/03/09 An end-to-end TextSpotter with Explicit Alighment and Attention 0.9 0.87 *CAFFE(M)
'18-CVPR 18/03/14 Rotation-Sensitive Regression for Oriented Scene Text Detection 0.89 0.838 *CAFFE(M)
'18-arXiv 18/04/08 Detecting Multi-Oriented Text with Corner-based Region Proposals 0.876 0.845 *CAFFE(M)
'18-arXiv 18/04/24 An Anchor-Free Region Proposal Network for Faster R-CNN based Text Detection Approaches 0.92 0.86
'18-IJCAI 18/05/03 IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection 0.9047
'18-arXiv 18/06/07 Shape Robust Text Detection with Progressive Scale Expansion Network 0.8721 PRJ
'18-ECCV 18/07/04 TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes 0.826 PYTORCH
'18-ECCV 18/07/06 Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes 0.917 0.86
'18-ECCV 18/07/10 Accurate Scene Text Detection through Border Semantics Awareness and Bootstrapping 0.892
'19-AAAI 18/11/21 Scene Text Detection with Supervised Pyramid Context Network 0.921 0.872
'19-TIP 18/12/04 TextField: Learning A Deep Direction Field for Irregular Scene Text Detection 0.824 *CAFFE(M)
'19-CVPR 19/03/21 Towards Robust Curve Text Detection with Conditional Spatial Expansion
'19-CVPR 19/03/28 Shape Robust Text Detection with Progressive Scale Expansion Network 0.857 TF(M)
'19-CVPR 19/04/03 Character Region Awareness for Text Detection 0.952 0.869 *PYTORCH(M)
VIDEO
PYTORCH
TF(M)
KERAS
BLOG_CH
BLOG_KR
BLOG_KR
BLOG_KR
'19-CVPR 19/04/13 Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes Screen reader support enabled 0.877
'19-CVPR 19/06/16 Learning Shape-Aware Embedding for Scene Text Detection 0.877
'19-CVPR 19/06/16 Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation 0.917 0.876
'19-ICCV 19/08/16 Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network 0.829
'19-ICCV 19/09/02 Geometry Normalization Networks for Accurate Scene Text Detection 0.8852
'19-AAAI 19/11/20 Real-time Scene Text Detection with Differentiable Binarization 0.847

Text Recognition

  • Papers are sorted by published date.
  • IC is shorts for ICDAR.
  • Score is word-accuracy for recognition task.
    • For results on IC03, IC13, and IC15 dataset, papers used different numbers of samples per paper,
      but we did not distinguish between them
  • *CODE means official code and CODE(M) means that trained model is provided.
Conf. Date Title SVT IIIT5k IC03 IC13 Resources
'15-ICLR 14/12/18 Deep structured output learning for unconstrained text recognition 0.717 0.896 0.818 TF
SLIDE
VIDEO
'16-IJCV 15/05/07 Reading text in the wild with convolutional neural networks 0.807 0.933 0.908 KERAS
'16-AAAI 15/06/14 Reading Scene Text in Deep Convolutional Sequences
'17-TPAMI 15/07/21 An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition 0.808 0.782 0.894 0.867 TORCH(M)
TF
TF
TF
TF
PYTORCH
PYTORCH(M)
BLOG(KR)
'16-CVPR 16/03/09 Recursive Recurrent Nets with Attention Modeling for OCR in the Wild 0.807 0.784 0.887 0.9
'16-CVPR 16/03/12 Robust scene text recognition with automatic rectification 0.819 0.819 0.901 0.886 PYTORCH
PYTORCH
'16-CVPR 16/06/27 CNN-N-Gram for Handwriting Word Recognition 0.8362 VIDEO
'16-BMVC 16/09/19 STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition 0.836 0.833 0.899 0.891
'17-arXiv 17/07/27 STN-OCR: A single Neural Network for Text Detection and Text Recognition 0.798 0.86 0.903 *MXNET(M)
PRJ
BLOG
'17-IJCAI 17/08/19 Learning to Read Irregular Text with Attention Mechanisms
'17-arXiv 17/09/06 Scene Text Recognition with Sliding Convolutional Character Models 0.765 0.816 0.845 0.852
'17-ICCV 17/09/07 Focusing Attention: Towards Accurate Text Recognition in Natural Images 0.859 0.874 0.942 0.933
'18-CVPR 17/11/12 AON: Towards Arbitrarily-Oriented Text Recognition 0.828 0.87 0.915 TF
'17-NIPS 17/12/04 Gated Recurrent Convolution Neural Network for OCR 0.815 0.808 0.978 *TORCH(M)
'18-AAAI 18/01/04 Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition 0.844 0.836 0.915 0.908
'18-AAAI 18/01/04 SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network 0.87 0.931 0.929
'18-CVPR 18/05/09 Edit Probability for Scene Text Recognition 0.875 0.883 0.946 0.944
'18-TPAMI 18/06/25 ASTER: An Attentional Scene Text Recognizer with Flexible Rectification 0.936 0.934 0.945 0.918 *TF(M)
PYTORCH
'18-ECCV 18/09/08 Synthetically Supervised Feature Learning for Scene Text Recognition 0.871 0.894 0.947 0.94
'19-AAAI 18/09/18 Scene Text Recognition from Two-Dimensional Perspective 0.821 0.92 0.914
'19-AAAI 18/11/02 Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition 0.845 0.915 0.91 *TORCH(M)
'19-CVPR 18/12/14 ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification 0.902 0.933 0.913 PRJ
'19-PR 19/01/10 MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition 0.883 0.912 0.950 0.924 *PYTORCH(M)
'19-ICCV 19/04/03 What is wrong with scene text recognition model comparisons? dataset and model analysis 0.875 0.949 0.936 *PYTORCH(M)
BLOG_KR
'19-CVPR 19/04/18 Aggregation Cross-Entropy for Sequence Recognition 0.826 0.823 0.921 0.897 *PYTORCH
'19-CVPR 19/06/16 Sequence-to-Sequence Domain Adaptation Network for Robust Text Image Recognition 0.845 0.838 0.921 0.918
'19-ICCV 19/08/06 Symmetry-constrained Rectification Network for Scene Text Recognition 0.889 0.944 0.95 0.939
'20-AAAI 19/12/28 TextScanner: Reading Characters in Order for Robust Scene Text Recognition 0.895 0.926 0.925
'20-AAAI 19/12/21 Decoupled Attention Network for Text Recognition 0.892 0.943 0.95 0.939 *PYTORCH(M)
'20-AAAI 20/02/04 GTC: Guided Training of CTC 0.929 0.955 0.952 0.943

End-to-End Text Recognition

  • Papers are sorted by published date.
  • IC is shorts for ICDAR.
  • Score is F1-score for generic task.
  • *CODE means official code and CODE(M) means that trained model is provided.
Conf. Date Title IC03 IC13 IC15 Resources
'12-ICPR 12/11/11 End-to-end text recognition with convolutional neural networks 0.67 *CODE
'14-ECCV 14/09/06 Deep Features for Text Spotting 0.75 PRJ
MATLAB
'15-IJCV 15/05/07 Reading Text in the Wild with Convolutional Neural Networks 0.70 0.77 KERAS
'15-TPAMI 15/10/30 Real-time Lexicon-free Scene Text Localization and Recognition 0.542 0.156
'16-arXiv 16/04/10 TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild 0.6843 0.4718
(L)0.533
*CAFFE(M)
'17-AAAI 16/11/21 TextBoxes: A fast text detector with a single deep neural network 0.84 TF
*CAFFE(M)
BLOG_KR
'17-ICCV 17/07/13 Towards End-to-end Text Spotting with Convolution Recurrent Neural Network 0.8459 VIDEO
'17-ICCV 17/10/22 Deep TextSpotter An End-to-End Trainable Scene Text Localization and Recognition Framework 0.77 0.47 VIDEO
*CAFFE(M)
'18-CVPR 18/01/05 FOTS: Fast Oriented Text Spotting with a Unified Network 0.8477 0.6533 VIDEO
TF(M)
'18-TIP 18/01/09 TextBoxes++: A Single-Shot Oriented Scene Text Detector 0.8465 0.519 *CAFFE(M)
'18-CVPR 18/03/09 An end-to-end TextSpotter with Explicit Alignment and Attention 0.86 0.63 *CAFFE(M)
'18-TPAMI 18/06/25 ASTER: An Attentional Scene Text Recognizer with Flexible Rectification 0.64 *TF(M)
'18-ECCV 18/07/06 Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes 0.865 0.624
'19-ICCV 19/08/24 Towards Unconstrained End-to-End Text Spotting 0.6994 BLOG_KR
'19-ICCV 19/10/17 Convolutional Character Networks 0.7108 *PYTORCH(M)
'19-ICCV 19/10/27 TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting 0.6537
'20-AAAI 19/11/21 All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting 0.841 0.641
'20-AAAI 20/02/12 Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting 0.858 0.651

Others

  • Papers are sorted by published date.
  • *CODE means official code and CODE(M) means that trained model is provided.
Conf. Date Title Description Resources
'14-NIPS 14/06/09 Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition Dataset PRJ
'17-ECCV 17/02/13 End-to-End Interpretation of the French Street Name Signs Dataset Dataset (FSNS) *TF(M)
'17-arXiv 17/04/11 Attention-based Extraction of Structured Information from Street View Imagery FSNS *TF(M)
TF
TF
LUA
BLOG_KR
'17-CVPR 17/07/21 Unambiguous Text Localization and Retrieval for Cluttered Scenes Text Retrieval
'17-AAAI 17/10/22 Detection and Recognition of Text Embedded in Online Images via Neural Context Models Dataset PRJ
'18-CVPR 17/11/17 Separating Style and Content for Generalized Style Transfer Font Style
'17-arXiv 17/12/06 Detecting Curve Text in the Wild New Dataset and New Solution Dataset (CTW 1500) PRJ
'18-AAAI 17/12/14 SEE: Towards Semi-Supervised End-to-End Scene Text Recognition FSNS PRJ
*CHAINER(M)
'17-CVPR 18/06/07 Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks Document Layout PRJ
'18-CVPR 18/06/19 DocUNet: Document Image Unwarping via A Stacked U-Net Document Dewarping PRJ
'18-CVPR 18/06/19 Document Enhancement using Visibility Detection Document Enhancement PRJ
'18-IJCAI 18/06/22 Multi-Task Handwritten Document Layout Analysis Document Layout
'18-ECCV 18/07/09 Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes Dataset PRJ
'19-AAAI 18/12/03 EnsNet: Ensconce Text in the Wild Text Removal DB
'19-CVPR 18/12/14 Spatial Fusion GAN for Image Synthesis Dataset DB
'19-AAAI 19/01/27 Hierarchical Encoder with Auxiliary Supervision for Table-to-text Generation: Learning Better Representation for Tables TableToText
'19-AAAI 19/01/27 A Radical-aware Attention-based Model for Chinese Text Classification Chinese Character Classification
'19-CVPR 19/02/25 Handwriting Recognition in Low-resource Scripts using Adversarial Learning Handwritting Recognition TF
'19-CVPR 19/03/27 Tightness-aware Evaluation Protocol for Scene Text Detection Evaluation CODE
'19-ICCV 19/05/31 Scene Text Visual Question Answering Dataset ICDAR_DB
'19-CVPR 19/06/16 DynTypo: Example-based Dynamic Text Effects Transfer Text Effects PRJ
VIDEO
'19-CVPR 19/06/16 Typography with Decor: Intelligent Text Style Transfer Text Effects *PYTORCH(M)
'19-CVPR 19/06/16 An Alternative Deep Feature Approach to Line Level Keyword Spotting Kyeword Spotting
'19-ICCV 19/07/23 GA-DAN: Geometry-Aware Domain Adaptation Network for Scene Text Detection and Recognition Domain Adaptation
'19-ICCV 19/09/17 Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning Dataset ICDAR_DB
'19-ICCV 19/10/02 Large-scale Tag-based Font Retrieval with Generative Feature Learning Font Retrieval
'19-ICCV 19/10/27 TextPlace: Visual Place Recognition and Topological Localization Through Reading Scene Texts Place Recognition DB
'19-ICCV 19/10/27 DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks Document Dewarping *PYTORCH(M)

Other lists

Tutorial Materials

Acknowledgment

  • This work is done by OCR team in Clova AI powered by NAVER-LINE. NAVER-LINE is an Asian top internet company and develops Clova, a cloud-based AI-assistant platform.
  • This repository is scheduled to be updated regularly in accordance with schedules of major AI conferences.
Issues
A curated list of papers and resources for scene text detection and recognition

Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin

Jan Zdenek 42 Oct 16, 2021
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 345 Nov 29, 2021
A curated list of resources dedicated to scene text localization and recognition

Scene Text Localization & Recognition Resources A curated list of resources dedicated to scene text localization and recognition. Any suggestions and

CarlosTao 1.6k Nov 14, 2021
A curated list of promising OCR resources

Call for contributor(paper summary,dataset generation,algorithm implementation and any other useful resources) awesome-ocr A curated list of promising

wanghaisheng 1.5k Nov 26, 2021
A curated list of papers, code and resources pertaining to image composition

A curated list of resources including papers, datasets, and relevant links pertaining to image composition.

BCMI 164 Dec 3, 2021
Extract tables from scanned image PDFs using Optical Character Recognition.

ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

Abhijeet Singh 192 Nov 30, 2021
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics 20 Sep 30, 2021
Provides OCR (Optical Character Recognition) services through web applications

OCR4all As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety

null 140 Nov 15, 2021
Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

gosseract OCR Golang OCR package, by using Tesseract C++ library. OCR Server Do you just want OCR server, or see the working example of this package?

Hiromu OCHIAI 1.6k Nov 25, 2021
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 68 Nov 21, 2021
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

Peace 10 Jun 30, 2021
A curated list of awesome synthetic data for text location and recognition

awesome-SynthText A curated list of awesome synthetic data for text location and recognition and OCR datasets. Text location SynthText SynthText_Chine

Tianzhong 235 Nov 20, 2021
Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Este programa tem o intuito de ser um modificador de arquivos PDF. Os arquivos PDFs podem ser 3: PDFs verdadeiros - em que podem ser selecionados o ti

Daniel Soares Saldanha 2 Oct 11, 2021
A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

Scene Text Localization & Recognition Resources Read this institute-wise: English, 简体中文. Read this year-wise: English, 简体中文. Tags: [STL] (Scene Text L

Karl Lok (Zhaokai Luo) 842 Dec 1, 2021
Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

Clova AI Research 2.2k Dec 3, 2021
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

null 408 Nov 25, 2021
A list of hyperspectral image super-solution resources collected by Junjun Jiang

A list of hyperspectral image super-resolution resources collected by Junjun Jiang. If you find that important resources are not included, please feel free to contact me.

Junjun Jiang 222 Dec 2, 2021
Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

Sparsha Saha 16 Oct 29, 2021
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 3, 2021