Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Overview

Total-Text-Dataset (Official site)

Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.)

Updated on March 19, 2020 (Query on the new groundtruth of test set)

Updated on Sept. 08, 2019 (New training groundtruth of Total-Text is now available)

Updated on Sept. 07, 2019 (Updated Guided Annotation toolbox for scene text image annotation)

Updated on Sept. 07, 2019 (Updated baseline as to our IJDAR)

Updated on August 01, 2019 (Extended version with new baseline + annotation tool is accepted at IJDAR)

Updated on May 30, 2019 (Important announcement on Total-Text vs. ArT dataset)

Updated on April 02, 2019 (Updated table ranking with default vs. our proposed DetEval)

Updated on March 31, 2019 (Faster version DetEval.py, support Python3. Thank you princewang1994.)

Updated on March 14, 2019 (Updated table ranking with evaluation protocol info.)

Updated on November 26, 2018 (Table ranking is included for reference.)

Updated on August 24, 2018 (Newly added Guided Annotation toolbox folder.)

Updated on May 15, 2018 (Added groundtruth in '.txt' format.)

Updated on May 14, 2018 (Added feature - 'Do not care' candidates filtering is now available in the latest python scripts.)

Updated on April 03, 2018 (Added pixel level groundtruth)

Updated on November 04, 2017 (Added text level groundtruth)

Released on October 27, 2017

News

  • We received some questions in regard to the new groundtruth for the test set of Total-Text. Here is an update. We do not release a new version of the test set groundtruth because

     1) there is no need of standardising the length of the groundtruth vertices for testing purpose, it was proposed to facilitate training only, and
     2) a new version of groundtruth would make the previous benchmarks irrelevant.
    

Do contact us if you think there is a valid reason to require the new groundtruth for the test set, we shall discuss about it.

  • TOTAL-TEXT is a word-level based English curve text dataset. If you are interested in text-line based dataset with both English and Chinese instances, we highly recommend you to refer SCUT-CTW1500. In addition, a Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT), which is extended from Total-Text and SCUT-CTW1500, was held at ICDAR2019 to stimulate more innovative ideas on the arbitrary-shaped text reading task. Congratulations to all winners and challengers. The technical report of ArT can be found on at this https URL.

Important Announcement

Total-Text and SCUT-CTW1500 are now part of the training set of the largest curved text dataset - ArT (Arbitrary-Shaped Text dataset). In order to retain the validity of future benchmarking on Total-Text datasets, the test-set images of Total-Text should be removed (with the corresponding ID provided HERE) from the ArT dataset shall one intend to leverage the extra training data from the ArT dataset. We count on the trust of the research community to perform such removal operation to attain the fairness of the benchmarking.

Table Ranking

  • The results from recent papers on Total-Text dataset are listed below where P=Precision, R=Recall & F=F-score.
  • If your result is missing or incorrect, please do not hesisate to contact us.
  • The baseline scores are based on our proposed [Poly-FRCNN-3] in this folder.
  • *Pascal VOC IoU metric; **Polygon Regression

Detection Leaderboard

Method Reported
on paper
DetEval
(tp=0.4, tr=0.8)
(Default)
DetEval
(tp=0.6, tr=0.7)
(New Proposal)
Published at
P R F P R F P R F
Our Baseline [paper] 78.0 68.0 73.0 - - - 78.0 68.0 73.0 IJDAR2020
CRAFTS [paper] 89.5 85.4 87.4 - - - - - - ECCV2020
#ASTS_Weakly-ResNet101 (E2E) [paper] - - 87.3 - - - - - - TIP2020
TextFuseNet [paper] 89.0 85.3 87.1 - - - - - - IJCAI2020
#Boundary (E2E) [paper] 88.9 85.0 87.0 - - - - - - AAAI2020
PolyPRNet [paper] 88.1 85.3 86.7 - - - - - - ACCV2020
#Qin et al. (E2E) [paper] 87.8 85.0 86.4 - - - - - - ICCV2019
100%Poly [paper] 88.2 83.3 85.6 - - - - - - arXiv:2012
ContourNet [paper] 86.9 83.9 85.4 - - - - - - CVPR2020
#Text Perceptron (E2E) [paper] 88.8 81.8 85.2 - - - - - - AAAI2020
PAN-640 [paper] 89.3 81.0 85.0 - - - - - - ICCV2019
DB-ResNet50 (800) [paper] 87.1 82.5 84.7 - - - - - - AAAI2020
TextCohesion [paper] 88.1 81.4 84.6 - - - - - - arXiv:1904
Feng et al. [paper] 87.3 81.1 84.1 - - - - - - IJCV2020
ReLaText [paper] 84.8 83.1 84.0 - - - - - - arXiv:2003
CRAFT [paper] 87.6 79.9 83.6 - - - - - - CVPR2019
LOMO MS [paper] 87.6 79.3 83.3 - - - - - - CVPR2019
SPCNet [paper] 83.0 82.8 82.9 - - - - - - AAAI2019
#ABCNet (E2E) [paper] 85.4 80.1 82.7 - - - - - - CVPR2020
ICG [paper] 82.1 80.9 81.5 - - - - - - PR2019
FTSN [paper] *84.7 *78.0 *81.3 - - - - - - ICPR2018
PSENet-1s [paper] 84.02 77.96 80.87 - - - - - - CVPR2019
1TextField [paper] 81.2 79.9 80.6 76.1 75.1 75.6 83.0 82.0 82.5 TIP2019
#TextDragon (E2E) [paper] 85.6 75.7 80.3 - - - - - - ICCV2019
CSE [paper] 81.4
(**80.9)
79.7
(**80.3)
80.2
(**80.6)
- - - - - - CVPR2019
MSR [paper] 85.2 73.0 78.6 82.7 68.3 74.9 81.4 72.5 76.7 arXiv:1901
ATTR [paper] 80.9 76.2 78.5 - - - - - - CVPR2019
TextSnake [paper] 82.7 74.5 78.4 - - - - - - ECCV2018
1CTD [paper] 74.0 71.0 73.0 60.7 58.8 59.8 76.5 73.8 75.2 PR2019
#TextNet (E2E) [paper] 68.2 59.5 63.5 - - - - - - ACCV2018
#,2Mask TextSpotter (E2E) [paper] 69.0 55.0 61.3 68.9 62.5 65.5 82.5 75.2 78.6 ECCV2018
CENet [paper] 59.9 54.4 57.0 - - - - - - ACCV2018
#Textboxes (E2E) [paper] 62.1 45.5 52.5 - - - - - - AAAI2017
EAST [paper] 50.0 36.2 42.0 - - - - - - CVPR2017
SegLink [paper] 30.3 23.8 26.7 - - - - - - CVPR2017

Note:

# Framework that does end-to-end training (i.e. detection + recognition).

1For the results of TextField and CTD, the improved versions of their original paper were used, and this explains why the performance is better.

2For Mask-TextSpotter, the relatively poor performance reported in their paper was due to a bug in the input reading module (which was fixed recently). The authors were informed about this issue.

End-to-end Recognition Leaderboard
(None refers to recognition without any lexicon; Full lexicon contains all words in test set.)

Method Backbone None (%) Full (%) FPS Published at
CRAFTS [paper] ResNet50-FPN 78.7 - - ECCV2020
MANGO [paper] ResNet50-FPN 72.9 83.6 4.3 AAAI2021
Text Perceptron [paper] ResNet50-FPN 69.7 78.3 - AAAI2020
ABCNet-MS [paper] ResNet50-FPN 69.5 78.4 6.9 CVPR2020
CharNet H-88 MS [paper] ResNet50-Hourglass57 69.2 - 1.2 ICCV2019
Qin et al. [paper] ResNet50-MSF 67.8 - - ICCV2019
ASTS_Weakly [paper] ResNet101-FPN 65.3 84.2 2.5 TIP2020
Boundary [paper] ResNet50-FPN 65.0 76.1 - AAAI2020
ABCNet [paper] ResNet50-FPN 64.2 75.7 17.9 CVPR2020
CAPNet [paper] ResNet50-FPN 62.7 - - ICASSP2020
Feng et al. [paper] VGG 55.8 79.2 - IJCV2020
TextNet [paper] ResNet50-SAM 54.0 - 2.7 ACCV2018
Mask TextSpotter [paper] ResNet50-FPN 52.9 71.8 4.8 ECCV2018
TextDragon [paper] VGG16 48.8 74.8 - ICCV2019
Textboxes [paper] ResNet50-FPN 36.3 48.9 1.4 AAAI2017

Description

In order to facilitate a new text detection research, we introduce Total-Text dataset (IJDAR)(ICDAR-17 paper) (presentation slides), which is more comprehensive than the existing text datasets. The Total-Text consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Citation

If you find this dataset useful for your research, please cite

@article{CK2019,
  author    = {Chee Kheng Ch’ng and
               Chee Seng Chan and
               Chenglin Liu},
  title     = {Total-Text: Towards Orientation Robustness in Scene Text Detection},
  journal   = {International Journal on Document Analysis and Recognition (IJDAR)},
  volume    = {23},
  pages     = {31-52},
  year      = {2020},
  doi       = {10.1007/s10032-019-00334-z},
}

Feedback

Suggestions and opinions of this dataset (both positive and negative) are greatly welcome. Please contact the authors by sending email to chngcheekheng at gmail.com or cs.chan at um.edu.my.

License and Copyright

The project is open source under BSD-3 license (see the LICENSE file).

For commercial purpose usage, please contact Dr. Chee Seng Chan at cs.chan at um.edu.my

©2017-2020 Center of Image and Signal Processing, Faculty of Computer Science and Information Technology, University of Malaya.

Comments
  • bug fix in function one_to_one

    bug fix in function one_to_one

    I think I found a bug in function one_to_one. Suppose there are one predict and two ground truth, the sigma table is [1,1]^T and tau table is [1,0]^T. This is an many to one case but origin code treat it as one to one case.

    opened by techkang 5
  • about DetEval.py evaluation speed boosting.

    about DetEval.py evaluation speed boosting.

    Hi, I found that the evaluation in this code run extremely slowly and most time-consuming operation in your code is area/area_of_intersection/iou. These functions are based on mask counting, which depends highly on the size of images(some big-size images can be bottleck of computing). I have replaced mask counting operation with polygon coordinate computing(which uses shapely, a geometry lib written in python) so that it highly boosts the evaluatoin process.
    Can I make a PR? Hope for your replying, thx.

    opened by princewang1994 4
  • Confused about the evaluation parameters

    Confused about the evaluation parameters

    Hi. According to standard Detval evaluation protocol, "tr = 0.8, tp = 0.4" (which is also your default setting in the MATLAB-code-Eval.m). But you recommend "tr = 0.7 and tp = 0.6" in your Evaluation_Protocol/README.md file.

    We recommend tr = 0.7 and tp = 0.6 threshold for a fairer evaluation with polygon ground-truth and detection format.
    

    I am confused about how to set tr and tp when I want to compare my results with other methods (listed in the Tabel Ranking)

    Detection (based on DetEval evaluation protocol, unless stated)

    | Method | Precision (%) | Recall (%) | F-measure (%) | Published at | |:--------: | :-----: | :----: | :-----: | :-----: | |MSR [paper] | 85.2 | 73.0 | 78.6 | arXiv:1901.02596 | |FTSN [paper] | 84.7 | 78.0 | 81.3 | ICPR2018 | |TextSnake [paper]| 82.7 | 74.5 | 78.4 | ECCV2018 | |TextField [paper] | 81.2 | 79.9 | 80.6 | TIP2019 | |CTD [paper] | 74.0 | 71.0 | 73.0 | PR2019 | |Mask TextSpotter [paper] | 69.0 | 55.0 | 61.3 | ECCV2018 | |TextNet [paper] | 68.2 | 59.5 | 63.5 | ACCV2018 | |Textboxes [paper] | 62.1 | 45.5 | 52.5 | AAAI2017 | |EAST [paper] | 50.0 | 36.2 | 42.0 | CVPR2017 | |Baseline [paper] | 33.0 | 40.0 | 36.0 | ICDAR2017 | |SegLink [paper] | 30.3 | 23.8 | 26.7 | CVPR2017 |

    opened by lillyPJ 4
  • rules for annotation

    rules for annotation

    I have some question about rules of annotation. As for curved text, why do you assign them different number of coordinates? What's the maximum number?

    opened by ran337287 4
  • confusing about the precision calculation in many_to_many method Deteval.py

    confusing about the precision calculation in many_to_many method Deteval.py

    Hi I'm reading the Deteval.py script and I'm confusing about the precision calculation in the many_to_many() method. (line 203)

    image

    when you calculating recall, you considered the num_qualified_sigma_candidates, but you don't consider num_qualified_tau_candidates when you calculate precision. Moreover, given the following condition (line 199), I think this method really should be called as many_to_one instead of many_to_many.

    image

    In summary, I think if you don't consider num_qualified_tau_candidates when you calculate precision in many_to_many method, and you only check if np.sum(local_tau_table[qualified_sigma_candidates, det_id]) >= tp. This method really should be called as many_to_one, and you probably need another many_to_many method.

    opened by fdengmark 3
  • Misunderstanding the scripts

    Misunderstanding the scripts

    Hi, can you explain me some scripts?

    What is difference between: T3 annotation tool scripts, Detection_Recognition_Annotation_script.m and the python scripts in Evaluation_Protocol?

    Some script is used for generating file with groud truth of text on image (T3 ?), one is used for drawing polygon bounding box (Detection_Recognition_Annotation_script.m?) and the main script where magic is happening is Deteval.py or Pascal_VOC.py ?

    opened by Didier0 2
  • Issue with gt_reading_mod with newest groundtruth datasets

    Issue with gt_reading_mod with newest groundtruth datasets

    With the newest ground truth labels, should gt_reading_mod now look like this? I couldn't ge thte evaluation script to run with the existing version

    def gt_reading_mod(gt_dir, gt_id):
        gt_id = gt_id.split('.')[0]
        gt = io.loadmat('%s/gt_%s.mat' % (gt_dir, gt_id))
        gt = `gt['gt']`
        return gt
    
    opened by mandytoh1212 2
  • faster implement evaluation, add python3 support

    faster implement evaluation, add python3 support

    Hi @ckchng , I add an faster version DetEval.py by polygon computing based on shapely. After testing with prediction uploaded in this comment(I have converted .mat format to .txt format as requested, see no_expand_txt.zip.

    the speed boosting is significance:

    • slow version: about 10 min for 300 testing image
    • faster version: within 1 min

    precision: I have uploaded the result of no_expand_txt.zip, see result.txt and ori_result.txt, which have almost the same precision, recall and F1 measure.

    By the way, python3 is supported in my version with a few codes adding, hope you like it!

    opened by princewang1994 2
  • Miss orientation label for some annotations

    Miss orientation label for some annotations

    eg: img958.jpg [array(['x:'], dtype='<U2') array([[125, 162, 191, 191, 157, 119]], dtype=int16) array(['y:'], dtype='<U2') array([[627, 631, 626, 637, 639, 636]], dtype=int16) array(['DIENST'], dtype='<U6') array([], dtype='<U1')]

    opened by billqxg 2
  • Don't know how to evaluate by Python

    Don't know how to evaluate by Python

    I see there is an example floder in Evaluation_Protocol, but in Python_scripts, I don't know where you read them. Can you give us some simple examples about how to evaluate by Python?
    Thank you very much.

    opened by xieenze 2
  • Where is the statement for the annotation format.

    Where is the statement for the annotation format.

    Hi, thank you so much for providing such a novel dataset.

    I want to do some experiments on it, but I cann't find any explanation for the annotation format and the submission format for evaluation (the meaning each element of the gt stands for).

    Could you please help me find that?

    opened by zxDeepDiver 2
  • ModuleNotFoundError: No module named 'object_detection'

    ModuleNotFoundError: No module named 'object_detection'

    I installed Python=3.7 and tensorflow=1.13.2.

    By running the ./train.sh, following error appears.

    @ckchng @princewang1994 @cs-chan @techkang Kindly enlighten. Thanks

    Capture

    opened by HassanBinHaroon 0
Owner
Chee Seng Chan
Chee Seng Chan
An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

GeorgeJoe 115 Dec 12, 2022
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 7, 2022
Pure Javascript OCR for more than 100 Languages 📖🎉🖥

Version 2 is now available and under development in the master branch, read a story about v2: Why I refactor tesseract.js v2? Check the support/1.x br

Project Naptha 29.2k Jan 5, 2023
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

Pengyuan Lyu 309 Dec 6, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

null 27 Jan 8, 2023
Apply different text recognition services to images of handwritten documents.

Handprint The Handwritten Page Recognition Test is a command-line program that invokes HTR (handwritten text recognition) services on images of docume

Caltech Library 117 Jan 2, 2023
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

Ankush Gupta 1.8k Dec 28, 2022
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection For more details, please refer to our paper. Citing Please cite the related works

Minghui Liao 102 Jun 29, 2022
Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

null 428 Nov 22, 2022
This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

Dafang He 30 Oct 22, 2022
An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

FOTS: Fast Oriented Text Spotting with a Unified Network Introduction This is a pytorch re-implementation of FOTS: Fast Oriented Text Spotting with a

GeorgeJoe 171 Aug 4, 2022
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022
TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

Minghui Liao 930 Jan 4, 2023
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 91 Nov 22, 2022
huoyijie 1.2k Dec 29, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022