Tightness-aware Evaluation Protocol for Scene Text Detection

Yuliang Liu

Last update: Nov 18, 2022

Related tags

Deep Learning TIoU-metric

Overview

TIoU-metric

Release on 27/03/2019. This repository is built on the ICDAR 2015 evaluation code.

If you propose a better metric and require further evaluation, we can provide all the detection results used on this paper. For this purpose, you can send email to [email protected] and copy to [email protected].
[Python 3 version] by PkuDavidGuan.

State-of-the-art Results on Total-Text and CTW1500 (TIoU)

We sincerely appreciate the authors of recent and previous state-of-the-art methods for providing their results for evaluating TIoU metric in curved text benchmarks. The results are listed below:

Total-Text

Methods on Total-Text	TIoU-Recall (%)	TIoU-Precision (%)	TIoU-Hmean (%)	Publication
LSN+CC [paper]	48.4	59.8	53.5	arXiv 1903
Polygon-FRCNN-3 [paper]	47.9	61.9	54.0	IJDAR 2019
CTD+TLOC [paper][code]	50.8	62.0	55.8	arXiv 1712
ATRR [paper]	53.7	63.5	58.2	CVPR 2019
PSENet [paper][code]	53.3	66.9	59.3	CVPR 2019
CRAFT [paper]	54.1	65.5	59.3	CVPR 2019
TextField [paper]	58.0	63.0	60.4	TIP 2019
Mask TextSpotter [paper]	54.5	68.0	60.5	ECCV 2018
SPCNet [paper][code]	61.8	69.4	65.4	AAAI 2019

CTW1500

Methods on CTW1500	TIoU-Recall (%)	TIoU-Precision (%)	TIoU-Hmean (%)	Publication
CTD+TLOC [paper][code]	42.5	53.9	47.5	arXiv 1712
ATRR [paper]	54.9	61.6	58.0	CVPR 2019
LSN+CC [paper]	55.9	64.8	60.0	arXiv 1903
PSENet [paper][code]	54.9	67.6	60.6	CVPR 2019
CRAFT [paper]	56.4	66.3	61.0	CVPR 2019
MSR [paper]	56.3	67.3	61.3	arXiv 1901
TextField [paper]	57.2	66.2	61.4	TIP 2019
TextMountain [paper]	60.7	68.1	64.2	arXiv 1811
PAN Mask R-CNN [paper]	61.0	70.0	65.2	WACV 2019

Description

Evaluation protocols plays key role in the developmental progress of text detection methods. There are strict requirements to ensure that the evaluation methods are fair, objective and reasonable. However, existing metrics exhibit some obvious drawbacks:

*Unreasonable cases obtained using recent evaluation metrics. (a), (b), (c), and (d) all have the same IoU of 0.66 against the GT. Red: GT. Blue: detection.

As shown in (a), previous metrics consider that the GT has been entirely recalled.
As shown in (b), (c), and (d), even if containing background noise, previous metrics consider such detection to have 100% precision.
Previous metrics consider detections (a), (b), (c), and (d) to be equivalent perfect detections.
Previous metrics severely rely on an IoU threshold. High IoU threshold may discard some satisfactory bounding boxes, while low IoU threshold may include several inexact bounding boxes.

To address many existing issues of previous evaluation metrics, we propose an improved evaluation protocol called Tightnessaware Intersect-over-Union (TIoU) metric that could quantify:

Completeness of ground truth
Compactness of detection
Tightness of matching degree

We hope this work can raise the attentions of the text detection evaluation metrics and serve as a modest spur to more valuable contributions. More details can be found on our paper.

Clone the TIoU repository

Clone the TIoU-metric repository

git clone https://github.com/Yuliang-Liu/TIoU-metric.git --recursive

Getting Started

Install required module

pip install Polygon2

Then run

python script.py -g=gt.zip -s=pixellinkch4.zip

After that you can see the evaluation resutls.

You can simply replace pixellinkch4.zip with your own dection results, and make sure your dection format follows the same as ICDAR 2015.

Joint Word&Text-Line Evaluation

To test your detection with our joint Word&Text-Line solution, simply

cd Word_Text-Line

Then run the code

python script.py -g=gt.zip -gl=gt_textline.zip -s=pixellinkch4.zip

Support Curved Text Evaluation

Curved text requires polygonal input with mutable number of points. To evaluate your results on recent curved text benchmarks Total-text or SCUT-CTW1500, you can refer to curved-tiou/readme.md.

Example Results

Qualitative results:

*Qualitative visualization of TIoU metric. Blue: Detection. Bold red: Target GT region. Light red: Other GT regions. Rec.: Recognition results by CRNN [24]. NED: Normalized edit distance. Previous metrics evaluate all detection results and target GTs as 100% precision and recall, respectively, while in TIoU metric, all matching pairs are penalized by different degrees. Ct is defined in Eq. 10. Ot is defined in Eq. 13. Please refer to our paper for all the references.

ICDAR 2013 results:

*Comparison of evaluation methods on ICDAR 2013 for general detection frameworks and previous state-of-the-art methods. det: DetEval. i: IoU. e1: End-to-end recognition results by using CRNN [24]. e2: End-to-end recognition results by using RARE [25]. t: TIoU.

Line chart:

*(a) X-axis represents the detection methods listed in the Table above, and Y-axis represents the values of the F-measures.

ICDAR 2015 results:

*Comparison of metrics on the ICDAR 2015 challenge 4. Word&Text-Line Annotations use our new solution to address OM and MO issues. i: IoU. s: SIoU. t: TIoU.

Citation

If you find our metric useful for your reserach, please cite

@article{liu2019tightness,
  title={Tightness-aware Evaluation Protocol for Scene Text Detection},
  author={Liu, Yuliang and Jin, Lianwen and Xie, Zecheng and Luo, Canjie and Zhang, Shuaitao and Xie, Lele},
  journal={CVPR},
  year={2019}
}

References

If you are insterested in developing better scene text detection metrics, some references recommended here might be useful.

[1] Wolf, Christian, and Jean-Michel Jolion. "Object count/area graphs for the evaluation of object detection and segmentation algorithms." International Journal of Document Analysis and Recognition (IJDAR) 8.4 (2006): 280-296.

[2] Calarasanu, Stefania, Jonathan Fabrizio, and Severine Dubuisson. "What is a good evaluation protocol for text localization systems? Concerns, arguments, comparisons and solutions." Image and Vision Computing 46 (2016): 1-17.

[3] Dangla, Aliona, et al. "A first step toward a fair comparison of evaluation protocols for text detection algorithms." 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 2018.

[4] Shi, Baoguang, et al. "ICDAR2017 competition on reading chinese text in the wild (RCTW-17)." 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE, 2017.

Feedback

Suggestions and opinions of this metric (both positive and negative) are greatly welcome. Please contact the authors by sending email to [email protected] or [email protected].

Comments

ZIP entry not valid: on total text

Got a problem when using curved-tiou: python script.py -g=total-text-gt.zip -s=total-text_baseline.zip

Error! ZIP entry not valid: img1.txt

Please indicate how to solve this. Thanks

opened by Redaimao 5
How to evaluate on icdar13?

Thanks for your great work! I'm confused about how to evaluate on icdar2013. I just replace the gt.zip for icdar2015 with the one for icdar2013 and set LTRB=True. Is that right? Should I change the script.py or rrc_evaluation_funcs.py?

opened by Ocelot7777 3
Did start point affect the evaluation result?

I order the coordinate in clockwise, but the start point might vary both in my detection result and ground truth label. Will it affect the evaluation result I got?

opened by HuiHuangEmi 2
Is four points Polygon supported in curve-tiou?
I use gt.zip and pixellinkch4.zip in curved-tiou, but I get error. How can I fix it?

max@max:~/my_project/tiou-metric/curved-tiou$ python script.py -g=gt.zip -s=pixellinkch4.zip Error! 'NoneType' object has no attribute '__getitem__'
opened by leaderkent 1
Line in sample not valid. Sample: 1269 Line
when i run python script.py -g=... -s=... , I got two error. As follow:

Error: ('polygon has intersection sides',

Line in sample not valid. Sample: 1269 Line:

I don't know how to solve it.
opened by Shualite 0

Tightness-aware Evaluation Protocol for Scene Text Detection

Related tags

Overview

TIoU-metric

State-of-the-art Results on Total-Text and CTW1500 (TIoU)

Total-Text

CTW1500

Description

Clone the TIoU repository

Getting Started

Joint Word&Text-Line Evaluation

Support Curved Text Evaluation

Example Results

Qualitative results:

ICDAR 2013 results:

Line chart:

ICDAR 2015 results:

Citation

References

Feedback

Comments

ZIP entry not valid: on total text

How to evaluate on icdar13?

Did start point affect the evaluation result?

Is four points Polygon supported in curve-tiou?

Line in sample not valid. Sample: 1269 Line

Owner

Yuliang Liu

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Neural Scene Graphs for Dynamic Scene (CVPR 2021)

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Scene-Text-Detection-and-Recognition (Pytorch)

The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Object-aware Contrastive Learning for Debiased Scene Representation

Object-aware Contrastive Learning for Debiased Scene Representation

Stochastic Scene-Aware Motion Prediction

Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)