Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

weijiawu

Last update: Nov 9, 2022

Related tags

Deep Learning Unconstrained-Text-Detection-with-Box-Supervisionand-Dynamic-Self-Training

Overview

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Introduction

This is a PyTorch implementation of "SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training"

The paper propose a novel text detection system termed SelfText Beyond Polygon(SBP) with Bounding Box Supervision(BBS) and Dynamic Self Training~(DST), where training a polygon-based text detector with only a limited set of upright bounding box annotations. As shown in the Figure, SBP achieves the same performance as strong supervision while saving huge data annotation costs.

From more details,please refer to our arXiv paper

Environments

python 3
torch = 1.1.0
torchvision
Pillow
numpy

ToDo List

Dataset

Supported:

model zoo

Supported text detection:

Bounding Box Supervision(BBS)

Train

The training strategy includes three steps: (1) training SASN with synthetic data (2) generating pseudo label on real data based on bounding box annotation with SASN (3) training the detectors(EAST and PSENet) with the pseudo label

training SASN with synthtext or curved synthtext

(TDB)

generating pseudo label on real data with SASN

(TDB)

training EAST or PSENet with the pseudo label

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Dynamic Self Training

Train

(TDB)

Eval

for example (batchsize=2)

(TDB)

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

Method	Dataset	Pretrain	precision	recall	f-score
EAST_box	ICDAR15	-	65.8	63.8	64.8
EAST	ICDAR15	-	76.9	77.1	77.0
EAST_pseudo(SynthText)	ICDAR15	-	77.8	78.2	78.0
EAST_box	ICDAR15	SynthText	70.8	72.0	71.4
EAST	ICDAR15	SynthText	82.0	82.4	82.2
EAST_pseudo(SynthText)	ICDAR15	SynthText	81.3	82.2	81.8

The performance of EAST on MSRA-TD500

Method	Dataset	Pretrain	precision	recall	f-score
EAST_box	MSRA-TD500	-	40.49	31.05	35.15
EAST	MSRA-TD500	-	71.76	69.05	70.38
EAST_pseudo(SynthText)	MSRA-TD500	-	71.27	67.54	69.36
EAST_box	MSRA-TD500	SynthText	48.34	42.37	45.16
EAST	MSRA-TD500	SynthText	77.91	76.45	77.17
EAST_pseudo(SynthText)	MSRA-TD500	SynthText	77.42	73.85	75.59

The performance of PSENet on ICDAR15

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	ICDAR15	-	70.17	69.09	69.63
PSENet	ICDAR15	-	81.6	79.5	80.5
PSENet_pseudo(SynthText)	ICDAR15	-	82.9	77.6	80.2
PSENet_box	ICDAR15	SynthText	72.65	74.29	73.46
PSENet	ICDAR15	SynthText	86.42	83.54	84.96
PSENet_pseudo(SynthText)	ICDAR15	SynthText	86.77	83.34	85.02

The performance of PSENet on MSRA-TD500

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	MSRA-TD500	-	47.17	36.90	41.41
PSENet	MSRA-TD500	-	80.86	77.72	79.13
PSENet_pseudo(SynthText)	MSRA-TD500	-	80.32	77.26	78.86
PSENet_box	MSRA-TD500	SynthText	47.45	39.49	43.11
PSENet	MSRA-TD500	SynthText	84.11	84.97	84.54
PSENet_pseudo(SynthText)	MSRA-TD500	SynthText	84.03	84.03	84.03

The performance of PSENet on Total Text

Method	Dataset	Pretrain	precision	recall	f-score
PSENet_box	Total Text	-	46.5	43.6	45.0
PSENet	Total Text	-	80.4	76.5	78.4
PSENet_pseudo(SynthText)	Total Text	-	80.33	73.54	76.78
PSENet_pseudo(Curved SynthText)	Total Text	-	81.68	74.61	78.0
PSENet_box	Total Text	SynthText	51.94	47.45	49.59
PSENet	Total Text	SynthText	83.4	78.1	80.7
PSENet_pseudo(SynthText)	Total Text	SynthText	81.57	75.54	78.44
PSENet_pseudo(Curved SynthText)	Total Text	SynthText	82.51	77.57	80.0

The visualization of bounding-box annotation and the pseudo labels generated by BBS on Total-Text

links

https://github.com/SakuraRiven/EAST

https://github.com/WenmuZhou/PSENet.pytorch

License

For academic use, this project is licensed under the Apache License - see the LICENSE file for details. For commercial use, please contact the authors.

Citations

Please consider citing our paper in your publications if the project helps your research.

Eamil: [email protected]

You might also like...

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning [CVPR'21, Oral] By Zhicheng Huang*, Zhaoyang Zeng*, Yupan H

196 Dec 13, 2022

PyTorch DepthNet Training on Still Box dataset

DepthNet training on Still Box Project page This code can replicate the results of our paper that was published in UAVg-17. If you use this repo in yo

115 Nov 21, 2022

ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプル

ByteTrack-ONNX-Sample ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプルです。 ONNXに変換したモデルも同梱しています。変換自体を試したい方はByteT

16 Oct 26, 2022

Improving Object Detection by Estimating Bounding Box Quality Accurately

Improving Object Detection by Estimating Bounding Box Quality Accurately Abstrac

2 Apr 14, 2022

LQM - Improving Object Detection by Estimating Bounding Box Quality Accurately

Improving Object Detection by Estimating Bounding Box Quality Accurately Abstract Object detection aims to locate and classify object instances in ima

0 Sep 28, 2022

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation （ICCV2021）

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation This is a pytorch project for the paper Dynamic Divide-and-Conquer Ad

29 Nov 21, 2022

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo

79 Dec 26, 2022

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

[ICLR 2022] Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity by Shiwei Liu, Tianlong Chen, Zahra Atashgahi, Xiaohan Chen, Ghada Sokar, Elena Mocanu, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

18 Dec 31, 2022

Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

215 Nov 28, 2022

Comments

CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0

Owner

weijiawu

computer version, OCR I am looking for a research intern or visiting chance.

GitHub

Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments

Cross-Quality Labeled Faces in the Wild (XQLFW) Here, we release the database, evaluation protocol and code for the following paper: Cross Quality LFW

10 Dec 12, 2022

Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) Paper Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi, "6D Ro

284 Dec 23, 2022

Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

149 Jan 4, 2023

Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

10 Oct 7, 2022

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

Box_Discretization_Network This repository is built on the pytorch [maskrcnn_benchmark]. The method is the foundation of our ReCTs-competition method

266 Nov 24, 2022

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

183 Jan 3, 2023

Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Related tags

Overview

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

Introduction

Environments

ToDo List

Dataset

model zoo

Bounding Box Supervision(BBS)

Train

training SASN with synthtext or curved synthtext

generating pseudo label on real data with SASN

training EAST or PSENet with the pseudo label

Eval

Visualization

Dynamic Self Training

Train

Eval

Visualization

Experiments

Bounding Box Supervision

The performance of EAST on ICDAR15

The performance of EAST on MSRA-TD500

The performance of PSENet on ICDAR15

The performance of PSENet on MSRA-TD500

The performance of PSENet on Total Text

links

License

Citations

You might also like...

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

PyTorch DepthNet Training on Still Box dataset

ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプル

Improving Object Detection by Estimating Bounding Box Quality Accurately

LQM - Improving Object Detection by Estimating Bounding Box Quality Accurately

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation （ICCV2021）

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

Comments

CVE-2007-4559 Patch

Patching CVE-2007-4559

Owner

weijiawu

Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments

Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)