PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Related tags

Computer Vision EAST
Overview

Description

This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector.

  • Only RBOX part is implemented.
  • Using dice loss instead of class-balanced cross-entropy loss. Some codes refer to argman/EAST and songdejia/EAST
  • The pre-trained model provided achieves 82.79 F-score on ICDAR 2015 Challenge 4 using only the 1000 images. see here for the detailed results.
Model Loss Recall Precision F-score
Original CE 72.75 80.46 76.41
Re-Implement Dice 81.27 84.36 82.79

Prerequisites

Only tested on

  • Anaconda3
  • Python 3.7.1
  • PyTorch 1.0.1
  • Shapely 1.6.4
  • opencv-python 4.0.0.21
  • lanms 1.0.2

When running the script, if some module is not installed you will see a notification and installation instructions. if you failed to install lanms, please update gcc and binutils. The update under conda environment is:

conda install -c omgarcia gcc-6
conda install -c conda-forge binutils

The original lanms code has a bug in normalize_poly that the ref vertices are not fixed when looping the p's ordering to calculate the minimum distance. We fixed this bug in LANMS so that anyone could compile the correct lanms. However, this repo still uses the original lanms.

Installation

1. Clone the repo

git clone https://github.com/SakuraRiven/EAST.git
cd EAST

2. Data & Pre-Trained Model

  • Download Train and Test Data: ICDAR 2015 Challenge 4. Cut the data into four parts: train_img, train_gt, test_img, test_gt.

  • Download pre-trained VGG16 from PyTorch: VGG16 and our trained EAST model: EAST. Make a new folder pths and put the download pths into pths

mkdir pths
mv east_vgg16.pth vgg16_bn-6c64b313.pth pths/

Here is an example:

.
├── EAST
│   ├── evaluate
│   └── pths
└── ICDAR_2015
    ├── test_gt
    ├── test_img
    ├── train_gt
    └── train_img

Train

Modify the parameters in train.py and run:

CUDA_VISIBLE_DEVICES=0,1 python train.py

Detect

Modify the parameters in detect.py and run:

CUDA_VISIBLE_DEVICES=0 python detect.py

Evaluate

  • The evaluation scripts are from ICDAR Offline evaluation and have been modified to run successfully with Python 3.7.1.
  • Change the evaluate/gt.zip if you test on other datasets.
  • Modify the parameters in eval.py and run:
CUDA_VISIBLE_DEVICES=0 python eval.py
Comments
  • trian

    trian

    Traceback (most recent call last): File "train.py", line 66, in train(train_img_path, train_gt_path, pths_path, batch_size, lr, num_workers, epoch_iter, save_interval) File "train.py", line 35, in train for i, (img, gt_score, gt_geo, ignored_map) in enumerate(train_loader): File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 637, in next return self._process_next_batch(batch) File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch raise batch.exc_type(batch.exc_msg) TypeError: function takes exactly 5 arguments (1 given)

    How can I fix it?

    opened by wangyuxin87 9
  • About backbone

    About backbone

    你好,我想请教下,你有试过使用ResNet50或者ResNeXt等作为backbone吗,比起VGG16这些网络的效果应该会更好吧? 另外,我想请问下,你复现的EAST是否用到一些trick,比起 @argman 的https://github.com/argman/EAST performance还提升了一些。 非常感谢!

    opened by zhengjiawen 6
  • According to find_min_rect_angle

    According to find_min_rect_angle

    Hi dear @SakuraRiven, great thanks for your practical re-implement. It's much more clearer than the other version.

    I wanna know why you're doing find_min_rect_angle? Are you doing something like finding a best-matched AABB (Axis Aligned Bounding Box) by rotating each BBOX? (like below)

    image

    THANKS in advance!

    opened by doem97 3
  • Speed&&Accuracy

    Speed&&Accuracy

    Hello,thanks for your shre! I meet two questions need your help. When I trained AdvanceEAST by Keras, it process quickly and result is accuracy. However. I trained slowly by your project and result is not accuracy. About 3500 images with input train. double 1080Ti spend 24 hours. But AdvanceEAST spend 1 hours.

    opened by www516717402 3
  • ValueError: invalid literal for int() with base 10: '555'

    ValueError: invalid literal for int() with base 10: '555'

    C:\Users\Adnan\AppData\Local\Programs\Python\Python35\python.exe C:/Users/Adnan/EAST/train.py C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\optim\lr_scheduler.py:82: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning) Traceback (most recent call last): File "C:/Users/Adnan/EAST/train.py", line 66, in train(train_img_path, train_gt_path, pths_path, batch_size, lr, num_workers, epoch_iter, save_interval) File "C:/Users/Adnan/EAST/train.py", line 35, in train for i, (img, gt_score, gt_geo, ignored_map) in enumerate(train_loader): File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 819, in next return self._process_data(data) File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 846, in _process_data data.reraise() File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch_utils.py", line 369, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\Adnan\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\Adnan\EAST\dataset.py", line 385, in getitem vertices, labels = extract_vertices(lines) File "C:\Users\Adnan\EAST\dataset.py", line 365, in extract_vertices vertices.append(list(map(int, line.rstrip('\n').lstrip('\ufeff').split(',')[:8]))) ValueError: invalid literal for int() with base 10: '555'

    opened by Adnan-annan 2
  • Loss differences wrt paper

    Loss differences wrt paper

    I'm not sure whether it's a bug or just difference in variable naming In loss.py, you define the width and height of the union box as:

    w_union = torch.min(d3_gt, d3_pred) + torch.min(d4_gt, d4_pred)
    h_union = torch.min(d1_gt, d1_pred) + torch.min(d2_gt, d2_pred)
    

    while in original paper it is: wi= min(d2_gt,d2_pred) + min(d4_gt,d4_pred) hi= min(d1_gt,d1_pred) + min(d3_gt,d3_pred)

    Could you please clarify?

    opened by lukszamarcin 2
  • cannot re-implement your work

    cannot re-implement your work

    I train the network from vgg16_bn-6c64b313.pth and eval it, got Calculated!{"precision": 0.7974987974987975, "recall": 0.79826673086182, "hmean": 0.7978825794032723, "AP": 0} in save/model_epoch_600.pth

    BTW, I eval the east_vgg16.pth get your original score: Calculated!{"precision": 0.8435782108945528, "recall": 0.8127106403466539, "hmean": 0.8278567925453654, "AP": 0} So, I used the right code and model.

    What's wrong in there? Should I use earlier epoch?

    Yours, Neo

    opened by CuriousCat-7 2
  • 关于网络对四条边距离的输出

    关于网络对四条边距离的输出

    您好,谢谢您的工作。 我想问一下关于网络输出的部分: loc = self.sigmoid2(self.conv2(x)) * self.scope 1.这里self.scope的值是输入图片的尺寸512,对吗? 2.self.scope如果是输入图片的尺寸,那么self.sigmoid2(self.conv2(x))的值应该很小对吗,比如都是0-0.1之间的值,有可能更小,这对训练回归影响大吗?

    期待您的回复,谢谢!

    opened by Banyueqin 2
  • can not repeat your result

    can not repeat your result

    Hi SakuraRiven, I have cloned your repository and try to run eval.py file, it outputs precision 0.014, recall 0.8574 and hmean 0.028. It is so strange to me, do you have any comment?

    opened by phucnsp 2
  • something wrong about dataset.py crop_img()

    something wrong about dataset.py crop_img()

    thanks for your clean code. but i doubt that if there is something wrong in crop_img() function or not : after while() loop, if cnt == 1000(flag==True), then there are still some vertices that are outside of the cropped img, these wrong ones should be removed from 'new_vertices' ? another question: suppose that after while() loop, if cnt <1000(flag==False), which means that all vertices are not cross-crop-boundry, but they don't include vertices whose label==0, because you just vertify vertices whose label==1. so if you train with ignored, the valid vertices may be not matched with the cropped img. waiting for your comment.

    opened by YanShuang17 2
  • 执行python train.py 遇到以下问题。望回复

    执行python train.py 遇到以下问题。望回复

    Traceback (most recent call last): File "train.py", line 66, in train(train_img_path, train_gt_path, pths_path, batch_size, lr, num_workers, epoch_iter, save_interval) File "train.py", line 38, in train pred_score, pred_geo = model(img) File "C:\Users\Tony\anaconda3\envs\EAST\lib\site-packages\torch\nn\modules\module.py", line 489, in call result = self.forward(*input, **kwargs) File "C:\Users\Tony\PycharmProjects\EAST\model.py", line 168, in forward return self.output(self.merge(self.extractor(x))) File "C:\Users\Tony\anaconda3\envs\EAST\lib\site-packages\torch\nn\modules\module.py", line 489, in call result = self.forward(*input, **kwargs) File "C:\Users\Tony\PycharmProjects\EAST\model.py", line 73, in forward x = m(x) File "C:\Users\Tony\anaconda3\envs\EAST\lib\site-packages\torch\nn\modules\module.py", line 489, in call result = self.forward(*input, **kwargs) File "C:\Users\Tony\anaconda3\envs\EAST\lib\site-packages\torch\nn\modules\batchnorm.py", line 76, in forward exponential_average_factor, self.eps) File "C:\Users\Tony\anaconda3\envs\EAST\lib\site-packages\torch\nn\functional.py", line 1623, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: CUDA out of memory. Tried to allocate 1.50 GiB (GPU 0; 6.00 GiB total capacity; 4.64 GiB already allocated; 0 bytes free; 516.50 KiB cached)

    opened by tony8888lrz 1
  • RGBA support with png images

    RGBA support with png images

    How to do RGBA support with png images ?

    If dataset contains alpha channel. Its impossible to use for now. Then im using jpg file format.

    However, my dataset has transparent background so can you make it happen ? :))

    opened by ffanccybear 0
  • The sample not present in GT

    The sample not present in GT

    Error!ting 2304 image
    The sample res_5350037-2005-0001-0808_scale(0.5) not present in GT
    
    eval time is 705.596533536911
    

    I cannot understand this error That file exists in gt.zip file and res file in submit but still this error occurred

    opened by whansk50 0
  • ZeroDivisionError: float division by zero

    ZeroDivisionError: float division by zero

    Traceback (most recent call last):
      File "train.py", line 66, in <module>
        train(train_img_path, train_gt_path, pths_path, batch_size, lr, num_workers, epoch_iter, save_interval)	
      File "train.py", line 35, in train
        for i, (img, gt_score, gt_geo, ignored_map) in enumerate(train_loader):
      File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
        data = self._next_data()
      File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
        return self._process_data(data)
      File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
        data.reraise()
      File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise
        raise exception
    ZeroDivisionError: Caught ZeroDivisionError in DataLoader worker process 4.
    Original Traceback (most recent call last):
      File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
        data = fetcher.fetch(index)
      File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/home/***/EAST/dataset.py", line 389, in __getitem__
        img, vertices = crop_img(img, vertices, labels, self.length)
      File "/home/***/EAST/dataset.py", line 221, in crop_img
        flag = is_cross_text([start_w, start_h], length, new_vertices[labels==1,:])
      File "/home/***/EAST/dataset.py", line 181, in is_cross_text
        if 0.01 <= inter / p2.area <= 0.99:
    ZeroDivisionError: float division by zero
    

    Why is this error occurred?? I checked my custom dataset again and again, but no difference is shown with sample ICDAR dataset...

    +) Exception was handled, but I don't think it's valid because it's a temporary method

    opened by whansk50 0
  • Question: Saving checkpoints

    Question: Saving checkpoints

    @SakuraRiven For how many epochs did you train the pretrained model ? And how did you decide which checkpoint to use? based on the average loss per epoch?

    opened by mineshmathew 0
Owner
I AM IRON MAN
null
huoyijie 1.2k Dec 29, 2022
Implementation of EAST scene text detector in Keras

EAST: An Efficient and Accurate Scene Text Detector This is a Keras implementation of EAST based on a Tensorflow implementation made by argman. The or

Jan Zdenek 208 Nov 15, 2022
A tensorflow implementation of EAST text detector

EAST: An Efficient and Accurate Scene Text Detector Introduction This is a tensorflow re-implementation of EAST: An Efficient and Accurate Scene Text

null 2.9k Jan 2, 2023
python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Danny Crasto 38 Dec 5, 2022
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t

QichaoWu 49 Dec 24, 2022
TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

Minghui Liao 930 Jan 4, 2023
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

GeorgeJoe 115 Dec 12, 2022
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

CRAFT: Character-Region Awareness For Text detection Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

null 188 Dec 28, 2022
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and

Peter M. Stahl 532 Dec 28, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

SSTDNet Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight' using pytorch. This code is work for general object detecti

HotaekHan 84 Jan 5, 2022
Single Shot Text Detector with Regional Attention

Single Shot Text Detector with Regional Attention Introduction SSTD is initially described in our ICCV 2017 spotlight paper. A third-party implementat

Pan He 215 Dec 7, 2022
TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法,textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

zhangjing1 24 Apr 28, 2022
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo

candler 80 Dec 28, 2021
Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

Code for the AAAI18 paper PixelLink: Detecting Scene Text via Instance Segmentation, by Dan Deng, Haifeng Liu, Xuelong Li, and Deng Cai. Contributions

null 758 Dec 22, 2022
A curated list of papers and resources for scene text detection and recognition

Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin

Jan Zdenek 43 Mar 15, 2022
End-to-end pipeline for real-time scene text detection and recognition.

Real-time-Scene-Text-Detection-and-Recognition-System End-to-end pipeline for real-time scene text detection and recognition. The detection model use

Fangneng Zhan 89 Aug 4, 2022