AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

Overview

AdvancedEAST

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST:An Efficient and Accurate Scene Text Detector, and the significant improvement was also made, which make long text predictions more accurate. If this project is helpful to you, welcome to star. And if you have any problem, please contact me.

advantages

  • writen in keras, easy to read and run
  • base on EAST, an advanced text detect algorithm
  • easy to train the model
  • significant improvement was made, long text predictions more accurate.(please see 'demo results' part bellow, and pay attention to the activation image, which starts with yellow grids, and ends with green grids.)

In my experiments, AdvancedEast has obtained much better prediction accuracy then East, especially on long text. Since East calculates final vertexes coordinates with weighted mean values of predicted vertexes coordinates of all pixels. It is too difficult to predict the 2 vertexes from the other side of the quadrangle. See East limitations picked from original paper bellow. East limitations

project files

  • config file:cfg.py,control parameters
  • pre-process data: preprocess.py,resize image
  • label data: label.py,produce label info
  • define network network.py
  • define loss function losses.py
  • execute training advanced_east.py and data_generator.py
  • predict predict.py and nms.py

后置处理过程说明参见 后置处理(含原理图)

network arch

  • AdvancedEast

AdvancedEast network arch

网络输出说明: 输出层分别是1位score map, 是否在文本框内;2位vertex code,是否属于文本框边界像素以及是头还是尾;4位geo,是边界像素可以预测的2个顶点坐标。所有像素构成了文本框形状,然后只用边界像素去预测回归顶点坐标。边界像素定义为黄色和绿色框内部所有像素,是用所有的边界像素预测值的加权平均来预测头或尾的短边两端的两个顶点。头和尾部分边界像素分别预测2个顶点,最后得到4个顶点坐标。

原理简介(含原理图)

  • East

East network arch

setup

  • python 3.6.3+
  • tensorflow-gpu 1.5.0+(or tensorflow 1.5.0+)
  • keras 2.1.4+
  • numpy 1.14.1+
  • tqdm 4.19.7+

training

  • tianchi ICPR dataset download 链接: https://pan.baidu.com/s/1NSyc-cHKV3IwDo6qojIrKA 密码: ye9y

  • prepare training data:make data root dir(icpr), copy images to root dir, and copy txts to root dir, data format details could refer to 'ICPR MTWI 2018 挑战赛二:网络图像的文本检测', Link

  • modify config params in cfg.py, see default values.

  • python preprocess.py, resize image to 256256,384384,512512,640640,736*736, and train respectively could speed up training process.

  • python label.py

  • python advanced_east.py, train entrance

  • python predict.py -p demo/001.png, to predict

  • pretrain model download(use for test) 链接: https://pan.baidu.com/s/1KO7tR_MW767ggmbTjIJpuQ 密码: kpm2

demo results

001原图 001激活图 001预测图

004原图 004激活图 004预测图

005原图 005激活图 005预测图

  • compared with east based on vgg16

As you can see, although the text area prediction is very accurate, the vertex coordinates are not accurate enough.

001激活图 001预测图

License

The codes are released under the MIT License.

references

网络输出说明: 输出层分别是1位score map, 是否在文本框内;2位vertex code,是否属于文本框边界像素以及是头还是尾;4位geo,是边界像素可以预测的2个顶点坐标。所有像素构成了文本框形状,然后只用边界像素去预测回归顶点坐标。边界像素定义为黄色和绿色框内部所有像素,是用所有的边界像素预测值的加权平均来预测头或尾的短边两端的两个顶点。头和尾部分边界像素分别预测2个顶点,最后得到4个顶点坐标。

原理简介(含原理图)

后置处理过程说明参见 后置处理(含原理图)

A Simple RaspberryPi Car Project

Comments
  • label中后四位geo表示啥意思?

    label中后四位geo表示啥意思?

    hi,huoyijie,在label的时候,有个地方不太理解,就是这里的后四位,表示的是每一个点和最长边的点的距离吗? gt[i, j, 3:5] = xy_list[vs[long_edge][ith][0]] - [px, py]

    gt[i, j, 5:] = xy_list[vs[long_edge][ith][1]] - [px, py]

    这里的xy_list[]记录的是这个4个点,按顺时针顺序存储的

    这里的后四位gt[i,j,3:7]存储的表示 [px,py]这个点到这两个长边点的距离吗?为什么要计算这个距离呢?

    opened by BigPandaCPU 11
  • 检测结果无头无尾

    检测结果无头无尾

    @huoyijie 您好,谢谢您代码的分享。我遇到一些问题,没找到头绪,想要咨询一下。 首先,我要检测的是自然场景下的文字,所以我对训练数据集进行了更改,使用的是ICDAR 2017的数据集和自己标注的数据集,后面找了一些图片进行测试。发现一些问题,有很多图片出现了有头无尾,有尾无头,无头无尾,以及相邻两行之间交叉,分不开的情况,可是我找不到原因以及怎么改善?可以给我一些建议吗?研三了,比较急,谢谢啦

    31 jpg_act 31 jpg_predict 41 jpg_act 41 jpg_predict

    8 jpg_act 8 jpg_predict

    opened by snowwindy 11
  • Weights loading issues.

    Weights loading issues.

    Hi! I try to train my own datasets with your pre-trained weights. First I set load_weights=True in cfg.py, then an error occurs in advanced_east.py as follow:

    Traceback (most recent call last): File "D:/Workspace/Ad_new/advanced_east.py", line 17, in main east_network.load_weights(cfg.saved_model_weights_file_path) File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1391, in load_weights saving.load_weights_from_hdf5_group(f, self.layers) File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 732, in load_weights_from_hdf5_group ' layers.') ValueError: You are trying to load a weight file containing 30 layers into a model with 1 layers.

    According to stackoverflow, I rewrite advanced_east.py :

    import os
    from tensorflow.python.keras.callbacks import EarlyStopping, ModelCheckpoint
    from tensorflow.python.keras.optimizers import Adam
    from tensorflow.python.keras.models import load_model
    from tensorflow.python.keras.models import model_from_json
    
    import cfg
    from network import East
    from losses import quad_loss
    from data_generator import gen
    
    east = East()
    east_network = east.east_network()
    
    if cfg.load_weights and os.path.exists(cfg.saved_model_weights_file_path):
      east_network.load_weights(cfg.saved_model_weights_file_path, by_name=True)
      json_string = east_network.to_json()
      east_network = model_from_json(json_string)
    
    east_network.summary()
    east_network.compile(loss=quad_loss, optimizer=Adam(lr=cfg.lr,
                                                        # clipvalue=cfg.clipvalue,
                                                        decay=cfg.decay))
    
    east_network.fit_generator(generator=gen(),
                               steps_per_epoch=cfg.steps_per_epoch,
                               epochs=cfg.epoch_num,
                               validation_data=gen(is_val=True),
                               validation_steps=cfg.validation_steps,
                               verbose=1,
                               # use_multiprocessing=True,
                               initial_epoch=cfg.initial_epoch,
                               callbacks=[
                                   EarlyStopping(patience=cfg.patience, verbose=1),
                                   ModelCheckpoint(filepath=cfg.model_weights_path,
                                                   save_best_only=False,
                                                   save_weights_only=True,
                                                   verbose=1)])
    east_network.save(cfg.saved_model_file_path)
    east_network.save_weights(cfg.saved_model_weights_file_path)
    
    

    After then, my training process started successfully. While I tried to predict images with my trained weights east_model_weights_3T736.h5 in folder saved_model , the error showing up again:

    Traceback (most recent call last): File "D:/Workspace/Ad_new/test_img.py", line 163, in main() File "D:/Workspace/Ad_new/test_img.py", line 137, in main east_detect.load_weights("saved_model/east_model_weights_3T736.h5") File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1391, in load_weights saving.load_weights_from_hdf5_group(f, self.layers) File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 732, in load_weights_from_hdf5_group ' layers.') ValueError: You are trying to load a weight file containing 1 layers into a model with 30 layers.

    I wanted to use the same solution and code as follow:

      east = East()
      east_detect = east.east_network()
      east_detect.load_weights("saved_model/east_model_weights_3T736.h5", by_name=True)
      json_string = east_detect.to_json()
      east_detect = model_from_json(json_string)
    

    Here occurs new error:

    File "D:/Workspace/Ad_new/test_img.py", line 163, in main() File "D:/Workspace/Ad_new/test_img.py", line 130, in main east_detect.load_weights("saved_model/east_model_weights_3T736.h5", by_name=True) File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1389, in load_weights saving.load_weights_from_hdf5_group_by_name(f, self.layers) File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 810, in load_weights_from_hdf5_group_by_name K.batch_set_value(weight_value_tuples) File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\backend.py", line 2711, in batch_set_value assign_op = x.assign(assign_placeholder) File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 945, in assign self._shape.assert_is_compatible_with(value_tensor.shape) File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 847, in assert_is_compatible_with raise ValueError("Shapes %s and %s are incompatible" % (self, other)) ValueError: Shapes (128,) and (1024,) are incompatible

    Am I using a wrong way loading or saving weights? And is there anyone else facing the same problem?

    opened by EllenSong77 6
  • weight file error

    weight file error

    when I use the file ‘east_model_weights_2T736.h5’, Error comes up. ValueError: You are trying to load a weight file containing 58 layers into a model with 30 layers. It seems the weight file does not match.

    opened by gs-ren 4
  • The issue of predict

    The issue of predict

    I want to test the pre-train model the environment : ubuntu16.04 python2.7 tensorflow==1.5.0

    I put my test image in the demo,here is my command --python predict.py -p demo/1.jpg

    but i got error: image

    please help me

    opened by simplify23 3
  • 关于side区域标签生成的问题

    关于side区域标签生成的问题

    头尾按照属于第一个poly进行标记,但是初始map为0, 属于head-side的ith=0,属于tail-side的ith=1,no-side的ith=-1,如此标记的话,head-side与non-side的值都为0,这部分可视化出来会发现只有tail没有head,在训练过程中head的损失计算会出现错误。这里作者认为如何

    opened by saicoco 3
  • inaccurate detection for image larger than 1000 pixel wide

    inaccurate detection for image larger than 1000 pixel wide

    the code resize the image to max 736 pixel wide before processing the detection, this makes the text in the image (larger than 1000 pixel wide) too small to detect, and leads to inaccuracy.

    if I change the resize process, i got resource exhausted error. do you have any ideas how to solve this?

    opened by am05mhz 3
  • Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

    Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

    Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

    8192/58889256 [..............................] - ETA: 38:30
    

    24576/58889256 [..............................] - ETA: 47:25 40960/58889256 [..............................] - ETA: 39:50 57344/58889256 [..............................] - ETA: 39:34Traceback (most recent call last):

    i do not know how to solve it????

    opened by magicxiaobai 2
  • 谢谢了  搞定了

    谢谢了 搞定了

    File "/home/louj1/pywork/AEAST/data_generator.py", line 37, in gen y[i] = np.load(gt_file) ValueError: could not broadcast input array from shape (21,4,2) into shape (64,64,7)

    opened by mialrr 2
  • Training error: could not broadcast input array from shape

    Training error: could not broadcast input array from shape

    Hi I am trying to run the training and am getting the following error:

     File "D:\AdvancedEAST\data_generator.py", line 33, in gen
        img
    
    ValueError: could not broadcast input array from shape (720,1280,3) into shape (736,736,3)
    

    Did you ever get such an error? Any suggestions? My data includes just a few images for merely initiating the training loop.

    The tree structure of the files used for training is displayed below:

    untitled

    opened by wajahat57 2
  • Respected moderator I would venture to ask, this Advanced EAST did not see the training entrance, it is estimated that many people like me are looking for. Thank you.

    Respected moderator I would venture to ask, this Advanced EAST did not see the training entrance, it is estimated that many people like me are looking for. Thank you.

    Respected moderator I would venture to ask, this Advanced EAST did not see the training entrance, it is estimated that many people like me are looking for. Thank you.

    opened by mialrr 1
  • 模型裁剪问题

    模型裁剪问题

    使用作者提供的模型,同时使用tensorflow中的模型优化包对模型进行裁剪,但是出现如下错误: prune_low_magnitude can only prune an object of the following types: tf.keras.models.Sequential, tf.keras functional model, tf.keras.layers.Layer, list of tf.keras.layers.Layer. You passed an object of type: Model.

    opened by zhangshabao 0
  • > 你的检测结果是好的吗,为什么我的检测结果很差,可以邮件交流吗?[1079158605@qq.com](mailto:1079158605@qq.com)

    > 你的检测结果是好的吗,为什么我的检测结果很差,可以邮件交流吗?[[email protected]](mailto:[email protected])

    你的检测结果是好的吗,为什么我的检测结果很差,可以邮件交流吗?[email protected]

    我的检测很好,看你的场景吧,[email protected]

    _Originally posted by @yahuuu in https://github.com/huoyijie/AdvancedEAST/issues/114#issuecomment-659986151 我的检测效果也很差,用的就是他给的数据集,,,请教一下[email protected] _

    opened by Cui7 0
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

null 365 Dec 20, 2022
Implementation of EAST scene text detector in Keras

EAST: An Efficient and Accurate Scene Text Detector This is a Keras implementation of EAST based on a Tensorflow implementation made by argman. The or

Jan Zdenek 208 Nov 15, 2022
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and

Peter M. Stahl 532 Dec 28, 2022
A tensorflow implementation of EAST text detector

EAST: An Efficient and Accurate Scene Text Detector Introduction This is a tensorflow re-implementation of EAST: An Efficient and Accurate Scene Text

null 2.9k Jan 2, 2023
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t

QichaoWu 49 Dec 24, 2022
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

null 11.4k Jan 2, 2023
Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

Marek Mauder 127 Dec 3, 2022
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Danny Crasto 38 Dec 5, 2022
TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法,textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

zhangjing1 24 Apr 28, 2022
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

text-detection-ctpn Scene text detection based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be

Shaohui Ruan 3.3k Dec 30, 2022
A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT 151 Dec 12, 2022
Tool which allow you to detect and translate text.

Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr

Damian Panek 176 Nov 28, 2022
This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

Dafang He 30 Oct 22, 2022
Detect handwritten words in a text-line (classic image processing method).

Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro

Harald Scheidl 190 Jan 3, 2023
Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

GTA-5-Lane-detection Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and

Danciu Georgian 4 Aug 1, 2021
This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.

This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.

Elkin Javier Guerra Galeano 17 Nov 3, 2022
A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

null 1 Dec 22, 2021