Scene text detection and recognition based on Extremal Region(ER)

HSIEH, YI CHIA

Last update: Dec 6, 2022

Related tags

Computer Vision classifier opencv machine-learning algorithm ocr computer-vision svm detection image-processing text-recognition spelling-checker adaboost chaincode cascade-classifier mser lbp canny non-maximum-suppression scene-text-recognition scene-text-detection

Overview

Scene text recognition

A real-time scene text recognition algorithm. Our system is able to recognize text in unconstrain background.
This algorithm is based on several papers, and was implemented in C/C++.

Enviroment and dependency

OpenCV 3.1 or above
CMake 3.10 or above
Visual Studio 2017 Community or above (Windows-only)

How to build?

Windows

Install OpenCV; put the opencv directory into C:\tools
- You can install it manually from its Github repo, or
- You can install it via Chocolatey: choco install opencv, or
- If you already have OpenCV, edit CMakeLists.txt and change WIN_OPENCV_CONFIG_PATH to where you have it

Use CMake to generate the project files

cd Scene-text-recognition
mkdir build-win
cd build-win
cmake .. -G "Visual Studio 15 2017 Win64"

Use CMake to build the project
```
cmake --build . --config Release
```
Find the binaries in the root directory
```
cd ..
dir | findstr scene
```
To execute the scene_text_recognition.exe binary, use its wrapper script; for example:
```
.\scene_text_recognition.bat -i res\ICDAR2015_test\img_6.jpg
```

Linux

Install OpenCV; refer to OpenCV Installation in Linux

Use CMake to generate the project files

cd Scene-text-recognition
mkdir build-linux
cd build-linux
cmake ..

Use CMake to build the project
```
cmake --build .
```
Find the binaries in the root directory
```
cd ..
ls | grep scene
```

To execute the binaries, run them as-is; for example:

./scene_text_recognition -i res/ICDAR2015_test/img_6.jpg

Usage

The executable file scene_text_recognition must ultimately exist in the project root directory (i.e., next to classifier/, dictionary/ etc.)

./scene_text_recognition -v:            take default webcam as input  
./scene_text_recognition -v [video]:    take a video as input  
./scene_text_recognition -i [image]:    take an image as input  
./scene_text_recognition -i [path]:     take folder with images as input,  
./scene_text_recognition -l [image]:    demonstrate "Linear Time MSER" Algorithm  
./scene_text_recognition -t detection:  train text detection classifier  
./scene_text_recognition -t ocr:        train text recognition(OCR) classifier

Train your own classifier

Text detection

Put your text data to res/pos, non-text data to res/neg
Name your data in numerical, e.g. 1.jpg, 2.jpg, 3.jpg, and so on.
Make sure training folder exist
Run ./scene_text_recognition -t detection

mkdir training
./scene_text_recognition -t detection

Text detection classifier will be found at training folder

Text recognition(OCR)

Put your training data to res/ocr_training_data/
Arrange the data in [Font Name]/[Font Type]/[Category]/[Character.jpg], for instance Time_New_Roman/Bold/lower/a.jpg. You can refer to res/ocr_training_data.zip
Make sure training folder exist, and put svm-train to root folder (svm-train will be build by the system and should be found at build/)
Run ./scene_text_recognition -t ocr

mkdir training
mv svm-train scene-text-recognition/
scene_text_recognition -t ocr

Text recognition(OCR) classifier will be fould at training folder

How it works

The algorithm is based on an region detector called Extremal Region (ER), which is basically the superset of famous region detector MSER. We use ER to find text candidates. The ER is extracted by Linear-time MSER algorithm. The pitfall of ER is repeating detection, therefore we remove most of repeating ERs with non-maximum suppression. We estimate the overlapped between ER based on the Component tree. and calculate the stability of every ER. Among the same group of overlapped ER, only the one with maximum stability is kept. After that we apply a 2-stages Real-AdaBoost to fliter non-text region. We choose Mean-LBP as feature because it's faster compare to other features. The suviving ERs are then group together to make the result from character-level to word level, which is more instinct for human. Our next step is to apply an OCR to these detected text. The chain-code of the ER is used as feature and the classifier is trained by SVM. We also introduce several post-process such as optimal-path selection and spelling check to make the recognition result better.

Notes

For text classification, the training data contains 12,000 positive samples, mostly extract from ICDAR 2003 and ICDAR 2015 dataset. the negative sample are extracted from random images with a bootstrap process. As for OCR classification, the training data is consist of purely synthetic letters, including 28 different fonts.

The system is able to detect text in real-time(30FPS) and recognize text in nearly real-time(8~15 FPS, depends on number of texts) for a 640x480 resolution image on a Intel Core i7 desktop computer. The algorithm's end-to-end text detection accuracy on ICDAR dataset 2015 is roughly 70% with fine tune, and end-to-end recognition accuracy is about 30%.

Result

Detection result on IDCAR 2015

Recognition result on random image

Linear Time MSER Demo

The green pixels are so called boundry pixels, which are pushed into stacks. Each stack stand for a gray level, and pixels will be pushed according to their gary level.

References

D. Nister and H. Stewenius, “Linear time maximally stable extremal regions,” European Conference on Computer Vision, pages 183196, 2008.
L. Neumann and J. Matas, “A method for text localization and recognition in real-world images,” Asian Conference on Computer Vision, pages 770783, 2010.
L. Neumann and J. Matas, “Real-time scene text localization and recognition,” Computer Vision and Pattern Recognition, pages 35383545, 2012.
L. Neumann and J. Matas, “On combining multiple segmentations in scene text recognition,” International Conference on Document Analysis and Recognition, pages 523527, 2013.
H. Cho, M. Sung and B. Jun, ”Canny Text Detector: Fast and robust scene text localization algorithm,” Computer Vision and Pattern Recognition, pages 35663573, 2016.
B. Epshtein, E. Ofek, and Y. Wexler, “Detecting text in natural scenes with stroke width transform,” Computer Vision and Pattern Recognition, pages 29632970, 2010.
P. Viola and M. J. Jones, “Rapid object detection using a boosted cascade of simple features,” Computer Vision and Pattern Recognition, pages 511518, 2001.

Comments

Now i am getting this error after running the .exe file

.\canny_text.exe -icdar Error: the input file is not opened!! Error: the input file is not opened!! Error: the Transition Probability Table file is not opened!!

Btw i also tried to open up an image but same error
question

opened by arsalan993 5
I get this error when executing it via VisualStudio2015

The exe file can be compiled and executed but didnot show anything in the window,after some seconds I got this: Microsoft C++ Exception: std::length_error .

opened by poetgeorge 4
I am using visual studio 2015 enterprice and getting error (Errpr : "node" is ambigious")?

Debugger takes me to ER.h struct Node { Node(ER v, const int i) : vertex(v), index(i){}; ER vertex; int index; vector adj_list; vector edge_prob; };

typedef vector < Node > Graph;

In Type def vector < Node > Graph , Node is ambiguous

opened by arsalan993 4
What does it mean to be “strong” and “weak”?

I know it uses the machine classifier to judge non-text. But i didn't know what does it mean to be “strong” and “weak”? What are the "strong" and "weak" classifying based on? I want to learn it and make it better to classify。

My English is no well，maybe you can't easily to read。 Hope to get your reply.

//Chinese 中文 strong和weak分类器是根据什么来判定的？我想去深入了解一下并把它改进改进。
question

opened by Sincky 2
feature/update-cmake
This PR adds minor enhancements to the build chain:

takes care of variable casts and thus removes MSVC and GCC warnings

adds modern CMake syntax to CMakeFiles.txt

on Windows: creates a wrapper script, scene_text_recognition.bat, which adds the OpenCV directory into the %PATH%, rather than having the developer add it manually

copies binaries into the root directory

updates the Readme file

Tested on Windows 10 and Linux Ubuntu 18.04.
opened by flriancu 1
Documentation on Training

@HsiehYiChia Thank you for you hard work

After reading your replies on issue #6, it is still unclear how to train our own model. Can you simplify the training process to us.

opened by ghost 1
error while making the scene text recognition

os system: ubuntu 16.04 cmake version 3.5.1 opencv : 3.4

g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

make -j8 [ 9%] Building CXX object CMakeFiles/svm-train.dir/src/svm-train.cpp.o [ 18%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/adaboost.cpp.o [ 27%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o [ 36%] Building CXX object CMakeFiles/svm-train.dir/src/svm.cpp.o [ 54%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/main.cpp.o [ 63%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/SpellingCorrector.cpp.o [ 72%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/svm.cpp.o [ 54%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/OCR.cpp.o [ 81%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o [ 90%] Linking CXX executable svm-train [ 90%] Built target svm-train /home/giuser/Scene-text-recognition/src/utils.cpp: In function ‘void get_lbp_data()’: /home/giuser/Scene-text-recognition/src/utils.cpp:1416:24: warning: ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings] char data_filename = "training/detection_training_data.txt"; ^ [100%] Linking CXX executable scene_text_recognition CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o: In function ERFilter::er_tree_extract(cv::Mat)': ER.cpp:(.text+0x165d): undefined reference tocv::error(int, cv::String const&, char const, char const*, int)' CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o: In function cv::String::String(char const*)': ER.cpp:(.text._ZN2cv6StringC2EPKc[_ZN2cv6StringC5EPKc]+0x54): undefined reference tocv::String::allocate(unsigned long)' CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o: In function cv::String::~String()': ER.cpp:(.text._ZN2cv6StringD2Ev[_ZN2cv6StringD5Ev]+0x14): undefined reference tocv::String::deallocate()' CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o: In function cv::String::operator=(cv::String const&)': ER.cpp:(.text._ZN2cv6StringaSERKS0_[_ZN2cv6StringaSERKS0_]+0x28): undefined reference tocv::String::deallocate()' CMakeFiles/scene_text_recognition.dir/src/OCR.cpp.o: In function OCR::extract_feature(cv::Mat&, svm_node*)': OCR.cpp:(.text+0xce1): undefined reference tocv::findContours(cv::_InputOutputArray const&, cv::OutputArray const&, int, int, cv::Point)' OCR.cpp:(.text+0x10b8): undefined reference to cv::normalize(cv::_InputArray const&, cv::_InputOutputArray const&, double, double, int, int, cv::_InputArray const&)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In functionimage_mode(ERFilter*, char*)': utils.cpp:(.text+0x44e): undefined reference to cv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In functionvideo_mode(ERFilter*, char*)': utils.cpp:(.text+0x977): undefined reference to cv::VideoCapture::VideoCapture(cv::String const&)' utils.cpp:(.text+0xa99): undefined reference tocv::VideoWriter::fourcc(char, char, char, char)' utils.cpp:(.text+0xaed): undefined reference to cv::VideoWriter::open(cv::String const&, int, double, cv::Size_<int>, bool)' utils.cpp:(.text+0xb32): undefined reference tocv::VideoWriter::fourcc(char, char, char, char)' utils.cpp:(.text+0xb86): undefined reference to cv::VideoWriter::open(cv::String const&, int, double, cv::Size_<int>, bool)' utils.cpp:(.text+0x1800): undefined reference tocv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator > const&)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function load_challenge2_test_file(cv::Mat&, int)': utils.cpp:(.text+0x23f8): undefined reference tocv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function load_challenge2_training_file(cv::Mat&, int)': utils.cpp:(.text+0x274b): undefined reference tocv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function show_result(cv::Mat&, cv::Mat&, std::vector<Text, std::allocator<Text> >&, std::vector<double, std::allocator<double> >, std::vector<ER*, std::allocator<ER*> >, std::vector<std::vector<ER*, std::allocator<ER*> >, std::allocator<std::vector<ER*, std::allocator<ER*> > > >, std::vector<std::vector<ER*, std::allocator<ER*> >, std::allocator<std::vector<ER*, std::allocator<ER*> > > >, std::vector<std::vector<ER*, std::allocator<ER*> >, std::allocator<std::vector<ER*, std::allocator<ER*> > > >, std::vector<std::vector<ER*, std::allocator<ER*> >, std::allocator<std::vector<ER*, std::allocator<ER*> > > >)': utils.cpp:(.text+0x3526): undefined reference tocv::getTextSize(cv::String const&, int, double, int, int*)' utils.cpp:(.text+0x372c): undefined reference to cv::putText(cv::_InputOutputArray const&, cv::String const&, cv::Point_<int>, int, double, cv::Scalar_<double>, int, int, bool)' utils.cpp:(.text+0x3b2b): undefined reference tocv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x3ba5): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x3c1f): undefined reference tocv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x3c99): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x3d13): undefined reference tocv::imshow(cv::String const&, cv::_InputArray const&)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o:utils.cpp:(.text+0x3d77): more undefined references to cv::imshow(cv::String const&, cv::_InputArray const&)' follow CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In functiondraw_FPS(cv::Mat&, double)': utils.cpp:(.text+0x41a1): undefined reference to cv::putText(cv::_InputOutputArray const&, cv::String const&, cv::Point_<int>, int, double, cv::Scalar_<double>, int, int, bool)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In functiondraw_linear_time_MSER(std::__cxx11::basic_string<char, std::char_traits, std::allocator >)': utils.cpp:(.text+0x425d): undefined reference to cv::imread(cv::String const&, int)' utils.cpp:(.text+0x42c5): undefined reference tocv::VideoWriter::fourcc(char, char, char, char)' utils.cpp:(.text+0x4319): undefined reference to cv::VideoWriter::open(cv::String const&, int, double, cv::Size_<int>, bool)' utils.cpp:(.text+0x4ca3): undefined reference tocv::imshow(cv::String const&, cv::_InputArray const&)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function draw_multiple_channel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)': utils.cpp:(.text+0x5707): undefined reference tocv::imread(cv::String const&, int)' utils.cpp:(.text+0x5aaf): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x5b23): undefined reference tocv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x5b97): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x5c21): undefined reference tocv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator > const&)' utils.cpp:(.text+0x5cba): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator<int> > const&)' utils.cpp:(.text+0x5d53): undefined reference tocv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator > const&)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function output_MSER_time(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)': utils.cpp:(.text+0x60c9): undefined reference tocv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function output_optimal_path(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)': utils.cpp:(.text+0x6763): undefined reference tocv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function load_gt(int)': utils.cpp:(.text+0x6cf3): undefined reference tocv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function calc_recall_rate()': utils.cpp:(.text+0x7b2e): undefined reference tocv::MSER::create(int, int, int, double, double, int, double, double, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function bootstrap()': utils.cpp:(.text+0xd10f): undefined reference tocv::imread(cv::String const&, int)' utils.cpp:(.text+0xd34d): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator<int> > const&)' utils.cpp:(.text+0xd4c2): undefined reference tocv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator > const&)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function get_lbp_data()': utils.cpp:(.text+0xd9df): undefined reference tocv::imread(cv::String const&, int)' utils.cpp:(.text+0xdc31): undefined reference to cv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In functionget_ocr_data()': utils.cpp:(.text+0xee47): undefined reference to cv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In functioncv::String::String(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)': utils.cpp:(.text._ZN2cv6StringC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE[_ZN2cv6StringC5ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE]+0x5d): undefined reference to `cv::String::allocate(unsigned long)' collect2: error: ld returned 1 exit status CMakeFiles/scene_text_recognition.dir/build.make:268: recipe for target 'scene_text_recognition' failed make[2]: *** [scene_text_recognition] Error 1 CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/scene_text_recognition.dir/all' failed make[1]: *** [CMakeFiles/scene_text_recognition.dir/all] Error 2 Makefile:83: recipe for target 'all' failed make: *** [all] Error 2

can any help me out Thanks

opened by vigneshgig 0
Poor Results

Dear HsiehYiChia,

I run the model with the ".\scene_text_recognition.bat -i res\ICDAR2015_test\img_6.jpg" command and I noticed that the result were not the same with the one mentioned. For example I was expecting to see the result1.jpg but the output was the result. Can you tell me how to fix it? Thank you in advance.

opened by Theologis 0
The positive and negative samples cannot be uncompressed after downloading

Hi! Thank you for your project! Maybe the pos and neg is too large to unzip

I want to use text detection function without OCR,but I don't know how to prepare my own neg and pos. Could you tell me how to prepare training data ? Could you update the training data again ?

opened by ssifei 3
Error executing with a video argument

Hello,

I've successfully compiled the solution and I've been able to test it with some images. However, when passing a video file as argument, it seems that it only process one frame, the first one. I am using as it is written in the docs: "scene_text_recognition.exe -v videofile.name". Am I doing something wrong?

Thanks! Ana

opened by anavc94 2

Owner

HSIEH, YI CHIA

GitHub

A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT

151 Dec 12, 2022

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

309 Dec 6, 2022

caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo

80 Dec 28, 2021

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

10 Jun 30, 2021

A curated list of papers and resources for scene text detection and recognition

Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin

43 Mar 15, 2022

End-to-end pipeline for real-time scene text detection and recognition.

Real-time-Scene-Text-Detection-and-Recognition-System End-to-end pipeline for real-time scene text detection and recognition. The detection model use

89 Aug 4, 2022

Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

SceneTextPapers Tracking the latest progress in Scene Text Detection and Recognition: must-read papers well organized Information about this repositor

763 Jan 1, 2023

A toolbox of scene text detection and recognition

FudanOCR This toolbox contains the implementations of the following papers: Scene Text Telescope: Text-Focused Scene Image Super-Resolution [Chen et a

170 Dec 26, 2022

OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

354 Dec 12, 2022

Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

2.5k Jan 3, 2023

CRAFT-Pyotorch：Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note：If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

453 Dec 28, 2022

This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

30 Oct 22, 2022

An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

115 Dec 12, 2022

Rotational region detection based on Faster-RCNN.

R2CNN_Faster_RCNN_Tensorflow Abstract This is a tensorflow re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detecti

581 Nov 22, 2022

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

AdvancedEAST AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST:An Efficient and Accurate Scene Text Dete

1.2k Dec 29, 2022

Scene text detection and recognition based on Extremal Region(ER)

Related tags

Overview

Scene text recognition

Enviroment and dependency

How to build?

Windows

Linux

Usage

Train your own classifier

Text detection

Text recognition(OCR)

How it works

Notes

Result

Detection result on IDCAR 2015

Recognition result on random image

Linear Time MSER Demo

References

Comments

Owner

HSIEH, YI CHIA

A novel region proposal network for more general object detection ( including scene text detection ).

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

A curated list of papers and resources for scene text detection and recognition

End-to-end pipeline for real-time scene text detection and recognition.

Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

A toolbox of scene text detection and recognition

OCR, Scene-Text-Understanding, Text Recognition

Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT-Pyotorch：Character Region Awareness for Text Detection Reimplementation for Pytorch

This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

Rotational region detection based on Faster-RCNN.

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

A curated list of resources dedicated to scene text localization and recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

Scene text recognition