TextBoxes++: A Single-Shot Oriented Scene Text Detector

Overview

TextBoxes++: A Single-Shot Oriented Scene Text Detector

Introduction

This is an application for scene text detection (TextBoxes++) and recognition (CRNN).

TextBoxes++ is a unified framework for oriented scene text detection with a single network. It is an extended work of TextBoxes. CRNN is an open-source text recognizer. The code of TextBoxes++ is based on SSD and TextBoxes. The code of CRNN is modified from CRNN.

For more details, please refer to our arXiv paper.

Citing the related works

Please cite the related works in your publications if it helps your research:

@article{Liao2018Text,
  title = {{TextBoxes++}: A Single-Shot Oriented Scene Text Detector},
  author = {Minghui Liao, Baoguang Shi and Xiang Bai},
  journal = {{IEEE} Transactions on Image Processing},
  doi  = {10.1109/TIP.2018.2825107},
  url = {https://doi.org/10.1109/TIP.2018.2825107},
  volume = {27},
  number = {8},
  pages = {3676--3690},
  year = {2018}
}

@inproceedings{LiaoSBWL17,
  author    = {Minghui Liao and
               Baoguang Shi and
               Xiang Bai and
               Xinggang Wang and
               Wenyu Liu},
  title     = {TextBoxes: {A} Fast Text Detector with a Single Deep Neural Network},
  booktitle = {AAAI},
  year      = {2017}
}

@article{ShiBY17,
  author    = {Baoguang Shi and
               Xiang Bai and
               Cong Yao},
  title     = {An End-to-End Trainable Neural Network for Image-Based Sequence Recognition
               and Its Application to Scene Text Recognition},
  journal   = {{IEEE} TPAMI},
  volume    = {39},
  number    = {11},
  pages     = {2298--2304},
  year      = {2017}
}

Contents

  1. Requirements
  2. Installation
  3. Docker
  4. Models
  5. Demo
  6. Train

Requirements

NOTE There is partial support for a docker image. See docker/README.md. (Thank you for the PR from @mdbenito)

Torch7 for CRNN; 
g++-5; cuda8.0; cudnn V5.1 (cudnn 6 and cudnn 7 may fail); opencv3.0

Please refer to Caffe Installation to ensure other dependencies;

Installation

  1. compile TextBoxes++ (This is a modified version of caffe so you do not need to install the official caffe)
# Modify Makefile.config according to your Caffe installation.
cp Makefile.config.example Makefile.config
make -j8
# Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
make py
  1. compile CRNN (Please refer to CRNN if you have trouble with the compilation.)
cd crnn/src/
sh build_cpp.sh

Docker

(Thanks for the PR from @idotobi)

Build Docke Image

docker build -t tbpp_crnn:gpu .

This can take +1h, so go get a coffee ;)

Once this is done you can start a container via nvidia-docker.

nvidia-docker run -it --rm tbpp_crnn:gpu bash

To check if the GPU is available inside the docker container you can run nvidia-smi.

It's recommendable to mount the ./models and ./crnn/model/ directories to include the downloaded models.

nvidia-docker run -it \
                  --rm \
                  -v ${PWD}/models:/opt/caffe/models \ 
                  -v ${PWD}/crrn/model:/opt/caffe/crrn/model \
                  tbpp_crnn:gpu bash

For convenince this command is executed when running ./run.bash.

Models

  1. pre-trained model on SynthText (used for training): Dropbox; BaiduYun

  2. model trained on ICDAR 2015 Incidental Text (used for testing): Dropbox; BaiduYun

    Please place the above models in "./models/"

    If your data is hugely different from ICDAR 2015 Incidental Text,you'd better train it on your own data based on the pre-trained model on SynthText.

  3. CRNN model: Dropbox; BaiduYun

    Please place the crnn model in "./crnn/model/"

Demo

Download the ICDAR 2015 model and place it in "./models/"

python examples/text/demo.py

The detection results and recognition results are in "./demo_images"

Train

Create lmdb data

  1. convert ground truth into "xml" form: example.xml

  2. create train/test lists (train.txt / test.txt) in "./data/text/" with the following form:

     path_to_example1.jpg path_to_example1.xml
     path_to_example2.jpg path_to_example2.xml
    
  3. Run "./data/text/creat_data.sh"

Start training

1. modify the lmdb path in modelConfig.py
2. Run "python examples/text/train.py"
Comments
  • Result not reproducible for ICDAR2015

    Result not reproducible for ICDAR2015

    Hi, I'm trying to reproduce the result in the paper which are, image image But it seems not reproducible.

    Im using the pre-trained model downloaded from your link which is for ICDAR2015 (model_icdar15.caffemodel)

    I've tested in 2 ways, These are my parameters, and the rest is as is except the loops added for batch inference.

    And for the evaluation, I'm using the official evaluation code downloaded from ICDAR challenge

    And below is the tested set link

    Config 1

    demo.py

    'input_height' : 1024,
    'input_width' : 1024,
    'overlap_threshold' : 0.2,
    

    deploy.prototxt

    dim: 1024
    dim: 1024
    

    Result 1

    Calculated!{"recall": 0.7448242657679345, "precision": 0.815068493150685, "hmean": 0.7783647798742138, "AP": 0}
    

    Config 2

    Same as Config 1 but, demo.py

    'overlap_threshold' : 0.5,
    

    Result 2

    Calculated!{"recall": 0.8078960038517092, "precision": 0.7198627198627199, "hmean": 0.7613430127041741, "AP": 0}
    

    Both of the results does not seem to match the performance mentioned in the paper.

    Please help if there are another parameters to be tuned.

    opened by bado-lee 13
  • strange demo.py detection result

    strange demo.py detection result

    I used the demo.py to run only the detection on the demo image, however I got some weird result: 681,232,604,233,1,1,1,1,0.9996038 658,94,615,93,1,1,1,1,0.9966356 662,160,620,160,1,1,1,1,0.996449 659,139,616,138,1,1,1,1,0.99529016 658,117,606,116,1,1,1,1,0.8200721

    I am running on model_icdar15.caffemodel. Does anyone have a clue?

    opened by ziyanghong 10
  • Caffe error when building

    Caffe error when building

    I'm trying to build custom caffe on Textboxes++ but I'm getting an error. Please help!

    Rash-MacBook-Pro:TextBoxes++ rashmendis$ make -j8
    CXX src/caffe/common.cpp
    CXX src/caffe/data_transformer.cpp
    CXX src/caffe/layers/ctc_decoder_layer.cpp
    CXX src/caffe/layers/tanh_layer.cpp
    CXX src/caffe/layers/smooth_L1_loss_layer.cpp
    CXX src/caffe/layers/bnll_layer.cpp
    CXX src/caffe/layers/relu_layer.cpp
    CXX src/caffe/layers/spp_layer.cpp
    CXX src/caffe/layers/argmax_layer.cpp
    CXX src/caffe/layers/sigmoid_layer.cpp
    CXX src/caffe/layers/crop_layer.cpp
    CXX src/caffe/layers/cudnn_pooling_layer.cpp
    CXX src/caffe/layers/prior_box_layer.cpp
    CXX src/caffe/layers/multinomial_logistic_loss_layer.cpp
    CXX src/caffe/layers/cudnn_lcn_layer.cpp
    CXX src/caffe/layers/exp_layer.cpp
    CXX src/caffe/layers/cudnn_conv_layer.cpp
    CXX src/caffe/layers/log_layer.cpp
    CXX src/caffe/layers/deconv_layer.cpp
    CXX src/caffe/layers/euclidean_loss_layer.cpp
    CXX src/caffe/layers/inner_product_layer.cpp
    CXX src/caffe/layers/reduction_layer.cpp
    CXX src/caffe/layers/mvn_layer.cpp
    CXX src/caffe/layers/batch_reindex_layer.cpp
    CXX src/caffe/layers/recurrent_layer.cpp
    CXX src/caffe/layers/input_layer.cpp
    CXX src/caffe/layers/image_data_layer.cpp
    CXX src/caffe/layers/conv_layer.cpp
    CXX src/caffe/layers/roi_pooling_layer.cpp
    CXX src/caffe/layers/normalize_layer.cpp
    CXX src/caffe/layers/detection_evaluate_layer.cpp
    CXX src/caffe/layers/lrn_layer.cpp
    CXX src/caffe/layers/prelu_layer.cpp
    CXX src/caffe/layers/permute_layer.cpp
    CXX src/caffe/layers/base_conv_layer.cpp
    CXX src/caffe/layers/cudnn_tanh_layer.cpp
    CXX src/caffe/layers/absval_layer.cpp
    CXX src/caffe/layers/batch_norm_layer.cpp
    CXX src/caffe/layers/cudnn_softmax_layer.cpp
    CXX src/caffe/layers/eltwise_layer.cpp
    CXX src/caffe/layers/threshold_layer.cpp
    CXX src/caffe/layers/cudnn_relu_layer.cpp
    CXX src/caffe/layers/cudnn_lrn_layer.cpp
    CXX src/caffe/layers/hdf5_output_layer.cpp
    CXX src/caffe/layers/accuracy_layer.cpp
    CXX src/caffe/layers/window_data_layer.cpp
    CXX src/caffe/layers/bias_layer.cpp
    CXX src/caffe/layers/silence_layer.cpp
    CXX src/caffe/layers/rnn_layer.cpp
    CXX src/caffe/layers/scale_layer.cpp
    CXX src/caffe/layers/base_data_layer.cpp
    CXX src/caffe/layers/neuron_layer.cpp
    CXX src/caffe/layers/filter_layer.cpp
    CXX src/caffe/layers/parameter_layer.cpp
    CXX src/caffe/layers/data_layer.cpp
    CXX src/caffe/layers/memory_data_layer.cpp
    CXX src/caffe/layers/softmax_loss_layer.cpp
    CXX src/caffe/layers/elu_layer.cpp
    CXX src/caffe/layers/sigmoid_cross_entropy_loss_layer.cpp
    CXX src/caffe/layers/lstm_unit_layer.cpp
    CXX src/caffe/layers/embed_layer.cpp
    CXX src/caffe/layers/reverse_layer.cpp
    CXX src/caffe/layers/multibox_loss_layer.cpp
    CXX src/caffe/layers/im2col_layer.cpp
    CXX src/caffe/layers/softmax_layer.cpp
    CXX src/caffe/layers/loss_layer.cpp
    CXX src/caffe/layers/contrastive_loss_layer.cpp
    CXX src/caffe/layers/dummy_data_layer.cpp
    CXX src/caffe/layers/hinge_loss_layer.cpp
    CXX src/caffe/layers/dropout_layer.cpp
    CXX src/caffe/layers/concat_layer.cpp
    CXX src/caffe/layers/flatten_layer.cpp
    CXX src/caffe/layers/split_layer.cpp
    CXX src/caffe/layers/infogain_loss_layer.cpp
    CXX src/caffe/layers/cudnn_sigmoid_layer.cpp
    CXX src/caffe/layers/detection_output_layer.cpp
    CXX src/caffe/layers/hdf5_data_layer.cpp
    CXX src/caffe/layers/video_data_layer.cpp
    CXX src/caffe/layers/pooling_layer.cpp
    CXX src/caffe/layers/power_layer.cpp
    CXX src/caffe/layers/reshape_layer.cpp
    CXX src/caffe/layers/ctc_loss_layer.cpp
    CXX src/caffe/layers/tile_layer.cpp
    CXX src/caffe/layers/annotated_data_layer.cpp
    CXX src/caffe/layers/lstm_layer.cpp
    CXX src/caffe/layers/slice_layer.cpp
    CXX src/caffe/parallel.cpp
    CXX src/caffe/util/im2col.cpp
    CXX src/caffe/util/db_lmdb.cpp
    CXX src/caffe/util/im_transforms.cpp
    CXX src/caffe/util/bbox_util.cpp
    CXX src/caffe/util/db.cpp
    CXX src/caffe/util/upgrade_proto.cpp
    CXX src/caffe/util/benchmark.cpp
    CXX src/caffe/util/cudnn.cpp
    CXX src/caffe/util/math_functions.cpp
    CXX src/caffe/util/sampler.cpp
    src/caffe/util/im_transforms.cpp:28:13: warning: unused variable 'prob_eps' [-Wunused-const-variable]
    const float prob_eps = 0.01;
                ^
    1 warning generated.
    CXX src/caffe/util/hdf5.cpp
    CXX src/caffe/util/signal_handler.cpp
    CXX src/caffe/util/blocking_queue.cpp
    CXX src/caffe/util/io.cpp
    CXX src/caffe/util/db_leveldb.cpp
    CXX src/caffe/util/insert_splits.cpp
    CXX src/caffe/layer_factory.cpp
    CXX src/caffe/blob.cpp
    CXX src/caffe/layer.cpp
    CXX src/caffe/solvers/nesterov_solver.cpp
    src/caffe/util/bbox_util.cpp:723:30: error: no member named 'format' in namespace 'boost'
      std::string s = str(boost::format("POLYGON((%1% %2%, %3% %4%, %5% %6%, %7% %8%, %9% %10%))") % x1 % y1 % x2 % y2 % x3 % y3 % x4 % y4 % x1 % y1);
                          ~~~~~~~^
    src/caffe/util/bbox_util.cpp:743:30: error: no member named 'format' in namespace 'boost'
      std::string s = str(boost::format("POLYGON((%1% %2%, %3% %4%, %5% %6%, %7% %8%, %9% %10%))") % x1 % y1 % x2 % y2 % x3 % y3 % x4 % y4 % x1 % y1);
                          ~~~~~~~^
    src/caffe/util/bbox_util.cpp:836:3: error: no matching function for call to 'RboxToPolygon'
      RboxToPolygon(rbox1, polygon1);
      ^~~~~~~~~~~~~
    src/caffe/util/bbox_util.cpp:850:16: note: in instantiation of function template specialization 'caffe::RboxOverlap<float>' requested here
    template float RboxOverlap(const float* rbox1, const float* rbox2);
                   ^
    src/caffe/util/bbox_util.cpp:708:6: note: candidate function not viable: no known conversion from 'const float *' to 'const caffe::NormalizedRBox' for 1st argument
    void RboxToPolygon(const NormalizedRBox rbox, Polygon& polygon){
         ^
    src/caffe/util/bbox_util.cpp:728:6: note: candidate template ignored: substitution failure [with Dtype = float]
    void RboxToPolygon(const Dtype* rbox, Polygon& polygon){
         ^
    src/caffe/util/bbox_util.cpp:837:3: error: no matching function for call to 'RboxToPolygon'
      RboxToPolygon(rbox2, polygon2);
      ^~~~~~~~~~~~~
    src/caffe/util/bbox_util.cpp:708:6: note: candidate function not viable: no known conversion from 'const float *' to 'const caffe::NormalizedRBox' for 1st argument
    void RboxToPolygon(const NormalizedRBox rbox, Polygon& polygon){
         ^
    src/caffe/util/bbox_util.cpp:728:6: note: candidate template ignored: substitution failure [with Dtype = float]
    void RboxToPolygon(const Dtype* rbox, Polygon& polygon){
         ^
    src/caffe/util/bbox_util.cpp:836:3: error: no matching function for call to 'RboxToPolygon'
      RboxToPolygon(rbox1, polygon1);
      ^~~~~~~~~~~~~
    src/caffe/util/bbox_util.cpp:851:17: note: in instantiation of function template specialization 'caffe::RboxOverlap<double>' requested here
    template double RboxOverlap(const double* rbox1, const double* rbox2);
                    ^
    src/caffe/util/bbox_util.cpp:708:6: note: candidate function not viable: no known conversion from 'const double *' to 'const caffe::NormalizedRBox' for 1st argument
    void RboxToPolygon(const NormalizedRBox rbox, Polygon& polygon){
         ^
    src/caffe/util/bbox_util.cpp:728:6: note: candidate template ignored: substitution failure [with Dtype = double]
    void RboxToPolygon(const Dtype* rbox, Polygon& polygon){
         ^
    src/caffe/util/bbox_util.cpp:837:3: error: no matching function for call to 'RboxToPolygon'
      RboxToPolygon(rbox2, polygon2);
      ^~~~~~~~~~~~~
    src/caffe/util/bbox_util.cpp:708:6: note: candidate function not viable: no known conversion from 'const double *' to 'const caffe::NormalizedRBox' for 1st argument
    void RboxToPolygon(const NormalizedRBox rbox, Polygon& polygon){
         ^
    src/caffe/util/bbox_util.cpp:728:6: note: candidate template ignored: substitution failure [with Dtype = double]
    void RboxToPolygon(const Dtype* rbox, Polygon& polygon){
         ^
    src/caffe/util/bbox_util.cpp:3627:25: error: no matching function for call to 'RboxOverlap'
            float overlap = RboxOverlap(rboxes + idx * 9, rboxes + kept_idx * 9);
                            ^~~~~~~~~~~
    src/caffe/util/bbox_util.cpp:3644:6: note: in instantiation of function template specialization 'caffe::ApplyNMSFastRBox<float>' requested here
    void ApplyNMSFastRBox(const float* rboxes, const float* scores, const int num,
         ^
    src/caffe/util/bbox_util.cpp:750:7: note: candidate function not viable: no known conversion from 'const float *' to 'const caffe::NormalizedRBox' for 1st argument
    float RboxOverlap(const NormalizedRBox& rbox1, const NormalizedRBox& rbox2) {
          ^
    src/caffe/util/bbox_util.cpp:827:7: note: candidate template ignored: substitution failure [with Dtype = float]
    Dtype RboxOverlap(const Dtype* rbox1, const Dtype* rbox2) {
          ^
    src/caffe/util/bbox_util.cpp:3627:25: error: no matching function for call to 'RboxOverlap'
            float overlap = RboxOverlap(rboxes + idx * 9, rboxes + kept_idx * 9);
                            ^~~~~~~~~~~
    src/caffe/util/bbox_util.cpp:3648:6: note: in instantiation of function template specialization 'caffe::ApplyNMSFastRBox<double>' requested here
    void ApplyNMSFastRBox(const double* rboxes, const double* scores, const int num,
         ^
    src/caffe/util/bbox_util.cpp:750:7: note: candidate function not viable: no known conversion from 'const double *' to 'const caffe::NormalizedRBox' for 1st argument
    float RboxOverlap(const NormalizedRBox& rbox1, const NormalizedRBox& rbox2) {
          ^
    src/caffe/util/bbox_util.cpp:827:7: note: candidate template ignored: substitution failure [with Dtype = double]
    Dtype RboxOverlap(const Dtype* rbox1, const Dtype* rbox2) {
          ^
    8 errors generated.
    make: *** [.build_release/src/caffe/util/bbox_util.o] Error 1
    make: *** Waiting for unfinished jobs....
    src/caffe/util/blocking_queue.cpp:50:7: warning: unused typedef 'INVALID_REQUESTED_LOG_SEVERITY' [-Wunused-local-typedef]
          LOG_EVERY_N(INFO, 1000)<< log_on_wait;
          ^
    /usr/local/include/glog/logging.h:943:30: note: expanded from macro 'LOG_EVERY_N'
                                 INVALID_REQUESTED_LOG_SEVERITY);           \
                                 ^
    1 warning generated.
    Rash-MacBook-Pro:TextBoxes++ rashmendis$ 
    
    
    opened by rashmendis 8
  • the loss declined so slowly!

    the loss declined so slowly!

    i train the model on RCTW-17 and some doc pictures, but the loss declined so slowly. I have spent 12 hours, and the the model was trained only 3000 iterations. The loss is from 4.71 to 3.47 in 12 hours. TOO SLOW! Is there anything wrong?

    below is the print.

    I0320 18:24:14.414151 32854 blocking_queue.cpp:50] Data layer prefetch queue empty I0320 18:46:59.492482 32854 solver.cpp:243] Iteration 100, loss = 4.71355 I0320 18:46:59.492624 32854 solver.cpp:259] Train net output #0: mbox_loss = 4.7507 (* 1 = 4.7507 loss) I0320 18:47:18.494647 32854 sgd_solver.cpp:138] Iteration 100, lr = 0.0001 I0320 19:09:15.981941 32854 solver.cpp:243] Iteration 200, loss = 4.25057 I0320 19:09:15.982163 32854 solver.cpp:259] Train net output #0: mbox_loss = 4.12665 (* 1 = 4.12665 loss) I0320 19:09:35.695390 32854 sgd_solver.cpp:138] Iteration 200, lr = 0.0001 I0320 19:28:36.370605 32854 solver.cpp:243] Iteration 300, loss = 4.06544 I0320 19:28:36.370846 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.81313 (* 1 = 3.81313 loss) I0320 19:28:40.893465 32854 sgd_solver.cpp:138] Iteration 300, lr = 0.0001 I0320 19:45:51.642319 32854 solver.cpp:243] Iteration 400, loss = 4.00686 I0320 19:45:51.657042 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.86898 (* 1 = 3.86898 loss) I0320 19:45:52.827033 32854 sgd_solver.cpp:138] Iteration 400, lr = 0.0001 I0320 20:09:07.978211 32854 solver.cpp:243] Iteration 500, loss = 4.04271 I0320 20:09:08.014988 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.87323 (* 1 = 3.87323 loss) I0320 20:09:08.015076 32854 sgd_solver.cpp:138] Iteration 500, lr = 0.0001 I0320 20:40:22.733320 32854 solver.cpp:243] Iteration 600, loss = 3.8724 I0320 20:40:22.733603 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.91798 (* 1 = 3.91798 loss) I0320 20:40:22.733654 32854 sgd_solver.cpp:138] Iteration 600, lr = 0.0001 I0320 21:14:05.252872 32854 solver.cpp:243] Iteration 700, loss = 3.91969 I0320 21:14:05.254623 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.73075 (* 1 = 3.73075 loss) I0320 21:14:12.680538 32854 sgd_solver.cpp:138] Iteration 700, lr = 0.0001 I0320 21:49:44.001958 32854 solver.cpp:243] Iteration 800, loss = 3.86093 I0320 21:49:44.002271 32854 solver.cpp:259] Train net output #0: mbox_loss = 4.65883 (* 1 = 4.65883 loss) I0320 21:49:49.958609 32854 sgd_solver.cpp:138] Iteration 800, lr = 0.0001 I0320 21:53:01.317153 33124 blocking_queue.cpp:50] Data layer prefetch queue empty I0320 22:23:29.791314 32854 solver.cpp:243] Iteration 900, loss = 3.8187 I0320 22:23:29.791862 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.85123 (* 1 = 3.85123 loss) I0320 22:23:47.945513 32854 sgd_solver.cpp:138] Iteration 900, lr = 0.0001 I0320 22:55:44.925760 32854 solver.cpp:243] Iteration 1000, loss = 3.7425 I0320 22:55:44.926020 32854 solver.cpp:259] Train net output #0: mbox_loss = 4.01863 (* 1 = 4.01863 loss) I0320 22:55:50.410193 32854 sgd_solver.cpp:138] Iteration 1000, lr = 0.0001 I0320 23:28:20.539289 32854 solver.cpp:243] Iteration 1100, loss = 3.8035 I0320 23:28:20.548061 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.7838 (* 1 = 3.7838 loss) I0320 23:28:25.631153 32854 sgd_solver.cpp:138] Iteration 1100, lr = 0.0001 I0320 23:56:49.150030 32854 solver.cpp:243] Iteration 1200, loss = 3.72246 I0320 23:56:49.150259 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.75223 (* 1 = 3.75223 loss) I0320 23:57:11.602012 32854 sgd_solver.cpp:138] Iteration 1200, lr = 0.0001 I0321 00:26:45.537853 32854 solver.cpp:243] Iteration 1300, loss = 3.72136 I0321 00:26:45.538089 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.38236 (* 1 = 3.38236 loss) I0321 00:26:45.538132 32854 sgd_solver.cpp:138] Iteration 1300, lr = 0.0001 I0321 00:55:09.132469 32854 solver.cpp:243] Iteration 1400, loss = 3.6804 I0321 00:55:09.132679 32854 solver.cpp:259] Train net output #0: mbox_loss = 2.87209 (* 1 = 2.87209 loss) I0321 00:55:16.543397 32854 sgd_solver.cpp:138] Iteration 1400, lr = 0.0001 I0321 01:31:36.330186 32854 solver.cpp:243] Iteration 1500, loss = 3.64987 I0321 01:31:36.337105 32854 solver.cpp:259] Train net output #0: mbox_loss = 4.11438 (* 1 = 4.11438 loss) I0321 01:32:01.195981 32854 sgd_solver.cpp:138] Iteration 1500, lr = 0.0001 I0321 02:03:18.576989 32854 solver.cpp:243] Iteration 1600, loss = 3.61781 I0321 02:03:18.622884 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.40459 (* 1 = 3.40459 loss) I0321 02:03:18.634691 32854 sgd_solver.cpp:138] Iteration 1600, lr = 0.0001 I0321 02:21:28.850323 32854 blocking_queue.cpp:50] Data layer prefetch queue empty I0321 02:31:13.534430 32854 solver.cpp:243] Iteration 1700, loss = 3.64082 I0321 02:31:13.539944 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.20621 (* 1 = 3.20621 loss) I0321 02:31:35.534457 32854 sgd_solver.cpp:138] Iteration 1700, lr = 0.0001 I0321 03:02:24.651046 32854 solver.cpp:243] Iteration 1800, loss = 3.59635 I0321 03:02:24.651306 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.84726 (* 1 = 3.84726 loss) I0321 03:02:35.240459 32854 sgd_solver.cpp:138] Iteration 1800, lr = 0.0001 I0321 03:28:55.246892 32854 solver.cpp:243] Iteration 1900, loss = 3.64389 I0321 03:28:55.247161 32854 solver.cpp:259] Train net output #0: mbox_loss = 4.57005 (* 1 = 4.57005 loss) I0321 03:29:12.756068 32854 sgd_solver.cpp:138] Iteration 1900, lr = 0.0001 I0321 03:56:30.419265 32854 solver.cpp:243] Iteration 2000, loss = 3.64181 I0321 03:56:30.419488 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.22497 (* 1 = 3.22497 loss) I0321 03:56:30.419528 32854 sgd_solver.cpp:138] Iteration 2000, lr = 0.0001 I0321 04:22:28.350476 32854 solver.cpp:243] Iteration 2100, loss = 3.6115 I0321 04:22:28.356010 32854 solver.cpp:259] Train net output #0: mbox_loss = 4.16131 (* 1 = 4.16131 loss) I0321 04:22:28.356235 32854 sgd_solver.cpp:138] Iteration 2100, lr = 0.0001 I0321 04:48:44.036162 32854 solver.cpp:243] Iteration 2200, loss = 3.56413 I0321 04:48:44.036393 32854 solver.cpp:259] Train net output #0: mbox_loss = 2.78308 (* 1 = 2.78308 loss) I0321 04:48:44.036433 32854 sgd_solver.cpp:138] Iteration 2200, lr = 0.0001 I0321 05:14:25.236363 32854 solver.cpp:243] Iteration 2300, loss = 3.6338 I0321 05:14:25.236584 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.5336 (* 1 = 3.5336 loss) I0321 05:14:28.135422 32854 sgd_solver.cpp:138] Iteration 2300, lr = 0.0001 I0321 05:39:43.394588 32854 solver.cpp:243] Iteration 2400, loss = 3.59064 I0321 05:39:43.394793 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.53245 (* 1 = 3.53245 loss) I0321 05:39:45.348371 32854 sgd_solver.cpp:138] Iteration 2400, lr = 0.0001 I0321 06:02:20.365394 33124 blocking_queue.cpp:50] Data layer prefetch queue empty I0321 06:08:23.288305 32854 solver.cpp:243] Iteration 2500, loss = 3.58889 I0321 06:08:23.289111 32854 solver.cpp:259] Train net output #0: mbox_loss = 2.61665 (* 1 = 2.61665 loss) I0321 06:08:28.295343 32854 sgd_solver.cpp:138] Iteration 2500, lr = 0.0001 I0321 06:43:53.267843 32854 solver.cpp:243] Iteration 2600, loss = 3.47278 I0321 06:43:53.268203 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.52838 (* 1 = 3.52838 loss) I0321 06:44:07.155161 32854 sgd_solver.cpp:138] Iteration 2600, lr = 0.0001 I0321 07:21:03.146634 32854 solver.cpp:243] Iteration 2700, loss = 3.50657 I0321 07:21:03.146906 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.22891 (* 1 = 3.22891 loss) I0321 07:21:30.132973 32854 sgd_solver.cpp:138] Iteration 2700, lr = 0.0001 I0321 07:52:57.703928 32854 solver.cpp:243] Iteration 2800, loss = 3.47005 I0321 07:52:57.704133 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.63058 (* 1 = 3.63058 loss) I0321 07:52:57.704190 32854 sgd_solver.cpp:138] Iteration 2800, lr = 0.0001 I0321 08:19:26.086001 32854 solver.cpp:243] Iteration 2900, loss = 3.49502 I0321 08:19:26.086308 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.85055 (* 1 = 3.85055 loss) I0321 08:19:26.086344 32854 sgd_solver.cpp:138] Iteration 2900, lr = 0.0001 I0321 08:48:30.159132 32854 solver.cpp:243] Iteration 3000, loss = 3.47565 I0321 08:48:30.159410 32854 solver.cpp:259] Train net output #0: mbox_loss = 3.78331 (* 1 = 3.78331 loss) I0321 08:48:30.159449 32854 sgd_solver.cpp:138] Iteration 3000, lr = 0.0001

    opened by justttry 7
  • Perform baddly in dense large angle text

    Perform baddly in dense large angle text

    I find that textboxes_plusplus have a fatal shortage because of the matching strategy between anchors and groundtruth, like this: textboxes So it perform baddly when large angle texts are densely layout. East is good at this situation.

    opened by Yuanhang8605 6
  • the model does not converge

    the model does not converge

    i trained this model on icdar2015, but the loss is still about 2 and the model dose not converge.

    anything wrong?

    below is the print:

    I0322 10:26:27.195602 34365 solver.cpp:243] Iteration 10000, loss = 2.25461 I0322 10:26:27.195647 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.25461 (* 1 = 2.25461 loss) I0322 10:26:27.195674 34365 sgd_solver.cpp:138] Iteration 10000, lr = 5e-05 I0322 10:35:39.117563 34365 solver.cpp:243] Iteration 10100, loss = 2.57042 I0322 10:35:39.120899 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.69033 (* 1 = 2.69033 loss) I0322 10:35:39.120955 34365 sgd_solver.cpp:138] Iteration 10100, lr = 5e-05 I0322 10:45:15.218400 34365 solver.cpp:243] Iteration 10200, loss = 2.48241 I0322 10:45:15.226205 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.52062 (* 1 = 2.52062 loss) I0322 10:45:15.226236 34365 sgd_solver.cpp:138] Iteration 10200, lr = 5e-05 I0322 10:54:54.284951 34365 solver.cpp:243] Iteration 10300, loss = 2.47637 I0322 10:54:54.287142 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.45467 (* 1 = 2.45467 loss) I0322 10:54:54.287153 34365 sgd_solver.cpp:138] Iteration 10300, lr = 5e-05 I0322 11:04:10.167562 34365 solver.cpp:243] Iteration 10400, loss = 2.44925 I0322 11:04:10.226392 34365 solver.cpp:259] Train net output #0: mbox_loss = 1.71287 (* 1 = 1.71287 loss) I0322 11:04:10.226441 34365 sgd_solver.cpp:138] Iteration 10400, lr = 5e-05 I0322 11:13:35.929177 34365 solver.cpp:243] Iteration 10500, loss = 2.3995 I0322 11:13:35.930341 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.18376 (* 1 = 2.18376 loss) I0322 11:13:35.930351 34365 sgd_solver.cpp:138] Iteration 10500, lr = 5e-05 I0322 11:22:45.384930 34365 solver.cpp:243] Iteration 10600, loss = 2.3046 I0322 11:22:45.385200 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.29681 (* 1 = 2.29681 loss) I0322 11:22:45.385210 34365 sgd_solver.cpp:138] Iteration 10600, lr = 5e-05 I0322 11:30:44.026199 34365 solver.cpp:243] Iteration 10700, loss = 2.28321 I0322 11:30:44.026521 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.3693 (* 1 = 2.3693 loss) I0322 11:30:44.026530 34365 sgd_solver.cpp:138] Iteration 10700, lr = 5e-05 I0322 11:38:26.737128 34365 solver.cpp:243] Iteration 10800, loss = 2.26945 I0322 11:38:26.754946 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.36171 (* 1 = 2.36171 loss) I0322 11:38:26.754990 34365 sgd_solver.cpp:138] Iteration 10800, lr = 5e-05 I0322 11:46:07.016868 34365 solver.cpp:243] Iteration 10900, loss = 2.26229 I0322 11:46:07.018682 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.51168 (* 1 = 2.51168 loss) I0322 11:46:07.018692 34365 sgd_solver.cpp:138] Iteration 10900, lr = 5e-05 I0322 11:47:52.859237 34365 blocking_queue.cpp:50] Data layer prefetch queue empty I0322 11:54:49.258191 34365 solver.cpp:243] Iteration 11000, loss = 2.22473 I0322 11:54:49.259897 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.4543 (* 1 = 2.4543 loss) I0322 11:54:49.259907 34365 sgd_solver.cpp:138] Iteration 11000, lr = 5e-05 I0322 12:03:11.798053 34365 solver.cpp:243] Iteration 11100, loss = 2.20064 I0322 12:03:11.799800 34365 solver.cpp:259] Train net output #0: mbox_loss = 1.77588 (* 1 = 1.77588 loss) I0322 12:03:11.799813 34365 sgd_solver.cpp:138] Iteration 11100, lr = 5e-05 I0322 12:11:35.804178 34365 solver.cpp:243] Iteration 11200, loss = 2.20924 I0322 12:11:35.805752 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.09673 (* 1 = 2.09673 loss) I0322 12:11:35.805763 34365 sgd_solver.cpp:138] Iteration 11200, lr = 5e-05 I0322 12:20:18.096740 34365 solver.cpp:243] Iteration 11300, loss = 2.21733 I0322 12:20:18.096981 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.88026 (* 1 = 2.88026 loss) I0322 12:20:18.096990 34365 sgd_solver.cpp:138] Iteration 11300, lr = 5e-05 I0322 12:28:30.917421 34365 solver.cpp:243] Iteration 11400, loss = 2.17975 I0322 12:28:30.930032 34365 solver.cpp:259] Train net output #0: mbox_loss = 1.96478 (* 1 = 1.96478 loss) I0322 12:28:30.930105 34365 sgd_solver.cpp:138] Iteration 11400, lr = 5e-05 I0322 12:37:02.980835 34365 solver.cpp:243] Iteration 11500, loss = 2.12864 I0322 12:37:02.981086 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.68564 (* 1 = 2.68564 loss) I0322 12:37:02.981096 34365 sgd_solver.cpp:138] Iteration 11500, lr = 5e-05 I0322 12:45:51.920763 34365 solver.cpp:243] Iteration 11600, loss = 2.17097 I0322 12:45:51.932497 34365 solver.cpp:259] Train net output #0: mbox_loss = 1.92408 (* 1 = 1.92408 loss) I0322 12:45:51.932554 34365 sgd_solver.cpp:138] Iteration 11600, lr = 5e-05 I0322 12:54:13.096737 34365 solver.cpp:243] Iteration 11700, loss = 2.16561 I0322 12:54:13.096983 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.31518 (* 1 = 2.31518 loss) I0322 12:54:13.096993 34365 sgd_solver.cpp:138] Iteration 11700, lr = 5e-05 I0322 13:02:49.026772 34365 solver.cpp:243] Iteration 11800, loss = 2.16566 I0322 13:02:49.027006 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.26004 (* 1 = 2.26004 loss) I0322 13:02:49.027016 34365 sgd_solver.cpp:138] Iteration 11800, lr = 5e-05 I0322 13:11:57.636726 34365 solver.cpp:243] Iteration 11900, loss = 2.17491 I0322 13:11:57.636976 34365 solver.cpp:259] Train net output #0: mbox_loss = 1.8108 (* 1 = 1.8108 loss) I0322 13:11:57.636991 34365 sgd_solver.cpp:138] Iteration 11900, lr = 5e-05 I0322 13:16:56.752300 34365 blocking_queue.cpp:50] Data layer prefetch queue empty I0322 13:21:02.483294 34365 solver.cpp:243] Iteration 12000, loss = 2.14385 I0322 13:21:02.483568 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.73836 (* 1 = 2.73836 loss) I0322 13:21:02.483580 34365 sgd_solver.cpp:138] Iteration 12000, lr = 5e-05 I0322 13:29:42.016816 34365 solver.cpp:243] Iteration 12100, loss = 2.1218 I0322 13:29:42.018276 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.38685 (* 1 = 2.38685 loss) I0322 13:29:42.018286 34365 sgd_solver.cpp:138] Iteration 12100, lr = 5e-05 I0322 13:38:40.395782 34365 solver.cpp:243] Iteration 12200, loss = 2.12648 I0322 13:38:40.396193 34365 solver.cpp:259] Train net output #0: mbox_loss = 1.80988 (* 1 = 1.80988 loss) I0322 13:38:40.396208 34365 sgd_solver.cpp:138] Iteration 12200, lr = 5e-05 I0322 13:47:24.710381 34365 solver.cpp:243] Iteration 12300, loss = 2.10221 I0322 13:47:24.711777 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.23219 (* 1 = 2.23219 loss) I0322 13:47:24.711788 34365 sgd_solver.cpp:138] Iteration 12300, lr = 5e-05 I0322 13:56:28.187919 34365 solver.cpp:243] Iteration 12400, loss = 2.13928 I0322 13:56:28.190034 34365 solver.cpp:259] Train net output #0: mbox_loss = 2.42755 (* 1 = 2.42755 loss) I0322 13:56:28.190047 34365 sgd_solver.cpp:138] Iteration 12400, lr = 5e-05 I0322 14:05:10.886636 34365 solver.cpp:243] Iteration 12500, loss = 2.04594 I0322 14:05:10.886910 34365 solver.cpp:259] Train net output #0: mbox_loss = 1.95077 (* 1 = 1.95077 loss) I0322 14:05:10.886921 34365 sgd_solver.cpp:138] Iteration 12500, lr = 5e-05

    opened by justttry 6
  • model parameters

    model parameters

    I am trying to understand the values in models/deploy.prototxt. Per your paper, for the first two phases of training, you train at 384x384 input resolution, and then increase to 768x768 during the last phase of training. It is my understanding that the step values for the priorboxes are essentially the 384/(width of input feature map). By my calculations, the feature map inputs for the priorbox layers for input resolution 384x384 are (48x48, 24x24, 12x12, 6x6, 4x4, 2x2). Accordingly, I would think the first four "step" sizes are (8, 16, 32, 64, 96, 192), but in deploy.prototxt they are (8, 16, 32, 64, 100, 300). Where did 100 and 300 come from?

    Also, why is min_size 30 for both of the first two priorbox layers? I would think min_size for the second priorbox layer would be 60, following the pattern of the rest...

    One final question: what parameters did you use for the Adam optimizer, in terms of beta1, beta2, and epsilon, when training on SynthText?

    Thank you so much for your help!

    opened by mglezer 6
  • Training on icdar2015-TRW

    Training on icdar2015-TRW

    Nice work! Did you try to train the model on "ICDAR 2015 Competition on Text Reading in the Wild"(icdar2015-TRW) dataset? Is it good to use this method for long sentences? Would you kindly share the trained model on icdar2015-TRW? Thanks!

    opened by chasonlee 6
  • Prototext for pre-trained model

    Prototext for pre-trained model

    Hi @MhLiao, thank you for writing this code.

    I am converting this code into keras tutorial.

    First I need to convert the pre-trained model weight model_pre_train_syn.caffemodel into an hdf5 format in order to use in keras. To convert the .caffemodel into .hdf5 format, I need a .prototext file which I am not getting, there is one deploy.prototxt which is used on the top of this pre-trained model if I am not wrong.

    Is there any way to convert or use this caffemodel weight in keras and then create a layer on the top of that.

    opened by shankarj67 5
  • unable to compile crnn on Ubuntu 16.04

    unable to compile crnn on Ubuntu 16.04

    torch/install/include/thpp/Storage.h:22:43: fatal error: thpp/if/gen-cpp2/Tensor_types.h: No such file or directory compilation terminated. CMakeFiles/crnn.dir/build.make:86: recipe for target 'CMakeFiles/crnn.dir/ctc.cpp.o' failed make[2]: *** [CMakeFiles/crnn.dir/ctc.cpp.o] Error 1 CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/crnn.dir/all' failed make[1]: *** [CMakeFiles/crnn.dir/all] Error 2 Makefile:83: recipe for target 'all' failed make: *** [all] Error 2 cp: cannot stat '*.so': No such file or directory

    opened by saxenarohit 5
  • comparing ICDAR2013 results

    comparing ICDAR2013 results

    Hi Since the result from TextBoxes++ is quadrangle, how did you compare to ICDAR2013? This is how I did it:

        x1 = result[0] # x is column    LEFT
        y1 = result[1] # y is row       TOP
        x2 = result[2] #                RIGHT
        y2 = result[3] #                 TOP
        x3 = result[4] #                RIGHT
        y3 = result[5] #                BOTTOM
        x4 = result[6] #                LEFT
        y4 = result[7] #                BOTTOMleft = int(.5*(x1+x4))
        rt = int(.5*(x2+x3))
        top = int(.5*(y2+y1))
        bot = int(.5*(y3+y4))
    

    Am I correct?

    opened by SHaiHosh 5
  • Docker for GPU failed

    Docker for GPU failed

    I am trying to build the GPU docker as it is advised, however encounter the following error

    W: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/Packages  gnutls_handshake() failed: Handshake failed
    
    W: Failed to fetch https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1404/x86_64/Packages  gnutls_handshake() failed: Handshake failed
    
    E: Some index files failed to download. They have been ignored, or old ones used instead.
    The command '/bin/sh -c apt-get update && apt-get install -y --no-install-recommends         build-essential         cmake         git         wget         libatlas-base-dev         libboost-all-dev         libgflags-dev         libgoogle-glog-dev         libhdf5-serial-dev         libleveldb-dev         liblmdb-dev         libopencv-dev         libprotobuf-dev         libsnappy-dev         libgeos-dev         protobuf-compiler         python-dev         python-numpy         python-pip         python-scipy         python-opencv &&     rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100
    
    

    I appreciate any hint

    opened by naarkhoo 2
  • opencv load caffe model

    opencv load caffe model

    Hi~MhLiao,

    I tried loading the model using opencv and following error showed:

    Traceback (most recent call last): File "D:/work/video/TextBoxes++/TextBoxes++/run.py", line 31, in density = net.forward() cv2.error: OpenCV(4.0.0) C:\projects\opencv-python\opencv\modules\dnn\src\layers\prior_box_layer.cpp:242: error: (-215:Assertion failed) !params.has("step") in function 'cv::dnn::PriorBoxLayerImpl::PriorBoxLayerImpl'

    what should i do to fix this problem? Anyone has some solutions? I will appriciate your helps!

    opened by amandazw 0
  • WARNING: Logging before InitGoogleLogging() is written to STDERR

    WARNING: Logging before InitGoogleLogging() is written to STDERR

    WARNING: Logging before InitGoogleLogging() is written to STDERR W0805 10:26:38.720525 3101 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface W0805 10:26:38.720546 3101 _caffe.cpp:140] Use this instead (with the named "weights" parameter): W0805 10:26:38.720549 3101 _caffe.cpp:142] Net('/home/xuy/桌面/code/python/caffe/Bag_gender_hair_classification/DenseNet_deploy_161.prototxt', 1, weights='/home/xuy/桌面/code/python/caffe/Bag_gender_hair_classification/model/DenseNet_161.caffemodel') [libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 54:14: Message type "caffe.PoolingParameter" has no field named "ceil_mode". F0805 10:26:38.740175 3101 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/xuy/桌面/code/python/caffe/Bag_gender_hair_classification/DenseNet_deploy_161.prototxt *** Check failure stack trace: ***

    opened by jiangzz1628 2
  • Detection_eval of shared pretrained model is very less

    Detection_eval of shared pretrained model is very less

    I evaluated the pretrained model 'model_icdar15.caffemodel' shared on ICDAR15 test set with 500 images . The detection_eval is observed to be:

    Resolution: 768x768 , IoU threshold: 0.5, detection_eval=0.463 Resolution: 1024x1024 , IoU threshold: 0.5, detection_eval=0.469

    However, in the paper the precision is mentioned to be 0.872 for 1024x1024 while I get 0.469. The metrics used are Pascal VOC based 11-point method to get mAP right?

    opened by manogna-s 1
  • Training failed for resulutions 768x768, 512x512, with error == cudaSuccess(2 vs. 0) on TitanX GPU

    Training failed for resulutions 768x768, 512x512, with error == cudaSuccess(2 vs. 0) on TitanX GPU

    The second stage of training with resolution 768x768 is failing throwing the following error:

    F0903 14:31:26.106397 92421 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory *** Check failure stack trace: *** @ 0x7fd08aa5c5cd google::LogMessage::Fail() @ 0x7fd08aa5e433 google::LogMessage::SendToLog() @ 0x7fd08aa5c15b google::LogMessage::Flush() @ 0x7fd08aa5ee1e google::LogMessageFatal::~LogMessageFatal() @ 0x7fd08b2290e0 caffe::SyncedMemory::to_gpu() @ 0x7fd08b2280a9 caffe::SyncedMemory::mutable_gpu_data() @ 0x7fd08b390282 caffe::Blob<>::mutable_gpu_data() @ 0x7fd08b363928 caffe::BaseConvolutionLayer<>::forward_gpu_gemm() @ 0x7fd08b3eb296 caffe::ConvolutionLayer<>::Forward_gpu() @ 0x7fd08b1f15f2 caffe::Net<>::ForwardFromTo() @ 0x7fd08b1f1717 caffe::Net<>::Forward() @ 0x7fd08b3a6eca caffe::Solver<>::Solve() @ 0x7fd08b226604 caffe::P2PSync<>::Run() @ 0x40ada0 train() @ 0x407590 main @ 0x7fd0899cc830 __libc_start_main @ 0x407db9 _start @ (nil) (unknown) Aborted (core dumped)

    Anyone came cross this error and found a fix for this?

    opened by manogna-s 0
Owner
Minghui Liao
Minghui Liao, a Ph.D. student of Huazhong University of Science and Technology.
Minghui Liao
Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

SSTDNet Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight' using pytorch. This code is work for general object detecti

HotaekHan 84 Jan 5, 2022
Single Shot Text Detector with Regional Attention

Single Shot Text Detector with Regional Attention Introduction SSTD is initially described in our ICCV 2017 spotlight paper. A third-party implementat

Pan He 215 Dec 7, 2022
Textboxes : Image Text Detection Model : python package (tensorflow)

shinTB Abstract A python package for use Textboxes : Image Text Detection Model implemented by tensorflow, cv2 Textboxes Paper Review in Korean (My Bl

Jayne Shin (신재인) 91 Dec 15, 2022
An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

GeorgeJoe 115 Dec 12, 2022
TextBoxes re-implement using tensorflow

TextBoxes-TensorFlow TextBoxes re-implementation using tensorflow. This project is greatly inspired by slim project And many functions are modified ba

Gu Xiaodong 44 Dec 29, 2022
Textboxes implementation with Tensorflow (python)

tb_tensorflow A python implementation of TextBoxes Dependencies TensorFlow r1.0 OpenCV2 Code from Chaoyue Wang 03/09/2017 Update: 1.Debugging optimize

Jayne Shin (신재인) 20 May 31, 2019
Implementation of EAST scene text detector in Keras

EAST: An Efficient and Accurate Scene Text Detector This is a Keras implementation of EAST based on a Tensorflow implementation made by argman. The or

Jan Zdenek 208 Nov 15, 2022
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

Dejia Song 544 Dec 20, 2022
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

null 365 Dec 20, 2022
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection For more details, please refer to our paper. Citing Please cite the related works

Minghui Liao 102 Jun 29, 2022
Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

null 428 Nov 22, 2022
This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

Dafang He 30 Oct 22, 2022
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

Pengyuan Lyu 309 Dec 6, 2022
Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Total-Text-Dataset (Official site) Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.) Update

Chee Seng Chan 671 Dec 27, 2022
huoyijie 1.2k Dec 29, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 5, 2023
A tensorflow implementation of EAST text detector

EAST: An Efficient and Accurate Scene Text Detector Introduction This is a tensorflow re-implementation of EAST: An Efficient and Accurate Scene Text

null 2.9k Jan 2, 2023