Convolutional Recurrent Neural Networks(CRNN) for Scene Text Recognition

MaybeShewill-CV

Last update: Dec 27, 2022

Related tags

Computer Vision tensorflow ocr-recognition ctc-loss sequence-recongnition chinese-ocr crnn-tensorflow

Overview

CRNN_Tensorflow

This is a TensorFlow implementation of a Deep Neural Network for scene text recognition. It is mainly based on the paper "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition". You can refer to the paper for architecture details. Thanks to the author Baoguang Shi.

The model consists of a CNN stage extracting features which are fed to an RNN stage (Bi-LSTM) and a CTC loss.

Installation

This software has been developed on Ubuntu 16.04(x64) using python 3.5 and TensorFlow 1.12. Since it uses some recent features of TensorFlow it is incompatible with older versions.

The following methods are provided to install dependencies:

Conda

You can create a conda environment with the required dependencies using:

conda env create -f crnntf-env.yml

Pip

Required packages may be installed with

pip3 install -r requirements.txt

Testing the pre-trained model

Evaluate the model on the synth90k dataset

In this repo you will find a model pre-trained on the Synth 90kdataset. When the tfrecords file of synth90k dataset has been successfully generated you may evaluated the model by the following script

The pretrained crnn model weights on Synth90k dataset can be found here

python tools/evaluate_shadownet.py --dataset_dir PATH/TO/YOUR/DATASET_DIR 
--weights_path PATH/TO/YOUR/MODEL_WEIGHTS_PATH
--char_dict_path PATH/TO/CHAR_DICT_PATH 
--ord_map_dict_path PATH/TO/ORD_MAP_PATH
--process_all 1 --visualize 1

If you set visualize true the expected output during evaluation process is

After all the evaluation process is done you should see some thing like this:

The model's main evaluation index are as follows:

Test Dataset Size: 891927 synth90k test images

Per char Precision: 0.974325 without average weighted on each class

Full sequence Precision: 0.932981 without average weighted on each class

For Per char Precision:

single_label_accuracy = correct_predicted_char_nums_of_single_sample / single_label_char_nums

avg_label_accuracy = sum(single_label_accuracy) / label_nums

For Full sequence Precision:

single_label_accuracy = 1 if the prediction result is exactly the same as label else 0

avg_label_accuracy = sum(single_label_accuracy) / label_nums

Part of the confusion matrix of every single char looks like this:

Test the model on the single image

If you want to test a single image you can do it with

python tools/test_shadownet.py --image_path PATH/TO/IMAGE 
--weights_path PATH/TO/MODEL_WEIGHTS
--char_dict_path PATH/TO/CHAR_DICT_PATH 
--ord_map_dict_path PATH/TO/ORD_MAP_PATH

Test example images

Example test_01.jpg

Example test_02.jpg

Example test_03.jpg

Training your own model

Data preparation

Download the whole synth90k dataset here And extract all th files into a root dir which should contain several txt file and several folders filled up with pictures. Then you need to convert the whole dataset into tensorflow records as follows

python tools/write_tfrecords 
--dataset_dir PATH/TO/SYNTH90K_DATASET_ROOT_DIR
--save_dir PATH/TO/TFRECORDS_DIR

During converting all the source image will be scaled into (32, 100)

Training

For all the available training parameters, check global_configuration/config.py, then train your model with

python tools/train_shadownet.py --dataset_dir PATH/TO/YOUR/TFRECORDS
--char_dict_path PATH/TO/CHAR_DICT_PATH 
--ord_map_dict_path PATH/TO/ORD_MAP_PATH

If you wish, you can add more metrics to the training progress messages with --decode_outputs 1, but this will slow training down. You can also continue the training process from a snapshot with

python tools/train_shadownet.py --dataset_dir PATH/TO/YOUR/TFRECORDS
--weights_path PATH/TO/YOUR/PRETRAINED_MODEL_WEIGHTS
--char_dict_path PATH/TO/CHAR_DICT_PATH --ord_map_dict_path PATH/TO/ORD_MAP_PATH

If you has multiple gpus in your local machine you may use multiple gpu training to access a larger batch size input data. This will be supported as follows

python tools/train_shadownet.py --dataset_dir PATH/TO/YOUR/TFRECORDS
--char_dict_path PATH/TO/CHAR_DICT_PATH --ord_map_dict_path PATH/TO/ORD_MAP_PATH
--multi_gpus 1

The sequence distance is computed by calculating the distance between two sparse tensors so the lower the accuracy value is the better the model performs. The training accuracy is computed by calculating the character-wise precision between the prediction and the ground truth so the higher the better the model performs.

Tensorflow Serving

Thanks for Eldon's contribution of tensorflow service function:)

Since tensorflow model server is a very powerful tools to serve the DL model in industry environment. Here's a script for you to convert the checkpoints model file into tensorflow saved model which can be used with tensorflow model server to serve the CRNN model. If you can not run the script normally you may need to check if the checkpoint file path is correct in the bash script.

bash tfserve/export_crnn_saved_model.sh

To start the tensorflow model server you may check following script

bash tfserve/run_tfserve_crnn_gpu.sh

There are two different ways to test the python client of crnn model. First you may test the server via http/rest request by running

python tfserve/crnn_python_client_via_request.py ./data/test_images/test_01.jpg

Second you may test the server via grpc by running

python tfserve/crnn_python_client_via_grpc.py

Experiment

The original experiment run for 2000000 epochs, with a batch size of 32, an initial learning rate of 0.01 and exponential decay of 0.1 every 500000 epochs. During training the train loss dropped as follows

The val loss dropped as follows

2019.3.27 Updates

I have uploaded a newly trained crnn model on chinese dataset which can be found here. Sorry for not knowing the owner of the dataset. But thanks for his great work. If someone knows it you're welcome to let me know. The pretrained weights can be found here

Before start training you may need reorgnize the dataset's label information according to the synth90k dataset's format if you want to use the same data feed pip line mentioned above. Now I have reimplemnted a more efficient tfrecords writer which will accelerate the process of generating tfrecords file. You may refer to the code for details. Some information about training is listed bellow:

image size: (280, 32)

classes nums: 5824 without blank

sequence length: 70

training sample counts: 2733004

validation sample counts: 364401

testing sample counts: 546601

batch size: 32

training iter nums: 200000

init lr: 0.01

Test example images

Example test_01.jpg

Example test_02.jpg

Example test_03.jpg

training tboard file

The val loss dropped as follows

2019.4.10 Updates

Add a small demo to recognize chinese pdf using the chinese crnn model weights. If you want to have a try you may follow the command:

cd CRNN_ROOT_REPO
python tools/recongnize_chinese_pdf.py -c ./data/char_dict/char_dict_cn.json 
-o ./data/char_dict/ord_map_cn.json --weights_path model/crnn_chinese/shadownet.ckpt 
--image_path data/test_images/test_pdf.png --save_path pdf_recognize_result.txt

You should see the same result as follows:

The left image is the recognize result displayed on console and the right image is the origin pdf image.

The left image is the recognize result written in local file and the right image is the origin pdf image.

TODO

Add new model weights trained on the whole synth90k dataset
Add multiple gpu training scripts
Add new pretrained model on chinese dataset
Add an online toy demo
Add tensorflow service script

Acknowledgement

Please cite my repo CRNN_Tensorflow if you use it.

Contact

Scan the following QR to disscuss :)

Comments

{"error":"Serving signature name: \"serving_default\" not found in signature def"}

@MaybeShewill-CV thank you for your code.

I have error. { "error": "Serving signature name: "serving_default" not found in signature def" } How can I solve?

----- script -----------

curl -X POST   http://localhost:9001/v1/models/crnn:predict   -H 'cache-control: no-cache'   -H 'content-type: application/json'   -d '{
  "inputs":
    {
    "input": { "b64": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAwICQoJBwwKCQoNDAwOER0TERAQESMZGxUdKiUsKyklKCguNEI4LjE/MigoOk46P0RHSktKLTdRV1FIVkJJSkf/2wBDAQwNDREPESITEyJHMCgwR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0dHR0f/wAARCAAfAHQDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwDZs9FF1awSfa0jmud/kxFCQ23rlh0qKHRrqeyjuYnhbzd2yLfh2wcHAPXpV+0hN9p2nLBdrbm283zpA4Biycg4yDzVzSDCLDS3YO1yiztAgICucnIJ/lQBz0GmXtxbfaILd5I84yuCfy61AYJhCJjDIIj0cqdp/Gtki3XwvbG4FxuLyNGYsYDdBuzVzTpWRNJgZj5DQzNKh+6w56jvQBytFdHJcSifRLaTY6tEgYOit8rMB3HoBVeX7Pc6tqEL20KLDDKsQiXaMqSQxx34oAxaK3bzSbWBGdgyrFZ5dkbIM27b37Z/lTDo9r9iGJphdfY/tZ4Gzb6euaAMWjNak+i+Xbeat2hZFjM6MpXy9+Mc856/pUV5pE9r5I863macr5axvljnocHHHvQBnk0tTmwu9zKtvIzLIYiFG75gCSOOvANRtFJGDvjdcMVO4EYI6j60AMpaKKACiiigBKVWKkMpII5BFJRQBZh1C9gAWK7mRR0UOcflUiavfpafZVnxDtKbdi9D1GcZqlRQBen1WaWa0laKEPa42FVxkDBAPPTjtjqaZaagYdUN7LGJN5csgOM7gc9c+tU2pKANa51g3Gjm1YMJnmLuwHyspJbH/fRqee/tPskk0VwxmktltlgKH92BjPzdD0P51hUUAdFrOo211ZXJsniUvMivxhpEC8Hn0Oeg9KbsS48W20MEqvFD5YRlOQQig9voawKMGgDrradZPsd1DbNHFJJPdSqGL4KqVzn3J6VHpSRzWNnG8jCbzftsjOeuH29/auainmgz5M0kZIwdrEcenFPS7uI87ZW5jMRyc/Ie3PagDX1KWQWdpAksUjzR+Y8XlbnLSEtwSDjr65qa6j017i1hlgSKF5cxzxptVosY2sc53bsA56e1ZX9q3JlhkcQs8TKysYlB46DIA4qeLWnVwr20X2fyni8qMleHOWIPJzmgBuoR2Ec6xtBLbSquJYkYsFbJ7t14weOOaKq6hcm9u2m8vYCAoXduwAMDJ7/WigD/2Q==" }
  }
}'
{ "error": "Serving signature name: \"serving_default\" not found in signature def" }

opened by kspook 60

txt文件与图像间的关系

annotation.txt 文件中，由两个部分组成：image_path 和 label_index。如： ./3000/7/169_deliberations_20271.jpg 20271 它们之间以空格键隔开。

可以根据 image_path 找到目标图像，根据 label_index 在 lexicon.txt 文件中找到对应的 label。如上面的例子，label 就在 lexicon.txt 的 20271 行，内容为: deliberation

同理，annotation_train.txt 里面的内容，也是一样的意义。

opened by wandaoyi 29
Problem of test the model on the single image?

If you want to test a single image you can do it with python tools/test_shadownet.py --image_path PATH/TO/IMAGE --weights_path PATH/TO/MODEL_WEIGHTS --char_dict_path PATH/TO/CHAR_DICT_PATH --ord_map_dict_path PATH/TO/ORD_MAP_PATH

"--weights_path PATH/TO/MODEL_WEIGHTS" Whether the selection of weights can be used directly use "model/crnn_syn90k_saved_model/1"? If can, I encountered some errors.

opened by Timthony 27
请问为什么我训练出来的模型比自带的模型大

我训练出来的模型是68152KB CRNN_Tensorflow-master\model\crnn_syn90k_saved_model\variables\文件夹里的模型是34085KB 我把训练好的模型丢到variables文件夹倒是能运行test 但是根本识别不了准确率低于5% 明明拿Synth 90k训练了200W 训练用的是GTX1070 请大佬解惑

opened by JXMss 26

NotFoundError

I try to run the demo for one image, but a notfounderror ocurr . What should I do?

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key shadow/batch_normalization/beta not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
	 [[Node: save/RestoreV2/_3 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_7_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

opened by LJXLJXLJX 22

请教一下生成tfrecord的问题

您好，我按照readme里的方法生成tfrecord用来测试，结果报错： OutOfRangeError (see above for traceback): RandomShuffleQueue '_1_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 32, current size 0) 然后我利用write_text_features.py生成的tfrecord，打开以后里面的图片格式为list.每个元素是0到255的像素值。而您在data里自带的tfrecord打开以后是乱码的，应该就是二进制文件。应该如何生成您这样的tfrecord文件呢，您的程序我没有做任何更改。另外，您的tfrecord文件里面包含8568张图像的名字，但是利用test_shadoenet.py测试的时候输出32张图像的测试结果。这又是为什么呢？期待您的指导，谢谢！

opened by ZhouFangru 21

$tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence [[{{node train_IteratorGetNext}} = IteratorGetNext[output_shapes=[[32,32,100,3], <unknown>, [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device=$

tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence [[{{node train_IteratorGetNext}} = IteratorGetNext[output_shapes=[[32,32,100,3], , [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]

@MaybeShewill-CV, it's not nvidia-smi at #295


(crnntf) kspook@MLNC6:/usr/local/cuda/samples/bin/x86_64/linux/release$ nvidia-smi
Thu Jul  4 11:00:00 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130                Driver Version: 384.130                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 0000D691:00:00.0 Off |                    0 |
| N/A   47C    P0    57W / 149W |      0MiB / 11439MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
(crnntf) kspook@MLNC6:/usr/local/cuda/samples/bin/x86_64/linux/release$

cuda9.0 installed successfully.


crnntf) kspook@MLNC6:/usr/local/cuda/samples/bin/x86_64/linux/release$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla K80"
  CUDA Driver Version / Runtime Version          9.0 / 9.0
  CUDA Capability Major/Minor version number:    3.7
  Total amount of global memory:                 11440 MBytes (11995578368 bytes)
  (13) Multiprocessors, (192) CUDA Cores/MP:     2496 CUDA Cores
  GPU Max Clock rate:                            824 MHz (0.82 GHz)
  Memory Clock rate:                             2505 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                 1572864 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   54929 / 0 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 9.0, NumDevs = 1
Result = PASS

cudnn 7 installed successfully


(crnntf) kspook@MLNC6:~/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7601 , CUDNN_VERSION from cudnn.h : 7601 (7.6.1)
Host compiler version : GCC 5.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 13  Capabilities 3.7, SmClock 823.5 Mhz, MemSize (Mb) 11439, MemClock 2505.0 Mhz, Ecc=1, boardGroupID=0
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 2
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.132224 time requiring 100 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.133056 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.157568 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.244576 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.406240 time requiring 203008 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 2
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.139008 time requiring 100 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.141408 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.176672 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.252768 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.431808 time requiring 203008 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

The error still occurred.


(crnntf) kspook@MLNC6:~/CRNN_Tensorflow$ python tools/train_shadownet.py --dataset_dir ./data/ --char_dict_path ./data/char_dict/char_dict.json --ord_map_dict_path ./data/char_dict/ord_map.json 
I0704 11:05:45.903860 17823 train_shadownet.py:569] Use single gpu to train the model
2019-07-04 11:05:49.530389: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-04 11:05:54.737760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: d691:00:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2019-07-04 11:05:54.737815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-07-04 11:05:55.023881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-04 11:05:55.023945: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2019-07-04 11:05:55.023963: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2019-07-04 11:05:55.024223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10295 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: d691:00:00.0, compute capability: 3.7)
I0704 11:05:55.310481 17823 train_shadownet.py:268] Training from scratch
Traceback (most recent call last):
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
	 [[{{node train_IteratorGetNext}} = IteratorGetNext[output_shapes=[[32,32,100,3], <unknown>, [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tools/train_shadownet.py", line 575, in <module>
    need_decode=args.decode_outputs
  File "tools/train_shadownet.py", line 321, in train_shadownet
    [optimizer, train_ctc_loss, merge_summary_op])
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
	 [[node train_IteratorGetNext (defined at /data/home/kspook/CRNN_Tensorflow/data_provider/tf_io_pipline_fast_tools.py:406)  = IteratorGetNext[output_shapes=[[32,32,100,3], <unknown>, [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]

Caused by op 'train_IteratorGetNext', defined at:
  File "tools/train_shadownet.py", line 575, in <module>
    need_decode=args.decode_outputs
  File "tools/train_shadownet.py", line 153, in train_shadownet
    batch_size=CFG.TRAIN.BATCH_SIZE
  File "/data/home/kspook/CRNN_Tensorflow/data_provider/shadownet_data_feed_pipline.py", line 289, in inputs
    num_threads=CFG.TRAIN.CPU_MULTI_PROCESS_NUMS
  File "/data/home/kspook/CRNN_Tensorflow/data_provider/tf_io_pipline_fast_tools.py", line 406, in inputs
    return iterator.get_next(name='{:s}_IteratorGetNext'.format(self._dataset_flag))
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 421, in get_next
    name=name)), self._output_types,
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 2069, in iterator_get_next
    output_shapes=output_shapes, name=name)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

OutOfRangeError (see above for traceback): End of sequence
	 [[node train_IteratorGetNext (defined at /data/home/kspook/CRNN_Tensorflow/data_provider/tf_io_pipline_fast_tools.py:406)  = IteratorGetNext[output_shapes=[[32,32,100,3], <unknown>, [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]

opened by kspook 20

how to adjust the SEQ_LENGTH ?

if i want the eng dataset to predict Sequence of length max 70. Is the only change SEQ_LENGTH =70 in config file and all the images should be of width 70*4? Any other changes required?

Should height always be 32 or is it ok to make it 64?

opened by SreenijaK 19
How to speed up inference with batch images input ?

Hello :+1:

How to speed up inference with batch images input ? When I use test_shadownet.py to inference image , it only can take one image per time , when the image size changed , I have to create a new graph and restore the saved model, it cost a lot of time. Can you please enlighten me for how to use batch inference ?

opened by liuchangf 19
train loss some wrong

INFO: 10-09 19:55:21: train_shadownet.py:139 * 140230364735232 Training from scratch INFO: 10-09 19:55:25: train_shadownet.py:174 * 140230364735232 Epoch: 1 cost= 197.760651 INFO: 10-09 19:55:26: train_shadownet.py:174 * 140230364735232 Epoch: 2 cost= 172.005493 INFO: 10-09 19:55:26: train_shadownet.py:174 * 140230364735232 Epoch: 3 cost= 142.324768 INFO: 10-09 19:55:26: train_shadownet.py:174 * 140230364735232 Epoch: 4 cost= 101.206573 INFO: 10-09 19:55:26: train_shadownet.py:174 * 140230364735232 Epoch: 5 cost= 75.101547 INFO: 10-09 19:55:26: train_shadownet.py:174 * 140230364735232 Epoch: 6 cost= 62.528458 INFO: 10-09 19:55:26: train_shadownet.py:174 * 140230364735232 Epoch: 7 cost= 43.640678 INFO: 10-09 19:55:26: train_shadownet.py:174 * 140230364735232 Epoch: 8 cost= 36.773048 INFO: 10-09 19:55:27: train_shadownet.py:174 * 140230364735232 Epoch: 9 cost= 33.266392 INFO: 10-09 19:55:27: train_shadownet.py:174 * 140230364735232 Epoch: 10 cost= 33.096649 INFO: 10-09 19:55:27: train_shadownet.py:174 * 140230364735232 Epoch: 11 cost= 30.440775 INFO: 10-09 19:55:27: train_shadownet.py:174 * 140230364735232 Epoch: 12 cost= 29.101267 INFO: 10-09 19:55:27: train_shadownet.py:174 * 140230364735232 Epoch: 13 cost= 31.529339 INFO: 10-09 19:55:27: train_shadownet.py:174 * 140230364735232 Epoch: 14 cost= 27.663685 INFO: 10-09 19:55:28: train_shadownet.py:174 * 140230364735232 Epoch: 15 cost= 27.336868 INFO: 10-09 19:55:28: train_shadownet.py:174 * 140230364735232 Epoch: 16 cost= 26.076229 INFO: 10-09 19:55:28: train_shadownet.py:174 * 140230364735232 Epoch: 17 cost= 29.231804 INFO: 10-09 19:55:28: train_shadownet.py:174 * 140230364735232 Epoch: 18 cost= 28.733067 INFO: 10-09 19:55:28: train_shadownet.py:174 * 140230364735232 Epoch: 19 cost= 31.574839 INFO: 10-09 19:55:28: train_shadownet.py:174 * 140230364735232 Epoch: 20 cost= 25.050362 INFO: 10-09 19:55:29: train_shadownet.py:174 * 140230364735232 Epoch: 21 cost= 26.840332 INFO: 10-09 19:55:29: train_shadownet.py:174 * 140230364735232 Epoch: 22 cost= 26.538689 INFO: 10-09 19:55:29: train_shadownet.py:174 * 140230364735232 Epoch: 23 cost= 31.059734 INFO: 10-09 19:55:29: train_shadownet.py:174 * 140230364735232 Epoch: 24 cost= 28.754152 INFO: 10-09 19:55:29: train_shadownet.py:174 * 140230364735232 Epoch: 25 cost= 27.408020 INFO: 10-09 19:55:29: train_shadownet.py:174 * 140230364735232 Epoch: 26 cost= 27.259018 INFO: 10-09 19:55:29: train_shadownet.py:157 * 140230364735232 Cost didn't improve beyond 0.001000 for 7 epochs, stopping early. Done

Could you help me how to solve it ?

thank you very

opened by 10183308 18
Is the ckpt wrong?

INFO: 10-11 02:52:46: tf_logging.py:115 * 140680814909184 Restoring parameters from ./model/shadownet/shadownet_2017-10-17-11-47-46.ckpt-199999 2018-10-11 02:52:46.563415: W tensorflow/core/framework/op_kernel.cc:1275] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key shadow/batch_normalization/beta not found in checkpoint

opened by Crazlion 17

Owner

MaybeShewill-CV

Engineer from Baidu

GitHub

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Handwriting Recognition System This repository is the Tensorflow implementation of the Handwriting Recognition System described in Handwriting Recogni

346 Jan 7, 2023

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

224 Jan 7, 2023

[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

本文基于tensorflow、keras/pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别 update20190706 为解决本项目中对数学公式预测的准确性，做了其他的改进和尝试，效果还不错，https://github.com/xiaofengShi/Image2Katex 希

2.7k Dec 25, 2022

OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

354 Dec 12, 2022

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition Python 2.7 Python 3.6 MORAN is a network with rectification mechanism for

595 Dec 27, 2022

Scene text recognition

AttentionOCR for Arbitrary-Shaped Scene Text Recognition Introduction This is the ranked No.1 tensorflow based scene text spotting algorithm on ICDAR2

777 Jan 9, 2023

End-to-end pipeline for real-time scene text detection and recognition.

Real-time-Scene-Text-Detection-and-Recognition-System End-to-end pipeline for real-time scene text detection and recognition. The detection model use

89 Aug 4, 2022

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

572 Jan 5, 2023

Scene text detection and recognition based on Extremal Region(ER)

Scene text recognition A real-time scene text recognition algorithm. Our system is able to recognize text in unconstrain background. This algorithm is

155 Dec 6, 2022

A curated list of resources dedicated to scene text localization and recognition

Scene Text Localization & Recognition Resources A curated list of resources dedicated to scene text localization and recognition. Any suggestions and

1.6k Dec 22, 2022

A curated list of papers and resources for scene text detection and recognition

Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin

43 Mar 15, 2022

Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

SceneTextPapers Tracking the latest progress in Scene Text Detection and Recognition: must-read papers well organized Information about this repositor

763 Jan 1, 2023

A toolbox of scene text detection and recognition

FudanOCR This toolbox contains the implementations of the following papers: Scene Text Telescope: Text-Focused Scene Image Super-Resolution [Chen et a

170 Dec 26, 2022

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

496 Jan 5, 2023

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

27 Jan 8, 2023

An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

115 Dec 12, 2022

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

AdvancedEAST AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST:An Efficient and Accurate Scene Text Dete

1.2k Dec 29, 2022

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

90 Dec 22, 2022

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

awesome-deep-text-detection-recognition A curated list of awesome deep learning based papers on text detection and recognition. Text Detection Papers

2.4k Jan 8, 2023