Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

Overview

Code for the AAAI18 paper PixelLink: Detecting Scene Text via Instance Segmentation, by Dan Deng, Haifeng Liu, Xuelong Li, and Deng Cai.

Contributions to this repo are welcome, e.g., some other backbone networks (including the model definition and pretrained models).

PLEASE CHECK EXSITING ISSUES BEFORE OPENNING YOUR OWN ONE. IF A SAME OR SIMILAR ISSUE HAD BEEN POSTED BEFORE, JUST REFER TO IT, AND DO NO OPEN A NEW ONE.

Installation

Clone the repo

git clone --recursive [email protected]:ZJULearning/pixel_link.git

Denote the root directory path of pixel_link by ${pixel_link_root}.

Add the path of ${pixel_link_root}/pylib/src to your PYTHONPATH:

export PYTHONPATH=${pixel_link_root}/pylib/src:$PYTHONPATH

Prerequisites

(Only tested on) Ubuntu14.04 and 16.04 with:

  • Python 2.7
  • Tensorflow-gpu >= 1.1
  • opencv2
  • setproctitle
  • matplotlib

Anaconda is recommended to for an easier installation:

  1. Install Anaconda
  2. Create and activate the required virtual environment by:
conda env create --file pixel_link_env.txt
source activate pixel_link

Testing

Download the pretrained model

Unzip the downloaded model. It contains 4 files:

  • config.py
  • model.ckpt-xxx.data-00000-of-00001
  • model.ckpt-xxx.index
  • model.ckpt-xxx.meta

Denote their parent directory as ${model_path}.

Test on ICDAR2015

The reported results on ICDAR2015 are:

Model Recall Precision F-mean
PixelLink+VGG16 2s 82.0 85.5 83.7
PixelLink+VGG16 4s 81.7 82.9 82.3

Suppose you have downloaded the ICDAR2015 dataset, execute the following commands to test the model on ICDAR2015:

cd ${pixel_link_root}
./scripts/test.sh ${GPU_ID} ${model_path}/model.ckpt-xxx ${path_to_icdar2015}/ch4_test_images

For example:

./scripts/test.sh 3 ~/temp/conv3_3/model.ckpt-38055 ~/dataset/ICDAR2015/Challenge4/ch4_test_images

The program will create a zip file of detection results, which can be submitted to the ICDAR2015 server directly. The detection results can be visualized via scripts/vis.sh.

Here are some samples: ./samples/img_333_pred.jpg ./samples/img_249_pred.jpg

Test on any images

Put the images to be tested in a single directory, i.e., ${image_dir}. Then:

cd ${pixel_link_root}
./scripts/test_any.sh ${GPU_ID} ${model_path}/model.ckpt-xxx ${image_dir}

For example:

 ./scripts/test_any.sh 3 ~/temp/conv3_3/model.ckpt-38055 ~/dataset/ICDAR2015/Challenge4/ch4_training_images

The program will visualize the detection results directly on images. If the detection result is not satisfying, try to:

  1. Adjust the inference parameters like eval_image_width, eval_image_height, pixel_conf_threshold, link_conf_threshold.
  2. Or train your own model.

Training

Converting the dataset to tfrecords files

Scripts for converting ICDAR2015 and SynthText datasets have been provided in the datasets directory. It not hard to write a converting script for your own dataset.

Train your own model

  • Modify scripts/train.sh to configure your dataset name and dataset path like:
DATASET=icdar2015
DATASET_DIR=$HOME/dataset/pixel_link/icdar2015
  • Start training
./scripts/train.sh ${GPU_IDs} ${IMG_PER_GPU}

For example, ./scripts/train.sh 0,1,2 8.

The existing training strategy in scripts/train.sh is configured for icdar2015, modify it if necessary. A lot of training or model options are available in config.py, try it yourself if you are interested.

Acknowlegement

Comments
  • I have few questions.

    I have few questions.

    I have few questions noted below.

    1. Have you tried training with SynthText dataset? If yes, does benchmark improved?
    2. Your model uses aligned rectangle, rather than 8-coordinate bounding box. Would the performance be improved if we use tight bounding box?

    Thank you!

    opened by GodOfSmallThings 38
  • 为什么程序会在CPU上跑?

    为什么程序会在CPU上跑?

    在我自己的数据上运程序时 为什么只会在CPU上运行呢 发现用了550%的CPU 而 gpu才用了150M 希望以前踩过这个坑的人 能够提示一下 谢谢!!

    下面是我 /scripts/train.sh 的中设置

    set -x set -e export CUDA_VISIBLE_DEVICES=0 IMG_PER_GPU=32

    TRAIN_DIR=$/pixel_link_info

    OLD_IFS="$IFS" IFS="," gpus=($CUDA_VISIBLE_DEVICES) IFS="$OLD_IFS" NUM_GPUS=${#gpus[@]}

    BATCH_SIZE=expr $NUM_GPUS \* $IMG_PER_GPU

    DATASET=thaiid DATASET_DIR=$/tmp

    CUDA_VISIBLE_DEVICES=0 python train_pixel_link.py
    --train_dir=${TRAIN_DIR}
    --num_gpus=${NUM_GPUS}
    --learning_rate=1e-3
    --gpu_memory_fraction=-1
    --train_image_width=512
    --train_image_height=512
    --batch_size=${BATCH_SIZE}
    --dataset_dir=${DATASET_DIR}
    --dataset_name=${DATASET}
    --dataset_split_name=train
    --max_number_of_steps=100
    --checkpoint_path=${CKPT_PATH}
    --using_moving_average=1 2>&1 | tee -a ${TRAIN_DIR}/log.log

    opened by Pro-xiaowen 11
  • Can't reach the results on the paper

    Can't reach the results on the paper

    Hi, @DelightRun I used my own data set for training, the test results are not good. I have trained 20,000 steps, and The loss function stops and does not reduced( stop at 0.6). Can you give some Suggestions for improvement to get better results?Which parameters need to be changed?

    image

    Thank you. @comzyh

    opened by ccnankai 8
  • A question about training icdar2015

    A question about training icdar2015

    Thank you for your sharing :)

    I tried using your network on icdar2015 to trian the model. And I made the dataset 'icdar2015_train.tfrecord' by the script 'icdar2015_to_tfrecords.py' . Then I used './scripts/train.sh' to trian the model. But some warning like this appeared:

    2018-05-03 08:20:52.938117: W tensorflow/core/kernels/draw_bounding_box_op.cc:122] Bounding box (-117,-490,-52,-323) is completely outside the image and will not be drawn. 2018-05-03 08:20:52.938501: W tensorflow/core/kernels/draw_bounding_box_op.cc:122] Bounding box (-60,-439,-33,-372) is completely outside the image and will not be drawn. 2018-05-03 08:20:52.938528: W tensorflow/core/kernels/draw_bounding_box_op.cc:122] Bounding box (-141,214,-81,484) is completely outside the image and will not be drawn.

    Why did this happen? What shall I do? Respectfully waiting for a reply~

    opened by lizzyYL 7
  • About the bboxes.

    About the bboxes.

    In the file mtwi_to_tfrecords.py, line 123. why bboxes.append([xmin, ymin, xmax, ymax])? I think it is bboxes.append([ymin, xmin, ymax, xmax]), because in tf.image.draw_bounding_boxes, The coordinates of the each bounding box in boxes are encoded as [y_min, x_min, y_max, x_max]. so in ssd_vgg_preprocessing.py, tf.image.sample_distorted_bounding_box return the distort_bboxes encoded as [y_min, x_min, y_max, x_max].

    opened by jhl13 5
  • test error!!

    test error!!

    (tf3_w) ??@thinkstation:~/w/pixel_link$ ./scripts/test.sh 0 ~/w/pixel_link/pixel_link_vgg_2s/conv2_2/model.ckpt-73018.data-00000-of-00001 ~/w/pixel_link/datasets/data
    ++ set -e
    ++ export CUDA_VISIBLE_DEVICES=0
    ++ CUDA_VISIBLE_DEVICES=0
    ++ python test_pixel_link.py --checkpoint_path=/home/??/w/pixel_link/pixel_link_vgg_2s/conv2_2/model.ckpt-73018.data-00000-of-00001 --dataset_dir=/home/??/w/pixel_link/datasets/data --gpu_memory_fraction=-1
    INFO:tensorflow:loading config.py from /home/??/w/pixel_link/pixel_link_vgg_2s/conv2_2/config.py
    test_pixel_link_on_icdar2015
    Traceback (most recent call last):
      File "test_pixel_link.py", line 176, in <module>
        tf.app.run()
      File "/home/??/anaconda2/envs/tf3_w/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run
        _sys.exit(main(_sys.argv[:1] + flags_passthrough))
      File "test_pixel_link.py", line 171, in main
        config_initialization()
      File "test_pixel_link.py", line 79, in config_initialization
        util.proc.set_proc_name('test_pixel_link_on'+ '_' + FLAGS.dataset_name)
      File "/home/??/w/pixel_link/pylib/src/util/proc.py", line 21, in set_proc_name
        setproctitle.setproctitle(name)
    UnboundLocalError: local variable 'setproctitle' referenced before assignment
    

    你好,在测试的过程中,我出现了下面的报错,我不知道怎么解决这个问题,可以帮我解答一下吗?

    opened by tsing-cv 5
  • Problem I met when I try to train with ICDAR2017 dataset.

    Problem I met when I try to train with ICDAR2017 dataset.

    When I try to train the model using ICDAR2017 dataset, I got the problem below. I wonder how can I fix this? I use the ICARD2015 converting scripts to convert the dataset, am I wrong here? Thanks! :)

    INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, All bounding box coordinates must be in [0.0, 1.0]: -0.00061273575 [[Node: ssd_preprocessing_train/distorted_bounding_box_crop/SampleDistortedBoundingBox = SampleDistortedBoundingBox[T=DT_INT32, area_range=[0.1, 1], aspect_ratio_range=[0.5, 2], max_attempts=200, min_object_covered=0.1, seed=0, seed2=0, use_image_if_no_bounding_boxes=true, _device="/job:localhost/replica:0/task:0/cpu:0"](ssd_preprocessing_train/distorted_bounding_box_crop/Shape_1, ssd_preprocessing_train/distorted_bounding_box_crop/ExpandDims)]]

    Caused by op u'ssd_preprocessing_train/distorted_bounding_box_crop/SampleDistortedBoundingBox', defined at: File "train_pixel_link.py", line 293, in tf.app.run() File "/data/app/smallhuang/anaconda2/envs/pixel_link/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "train_pixel_link.py", line 287, in main batch_queue = create_dataset_batch_queue(dataset) File "train_pixel_link.py", line 136, in create_dataset_batch_queue is_training = True) File "/data/app/smallhuang/pixel_link/preprocessing/ssd_vgg_preprocessing.py", line 480, in preprocess_image data_format=data_format) File "/data/app/smallhuang/pixel_link/preprocessing/ssd_vgg_preprocessing.py", line 380, in preprocess_for_train area_range = AREA_RANGE) File "/data/app/smallhuang/pixel_link/preprocessing/ssd_vgg_preprocessing.py", line 246, in distorted_bounding_box_crop use_image_if_no_bounding_boxes=True) File "/data/app/smallhuang/anaconda2/envs/pixel_link/lib/python2.7/site-packages/tensorflow/python/ops/gen_image_ops.py", line 989, in sample_distorted_bounding_box name=name) File "/data/app/smallhuang/anaconda2/envs/pixel_link/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op op_def=op_def) File "/data/app/smallhuang/anaconda2/envs/pixel_link/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op original_op=self._default_original_op, op_def=op_def) File "/data/app/smallhuang/anaconda2/envs/pixel_link/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in init self._traceback = _extract_stack()

    opened by small-wong 5
  • no module named util

    no module named util

    谢作者分享,还在初学阶段 同之前那个提问一样,在运行test.sh时,出现: Traceback (most recent call last): File "test_pixel_link.py", line 8, in from datasets import dataset_factory File "/home/zht/study/PixelLink/pixel_link-master/datasets/dataset_factory.py", line 2, in from datasets import dataset_utils File "/home/zht/study/PixelLink/pixel_link-master/datasets/dataset_utils.py", line 21, in import util ImportError: No module named 'util' 同没有找到util这个文件在哪,求指教,谢谢。

    opened by zwz7712 5
  • i have a question about the learning rate

    i have a question about the learning rate

    When I used the script to train icdar2015 datasets, Nan appeared after roughly 20k rounds. Why is this learning rate set to 1e-3 first and then fixed to 1e-2?

    opened by xgbj 4
  • After many steps the loss becomes NaN. (loss = Nan)

    After many steps the loss becomes NaN. (loss = Nan)

    Hello, I have been training the pixel link on my own data. I have tried various settings (e.g. different batch sizes, different probability of rotation, minimum side length, etc.) but I end up having NaN loss. Any ideas regarding the reasons of this issue. However, when I train on ICDAR 2015, the training goes fine. Any clue to solve this issue is appreciated. I am only using 1 GPU (6GB) to train and I have adjusted the related parameters. To avoid out of memory (OOM) issue of gpu, the maximum batch size I use is 6.

    Thank you for your help.

    opened by famunir 2
  • Loss around 0.4 ~0.5, and predict empty box

    Loss around 0.4 ~0.5, and predict empty box

    After training about 20k iters, my loss still around 0.4 ~ 0.5. And pick up the model to predict, what i receive is empty box?

    Can anyone know how to fine tune the model? Please help me out this problem, thanks you

    opened by jisheng047 2
  • Problem when restoring the trained model : FailedPreconditionError :Attempting to use uninitialized value count_warning

    Problem when restoring the trained model : FailedPreconditionError :Attempting to use uninitialized value count_warning

    I have successfully trained a model and want to convert the last checkpoint model ( .meta, .index outptuts) to frozen model (.pb file).

    with tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)) as sess:
        sess.run(tf.global_variables_initializer())
        # Restore the graph
        saver = tf.train.import_meta_graph(meta_path)
    
        # Load weights
        saver.restore(sess,tf.train.latest_checkpoint('/home/master/small_dt'))
        output_node_names = [n.name for n in tf.get_default_graph().as_graph_def().node]
        frozen_graph_def = tf.graph_util.convert_variables_to_constants(
            sess,
            sess.graph_def,
            output_node_names)
    
        # Save the frozen graph
        with open('output_graph.pb', 'wb') as f:
    	f.write(frozen_graph_def.SerializeToString())
    	graph_io.write_graph(frozen, './', 'inference_graph.pb', as_text=False)
    
    

    everytime i run my code. There seems to be a problem with the model when trying to restore it. The variable "count_warning" - which seems to be an output node - is not initialized.

    How can i fix this issue?!

    opened by mohamad-hoseini 0
  • InternalError (see above for traceback): Blas SGEMM launch failed : m=61440, n=2, k=256

    InternalError (see above for traceback): Blas SGEMM launch failed : m=61440, n=2, k=256

    I want to run test_pixel_link_on_any_image.py i use win 10os anaconda3 cuda9 tensorflow 1.5.0

    python test_pixel_link_on_any_image.py --checkpoint_path C:\Users\Administrator\Desktop\LOG\pixel_link\pixel_link_vgg_4s\conv3_3 --dataset_dir C:\Users\Administrator\Desktop\LOG\pixel_link\data --eval_image_width=1280 --eval_image_height=768 --pixel_conf_threshold=0.5 --link_conf_threshold=0.5 --gpu_memory_fraction=-1

    i have error :

    WARNING:tensorflow:From test_pixel_link_on_any_image.py:76: get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.get_or_create_global_step 2020-04-01 16:32:53.590041: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2020-04-01 16:32:54.080675: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1105] Found device 0 with properties: name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7335 pciBusID: 0000:01:00.0 totalMemory: 6.00GiB freeMemory: 4.96GiB 2020-04-01 16:32:54.085798: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1) INFO:tensorflow:Restoring parameters from C:/Users/Administrator/Desktop/LOG/pixel_link/pixel_link_vgg_4s/conv3_3/model.ckpt-38055 2020-04-01 16:34:52.911629: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\stream_executor\cuda\cuda_blas.cc:444] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED 2020-04-01 16:34:52.943963: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\stream_executor\stream.cc:1901] attempting to perform BLAS operation using StreamExecutor without BLAS support Traceback (most recent call last): File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_call return fn(*args) File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1329, in _run_fn status, run_metadata) File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : m=61440, n=2, k=256 [[Node: evaluation_768x1280/pixel_cls/score_from_conv3_3/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](evaluation_768x1280/conv3/conv3_3/Relu, pixel_cls/score_from_conv3_3/weights/read/_193)]] [[Node: evaluation_768x1280/Reshape/_195 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_209_evaluation_768x1280/Reshape", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "test_pixel_link_on_any_image.py", line 157, in tf.app.run() File "C:\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 124, in run _sys.exit(main(argv)) File "test_pixel_link_on_any_image.py", line 153, in main test() File "test_pixel_link_on_any_image.py", line 120, in test feed_dict={image: image_data}) File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 895, in run run_metadata_ptr) File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1128, in _run feed_dict_tensor, options, run_metadata) File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1344, in _do_run options, run_metadata) File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1363, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : m=61440, n=2, k=256 [[Node: evaluation_768x1280/pixel_cls/score_from_conv3_3/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](evaluation_768x1280/conv3/conv3_3/Relu, pixel_cls/score_from_conv3_3/weights/read/_193)]] [[Node: evaluation_768x1280/Reshape/_195 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_209_evaluation_768x1280/Reshape", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

    Caused by op 'evaluation_768x1280/pixel_cls/score_from_conv3_3/Conv2D', defined at: File "test_pixel_link_on_any_image.py", line 157, in tf.app.run() File "C:\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 124, in run _sys.exit(main(argv)) File "test_pixel_link_on_any_image.py", line 153, in main test() File "test_pixel_link_on_any_image.py", line 88, in test net = pixel_link_symbol.PixelLinkNet(b_image, is_training=False) File "C:\Users\Administrator\Desktop\LOG\pixel_link\nets\pixel_link_symbol.py", line 18, in init self._fuse_feat_layers() File "C:\Users\Administrator\Desktop\LOG\pixel_link\nets\pixel_link_symbol.py", line 150, in _fuse_feat_layers config.num_classes, scope = 'pixel_cls') File "C:\Users\Administrator\Desktop\LOG\pixel_link\nets\pixel_link_symbol.py", line 137, in _fuse_by_cascade_conv1x1_upsample_sum num_classes, current_layer_name) File "C:\Users\Administrator\Desktop\LOG\pixel_link\nets\pixel_link_symbol.py", line 58, in _score_layer normalizer_fn=None) File "C:\Anaconda3\lib\site-packages\tensorflow\contrib\framework\python\ops\arg_scope.py", line 182, in func_with_args return func(*args, **current_args) File "C:\Anaconda3\lib\site-packages\tensorflow\contrib\layers\python\layers\layers.py", line 1057, in convolution outputs = layer.apply(inputs) File "C:\Anaconda3\lib\site-packages\tensorflow\python\layers\base.py", line 762, in apply return self.call(inputs, *args, **kwargs) File "C:\Anaconda3\lib\site-packages\tensorflow\python\layers\base.py", line 652, in call outputs = self.call(inputs, *args, **kwargs) File "C:\Anaconda3\lib\site-packages\tensorflow\python\layers\convolutional.py", line 167, in call outputs = self._convolution_op(inputs, self.kernel) File "C:\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 838, in call return self.conv_op(inp, filter) File "C:\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 502, in call return self.call(inp, filter) File "C:\Anaconda3\lib\site-packages\tensorflow\python\ops\nn_ops.py", line 190, in call name=self.name) File "C:\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_nn_ops.py", line 725, in conv2d data_format=data_format, dilations=dilations, name=name) File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3160, in create_op op_def=op_def) File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1625, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

    InternalError (see above for traceback): Blas SGEMM launch failed : m=61440, n=2, k=256 [[Node: evaluation_768x1280/pixel_cls/score_from_conv3_3/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](evaluation_768x1280/conv3/conv3_3/Relu, pixel_cls/score_from_conv3_3/weights/read/_193)]] [[Node: evaluation_768x1280/Reshape/_195 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_209_evaluation_768x1280/Reshape", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

    opened by DucLong06 0
  • Invalid Syntax on util

    Invalid Syntax on util

    Hi,

    I get an invalid syntax error on util submodule

    print util.sit(image_data)
                 ^
    SyntaxError: invalid syntax
    

    What am I doing wrong?

    Environment: Docker 19.03 Python: 3.6.8 Tensorflow:1.13

    opened by ManuelRaddatz 1
  • ./scripts/train.sh: 权限不够是怎么回事啊求帮助

    ./scripts/train.sh: 权限不够是怎么回事啊求帮助

    ./scripts/train.sh: 行 25: /home/yan/disk/All_python_Text/pixel_link-master/dataset/ICDAR2015/Challenge4/train_icdar2015/icdar2015_train.tfrecord: 权限不够

    opened by jun214384468 1
  • TypeError: Can not convert a list into a Tensor or Operation.

    TypeError: Can not convert a list into a Tensor or Operation.

    When I run "./scripts/train.sh 0,1 8" :

    Traceback (most recent call last): File "train_pixel_link.py", line 293, in tf.app.run() File "/ai/installs/anaconda3/envs/py27_tf/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "train_pixel_link.py", line 288, in main train_op = create_clones(batch_queue) File "train_pixel_link.py", line 252, in create_clones train_op = control_flow_ops.with_dependencies(train_ops, pixel_link_loss, name='train_op') File "/ai/installs/anaconda3/envs/py27_tf/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3368, in with_dependencies with ops.control_dependencies(dependencies): File "/ai/installs/anaconda3/envs/py27_tf/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 5004, in control_dependencies return get_default_graph().control_dependencies(control_inputs) File "/ai/installs/anaconda3/envs/py27_tf/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 4543, in control_dependencies c = self.as_graph_element(c) File "/ai/installs/anaconda3/envs/py27_tf/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3490, in as_graph_element return self._as_graph_element_locked(obj, allow_tensor, allow_operation) File "/ai/installs/anaconda3/envs/py27_tf/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3579, in _as_graph_element_locked types_str)) TypeError: Can not convert a list into a Tensor or Operation.

    opencv_version:2.4.13.4 python:2.7.15 tensorflow:1.12 cuda:9.0

    After I change tensorflow1.12 to tensorflow1.1,the problem still remains.

    opened by Lanme 1
Owner
null
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

Pengyuan Lyu 309 Dec 6, 2022
An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

InceptText-Tensorflow An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Orien

GeorgeJoe 115 Dec 12, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 7, 2022
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...

keras-ctpn [TOC] 说明 预测 训练 例子 4.1 ICDAR2015 4.1.1 带侧边细化 4.1.2 不带带侧边细化 4.1.3 做数据增广-水平翻转 4.2 ICDAR2017 4.3 其它数据集 toDoList 总结 说明 本工程是keras实现的CPTN: Detecti

mick.yi 107 Jan 9, 2023
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

Tian Zhi 1.3k Dec 22, 2022
Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

null 428 Nov 22, 2022
huoyijie 1.2k Dec 29, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

SoftGroup We provide code for reproducing results of the paper SoftGroup for 3D Instance Segmentation on Point Clouds (CVPR 2022) Author: Thang Vu, Ko

Thang Vu 231 Dec 27, 2022
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding

null 81 Dec 1, 2022
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo

candler 80 Dec 28, 2021
Implementation of EAST scene text detector in Keras

EAST: An Efficient and Accurate Scene Text Detector This is a Keras implementation of EAST based on a Tensorflow implementation made by argman. The or

Jan Zdenek 208 Nov 15, 2022
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

Dejia Song 544 Dec 20, 2022
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

null 365 Dec 20, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

null 68 Dec 14, 2022
Learning Camera Localization via Dense Scene Matching, CVPR2021

This repository contains code of our CVPR 2021 paper - "Learning Camera Localization via Dense Scene Matching" by Shitao Tang, Chengzhou Tang, Rui Hua

tangshitao 65 Dec 1, 2022
MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition Python 2.7 Python 3.6 MORAN is a network with rectification mechanism for

Canjie Luo 595 Dec 27, 2022
Scene text recognition

AttentionOCR for Arbitrary-Shaped Scene Text Recognition Introduction This is the ranked No.1 tensorflow based scene text spotting algorithm on ICDAR2

null 777 Jan 9, 2023