An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Overview

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link

Contents:

  1. Introduction
  2. Installation&requirements
  3. Datasets
  4. Problems
  5. Models
  6. Test Your own images
  7. Models
  8. Some Comments

Introduction

This is a re-implementation of the SegLink text detection algorithm described in the paper Detecting Oriented Text in Natural Images by Linking Segments, Baoguang Shi, Xiang Bai, Serge Belongie

Installation&requirements

  1. tensorflow-gpu 1.1.0

  2. cv2. I'm using 2.4.9.1, but some other versions less than 3 should be OK too. If not, try to switch to the version as mine.

  3. download the project pylib and add the src folder to your PYTHONPATH

If any other requirements unmet, just install them following the error msg.

Datasets

  1. SynthText

  2. ICDAR2015

Convert them into tfrecords format using the scripts in datasets if you wanna train your own model.

Problems

The convergence speed of my seglink is quite slow compared with that described in the paper. For example, the authors of SegLink paper said that a good result can be obtained by training on Synthtext for less than 10W iterations and on IC15-train for less than 1W iterations. However, using my implementation, I have to train on SynthText for about 20W iterations and another more than 10W iterations on IC15-train, to get a competitive result.

Several reasons may contribute to the slow convergency of my model:

  1. Batch size. I don't have 4 12G-Titans for training, as described in the paper. Instead, I trained my model on two 8G GeForce GTX 1080 or two Titans.
  2. Learning Rate. In the paper, 10^-3 and 10^-4 have been used. But I adopted a fixed learning rate of 10^-4.
  3. Different initialization model. I used the pretrained VGG model from SSD-caffe on coco , because I thought it better than VGG trained on ImageNet. However, it seems that my point of view does not hold. 4.Some other differences exists maybe, I am not sure.

Models

Two models trained on SynthText and IC15 train can be downloaded.

  1. seglink-384. Trained using image size of 384x384, the same image size as the paper. The Hmean is comparable to the result reported in the paper.

The hust_orientedText is the result of paper.

  1. seglink-512. Trainied using image size of 512x512, and one pointer better than 384x384.

They have been trained:

  • on Synthtext for about 20W iterations, and on IC15-train for 10w~20W iterations.

  • learning_rate = 10e-4

  • two gpus

  • 384: GTX 1080, batch_size = 24; 512: Titan, batch_size = 20

Both models perform best at seg_conf_threshold=0.8 and link_conf_threshold=0.5, well, another difference from paper, which takes 0.9 and 0.7 respectively.

Test Your own images

Use the script test_seglink.py, and a shortcut has been created in script test.sh:

Go to the seglink root directory and execute the command:


./scripts/test.sh 0 GPU_ID CKPT_PATH DATASET_DIR

For example:


./scripts/test.sh 0 ~/models/seglink/model.ckpt-217867  ~/dataset/ICDAR2015/Challenge4/ch4_training_images

I have only tested my models on IC15-test, but any other images can be used for test: just put your images into a directory, and config the path in the command as DATASET_DIR.

A bunch of txt files and a zip file is created after test. If you are using IC15-test for testing, you can upload this zip file to the icdar evaluation server directly.

The text files and placed in a subdir of the checkpoint directory, and contain the bounding boxes as the detection results, and can visualized using the script visualize_detection_result.py.

The command looks like:


python visualize_detection_result.py \

    --image=where your images are put

    --det=the directory of the text files output by test_seglink.py

    --output=the output directory of detection results drawn on images.

For example:


python visualize_detection_result.py \

    --image=~/dataset/ICDAR2015/Challenge4/ch4_training_images/ \

    --det=~/models/seglink/seglink_icdar2015_without_ignored/eval/icdar2015_train/model.ckpt-72885/seg_link_conf_th_0.900000_0.700000/txt \
    --output=~/temp/no-use/seglink_result_512_train

Training and evaluation

The training processing requires data processing, i.e. converting data into tfrecords. The converting scripts are put in the datasets directory. The scrips:train_seglink.py and eval_seglink.py are the training and evaluation scripts respectively. Especially, I have implemented an offline evaluation function, which calculates the Recall/Precision/Hmean as the ICDAR test server, and can be used for cross validation and grid search. However, the resulting scores may have slight differences from those of test sever, but it does not matter that much. Sorry for the imcomplete documentation here. Read and modify them if you want to train your own model.

Some Comments

Thanks should be given to the authors of the Seglink paper, i.e., Baoguang Shi1 Xiang Bai1, Serge Belongie.

EAST is another paper on text detection accepted by CVPR 2017, and its reported result is better than that of SegLink. But if they both use same VGG16, their performances are quite similar.

Contact me if you have any problems, through github issues.

Some Notes On Implementation Detail

How the groundtruth is calculated, in Chinese: http://fromwiz.com/share/s/34GeEW1RFx7x2iIM0z1ZXVvc2yLl5t2fTkEg2ZVhJR2n50xg

Comments
  • How to train on my own datasets?

    How to train on my own datasets?

    Hi, dengdan, thank you for your hard work, i am trying to train seglink model on my own datasets, i meet such situation:

    1. I only get one GPU card which is TITAN XP, i have rewrite your train scripts,but i get some warnings
    2. which pretrain model should i prepare for trainning progress? i got imagenet vgg16 checkpoints from SSD-tensorflow project, does this which part of the code should i rewrite to train on this pretrain model?

    Thank you again, your work is awesome.

    opened by BowieHsu 16
  • INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Retval[0] does not have value

    INFO:tensorflow:Error reported to Coordinator: , Retval[0] does not have value

    INFO:tensorflow:global step 109662: loss = 5.3843 (0.160 sec/step) INFO:tensorflow:global step 109663: loss = 4.5832 (0.256 sec/step) INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Retval[0] does not have value INFO:tensorflow:global step 109664: loss = 8.8361 (0.098 sec/step) INFO:tensorflow:Finished training! Saving model to disk. Traceback (most recent call last): File "./train_seglink.py", line 275, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "./train_seglink.py", line 271, in main train(train_op) File "./train_seglink.py", line 260, in train session_config = sess_config File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/learning.py", line 759, in train sv.saver.save(sess, sv.save_path, global_step=sv.global_step) File "/usr/lib/python2.7/contextlib.py", line 24, in exit self.gen.next() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session self.stop(close_summary_writer=close_summary_writer) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 792, in stop stop_grace_period_secs=self._stop_grace_secs) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join six.reraise(*self._exc_info_to_raise) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 296, in stop_on_exception yield File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 494, in run self.run_loop() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/supervisor.py", line 994, in run_loop self._sv.global_step]) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Retval[0] does not have value zst@zst-robot1:~/zst/seglink-master$ 我的tf 版本是1.2.1,我也尝试在1.1.0上运行也会出现这样的错误.

    opened by zhangshuaitao 14
  • test algorithm on own images

    test algorithm on own images

    I am proceeding exactly how you describe. However, once I run the code I receive

    DataLossError (see above for traceback): Unable to open table file /home.net/vs17dow/Desktop/A_M-arbeit/G_Code/E_seglink/models/model.ckpt-136750.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? [[Node: save/RestoreV2_24 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_24/tensor_names, save/RestoreV2_24/shape_and_slices)]] [[Node: save/RestoreV2_56/_73 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_260_save/RestoreV2_56", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

    ./scripts/test.sh 0 ~/Desktop/A_M-arbeit/G_Code/E_seglink/models/seglink/model.ckpt-136750.data-00000-of-00001 ~/Desktop/A_M-arbeit/G_Code/E_seglink/datasets/mydata/

    I do not see an error in my command. Any advice how to proceed here?

    Best

    Valentin

    opened by idefix92 10
  • About transofrom_cv_rect

    About transofrom_cv_rect

    When a rectangle is rotated to horizontal direction, should width-side be horizontal? should the length of width-side be larger than than the length of height-side?

    The input parameter rects is generated by minarearect function of opncv

    
    def transform_cv_rect(rects):
        only_one = False
        if len(np.shape(rects)) == 1:
            rects = np.expand_dims(rects, axis = 0)
            only_one = True
        assert np.shape(rects)[1] == 5, 'The shape of rects must be (N, 5), but meet %s'%(str(np.shape(rects)))
        rects = np.asarray(rects, dtype = np.float32).copy()
        num_rects = np.shape(rects)[0]
        for idx in xrange(num_rects):
            cx, cy, w, h, theta = rects[idx, ...];
            #assert theta < 0 and theta >= -90, "invalid theta: %f"%(theta) 
            if abs(theta) > 45 or (abs(theta) == 45 and w < h):
                w, h = [h, w]
                theta = 90 + theta
            rects[idx, ...] = [cx, cy, w, h, theta]
        if only_one:
            return rects[0, ...]
        return rects 
    
    

    After using the above function, it seems that it can't promise that the length of width is larger than the length of height.

    I'm so confused.

    opened by abc8350712 8
  • UnboundLocalError: local variable 'setproctitle' referenced before assignment

    UnboundLocalError: local variable 'setproctitle' referenced before assignment

    Hi, I have the dependencies installed for seglink already.

    However, when I tried to run the program, I have encountered such problem:

    image

    Could anyone advise how can I solve this?

    Thanks!

    opened by JohnWnx 3
  • About bboxes_filter_overlap

    About bboxes_filter_overlap

    I read the code and something make me confused. In the process of data augmentation,the following function appears.

    bboxes_filter_overlap(labels, bboxes,xs, ys, threshold, scope=None, assign_negative = False)

    Is the value in bboxes of parameters may be negative?

    I am looking forward to your help!

    opened by abc8350712 3
  • Trained model

    Trained model

    Could you please share a Drive or Dropbox link with a quicker server? Unfortunately, I cannot download any of the models. Once I start downloading it is telling me 20 h....for 300 mB? And as a consequence Chrome stops the download.

    BR

    Valentin

    opened by idefix92 3
  • exceptions.AttributeError: 'module' object has no attribute 'cv'  When run test_seglink.py

    exceptions.AttributeError: 'module' object has no attribute 'cv' When run test_seglink.py

    I run test_seglink.py, and the script meet below error when it run to line "image_bboxes = sess.run([bboxes_pred], feed_dict = {image:image_data, image_shape:image_data.shape}) " I use tensorflow-gpu (1.2.0) python 2.7

    Traceback (most recent call last):

    File "", line 1, in runfile('/home/user/MZH/seglink-master/test_seglink.py', args='--dataset_dir=datasets/ICDAR-Test-Images --checkpoint_path=/home/user/MZH/seglink-master/seglink-384/model.ckpt-136750', wdir='/home/user/MZH/seglink-master')

    File "/home/user/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 880, in runfile execfile(filename, namespace)

    File "/home/user/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 94, in execfile builtins.execfile(filename, *where)

    File "/home/user/MZH/seglink-master/test_seglink.py", line 164, in tf.app.run()

    File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough))

    File "/home/user/MZH/seglink-master/test_seglink.py", line 160, in main eval()

    File "/home/user/MZH/seglink-master/test_seglink.py", line 147, in eval image_bboxes = sess.run([bboxes_pred], feed_dict = {image:image_data, image_shape:image_data.shape})

    File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr)

    File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata)

    File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata)

    File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message)

    UnknownError: exceptions.AttributeError: 'module' object has no attribute 'cv' [[Node: test/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_FLOAT], Tout=[DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](test/strided_slice_4, test/strided_slice_5, test/strided_slice_2, test/strided_slice_3, test/PyFunc/input_4, test/PyFunc/input_5)]]

    Caused by op u'test/PyFunc', defined at: File "/home/user/anaconda2/lib/python2.7/site-packages/spyder/utils/ipython/start_kernel.py", line 231, in main() File "/home/user/anaconda2/lib/python2.7/site-packages/spyder/utils/ipython/start_kernel.py", line 227, in main kernel.start() File "/home/user/anaconda2/lib/python2.7/site-packages/ipykernel/kernelapp.py", line 477, in start ioloop.IOLoop.instance().start() File "/home/user/anaconda2/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 177, in start super(ZMQIOLoop, self).start() File "/home/user/anaconda2/lib/python2.7/site-packages/tornado/ioloop.py", line 888, in start handler_func(fd_obj, events) File "/home/user/anaconda2/lib/python2.7/site-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/home/user/anaconda2/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events self._handle_recv() File "/home/user/anaconda2/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv self._run_callback(callback, msg) File "/home/user/anaconda2/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback callback(*args, **kwargs) File "/home/user/anaconda2/lib/python2.7/site-packages/tornado/stack_context.py", line 277, in null_wrapper return fn(*args, **kwargs) File "/home/user/anaconda2/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher return self.dispatch_shell(stream, msg) File "/home/user/anaconda2/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell handler(stream, idents, msg) File "/home/user/anaconda2/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 399, in execute_request user_expressions, allow_stdin) File "/home/user/anaconda2/lib/python2.7/site-packages/ipykernel/ipkernel.py", line 196, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/home/user/anaconda2/lib/python2.7/site-packages/ipykernel/zmqshell.py", line 533, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "/home/user/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2717, in run_cell interactivity=interactivity, compiler=compiler, result=result) File "/home/user/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2827, in run_ast_nodes if self.run_code(code, result): File "/home/user/anaconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in runfile('/home/user/MZH/seglink-master/test_seglink.py', args='--dataset_dir=datasets/ICDAR-Test-Images --checkpoint_path=/home/user/MZH/seglink-master/seglink-384/model.ckpt-136750', wdir='/home/user/MZH/seglink-master') File "/home/user/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 880, in runfile execfile(filename, namespace) File "/home/user/anaconda2/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py", line 94, in execfile builtins.execfile(filename, *where) File "/home/user/MZH/seglink-master/test_seglink.py", line 164, in tf.app.run() File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/home/user/MZH/seglink-master/test_seglink.py", line 160, in main eval() File "/home/user/MZH/seglink-master/test_seglink.py", line 98, in eval link_conf_threshold = config.link_conf_threshold) File "tf_extended/seglink.py", line 680, in tf_seglink_to_bbox tf.float32); File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 198, in py_func input=inp, token=token, Tout=Tout, name=name) File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_script_ops.py", line 38, in _py_func name=name) File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op original_op=self._default_original_op, op_def=op_def) File "/home/user/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1269, in init self._traceback = _extract_stack()

    UnknownError (see above for traceback): exceptions.AttributeError: 'module' object has no attribute 'cv' [[Node: test/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_FLOAT], Tout=[DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](test/strided_slice_4, test/strided_slice_5, test/strided_slice_2, test/strided_slice_3, test/PyFunc/input_4, test/PyFunc/input_5)]]

    opened by 100200cs 3
  • When I add focal loss, the time of evaluation is very slow

    When I add focal loss, the time of evaluation is very slow

    Sorry to bother you. I change the loss to be focal loss,it trains normally. However,when i try to evaluate the effect of it, it's so slow and i can't understand why. This is the code:

        def focal_loss(self, onehot_labels, cls_preds,
                       alpha=0.25, gamma=2.0, name=None, scope=None):
            with tf.name_scope(scope, 'focal_loss', [cls_preds, onehot_labels]) as sc:
                logits = tf.convert_to_tensor(cls_preds)
                onehot_labels = tf.convert_to_tensor(onehot_labels)
    
            precise_logits = tf.cast(logits, tf.float32) if (
                logits.dtype == tf.float16) else logits
            onehot_labels = tf.cast(onehot_labels, precise_logits.dtype)
            predictions = tf.nn.sigmoid(precise_logits)
            predictions_pt = tf.where(tf.equal(onehot_labels, 1), predictions, 1. - predictions)
            # add small value to avoid 0
            epsilon = 1e-8
            alpha_t = tf.scalar_mul(alpha, tf.ones_like(onehot_labels, dtype=tf.float32))
            alpha_t = tf.where(tf.equal(onehot_labels, 1.0), alpha_t, 1 - alpha_t)
            losses = tf.reduce_mean(-alpha_t * tf.pow(1. - predictions_pt, gamma) * tf.log(predictions_pt + epsilon),
                                   name=name)
            return losses
    
    def build_loss(self, seg_labels, seg_offsets, link_labels, do_summary=True):
        batch_size = config.batch_size_per_gpu
    
        # note that for label values in both seg_labels and link_labels:
        #    -1 stands for negative
        #     1 stands for positive
        #     0 stands for ignored
        def get_pos_and_neg_masks(labels):
            if config.train_with_ignored:
                pos_mask = labels >= 0
                neg_mask = tf.logical_not(pos_mask)
            else:
                pos_mask = tf.equal(labels, 1)
                neg_mask = tf.equal(labels, -1)
    
            return pos_mask, neg_mask
    
        def OHNM_single_image(scores, n_pos, neg_mask):
            """Online Hard Negative Mining.
                scores: the scores of being predicted as negative cls
                n_pos: the number of positive samples
                neg_mask: mask of negative samples
                Return:
                    the mask of selected negative samples.
                    if n_pos == 0, no negative samples will be selected.
            """
    
            def has_pos():
                n_neg = n_pos * config.max_neg_pos_ratio
                max_neg_entries = tf.reduce_sum(tf.cast(neg_mask, tf.int32))
                n_neg = tf.minimum(n_neg, max_neg_entries)
                n_neg = tf.cast(n_neg, tf.int32)
                neg_conf = tf.boolean_mask(scores, neg_mask)
                vals, _ = tf.nn.top_k(-neg_conf, k=n_neg)
                threshold = vals[-1]  # a negtive value
                selected_neg_mask = tf.logical_and(neg_mask, scores <= -threshold)
                return tf.cast(selected_neg_mask, tf.float32)
    
            def no_pos():
                return tf.zeros_like(neg_mask, tf.float32)
    
            return tf.cond(n_pos > 0, has_pos, no_pos)
    
        def OHNM_batch(neg_conf, pos_mask, neg_mask):
            selected_neg_mask = []
            for image_idx in xrange(batch_size):
                image_neg_conf = neg_conf[image_idx, :]
                image_neg_mask = neg_mask[image_idx, :]
                image_pos_mask = pos_mask[image_idx, :]
                n_pos = tf.reduce_sum(tf.cast(image_pos_mask, tf.int32))
                selected_neg_mask.append(OHNM_single_image(image_neg_conf, n_pos, image_neg_mask))
    
            selected_neg_mask = tf.stack(selected_neg_mask)
            selected_mask = tf.cast(pos_mask, tf.float32) + selected_neg_mask
            return selected_mask
    
        # OHNM on segments
        seg_neg_scores = self.seg_scores[:, :, 0]
        seg_pos_mask, seg_neg_mask = get_pos_and_neg_masks(seg_labels)
        seg_selected_mask = OHNM_batch(seg_neg_scores, seg_pos_mask, seg_neg_mask)
        n_seg_pos = tf.reduce_sum(tf.cast(seg_pos_mask, tf.float32))
    
        with tf.name_scope('seg_cls_loss'):
            def has_pos():
                #seg_cls_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
                #    logits=self.seg_score_logits,
                #    labels=tf.cast(seg_pos_mask, dtype=tf.int32))
                #return tf.reduce_sum(seg_cls_loss * seg_selected_mask) / n_seg_pos
                seg_cls_loss = self.focal_loss(tf.one_hot(seg_labels, 2), self.seg_score_logits)
                return seg_cls_loss
            def no_pos():
                return tf.constant(.0);
    
            seg_cls_loss = tf.cond(n_seg_pos > 0, has_pos, no_pos)
            tf.add_to_collection(tf.GraphKeys.LOSSES, seg_cls_loss)
    
        def smooth_l1_loss(pred, target, weights):
            diff = pred - target
            abs_diff = tf.abs(diff)
            abs_diff_lt_1 = tf.less(abs_diff, 1)
            if len(target.shape) != len(weights.shape):
                loss = tf.reduce_sum(tf.where(abs_diff_lt_1, 0.5 * tf.square(abs_diff), abs_diff - 0.5), axis=2)
                return tf.reduce_sum(loss * tf.cast(weights, tf.float32))
            else:
                loss = tf.where(abs_diff_lt_1, 0.5 * tf.square(abs_diff), abs_diff - 0.5)
                return tf.reduce_sum(loss * tf.cast(weights, tf.float32))
    
        with tf.name_scope('seg_loc_loss'):
            def has_pos():
                seg_loc_loss = smooth_l1_loss(self.seg_offsets, seg_offsets,
                                              seg_pos_mask) * config.seg_loc_loss_weight / n_seg_pos
                names = ['loc_cx_loss', 'loc_cy_loss', 'loc_w_loss', 'loc_h_loss', 'loc_theta_loss']
                sub_loc_losses = []
                from tensorflow.python.ops import control_flow_ops
                for idx, name in enumerate(names):
                    name_loss = smooth_l1_loss(self.seg_offsets[:, :, idx], seg_offsets[:, :, idx],
                                               seg_pos_mask) * config.seg_loc_loss_weight / n_seg_pos
                    name_loss = tf.identity(name_loss, name=name)
                    if do_summary:
                        tf.summary.scalar(name, name_loss)
                    sub_loc_losses.append(name_loss)
                seg_loc_loss = control_flow_ops.with_dependencies(sub_loc_losses, seg_loc_loss)
                return seg_loc_loss
    
            def no_pos():
                return tf.constant(.0);
    
            seg_loc_loss = tf.cond(n_seg_pos > 0, has_pos, no_pos)
            tf.add_to_collection(tf.GraphKeys.LOSSES, seg_loc_loss)
    
        link_neg_scores = self.link_scores[:, :, 0]
        link_pos_mask, link_neg_mask = get_pos_and_neg_masks(link_labels)
        link_selected_mask = OHNM_batch(link_neg_scores, link_pos_mask, link_neg_mask)
        n_link_pos = tf.reduce_sum(tf.cast(link_pos_mask, dtype=tf.float32))
        with tf.name_scope('link_cls_loss'):
            def has_pos():
                #link_cls_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
                #    logits=self.link_score_logits,
                #    labels=tf.cast(link_pos_mask, tf.int32))
                #return tf.reduce_sum(link_cls_loss * link_selected_mask) / n_link_pos
                link_cls_loss = self.focal_loss(tf.one_hot(link_labels, 2), self.link_score_logits)
                return link_cls_loss
            def no_pos():
                return tf.constant(.0);
    
            link_cls_loss = tf.cond(n_link_pos > 0, has_pos, no_pos) * config.link_cls_loss_weight
            tf.add_to_collection(tf.GraphKeys.LOSSES, link_cls_loss)
    
        if do_summary:
            tf.summary.scalar('seg_cls_loss', seg_cls_loss)
            tf.summary.scalar('seg_loc_loss', seg_loc_loss)
            tf.summary.scalar('link_cls_loss', link_cls_loss)
    

    Thanks in advance.

    opened by abc8350712 2
  • The f-measure of evaluation

    The f-measure of evaluation

    I used the command

    
    python eval_seglink.py --checkpoint_path=./seglink/model.ckpt-136750  --dataset_name=icdar2015 --dataset_split_name=test --dataset_dir=./tf_records
    

    to evaluate the model provided by you(seglink-384 model)

    I change 'seg_conf_threshold' and 'link_conf_threshold' to 0.8 and 0.5 separately.

    when i set the test image as 384x384, the result is Recall, Precision, Fmean = [0.48117587][0.72381693][0.57806695]

    when I set the test image as 512x512, the result is Recall, Precision, Fmean = [0.61840743][0.78477693][0.69172925]

    It doesn't match the result you provided. Is there anything i miss?

    opened by abc8350712 2
  • Errors while changing the basenet

    Errors while changing the basenet

    When I try to change the VGG net to Resnet,it doesn't work.

    I mainly change the vgg.py file like

    def basenet(inputs):
        logit, endpoints =resnet_50(inputs)
        endpoints['conv4_3'] = endpoints['vgg/resnet_50/block2/unit_2']
        endpoints['fc7'] = endpoints['vgg/resnet_50/block3/unit_4']
        return endpoints['fc7'], endpoints
    
    #try to keep the output as original net
    

    However it dosen't work but come out:

    Traceback (most recent call last): File "/home/moon/seglink-master/train_seglink.py", line 276, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/home/moon/seglink-master/train_seglink.py", line 271, in main train_op = create_clones(batch_queue) File "/home/moon/seglink-master/train_seglink.py", line 220, in create_clones averaged_gradients = sum_gradients(gradients) File "/home/moon/seglink-master/train_seglink.py", line 164, in sum_gradients grad = tf.add_n(grads, name = v.op.name + '_summed_gradients') File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1918, in add_n raise ValueError("inputs must be a list of at least one Tensor with the " ValueError: inputs must be a list of at least one Tensor with the same dtype and shape

    My coarse renset implemention is as follow:

    import tensorflow as tf
    import collections
    
    slim = tf.contrib.slim
    
    Block = collections.namedtuple('Block', ['scope', 'unit_fn', 'args'])
    
    def subsample(inputs, factor, scope=None):
        if factor == 1:
            return inputs
        else:
            return slim.max_pool2d(inputs, [1, 1], stride= factor, scope=scope)
    
    def bottleneck(inputs,
                   depth,
                   depth_bottleneck,
                   stride,
                   outputs_collections='collections',
                   scope=None):
    
        with tf.variable_scope(scope) as sc:
            depth_in = slim.utils.last_dimension(inputs.get_shape(), min_rank=4)
            preact = slim.batch_norm(inputs, activation_fn=tf.nn.relu, scope='preact')
            if depth==depth_in:
                shortcut = subsample(inputs, stride, 'shortcut')
            else:
    
                shortcut = slim.conv2d(preact, depth, [1, 1],
                                       stride=stride, normalizer_fn=None,
                                       activation_fn=None, scope='shortcut')
            residual = slim.conv2d(preact, depth_bottleneck, [1, 1], stride=1, scope='conv1')
    
            residual = slim.conv2d(residual, depth_bottleneck, [3, 3], stride=stride, padding='SAME', scope='conv2')
    
            residual = slim.conv2d(residual, depth, [1, 1], stride=1, scope='conv3')
    
            output = shortcut+residual
    
            return slim.utils.collect_named_outputs(outputs_collections, sc.name, output)
    
    
    
    def resnet_50(input):
        blocks = [
            Block('block1', bottleneck, [(256, 64, 1)] * 2 + [(256, 64, 2)]),
            Block(
                'block2', bottleneck, [(512, 128, 1)] * 3 + [(512, 128, 2)]),
            Block(
                'block3', bottleneck, [(1024, 256, 1)] * 5 + [(1024, 256, 2)]),
            Block(
                'block4', bottleneck, [(2048, 512, 1)] * 3)]
        net = input
        net = slim.conv2d(net, 64, 7, stride=2, scope='conv1', padding='SAME')
        net = slim.max_pool2d(net, [3, 3], stride=2, scope='pool1')
        with tf.variable_scope('resnet_50'):
            for i, block in enumerate(blocks):
                with tf.variable_scope(block.scope):
                    args = block.args
                    for j, arg in enumerate(args):
                        depth, depth_bottleneck, stride = arg
                        net = bottleneck(net, depth, depth_bottleneck, stride, scope='unit_'+str(j))
        endpoints = slim.utils.convert_collection_to_dict('collections')
        return net, endpoints
    

    Can you help me figure it out? Is there any example for changing basenet? Thank you! @dengdan

    opened by abc8350712 2
  • Different output between .pb and .ckpt file

    Different output between .pb and .ckpt file

    Hello! Thanks for this great work of text detection. I am recently working on a task of text detection and applying this model. the performance on my own data after training is wonderful. However, when I converted my training model of ckpt to pb model, the inference result of pb changes a lot. I cannot find out where the problem is, could you help me to locate my problem?

    here is my conversion code from ckpt to pb model:

    
    def freezeGraph(input_checkpoint, output_nodes_names, output_graph):
        '''
        :param input_checkpoint:
        :param output_graph: 
        :return:
        '''
        # checkpoint = tf.train.get_checkpoint_state(model_folder) 
        # input_checkpoint = checkpoint.model_checkpoint_path 
     
        # 
        output_node_names = output_nodes_names
        saver = tf.train.import_meta_graph(input_checkpoint + '.meta', clear_devices=True)
        graph = tf.get_default_graph() # 
        input_graph_def = graph.as_graph_def()  # 
     
        with tf.Session() as sess:
            saver.restore(sess, input_checkpoint) #
            output_graph_def = graph_util.convert_variables_to_constants(  # 
                sess=sess,
                input_graph_def=input_graph_def,# 
                output_node_names=output_node_names)# 
     
            with tf.gfile.GFile(output_graph, "wb") as f: 
                f.write(output_graph_def.SerializeToString()) 
            print("%d ops in the final graph." % len(output_graph_def.node)) 
    
    
    if __name__ == '__main__':
        nodes = getOutNodes('./seglink.txt') 
        freezeGraph('./ckpt/model.ckpt-5882', nodes, './pb/seglink.pb')
    

    the seglink.txt is recording the nodes' name of nodes seglink.txt

    I wonder if its the problem of conversion

    With many thanks if you can offer me some help of this problem!

    opened by YIYANGCAI 0
  • The links for pretrained models are expired.

    The links for pretrained models are expired.

    Hello, Thank you for the effort. The links you provided for the pre-trained models don't work. They are expired, maybe. Can you please provide another link from where I can download these pre-trained models. Thank you!

    opened by famunir 0
  • loss is in [3,4],how can I do

    loss is in [3,4],how can I do

    thanks for your work~ I use your 384-pre-trained model to finetune on my own dataset of 70000 images but my loss is always around 3,can you tell me how can I do? image

    opened by unshaven 1
  • About the test<(=╥﹏╥=)>

    About the test<(=╥﹏╥=)>

    Hey....can you help me with this BUG?~( >﹏<) It spends me all day to DEBUG, but nothing changed...:

    `++ set -e ++ export CUDA_VISIBLE_DEVICES=0 ++ CUDA_VISIBLE_DEVICES=0 ++ CHECKPOINT_PATH=/home/zhengdl/.local_config/seglink-master/models/model.ckpt-217867 ++ DATASET_DIR=/home/zhengdl/.local_config/seglink-master/test_imgs/ ++ python test_seglink.py --checkpoint_path=/home/zhengdl/.local_config/seglink-master/models/model.ckpt-217867 --gpu_memory_fraction=-1 --seg_conf_threshold=0.8 --link_conf_threshold=0.5 --dataset_dir=/home/zhengdl/.local_config/seglink-master/test_imgs/ Traceback (most recent call last): File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/logging/init.py", line 868, in emit msg = self.format(record) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/logging/init.py", line 741, in format return fmt.format(record) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/logging/init.py", line 465, in format record.message = record.getMessage() File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/logging/init.py", line 329, in getMessage msg = msg % self.args TypeError: not all arguments converted during string formatting Logged from file test_seglink.py, line 113 2019-05-10 05:14:02.616811: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2019-05-10 05:14:02.616853: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2019-05-10 05:14:02.616864: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2019-05-10 05:14:02.616872: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2019-05-10 05:14:02.616880: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 2019-05-10 05:14:02.874247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: name: Tesla K80 major: 3 minor: 7 memoryClockRate (GHz) 0.8235 pciBusID 0000:04:00.0 Total memory: 11.17GiB Free memory: 6.91GiB 2019-05-10 05:14:02.874348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 2019-05-10 05:14:02.874368: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y 2019-05-10 05:14:02.874396: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:04:00.0) INFO:tensorflow:Restoring parameters from /home/zhengdl/.local_config/seglink-master/models/model.ckpt-217867 Traceback (most recent call last): File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 82, in call ret = func(*args) File "/home/zhengdl/.local_config/seglink-master/tf_extended/seglink.py", line 712, in seglink_to_bbox bboxes = bboxes_to_xys(bboxes, image_shape) File "/home/zhengdl/.local_config/seglink-master/tf_extended/seglink.py", line 808, in bboxes_to_xys points = cv2.cv.BoxPoints(bbox) AttributeError: 'module' object has no attribute 'cv' 2019-05-10 05:14:12.480069: W tensorflow/core/framework/op_kernel.cc:1152] Internal: Failed to run py callback pyfunc_0: see error log. Traceback (most recent call last): File "test_seglink.py", line 155, in tf.app.run() File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "test_seglink.py", line 151, in main resnet_v1_101/block4/unit_2/bottleneck_v1/conv3/BatchNorm/beta:0 resnet_v1_101/block4/unit_2/bottleneck_v1/conv3/BatchNorm/gamma:0 resnet_v1_101/block4/unit_2/bottleneck_v1/conv3/BatchNorm/moving_mean:0 resnet_v1_101/block4/unit_2/bottleneck_v1/conv3/BatchNorm/moving_variance:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv1/weights:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv1/BatchNorm/beta:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv1/BatchNorm/gamma:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv1/BatchNorm/moving_mean:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv1/BatchNorm/moving_variance:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv2/weights:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv2/BatchNorm/beta:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv2/BatchNorm/gamma:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv2/BatchNorm/moving_mean:0resnet_v1_101/block4/unit_3/bottleneck_v1/conv2/BatchNorm/moving_variance:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/weights:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/beta:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/moving_mean:0 resnet_v1_101/block4/unit_3/bottleneck_v1/conv3/BatchNorm/moving_variance:0 2019-05-09 02:34:26.768999: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2019-05-09 02:34:26.769044: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2019-05-09 02:34:26.769055: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2019-05-09 02:34:26.769063: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2019-05-09 02:34:26.769071: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. eval() File "test_seglink.py", line 138, in eval image_bboxes = sess.run([bboxes_pred], feed_dict = {image:image_data, image_shape:image_data.shape}) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run run_metadata_ptr) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 982, in _run feed_dict_string, options, run_metadata) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run target_list, options, run_metadata) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: Failed to run py callback pyfunc_0: see error log. [[Node: test/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_FLOAT], Tout=[DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](test/strided_slice_4/_189, test/strided_slice_5/_191, test/strided_slice_2/_193, test/strided_slice_3/_195, test/PyFunc/input_4, test/PyFunc/input_5)]]

    Caused by op u'test/PyFunc', defined at: File "test_seglink.py", line 155, in tf.app.run() File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "test_seglink.py", line 151, in main eval() File "test_seglink.py", line 94, in eval link_conf_threshold = config.link_conf_threshold) File "/home/zhengdl/.local_config/seglink-master/tf_extended/seglink.py", line 680, in tf_seglink_to_bbox tf.float32); File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 189, in py_func input=inp, token=token, Tout=Tout, name=name) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/ops/gen_script_ops.py", line 40, in _py_func name=name) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op op_def=op_def) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op original_op=self._default_original_op, op_def=op_def) File "/home/zhengdl/anaconda3/envs/qq2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in init self._traceback = _extract_stack()

    InternalError (see above for traceback): Failed to run py callback pyfunc_0: see error log. [[Node: test/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_FLOAT], Tout=[DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](test/strided_slice_4/_189, test/strided_slice_5/_191, test/strided_slice_2/_193, test/strided_slice_3/_195, test/PyFunc/input_4, test/PyFunc/input_5)]] `

    opened by 1157942086 0
Owner
dengdan
Master on CS, from Zhejiang University; Now, perception algorithm R&D in FABU.ai, on automous driving
dengdan
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...

keras-ctpn [TOC] 说明 预测 训练 例子 4.1 ICDAR2015 4.1.1 带侧边细化 4.1.2 不带带侧边细化 4.1.3 做数据增广-水平翻转 4.2 ICDAR2017 4.3 其它数据集 toDoList 总结 说明 本工程是keras实现的CPTN: Detecti

mick.yi 107 Jan 9, 2023
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

Tian Zhi 1.3k Dec 22, 2022
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

Ankush Gupta 1.8k Dec 28, 2022
Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Total-Text-Dataset (Official site) Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.) Update

Chee Seng Chan 671 Dec 27, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

Code for the AAAI18 paper PixelLink: Detecting Scene Text via Instance Segmentation, by Dan Deng, Haifeng Liu, Xuelong Li, and Deng Cai. Contributions

null 758 Dec 22, 2022
Recognizing cropped text in natural images.

ASTER: Attentional Scene Text Recognizer with Flexible Rectification ASTER is an accurate scene text recognizer with flexible rectification mechanism.

Baoguang Shi 681 Jan 2, 2023
An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

FOTS: Fast Oriented Text Spotting with a Unified Network Introduction This is a pytorch re-implementation of FOTS: Fast Oriented Text Spotting with a

GeorgeJoe 171 Aug 4, 2022
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection For more details, please refer to our paper. Citing Please cite the related works

Minghui Liao 102 Jun 29, 2022
Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

null 428 Nov 22, 2022
This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

Dafang He 30 Oct 22, 2022
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

Pengyuan Lyu 309 Dec 6, 2022
TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

Minghui Liao 930 Jan 4, 2023
OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

Hussein Youssef 144 Jan 5, 2023
Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

null 27 Jan 8, 2023
Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

GTA-5-Lane-detection Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and

Danciu Georgian 4 Aug 1, 2021