PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

The unofficial code of CDistNet.

Now, we have implemented all the modules according to the papaer except for TPS in the visual branch.You can refer ASTER for the implementation of TPS.

Requirements

Python3.6.8
lmdb==0.98
torch==1.5.1
torchvision==0.6.1
Pillow==6.1.0
opencv-python==4.2.0.32
numpy==1.17.1

Data preparation

We offer you a tool to transform raw dataset to LMDB dataset. Details please refer to tools/create_lmdb_dataset.py

You can also download lmdb dataset from OCR_Dataset

Train

First you need to modify some arguments in configs/cdistnet.yml.

  • TrainReader set the path of train lmdb dataset.
  • EvalReader set the path of evaluation lmdb dataset.
  • Global set the args like image_shape, dict_file, etc.
  • VisualModule set the args of visual branch in the original paper.
  • PositionalEmbedding set the args of positional branch.
  • SemanticEmbedding set the args of semantic branch.
  • MDCDP set the args of MDCDP.
python train.py -c configs/cdistnet.yml

Demo

Modify these arguments below in configs/cdistnet.yml.

  • pretrain_weights set the path of model file path.
  • infer_img set the image path.
  • `is_train set to False.
python predict.py -c configs/cdistnet.yml

TODO

  • Pretrained models
  • Test code
  • Comparison with original paper on benchmarks(CUTE, IC13, IC15, IIIT5K, SVT, SVTP)
You might also like...
Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

A Pytorch Implementation of Source data‐free domain adaptation of object detector through domain‐specific perturbation Please follow Faster R-CNN and

Official PyTorch implementation of
Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos Introduction This repo is official PyTorch implementatio

Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)
Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)

N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Gra

Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch
Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

Neural Distance Embeddings for Biological Sequences Official implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTo

Implementation of the Chamfer Distance as a module for pyTorch

Chamfer Distance for pyTorch This is an implementation of the Chamfer Distance as a module for pyTorch. It is written as a custom C++/CUDA extension.

Cossim - Sharpened Cosine Distance implementation in PyTorch

Sharpened Cosine Distance PyTorch implementation of the Sharpened Cosine Distanc

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21) A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)
A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022) https://arxiv.org/abs/2203.09388 Jianqi Ma, Zheto

Comments
  • Possible to train only language modelling?

    Possible to train only language modelling?

    I have a relatively small dataset of a different format (license plates) and it often gets license plate format wrong.

    I was wondering if there was a way to train the model on just a bunch of text string data without feeding any images at all in order to enforce the format.

    Please let me know if it is possible to train the language/semantic model independently, by just feeding string text data of words, without corresponding images.

    opened by siddagra 0
  • Faced PicklingError

    Faced PicklingError

    2022-02-02 16:31:18,549 [INFO ]  dataset_root:    root\is\here
             dataset: \
    sub-directory:  /.       num samples: 3736
    num total samples of total dataset is 3736
    
    
    2022-02-02 16:31:18,559 [INFO ]  num total samples of \: 3736 x 1.0 (total_data_usage_ratio) = 3736
    num samples of \ per batch: 32 x 1.0 (batch_ratio) = 32
    
    Traceback (most recent call last):
      File "train.py", line 71, in <module>
        main()
      File "train.py", line 48, in main
        train_loader = build_data_loader(flags, mode='train')
      File "root\is\here\CdistNet2\program.py", line 84, in build_data_loader
        dataloader = create_module(dataloader_infor)(flags)
      File "root\is\here\CdistNet2\dataset.py", line 114, in __init__
        self.dataloader_iter_list.append(iter(_data_loader))
      File "I:\Anaconda\envs\CDistNet2\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
        return _MultiProcessingDataLoaderIter(self)
      File "I:\Anaconda\envs\CDistNet2\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
        w.start()
      File "I:\Anaconda\envs\CDistNet2\lib\multiprocessing\process.py", line 105, in start
        self._popen = self._Popen(self)
      File "I:\Anaconda\envs\CDistNet2\lib\multiprocessing\context.py", line 223, in _Popen
        return _default_context.get_context().Process._Popen(process_obj)
      File "I:\Anaconda\envs\CDistNet2\lib\multiprocessing\context.py", line 322, in _Popen
        return Popen(process_obj)
      File "I:\Anaconda\envs\CDistNet2\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
        reduction.dump(process_obj, to_child)
      File "I:\Anaconda\envs\CDistNet2\lib\multiprocessing\reduction.py", line 60, in dump
        ForkingPickler(file, protocol).dump(obj)
    _pickle.PicklingError: Can't pickle <class 'flags.FLAGS'>: attribute lookup FLAGS on flags failed
    
    (CDistNet2) I:\Google_Drive\Hyundai\Learnin\CdistNet2>Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "I:\Anaconda\envs\CDistNet2\lib\multiprocessing\spawn.py", line 105, in spawn_main
        exitcode = _main(fd)
      File "I:\Anaconda\envs\CDistNet2\lib\multiprocessing\spawn.py", line 115, in _main
        self = reduction.pickle.load(from_parent)
    EOFError: Ran out of input
    

    I don't know what is problem exactly but Looks like problem of namedtuple I guess??
    Is there any way to get around this problem?

    opened by qkfxhqrkrrl 0
  • Not predicting any text

    Not predicting any text

    Thank you for your work. Training accuracy is good but when I tried to predict not predicting any text all are empty. could you please let me know what is the problem?

    opened by bharatsubedi 0
Owner
null
a pytorch implementation of auto-punctuation learned character by character

Learning Auto-Punctuation by Reading Engadget Articles Link to Other of my work ?? Deep Learning Notes: A collection of my notes going from basic mult

Ge Yang 137 Nov 9, 2022
Hand-distance-measurement-game - Hand Distance Measurement Game

Hand Distance Measurement Game This is program is made to calculate the distance

Priyansh 2 Jan 12, 2022
Add-on for importing and auto setup of character creator 3 character exports.

CC3 Blender Tools An add-on for importing and automatically setting up materials for Character Creator 3 character exports. Using Blender in the Chara

null 260 Jan 5, 2023
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 677 Dec 25, 2022
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

mxin262 183 Jan 3, 2023
a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

RNN-Playwrite a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LS

Arno Barton 1 Oct 29, 2021
GeneralOCR is open source Optical Character Recognition based on PyTorch.

Introduction GeneralOCR is open source Optical Character Recognition based on PyTorch. It makes a fidelity and useful tool to implement SOTA models on

null 57 Dec 29, 2022
Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)

MASTER-PyTorch PyTorch reimplementation of "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021). This projec

Wenwen Yu 255 Dec 29, 2022
Indonesian Car License Plate Character Recognition using Tensorflow, Keras and OpenCV.

Monopol Indonesian Car License Plate (Indonesia Mobil Nomor Polisi) Character Recognition using Tensorflow, Keras and OpenCV. Background This applicat

Jayaku Briliantio 3 Apr 7, 2022
Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting (ICCV, 2021)

DKPNet ICCV 2021 Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting Baseline of DKPNet is availa

null 19 Oct 14, 2022