Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)

Ruijie Yan

Last update: Jan 2, 2023

Related tags

Deep Learning pren

Overview

Primitive Representation Learning Network (PREN)

This repository contains the code for our paper accepted by CVPR 2021

Primitive Representation Learning for Scene Text Recognition

Ruijie Yan, Liangrui Peng, Shanyu Xiao, Gang Yao

For now we only provide code for PREN.

Requirements

python 3.7.9, pytorch 1.4.0, and torchvision 0.5.0
other libraries can be installed by

pip install -r requirements.txt

Recognition with pretrained model

We provide code for using our pretrained model to recognize text images.

The pretrained model can be downloaded via Baidu net disk: download_link key: 2txt
After downloading the pretrained model (pren.pth), put it in the "models" folder.
To recognize three samples in the "samples" folder, just run

python recog.py

The results would be

[Info] Load model from ./models/pren.pth
samples/001.jpg: ronaldo
samples/002.png: leaves
samples/003.jpg: salmon

Training

Two simple steps to train your own model:

Modify training configurations in Configs/trainConf.py
Run python train.py

To run the training code, please modify image_dir and train_list to your own training data.

image_dir is the path of training data root.

train_list is the path of a text file containing image paths (relative to image_dir) and corresponding labels.

For example, image_dir could be './samples', and train_list could be a text file with the following content

001.jpg RONALDO
002.png LEAVES
003.jpg SALMON

Evaluation

Similar to train, one can modify Configs/testConf.py and run python test.py to evaluate a model.

Acknowledgement

The code of EfficientNet is modified from EfficientNet-PyTorch, where we output multi-scale feature maps.

Citation

If you find this project helpful for your research, please cite our paper

@inproceedings{yan2021primitive,
  author    = {Yan, Ruijie and
               Peng, Liangrui and
               Xiao, Shanyu and
               Yao, Gang},
  title     = {Primitive Representation Learning for Scene Text Recognition},
  booktitle = {CVPR},
  year      = {2021}
}

Comments

Pretrained model ?

Hello, many thanks to your excellent work! I can't download weight in server baidu. Can you upload on google driver or send by email [email protected] Thank so much.

opened by ThorPham 6
evaluation for special character

Hi, thank you for your nice work ! I wanna ask about evaluation of your work. Since some datasets include special character (out of vocab), the model can't predict these characters. In this case, if the model predict these unknown character as unk, did you accept it as correct or not in the reported performance? or did you just ignore all unk characters ?

thank you!

opened by vanche 2
Will the code of Pren2D be available?

Hello, many thanks to your excellent work! We are considering to cite your paper ;D

Before that, we need to reproduce the result of Pren2D on our private dataset. So will the code of Pren2D be available?

opened by JingyeChen 1
Failed to reproduce the results in the paper when training from the scratch

Hello, we have a problem with reproducing the results in the paper.

With the official code and the default parameters for training, we are not able to reach the desirable scores except IC03 and IC13.

Method | Train Opt | Epoch | IC03 | IC13 | IC15 | IIIT5k | SVT | SVTP | CUTE -- | -- | -- | -- | -- | -- | -- | -- | -- | -- PREN(Paper) | - | - | 94.90 | 94.70 | 79.20 | 92.10 | 92.00 | 83.90 | 81.30 PREN(w/ Official code) | default | 3 | 95.23 | 94.52 | 76.97 | 84.33 | 87.33 | 79.23 | 71.18

We used all data in ST and MJ in LMDB format. We haven't changed any code except to import images and labels. By any chance, did you use preprocessing that does not exist in the current code when creating the image file?

And also it's very strange that the score on CUTE dataset is 10% lower than the reported one. Can you guide us in detail on how to reproduce it?

opened by becxer 5
关于论文中的可视化部分

感谢您的分享，文章给了我很大的启发。关于论文中的实验部分，4.3. Visualization and analysis。能冒昧的请教有一下此部分的热力图是如何画出来的吗？，最近我尝试了很多次都没有画出有意义的图形，大多都不具有空间的含义。如何才能画出您论文中的attention map可视化图形呢？需要用到什么方法？具体的流程或者code是什么呢？非常感谢您的工作，期待您的答复！万分感谢

opened by zdz1997 0

Owner

Ruijie Yan

GitHub

[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

44 Dec 12, 2022

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Counterfactual Zero-Shot and Open-Set Visual Recognition This project provides implementations for our CVPR 2021 paper Counterfactual Zero-S

144 Dec 24, 2022

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla

269 Dec 21, 2022

Code for our CVPR 2021 paper "MetaCam+DSCE"

Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification (CVPR'21) Introduction Code for our CVPR 2021

59 Oct 31, 2022

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam

334 Dec 23, 2022

Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

161 Dec 20, 2022

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

MI-AOD Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection (The PDF is not available tem

269 Dec 21, 2022

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

55 Nov 14, 2022

Code for CVPR 2021 paper: Anchor-Free Person Search

Introduction This is the implementationn for Anchor-Free Person Search in CVPR2021 License This project is released under the Apache 2.0 license. Inst

158 Jan 4, 2023

Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

CDFI (Compression-Driven-Frame-Interpolation) [Paper] (Coming soon...) | [arXiv] Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov IEEE Conference

95 Dec 4, 2022

Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Transferable Semantic Augmentation for Domain Adaptation Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021) Paper

66 Dec 16, 2022

Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021

LoFTR: Detector-Free Local Feature Matching with Transformers Project Page | Paper LoFTR: Detector-Free Local Feature Matching with Transformers Jiami

1.4k Jan 4, 2023

Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

1.4k Dec 30, 2022

Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Mo

76 Jan 2, 2023

Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)

Related tags

Overview

Primitive Representation Learning Network (PREN)

Requirements

Recognition with pretrained model

Training

Evaluation

Acknowledgement

Citation

Comments

Pretrained model ?

evaluation for special character

Will the code of Pren2D be available?

Failed to reproduce the results in the paper when training from the scratch

关于论文中的可视化部分

Owner

Ruijie Yan

[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Code for our CVPR 2021 paper "MetaCam+DSCE"

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

Code for CVPR 2021 paper: Anchor-Free Person Search

Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021

Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

Demo code for paper "Learning optical flow from still images", CVPR 2021.

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation