A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

Related tags

Deep Learning TATT
Overview

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

https://arxiv.org/abs/2203.09388

Jianqi Ma, Zhetong Liang, Lei Zhang
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China & OPPO Research

Recovering TextZoom samples

TATT visualization

Environment:

python pytorch cuda numpy

Other possible python packages like pyyaml, cv2, Pillow and imgaug

Main idea

The pipeline

TP Interpreter

Configure your training

Download the pretrained recognizer from:

Aster: https://github.com/ayumiymk/aster.pytorch  
MORAN:  https://github.com/Canjie-Luo/MORAN_v2  
CRNN: https://github.com/meijieru/crnn.pytorch

Unzip the codes and walk into the ' $TATT_ROOT$ /', place the pretrained weights from recognizer in ' $TATT_ROOT$ /'.

Download the TextZoom dataset:

https://github.com/JasonBoy1/TextZoom

Train the corresponding model (e.g. TPGSR-TSRN):

chmod a+x train_TATT.sh
./train_TATT.sh

Run the test-prefixed shell to test the corresponding model.

Adding '--go_test' in the shell file

Cite this paper:

@article{ma2021text,
title={A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution},
author={Ma, Jianqi and Zhetong, Liang and Zhang, Lei},
journal={},
year={2022}
}
Comments
  • I meet a problem in training

    I meet a problem in training "No such file or directory: 'ckpt/TATT/model_best_acc_0.pth' ", how can I solve it ?

    Thanks for you work.I meet a problem in training.How can I solve it? First, I meet this problem.

    No such file or directory: 'ckpt/TATT/
    

    Then, I make a directory named 'TATT' in 'ckpt', but I meet another problem

    No such file or directory: 'ckpt/TATT/model_best_acc_0.pth' 
    

    I'm working on my graduation project, and I need to reproduce your code, so it's important to me.Thanks for your work, and looking forward to your reply!

    opened by Meow-2 1
  • images_sr = model(images_hr)

    images_sr = model(images_hr)

    At line 1734 in interfaces/super_resolution.py,

    images_sr = model(images_hr)
    

    It seems that HR images are input into the model. Is this correct specification?

    opened by chnoguchi 1
  • Pretrained model

    Pretrained model

    Accepted from CVPR 2022 a long time ago, and still there is no pertained model for TextZoom. Authors mentioned this problem in closed issue #1, said they will release pre-trained model in later release version, but still don't upload checkpoint for measuring their performance.

    opened by chhkang 1
  • location of code about your text prior architecture

    location of code about your text prior architecture

    Can you point where is the location of your text prior architecture in your code, please? I really want to know how your architecture use the output of CRNN as text prior. It is hard to find it.

    And another question, does it seem that your code don't use text prior in testing?

    opened by HellbroWDR 0
  • If the word length is greater than 4, the letter

    If the word length is greater than 4, the letter "e" seems to be inserted third.

    At lines 1918-1921 in dataset/dataset.py,

    if len(word) > 4:
      word = [ch for ch in word]
      word[2] = "e"
      word = "".join(word)
    

    the letter "e" seems to be inserted into gt labels. What is the intention behind this process?

    opened by chnoguchi 0
  • Where is RPE used?

    Where is RPE used?

    I've read the supplement describing the details of the recurrent positional encoding (RPE).

    However, I cannot seem to find the code which RPE is implemented and used.

    Would the authors kindly point out where the implementation of RPE is in the released codebase?

    opened by killawhale2 1
  • IndexError: too many index for tensor of dimension 4

    IndexError: too many index for tensor of dimension 4

    Thanks for sharing ! when I testing with code vis = true. The error came up about : too many index for tensor of dimension 4 . I checked the size of the input , its (16, 1 , 32 , 100), what should i do next ?

    opened by foxbeing7 2
Owner
MA Jianqi, shiki
MA Jianqi, shiki
The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

Neural Deformation Graphs Project Page | Paper | Video Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction Aljaž Božič, Pablo P

Aljaz Bozic 134 Dec 16, 2022
PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

Haoyu Chen 71 Dec 30, 2022
MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

MASA-SR Official PyTorch implementation of our CVPR2021 paper MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Re

DV Lab 126 Dec 20, 2022
PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

五维空间 140 Nov 23, 2022
Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Pop-Out Motion Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022) Jihyun Lee*, Minhyuk Sung*, Hyunjin Kim, Tae-Ky

Jihyun Lee 88 Nov 22, 2022
PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

null 143 Dec 28, 2022
PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Yulun Zhang 1.2k Dec 26, 2022
【Arxiv】Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

SANet Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 to

null 36 Jan 5, 2023
Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

null 2 Nov 15, 2021
Image Super-Resolution Using Very Deep Residual Channel Attention Networks

Image Super-Resolution Using Very Deep Residual Channel Attention Networks

kongdebug 14 Oct 14, 2022
Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

C2-Matching (CVPR2021) This repository contains the implementation of the following paper: Robust Reference-based Super-Resolution via C2-Matching Yum

Yuming Jiang 151 Dec 26, 2022
pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network arXiv:1609.04802

PyTorch SRResNet Implementation of Paper: "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"(https://arxiv.org/abs

Jiu XU 436 Jan 9, 2023
Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021) This repository is the official PyTorc

Jingyun Liang 139 Dec 29, 2022
Official implementation of the paper 'Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution'

DASR Paper Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution Jie Liang, Hui Zeng, and Lei Zhang. In arxiv preprint. Abs

null 81 Dec 28, 2022
DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates

DeepMetaHandles (CVPR2021 Oral) [paper] [animations] DeepMetaHandles is a shape deformation technique. It learns a set of meta-handles for each given

Liu Minghua 73 Dec 15, 2022
Joint Learning of 3D Shape Retrieval and Deformation, CVPR 2021

Joint Learning of 3D Shape Retrieval and Deformation Joint Learning of 3D Shape Retrieval and Deformation Mikaela Angelina Uy, Vladimir G. Kim, Minhyu

Mikaela Uy 38 Oct 18, 2022
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Make-A-Scene - PyTorch Pytorch implementation (inofficial) of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (https://arxiv.org/

Casual GAN Papers 259 Dec 28, 2022
Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

Tengfei Wang 110 Dec 20, 2022
Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

Tengfei Wang 110 Dec 20, 2022