[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Related tags

Deep Learning AHIQ
Overview

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network

[CVPRW 2022] Code for Hybrid Image Quality Assessment Network

[paper] [code]

This is the official repository for NTIRE2022 Perceptual Image Quality Assessment Challenge Track 1 Full-Reference competition. We won first place in the competition and the codes have been released now.

Abstract: Image quality assessment (IQA) algorithm aims to quantify the human perception of image quality. Unfortunately, there is a performance drop when assessing the distortion images generated by generative adversarial network (GAN) with seemingly realistic texture. In this work, we conjecture that this maladaptation lies in the backbone of IQA models, where patch-level prediction methods use independent image patches as input to calculate their scores separately, but lack spatial relationship modeling among image patches. Therefore, we propose an Attention-based Hybrid Image Quality Assessment Network (AHIQ) to deal with the challenge and get better performance on the GAN-based IQA task. Firstly, we adopt a two-branch architecture, including a vision transformer (ViT) branch and a convolutional neural network (CNN) branch for feature extraction. The hybrid architecture combines interaction information among image patches captured by ViT and local texture details from CNN. To make the features from shallow CNN more focused on the visually salient region, a deformable convolution is applied with the help of semantic information from the ViT branch. Finally, we use a patch-wise score prediction module to obtain the final score. The experiments show that our model outperforms the state-of-the-art methods on four standard IQA datasets and AHIQ ranked first on the Full Reference (FR) track of the NTIRE 2022 Perceptual Image Quality Assessment Challenge.

Overview

Getting Started

Prerequisites

  • Linux
  • NVIDIA GPU + CUDA CuDNN
  • Python 3.7

Dependencies

We recommend running this repository using Anaconda. All dependencies for defining the environment are provided in requirements.txt.

Pretrained Models

You may manually download the pretrained models from Google Drive and put them into checkpoints/ahiq_pipal/, or simply use

sh download.sh

Instruction

use sh train.sh or sh test.sh to train or test the model. You can also change the options in the options/ as you like.

Acknowledgment

The codes borrow heavily from IQT implemented by anse3832 and we really appreciate it.

Citation

If you find our work or code helpful for your research, please consider to cite:

@article{lao2022attentions,
  title   = {Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network},
  author  = {Lao, Shanshan and Gong, Yuan and Shi, Shuwei and Yang, Sidi and Wu, Tianhe and Wang, Jiahao and Xia, Weihao and Yang, Yujiu},
  journal = {arXiv preprint arXiv:2204.10485},
  year    = {2022}
}
Comments
  • about optimizer

    about optimizer

    In paper section 4.2, the author says to use the AdamW optimizer. But actually in the code train.py, the author use the Adam. Does anyone know if the two optimizers will cause a performance difference? Thanks.

    opened by YaoShunyu19 0
  • How to interpret score values?

    How to interpret score values?

    Hi. Thanks for this interesting work.

    I want to use this full reference metric to evaluate the quality of some images for my research.

    I managed to run the code and I tried a few images from the PIPAL validation set (i.e the image of the parrot A0019.bmp and the image with a house A0020.bmp and their corresponding distorted versions A0019_10_04.bmp and A0020_10_04.bmp). I used this model: AHIQ_vit_p8_epoch33.pth.

    When I compare A0019_10_04.bmp with A0019.bmp I get as score 0.466813. For A0020_10_04.bmp with A0020.bmp I get 0.589722. But then when I compare the images with themselves (A0019.bmp with A0019.bmp and A0020.bmp with A0020.bmp) I get the scores 0.691261 and 0.704291, respectively.

    I am wondering that is the score range min and max value for this metric? It is 0 and 1? How do you interpret the score values? Lower means more different? Higher means more similar? For examples, in the case of other FR metrics like SSIM, when I compare two identical images I get a score of 1, so then I can say that the images are identical. But with your metric, how can I tell if 2 images are the same (or different) just by looking at the score values?

    Thank you for your time. I'm looking forward to your answer. Have a great day!

    opened by Ellyuca 1
  • question about CNN Features

    question about CNN Features

    The method uses the features of stage2 of Resnet50 as the result of feature extraction. It's very strange here, why choose stage2, or why not choose one output per stage for concat?

    opened by TimZhang001 0
  • Issue on the Training

    Issue on the Training

    I am trying to run your model, and have already setup the environments. I found this problem regarding vit_base_patch8_224

    do you have any ideas to solve it? Thank you! really appreciate it!

    Screenshot from 2022-08-17 00-36-51

    opened by sonata13 2
  • Ask regarding the PIPAL_NTIRE_Valid_MOS.txt

    Ask regarding the PIPAL_NTIRE_Valid_MOS.txt

    Hi, could you provide us regarding the PIPAL.txt and PIPAL_NTIRE_Valid_MOS.txt? I could not find any related files about that

    Thank you, I am looking forward for your reply.

    opened by sonata13 4
  • Ask regarding the dataset

    Ask regarding the dataset

    Hi, thanks for your hard work and congratulations for your 1st place in the CPVR competition, I am curious on the implementation of your model. and currently, I have tried to deploy your model. Here, I have question:

    1. Where do you get the PIPAL dataset? is it from https://drive.google.com/drive/folders/1G4fLeDcq6uQQmYdkjYUHhzyel4Pz81p- ? is that true? If that so, from that link I saw only training data without validation and test.

    or maybe could you share the dataset?

    Thank you! really appreciate it.

    opened by sonata13 1
Owner
IIGROUP
The Intelligent Interaction Group at Tsinghua University
IIGROUP
Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment"

DSN-IQA Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment" Requirements Python >=3.8.0 Pytorch >=1.7.1 Usage wit

null 7 Oct 13, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 271 Dec 9, 2022
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 4 Mar 21, 2022
Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

RecycleD Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN

Yunan Zhu 23 Nov 5, 2022
Lightweight Face Image Quality Assessment

LightQNet This is a demo code of training and testing [LightQNet] using Tensorflow. Uncertainty Losses: IDQ loss PCNet loss Uncertainty Networks: Mobi

Kaen 5 Nov 18, 2022
No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

No-Reference Image Quality Assessment Algorithms No-reference Image Quality Assessment(NIQA) is a task of evaluating an image without a reference imag

Dae-Young Song 21 Nov 12, 2022
Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

arXiv Dual Contrastive Learning Adversarial Generative Networks (DCLGAN) We provide our PyTorch implementation of DCLGAN, which is a simple yet powerf

null 117 Nov 26, 2022
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

Sicheng Xu 807 Dec 5, 2022
MagFace: A Universal Representation for Face Recognition and Quality Assessment

MagFace MagFace: A Universal Representation for Face Recognition and Quality Assessment in IEEE Conference on Computer Vision and Pattern Recognition

Qiang Meng 515 Dec 1, 2022
[ICCV 2021] Group-aware Contrastive Regression for Action Quality Assessment

CoRe Created by Xumin Yu*, Yongming Rao*, Wenliang Zhao, Jiwen Lu, Jie Zhou This is the PyTorch implementation for ICCV paper Group-aware Contrastive

Xumin Yu 30 Nov 4, 2022
Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Zhengzhong Tu 5 Sep 16, 2022
MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Front-end View Backend View Table of Contents Description Prerequisites Running Basic Information Measurements User Interface Feedback and usage Descr

Center for Computational Imaging and Personalized Diagnostics 58 Dec 2, 2022
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Nov 29, 2022
Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction

Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction Requirements The code has been tested running under Python 3.7.4, with the foll

zshicode 77 Dec 7, 2022
It's a implement of this paper:Relation extraction via Multi-Level attention CNNs

Relation Classification via Multi-Level Attention CNNs It's a implement of this paper:Relation Classification via Multi-Level Attention CNNs. Training

Aybss 2 Nov 4, 2022
Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

Li Shengyan 260 Nov 23, 2022
This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021.

inverse_attention This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021. Le

Firas Laakom 5 Jul 8, 2022
Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"

LongDocSum Code for NAACL 2021 paper "Efficient Attentions for Long Document Summarization" This repository contains data and models needed to reprodu

null 55 Sep 16, 2022