[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

IIGROUP

Last update: Dec 11, 2022

Related tags

Deep Learning AHIQ

Overview

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network

[CVPRW 2022] Code for Hybrid Image Quality Assessment Network

[paper] [code]

This is the official repository for NTIRE2022 Perceptual Image Quality Assessment Challenge Track 1 Full-Reference competition. We won first place in the competition and the codes have been released now.

Abstract: Image quality assessment (IQA) algorithm aims to quantify the human perception of image quality. Unfortunately, there is a performance drop when assessing the distortion images generated by generative adversarial network (GAN) with seemingly realistic texture. In this work, we conjecture that this maladaptation lies in the backbone of IQA models, where patch-level prediction methods use independent image patches as input to calculate their scores separately, but lack spatial relationship modeling among image patches. Therefore, we propose an Attention-based Hybrid Image Quality Assessment Network (AHIQ) to deal with the challenge and get better performance on the GAN-based IQA task. Firstly, we adopt a two-branch architecture, including a vision transformer (ViT) branch and a convolutional neural network (CNN) branch for feature extraction. The hybrid architecture combines interaction information among image patches captured by ViT and local texture details from CNN. To make the features from shallow CNN more focused on the visually salient region, a deformable convolution is applied with the help of semantic information from the ViT branch. Finally, we use a patch-wise score prediction module to obtain the final score. The experiments show that our model outperforms the state-of-the-art methods on four standard IQA datasets and AHIQ ranked first on the Full Reference (FR) track of the NTIRE 2022 Perceptual Image Quality Assessment Challenge.

Overview

Getting Started

Prerequisites

Linux
NVIDIA GPU + CUDA CuDNN
Python 3.7

Dependencies

We recommend running this repository using Anaconda. All dependencies for defining the environment are provided in requirements.txt.

Pretrained Models

You may manually download the pretrained models from Google Drive and put them into checkpoints/ahiq_pipal/, or simply use

sh download.sh

Instruction

use sh train.sh or sh test.sh to train or test the model. You can also change the options in the options/ as you like.

Acknowledgment

The codes borrow heavily from IQT implemented by anse3832 and we really appreciate it.

Citation

If you find our work or code helpful for your research, please consider to cite:

@article{lao2022attentions,
  title   = {Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network},
  author  = {Lao, Shanshan and Gong, Yuan and Shi, Shuwei and Yang, Sidi and Wu, Tianhe and Wang, Jiahao and Xia, Weihao and Yang, Yujiu},
  journal = {arXiv preprint arXiv:2204.10485},
  year    = {2022}
}

Comments

about optimizer

In paper section 4.2, the author says to use the AdamW optimizer. But actually in the code train.py, the author use the Adam. Does anyone know if the two optimizers will cause a performance difference? Thanks.

opened by YaoShunyu19 0
How to interpret score values?

Hi. Thanks for this interesting work.

I want to use this full reference metric to evaluate the quality of some images for my research.

I managed to run the code and I tried a few images from the PIPAL validation set (i.e the image of the parrot A0019.bmp and the image with a house A0020.bmp and their corresponding distorted versions A0019_10_04.bmp and A0020_10_04.bmp). I used this model: AHIQ_vit_p8_epoch33.pth.

When I compare A0019_10_04.bmp with A0019.bmp I get as score 0.466813. For A0020_10_04.bmp with A0020.bmp I get 0.589722. But then when I compare the images with themselves (A0019.bmp with A0019.bmp and A0020.bmp with A0020.bmp) I get the scores 0.691261 and 0.704291, respectively.

I am wondering that is the score range min and max value for this metric? It is 0 and 1? How do you interpret the score values? Lower means more different? Higher means more similar? For examples, in the case of other FR metrics like SSIM, when I compare two identical images I get a score of 1, so then I can say that the images are identical. But with your metric, how can I tell if 2 images are the same (or different) just by looking at the score values?

Thank you for your time. I'm looking forward to your answer. Have a great day!

opened by Ellyuca 1
question about CNN Features

The method uses the features of stage2 of Resnet50 as the result of feature extraction. It's very strange here, why choose stage2, or why not choose one output per stage for concat?

opened by TimZhang001 0
Issue on the Training

I am trying to run your model, and have already setup the environments. I found this problem regarding vit_base_patch8_224

do you have any ideas to solve it? Thank you! really appreciate it!

opened by sonata13 2
Ask regarding the PIPAL_NTIRE_Valid_MOS.txt

Hi, could you provide us regarding the PIPAL.txt and PIPAL_NTIRE_Valid_MOS.txt? I could not find any related files about that

Thank you, I am looking forward for your reply.

opened by sonata13 5
Ask regarding the dataset
Hi, thanks for your hard work and congratulations for your 1st place in the CPVR competition, I am curious on the implementation of your model. and currently, I have tried to deploy your model. Here, I have question:

Where do you get the PIPAL dataset? is it from https://drive.google.com/drive/folders/1G4fLeDcq6uQQmYdkjYUHhzyel4Pz81p- ? is that true? If that so, from that link I saw only training data without validation and test.

or maybe could you share the dataset?

Thank you! really appreciate it.
opened by sonata13 1

[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Related tags

Overview

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network

Overview

Getting Started

Prerequisites

Dependencies

Pretrained Models

Instruction

Acknowledgment

Citation

Comments

about optimizer

How to interpret score values?

question about CNN Features

Issue on the Training

Ask regarding the PIPAL_NTIRE_Valid_MOS.txt

Ask regarding the dataset

Owner

IIGROUP

Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment"

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

Lightweight Face Image Quality Assessment

No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

MagFace: A Universal Representation for Face Recognition and Quality Assessment

[ICCV 2021] Group-aware Contrastive Regression for Action Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction

It's a implement of this paper：Relation extraction via Multi-Level attention CNNs

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021.

Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"