A PyTorch Implementation of Neural IMage Assessment

Overview

NIMA: Neural IMage Assessment

Python 3.6+ MIT License

This is a PyTorch implementation of the paper NIMA: Neural IMage Assessment (accepted at IEEE Transactions on Image Processing) by Hossein Talebi and Peyman Milanfar. You can learn more from this post at Google Research Blog.

Implementation Details

  • The model was trained on the AVA (Aesthetic Visual Analysis) dataset containing 255,500+ images. You can get it from here. Note: there may be some corrupted images in the dataset, remove them first before you start training. Use provided CSVs which have already done this for you.

  • Dataset is split into 229,981 images for training, 12,691 images for validation and 12,818 images for testing.

  • An ImageNet pretrained VGG-16 is used as the base network. Should be easy to plug in the other two options (MobileNet and Inception-v2).

  • The learning rate setting differs from the original paper. Can't seem to get the model to converge using the original params. Also didn't do much hyper-param tuning therefore you could probably get better results. Other settings are all directly mirrored from the paper.

Requirements

Code is written using PyTorch 1.8.1 with CUDA 11.1. You can recreate the environment I used with conda by

conda env create -f env.yml

to install the dependancies.

Usage

To start training on the AVA dataset, first download the dataset from the link above and decompress which should create a directory named images/. Then download the curated annotation CSVs below which already splits the dataset (You can create your own split of course). Then do

python main.py --img_path /path/to/images/ --train --train_csv_file /path/to/train_labels.csv --val_csv_file /path/to/val_labels.csv --conv_base_lr 5e-4 --dense_lr 5e-3 --decay --ckpt_path /path/to/ckpts --epochs 100 --early_stoppping_patience 10

For inference, do

python -W ignore test.py --model /path/to/your_model --test_csv /path/to/test_labels.csv --test_images /path/to/images --predictions /path/to/save/predictions

See predictions/ for dumped predictions as an example.

Training Statistics

Training is done with early stopping. Here I set early_stopping_patience=10.

Pretrained Model

~0.069 EMD on validation. Not fully converged yet (constrained by resources). To continue training, download the pretrained weights and add --warm_start --warm_start_epoch 34 to your args.

Google Drive

Annotation CSV Files

Train Validation Test

Example Results

  • Here first shows some good predictions from the test set. Each image title starts with ground-truth rating followed by the predicted mean and std in the parentheses.

  • Also some failure cases, it would seem that the model usually fails at images with low/high aesthetic ratings.

  • The predicted aesthetic ratings from training on the AVA dataset are sensitive to contrast adjustments, preferring images with higher contrast. Below top row is the reference image with contrast c=1.0, while bottom images are enhanced with contrast [0.25, 0.75, 1.25, 1.75]. Contrast adjustment is done using ImageEnhance.Contrast from PIL (in this case pillow-simd).

License

MIT

Comments
  • The EMD loss still seems to be wrong

    The EMD loss still seems to be wrong

    https://github.com/kentsyx/Neural-IMage-Assessment/blob/9c50b3e384a88a8afdc00333d01656be5526bfed/model.py#L36

    The EMD loss still seems to be wrong, my opinion is the sum operation should be inside of torch.abs

    Originally posted by @luqiang360 in https://github.com/kentsyx/Neural-IMage-Assessment/issues/4#issuecomment-456781249

    opened by luqiang360 9
  • Problematic Implementation of EMD Loss

    Problematic Implementation of EMD Loss

    Should be the L2 distance between CDF of two distributions but not between the PDF of two distributions

    And there's some typo in the naming such as emb and emd

    opened by VoVAllen 6
  • Bump pillow from 7.1.2 to 8.1.1

    Bump pillow from 7.1.2 to 8.1.1

    Bumps pillow from 7.1.2 to 8.1.1.

    Release notes

    Sourced from pillow's releases.

    8.1.1

    https://pillow.readthedocs.io/en/stable/releasenotes/8.1.1.html

    8.1.0

    https://pillow.readthedocs.io/en/stable/releasenotes/8.1.0.html

    Changes

    Dependencies

    Deprecations

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    8.1.1 (2021-03-01)

    • Use more specific regex chars to prevent ReDoS. CVE-2021-25292 [hugovk]

    • Fix OOB Read in TiffDecode.c, and check the tile validity before reading. CVE-2021-25291 [wiredfool]

    • Fix negative size read in TiffDecode.c. CVE-2021-25290 [wiredfool]

    • Fix OOB read in SgiRleDecode.c. CVE-2021-25293 [wiredfool]

    • Incorrect error code checking in TiffDecode.c. CVE-2021-25289 [wiredfool]

    • PyModule_AddObject fix for Python 3.10 #5194 [radarhere]

    8.1.0 (2021-01-02)

    • Fix TIFF OOB Write error. CVE-2020-35654 #5175 [wiredfool]

    • Fix for Read Overflow in PCX Decoding. CVE-2020-35653 #5174 [wiredfool, radarhere]

    • Fix for SGI Decode buffer overrun. CVE-2020-35655 #5173 [wiredfool, radarhere]

    • Fix OOB Read when saving GIF of xsize=1 #5149 [wiredfool]

    • Makefile updates #5159 [wiredfool, radarhere]

    • Add support for PySide6 #5161 [hugovk]

    • Use disposal settings from previous frame in APNG #5126 [radarhere]

    • Added exception explaining that repr_png saves to PNG #5139 [radarhere]

    • Use previous disposal method in GIF load_end #5125 [radarhere]

    ... (truncated)

    Commits
    • 741d874 8.1.1 version bump
    • 179cd1c Added 8.1.1 release notes to index
    • 7d29665 Update CHANGES.rst [ci skip]
    • d25036f Credits
    • 973a4c3 Release notes for 8.1.1
    • 521dab9 Use more specific regex chars to prevent ReDoS
    • 8b8076b Fix for CVE-2021-25291
    • e25be1e Fix negative size read in TiffDecode.c
    • f891baa Fix OOB read in SgiRleDecode.c
    • cbfdde7 Incorrect error code checking in TiffDecode.c
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • I met this error when run python3 main.py

    I met this error when run python3 main.py

    Hi, @kentsyx @George3d6

    I met this error when run python3 main.py

    /home/tezro/.local/lib/python3.7/site-packages/torchvision/transforms/transforms.py:187: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead. warnings.warn("The use of the transforms.Scale transform is deprecated, " + Trainable params: 14.97 million /home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. annotations = self.annotations.iloc[idx, 1:].as_matrix() /home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. annotations = self.annotations.iloc[idx, 1:].as_matrix() /home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. annotations = self.annotations.iloc[idx, 1:].as_matrix() /home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. annotations = self.annotations.iloc[idx, 1:].as_matrix() Traceback (most recent call last): File "main.py", line 241, in main(config) File "main.py", line 96, in main for i, data in enumerate(train_loader): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 819, in next return self._process_data(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 846, in _process_data data.reraise() File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 385, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate return {key: default_collate([d[key] for d in batch]) for key in elem} File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 74, in return {key: default_collate([d[key] for d in batch]) for key in elem} File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689

    My System: Ubuntu 19.04, Pytorch 1.4, Torchvision 0.4.2, TitanXP.

    Thanks in advance. Best from @bemoregt.

    opened by bemoregt 1
  • Issue with the test.py

    Issue with the test.py

    At line [66](https://github.com/kentsyx/Neural-IMage-Assessment/blob/f0028cd27de5cdb20a21c2b896999b3505bcb4f6/test.py#L66), it should be l+1 rather than l, because AVA votes start from 1 not 0.

    opened by chingjunehao 1
  • Pre-trained model giving vague results

    Pre-trained model giving vague results

    I am trying to implement this for a single image and not getting any mean value below 5.0. The good quality images also at times return low values.

    I am sharing the main.py file, please check if anything is wrong with the code.

    import argparse
    import os
    
    import numpy as np
    import matplotlib
    import matplotlib.pyplot as plt
    
    import torch
    from torch import no_grad
    import torch.autograd as autograd
    import torch.optim as optim
    
    import torchvision.transforms as transforms
    import torchvision.datasets as dsets
    import torchvision.models as models
    
    import torch.nn.functional as F
    
    from model import *
    
    import cv2
    file_name = 'bad'
    filename = '/home/shayan/Projects/NIMA/images/'+file_name+'.jpg'
    
    image = cv2.imread(filename)
    image = cv2.resize(image,(224,224))
    
    img_arr = image.transpose(2, 0, 1) # C x H x W
    img_arr = np.expand_dims(img_arr,axis = 0)
    print(img_arr.shape)
    
    img_tensor = torch.from_numpy(img_arr)
    img_tensor = img_tensor.type('torch.FloatTensor')
    print(img_tensor.shape,img_tensor.size)
    
    cuda = torch.cuda.is_available()
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    if cuda:
        print("Device: GPU")
    else:
        print("Device: CPU")
        
    base_model = models.vgg16(pretrained=True)
    model = NIMA(base_model)
    
    model.load_state_dict(torch.load("/home/shayan/Projects/NIMA/epoch-12.pkl", map_location=lambda storage, loc: storage))
    print("Successfully loaded model")
    
    with torch.no_grad():
    
        model.eval()
    
    output = model(img_tensor)
    output = output.view(10, 1)
    
    predicted_mean, predicted_std = 0.0, 0.0
    for i, elem in enumerate(output, 1):
        predicted_mean += i * elem
    for j, elem in enumerate(output, 1):
        predicted_std += elem * (j - predicted_mean) ** 2
    print("________________")
    print(u"({}) \u00B1{}".format(round(float(predicted_mean),2), round(float(predicted_std), 2)))  
    opened by shayan09 1
  • Doubt regarding computing standard deviation

    Doubt regarding computing standard deviation

    In line 195-196 of the main.py file you are computing the standard deviation of the score. The for loop is over the variable j while the variable used inside the loop is i. I just wanted to clarify if this is correct. Thank you in advance.

    opened by asprasan 1
  • xrange() was removed in Python 3

    xrange() was removed in Python 3

    Flake8 testing of https://github.com/kentsyx/Neural-IMage-Assessment on Python 3.6.4

    $ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

    ./model.py:40:14: F821 undefined name 'xrange'
        for i in xrange(1, length + 1):
                 ^
    ./model.py:57:14: F821 undefined name 'xrange'
        for i in xrange(mini_batch_size):
                 ^
    2     F821 undefined name 'xrange'
    2
    
    opened by cclauss 1
  • It requires me to install so much packages

    It requires me to install so much packages

    Hi Thanks for your code I really appreciate that but when I run the code using mac terminal, it keeps asking me for packages not installed, and I keep installing and installing but I still unable to run the training, and now I am sticking with NoModelNamed tensorboardX,

    Is there basic solid steps I should follow to have all the required packages then I can run the code?

    Sincerely

    Abdullah

    opened by AbdullahJirjees 0
  • About the test.py

    About the test.py

    Thank you for your excellent job. when I run the test.py with the csv file ,it seems wrong in the following line: "gt = test_df[test_df[0] == img].to_numpy()[:, 1:].reshape(10, 1)", some value errors occurred as following: ValueError: cannot reshape array of size 0 into shape (10,1). So sorry to disturb you about this issue. Looking forward to your help.

    Maybe the format of the csv is different from that of the paper, How to generate the specified 11 columns csv file of the paper ?

    opened by hello-trouble 0
  • The link to the pre-trained model is not working

    The link to the pre-trained model is not working

    I would like to use your pre-training model, but the link to Google Cloud Drive is not working, I would like to request you to fill in the pre-training model, thank you very much.

    opened by Volodymyr233 4
Owner
yunxiaos
yunxiaos
NIMA: Neural IMage Assessment

PyTorch NIMA: Neural IMage Assessment PyTorch implementation of Neural IMage Assessment by Hossein Talebi and Peyman Milanfar. You can learn more from

Kyryl Truskovskyi 293 Dec 30, 2022
Pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"

MOSNet pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion" https://arxiv.org/abs/1904.08352 Dependency L

null 9 Nov 18, 2022
[CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment

RADN [CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment [Paper on arXiv] Overview Update [2021/5/7] add codes for W

IIGROUP 53 Dec 28, 2022
Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment"

DSN-IQA Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment" Requirements Python >=3.8.0 Pytorch >=1.7.1 Usage wit

null 7 Oct 13, 2022
Lightweight Face Image Quality Assessment

LightQNet This is a demo code of training and testing [LightQNet] using Tensorflow. Uncertainty Losses: IDQ loss PCNet loss Uncertainty Networks: Mobi

Kaen 5 Nov 18, 2022
No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

No-Reference Image Quality Assessment Algorithms No-reference Image Quality Assessment(NIQA) is a task of evaluating an image without a reference imag

Dae-Young Song 26 Jan 4, 2023
[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network [CVPRW 2022] Code for Hybrid Image Quality Assessment Network [paper] [code] T

IIGROUP 49 Dec 11, 2022
Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

StrengthNet Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis" https://arxiv.org/abs/2110

RuiLiu 65 Dec 20, 2022
Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Zhengzhong Tu 5 Sep 16, 2022
Easy and comprehensive assessment of predictive power, with support for neuroimaging features

Documentation: https://raamana.github.io/neuropredict/ News As of v0.6, neuropredict now supports regression applications i.e. predicting continuous t

Pradeep Reddy Raamana 93 Nov 29, 2022
MagFace: A Universal Representation for Face Recognition and Quality Assessment

MagFace MagFace: A Universal Representation for Face Recognition and Quality Assessment in IEEE Conference on Computer Vision and Pattern Recognition

Qiang Meng 523 Jan 5, 2023
Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

Max Berrendorf 16 Oct 14, 2022
Repo for 2021 SDD assessment task 2, by Felix, Anna, and James.

SoftwareTask2 Repo for 2021 SDD assessment task 2, by Felix, Anna, and James. File/folder structure: helloworld.py - demonstrates various map backgrou

null 3 Dec 13, 2022
[ICCV 2021] Group-aware Contrastive Regression for Action Quality Assessment

CoRe Created by Xumin Yu*, Yongming Rao*, Wenliang Zhao, Jiwen Lu, Jie Zhou This is the PyTorch implementation for ICCV paper Group-aware Contrastive

Xumin Yu 31 Dec 24, 2022
A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

Built with ❤️ by Sam Showalter Contents Overview Installation Dependencies Usage Scripts Standard Execution Environment Development Environment Benchm

SRI-AIC 1 Nov 18, 2021
MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Front-end View Backend View Table of Contents Description Prerequisites Running Basic Information Measurements User Interface Feedback and usage Descr

Center for Computational Imaging and Personalized Diagnostics 58 Dec 2, 2022
ZEBRA: Zero Evidence Biometric Recognition Assessment

ZEBRA: Zero Evidence Biometric Recognition Assessment license: LGPLv3 - please reference our paper version: 2020-06-11 author: Andreas Nautsch (EURECO

Voice Privacy Challenge 2 Dec 12, 2021
To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

Kunal Wadhwa 2 Jan 5, 2022
Internship Assessment Task for BaggageAI.

BaggageAI Internship Task Problem Statement: You are given two sets of images:- background and threat objects. Background images are the background x-

Arya Shah 10 Nov 14, 2022