Automatic caption evaluation metric based on typicality analysis.

Joshua Feinglass

Last update: Jan 9, 2022

Related tags

Deep Learning SMURF

Overview

SeMantic and linguistic UndeRstanding Fusion (SMURF)

Automatic caption evaluation metric described in the paper "SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis" (ACL 2021).

arXiv: https://arxiv.org/abs/2106.01444

ACL Anthology: https://aclanthology.org/2021.acl-long.175/

Overview

SMURF is an automatic caption evaluation metric that combines a novel semantic evaluation algorithm (SPARCS) and novel fluency evaluation algorithms (SPURTS and MIMA) for both caption-level and system-level analysis. These evaluations were developed to be generalizable and as a result demonstrate a high correlation with human judgment across many relevant datasets. See paper for more details.

Requirements

You can run requirements/install.sh to quickly install all the requirements in an Anaconda environment. The requirements are:

python 3
torch>=1.0.0
numpy
nltk>=3.5.0
pandas>=1.0.1
matplotlib
transformers>=3.0.0
shapely
sklearn
sentencepiece

Usage

./smurf_example.py provides working examples of the following functions:

Caption-Level Scoring

Returns a dictionary with scores for semantic similarity between reference captions and candidate captions (SPARCS), style/diction quality of candidate text (SPURTS), grammar outlier penalty of candidate text (MIMA), and the fusion of these scores (SMURF). Input sentences should be preprocessed before being fed into the smurf_eval_captions object as shown in the example. Evaluations with SPARCS require a list of reference sentences while evaluations with SPURTS and MIMA do not use reference sentences.

System-Level Analysis

After reading in and standardizing caption-level scores, generates a plot that can be used to give an overall evaluation of captioner performances along with relevant system-level scores (intersection with reference captioner and total grammar outlier penalties) for each captioner. An example of such a plot is shown below:

The number of captioners you are comparing should be specified when instantiating a smurf_system_analysis object. In order to generate the plot correctly, the captions fed into the caption-level scoring for each candidate captioner (C1, C2,...) should be organized in the following format with the C1 captioner as the ground truth:

[C1 image 1 output, C2 image 1 output,..., C1 image 2 output, C2 image 2 output,...].

Author/Maintainer:

Joshua Feinglass (https://scholar.google.com/citations?user=V2h3z7oAAAAJ&hl=en)

If you find this repo useful, please cite:

@inproceedings{feinglass2021smurf,
  title={SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis},
  author={Joshua Feinglass and Yezhou Yang},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  year={2021},
  url={https://aclanthology.org/2021.acl-long.175/}
}

You might also like...

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

VANET Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning" Introduction This is the implementation of article VAN

23 Dec 26, 2022

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

30 Dec 6, 2022

Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

Embedding Transfer with Label Relaxation for Improved Metric Learning Official PyTorch implementation of CVPR 2021 paper Embedding Transfer with Label

37 Dec 6, 2022

An unreferenced image captioning metric (ACL-21)

UMIC This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Cont

14 Nov 20, 2022

Towards Interpretable Deep Metric Learning with Structural Matching

DIML Created by Wenliang Zhao*, Yongming Rao*, Ziyi Wang, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for paper Towards Interpr

75 Nov 11, 2022

Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

Map Metrics for Trajectory Quality Map metrics toolkit provides a set of metrics to quantitatively evaluate trajectory quality via estimating consiste

31 Oct 28, 2022

PrimitiveNet: Primitive Instance Segmentation with Local Primitive Embedding under Adversarial Metric (ICCV 2021)

Comments

Error

RuntimeError: Failed to import transformers.models.speech_to_text.feature_extraction_speech_to_text because of the following error (look up to see its traceback): /home1/anaconda3/envs/smurf/lib/python3.9/site-packages/torchaudio/_torchaudio.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSs How to solve it？

opened by jxrloveyou 4

Automatic caption evaluation metric based on typicality analysis.

Related tags

Overview

SeMantic and linguistic UndeRstanding Fusion (SMURF)

Overview

Requirements

Usage

Caption-Level Scoring

System-Level Analysis

Author/Maintainer:

You might also like...

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

An unreferenced image captioning metric (ACL-21)

Towards Interpretable Deep Metric Learning with Structural Matching

Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

PrimitiveNet: Primitive Instance Segmentation with Local Primitive Embedding under Adversarial Metric (ICCV 2021)

[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

GeDML is an easy-to-use generalized deep metric learning library

Comments

Error

Owner

Joshua Feinglass

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

Yet another video caption

Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Gif-caption - A straightforward GIF Captioner written in Python

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

Metric learning algorithms in Python

Dogs classification with Deep Metric Learning using some popular losses

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation