UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Last update: Dec 30, 2022

Related tags

Overview

UNION

Automatic Evaluation Metric described in the paper UNION: An UNreferenced MetrIc for Evaluating Open-eNded Story Generation (EMNLP 2020). Please refer to the Paper List for more information about Open-eNded Language Generation (ONLG) tasks. Hopefully the paper list will help you know more about this field.

Prerequisites
Computing Infrastructure
Quick Start
Data Instruction
Citation

Prerequisites

The code is written in TensorFlow library. To use the program the following prerequisites need to be installed.

Python 3.7.0
tensorflow-gpu 1.14.0
numpy 1.18.1
regex 2020.2.20
nltk 3.4.5

Computing Infrastructure

We train UNION based on the platform:

OS: Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-98-generic x86_64)
GPU: NVIDIA TITAN Xp

Quick Start

1. Constructing Negative Samples

Execute the following command:

cd ./Data
python3 ./get_vocab.py your_mode
python3 ./gen_train_data.py your_mode

your_mode is roc for ROCStories corpus or wp for WritingPrompts dataset. Then the summary of vocabulary and the corresponding frequency and pos-tagging will be found under ROCStories/ini_data/entitiy_vocab.txt or WritingPrompts/ini_data/entity_vocab.txt.
Negative samples and human-written stories will be constructed based on the original training set. The training set will be found under ROCStories/train_data or WritingPrompts/train_data.
Note: currently only 10 samples of the full original data and training data are provided. The full data can be downloaded from THUcloud or GoogleDrive.

2. Training of UNION

Execute the following command:

python3 ./run_union.py --data_dir your_data_dir \
    --output_dir ./model/union \
    --task_name train \
    --init_checkpoint ./model/uncased_L-12_H-768_A-12/bert_model.ckpt

your_data_dir is ./Data/ROCStories or ./Data/WritingPrompts.
The initial checkpoint of BERT can be downloaded from bert. We use the uncased base version of BERT (about 110M parameters). We train the model for 40000 steps at most. The training process will task about 1~2 days.

3. Prediction with UNION

Execute the following command:

python3 ./run_union.py --data_dir your_data_dir \
    --output_dir ./model/output \
    --task_name pred \
    --init_checkpoint your_model_name

your_data_dir is ./Data/ROCStories or ./Data/WritingPrompts. If you want to evaluate your custom texts, you only need tp change your file format into ours.
your_model_name is ./model/union_roc/union_roc or ./model/union_wp/union_wp. The fine-tuned checkpoint can be downloaded from the following link:

Dataset	Fine-tuned Model
ROCStories	THUcloud; GoogleDrive
WritingPrompts	THUcloud; GoogleDrive

The union score of the stories under your_data_dir/ant_data can be found under the output_dir ./model/output.

4. Correlation Calculation

Execute the following command:

python3 ./correlation.py your_mode

Then the correlation between the human judgements under your_data_dir/ant_data and the scores of metrics under your_data_dir/metric_output will be output. The figures under "./figure" show the score graph between metric scores and human judgments for ROCStories corpus.

Data Instruction for files under `./Data`

├── Data
   └── `negation.txt`             # manually constructed negation word vocabulary.
   └── `conceptnet_antonym.txt`   # triples with antonym relations extracted from ConceptNet.
   └── `conceptnet_entity.csv`    # entities acquired from ConceptNet.
   └── `ROCStories`
       ├── `ant_data`        # sampled stories and corresponding human annotation.
              └── `ant_data.txt`        # include only binary annotation for reasonable(1) or unreasonable(0)
              └── `ant_data_all.txt`    # include the annotation for specific error types: reasonable(0), repeated plots(1), bad coherence(2), conflicting logic(3), chaotic scenes(4), and others(5). 
              └── `reference.txt`       # human-written stories with the same leading context with annotated stories.
              └── `reference_ipt.txt`
              └── `reference_opt.txt`
       ├── `ini_data`        # original dataset for training/validation/testing.
              └── `train.txt`
              └── `dev.txt`
              └── `test.txt`
              └── `entity_vocab.txt`    # generated by `get_vocab.py`, consisting of all the entities and the corresponding tagged POS followed by the mention frequency in the dataset.
       ├── `train_data`      # negative samples and corresponding human-written stories for training, which are constructed by `gen_train_data.py`.
              └── `train_human.txt`
              └── `train_negative.txt`
              └── `dev_human.txt`
              └── `dev_negative.txt`
              └── `test_human.txt`
              └── `test_negative.txt`
       ├── `metric_output`   # the scores of different metrics, which can be used to replicate the correlation in Table 5 of the paper. 
              └── `bleu.txt`
              └── `bleurt.txt`
              └── `ppl.txt`             # the sign of the result of Perplexity needs to be changed to get the result for *minus* Perplexity.
              └── `union.txt`
              └── `union_recon.txt`     # the ablated model without the reconstruction task
              └── ...
   └── `WritingPrompts`
       ├── ...

The annotated data file ant_data.txt and ant_data_all.txt are formatted as Story ID ||| Story ||| Seven Annotated Scores.
ant_data_all.txt is only available for ROCStories corpus. ant_data_all.txt is the same with ant_data.txt for WrintingPrompts dataset.

Citation

Please kindly cite our paper if this paper and the code are helpful.

@misc{guan2020union,
    title={UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation},
    author={Jian Guan and Minlie Huang},
    year={2020},
    eprint={2009.07602},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Comments

`pip install -r requirements.txt` causes an error

Thank you for sharing your great work in GitHub! I'm working on NLP, especially on story analysis and generation, so I'm very interested in your proposed metric "UNION".

I'd like to use the metric, but have trouble installing prerequisites. If you know how to avoid the error, could you please tell me how to solve it?

First, I tried pip install -r requirements.txt in Python 3.8 environment, but failed in installing Tensorflow v1. Then, I downgraded Python to 3.7 (as you indicated in "Prerequisites" in "README.md"), and TensorFlow v1 could be installed. However, now I come across another error saying the version of regex "Could not find".

(If you allow me to add information, "tensorflow-gpu==1.14.0" seems duplicated in requirements.txt. Is it something you intended?)

<username>@<servername> $ conda activate py37_union
(py37_union) <username>@<servername> $ pwd
/********/workspace/Clone/UNION
(py37_union) <username>@<servername> $ pip install -r requirements.txt
Collecting tensorflow-gpu==1.14.0
  Downloading tensorflow_gpu-1.14.0-cp37-cp37m-manylinux1_x86_64.whl (377.1 MB)
     |████████████████████████████████| 377.1 MB 5.2 kB/s
Collecting numpy==1.18.1
  Downloading numpy-1.18.1-cp37-cp37m-manylinux1_x86_64.whl (20.1 MB)
     |████████████████████████████████| 20.1 MB 38.1 MB/s
ERROR: Could not find a version that satisfies the requirement regex==2.5.76 (from -r requirements.txt (line 4)) (from versions: 2013-02-16, 2013-02-23, 2013-03-11, 2013-05-21, 2013-06-05, 2013-06-26, 2013-08-04, 2013-10-04, 2013-10-12, 2013-10-21, 2013-10-22, 2013-10-23, 2013-10-24, 2013-10-25, 2013-10-26, 2013-11-29, 2013-12-31, 0.1.20100217, 0.1.20100226, 0.1.20100305, 0.1.20100323, 0.1.20100331, 0.1.20100706, 0.1.20100706.1, 0.1.20100709, 0.1.20100709.1, 0.1.20100719, 0.1.20100725, 0.1.20100814, 0.1.20100816, 0.1.20100824, 0.1.20100912, 0.1.20100913, 0.1.20100918, 0.1.20101009, 0.1.20101029, 0.1.20101030b0, 0.1.20101030, 0.1.20101101, 0.1.20101102a0, 0.1.20101102, 0.1.20101106, 0.1.20101113, 0.1.20101120, 0.1.20101121, 0.1.20101123, 0.1.20101130, 0.1.20101207, 0.1.20101210, 0.1.20101224, 0.1.20101228a0, 0.1.20101228, 0.1.20101229, 0.1.20101230, 0.1.20101231, 0.1.20110104, 0.1.20110106, 0.1.20110124, 0.1.20110313, 0.1.20110314, 0.1.20110315, 0.1.20110429, 0.1.20110502, 0.1.20110504, 0.1.20110510, 0.1.20110514, 0.1.20110524, 0.1.20110608a0, 0.1.20110608, 0.1.20110609, 0.1.20110610, 0.1.20110616, 0.1.20110623a0, 0.1.20110623, 0.1.20110627, 0.1.20110702, 0.1.20110717, 0.1.20110917a0, 0.1.20110917, 0.1.20110922a0, 0.1.20110922, 0.1.20110927, 0.1.20110929, 0.1.20111004, 0.1.20111005, 0.1.20111006, 0.1.20111014, 0.1.20111103, 0.1.20111223, 0.1.20120103, 0.1.20120105, 0.1.20120112, 0.1.20120114, 0.1.20120115, 0.1.20120119, 0.1.20120122, 0.1.20120123, 0.1.20120126, 0.1.20120128, 0.1.20120129, 0.1.20120208, 0.1.20120209, 0.1.20120301, 0.1.20120303, 0.1.20120316, 0.1.20120317, 0.1.20120323, 0.1.20120416, 0.1.20120502, 0.1.20120503, 0.1.20120504, 0.1.20120506, 0.1.20120611, 0.1.20120613, 0.1.20120705, 0.1.20120708, 0.1.20120709, 0.1.20120710, 0.1.20120803, 0.1.20120825, 0.1.20120904, 0.1.20121008, 0.1.20121017, 0.1.20121031, 0.1.20121105, 0.1.20121113, 0.1.20121120, 0.1.20121216, 0.1.20130120, 0.1.20130124, 0.1.20130125, 2014.1.10, 2014.1.20, 2014.1.30, 2014.2.16, 2014.2.19, 2014.4.10, 2014.5.17, 2014.5.23, 2014.6.28, 2014.8.15, 2014.8.28, 2014.9.18, 2014.9.22, 2014.10.1, 2014.10.2, 2014.10.7, 2014.10.9, 2014.10.23, 2014.10.24, 2014.11.3, 2014.11.13, 2014.11.14, 2014.12.15, 2014.12.24, 2015.3.18, 2015.5.7, 2015.5.10, 2015.5.28, 2015.6.2, 2015.6.4, 2015.6.9, 2015.6.10, 2015.6.14, 2015.6.15, 2015.6.19, 2015.6.21, 2015.6.24, 2015.7.12, 2015.7.19, 2015.9.14, 2015.9.15, 2015.9.23, 2015.9.28, 2015.10.1, 2015.10.5, 2015.10.22, 2015.10.29, 2015.11.5b0, 2015.11.7, 2015.11.8, 2015.11.9, 2015.11.12, 2015.11.14, 2015.11.22, 2016.1.10, 2016.2.23, 2016.2.24, 2016.2.25, 2016.3.2, 2016.3.24, 2016.3.26, 2016.3.31, 2016.4.1, 2016.4.2, 2016.4.3, 2016.4.8, 2016.4.15, 2016.4.25, 2016.5.13, 2016.5.14, 2016.5.15, 2016.5.23, 2016.6.2, 2016.6.5, 2016.6.14, 2016.6.19, 2016.6.24, 2016.7.14, 2016.7.21, 2016.8.27, 2016.9.22, 2016.10.22, 2016.11.18, 2016.11.21, 2016.12.27, 2017.1.12, 2017.1.14, 2017.1.17, 2017.2.8, 2017.4.5, 2017.4.23, 2017.4.29, 2017.5.26, 2017.6.7, 2017.6.20, 2017.6.23, 2017.7.11, 2017.7.26, 2017.7.28, 2017.9.23, 2017.11.8, 2017.11.9, 2017.12.5, 2017.12.9, 2017.12.12, 2018.1.10, 2018.2.3, 2018.2.8, 2018.2.21, 2018.6.6, 2018.6.9, 2018.6.20, 2018.6.21, 2018.7.11, 2018.8.17, 2018.8.29, 2018.11.2, 2018.11.3, 2018.11.6, 2018.11.7, 2018.11.22, 2019.1.23, 2019.1.24, 2019.2.3, 2019.2.5, 2019.2.6, 2019.2.7, 2019.2.18, 2019.2.19, 2019.2.20, 2019.2.21, 2019.3.8, 2019.3.9, 2019.3.12, 2019.4.9, 2019.4.10, 2019.4.12, 2019.4.14, 2019.5.25, 2019.6.2, 2019.6.5, 2019.6.8, 2019.8.19, 2019.11.1, 2019.12.9, 2019.12.17, 2019.12.18, 2019.12.19, 2019.12.20, 2020.1.7, 2020.1.8, 2020.2.18, 2020.2.20, 2020.4.4, 2020.5.7, 2020.5.13, 2020.5.14, 2020.6.7, 2020.6.8, 2020.7.14, 2020.9.27, 2020.10.11)
ERROR: No matching distribution found for regex==2.5.76 (from -r requirements.txt (line 4))

Thank you in advance!

opened by forest1988 4

Duplicate data for WritingPrompts

Hi @JianGuanTHU Thanks again for sharing the work.

The human-annotated data (ant_data_all.txt) for WritingPrompts domain does not include six label categories, but including the exact labels as ant_data.txt.

Would the actual data (ant_data_all.txt) be re-upload in the future?

opened by inimah 0
Computing Perplexity

Hi, @JianGuanTHU Thanks for making the data publicly available.

Could you please elaborate more on how the current work computes "Perplexity" metric? Is it sentence-perplexity or perplexity of predicting a token?

The paper mentions in a footnote We take the minus of perplexity for all the following...

But I do not think the metric outputs in ~/Data/../metric_output/ppl.txt are reasonably fit with the text inputs. What is "minus of perplexity" in this context?

For example, score on sample-ID 151 from ant_data_all ((I am using HuggingFace -- evaluate perplexity metric))

Prediction text: ["we were looking for something fun to do on a female night . Female wife and i were so excited . we went to the mall . we had a great time . we had a great time ."]

results: {'perplexities': [55.47270202636719], 'mean_perplexity': 55.47270202636719}

While, in ppl.txt the score is 2.5693

opened by inimah 0
Some question occurred during training

I try to train the union model with other dataset, but there is an issue about tensorflow " Cannot serialize protocol buffer of type tensorflow.GraphDef as the serialized size (3029951657bytes) would be larger than the limit (2147483647 bytes)"

opened by a835194891 0
Not an unreference metric (as claimed in the paper).

Hi, short question for running the repo with my own texts. Do the texts that I want to evaluate have to be in the "ant_data" format??

That "ant_data" comes with annotations, so if we have to annotate our texts the metric it's not anymore an Unreferenced metric as claimed in the paper. I just ask this because when running the repo the number of outputs is 400. The same that the number of texts in the ant_data.txt.

Just seeing the repo I thought that the texts that we want to analyze should go in the init_data folder. And split them up in train, dev and test. But apparently, it's not like that.

Could you then please confirm the correct way to run the metric in our own texts :) Thanks so much for your help. I really appreciate and love your work. Have a nice day,

Victor.

opened by VictorOtin 0
Args for reproduce the fine-tuned model

Hello,

In README.md, you wrote that the training steps are 40000 at most. However, when I tried to reproduce the fine-tuned model you kindly provide, it seems the training steps are too many if I run the script at the default setting (100.0 epochs are set as default for run_union.py).

The initial checkpoint of BERT can be downloaded from bert. We use the uncased base version of BERT (about 110M parameters). We train the model for 40000 steps at most. The training process will task about 1~2 days.

After the 100-epoch training, I got model.ckpt-1414000.

If you don't mind, could you please tell me the appropriate args to be used for reproducing your fine-tuned UNION models used in the paper? Is it enough to change the training epochs?

I'm sorry if there are already details somewhere.

Thank you in advance.

opened by forest1988 0
Can't reconstruct text

When i use run_union.py do predict-tokens task

Input texts are: 1 An alien race encounters the most terrifying predator imaginable . A lone , unarmed human . The last time they 'd come ... they 'd arrived on the planet , expecting the worst . But when they arrived , they 'd been silent for a few days , and just as quickly they were gone . I mean , no one knows where they came from , or who they came from . However , in the end , they 'd never come . They 'd been at the top of the world , watching us from miles away . They were all on the planet , but we could see them , and we could see them , that 's what we wanted . We went over to the surface to find the planet we could live in , and they appeared , in the center of the planet . They were humanoid , with a large mass of black , but they looked like us . They were humanoid and seemed to be humanoid , except for at their appearance . They wore strange masks and wore strange helmets . They were humanoid , with strange attire , and wore strange garments .

2 It 's surprising that the most important person in the world has so little security . '' Said the assassin to his target . I am here , there is no better security . '' Was the casual reply . `` I 'm not a security threat , I 'm a tool , I am a tool . '' I had always wanted to be a tool , and I had always wanted to be someone else . I was the only one who was n't a tool . I was the only one who could tell me what was going on . I was the only one who could prevent the death of those who I knew would kill me . I was the only one who could stop the murder of the most important people in the world to avoid a nuclear war . I was the only one who could stop the deaths of the world 's most important people .

3 Write your heart onto your sleeve , Reddit . You 've got ta understand , I 'm a bit surprised . The stress is off , and I 'm a little worried about how much I 'm getting . I 'm a bit nervous , and I 'm not like you , I just want to tell you how much I love you . I 'm just so upset , I 'm really , really good at it , but here 's my feelings , and I love you so much . I 've been to many different schools , to be specific . I know you 're not real , and I know you 're not real . I know you 're not real , but you do n't know me . So I 'm going to let you know . I know that feeling , that ... that I 'm feeling right now . It 's kind of like that , you know ? It 's like that 's the first thing I 've ever felt .

but the predict token are: 1 . [SEP] . . . . . . . . [SEP] [SEP] . [SEP] . [SEP] [SEP] [SEP] . . . [SEP] . . [SEP] . . . . . . [SEP] . . . . [SEP] . [SEP] . . . [SEP] . . [SEP] . . [SEP] [SEP] . . [SEP] . [SEP] [SEP] [SEP] [SEP] . [SEP] . [SEP] . . . [SEP] [SEP] . [SEP] . [SEP] . [SEP] . . [SEP] [SEP] . [SEP] . . . [SEP] [SEP] [SEP] . [SEP] . [SEP] [SEP] . . . . . [SEP] [SEP] . [SEP] . . . . [SEP] . . . [SEP] . . . [SEP] . . . . . . . . [SEP] . . [SEP] . . [SEP] . [SEP] . [SEP] [SEP] . . [SEP] . . . [SEP] [SEP] [SEP] . . . [SEP] . . [SEP] . . [SEP] [SEP] . . [SEP] . [SEP] . . [SEP] . . . [SEP] . [SEP] . [SEP] [SEP] [SEP] . . [SEP] . . . . . . . . . [SEP] [SEP] . [SEP] . [SEP] . [SEP] . [SEP] . . [SEP] . [SEP] [SEP] . .

2 . [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . . [SEP] . [SEP] . [SEP] . . [SEP] . . [SEP] . . [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] . [SEP] . . . . [SEP] . . . . . . [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] . [SEP] . . [SEP] . [SEP] . . . . [SEP] [SEP] [SEP] . . . [SEP] . . [SEP] [SEP] . [SEP] . [SEP] [SEP] . . [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] . . . [SEP] [SEP] [SEP] [SEP] [SEP] . . . . . . . [SEP] . [SEP] . [SEP] [SEP] [SEP] . [SEP] [SEP] . . [SEP] . . . . . . [SEP] . . [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] . [SEP] . . [SEP] [SEP] [SEP] [SEP] . [SEP] . . . [SEP] . [SEP] [SEP] . [SEP] [SEP] . [SEP] [SEP] [SEP] . [SEP] . . [SEP] . [SEP] [SEP] [SEP] [SEP] [SEP] . . . . [SEP] . [SEP] . [SEP] [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] [SEP] [SEP] [SEP] . [SEP]

3 [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] . [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] [SEP] . [SEP] [SEP] . . [SEP] [SEP] . [SEP] . [SEP] [SEP] [SEP] [SEP] . [SEP] . . [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] [SEP] . . . [SEP] [SEP] . [SEP] . . . [SEP] [SEP] [SEP] [SEP] [SEP] . . [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] . [SEP] . . . . . [SEP] [SEP] [SEP] . [SEP] . [SEP] [SEP] [SEP] . [SEP] [SEP] [SEP] . [SEP] . [SEP] [SEP] . [SEP] . . . [SEP] . [SEP] [SEP] [SEP] . [SEP] [SEP] . [SEP] . . [SEP] . . [SEP] . [SEP] . . [SEP] [SEP] . [SEP] . . . [SEP] . . [SEP] . . . . [SEP] [SEP] . . [SEP] . [SEP] . . [SEP] . [SEP] [SEP] . . . . . . [SEP] [SEP] . . . . . [SEP] [SEP] [SEP] . [SEP] . [SEP] . [SEP] . . . [SEP] . . [SEP] [SEP] [SEP] [SEP] . . [SEP]

and these tokens are wrong tokens.

opened by Kalafinaian 1

Owner

Conversational AI groups from Tsinghua University

GitHub

Benchmark for evaluating open-ended generation

OpenMEVA Contributed by Jian Guan, Zhexin Zhang. Thank Jiaxin Wen for DeBugging. OpenMEVA is a benchmark for evaluating open-ended story generation me

25 Nov 15, 2022

An unreferenced image captioning metric (ACL-21)

UMIC This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Cont

14 Nov 20, 2022

Open-Ended Commonsense Reasoning (NAACL 2021)

Open-Ended Commonsense Reasoning Quick links: [Paper] | [Video] | [Slides] | [Documentation] This is the repository of the paper, Differentiable Open-

31 Oct 19, 2022

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

338 Dec 29, 2022

This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression YOLOv5 with alpha-IoU losses implemented in PyTorch. Example r

147 Dec 5, 2022

BARTScore: Evaluating Generated Text as Text Generation

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation Updates 2021.06.28 Release online evaluation Demo 2021.06.25 R

196 Dec 17, 2022

The story of Chicken for Club Bing

Chicken Story tl;dr: The time when Microsoft banned my entire country for cheating at Club Bing. (A lot of the details are from memory so I've recreat

142 May 16, 2022

OpenGAN: Open-Set Recognition via Open Data Generation

OpenGAN: Open-Set Recognition via Open Data Generation ICCV 2021 (oral) Real-world machine learning systems need to analyze novel testing data that di

90 Jan 6, 2023

Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

1.3k Dec 28, 2022

Dogs classification with Deep Metric Learning using some popular losses

Tsinghua Dogs classification with Deep Metric Learning 1. Introduction Tsinghua Dogs dataset Tsinghua Dogs is a fine-grained classification dataset fo

45 Nov 9, 2022

Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

Auto-Seg-Loss By Hao Li, Chenxin Tao, Xizhou Zhu, Xiaogang Wang, Gao Huang, Jifeng Dai This is the official implementation of the ICLR 2021 paper Auto

61 Dec 21, 2022

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

VANET Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning" Introduction This is the implementation of article VAN

23 Dec 26, 2022

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

30 Dec 6, 2022

This is the official repository for evaluation on the NoW Benchmark Dataset. The goal of the NoW benchmark is to introduce a standard evaluation metric to measure the accuracy and robustness of 3D face reconstruction methods from a single image under variations in viewing angle, lighting, and common occlusions.

NoW Evaluation This is the official repository for evaluation on the NoW Benchmark Dataset. The goal of the NoW benchmark is to introduce a standard e

71 Dec 30, 2022

Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

Embedding Transfer with Label Relaxation for Improved Metric Learning Official PyTorch implementation of CVPR 2021 paper Embedding Transfer with Label

37 Dec 6, 2022

Towards Interpretable Deep Metric Learning with Structural Matching

DIML Created by Wenliang Zhao*, Yongming Rao*, Ziyi Wang, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for paper Towards Interpr

75 Nov 11, 2022

Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

Map Metrics for Trajectory Quality Map metrics toolkit provides a set of metrics to quantitatively evaluate trajectory quality via estimating consiste

31 Oct 28, 2022

PrimitiveNet: Primitive Instance Segmentation with Local Primitive Embedding under Adversarial Metric (ICCV 2021)

PrimitiveNet Source code for the paper: Jingwei Huang, Yanfeng Zhang, Mingwei Sun. [PrimitiveNet: Primitive Instance Segmentation with Local Primitive

47 Dec 6, 2022

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Related tags

Overview

UNION

Contents

Prerequisites

Computing Infrastructure

Quick Start

1. Constructing Negative Samples

2. Training of UNION

3. Prediction with UNION

4. Correlation Calculation

Data Instruction for files under ./Data

Citation

Comments

`pip install -r requirements.txt` causes an error

Duplicate data for WritingPrompts

Computing Perplexity

Some question occurred during training

Not an unreference metric (as claimed in the paper).

Args for reproduce the fine-tuned model

Can't reconstruct text

Owner

Benchmark for evaluating open-ended generation

An unreferenced image captioning metric (ACL-21)

Open-Ended Commonsense Reasoning (NAACL 2021)

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

BARTScore: Evaluating Generated Text as Text Generation

The story of Chicken for Club Bing

OpenGAN: Open-Set Recognition via Open Data Generation

Metric learning algorithms in Python

Dogs classification with Deep Metric Learning using some popular losses

Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

Towards Interpretable Deep Metric Learning with Structural Matching

Pip-package for trajectory benchmarking from "Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds", ECMR'21

PrimitiveNet: Primitive Instance Segmentation with Local Primitive Embedding under Adversarial Metric (ICCV 2021)

Data Instruction for files under `./Data`