PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency"

Overview

Improving Generation and Evaluation of Visual Stories via Semantic Consistency

PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency". Link to arXiv paper: https://arxiv.org/abs/2105.10026

Requirements:

This code has been tested on torch==1.7.1 and torchvision==0.8.2

Prepare Repository:

Download the PororoSV dataset and associated files from here and save it as ./data. Download GloVe embeddings (glove.840B.300D) from here. The default location of the embeddings is ./data/ (see ./dcsgan/miscc/config.py).

Training DuCo-StoryGAN:

To train DuCo-StoryGAN, first train the VideoCaptioning model on the PororoSV dataset:
python train_mart.py --data_dir
Default parameters were used to train the model used in our paper.

Next, train the generative model:
python train_gan.py --cfg ./cfg/pororo_s1_duco.yml --data_dir
If training DuCo-StoryGAN on a new dataset, make sure to train the Video Captioning model (see below) before training the GAN. The vocabulary file prepared for the video-captioning model is re-used for generating common input_ids for both models. Change location of video captioning checkpoint in config file.

Unless specified, the default output root directory for all model checkpoints is ./out/

Training Evaluation Models:

  • Video Captioning Model
    The video captioning model trained for DuCo-StoryGAN (see above) is used for evaluation. python train_mart.py --data_dir

  • Hierarchical Deep Multimodal Similarity (H-DAMSM)
    python train_damsm.py --cfg ./cfg/pororo_damsm.yml --data_dir

  • Character Classifier
    python train_classifier.py --data_dir --model_name inception --save_path ./models/inception --batch_size 8 --learning_rate 1e-05

Inference from DuCo-StoryGAN:

Use the following command to infer from trained weights for DuCo-StoryGAN:
python train_gan.py --cfg ./cfg/pororo_s1_duco_eval.yml --data_dir --checkpoint --infer_dir

Download our pretrained checkpoint from here.

Evaluation:

Download the pretrained models for evaluations:
Character Classifier, Video Captioning

Use the following command to evaluate classification accuracy of generated images:
python eval_scripts/eval_classifier.py --image_path --data_dir --model_path --model_name inception --mode

Use the following command to evaluate BLEU Score of generated images:
python eval_scripts/translate.py --batch_size 50 --pred_dir --data_dir --checkpoint_file --eval_mode

Acknowledgements

The code in this repository has been adapted from the MART, StoryGAN and MirrorGAN codebases.

Comments
  • Permission for using contents from your paper

    Permission for using contents from your paper

    Hello,

    I am trying to publish a survey paper, and I would like to use a picture from your paper. If you allow, I will use it with proper citation to your work.

    opened by uu95 0
  • Evaluation metrics

    Evaluation metrics

    Hello, I hope your research goes well. 😀

    I am trying to evaluate the metrics that you proposed for my model.

    I have read your paper. However, I am asking you to double-check. (my results seem a bit odd and off the scale, that's why 😢)

    1. I presume that the "character F1" score represents the "micro avg" of F1 score outputs from your eval_clasifier.py code? Am I correct?
    2. also, "Frame accuracy" represents "eval Image Exact Match Acc" outputs from your eval_classifier.py code?
    3. are BLEU 2 and BLEU 3 scores scaled by 100? I have tested your translate.py code with my generated images, and I've got about 0.04-ish scores. are the BLEU scores you reported multiplied by 100?
    4. Lastly, It is unclear about the R-precision evaluation method. Do I require to train your code (H-DAMSM)? if so, when is the right time to stop the training and benchmark my model?
    5. To fair comparison, is it possible to be provided your H-DAMSM pretrained weight?

    I am currently stuck on the R-precision evaluation using H-DAMSM. So, I was thinking of utilizing the recent CLIP R-Precision instead, but I am leaving this issue to avoid a fair comparison issue.

    opened by KyonP 0
  • About R-precision evaluation

    About R-precision evaluation

    Hi!

    I've been trying to reproduce results from your code, especially H-DAMSM. I've trained DAMSM using the code on GitHub and done eval, but I only get 2.38 ~ 2.4. Would it be possible for you to upload the pretrained weight to use for eval_damsm? Thanks.

    opened by carpedkm 1
  • Questions about your paper implementation!

    Questions about your paper implementation!

    Hi, thank you for your nice work! Currently, I've been reproducing your paper.

    I think you're using label(crnn_code) and glove embeddings(zmc_code, m_image) to create the images in your implementation. But I can't find explanation about this part in your paper.

    Can you give me more explanation for this part?

    opened by statjuns 1
  • best performing pretrained weight

    best performing pretrained weight

    To author,

    First of all, on behalf of our lab member who published the original Pororo dataset (K.M. Kim), we very much appreciated that this dataset is still alive and being researched.

    We are trying to reproduce your achievement. However, struggling to find mimicking your best performing setting.

    Is there any way to receive your best-performing checkpoint file or detailed settings?

    Of course, If you don't mind. I hope it is possible.

    Your work inspired us to a generative model for story visualization on a realistic drama dataset.

    Best wishes,

    opened by KyonP 3
Owner
Adyasha Maharana
Adyasha Maharana
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

THUNLP 37 Oct 30, 2022
Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

Kalpesh Krishna 41 Nov 8, 2022
Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

Liangming Pan 70 Nov 27, 2022
Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Supporting Clustering with Contrastive Learning SCCL (NAACL 2021) Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ram

null 231 Jan 5, 2023
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022
Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

Zoey Li 87 Dec 26, 2022
Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

ATP: AMRize Then Parse! Enhancing AMR Parsing with PseudoAMRs Hi this is the source code of our paper "ATP: AMRize Then Parse! Enhancing AMR Parsing w

Chen Liang 13 Nov 23, 2022
Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering Abstract In open-domain question answering (QA), retrieve-and-read mec

Clova AI Research 34 Apr 13, 2022
NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple

Princeton Natural Language Processing 150 Dec 20, 2022
Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Contextualized Perturbation for Textual Adversarial Attack Introduction This is a PyTorch implementation of Contextualized Perturbation for Textual Ad

cookielee77 30 Jan 1, 2023
[NAACL & ACL 2021] SapBERT: Self-alignment pretraining for BERT.

SapBERT: Self-alignment pretraining for BERT This repo holds code for the SapBERT model presented in our NAACL 2021 paper: Self-Alignment Pretraining

Cambridge Language Technology Lab 104 Dec 7, 2022
Self-training with Weak Supervision (NAACL 2021)

This repo holds the code for our weak supervision framework, ASTRA, described in our NAACL 2021 paper: "Self-Training with Weak Supervision"

Microsoft 148 Nov 20, 2022
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis ?? A more detailed re

Lincedo Lab 4 Jun 9, 2021
Paddle implementation for "Cross-Lingual Word Embedding Refinement by â„“1 Norm Optimisation" (NAACL 2021)

L1-Refinement Paddle implementation for "Cross-Lingual Word Embedding Refinement by â„“1 Norm Optimisation" (NAACL 2021) ?? A more detailed readme is co

Lincedo Lab 4 Jun 9, 2021
Open-Ended Commonsense Reasoning (NAACL 2021)

Open-Ended Commonsense Reasoning Quick links: [Paper] | [Video] | [Slides] | [Documentation] This is the repository of the paper, Differentiable Open-

(Bill) Yuchen Lin 31 Oct 19, 2022
✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.

How Robust are Fact Checking Systems on Colloquial Claims? Official PyTorch implementation of our NAACL paper: Byeongchang Kim*, Hyunwoo Kim*, Seokhee

Byeongchang Kim 19 Mar 15, 2022
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

João Fonseca 3 Jan 3, 2023
PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

Unbiased Teacher for Semi-Supervised Object Detection This is the PyTorch implementation of our paper: Unbiased Teacher for Semi-Supervised Object Detection

Facebook Research 366 Dec 28, 2022