Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020

Overview

Show, Edit and Tell: A Framework for Editing Image Captions | arXiv

This contains the source code for Show, Edit and Tell: A Framework for Editing Image Captions, to appear at CVPR 2020

Requirements

  • Python 3.6 or 3.7
  • PyTorch 1.2

For evaluation, you also need:

Argument Parser is currently not supported. We will add support to it soon.

Pretrained Models

You can download the pretrained models from here. Place them in eval folder.

Download and Prepare Features

In this work, we use 36 fixed bottom-up features. If you wish to use the adaptive features (10-100), please refer to adaptive_features folder in this repository and follow the instructions.

First, download the fixed features from here and unzip the file. Place the unzipped folder in bottom-up_features folder.

Next type this command:

python bottom-up_features/tsv.py

This command will create the following files:

  • An HDF5 file containing the bottom up image features for train and val splits, 36 per image for each split, in an (I, 36, 2048) tensor where I is the number of images in the split.
  • PKL files that contain training and validation image IDs mapping to index in HDF5 dataset created above.

Download/Prepare Caption Data

You can either download all the related caption data files from here or create them yourself. The folder contains the following:

  • WORDMAP_coco: maps the words to indices
  • CAPUTIL: stores the information about the existing captions in a dictionary organized as follows: {"COCO_image_name": {"caption": "existing caption to be edited", "encoded_previous_caption": an encoded list of the words, "previous_caption_length": a list contaning the length of the caption, "image_ids": the COCO image id}
  • CAPTIONS the encoded ground-truth captions (a list with number_images x 5 lists. Example: we have 113,287 training images in Karpathy Split, thereofre there is 566,435 lists for the training split)
  • CAPLENS: the length of the ground-truth captions (a list with number_images x 5 vallues)
  • NAMES: the COCO image name in the same order as the CAPTIONS
  • GENOME_DETS: the splits and image ids for loading the images in accordance to the features file created above

If you'd like to create the caption data yourself, download Karpathy's Split training, validation, and test splits. This zip file contains the captions. Place the file in caption data folder. You should also have the pkl files created from the 'Download Features' section: train36_imgid2idx.pkl and val36_imgid2idx.pkl.

Next, run:

python preprocess_caps.py

This will dump all the files to the folder caption data.

Next, download the existing captios to be edited, and organize them in a list containing dictionaries with each dictionary in the following format: {"image_id": COCO_image_id, "caption": "caption to be edited", "file_name": "split\\COCO_image_name"}. For example: {"image_id": 522418, "caption": "a woman cutting a cake with a knife", "file_name": "val2014\\COCO_val2014_000000522418.jpg"}. In our work, we use the captions produced by AoANet.

Next, run:

python preprocess_existing_caps.py

This will dump all the existing caption files to the folder caption data.

Prepare/Download Sequence-Level Training Data

Download the RL-data for sequence-level training used for computing metric scores from here.

Alternitavely, you may prepare the data yourself:

Run the following command:

python preprocess_rl.py

This will dump two files in the data folder used for computing metric scores.

Training and Validation

XE training stage:

For training DCNet, run:

python dcnet.py

For optimizing DCNet with MSE, run:

python dcnet_with_mse.py

For training editnet:

python editnet.py
Cider-D Optimization stage:

For training DCNet, run:

python dcnet_rl.py

For training editnet:

python editnet_rl.py

Evaluation

Refer to eval folder for instructions. All the generated captions and scores from our model can be found in the outputs folder.

BLEU-1 BLEU-4 CIDEr SPICE
Cross-Entropy Loss 77.9 38.0 1.200 21.2
CIDEr Optimization 80.6 39.2 1.289 22.6

Citation

@InProceedings{Sammani_2020_CVPR,
author = {Sammani, Fawaz and Melas-Kyriazi, Luke},
title = {Show, Edit and Tell: A Framework for Editing Image Captions},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

References

Our code is mainly based on self-critical and show attend and tell. We thank both authors.

Comments
  • is your coco-cap api guranteed?

    is your coco-cap api guranteed?

    hello sir, thanks for your talent work on show, edit and tell. it really helps me a lot. I'm choosing one coco eval tool for my project and you choose the https://github.com/mtanti/coco-caption as your eval tools. But i see your issue on https://github.com/salaniz/pycocoevalcap. i wanna know why you not choose salaniz/pycocoevalcap ? is there any wrong with salaniz/pycocoevalcap? Thanks a lot.

    opened by Doragd 5
  • About CIDEr Optimization

    About CIDEr Optimization

    Hello! Thanks for your work. Your code is quite clear and easy to understand. Thus, I'm doing some experiments based on it.

    However, I got some problems while training with CIDEr optimization. When I use self-critical strategy to train my pre-trained model, the score on CIDEr tends to drop about 5 points after the first epoch. And it then costs quite a few epoches for the model to achieve the same score as it does with XE loss. Only after these epoches, the model begins to outperform the pre-trained one.

    I checked my code and found that I didn't save the state dict of optimizer while training with XE loss. So when I start to train with self-critical strategy, I just initialize a new optimizer with a learning rate about 2e-5 or 5e-5. Is this the reason why I got the problems described above?

    opened by LONGRYUU 4
  • aoa_caps

    aoa_caps

    Excuse me, where can I download the existing caption to be edited? And how organize them in a list containing dictionaries with each dictionary. Thank you!

    opened by WUHE-art 3
  • about pretrained model

    about pretrained model

    Thank you for open-source code, Can I use the pretrained models, which are provided in link"https://drive.google.com/drive/folders/1kPoRVsUuj57Scon-SbUJXNl555ee6sjo" to generate caption for new data?

    opened by wanboyang 3
  • A Strange Error OSError: [Errno 12] Cannot allocate memory

    A Strange Error OSError: [Errno 12] Cannot allocate memory

    Calculating Evalaution Metric Scores......

    loading annotations into memory... 0:00:00.743513 creating index... index created! Loading and preparing results...
    DONE (t=0.07s) creating index... index created! tokenization... Traceback (most recent call last): File "/home/chenzhanghui/.pycharm_helpers/pydev/pydevd.py", line 1741, in main() File "/home/chenzhanghui/.pycharm_helpers/pydev/pydevd.py", line 1735, in main globals = debugger.run(setup['file'], None, None, is_module) File "/home/chenzhanghui/.pycharm_helpers/pydev/pydevd.py", line 1135, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/home/chenzhanghui/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/chenzhanghui/code/showEditAndTell/editnet.py", line 828, in word_map = word_map) File "/home/chenzhanghui/code/showEditAndTell/editnet.py", line 727, in evaluate cocoEval.evaluate()
    File "coco-caption/pycocoevalcap/eval.py", line 36, in evaluate gts = tokenizer.tokenize(gts) File "coco-caption/pycocoevalcap/tokenizer/ptbtokenizer.py", line 54, in tokenize stdout=subprocess.PIPE) File "/home/chenzhanghui/anaconda3/envs/py36/lib/python3.6/subprocess.py", line 729, in init restore_signals, start_new_session) File "/home/chenzhanghui/anaconda3/envs/py36/lib/python3.6/subprocess.py", line 1295, in _execute_child restore_signals, start_new_session, preexec_fn) File "/home/chenzhanghui/.pycharm_helpers/pydev/_pydev_bundle/pydev_monkey.py", line 424, in new_fork_exec return getattr(_posixsubprocess, original_name)(args, *other_args) OSError: [Errno 12] Cannot allocate memory

    When running the editnet.py, the error occurs.

    Mostly because of this line of code in the class PTBTokenizer. p_tokenizer = subprocess.Popen(cmd, cwd=path_to_jar_dirname,
    stdout=subprocess.PIPE) But I don't know how to solve it....

    Can you help me with this? When I run the dcnet.py, it can work normally !!!

    opened by czhxiaohuihui 2
  • Some problems about Meshed Transformer

    Some problems about Meshed Transformer

    Sorry to bother you. I want to test the performence of meshed decoder part in harvardnlp code for transformer followed as the snippet you mentioned in https://github.com/aimagelab/meshed-memory-transformer/issues/4, image

    but i got the memory'shape [batch_size, num_boxes, d_model] which not contain the num_layers. For your Transformer model, are there other important things to make it work? Thanks a lot for your help!

    opened by HN123-123 2
  • I'm struggling on running tsv.py

    I'm struggling on running tsv.py

    As the instruction I downloaded the file 'trainval_36', unzipped it and place it at 'bottom-up_features'. I ran code "python bottom-up_features/tsv.py" and it raises error that no such file or directory: '../data/train2014'. Is there any other implementation that I should do before run the code "../data/train2014" except placing 'trainval_36' file at "bottom-up_features" folder?

    *I'm using google colab

    opened by KwonDaYeong 1
Owner
Fawaz Sammani
The human brain is a miracle every human has, and mathematically modelling that brain is an overwhelming matter! I like teaching machines vision-language
Fawaz Sammani
Show surprise when tests are passing

pytest-pikachu pytest-pikachu prints ascii art of Surprised Pikachu when all tests pass. Installation $ pip install pytest-pikachu Usage Pass the --p

Charlie Hornsby 13 Apr 15, 2022
Generic automation framework for acceptance testing and RPA

Robot Framework Introduction Installation Example Usage Documentation Support and contact Contributing License Introduction Robot Framework is a gener

Robot Framework 7.7k Jan 7, 2023
A browser automation framework and ecosystem.

Selenium Selenium is an umbrella project encapsulating a variety of tools and libraries enabling web browser automation. Selenium specifically provide

Selenium 25.5k Jan 1, 2023
Ward is a modern test framework for Python with a focus on productivity and readability.

Ward is a modern test framework for Python with a focus on productivity and readability.

Darren Burns 1k Dec 31, 2022
ApiPy was created for api testing with Python pytest framework which has also requests, assertpy and pytest-html-reporter libraries.

ApiPy was created for api testing with Python pytest framework which has also requests, assertpy and pytest-html-reporter libraries. With this f

Mustafa 1 Jul 11, 2022
frwk_51pwn is an open-sourced remote vulnerability testing and proof-of-concept development framework

frwk_51pwn Legal Disclaimer Usage of frwk_51pwn for attacking targets without prior mutual consent is illegal. frwk_51pwn is for security testing purp

51pwn 4 Apr 24, 2022
Sixpack is a language-agnostic a/b-testing framework

Sixpack Sixpack is a framework to enable A/B testing across multiple programming languages. It does this by exposing a simple API for client libraries

null 1.7k Dec 24, 2022
splinter - python test framework for web applications

splinter - python tool for testing web applications splinter is an open source tool for testing web applications using Python. It lets you automate br

Cobra Team 2.6k Dec 27, 2022
a socket mock framework - for all kinds of socket animals, web-clients included

mocket /mɔˈkɛt/ A socket mock framework for all kinds of socket animals, web-clients included - with gevent/asyncio/SSL support ...and then MicroPytho

Giorgio Salluzzo 249 Dec 14, 2022
fsociety Hacking Tools Pack – A Penetration Testing Framework

Fsociety Hacking Tools Pack A Penetration Testing Framework, you will have every script that a hacker needs. Works with Python 2. For a Python 3 versi

Manisso 8.2k Jan 3, 2023
splinter - python test framework for web applications

splinter - python tool for testing web applications splinter is an open source tool for testing web applications using Python. It lets you automate br

Cobra Team 2.3k Feb 5, 2021
Web testing library for Robot Framework

SeleniumLibrary Contents Introduction Keyword Documentation Installation Browser drivers Usage Extending SeleniumLibrary Community Versions History In

Robot Framework 1.2k Jan 3, 2023
a socket mock framework - for all kinds of socket animals, web-clients included

mocket /mɔˈkɛt/ A socket mock framework for all kinds of socket animals, web-clients included - with gevent/asyncio/SSL support ...and then MicroPytho

Giorgio Salluzzo 208 Jan 31, 2021
A framework-agnostic library for testing ASGI web applications

async-asgi-testclient Async ASGI TestClient is a library for testing web applications that implements ASGI specification (version 2 and 3). The motiva

null 122 Nov 22, 2022
A Modular Penetration Testing Framework

fsociety A Modular Penetration Testing Framework Install pip install fsociety Update pip install --upgrade fsociety Usage usage: fsociety [-h] [-i] [-

fsociety-team 802 Dec 31, 2022
Parameterized testing with any Python test framework

Parameterized testing with any Python test framework Parameterized testing in Python sucks. parameterized fixes that. For everything. Parameterized te

David Wolever 714 Dec 21, 2022
The pytest framework makes it easy to write small tests, yet scales to support complex functional testing

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing for applications and libraries. An example o

pytest-dev 9.6k Jan 2, 2023
Sixpack is a language-agnostic a/b-testing framework

Sixpack Sixpack is a framework to enable A/B testing across multiple programming languages. It does this by exposing a simple API for client libraries

null 1.7k Dec 24, 2022
Automated Penetration Testing Framework

Automated Penetration Testing Framework

OWASP 2.1k Jan 1, 2023