Machine-in-the-Loop Rewriting for Creative Image Captioning

Vishakh P

Last update: Jul 24, 2022

Related tags

Deep Learning mil-creative-captioning

Overview

Machine-in-the-Loop Rewriting for Creative Image Captioning

Data

Annotated sources of data used in the paper:

Data Source	URL
Mohammed et al.	Link
Gordon et al.	Link
Bostan et al.	Link
Niculae et al.	Link
Steen et al.	Link

TODO: Individual data cleaning scripts

Model Training

Follow the README in the model_training directory to train a Fairseq BART model. Reach out for our trained model.

Interface

Code to run the UI we used for interactive experiments. This UI hosts a server and needs you to have a backend GPU to run model inference during interaction. The code saves each interaction with a unique ID which we use to match to our crowdworkers for experimental analysis.

TODO: Data Processing Scripts to filter results

An Image Captioning codebase

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

1.1k Oct 18, 2021

A transformer-based method for Healthcare Image Captioning in Vietnamese

vieCap4H Challenge 2021: A transformer-based method for Healthcare Image Captioning in Vietnamese This repo GitHub contains our solution for vieCap4H

4 May 5, 2022

Image Captioning using CNN ,LSTM and Attention

Image Captioning using CNN ,LSTM and Attention This is a deeplearning model which tries to summarize an image into a text . Installation Install this

1 Dec 16, 2021

Image Captioning on google cloud platform based on iot

Image-Captioning-on-google-cloud-platform-based-on-iot - Image Captioning on google cloud platform based on iot

1 Jan 20, 2022

Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

PyMAF This repository contains the code for the following paper: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop Hongwe

450 Dec 28, 2022

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor

27 Dec 20, 2022

A list of papers about point cloud based place recognition, also known as loop closure detection in SLAM (processing)

17 May 16, 2021

FluxTraining.jl gives you an endlessly extensible training loop for deep learning

A flexible neural net training library inspired by fast.ai

86 Dec 31, 2022

FLVIS: Feedback Loop Based Visual Initial SLAM

FLVIS Feedback Loop Based Visual Inertial SLAM 1-Video EuRoC DataSet MH_05 Handheld Test in Lab FlVIS on UAV Platform 2-Relevent Publication: Under Re

182 Dec 4, 2022

Comments

Any download link of trained model?

This rewriting model is impressive enough for me. Could someone share a link of trained model? I tried to mailto author but did not receive a reply unfortunately~

Thanks firstly!

opened by lizekui 1

Owner

Vishakh P

GitHub

RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP

[Paper] [Хабр] [Model Card] [Colab] [Kaggle] RuDOLPH ?? ?? ☃️ One Hyper-Modal Tr

230 Dec 31, 2022

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

1 Jan 23, 2022

Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)

Diverse Image Captioning with Context-Object Split Latent Spaces This repository is the PyTorch implementation of the paper: Diverse Image Captioning

34 Nov 21, 2022

Semi-Autoregressive Transformer for Image Captioning

Semi-Autoregressive Transformer for Image Captioning Requirements Python 3.6 Pytorch 1.6 Prepare data Please use git clone --recurse-submodules to clo

23 Dec 9, 2022

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

CLIP-ViL In our paper "How Much Can CLIP Benefit Vision-and-Language Tasks?", we show the improvement of CLIP features over the traditional resnet fea

310 Dec 28, 2022

Machine-in-the-Loop Rewriting for Creative Image Captioning

Related tags

Overview

Machine-in-the-Loop Rewriting for Creative Image Captioning

Data

TODO: Individual data cleaning scripts

Model Training

Interface

TODO: Data Processing Scripts to filter results

You might also like...

An Image Captioning codebase

A transformer-based method for Healthcare Image Captioning in Vietnamese

Image Captioning using CNN ,LSTM and Attention

Image Captioning on google cloud platform based on iot

Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

A list of papers about point cloud based place recognition, also known as loop closure detection in SLAM (processing)

FluxTraining.jl gives you an endlessly extensible training loop for deep learning

FLVIS: Feedback Loop Based Visual Initial SLAM

Comments

Any download link of trained model?

Owner

Vishakh P

RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)

Semi-Autoregressive Transformer for Image Captioning

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

An unreferenced image captioning metric (ACL-21)

Image Captioning using CNN and Transformers

Optimized code based on M2 for faster image captioning training