PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Zechen Bai

Last update: Jul 8, 2022

Related tags

Deep Learning Art-Description

Overview

Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

[Code] [Data] [Project Page]

Official PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation, published at ICCV 2021.

Have you ever looked at a painting and wondered what is the story behind it? This work presents a framework to bring art closer to people by generating comprehensive descriptions of ﬁne-art paintings. Generating informative descriptions for artworks, however, is extremely challenging, as it requires to 1) describe multiple aspects of the image such as its style, content, or composition, and 2) provide background and contextual knowledge about the artist, their inﬂuences, or the historical period. To address these challenges, we introduce a multi-topic and knowledgeable art description framework, which modules the generated sentences according to three artistic topics and, additionally, enhances each description with external knowledge. The framework is validated through an exhaustive analysis, both quantitative and qualitative, as well as a comparative human evaluation, demonstrating outstanding results in terms of both topic diversity and information veracity.

Setup

Requirements

The code are tested under Python3.6 with the following packages:

torch==1.1.0
torchvision==0.2.2
numpy==1.16.2
visdom==0.1.8.9
transformers==2.1.1
nltk==3.2.3
stanfordcorenlp==3.9.1.1
scipy==1.3.1
pandas==0.25.1

Prepare Data

1.Download the dataset from this repository

2.Put the annotation folder into the MaskedSentenceGeneration

Masked Sentence Generation

cd MaskedSentenceGeneration
python prepare_dataset.py
bash train.sh
bash test_one.sh / bash test_all.sh

Knowledge Retrieval

Please look into here

Knowledge Filling

cd KnowledgeFilling
python create_dataset_drqa_src.py
bash train.sh
bash test.sh

Citation

If you find the data in this repository useful, please cite our paper:

@InProceedings{bai2021explain,
   author    = {Zechen Bai and Yuta Nakashima and Noa Garcia},
   title     = {Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation},
   booktitle = {International Conference in Computer Vision},
   year      = {2021},
}

You might also like...

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

46 Dec 26, 2022

Joint deep network for feature line detection and description

SOLD² - Self-supervised Occlusion-aware Line Description and Detection This repository contains the implementation of the paper: SOLD² : Self-supervis

427 Dec 27, 2022

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

2 Oct 20, 2021

Code and description for my BSc Project, September 2021

BSc-Project Disclaimer: This repo consists of only the additional python scripts necessary to run the agent. To run the project on your own personal d

20 Jul 19, 2022

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

663 Jan 6, 2023

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering Papers with code | Paper Nikola Zubić Pietro Lio University

213 Dec 27, 2022

Comments

Help With Setting Up RCNN Module

Hello,

I am trying to set up the KnowledgeRetrever module. I have been able to deploy the DrQA and context-art-classification submodules, but I keep getting C errors relating to RCNN when trying to run build-visual-concept.py and I am not entirely clear where I am going wrong. There are many instructions and instructions relating to setting up RCNN, and I am not sure which ones are relevant to getting art-description running.

I would greatly appreciate it to work on this with someone who can help guide or provide some additional instructions on getting the RCNN part set up.

Thank you!

opened by SafaTinaztepe 0
some questions about test_one.sh

Hello @JosephPai

when i use MaskedSentenceGeneration/test_one.sh, there are some errors. The error information is "IndexError: tensors used as indices must be long, byte or bool tensors". How can i solve it?

opened by zml110120 0
Please provide Pretrained Models

Hi @JosephPai

How are you? Hope you are doing well!

I am Tarun Makkar. I am working on a project related to this, And training is taking a very long time. I have a request, Can u provide pretrained models of this? Please.

I would be very thankful to you.

Thanks Best,

Tarun Makkar [email protected]

opened by makkarss929 0

PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Related tags

Overview

Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

[Code] [Data] [Project Page]

Setup

Requirements

Prepare Data

Masked Sentence Generation

Knowledge Retrieval

Knowledge Filling

Citation

You might also like...

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

Joint deep network for feature line detection and description

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

Code and description for my BSc Project, September 2021

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

Related resources for our EMNLP 2021 paper Plan-then-Generate: Controlled Data-to-Text Generation via Planning

Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations"

Source code for our paper "Empathetic Response Generation with State Management"

Comments

Help With Setting Up RCNN Module

some questions about test_one.sh

Please provide Pretrained Models

Owner

Zechen Bai

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

Painting app using Python machine learning and vision technology.

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

This application explain how we can easily integrate Deepface framework with Python Django application

Here I will explain the flow to deploy your custom deep learning models on Ultra96V2.

Fit Fast, Explain Fast

A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".