Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction"

Antoine Bosselut

Last update: Jan 1, 2023

Related tags

Deep Learning comet-commonsense

Overview

To run a generation experiment (either conceptnet or atomic), follow these instructions:

First Steps

First clone, the repo:

git clone https://github.com/atcbosselut/comet-commonsense.git

Then run the setup scripts to acquire the pretrained model files from OpenAI, as well as the ATOMIC and ConceptNet datasets

bash scripts/setup/get_atomic_data.sh
bash scripts/setup/get_conceptnet_data.sh
bash scripts/setup/get_model_files.sh

Then install dependencies (assuming you already have Python 3.6 and Pytorch >= 1.0:

conda install tensorflow
pip install ftfy==5.1
conda install -c conda-forge spacy
python -m spacy download en
pip install tensorboardX
pip install tqdm
pip install pandas
pip install ipython

Making the Data Loaders

Run the following scripts to pre-initialize a data loader for ATOMIC or ConceptNet:

python scripts/data/make_atomic_data_loader.py
python scripts/data/make_conceptnet_data_loader.py

For the ATOMIC KG, if you'd like to make a data loader for only a subset of the relation types, comment out any relations in lines 17-25.

For ConceptNet if you'd like to map the relations to natural language analogues, set opt.data.rel = "language" in line 26. If you want to initialize unpretrained relation tokens, set opt.data.rel = "relation"

Setting the ATOMIC configuration files

Open config/atomic/changes.json and set which categories you want to train, as well as any other details you find important. Check src/data/config.py for a description of different options. Variables you may want to change: batch_size, learning_rate, categories. See config/default.json and config/atomic/default.json for default settings of some of these variables.

Setting the ConceptNet configuration files

Open config/conceptnet/changes.json and set any changes to the degault configuration that you may want to vary in this experiment. Check src/data/config.py for a description of different options. Variables you may want to change: batch_size, learning_rate, etc. See config/default.json and config/conceptnet/default.json for default settings of some of these variables.

Running the ATOMIC experiment

Training

For whichever experiment # you set in ```config/atomic/changes.json``` (e.g., 0, 1, 2, etc.), run:

python src/main.py --experiment_type atomic --experiment_num #

Evaluation

Once you've trained a model, run the evaluation script:

python scripts/evaluate/evaluate_atomic_generation_model.py --split $DATASET_SPLIT --model_name /path/to/model/file

Generation

Once you've trained a model, run the generation script for the type of decoding you'd like to do:

python scripts/generate/generate_atomic_beam_search.py --beam 10 --split $DATASET_SPLIT --model_name /path/to/model/file
python scripts/generate/generate_atomic_greedy.py --split $DATASET_SPLIT --model_name /path/to/model/file
python scripts/generate/generate_atomic_topk.py --k 10 --split $DATASET_SPLIT --model_name /path/to/model/file

Running the ConceptNet experiment

Training

For whichever experiment # you set in config/conceptnet/changes.json (e.g., 0, 1, 2, etc.), run:

python src/main.py --experiment_type conceptnet --experiment_num #

Development and Test set tuples are automatically evaluated and generated with greedy decoding during training

Generation

If you want to generate with a larger beam size, run the generation script

python scripts/generate/generate_conceptnet_beam_search.py --beam 10 --split $DATASET_SPLIT --model_name /path/to/model/file

Classifying Generated Tuples

To run the classifier from Li et al., 2016 on your generated tuples to evaluate correctness, first download the pretrained model from:

wget https://ttic.uchicago.edu/~kgimpel/comsense_resources/ckbc-demo.tar.gz
tar -xvzf ckbc-demo.tar.gz

then run the following script on the the generations file, which should be in .pickle format:

bash scripts/classify/classify.sh /path/to/generations_file/without/pickle/extension

If you use this classification script, you'll also need Python 2.7 installed.

Playing Around in Interactive Mode

First, download the pretrained models from the following link:

https://drive.google.com/open?id=1FccEsYPUHnjzmX-Y5vjCBeyRt1pLo8FB

Then untar the file:

tar -xvzf pretrained_models.tar.gz

Then run the following script to interactively generate arbitrary ATOMIC event effects:

python scripts/interactive/atomic_single_example.py --model_file pretrained_models/atomic_pretrained_model.pickle

Or run the following script to interactively generate arbitrary ConceptNet tuples:

python scripts/interactive/conceptnet_single_example.py --model_file pretrained_models/conceptnet_pretrained_model.pickle

Bug Fixes

Beam Search

In BeamSampler in sampler.py, there was a bug that made the scoring function for each beam candidate slightly different from normalized loglikelihood. Only sequences decoded with beam search are affected by this. It's been fixed in the repository, and seems to have little discernible impact on the quality of the generated sequences. If you'd like to replicate the exact paper results, however, you'll need to use the buggy beam search from before, by setting paper_results = True in Line 251 of sampler.py

References

Please cite this repository using the following reference:

@inproceedings{Bosselut2019COMETCT,
  title={COMET: Commonsense Transformers for Automatic Knowledge Graph Construction},
  author={Antoine Bosselut and Hannah Rashkin and Maarten Sap and Chaitanya Malaviya and Asli Çelikyilmaz and Yejin Choi},
  booktitle={Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)},
  year={2019}
}

Comments

Is the model a transformer encoder-decoder or just decoder?

Hi,

Thank you for the very nice and interesting work. I have a question regarding the model. In the paper it s mentioned that you used the same architecture as GPT, which is a transformer decoder. However, you also talked about input encoder and how you configure the input to encoder. I was a bit confused, whether you have both encoder and decoder? or it is a language model and then it would be just decoder. And if it is a decoder, then how do you encode your [X_s, X_r] ? Could you please clarify this. Many thanks

opened by ehsan-soe 8
Difficulty reproducing ConceptNet scores
I'm having some difficulty reproducing the conceptnet accuracy scores, could you please point me in the right direction? These are the steps I took:

Download pretrained COMET model from https://drive.google.com/open?id=1FccEsYPUHnjzmX-Y5vjCBeyRt1pLo8FB

Run generations with greedy decoding on conceptnet test set

Use Bilinear AVG model by Li et al. (2016) to score generated tuples and threshold by 0.5 to get accuracy

However, because I achieved an accuracy of 80.04%, which is much lower than the 95.25% reported in your Table, I'm sure I must be doing something wrong! The (short) code I used to get the scores is here: https://colab.research.google.com/drive/10oaX-_1qS75xrgbm67MDkRy_hOHEJA0s
opened by chiayewken 6
Can this model give a knowledge graph according to the input sentence?

Can this model give a knowledge graph according to the input sentence? Such as "One absolutely cardinal reason is the fact that universities offer an opportunity to intensify the knowledge in a particular field of interest." Maybe this is a long sentence, I want to ask if our model has such a function?

opened by zysNLP 4
Environment install problems

In the README.md you need to change the tensorflow and spacy install methods. pip tensorflow downloads a different version than what is needed and works better if you just use conda to install it. Also, the spacy 3.0 that is default install now, doesn't work with your dataloader. Proposed changes:

pip install tensorflow --> conda install tensorflow conda install -c conda-forge spacy --> conda install -c conda-forge spacy=2.3.5

opened by sawyermade 2
Why do not excludes "none" answers in ATOMIC datasets?

Hello, Sir.

I have something to ask you about this code.

I wonder a reason of why you do not exclude 'none' target answer.

Your code seems to include such data samples like "PersonX loses personX's sight xNeed none " as a training data or test data.

I think learning to generate 'none' dose not seems to meaningful.

And I found same thing in COMET-ATOMIC 2020.

Is there any specific reasons?

Thanks.

opened by yongho94 2
Feeding a bigger size of the input event than 18.

Just want to make sure. Are we meant to change the original setting values 17, 35 for ATOMIC to bigger values in order to feed with a longer sentence than 18 words?

Or can we feed with a long sentence?

opened by terryyz 2
question about the runtime error

Here is the detail of the question : File "comet-commonsense/src/interactive/functions.py", line 124, in get_atomic_sequence input_event, category, data_loader, text_encoder) File "comet-commonsense/src/interactive/functions.py", line 158, in set_atomic_inputs XMB[:, :len(prefix)] = torch.LongTensor(prefix) RuntimeError: The expanded size of the tensor (18) must match the existing size (93) at non-singleton dimension 1. Tar get sizes: [1, 18]. Tensor sizes: [93] do you know how to deal with it ? @madaan

opened by shaoniana1997 1
How to generate xIntent output file for a given input file in atomic_single_example.py

Hello!

Thank you for sharing your work! I want to generate xIntent for some sequences. Is there a way that I can give an input file and have the output be stored in another file?

Thanks for your time!

Frank

opened by knarfamlap 1
Question about Making the Data loaders

Dear author : I follow the instruction in the Readme file,when i try to make the data loaders about conceptNet,i came across the question: ~/comet-commonsense$ python scripts/data/make_conceptnet_data_loader.py 100%███████████████████████████████████████████| 100000/100000 [04:06<00:00, 405.46it/s] 100%███████████████████████████████████████████████████| 2400/2400 [00:05<00:00, 408.77it/s] 100%|████████████████████████████████████| 2400/2400 [00:05<00:00, 426.74it/s] 28 16 16 dev Traceback (most recent call last): File "scripts/data/make_conceptnet_data_loader.py", line 66, in data_loader.make_tensors(text_encoder, special, test=False) File "/home/caihanqing/comet-commonsense/src/data/conceptnet.py", line 196, in make_tensors self.data[split]['total']) if not j[3]])) RuntimeError: index out of range: Tried to access index 2306 out of table with 2305 rows. at /opt/conda/conda-bld/pytorch_1570910687230/work/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418 Can you give me some useful advice to my further training ? @madaan @atcbosselut

opened by shaoniana1997 1
MASK tokens

Hello,

I have a question about MASK tokens in the input examples (Figure 3, paper). Which role do they play in training and predictions. Unfortunately, I could not find the explanation in the paper. What is the role of mask tokens between subject tokens and relation tokens (ConceptNet) when only object tokens need to be predicted? Thank you very much in advance.

opened by JohannaOm 1
Different inference resullt

Hi, I've tested a number of cases either using code&&parameter from this repo or demo from https://mosaickg.apps.allenai.org/comet_atomic. And I'm kind of confused as there have obvious differences in the results. So what's the exact difference between them? Thanks

opened by xwwwwww 1
Can I use the input with a longer length?

Dear authors,

I found you set the max_eventand max_affect to 17 and 35 respectively. However, the length of my input event is much longer than 17, can I change it according to my need?

After scanning your code, I tend to change opt['data']["maxe1"] in main_atomic.py and self.max_event in atomic.py, can it work?

opened by RenzeLou 4
Testing COMET to new ConceptNet-like words

Gongratulations for your work! What part of the code should I use in order to give as input a text file with some new words(begin node) and receive relations with generated new words/concepts (end node)? Thank you.

opened by alkiskatsalis 0
AttributeError: 'OpenAIGPTTokenizerFast' object has no attribute 'added_tokens_encoder'

When I run python -m comet2.train,I got the error : AttributeError: 'OpenAIGPTTokenizerFast' object has no attribute 'added_tokens_encoder'.And I can I fix it ?may be the issue from transformer verison?

opened by RyanYip-Kat 1
Same loss value is logged for different loss types

Hi,

Seems like token level averaged NLL loss will be logged for all loss types:

https://github.com/atcbosselut/comet-commonsense/blob/070aad114600b36296ef8420325e3d4cef0be470/src/train/train.py#L118

If you can confirm that this is indeed an issue, I can submit a pull request.

Thanks

opened by madaan 0

Owner

Antoine Bosselut

I am an assistant professor at EPFL working on learning algorithms for NLP and knowledge graphs. Previously @snap-stanford @stanfordnlp @allenai @uwnlp

GitHub

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

MetaAdaptRank This repository provides the implementation of meta-learning to reweight synthetic weak supervision data described in the paper Few-Shot

5 Jun 16, 2022

Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

One2Set This repository contains the code for our ACL 2021 paper “One2Set: Generating Diverse Keyphrases as a Set”. Our implementation is built on the

63 Jan 5, 2023

code associated with ACL 2021 DExperts paper

DExperts Hi! This repository contains code for the paper DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts to appear at

68 Dec 15, 2022

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang

190 Jan 3, 2023

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

21 May 18, 2022

Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

81 Dec 27, 2022

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming

Code for ACL'2021 paper WARP ?? Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification.

75 Nov 6, 2022

Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Dual Self-Attention Network for Multivariate Time Series Forecasting 20.10.26 Update: Due to the difficulty of installation and code maintenance cause

223 Dec 16, 2022

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

53 Jul 4, 2022

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Deep Exemplar-based Video Colorization (Pytorch Implementation) Paper | Pretrained Model | Youtube video ?? | Colab demo Deep Exemplar-based Video Col

253 Dec 27, 2022

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

47 Jan 1, 2023

PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".

Maria: A Visual Experience Powered Conversational Agent This repository is the Pytorch implementation of our paper "Maria: A Visual Experience Powered

22 Dec 12, 2022

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022

Database Reasoning Over Text project for ACL paper

Database Reasoning over Text This repository contains the code for the Database Reasoning Over Text paper, to appear at ACL2021. Work is performed in

320 Dec 12, 2022

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

10 Dec 6, 2022

Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction"

Related tags

Overview

First Steps

Making the Data Loaders

Setting the ATOMIC configuration files

Setting the ConceptNet configuration files

Running the ATOMIC experiment

Training

Evaluation

Generation

Running the ConceptNet experiment

Training

Generation

Classifying Generated Tuples

Playing Around in Interactive Mode

Bug Fixes

Beam Search

References

Comments

Owner

Antoine Bosselut

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

code associated with ACL 2021 DExperts paper

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming

Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

Database Reasoning Over Text project for ACL paper

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)

Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code