This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"

Last update: Jan 2, 2023

Related tags

Deep Learning knowledge-graph question-answering language-model commonsense-reasoning graph-neural-networks biomedical-ques

Overview

GreaseLM: Graph REASoning Enhanced Language Models

This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models".

Usage

1. Dependencies

Python == 3.8
PyTorch == 1.8.0
transformers == 3.4.0
torch-geometric == 1.7.0

Run the following commands to create a conda environment (assuming CUDA 10.1):

conda create -y -n greaselm python=3.8
conda activate greaselm
pip install numpy==1.18.3 tqdm
pip install torch==1.8.0+cu101 torchvision -f https://download.pytorch.org/whl/torch_stable.html
pip install transformers==3.4.0 nltk spacy
pip install wandb
conda install -y -c conda-forge tensorboardx
conda install -y -c conda-forge tensorboard

# for torch-geometric
pip install torch-scatter==2.0.7 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-cluster==1.5.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-sparse==0.6.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-spline-conv==1.2.1 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-geometric==1.7.0 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html

2. Download data

Download all the raw data -- ConceptNet, CommonsenseQA, OpenBookQA -- by

./download_raw_data.sh

You can preprocess the raw data by running

CUDA_VISIBLE_DEVICES=0 python preprocess.py -p

You can specify the GPU you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=.... The script will:

Setup ConceptNet (e.g., extract English relations from ConceptNet, merge the original 42 relation types into 17 types)
Convert the QA datasets into .jsonl files (e.g., stored in data/csqa/statement/)
Identify all mentioned concepts in the questions and answers
Extract subgraphs for each q-a pair

TL;DR. The preprocessing may take long; for your convenience, you can download all the processed data here into the top-level directory of this repo and run

unzip data_preprocessed.zip

Add MedQA-USMLE. Besides the commonsense QA datasets (CommonsenseQA, OpenBookQA) with the ConceptNet knowledge graph, we added a biomedical QA dataset (MedQA-USMLE) with a biomedical knowledge graph based on Disease Database and DrugBank. You can download all the data for this from [here]. Unzip it and put the medqa_usmle and ddb folders inside the data/ directory.

The resulting file structure should look like this:

.
├── README.md
└── data/
    ├── cpnet/                 (preprocessed ConceptNet)
    └── csqa/
        ├── train_rand_split.jsonl
        ├── dev_rand_split.jsonl
        ├── test_rand_split_no_answers.jsonl
        ├── statement/             (converted statements)
        ├── grounded/              (grounded entities)
        ├── graphs/                (extracted subgraphs)
        ├── ...

3. Training GreaseLM

To train GreaseLM on CommonsenseQA, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm.sh csqa --data_dir data/

You can specify up to 2 GPUs you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=....

Similarly, to train GreaseLM on OpenbookQA, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm.sh obqa --data_dir data/

To train GreaseLM on MedQA-USMLE, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm__medqa_usmle.sh

4. Pretrained model checkpoints

You can download a pretrained GreaseLM model on CommonsenseQA here, which achieves an IH-dev acc. of 79.0 and an IH-test acc. of 74.0.

You can also download a pretrained GreaseLM model on OpenbookQA here, which achieves an test acc. of 84.8.

You can also download a pretrained GreaseLM model on MedQA-USMLE here, which achieves an test acc. of 38.5.

5. Evaluating a pretrained model checkpoint

To evaluate a pretrained GreaseLM model checkpoint on CommonsenseQA, run

CUDA_VISIBLE_DEVICES=0 ./eval_greaselm.sh csqa --data_dir data/ --load_model_path /path/to/checkpoint

Again you can specify up to 2 GPUs you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=....

SimilarlyTo evaluate a pretrained GreaseLM model checkpoint on OpenbookQA, run

CUDA_VISIBLE_DEVICES=0 ./eval_greaselm.sh obqa --data_dir data/ --load_model_path /path/to/checkpoint

6. Use your own dataset

Convert your dataset to {train,dev,test}.statement.jsonl in .jsonl format (see data/csqa/statement/train.statement.jsonl)
Create a directory in data/{yourdataset}/ to store the .jsonl files
Modify preprocess.py and perform subgraph extraction for your data
Modify utils/parser_utils.py to support your own dataset

Acknowledgment

This repo is built upon the following work:

QA-GNN: Question Answering using Language Models and Knowledge Graphs
https://github.com/michiyasunaga/qagnn

Many thanks to the authors and developers!

Comments

[Help] About the hyper-parameters to reproduce the result

@XikunZhang @michiyasunaga @roks

Hi,

Thanks for your great effort!

I've run the code in this repo with the same hyper-parameters provided in the script run_greaselm.sh, which are also the same as reported in the paper. But the results aren't as good as reported in the paper. For example, in csqa, the reported dev_acc and test_acc are 78.5(+-0.5) and 74.2(+-0.4) respectively, but the model I trained only performs 77.48 and 73.01 respectively.

I've tried several random seeds, but the problem still exists. So could you please release the hyper-parameters(i.e. random seed) that you used when you train the model?

Look forward to your response!

opened by yeeeqichen 2

Cannot reshape array of size 0 into shape (0)

Hi Xikun @XikunZhang ,

Thanks for your great work. When I preprocessed csqa, I have met this error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 337, in concepts_to_adj_matrices_2hop_all_pair__use_LM__Part3
    adj, concepts = concepts2adj(schema_graph)
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 128, in concepts2adj
    adj = coo_matrix(adj.reshape(-1, n_node))
ValueError: cannot reshape array of size 0 into shape (0)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "preprocess.py", line 131, in <module>
    main()
  File "preprocess.py", line 125, in main
    rt_dic['func'](*rt_dic['args'])
  File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 512, in generate_adj_data_from_grounded_concepts__use_LM
    res3 = list(tqdm(p.imap(concepts_to_adj_matrices_2hop_all_pair__use_LM__Part3, res2), total=len(res2)))
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__
    for obj in iterable:
  File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
ValueError: cannot reshape array of size 0 into shape (0)

I have tried to fix it by editing the line https://github.com/snap-stanford/GreaseLM/blob/803946bba3273556c1ff2be6ad8b02850fe5972d/preprocess_utils/graph.py#L128 to just ignore the reshape method if the array has size 0:

try:
        adj = coo_matrix(adj.reshape(-1, n_node))
except:
        print("FAIL concepts2adj")

I think that I edited in an incorrect way because when running evaluation, I got this error:

points/csqa/csqa_model.pt
***** hyperparameters *****
dataset: csqa
******************************
wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
ModelClass <class 'transformers.modeling_roberta.RobertaModel'>
NLP
pid: 74920
screen: 

gpu: 1

torch version: 1.8.0+cu101
torch cuda version: 10.1
cuda is available: True
cuda device count: 1
cudnn version: 7603
wandb id:  1ziiml5l
loading from checkpoint: ./checkpoints/csqa/csqa_model.pt
train_statement_path ./data//csqa/statement/train.statement.jsonl
num_choice 5
Loading sparse adj data...
loading adj matrices: 100%|███████████████████████████████████████████████████████████████████████| 48705/48705 [00:22<00:00, 2158.86it/s]
| ori_adj_len: mu 12.13 sigma 9.67 | adj_len: 13.13 | prune_rate： 0.00 | qc_num: 5.46 | ac_num: 1.54 |
Traceback (most recent call last):
  File "greaselm.py", line 606, in <module>
    main(args)
  File "greaselm.py", line 546, in main
    evaluate(args, has_test_split, devices, kg)
  File "greaselm.py", line 449, in evaluate
    dataset = load_data(args, devices, kg)
  File "greaselm.py", line 50, in load_data
    dataset = data_utils.GreaseLM_DataLoader(args.train_statements, args.train_adj,
  File "/data/xuanlong/Graph2Text/GreaseLM/utils/data_utils.py", line 121, in __init__
    assert all(len(self.train_qids) == len(self.train_adj_data[0]) == x.size(0) for x in [self.train_labels] + self.train_encoder_data + self.train_decoder_data)
AssertionError

Is it possible that you could give me some advices on how I can fix it (the first error).

Thank you & BR,

opened by dxlong2000 2

retrieve graph for single sentence input

Hi, thank you for introducing such an intriguing work.

your proposed sub graph retrieval process is customized to Q-A pair input.

What kinds of code snippets should I modify to retrieve a graph for just a single sentence input?

The reason why I ask is that I'd like to make a custom pipeline that samples ConceptNet's subgraph for given custom string input.

opened by kaiiwoo 1
The experience on complex questions with semantic nuance

I also want to try similar experiments over different complex questions with semantic nuance. But I don't find your specific classification method (Prepositional Phrases, negation terms, hedging terms) for the complex problem. If you still save the code of dealing with the questions, could you share it with me? thanks!

opened by HAOChuzhan 0
Question about freezing LM parameter problem

Hi, I'am very interesting in this model. I want to know why freeze LM parameters previous epochs. In my knowledge, LM parameters are fun-tuned in previous epochs(1~3) and then freeze it. I would really appreciate for your help.

opened by lokking 0

FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'

Hi Xikun,

Thanks for your great work. May I ask where could I take this inhouse_split_qids.txt file?

***** hyperparameters *****
dataset: csqa
******************************
wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
ModelClass <class 'transformers.modeling_roberta.RobertaModel'>
NLP
pid: 493
screen: 

gpu: 1

torch version: 1.8.0+cu101
torch cuda version: 10.1
cuda is available: True
cuda device count: 1
cudnn version: 7603
wandb id:  1iygcifx
loading from checkpoint: ./checkpoints/csqa/csqa_model.pt
train_statement_path ./data//csqa/statement/train.statement.jsonl
num_choice 5
Loading sparse adj data...
| ori_adj_len: mu 12.13 sigma 9.67 | adj_len: 13.13 | prune_rate： 0.00 | qc_num: 5.46 | ac_num: 1.54 |
Finish loading training data.
Loading sparse adj data...
| ori_adj_len: mu 12.16 sigma 10.18 | adj_len: 13.16 | prune_rate： 0.00 | qc_num: 5.34 | ac_num: 1.54 |
Finish loading dev data.
Loading sparse adj data...
| ori_adj_len: mu 12.02 sigma 9.17 | adj_len: 13.02 | prune_rate： 0.00 | qc_num: 5.48 | ac_num: 1.53 |
Finish loading test data.
Traceback (most recent call last):
  File "greaselm.py", line 606, in <module>
    main(args)
  File "greaselm.py", line 546, in main
    evaluate(args, has_test_split, devices, kg)
  File "greaselm.py", line 449, in evaluate
    dataset = load_data(args, devices, kg)
  File "greaselm.py", line 50, in load_data
    dataset = data_utils.GreaseLM_DataLoader(args.train_statements, args.train_adj,
  File "/data/xuanlong/Graph2Text/GreaseLM/utils/data_utils.py", line 144, in __init__
    with open(inhouse_train_qids_path, 'r') as fin:
FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'

BR,

opened by dxlong2000 2

[Discussion] Relevance of GreaseLM results in light of 'GNN is a Counter?..' paper + dataset discussion
Hi

Very interesting work on the combination of LM + KG! This is something I am looking into myself as a research project (https://github.com/apoorvumang/transformer-kgc), and I thought this would be a good place to discuss what datasets such models should be used on.

In the very recently released paper GNN is a Counter? Revisiting GNN for Question Answering, (code at https://github.com/anonymousGSC/graph-soft-counter), they show that a 1-dim GNN + LM is able to achieve almost SOTA results on both OpenBookQA and CommonsenseQA. In fact according to their numbers it even outperforms GreaseLM on both these datasets.

I would like to discuss a few things regarding the dataset situation:

CommonSenseQA leaderboard no longer accepts ConceptNet based submissions, which is quite a bummer, and OpenBookQA is extremely small (500 test and 500 dev questions only, around 5k train). Is it worth it (for me and others) to work with these datasets, given the findings of 'GNN...' paper?

If not, could GreaseLM (and similar methods) be applied to regular KGQA datasets such WebQuestionsSP, ComplexWebQuestions or GrailQA? This of course would be harder since its no longer MCQ reasoning, but it might be more interesting and can give real evidence of LM + KG based reasoning.

Is there any other datasets apart from the ones I mentioned that could be relevant in this area? (MedQA-USMLE is ofc one, but I feel it is quite new, and having another older/more established dataset would be an advantage)

Looking forward to a healthy discussion! 😊
opened by apoorvumang 0

RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB

When we are trying to run the greaselm.py we are getting this issue even if we run the batch size minimum of 8

we tried from 128-8 every time, It throws the error with different memory size as free , after some epochs. can you help us here in solving this issue and run the code

logits, _ = model(*[x[a:b] for x in input_data])
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 85, in forward
  logits, attn = self.lmgnn(lm_inputs, concept_ids,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 217, in forward
  outputs, gnn_output = self.mp(input_ids, token_type_ids, attention_mask, output_mask, gnn_input, adj, node_type_ids, node_scores, special_nodes_mask, output_hidden_$
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 411, in forward
  encoder_outputs, _X = self.encoder(embedding_output,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 815, in forward
  _X = self.gnn_layers[gnn_layer_index](_X, edge_index, edge_type, _node_type, _node_feature_extra)
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
  result = self.forward(*input, **kwargs)
File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_gnn.py", line 91, in forward
  aggr_out = self.propagate(edge_index, x=x, edge_attr=edge_embeddings) #[N, emb_dim]
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 261, in propagate
  coll_dict = self.__collect__(self.__user_args__, edge_index, size,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 171, in _collect_
  data = self.__lift__(data, edge_index,
File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 141, in _lift_
  return src.index_select(self.node_dim, index)
RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB (GPU 0; 15.78 GiB total capacity; 14.28 GiB already allocated; 133.50 MiB free; 14.39 GiB reserved in tot

opened by tarun3300 3

This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"

Related tags

Overview

GreaseLM: Graph REASoning Enhanced Language Models

Usage

1. Dependencies

2. Download data

3. Training GreaseLM

4. Pretrained model checkpoints

5. Evaluating a pretrained model checkpoint

6. Use your own dataset

Acknowledgment

Comments

Owner

Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

Code for our NeurIPS 2021 paper 'Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation'

Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations"

Source code for our paper "Empathetic Response Generation with State Management"

Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Retrieval.

A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)