This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"

Overview

GreaseLM: Graph REASoning Enhanced Language Models

This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models".

Usage

1. Dependencies

Run the following commands to create a conda environment (assuming CUDA 10.1):

conda create -y -n greaselm python=3.8
conda activate greaselm
pip install numpy==1.18.3 tqdm
pip install torch==1.8.0+cu101 torchvision -f https://download.pytorch.org/whl/torch_stable.html
pip install transformers==3.4.0 nltk spacy
pip install wandb
conda install -y -c conda-forge tensorboardx
conda install -y -c conda-forge tensorboard

# for torch-geometric
pip install torch-scatter==2.0.7 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-cluster==1.5.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-sparse==0.6.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-spline-conv==1.2.1 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
pip install torch-geometric==1.7.0 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html

2. Download data

Download all the raw data -- ConceptNet, CommonsenseQA, OpenBookQA -- by

./download_raw_data.sh

You can preprocess the raw data by running

CUDA_VISIBLE_DEVICES=0 python preprocess.py -p 
   

   

You can specify the GPU you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=.... The script will:

  • Setup ConceptNet (e.g., extract English relations from ConceptNet, merge the original 42 relation types into 17 types)
  • Convert the QA datasets into .jsonl files (e.g., stored in data/csqa/statement/)
  • Identify all mentioned concepts in the questions and answers
  • Extract subgraphs for each q-a pair

TL;DR. The preprocessing may take long; for your convenience, you can download all the processed data here into the top-level directory of this repo and run

unzip data_preprocessed.zip

Add MedQA-USMLE. Besides the commonsense QA datasets (CommonsenseQA, OpenBookQA) with the ConceptNet knowledge graph, we added a biomedical QA dataset (MedQA-USMLE) with a biomedical knowledge graph based on Disease Database and DrugBank. You can download all the data for this from [here]. Unzip it and put the medqa_usmle and ddb folders inside the data/ directory.

The resulting file structure should look like this:

.
├── README.md
└── data/
    ├── cpnet/                 (preprocessed ConceptNet)
    └── csqa/
        ├── train_rand_split.jsonl
        ├── dev_rand_split.jsonl
        ├── test_rand_split_no_answers.jsonl
        ├── statement/             (converted statements)
        ├── grounded/              (grounded entities)
        ├── graphs/                (extracted subgraphs)
        ├── ...

3. Training GreaseLM

To train GreaseLM on CommonsenseQA, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm.sh csqa --data_dir data/

You can specify up to 2 GPUs you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=....

Similarly, to train GreaseLM on OpenbookQA, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm.sh obqa --data_dir data/

To train GreaseLM on MedQA-USMLE, run

CUDA_VISIBLE_DEVICES=0 ./run_greaselm__medqa_usmle.sh

4. Pretrained model checkpoints

You can download a pretrained GreaseLM model on CommonsenseQA here, which achieves an IH-dev acc. of 79.0 and an IH-test acc. of 74.0.

You can also download a pretrained GreaseLM model on OpenbookQA here, which achieves an test acc. of 84.8.

You can also download a pretrained GreaseLM model on MedQA-USMLE here, which achieves an test acc. of 38.5.

5. Evaluating a pretrained model checkpoint

To evaluate a pretrained GreaseLM model checkpoint on CommonsenseQA, run

CUDA_VISIBLE_DEVICES=0 ./eval_greaselm.sh csqa --data_dir data/ --load_model_path /path/to/checkpoint

Again you can specify up to 2 GPUs you want to use in the beginning of the command CUDA_VISIBLE_DEVICES=....

SimilarlyTo evaluate a pretrained GreaseLM model checkpoint on OpenbookQA, run

CUDA_VISIBLE_DEVICES=0 ./eval_greaselm.sh obqa --data_dir data/ --load_model_path /path/to/checkpoint

6. Use your own dataset

  • Convert your dataset to {train,dev,test}.statement.jsonl in .jsonl format (see data/csqa/statement/train.statement.jsonl)
  • Create a directory in data/{yourdataset}/ to store the .jsonl files
  • Modify preprocess.py and perform subgraph extraction for your data
  • Modify utils/parser_utils.py to support your own dataset

Acknowledgment

This repo is built upon the following work:

QA-GNN: Question Answering using Language Models and Knowledge Graphs
https://github.com/michiyasunaga/qagnn

Many thanks to the authors and developers!

Comments
  • [Help] About the hyper-parameters to reproduce the result

    [Help] About the hyper-parameters to reproduce the result

    @XikunZhang @michiyasunaga @roks

    Hi,

    Thanks for your great effort!

    I've run the code in this repo with the same hyper-parameters provided in the script run_greaselm.sh, which are also the same as reported in the paper. But the results aren't as good as reported in the paper. For example, in csqa, the reported dev_acc and test_acc are 78.5(+-0.5) and 74.2(+-0.4) respectively, but the model I trained only performs 77.48 and 73.01 respectively.

    I've tried several random seeds, but the problem still exists. So could you please release the hyper-parameters(i.e. random seed) that you used when you train the model?

    Look forward to your response!

    opened by yeeeqichen 2
  • Cannot reshape array of size 0 into shape (0)

    Cannot reshape array of size 0 into shape (0)

    Hi Xikun @XikunZhang ,

    Thanks for your great work. When I preprocessed csqa, I have met this error:

    multiprocessing.pool.RemoteTraceback: 
    """
    Traceback (most recent call last):
      File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/multiprocessing/pool.py", line 125, in worker
        result = (True, func(*args, **kwds))
      File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 337, in concepts_to_adj_matrices_2hop_all_pair__use_LM__Part3
        adj, concepts = concepts2adj(schema_graph)
      File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 128, in concepts2adj
        adj = coo_matrix(adj.reshape(-1, n_node))
    ValueError: cannot reshape array of size 0 into shape (0)
    """
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "preprocess.py", line 131, in <module>
        main()
      File "preprocess.py", line 125, in main
        rt_dic['func'](*rt_dic['args'])
      File "/data/xuanlong/Graph2Text/GreaseLM/preprocess_utils/graph.py", line 512, in generate_adj_data_from_grounded_concepts__use_LM
        res3 = list(tqdm(p.imap(concepts_to_adj_matrices_2hop_all_pair__use_LM__Part3, res2), total=len(res2)))
      File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__
        for obj in iterable:
      File "/data/xuanlong/anaconda3/envs/greaselm/lib/python3.8/multiprocessing/pool.py", line 868, in next
        raise value
    ValueError: cannot reshape array of size 0 into shape (0)
    

    I have tried to fix it by editing the line https://github.com/snap-stanford/GreaseLM/blob/803946bba3273556c1ff2be6ad8b02850fe5972d/preprocess_utils/graph.py#L128 to just ignore the reshape method if the array has size 0:

    try:
            adj = coo_matrix(adj.reshape(-1, n_node))
    except:
            print("FAIL concepts2adj")
    

    I think that I edited in an incorrect way because when running evaluation, I got this error:

    points/csqa/csqa_model.pt
    ***** hyperparameters *****
    dataset: csqa
    ******************************
    wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
    ModelClass <class 'transformers.modeling_roberta.RobertaModel'>
    NLP
    pid: 74920
    screen: 
    
    gpu: 1
    
    torch version: 1.8.0+cu101
    torch cuda version: 10.1
    cuda is available: True
    cuda device count: 1
    cudnn version: 7603
    wandb id:  1ziiml5l
    loading from checkpoint: ./checkpoints/csqa/csqa_model.pt
    train_statement_path ./data//csqa/statement/train.statement.jsonl
    num_choice 5
    Loading sparse adj data...
    loading adj matrices: 100%|███████████████████████████████████████████████████████████████████████| 48705/48705 [00:22<00:00, 2158.86it/s]
    | ori_adj_len: mu 12.13 sigma 9.67 | adj_len: 13.13 | prune_rate: 0.00 | qc_num: 5.46 | ac_num: 1.54 |
    Traceback (most recent call last):
      File "greaselm.py", line 606, in <module>
        main(args)
      File "greaselm.py", line 546, in main
        evaluate(args, has_test_split, devices, kg)
      File "greaselm.py", line 449, in evaluate
        dataset = load_data(args, devices, kg)
      File "greaselm.py", line 50, in load_data
        dataset = data_utils.GreaseLM_DataLoader(args.train_statements, args.train_adj,
      File "/data/xuanlong/Graph2Text/GreaseLM/utils/data_utils.py", line 121, in __init__
        assert all(len(self.train_qids) == len(self.train_adj_data[0]) == x.size(0) for x in [self.train_labels] + self.train_encoder_data + self.train_decoder_data)
    AssertionError
    

    Is it possible that you could give me some advices on how I can fix it (the first error).

    Thank you & BR,

    opened by dxlong2000 2
  • retrieve graph for single sentence input

    retrieve graph for single sentence input

    Hi, thank you for introducing such an intriguing work.

    your proposed sub graph retrieval process is customized to Q-A pair input.

    What kinds of code snippets should I modify to retrieve a graph for just a single sentence input?

    The reason why I ask is that I'd like to make a custom pipeline that samples ConceptNet's subgraph for given custom string input.

    opened by kaiiwoo 1
  • The experience on complex questions with semantic nuance

    The experience on complex questions with semantic nuance

    I also want to try similar experiments over different complex questions with semantic nuance. But I don't find your specific classification method (Prepositional Phrases, negation terms, hedging terms) for the complex problem. If you still save the code of dealing with the questions, could you share it with me? thanks!

    opened by HAOChuzhan 0
  • Question about freezing LM parameter problem

    Question about freezing LM parameter problem

    Hi, I'am very interesting in this model. I want to know why freeze LM parameters previous epochs. In my knowledge, LM parameters are fun-tuned in previous epochs(1~3) and then freeze it. I would really appreciate for your help.

    opened by lokking 0
  • FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'

    FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'

    Hi Xikun,

    Thanks for your great work. May I ask where could I take this inhouse_split_qids.txt file?

    ***** hyperparameters *****
    dataset: csqa
    ******************************
    wandb: WARNING W&B installed but not logged in.  Run `wandb login` or set the WANDB_API_KEY env variable.
    ModelClass <class 'transformers.modeling_roberta.RobertaModel'>
    NLP
    pid: 493
    screen: 
    
    gpu: 1
    
    torch version: 1.8.0+cu101
    torch cuda version: 10.1
    cuda is available: True
    cuda device count: 1
    cudnn version: 7603
    wandb id:  1iygcifx
    loading from checkpoint: ./checkpoints/csqa/csqa_model.pt
    train_statement_path ./data//csqa/statement/train.statement.jsonl
    num_choice 5
    Loading sparse adj data...
    | ori_adj_len: mu 12.13 sigma 9.67 | adj_len: 13.13 | prune_rate: 0.00 | qc_num: 5.46 | ac_num: 1.54 |
    Finish loading training data.
    Loading sparse adj data...
    | ori_adj_len: mu 12.16 sigma 10.18 | adj_len: 13.16 | prune_rate: 0.00 | qc_num: 5.34 | ac_num: 1.54 |
    Finish loading dev data.
    Loading sparse adj data...
    | ori_adj_len: mu 12.02 sigma 9.17 | adj_len: 13.02 | prune_rate: 0.00 | qc_num: 5.48 | ac_num: 1.53 |
    Finish loading test data.
    Traceback (most recent call last):
      File "greaselm.py", line 606, in <module>
        main(args)
      File "greaselm.py", line 546, in main
        evaluate(args, has_test_split, devices, kg)
      File "greaselm.py", line 449, in evaluate
        dataset = load_data(args, devices, kg)
      File "greaselm.py", line 50, in load_data
        dataset = data_utils.GreaseLM_DataLoader(args.train_statements, args.train_adj,
      File "/data/xuanlong/Graph2Text/GreaseLM/utils/data_utils.py", line 144, in __init__
        with open(inhouse_train_qids_path, 'r') as fin:
    FileNotFoundError: [Errno 2] No such file or directory: 'data/csqa/inhouse_split_qids.txt'
    

    BR,

    opened by dxlong2000 2
  • [Discussion] Relevance of GreaseLM results in light of 'GNN is a Counter?..' paper + dataset discussion

    [Discussion] Relevance of GreaseLM results in light of 'GNN is a Counter?..' paper + dataset discussion

    Hi

    Very interesting work on the combination of LM + KG! This is something I am looking into myself as a research project (https://github.com/apoorvumang/transformer-kgc), and I thought this would be a good place to discuss what datasets such models should be used on.

    In the very recently released paper GNN is a Counter? Revisiting GNN for Question Answering, (code at https://github.com/anonymousGSC/graph-soft-counter), they show that a 1-dim GNN + LM is able to achieve almost SOTA results on both OpenBookQA and CommonsenseQA. In fact according to their numbers it even outperforms GreaseLM on both these datasets.

    I would like to discuss a few things regarding the dataset situation:

    1. CommonSenseQA leaderboard no longer accepts ConceptNet based submissions, which is quite a bummer, and OpenBookQA is extremely small (500 test and 500 dev questions only, around 5k train). Is it worth it (for me and others) to work with these datasets, given the findings of 'GNN...' paper?
    2. If not, could GreaseLM (and similar methods) be applied to regular KGQA datasets such WebQuestionsSP, ComplexWebQuestions or GrailQA? This of course would be harder since its no longer MCQ reasoning, but it might be more interesting and can give real evidence of LM + KG based reasoning.
    3. Is there any other datasets apart from the ones I mentioned that could be relevant in this area? (MedQA-USMLE is ofc one, but I feel it is quite new, and having another older/more established dataset would be an advantage)

    Looking forward to a healthy discussion! 😊

    opened by apoorvumang 0
  • RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB

    RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB

    When we are trying to run the greaselm.py we are getting this issue even if we run the batch size minimum of 8

    we tried from 128-8 every time, It throws the error with different memory size as free , after some epochs. can you help us here in solving this issue and run the code

    logits, _ = model(*[x[a:b] for x in input_data])
    File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
      result = self.forward(*input, **kwargs)
    File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 85, in forward
      logits, attn = self.lmgnn(lm_inputs, concept_ids,
    File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
      result = self.forward(*input, **kwargs)
    File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 217, in forward
      outputs, gnn_output = self.mp(input_ids, token_type_ids, attention_mask, output_mask, gnn_input, adj, node_type_ids, node_scores, special_nodes_mask, output_hidden_$
    File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
      result = self.forward(*input, **kwargs)
    File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 411, in forward
      encoder_outputs, _X = self.encoder(embedding_output,
    File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
      result = self.forward(*input, **kwargs)
    File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 815, in forward
      _X = self.gnn_layers[gnn_layer_index](_X, edge_index, edge_type, _node_type, _node_feature_extra)
    File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
      result = self.forward(*input, **kwargs)
    File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_gnn.py", line 91, in forward
      aggr_out = self.propagate(edge_index, x=x, edge_attr=edge_embeddings) #[N, emb_dim]
    File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 261, in propagate
      coll_dict = self.__collect__(self.__user_args__, edge_index, size,
    File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 171, in _collect_
      data = self.__lift__(data, edge_index,
    File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 141, in _lift_
      return src.index_select(self.node_dim, index)
    RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB (GPU 0; 15.78 GiB total capacity; 14.28 GiB already allocated; 133.50 MiB free; 14.39 GiB reserved in tot
    
    opened by tarun3300 3
Owner
null
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

null 7 Oct 22, 2021
This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

TransBTS: Multimodal Brain Tumor Segmentation Using Transformer This repo is the official implementation for TransBTS: Multimodal Brain Tumor Segmenta

Raymond 247 Dec 28, 2022
PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

Shipeng Wang 34 Dec 21, 2022
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

Isen (Songyao Jiang) 128 Dec 8, 2022
This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

Transferability for domain generalization This repo is for evaluating and improving transferability in domain generalization (NeurIPS 2021), based on

gordon 9 Nov 29, 2022
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

Jinpeng Wang 150 Dec 28, 2022
This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

GPRGNN This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network. Hidden state feature extraction i

Jianhao 92 Jan 3, 2023
The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

TriageSQL The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text

Yusen Zhang 22 Nov 9, 2022
Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

XingBowen 4 May 20, 2022
Code for our NeurIPS 2021 paper 'Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation'

Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation (NeurIPS 2021) Code for our NeurIPS 2021 paper 'Exploiting the Intri

Shiqi Yang 53 Dec 25, 2022
Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations"

Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations" this repository is maintained by bo

Yuhan Liu 24 Nov 29, 2022
Source code for our paper "Empathetic Response Generation with State Management"

Source code for our paper "Empathetic Response Generation with State Management" this repository is maintained by both Jun Gao and Yuhan Liu Model Ove

Yuhan Liu 3 Oct 8, 2022
Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash Abstract: Apple recently revealed its deep perceptual hashing system NeuralHash to

ml-research@TUDarmstadt 11 Dec 3, 2022
[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

Jinpeng Wang 114 Oct 16, 2022
Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Retrieval.

Targeted Trojan-Horse Attacks on Language-based Image Retrieval Source code of our TTH paper: Targeted Trojan-Horse Attacks on Language-based Image Re

fine 7 Aug 23, 2022
A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

Knodle (Knowledge-supervised Deep Learning Framework) - a new framework for weak supervision with neural networks. It provides a modularization for se

null 93 Nov 6, 2022
The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Dice Loss for NLP Tasks This repository contains code for Dice Loss for Data-imbalanced NLP Tasks at ACL2020. Setup Install Package Dependencies The c

null 223 Dec 17, 2022
This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

null 19 Oct 27, 2022
This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Towards Persona-Based Empathetic Conversational Models (PEC) This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (E

Zhong Peixiang 35 Nov 17, 2022