This repo is to present various code demos on how to use our Graph4NLP library.

Overview

Deep Learning on Graphs for Natural Language Processing Demo

The repository contains code examples for DLG4NLP tutorials at NAACL 2021, SIGIR 2021, KDD 2021, IJCAI 2021, AAAI 2022 and TheWebConf 2022.

Slides can be downloaded from here.

Get Started

You will need to install our graph4nlp library in order to run the demo code. Please follow the following environment setup instructions. Please also refer to the graph4nlp repository page for more details on how to use the library.

Environment setup

  1. Create virtual environment
conda create --name graph4nlp python=3.8
conda activate graph4nlp
  1. Install graph4nlp library
  • Clone the github repo
git clone -b [branch_version] https://github.com/graph4ai/graph4nlp.git
cd graph4nlp

Please choose the branch version corresponding to the demo version as shown in the table below.

demo version library branch version
DLG4NLP@ICLR 2022 v0.5.5
TheWebConf 2022 v0.5.5
AAAI 2022 v0.5.5
CLIQ-ai 2021 stable_nov2021b
IJCAI 2021 stable_202108
KDD 2021 stable_202108
SIGIR 2021 stable
NAACL 2021 stable
  • Then run ./configure (or ./configure.bat if you are using Windows 10) to config your installation. The configuration program will ask you to specify your CUDA version. If you do not have a GPU, please choose 'cpu'.
./configure
  • Finally, install the package
python setup.py install
  1. Install other packages
pip install torchtext
pip install notebook
  1. Set up StanfordCoreNLP (for static graph construction only, unnecessary for this demo because preprocessed data is provided)
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

Start Jupyter notebook and run the demo

After complete the above steps, you can start the jupyter notebook server to run the demo:

cd graph4nlp_demo/XYZ
jupyter notebook

Note that you will need to change XYZ to the specific folder name.

Additional Resources:

Comments
  • CUDA error: no kernel image

    CUDA error: no kernel image

    I followed all the instructions in the repo to install it from the source. Torch version - 1.10.2 torchtext - 0.11.2 cuda - 11.1

    By when I run https://github.com/graph4ai/graph4nlp_demo/blob/main/AAAI2022_demo/semantic_parsing.ipynb

    I get the error:

    RuntimeError: CUDA error: no kernel image is available for execution on the device
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    

    See the full error here:

    `[ Using CUDA ] /home/synthesisproject/anaconda3/envs/g4nlp/lib/python3.8/site-packages/graph4nlp_cu111-0.5.5-py3.8.egg/graph4nlp/pytorch/modules/graph_embedding_learning/gat.py:259: UserWarning: The residual option must be False when num_heads > 1 warnings.warn("The residual option must be False when num_heads > 1") /home/synthesisproject/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/cuda/init.py:143: UserWarning: NVIDIA RTX A5000 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA RTX A5000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

    warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

    RuntimeError Traceback (most recent call last) in 1 # run the model ----> 2 runner = Jobs(opt) 3 max_score = runner.train() 4 print("Train finish, best val score: {:.3f}".format(max_score)) 5 test_score = runner.translate()

    in init(self, opt) 7 self._build_device(self.opt) 8 self._build_dataloader() ----> 9 self._build_model() 10 self._build_optimizer() 11 self._build_evaluation()

    in _build_model(self) 75 76 def _build_model(self): ---> 77 self.model = Graph2Seq.from_args(self.opt, self.vocab).to(self.device) 78 79 def _build_optimizer(self):

    ~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in to(self, *args, **kwargs) 897 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) 898 --> 899 return self._apply(convert) 900 901 def register_backward_hook(

    ~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in _apply(self, fn) 568 def _apply(self, fn): 569 for module in self.children(): --> 570 module._apply(fn) 571 572 def compute_should_use_set_data(tensor, tensor_applied):

    ~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in _apply(self, fn) 568 def _apply(self, fn): 569 for module in self.children(): --> 570 module._apply(fn) 571 572 def compute_should_use_set_data(tensor, tensor_applied):

    ~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in _apply(self, fn) 568 def _apply(self, fn): 569 for module in self.children(): --> 570 module._apply(fn) 571 572 def compute_should_use_set_data(tensor, tensor_applied):

    ~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in _apply(self, fn) 568 def _apply(self, fn): 569 for module in self.children(): --> 570 module._apply(fn) 571 572 def compute_should_use_set_data(tensor, tensor_applied):

    ~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/rnn.py in _apply(self, fn) 187 self._flat_weights = [(lambda wn: getattr(self, wn) if hasattr(self, wn) else None)(wn) for wn in self._flat_weights_names] 188 # Flattens params (on CUDA) --> 189 self.flatten_parameters() 190 191 return ret

    ~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/rnn.py in flatten_parameters(self) 173 if self.proj_size > 0: 174 num_weights += 1 --> 175 torch._cudnn_rnn_flatten_weight( 176 self._flat_weights, num_weights, 177 self.input_size, rnn.get_cudnn_mode(self.mode),

    RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.`

    opened by vinven7 2
  • Cannot execute KGC and math word problem ipynb

    Cannot execute KGC and math word problem ipynb

    Hello, please let me know where am i going wrong. I have not made any changes and have followed the given instructions for setup. for math word problem Screenshot (208) for KG completion

    Screenshot (210)

    opened by Sanchita333 2
  • Cannot construct the IE Graph correctly in the CLIQ-ai2021_demo

    Cannot construct the IE Graph correctly in the CLIQ-ai2021_demo

    Hi, I encountered some problem when I tried to build the IE graph using demo code in CLIQ-ai2021_demo classification problem. By implementation, I simply changed the config file that -- graph_name: 'ie'.

    The error says

    "ValueError: num_samples should be a positive integer value, but got num_samples=0"

    I checked the built graph in data.pt and label.py, only to find they are null.

    Could someone explain why I encountered this error? Thank you!

    opened by liuchenzhengyi 2
  • UnpicklingError when trying KDD2021_demo's semantic_parsing in jupyter notebook

    UnpicklingError when trying KDD2021_demo's semantic_parsing in jupyter notebook

    🐛 Bug

    When trying KDD2021_demo semantic_parsing in jupyter notebook, I got the following error upon "Run the model"

    [ Using CUDA ]
    Loading pre-built vocab model stored in ../data/jobs\processed\node_emb_graph\vocab.pt
    ---------------------------------------------------------------------------
    UnpicklingError                           Traceback (most recent call last)
    <ipython-input-7-8ce32756f002> in <module>
          1 # run the model
    ----> 2 runner = Jobs(opt)
          3 max_score = runner.train()
          4 print("Train finish, best val score: {:.3f}".format(max_score))
          5 runner.load_checkpoint("best.pth")
    
    <ipython-input-2-a40dcc751a5f> in __init__(self, opt)
          6         self.use_coverage = self.opt["decoder_args"]["rnn_decoder_share"]["use_coverage"]
          7         self._build_device(self.opt)
    ----> 8         self._build_dataloader()
          9         self._build_model()
         10         self._build_loss_function()
    
    <ipython-input-2-a40dcc751a5f> in _build_dataloader(self)
         58 
         59         # Call the TREC dataset API
    ---> 60         dataset = JobsDataset(root_dir=self.opt["graph_construction_args"]["graph_construction_share"]["root_dir"],
         61                               pretrained_word_emb_name=self.opt["pretrained_word_emb_name"],
         62                               pretrained_word_emb_cache_dir=self.opt["pretrained_word_emb_cache_dir"],
    
    c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\datasets\jobs.py in __init__(self, root_dir, topology_builder, topology_subdir, pretrained_word_emb_name, pretrained_word_emb_url, pretrained_word_emb_cache_dir, graph_type, merge_strategy, edge_strategy, seed, word_emb_size, share_vocab, lower_case, thread_number, port, dynamic_graph_type, dynamic_init_topology_builder, dynamic_init_topology_aux_args)
         68         """
         69         # Initialize the dataset. If the preprocessed files are not found, then do the preprocessing and save them.
    ---> 70         super(JobsDataset, self).__init__(root_dir=root_dir, topology_builder=topology_builder,
         71                                           topology_subdir=topology_subdir, graph_type=graph_type,
         72                                           edge_strategy=edge_strategy, merge_strategy=merge_strategy,
    
    c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\data\dataset.py in __init__(self, root_dir, topology_builder, topology_subdir, share_vocab, **kwargs)
        692         self.data_item_type = Text2TextDataItem
        693         self.share_vocab = share_vocab
    --> 694         super(Text2TextDataset, self).__init__(root_dir, topology_builder, topology_subdir, **kwargs)
        695 
        696     def parse_file(self, file_path) -> list:
    
    c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\data\dataset.py in __init__(self, root, topology_builder, topology_subdir, tokenizer, lower_case, pretrained_word_emb_name, pretrained_word_emb_url, target_pretrained_word_emb_name, target_pretrained_word_emb_url, pretrained_word_emb_cache_dir, max_word_vocab_size, min_word_vocab_freq, use_val_for_vocab, seed, thread_number, port, timeout, **kwargs)
        370             self.val = data['val']
        371 
    --> 372         self.build_vocab()
        373 
        374     @property
    
    c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\data\dataset.py in build_vocab(self)
        640             data_for_vocab = self.val + data_for_vocab
        641 
    --> 642         vocab_model = VocabModel.build(saved_vocab_file=self.processed_file_paths['vocab'],
        643                                        data_set=data_for_vocab,
        644                                        tokenizer=self.tokenizer,
    
    c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\modules\utils\vocab_utils.py in build(cls, saved_vocab_file, data_set, tokenizer, lower_case, max_word_vocab_size, min_word_vocab_freq, pretrained_word_emb_name, pretrained_word_emb_url, target_pretrained_word_emb_name, target_pretrained_word_emb_url, pretrained_word_emb_cache_dir, word_emb_size, share_vocab)
        184             print('Loading pre-built vocab model stored in {}'.format(saved_vocab_file))
        185             with open(saved_vocab_file, 'rb') as f:
    --> 186                 vocab_model = pickle.load(f)
        187 
        188         else:
    
    UnpicklingError: A load persistent id instruction was encountered,
    but no persistent_load function was specified.
    

    To Reproduce

    Steps to reproduce the behavior:

    1. Following the instructions by the site to open semantic_parsing.
    2. Running through code segments.
    3. The error occurred when running "Run the model" code segment.

    Expected behavior

    Don't expect the error.

    Environment

    • Graph4NLP Version (e.g., 0.4.1): v0.4.1
    • Backend Library & Version (e.g., PyTorch 1.6.0): 1.8.0
    • OS (e.g., Linux): Windows 10
    • How you installed Graph4NLP (pip, source): followed "Installation via source code"
    • Build command you used (if compiling from source): As listed instructions
    • Python version: 3.8.10
    • CUDA/cuDNN version (if applicable): 11.1
    • GPU models and configuration (e.g. 2080Ti): RTX2060
    • Any other relevant information:
      • torchtext==0.9.0
      • pickle.compatible_formats ['1.0', '1.1', '1.2', '1.3', '2.0', '3.0', '4.0', '5.0']

    Additional context

    1. Started jupyter notebook in python virtual environment.
    2. Also tried setting config_file (in Set up the config segment) to the absolute path.
    3. text_classification example faced the same issue.
    opened by chris-opendata 2
  • A typo under

    A typo under "Get Started" and missing link for "graph4nlp repository" under the same.

    Thanks for this wonderful repo!!

    There is a small typo in the below highlighted word and also a link is missing for graph4nlp repository page, You will need to install our graph4nlp library in order to run the demo code. Please follow the following environment setup instructinos.

    opened by karthikkaiplody 2
  • No graph4nlp_demo folder created when git clone

    No graph4nlp_demo folder created when git clone

    I followed the instructions in the repo: created environment (python 3.8)
    git clone -b v0.5.5 https://github.com/graph4ai/graph4nlp.git

    While this created a graph4nlp folder, it does not have a graph4nlp_demo folder. I had to separately clone to the graph4nlp_demo repo

    opened by vinven7 1
  • Video Release

    Video Release

    Hello! Your NAACL2021 tutorial about gnn4nlp is so amazing. I am eager to see the video about it. Do you have a plan when to release it? Thanks so much

    opened by Hannibal046 1
  • TypeError: __init__() missing 1 required positional argument: 'graph_name'

    TypeError: __init__() missing 1 required positional argument: 'graph_name'

    TypeError                                 Traceback (most recent call last)
    /tmp/ipykernel_23/1251835385.py in <module>
          7 print('\n' + config['out_dir'])
          8 
    ----> 9 runner = ModelHandler(config)
         10 t0 = time.time()
         11 
    
    /tmp/ipykernel_23/370220804.py in __init__(self, config)
          6         self.logger.write(self.config['out_dir'])
          7         self._build_device()
    ----> 8         self._build_dataloader()
          9         self._build_model()
         10         self._build_optimizer()
    
    /tmp/ipykernel_23/370220804.py in _build_dataloader(self)
         77                                   self.config['graph_type'] in ('node_emb', 'node_emb_refined') else None,
         78                               dynamic_init_topology_builder=dynamic_init_topology_builder,
    ---> 79                               dynamic_init_topology_aux_args={'dummy_param': 0})
         80 
         81         self.train_dataloader = DataLoader(dataset.train, batch_size=self.config['batch_size'], shuffle=True,
    
    TypeError: __init__() missing 1 required positional argument: 'graph_name'
    
    opened by kiloGrand 0
  • DependencyBasedGraphConstruction initialization error

    DependencyBasedGraphConstruction initialization error

    https://github.com/graph4ai/graph4nlp_demo/blob/main/SIGIR2021_demo/text_classification.ipynb in this notebook, in line 16 of class TextClassifier when i initialize DependencyBasedGraphConstruction, it raise this error

    self.graph_topology = DependencyBasedGraphConstruction( TypeError: init() got an unexpected keyword argument 'embedding_style'

    opened by mahmodDAHOL 0
Owner
Graph4AI
Graph4AI
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 9, 2022
PPO is a very popular Reinforcement Learning algorithm at present.

PPO is a very popular Reinforcement Learning algorithm at present. OpenAI takes PPO as the current baseline algorithm. We use the PPO algorithm to train a policy to give the best action in any situation.

Rosefintech 11 Aug 23, 2021
Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Codes-for-Algorithms Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Tracy (Shengmin) Tao 1 Apr 12, 2022
Flybirds - BDD-driven natural language automated testing framework, present by Trip Flight

Flybird | English Version 行为驱动开发(Behavior-driven development,缩写BDD),是一种软件过程的思想或者

Ctrip, Inc. 706 Dec 30, 2022
Repo for FUZE project. I will also publish some Linux kernel LPE exploits for various real world kernel vulnerabilities here. the samples are uploaded for education purposes for red and blue teams.

Linux_kernel_exploits Some Linux kernel exploits for various real world kernel vulnerabilities here. More exploits are yet to come. This repo contains

Wei Wu 472 Dec 21, 2022
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

Isen (Songyao Jiang) 128 Dec 8, 2022
This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

Transferability for domain generalization This repo is for evaluating and improving transferability in domain generalization (NeurIPS 2021), based on

gordon 9 Nov 29, 2022
A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh-Keys A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python Have been seeing alot

Joseph 53 Dec 13, 2022
We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

ConTNet Introduction ConTNet (Convlution-Tranformer Network) is proposed mainly in response to the following two issues: (1) ConvNets lack a large rec

null 93 Nov 8, 2022
This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Towards Persona-Based Empathetic Conversational Models (PEC) This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (E

Zhong Peixiang 35 Nov 17, 2022
Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash Abstract: Apple recently revealed its deep perceptual hashing system NeuralHash to

ml-research@TUDarmstadt 11 Dec 3, 2022
PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

Shipeng Wang 34 Dec 21, 2022
Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Machine Learning Sleep Schedule Tracker What is it? Convolutional neural network web app trained to track our infant’s sleep schedule using our Google

g-parki 7 Jul 15, 2022
Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode

?? Transformers Wav2Vec2 + PyCTCDecode Introduction This repo shows how ?? Transformers can be used in combination with kensho-technologies's PyCTCDec

Patrick von Platen 102 Oct 22, 2022
A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

jedibobo 3 Dec 28, 2022
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 364 Dec 28, 2022
Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

Thomas Dopierre 47 Jan 3, 2023
FindFunc is an IDA PRO plugin to find code functions that contain a certain assembly or byte pattern, reference a certain name or string, or conform to various other constraints.

FindFunc: Advanced Filtering/Finding of Functions in IDA Pro FindFunc is an IDA Pro plugin to find code functions that contain a certain assembly or b

null 213 Dec 17, 2022
This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

International Business Machines 72 Jan 6, 2023