This repo is to present various code demos on how to use our Graph4NLP library.

Graph4AI

Last update: Dec 23, 2022

Related tags

Deep Learning graph4nlp_demo

Overview

Deep Learning on Graphs for Natural Language Processing Demo

The repository contains code examples for DLG4NLP tutorials at NAACL 2021, SIGIR 2021, KDD 2021, IJCAI 2021, AAAI 2022 and TheWebConf 2022.

Slides can be downloaded from here.

Get Started

You will need to install our graph4nlp library in order to run the demo code. Please follow the following environment setup instructions. Please also refer to the graph4nlp repository page for more details on how to use the library.

Environment setup

Create virtual environment

conda create --name graph4nlp python=3.8
conda activate graph4nlp

Install graph4nlp library

Clone the github repo

git clone -b [branch_version] https://github.com/graph4ai/graph4nlp.git
cd graph4nlp

Please choose the branch version corresponding to the demo version as shown in the table below.

demo version	library branch version
DLG4NLP@ICLR 2022	v0.5.5
TheWebConf 2022	v0.5.5
AAAI 2022	v0.5.5
CLIQ-ai 2021	stable_nov2021b
IJCAI 2021	stable_202108
KDD 2021	stable_202108
SIGIR 2021	stable
NAACL 2021	stable

Then run ./configure (or ./configure.bat if you are using Windows 10) to config your installation. The configuration program will ask you to specify your CUDA version. If you do not have a GPU, please choose 'cpu'.

./configure

Finally, install the package

python setup.py install

Install other packages

pip install torchtext
pip install notebook

Set up StanfordCoreNLP (for static graph construction only, unnecessary for this demo because preprocessed data is provided)

Download StanfordCoreNLP
Go to the root folder and start the server

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

Start Jupyter notebook and run the demo

After complete the above steps, you can start the jupyter notebook server to run the demo:

cd graph4nlp_demo/XYZ
jupyter notebook

Note that you will need to change XYZ to the specific folder name.

Additional Resources:

Comments

CUDA error: no kernel image
I followed all the instructions in the repo to install it from the source. Torch version - 1.10.2 torchtext - 0.11.2 cuda - 11.1

By when I run https://github.com/graph4ai/graph4nlp_demo/blob/main/AAAI2022_demo/semantic_parsing.ipynb

I get the error:

RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

See the full error here:

`[ Using CUDA ] /home/synthesisproject/anaconda3/envs/g4nlp/lib/python3.8/site-packages/graph4nlp_cu111-0.5.5-py3.8.egg/graph4nlp/pytorch/modules/graph_embedding_learning/gat.py:259: UserWarning: The residual option must be False when num_heads > 1 warnings.warn("The residual option must be False when num_heads > 1") /home/synthesisproject/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/cuda/init.py:143: UserWarning: NVIDIA RTX A5000 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70. If you want to use the NVIDIA RTX A5000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

RuntimeError Traceback (most recent call last) in 1 # run the model ----> 2 runner = Jobs(opt) 3 max_score = runner.train() 4 print("Train finish, best val score: {:.3f}".format(max_score)) 5 test_score = runner.translate()

in init(self, opt) 7 self._build_device(self.opt) 8 self._build_dataloader() ----> 9 self._build_model() 10 self._build_optimizer() 11 self._build_evaluation()

in _build_model(self) 75 76 def _build_model(self): ---> 77 self.model = Graph2Seq.from_args(self.opt, self.vocab).to(self.device) 78 79 def _build_optimizer(self):

~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in to(self, *args, **kwargs) 897 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) 898 --> 899 return self._apply(convert) 900 901 def register_backward_hook(

~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in _apply(self, fn) 568 def _apply(self, fn): 569 for module in self.children(): --> 570 module._apply(fn) 571 572 def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in _apply(self, fn) 568 def _apply(self, fn): 569 for module in self.children(): --> 570 module._apply(fn) 571 572 def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in _apply(self, fn) 568 def _apply(self, fn): 569 for module in self.children(): --> 570 module._apply(fn) 571 572 def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/module.py in _apply(self, fn) 568 def _apply(self, fn): 569 for module in self.children(): --> 570 module._apply(fn) 571 572 def compute_should_use_set_data(tensor, tensor_applied):

~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/rnn.py in _apply(self, fn) 187 self._flat_weights = [(lambda wn: getattr(self, wn) if hasattr(self, wn) else None)(wn) for wn in self._flat_weights_names] 188 # Flattens params (on CUDA) --> 189 self.flatten_parameters() 190 191 return ret

~/anaconda3/envs/g4nlp/lib/python3.8/site-packages/torch-1.10.2-py3.8-linux-x86_64.egg/torch/nn/modules/rnn.py in flatten_parameters(self) 173 if self.proj_size > 0: 174 num_weights += 1 --> 175 torch._cudnn_rnn_flatten_weight( 176 self._flat_weights, num_weights, 177 self.input_size, rnn.get_cudnn_mode(self.mode),

RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.`
opened by vinven7 2
Cannot execute KGC and math word problem ipynb

Hello, please let me know where am i going wrong. I have not made any changes and have followed the given instructions for setup. for math word problem for KG completion

opened by Sanchita333 2
Cannot construct the IE Graph correctly in the CLIQ-ai2021_demo

Hi, I encountered some problem when I tried to build the IE graph using demo code in CLIQ-ai2021_demo classification problem. By implementation, I simply changed the config file that -- graph_name: 'ie'.

The error says

"ValueError: num_samples should be a positive integer value, but got num_samples=0"

I checked the built graph in data.pt and label.py, only to find they are null.

Could someone explain why I encountered this error? Thank you!

opened by liuchenzhengyi 2

UnpicklingError when trying KDD2021_demo's semantic_parsing in jupyter notebook

🐛 Bug

When trying KDD2021_demo semantic_parsing in jupyter notebook, I got the following error upon "Run the model"

[ Using CUDA ]
Loading pre-built vocab model stored in ../data/jobs\processed\node_emb_graph\vocab.pt
---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
<ipython-input-7-8ce32756f002> in <module>
      1 # run the model
----> 2 runner = Jobs(opt)
      3 max_score = runner.train()
      4 print("Train finish, best val score: {:.3f}".format(max_score))
      5 runner.load_checkpoint("best.pth")

<ipython-input-2-a40dcc751a5f> in __init__(self, opt)
      6         self.use_coverage = self.opt["decoder_args"]["rnn_decoder_share"]["use_coverage"]
      7         self._build_device(self.opt)
----> 8         self._build_dataloader()
      9         self._build_model()
     10         self._build_loss_function()

<ipython-input-2-a40dcc751a5f> in _build_dataloader(self)
     58 
     59         # Call the TREC dataset API
---> 60         dataset = JobsDataset(root_dir=self.opt["graph_construction_args"]["graph_construction_share"]["root_dir"],
     61                               pretrained_word_emb_name=self.opt["pretrained_word_emb_name"],
     62                               pretrained_word_emb_cache_dir=self.opt["pretrained_word_emb_cache_dir"],

c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\datasets\jobs.py in __init__(self, root_dir, topology_builder, topology_subdir, pretrained_word_emb_name, pretrained_word_emb_url, pretrained_word_emb_cache_dir, graph_type, merge_strategy, edge_strategy, seed, word_emb_size, share_vocab, lower_case, thread_number, port, dynamic_graph_type, dynamic_init_topology_builder, dynamic_init_topology_aux_args)
     68         """
     69         # Initialize the dataset. If the preprocessed files are not found, then do the preprocessing and save them.
---> 70         super(JobsDataset, self).__init__(root_dir=root_dir, topology_builder=topology_builder,
     71                                           topology_subdir=topology_subdir, graph_type=graph_type,
     72                                           edge_strategy=edge_strategy, merge_strategy=merge_strategy,

c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\data\dataset.py in __init__(self, root_dir, topology_builder, topology_subdir, share_vocab, **kwargs)
    692         self.data_item_type = Text2TextDataItem
    693         self.share_vocab = share_vocab
--> 694         super(Text2TextDataset, self).__init__(root_dir, topology_builder, topology_subdir, **kwargs)
    695 
    696     def parse_file(self, file_path) -> list:

c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\data\dataset.py in __init__(self, root, topology_builder, topology_subdir, tokenizer, lower_case, pretrained_word_emb_name, pretrained_word_emb_url, target_pretrained_word_emb_name, target_pretrained_word_emb_url, pretrained_word_emb_cache_dir, max_word_vocab_size, min_word_vocab_freq, use_val_for_vocab, seed, thread_number, port, timeout, **kwargs)
    370             self.val = data['val']
    371 
--> 372         self.build_vocab()
    373 
    374     @property

c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\data\dataset.py in build_vocab(self)
    640             data_for_vocab = self.val + data_for_vocab
    641 
--> 642         vocab_model = VocabModel.build(saved_vocab_file=self.processed_file_paths['vocab'],
    643                                        data_set=data_for_vocab,
    644                                        tokenizer=self.tokenizer,

c:\users\chris\dev\pyvenv_gnn\lib\site-packages\graph4nlp_cu111-0.4.0-py3.8.egg\graph4nlp\pytorch\modules\utils\vocab_utils.py in build(cls, saved_vocab_file, data_set, tokenizer, lower_case, max_word_vocab_size, min_word_vocab_freq, pretrained_word_emb_name, pretrained_word_emb_url, target_pretrained_word_emb_name, target_pretrained_word_emb_url, pretrained_word_emb_cache_dir, word_emb_size, share_vocab)
    184             print('Loading pre-built vocab model stored in {}'.format(saved_vocab_file))
    185             with open(saved_vocab_file, 'rb') as f:
--> 186                 vocab_model = pickle.load(f)
    187 
    188         else:

UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

To Reproduce

Steps to reproduce the behavior:

Following the instructions by the site to open semantic_parsing.
Running through code segments.
The error occurred when running "Run the model" code segment.

Expected behavior

Don't expect the error.

Environment

Graph4NLP Version (e.g., 0.4.1): v0.4.1
Backend Library & Version (e.g., PyTorch 1.6.0): 1.8.0
OS (e.g., Linux): Windows 10
How you installed Graph4NLP (pip, source): followed "Installation via source code"
Build command you used (if compiling from source): As listed instructions
Python version: 3.8.10
CUDA/cuDNN version (if applicable): 11.1
GPU models and configuration (e.g. 2080Ti): RTX2060
Any other relevant information:
- torchtext==0.9.0
- pickle.compatible_formats ['1.0', '1.1', '1.2', '1.3', '2.0', '3.0', '4.0', '5.0']

Additional context

Started jupyter notebook in python virtual environment.
Also tried setting config_file (in Set up the config segment) to the absolute path.
text_classification example faced the same issue.

opened by chris-opendata 2

A typo under "Get Started" and missing link for "graph4nlp repository" under the same.

Thanks for this wonderful repo!!

There is a small typo in the below highlighted word and also a link is missing for graph4nlp repository page, You will need to install our graph4nlp library in order to run the demo code. Please follow the following environment setup instructinos.

opened by karthikkaiplody 2
No graph4nlp_demo folder created when git clone

I followed the instructions in the repo: created environment (python 3.8)
git clone -b v0.5.5 https://github.com/graph4ai/graph4nlp.git

While this created a graph4nlp folder, it does not have a graph4nlp_demo folder. I had to separately clone to the graph4nlp_demo repo

opened by vinven7 1
Video Release

Hello! Your NAACL2021 tutorial about gnn4nlp is so amazing. I am eager to see the video about it. Do you have a plan when to release it? Thanks so much

opened by Hannibal046 1

TypeError: init() missing 1 required positional argument: 'graph_name'

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_23/1251835385.py in <module>
      7 print('\n' + config['out_dir'])
      8 
----> 9 runner = ModelHandler(config)
     10 t0 = time.time()
     11 

/tmp/ipykernel_23/370220804.py in __init__(self, config)
      6         self.logger.write(self.config['out_dir'])
      7         self._build_device()
----> 8         self._build_dataloader()
      9         self._build_model()
     10         self._build_optimizer()

/tmp/ipykernel_23/370220804.py in _build_dataloader(self)
     77                                   self.config['graph_type'] in ('node_emb', 'node_emb_refined') else None,
     78                               dynamic_init_topology_builder=dynamic_init_topology_builder,
---> 79                               dynamic_init_topology_aux_args={'dummy_param': 0})
     80 
     81         self.train_dataloader = DataLoader(dataset.train, batch_size=self.config['batch_size'], shuffle=True,

TypeError: __init__() missing 1 required positional argument: 'graph_name'

opened by kiloGrand 0

DependencyBasedGraphConstruction initialization error

https://github.com/graph4ai/graph4nlp_demo/blob/main/SIGIR2021_demo/text_classification.ipynb in this notebook, in line 16 of class TextClassifier when i initialize DependencyBasedGraphConstruction, it raise this error

self.graph_topology = DependencyBasedGraphConstruction( TypeError: init() got an unexpected keyword argument 'embedding_style'

opened by mahmodDAHOL 0

This repo is to present various code demos on how to use our Graph4NLP library.

Related tags

Overview

Deep Learning on Graphs for Natural Language Processing Demo

Get Started

Environment setup

Start Jupyter notebook and run the demo

Additional Resources:

Comments

warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Owner

Graph4AI

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

PPO is a very popular Reinforcement Learning algorithm at present.

Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Flybirds - BDD-driven natural language automated testing framework, present by Trip Flight

Repo for FUZE project. I will also publish some Linux kernel LPE exploits for various real world kernel vulnerabilities here. the samples are uploaded for education purposes for red and blue teams.

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode

A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Library of various Few-Shot Learning frameworks for text classification

FindFunc is an IDA PRO plugin to find code functions that contain a certain assembly or byte pattern, reference a certain name or string, or conform to various other constraints.

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.