Deep GNN, Shallow Sampling

Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, Ren Chen

arxiv paper

Overview

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. The training of shaDow-GNN can be abstracted as three major steps:

Preprocessing

The preprocessing steps may

Smooth the node features
Augment the node features with training labels.

The first point is similar to what SGC and SIGN did (it's just we convert the original algorithm into the shaDow version). The second point is inspired by the methods on the OGB leaderboard (only applicable under the transductive setting).

Note that preprocessing is turned off in some cases (e.g., for results on ogbn-papers100M).

Training

shaDow-GNN now supports six different backbone architectures, including: GCN, GraphSAGE, GAT, GIN, JK-Net and SGC.

In addition, we support the following architectural extensions:

Samplers: k-hop, PPR, and the ensemble of them
Subgraph pooling / READOUT: CenterPooling, SortPooling, MaxPooling, MeanPooling, SumPooling

Postprocessing

After the training is finished, we can reload the stored checkpoint to perform the following post-processing steps:

C&S (transductive only): we borrow the DGL implementation of C&S to perform smoothening of the predictions generated by shaDow-GNN.
Ensemble: Ensemble can be done either in an "end-to-end" fashion during the above training step, or as a postprocessing step. In the paper, our discussion is based on the "end-to-end" ensemble.

Hardware requirements

Due to its flexibility in minibatching, shaDow-GNN requires the minimum hardware for training and inference computation. Most of our experiments can be run on a desktop machine. Even the largest graph of 111 million nodes can be trained on a low-end server.

The main computation operations include:

Subgraph sampling, where we construct a local subgraph for each target node independently. This part is parallelized on CPU by C++ and OpenMP.
Forward / backward propagation of the GNN model. This part is parallelized on GPU via PyTorch.

We summarize the recommended minimum hardware spec for the three OGB graphs:

Graph	Num. nodes	CPU cores	CPU RAM	GPU memory
ogbn-arxiv	0.2M	4	8GB	4GB
ogbn-products	2.4M	4	32GB	4GB
ogbn-papers100M	111.1M	4	128GB	4GB

If you have more powerful machines, you can simply scale up the performance by increasing the batch size. In our experiments, we have tested on GPUs ranging from NVIDIA GeForce GTX 1650 (4GB) to NVIDIA GeForce RTX 3090 (24GB).

Data format

When you run shaDow-GNN for the first time, we will convert the graph data from the OGB or GraphSAINT format into the shaDow-GNN format. The converted data files are (by default) stored in the ./data/<graph_name> directory.

NOTE: the initial data conversion may take a while for large graphs (e.g., for ogbn-papers100M). Please be patient.

General shaDow format

We briefly describe the shaDow data format. You should not need to worry about the details unless you want to prepare your own dataset. Each graph is defined by the following files:

adj_full_raw.npz / adj_full_raw.npy: The adjacency matrix of the full graph (consisting of all the train / valid / test nodes). It can either be a *.npz file of type scipy.sparse.csr_matrix, or a *.npy file containing the dictionary {'indptr': numpy.ndarray, 'indices': numpy.ndarray, 'data': numpy.ndarray}.
adj_train_raw.npz / adj_train_raw.npy: The adjacency matrix induced by all training nodes (ONLY used in inductive learning).
label_full.npy: The numpy.ndarray representing the labels of all the train / valid / test nodes. If this matrix is 2D, then a row is a one-hot encoding of the label(s) of a node. If this is 1D, then an element is the label index of a node. In any case, the first dimension equals the total number of nodes.
feat_full.npy: The numpy.ndarray representing the node features. The first dimension of the matrix equals the total number of nodes.
split.npy: The file stores a dictionary representing the train / valid / test splitting. The keys are train / valid / test. The values are numpy array of the node indices for the corresponding split.
(Optional) adj_full_undirected.npy: This is a cache file storing the graph after converting adj_full_raw into undirected (e.g., the raw graph of ogbn-arxiv is directed).
(Optional) adj_train_undirected.npy: Similar as above. Converted from adj_train_raw into undirected.
(Optional) cpp/adj_<full|train>_<indices|indptr|data>.bin: These are the cache files for the C++ sampler. We store the corresponding *.npy / *.npz files as binary files so that the C++ sampler can directly load the graph without going through the layer of PyBind11 (see below). For gigantic graphs such as ogbn-papers100M, the conversion from numpy.ndarray to C++ vector seems to be slow (maybe an issue of PyBind11).
(Optional) ppr_float/<neighs|scores>_<transductive|inductive>_<ppr params>.bin: These are the cache files for the C++ PPR sampler. We store the PPR values and node indices for the close neighbors of each target as the external binary files. Therefore, we do not need to run PPR multiple times when we perform parameter tuning (even through running PPR from scratch is still much cheaper than the model training).

Graphs tested

To train shaDow-GNN on the six graphs evaluated in the paper:

For the three OGB graphs (i.e., ogbn-arxiv, ogbn-products, ogbn-papers100M), you don't need to manually download anything. Just execute the training command (see below).
For the three other graphs (i.e., Flickr, Reddit, Yelp), these are listed in the officially GraphSAINT repo. Please manually download from the link provided by GraphSAINT, and place all the downloaded files under the ./data/saint/<graph name>/ directory.
- E.g., for Flickr, the directory should look something like (note the lower case for graph name)

data/
└───saint/
    └───flickr/
        └───adj_full.npz
            class_map.json
            ...

The script for converting from OGB / SAINT into shaDow format is ./shaDow/data_converter.py. It is automatically invoked when you run training for the first time.

Build and Run

Clone the repo by (you need the --recursive flag to download pybind11 as submodule):

git clone <URL FOR THIS REPO> --recursive

Step 0: Make sure you create a virtual environment with Python 3.8 (lower version of python may not work. The version we use is 3.8.5).

Step 1: We need PyBind11 to link the C++ based sampler with the PyTorch based trainer. The ./shaDow/para_sampler/ParallelSampler.* contains the C++ code for the PPR and k-hop samplers. The ./shaDow/para_sampler/pybind11/ directory contains a copy of PyBind11.

Before training, we need to build the C++ sampler as a python package, so that it can be directly imported by the PyTorch trainer (just like we import any other python module). To do so, you need to install the following:

cmake (our version is 3.18.2. May be installed by conda install -c anaconda cmake)
ninja (our version is 1.10.2. May be installed by conda install -c conda-forge ninja)
pybind11 (our version is 2.6.2. May be installed by pip install pybind11)
OpenMP: normally openmp should already be included in the C++ compiler. If not, you may need to install it manually based on your C++ compiler version.

Then build the sampler. Run the following in your terminal

bash install.sh

On Windows machine, you could instead execute .\install.bat.

NOTE: if the above does not work, we provide an alternative script to compile the C++ sampler. See the Troubleshooting section for details.

Step 2: Install all the other Python packages in your virtual environment.

pytorch==1.7.1 (CUDA 11)
Pytorch Geometric and its dependency packages (torch-scatter, torch-sparse, etc.)
- Follow the official instructions (see the "Installation via Binaries" section)
- We also explicitly use the torch_scatter functions to perform some graph operations for shaDow.
ogb>=1.2.4
dgl>=0.5.3 (only used by the postprocessing of C&S). Can be installed by pip or conda. See the official instruction
numpy>=1.19.2
scipy>=1.6.0
scikit-learn>=0.24.0
pyyaml>=5.4.1
argparse
tqdm

Step 3: Record your system information. We use the CONFIG.yml file to keep track of the meta information of your hardware / software system. Copy CONFIG_TEMPLATE.yml and name it CONFIG.yml. Edit the fields based on your machine specs.

In most cases, the only thing you need to overwrite is the max_threads field. This is used to control the parallelism of the C++ sampler. You can also set it to -1 so that OpenMP will automatically decide the number of threads for you.

Step 4: You should be able to run the training now. In general, just type:

python -m shaDow.main --configs <your config *.yml file> --dataset <name of the graph> --gpu <index of the available GPU>

where the *.yml file specifies the GNN architecture, sampler parameters and other hyperparameters. The name of the graph should correspond to the sub-directory name under ./data/. E.g., the graphs used in the papers correspond to flickr, reddit, yelp, arxiv, products, papers100M (we use all lowercase and omit the ogbn- prefix).

Step 5 Check the logs of the training. We use the following protocol for logging. Our principle is that the logs enable you to completely reproduce your own previous runs.

Each run gets its own subdirectory in the format of ./<log dir>/<data>/<running|done|crashed|killed>/<timestamp>-<githash>/..., where the subdirectory indicates the status of the run:
- running/: the training is still in progress.
- finished/: the training finishes normally. The logs will be moved from running/ to finished/.
- killed/: the training is killed (e.g., by CTRL-C).
- crashed/: the training crashes (e.g., bugs in the code, GPU / CPU out-of-memory, etc.).
In the subdirectory we should find the following files:
- *.yml: a copy of the *.yml file to launch the training
- epoch_<train|valid|test>.csv: CSV file logging the accuracy of each epoch
- final.csv: CSV file logging the final accuracy on the full train / valid / test sets.
- pytorch checkpoint: the model weights and optimizer states.

Reproducing the results

We first describe the command for a single run. At the end of this section, we show the wrapper script for repeating the same configuration for 10 runs.

`ogbn-papers100M` (leaderboard result)

shaDow-SAGE

Run the following:

python -m shaDow.main --configs config_train/papers100M/sage_5_ppr_ogb.yml --dataset papers100M --gpu 0

shaDow-GAT

Run the following:

python -m shaDow.main --configs config_train/papers100M/gat_5_ppr_ogb.yml --dataset papers100M --gpu 0

We don't fix any random seeds (for C++ sampler, the seed is set by the timestamp of launching the training. See std::srand(std::time(0)) of shaDow/para_samplers/ParallelSampler.h). Over 10 runs, we get 0.6653 test accuracy for the shaDow-GAT model.

Repeat the same configuration 10 times

According to the OGB leaderboard instruction, we need to repeat the same configuration for 10 times, and report the mean and std of accuracy. We provide a wrapper script for this purpose: ./scripts/train_multiple_runs.py.

NOTE: the wrapper script uses python subprocess to launch multiple runs. There seems to be some issue on redirecting the print-out messages of the training subprocess. It may appear that the program stucks without any outputs. This is due to the buffering of output. However, the training should actually be running in the background. You can check the corresponding log files in the running/ directory to see the accuracy per epoch being updated.

For example, to repeat the configuration for 10 times:

python scripts/train_multiple_runs.py --dataset papers100M --configs config_train/papers100M/gat_5_ppr_ogb.yml --gpu 0 --repetition 10

where all the command line arguments are the same as the original training script (i.e., the shaDow.main module). The only additional flag is --repetition.

NOTE: due to the large graph size, each run on ogbn-papers100M takes quite sometime to converge. It may make more sense to manually launch the training 10 times and manually calculate the mean and std based on the log files.

Troubleshooting

What if the sampler does not build successfully?

This may be an issue specific to your machine. You can file an issue or contact [email protected]. Alternatively, you can try another compilation script we provided after manually build pybind11. See this installation instruction and do the following:

Download pybind11 from the official GitHub repo. Enter the pybind11 root dir.
Install pytest by, e.g., pip install -U pytest
mkdir build
cd build
cmake ..
- Note: if cmake is not installed, you can install it via Anaconda: conda install -c anaconda cmake
make check -j 4 (Windows machine does not seem to be well supported)
Install pybind11 package: python -m pip install pybind11

Then go back to the shaDow-GNN root directory and compile the C++ sampler by

./compile.sh

License

shaDow_GNN is released under an MIT license. Find out more about it here.

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

41 Nov 8, 2022

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow

201 Dec 21, 2022

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

A Critical Assessment of State-of-the-Art in Entity Alignment This repository contains the source code for the paper A Critical Assessment of State-of

16 Oct 14, 2022

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

10 Jan 30, 2022

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Transfer Learning for Text Classification with Tensorflow Tensorflow implementation of Semi-supervised Sequence Learning(https://arxiv.org/abs/1511.01

82 Oct 22, 2022

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

46 Dec 14, 2022

https://arxiv.org/abs/2102.11005

LogME LogME: Practical Assessment of Pre-trained Models for Transfer Learning How to use Just feed the features f and labels y to the function, and yo

149 Dec 19, 2022

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement Recently, the power of unconditional image synthesis has significantly advanced th

967 Jan 4, 2023

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

This is the project page for the paper: ISTR: End-to-End Instance Segmentation via Transformers, Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wa

182 Dec 19, 2022

Please share the requirement file.

Hi, I have been trying to follow the installation guide mentioned. However, even after lot's of tries failed to do so. Can you please share the requirement file?

For conda: conda list --explicit > req_conda.txt

For pip: pip freeze > req_pip.txt

Main culprit for me is the torch-scatter and torch-sparse libraries.

opened by intelsam 4
Please provide the wheel file for ParallelSampler, if possible.

Hello, can you please provide the wheel file for ParallelSampler? The reason I ask is that there are some issues on my machine with building the ParallelSampler package. Additionally, it makes life easier if someone wants to run the repo in Collab for faster experiments and prototyping.

By the way, building the wheel does not require any extra work as it's automatically generated when you build it.

PS: I know that the wheel file will be dependent on matching the other package versions. However, I can make sure to install matching versions.

opened by intelsam 3
How to disable shaDow_GNN sampling?

Hi, I need to compare the efficiency of the shaDow_GNN (decoupled GNN) with normal GNN (coupled GNN). Is there a way to run this repo without the ParallelSampler? What I mean is how do I run the baseline (coupled GNN where neighborhood expands with layer)?

I could not find any flag or config which disables the ParallelSampler. If you could kindly point me on the correct direction that will be really helpful.

opened by intelsam 2
Can it be run on CPU?
Hi, I've been through the installation process, and when I try to run the program with the CPU version command, it throws me an error.

I wonder if it is possible to run it with CPU only?

I'm running windows 11 Intel i7 CPU Intel UHD Onboard graphics card (that's why I'm trying to use it with the CPU version only) this is the command that I use (if you could verify it please):

python -m shaDow.main --configs config_train\products\vanilla\gcn_3_ppr.yml --dataset products --cpu

and this is the error I'm getting:
opened by parisasl 9

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

Related tags

Overview

Deep GNN, Shallow Sampling

Overview

Preprocessing

Training

Postprocessing

Hardware requirements

Data format

General shaDow format

Graphs tested

Build and Run

Reproducing the results

ogbn-papers100M (leaderboard result)

Repeat the same configuration 10 times

Troubleshooting

What if the sampler does not build successfully?

License

You might also like...

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

https://arxiv.org/abs/2102.11005

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

Comments

Please share the requirement file.

Please provide the wheel file for ParallelSampler, if possible.

How to disable shaDow_GNN sampling?

Can it be run on CPU?

Owner

Facebook Research

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

A PyTorch implementation of EventProp [https://arxiv.org/abs/2009.08378], a method to train Spiking Neural Networks

Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

`ogbn-papers100M` (leaderboard result)