GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training @ KDD 2020

Overview



License Code Style


GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

Original implementation for paper GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training.

GCC is a contrastive learning framework that implements unsupervised structural graph representation pre-training and achieves state-of-the-art on 10 datasets on 3 graph mining tasks.

Installation

Requirements

Quick Start

Pretraining

Pre-training datasets

python scripts/download.py --url https://drive.google.com/open?id=1JCHm39rf7HAJSp-1755wa32ToHCn2Twz --path data --fname small.bin
# For regions where Google is not accessible, use
# python scripts/download.py --url https://cloud.tsinghua.edu.cn/f/b37eed70207c468ba367/?dl=1 --path data --fname small.bin

E2E

Pretrain E2E with K = 255:

bash scripts/pretrain.sh <gpu> --batch-size 256

MoCo

Pretrain MoCo with K = 16384; m = 0.999:

bash scripts/pretrain.sh <gpu> --moco --nce-k 16384

Download Pretrained Models

Instead of pretraining from scratch, you can download our pretrained models.

python scripts/download.py --url https://drive.google.com/open?id=1lYW_idy9PwSdPEC7j9IH5I5Hc7Qv-22- --path saved --fname pretrained.tar.gz
# For regions where Google is not accessible, use
# python scripts/download.py --url https://cloud.tsinghua.edu.cn/f/cabec37002a9446d9b20/?dl=1 --path saved --fname pretrained.tar.gz

Downstream Tasks

Downstream datasets

python scripts/download.py --url https://drive.google.com/open?id=12kmPV3XjVufxbIVNx5BQr-CFM9SmaFvM --path data --fname downstream.tar.gz
# For regions where Google is not accessible, use
# python scripts/download.py --url https://cloud.tsinghua.edu.cn/f/2535437e896c4b73b6bb/?dl=1 --path data --fname downstream.tar.gz

Generate embeddings on multiple datasets with

bash scripts/generate.sh <gpu> <load_path> <dataset_1> <dataset_2> ...

For example:

bash scripts/generate.sh 0 saved/Pretrain_moco_True_dgl_gin_layer_5_lr_0.005_decay_1e-05_bsz_32_hid_64_samples_2000_nce_t_0.07_nce_k_16384_rw_hops_256_restart_prob_0.8_aug_1st_ft_False_deg_16_pos_32_momentum_0.999/current.pth usa_airport kdd imdb-binary

Node Classification

Unsupervised (Table 2 freeze)

Run baselines on multiple datasets with bash scripts/node_classification/baseline.sh <hidden_size> <baseline:prone/graphwave> usa_airport h-index.

Evaluate GCC on multiple datasets:

bash scripts/generate.sh <gpu> <load_path> usa_airport h-index
bash scripts/node_classification/ours.sh <load_path> <hidden_size> usa_airport h-index
Supervised (Table 2 full)

Finetune GCC on multiple datasets:

bash scripts/finetune.sh <load_path> <gpu> usa_airport

Note this finetunes the whole network and will take much longer than the freezed experiments above.

Graph Classification

Unsupervised (Table 3 freeze)
bash scripts/generate.sh <gpu> <load_path> imdb-binary imdb-multi collab rdt-b rdt-5k
bash scripts/graph_classification/ours.sh <load_path> <hidden_size> imdb-binary imdb-multi collab rdt-b rdt-5k
Supervised (Table 3 full)
bash scripts/finetune.sh <load_path> <gpu> imdb-binary

Similarity Search (Table 4)

Run baseline (graphwave) on multiple datasets with bash scripts/similarity_search/baseline.sh <hidden_size> graphwave kdd_icdm sigir_cikm sigmod_icde.

Run GCC:

bash scripts/generate.sh <gpu> <load_path> kdd icdm sigir cikm sigmod icde
bash scripts/similarity_search/ours.sh <load_path> <hidden_size> kdd_icdm sigir_cikm sigmod_icde

Common Issues

"XXX file not found" when running pretraining/downstream tasks.
Please make sure you've downloaded the pretraining dataset or downstream task datasets according to GETTING_STARTED.md.
Server crashes/hangs after launching pretraining experiments.
In addition to GPU, our pretraining stage requires a lot of computation resources, including CPU and RAM. If this happens, it usually means the CPU/RAM is exhausted on your machine. You can decrease `--num-workers` (number of dataloaders using CPU) and `--num-copies` (number of datasets copies residing in RAM). With the lowest profile, try `--num-workers 1 --num-copies 1`.

If this still fails, please upgrade your machine :). In the meanwhile, you can still download our pretrained model and evaluate it on downstream tasks.

Having difficulty installing RDKit.
See the P.S. section in [this](https://github.com/THUDM/GCC/issues/12#issue-752080014) post.

Citing GCC

If you use GCC in your research or wish to refer to the baseline results, please use the following BibTeX.

@article{qiu2020gcc,
  title={GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training},
  author={Qiu, Jiezhong and Chen, Qibin and Dong, Yuxiao and Zhang, Jing and Yang, Hongxia and Ding, Ming and Wang, Kuansan and Tang, Jie},
  journal={arXiv preprint arXiv:2006.09963},
  year={2020}
}

Acknowledgements

Part of this code is inspired by Yonglong Tian et al.'s CMC: Contrastive Multiview Coding.

Comments
  • Can`t running this code with low equipments

    Can`t running this code with low equipments

    Hi,Thank you for releasing your code. I am currently trying to reproduce the result of pre-training experiment on E2E.

    However I cant running this code because of my PCs Configuration is too low.

    Is there any methods to deal with it?

    Maybe lower the parameters of the experiment? I have tried many times to lower the batch-size to 16. But its still cant work. I have no idea to deal with it.

    How can i run this code with the following equipment?

    I am looking forward to your reply. Thanks you very much.

    My PC`s System parameters image

    opened by ChloeWongxt 5
  • Reproduce result of usa-airports

    Reproduce result of usa-airports

    Hi,Thank you for releasing your code. I am currently trying to reproduce the result of node classification experiment on US-Airport dataset, while I can't get as high as 68.3%. Is there any techniques I can use to get higher accuracy? Thanks!

    opened by larry2020626 5
  • 【DGLError】dgl._ffi.base.DGLError:Check failed: fs: Filename is invalid

    【DGLError】dgl._ffi.base.DGLError:Check failed: fs: Filename is invalid

    Run pre-training example fails.

    When i run wit this code

    There are some troubles with it.

    I don`t have any idea with what happens. Could you help me? thanks a lot.

    paper`s name:GCC: Graph Contrastive Coding for Graph Neural NetworkPre-Training paper:https://arxiv.org/abs/2006.09963 code:https://github.com/THUDM/GCC

    Error

    (GCC01) chloe@chloe-MS-7A74:~/Documents/00 Work/02 Xovee/01 Code/GCC$ bash scripts/pretrain.sh 0 --batch-size 256 Using backend: pytorch Namespace(alpha=0.999, aug='1st', batch_size=256, beta1=0.9, beta2=0.999, clip_norm=1.0, cv=False, dataset='dgl', degree_embedding_size=16, epochs=100, exp='Pretrain', finetune=False, fold_idx=0, freq_embedding_size=16, gpu=0, hidden_size=64, learning_rate=0.005, load_path=None, lr_decay_epochs=[120, 160, 200], lr_decay_rate=0.0, max_degree=512, max_edge_freq=16, max_node_freq=16, moco=False, model='gin', model_folder='saved/Pretrain_moco_False_dgl_gin_layer_5_lr_0.005_decay_1e-05_bsz_256_hid_64_samples_2000_nce_t_0.07_nce_k_32_rw_hops_256_restart_prob_0.8_aug_1st_ft_False_deg_16_pos_32_momentum_0.999', model_name='Pretrain_moco_False_dgl_gin_layer_5_lr_0.005_decay_1e-05_bsz_256_hid_64_samples_2000_nce_t_0.07_nce_k_32_rw_hops_256_restart_prob_0.8_aug_1st_ft_False_deg_16_pos_32_momentum_0.999', model_path='saved', momentum=0.9, nce_k=32, nce_t=0.07, norm=True, num_copies=6, num_layer=5, num_samples=2000, num_workers=12, optimizer='adam', positional_embedding_size=32, print_freq=10, readout='avg', restart_prob=0.8, resume='', rw_hops=256, save_freq=1, seed=0, set2set_iter=6, set2set_lstm_layer=3, subgraph_size=128, tb_folder='tensorboard/Pretrain_moco_False_dgl_gin_layer_5_lr_0.005_decay_1e-05_bsz_256_hid_64_samples_2000_nce_t_0.07_nce_k_32_rw_hops_256_restart_prob_0.8_aug_1st_ft_False_deg_16_pos_32_momentum_0.999', tb_freq=250, tb_path='tensorboard', weight_decay=1e-05) Use GPU: 0 for training setting random seeds before construct dataset 6.249996185302734 Traceback (most recent call last): File "train.py", line 818, in main(args) File "train.py", line 555, in main num_copies=args.num_copies File "/home/chloe/Documents/00 Work/02 Xovee/01 Code/GCC/gcc/datasets/graph_dataset.py", line 58, in init graph_sizes = dgl.data.utils.load_labels(dgl_graphs_file)[ File "/home/chloe/anaconda3/envs/GCC01/lib/python3.7/site-packages/dgl/data/graph_serialize.py", line 172, in load_labels metadata = _CAPI_DGLLoadGraphs(filename, [], True) File "dgl/_ffi/_cython/./function.pxi", line 287, in dgl._ffi._cy3.core.FunctionBase.call File "dgl/_ffi/_cython/./function.pxi", line 222, in dgl._ffi._cy3.core.FuncCall File "dgl/_ffi/_cython/./function.pxi", line 211, in dgl._ffi._cy3.core.FuncCall3 File "dgl/_ffi/_cython/./base.pxi", line 155, in dgl._ffi._cy3.core.CALL dgl._ffi.base.DGLError: [21:41:01] /opt/dgl/src/graph/graph_serialize.cc:193: Check failed: fs: Filename is invalid Stack trace: [bt] (0) /home/chloe/anaconda3/envs/GCC01/lib/python3.7/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x22) [0x7fbe9e1ec782] [bt] (1) /home/chloe/anaconda3/envs/GCC01/lib/python3.7/site-packages/dgl/libdgl.so(dgl::serialize::LoadDGLGraphs(std::string const&, std::vector<unsigned long, std::allocator >, bool)+0xe7c) [0x7fbe9e859a5c] [bt] (2) /home/chloe/anaconda3/envs/GCC01/lib/python3.7/site-packages/dgl/libdgl.so(+0xd1f0eb) [0x7fbe9e85a0eb] [bt] (3) /home/chloe/anaconda3/envs/GCC01/lib/python3.7/site-packages/dgl/libdgl.so(DGLFuncCall+0x52) [0x7fbe9e7f46e2] [bt] (4) /home/chloe/anaconda3/envs/GCC01/lib/python3.7/site-packages/dgl/_ffi/_cy3/core.cpython-37m-x86_64-linux-gnu.so(+0x19cdb) [0x7fbef63a5cdb] [bt] (5) /home/chloe/anaconda3/envs/GCC01/lib/python3.7/site-packages/dgl/_ffi/_cy3/core.cpython-37m-x86_64-linux-gnu.so(+0x1a25b) [0x7fbef63a625b] [bt] (6) python(_PyObject_FastCallKeywords+0x48b) [0x55db603c900b] [bt] (7) python(_PyEval_EvalFrameDefault+0x49b6) [0x55db6042d186] [bt] (8) python(_PyFunction_FastCallKeywords+0xfb) [0x55db603c120b]

    ##Environment

    scikit-learn==0.20.3 scipy==1.4.1 coverage==4.5.4 coveralls==1.9.2 black==19.3b0 pytest==5.3.2 networkx==2.3 numpy==1.18.2 matplotlib==3.1.0 seaborn==0.9.0 tqdm==4.43.0 tensorboard_logger==0.1.0

    torch~=1.5.1 dgl~=0.4.3.post2 pandas~=1.0.5 requests~=2.24.0 psutil~=5.7.2 joblib~=0.16.0

    Python:3.7 PyTorch 1.5.1 DGL 0.4.1 rdkit=2019.09.2.

    opened by ChloeWongxt 4
  • An error when finetuning on graph classification dataset

    An error when finetuning on graph classification dataset

    Hi,

    I encountered an error when finetuning on imdb-binary dataset.

    The running command is

    bash scripts/finetune.sh saved/Pretrain_moco_True_dgl_gin_layer_5_lr_0.005_decay_1e-05_bsz_32_hid_64_samples_2000_nce_t_0.07_nce_k_163841_rw_hops_256_restart_prob_0.8_aug_1st_ft_False_deg_16_pos_32_momentum_0.999/ 0 imdb-binary

    The error message is AttributeError: 'GraphClassificationDatasetLabeled' object has no attribute 'dataset'

    Please see the screenshot: Screen Shot 2020-12-13 at 8 53 17 PM

    Thank you!

    opened by Kqiii 3
  • In generating phase, the necessarity of increasing batch size to the length of dataset.

    In generating phase, the necessarity of increasing batch size to the length of dataset.

    https://github.com/THUDM/GCC/blob/master/generate.py#L90

    I find if comment this line, i.e. maintain batch size at 32, inference will be much faster.

    Why do you set batch size to the length of dataset. Will it influence the performance?

    opened by JiangTanZJU 3
  • About downstream datasets

    About downstream datasets

    Hello, I want to run code on the Cora and Citeseer dataset, but I found no downstream dataset named it. So, could you please provide the code for generating downstream datasets? or would you mind offering me the Cora and Citeseer files you have generated? Thanks a million!

    opened by flyz1 2
  • torch.util.data.IterableDataset

    torch.util.data.IterableDataset

    Hi: In graph_dataset.py , class LoadBalanceGraphDataset(torch.utils.data.IterableDataset), self.num_samples default is 2000. I want to learn what's the relation between this variable and tensor.batch_size.In my own experiment, my class Dataset belong to IterableDataset, when self.num_samples is not the multiple of train_loader.batch_size, it will be wring.

    opened by wangzeyu135798 1
  • Can't do node_classification tasks on panther datasets

    Can't do node_classification tasks on panther datasets

    When I use cikm dataset in panther directory do downstream node_classification task, in data_utils class SSSingleDataset, self.data = Data(x=None, edge_index = egde_index,y = None), while in graph_dataset.py line 434 self.num_classes = self.data.y.shape[1], this self.data is Data object, self.data.y doesn't exist so it runs wrong.

    opened by wangzeyu135798 1
  • Possibly redundant BatchNorm layer?

    Possibly redundant BatchNorm layer?

    @xptree Thank you for the code. It seems that there are two cascaded BatchNorm layers in each GIN layer. I am wondering whether one of them is redundant.

    Specifically, In the UnsupervisedGIN class, BNs are instantiated (use_selayer is False, as in the code): image and called during forward: image

    Meanwhile, In the ApplyNodeFunc class, a BN is again instantiated and called during forward (use_selayer is False, as in the code): image

    So in each layer, there are two cascaded BNs, between which there is only a ReLU activation.

    As a novice in GNN, I did not see such implementation (cascaded BNs) elsewhere. Could you please explain why you did this? Does this implementation lead to better performance than keeping only one BN in each layer?

    Thank you!

    opened by zhikaili 1
  • Running experiments completely on CPU

    Running experiments completely on CPU

    Hi, thanks for the inspiring work of GCC.

    I wonder if there is a way to run GCC pretrain on CPU-only. As it appears to me, there is no easy way (such as specifying an option in the arguments) to do this.

    P.S. Installation of RDKit is really painful for servers inside mainland china. Through checking the code, I found that there is no need to install RDKit, all one have to do is to copy the GAT and GCN layer code of DGL to the models/gat.py and models/gcn.py respectively. Please kindly correct me if I was wrong.

    Any help is highly appreciated : )

    opened by AndyJZhao 1
  • it seems dgl-0.4.1 cannot work properly

    it seems dgl-0.4.1 cannot work properly

    when running the script, will get the following error:

    munmap_chunk(): invalid pointer scripts/pretrain.sh: line 10: 17039 Aborted (core dumped) python train.py --exp Pretrain --model-path saved --tb-path tensorboard --gpu $gpu $ARGS

    update dgl to 0.4.3 fix it.

    opened by Coolgiserz 1
  • Bump numpy from 1.18.2 to 1.22.0

    Bump numpy from 1.18.2 to 1.22.0

    Bumps numpy from 1.18.2 to 1.22.0.

    Release notes

    Sourced from numpy's releases.

    v1.22.0

    NumPy 1.22.0 Release Notes

    NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

    • Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.
    • A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.
    • NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.
    • New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.
    • A new configurable allocator for use by downstream projects.

    These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

    The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

    Expired deprecations

    Deprecated numeric style dtype strings have been removed

    Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

    (gh-19539)

    Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

    numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

    (gh-19615)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Please help resolve the problem when the code was running on CUDA11.1 , DGL 0.7,

    Please help resolve the problem when the code was running on CUDA11.1 , DGL 0.7,

    dgl._ffi.base.DGLError: [15:08:17] /opt/dgl/include/dgl/packed_func_ext.h:117: Check failed: ObjectTypeChecker::Check(sptr.get()): Expected type graph.Graph but get graph.HeteroGraph Stack trace:

    how can I update this code to dgl7.x when in cuda11?

    the dgl.contrib.sampling.random_walk_with_restart and dgl.contrib.sampling.random_walk cannt work in dgl-cu11, it need to replaced by dgl.sampling.random_walk , but the parameters was different . how can I update this code to dgl7.x when in cuda11.

    my email: [email protected]

    opened by xdjwolf 1
  • finetune

    finetune

    Hi: When I use pre-trained model finetune downstream graph_classification dataset RDT-B and RDT-M, I meet one problem. In deep learning finetune process, RDT-B and RDT-M, the train accuracy is very high, sometimes near to 1 while test accuracy is very low, it should be overfitting problem. Have you met this before and how did you deal with it if it does? Thanks.

    opened by wangzeyu135798 2
Owner
THUDM
Data Mining Research Group at Tsinghua University
THUDM
Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Heterogeneous INteract and aggreGatE (GraphHINGE) This is a pytorch implementation of GraphHINGE model. This is the experiment code in the following w

Jinjiarui 69 Nov 24, 2022
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).

GAM ⠀⠀ A PyTorch implementation of Graph Classification Using Structural Attention (KDD 2018). Abstract Graph classification is a problem with practic

Benedek Rozemberczki 259 Dec 5, 2022
Code for the KDD 2021 paper 'Filtration Curves for Graph Representation'

Filtration Curves for Graph Representation This repository provides the code from the KDD'21 paper Filtration Curves for Graph Representation. Depende

Machine Learning and Computational Biology Lab 16 Oct 16, 2022
The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

This repository is the official PyTorch implementation of SAINT. Find the paper on arxiv SAINT: Improved Neural Networks for Tabular Data via Row Atte

Gowthami Somepalli 284 Dec 21, 2022
Saeed Lotfi 28 Dec 12, 2022
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 3, 2022
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

DeCLIP Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. Our paper is available in arxiv Updates ** Ou

Sense-GVT 470 Dec 30, 2022
Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

CCOP Code of our paper Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning Requirement Install OpenSelfSup Install Detectron2

Chenhongyi Yang 21 Dec 13, 2022
CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

CLIP-Indonesian CLIP (Radford et al., 2021) is a multimodal model that can connect images and text by training a vision encoder and a text encoder joi

Galuh 17 Mar 10, 2022
Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

gHHC Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, D

Nicholas Monath 35 Nov 16, 2022
This project provides an unsupervised framework for mining and tagging quality phrases on text corpora with pretrained language models (KDD'21).

UCPhrase: Unsupervised Context-aware Quality Phrase Tagging To appear on KDD'21...[pdf] This project provides an unsupervised framework for mining and

Xiaotao Gu 146 Dec 22, 2022
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 4, 2020
TilinGNN: Learning to Tile with Self-Supervised Graph Neural Network (SIGGRAPH 2020)

TilinGNN: Learning to Tile with Self-Supervised Graph Neural Network (SIGGRAPH 2020) About The goal of our research problem is illustrated below: give

null 59 Dec 9, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

null 54 Dec 15, 2022
Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

Xinyi Wang 21 May 18, 2022
Code for our paper "Graph Pre-training for AMR Parsing and Generation" in ACL2022

AMRBART An implementation for ACL2022 paper "Graph Pre-training for AMR Parsing and Generation". You may find our paper here (Arxiv). Requirements pyt

xfbai 60 Jan 3, 2023
TAPEX: Table Pre-training via Learning a Neural SQL Executor

TAPEX: Table Pre-training via Learning a Neural SQL Executor The official repository which contains the code and pre-trained models for our paper TAPE

Microsoft 157 Dec 28, 2022
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

GLOM - Pytorch (wip) An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding,

Phil Wang 173 Dec 14, 2022