Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

THUDM

Last update: Dec 17, 2022

Related tags

Deep Learning HGB

Overview

Heterogeneous Graph Benchmark

Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Roadmap

We organize our repo by task, and one sub-folder per task. Currently, we have four tasks, i.e., node classification (NC), link prediction (LP), knowledge-aware recommendation (Recom) and text classification (TC).

Revisiting

This part refers to Section 3 and Table 1 in our paper.

Benchmarking and Refining

This part refers to Section 4,5,6 in our paper.

You should notice that the test data labels are randomly replaced to prevent data leakage issues. If you want to obtain test scores, you need to submit your prediction to our website.

For node classification and link prediction tasks, you can submit online. But for recommendation task, since the prediction files are too large to submit, you have to test offline by yourself.

If you want to show your method on our official leaderboard on HGB website, you should submit your code or paper to us. Once your code or paper is verified, your method will be displayed on the official leaderboard. (The request form is under development and will be available soon!)

This repo is actively under development. Therefore, there are some extra experiments in this repo beyond our paper, such as graph-based text classification. For more information, see our website. Welcome contribute new tasks, datasets, methods to HGB!

Moreover, we also have an implementation of Simple-HGN in cogdl.

Citation

Title: Are we really making much progress? Revisiting, benchmarking and refining the Heterogeneous Graph Neural Networks.
Authors: Qingsong Lv*, Ming Ding*, Qiang Liu, Yuxiang Chen, Wenzheng Feng, Siming He, Chang Zhou, Jianguo Jiang, Yuxiao Dong, Jie Tang.
In proceedings: KDD 2021.

Comments

Dataset about IMDB

Dataset IMDB has multi labels. But I am wondering how to save the file for evaluation, cause I don't see it in run_multi.py. And else, I want to know what API in sklearn to evaluate multi-label.

opened by Theheavens 5
use /NC/benchmark/methods/GNN to run IMDB

the result just can't reach the result in the paper. When I ran IMDB, the result was like macro-f1=0.44, micro-f1=0.56 or so. I use the parameter described in the paper. "We set 𝑑 = 64, 𝑛ℎ = 8 for all datasets. For IMDB, we set 𝑠 = 0.1 and 𝐿 = 5. We use feat = 0 for IMDB."

I wonder why this happened. Thanks for the attention

opened by lanadelreyfan 1
about L2 norm in Simple HGN and a problem about the submission website

Thanks for your team's good work. But I have a doubt about the last L2 norm trick used in Simple HGN. From the paper's ablation study, we can see that the performance of Simple HGN drops greatly without last L2 norm trick(even decrease 5-8% , it seems that the high performance of Simple HGN comes from this trick), which is quite strange.

Also, the website https://www.biendata.xyz/competition/hgb-1 has a problem recently. When I make a submission, the website will stop at "evaluating" for quite a long time and won't return the results. Can you check it? Thank you

opened by SsrCode 1
Stuck at evaluation after submitting to node classification leaderboard （节点预测提交后没有返回评测结果）

After I made a submission to node classification leaderboard, the interface is stuck at 100% upload and a greyed-out "Evaluating" button forever without returning any results. Is there any issue in the evaluation server? Submitting to link prediction worked fine. Thanks!

您好，我在上传结果到节点分类排行榜时，界面卡在了100%上传，没有返回评测结果。上传到链路预测的时候没有问题。请问是否是评测服务器有问题？谢谢！

(Possible duplicate of #14 )

opened by BarclayII 1
about meta-paths

您好，在论文中，ACM的元路径设置如图，但在代码中 HAN 的meta-path不是这样https://github.com/THUDM/HGB/blob/master/NC/benchmark/methods/HAN/utils.py#L170 这需要自己改动吗

以及我跑了benchmark中HAN的代码，按照代码的路径设置，验证集最好 macro f1 为 0.87，micro f1 为 0.96，5 次后提交到 biendata，测试集的 macro f1 和 micro f1 都是 0.86 左右，与论文的 90.8% 左右还是相差较多如果把路径按论文设置（即没有 ptp，添加额外的引用/参考关系），验证集最好 macro f1 为 0.78，micro f1 为 0.90

不知道其他人跑 ACM 结果如何，请问 ACM 的实验有什么要注意的吗

opened by NovelinUp 3
Question about feature preprocessing for certain datasets

Dear authors,

I find this repo only provides datasets which are available to download but doe not provide the preprocessing code for the datasets.

I have a question about the dataset ACM and Freebase, I wonder what is the input features for target type nodes and how you assign features for other non-target-type nodes? Thanks!

opened by eddiegaoo 0

convolution optimization

Hi, I was checking the convolution, and apparently there are expensive layers there that can be completely eliminated:

The code is:

e_feat = self.edge_emb(e_feat)
e_feat = self.fc_e(e_feat).view(-1, self._num_heads, self._edge_feats)
ee = (e_feat * self.attn_e).sum(dim=-1).unsqueeze(-1)
el = (feat_src * self.attn_l).sum(dim=-1).unsqueeze(-1)
er = (feat_dst * self.attn_r).sum(dim=-1).unsqueeze(-1)
graph.srcdata.update({'ft': feat_src, 'el': el})
graph.dstdata.update({'er': er})
graph.edata.update({'ee': ee})
graph.apply_edges(fn.u_add_v('el', 'er', 'e'))
e = self.leaky_relu(graph.edata.pop('e')+graph.edata.pop('ee'))

Problem 1:

self.edge_emb = nn.Embedding(num_etypes, edge_feats)
e_feat = self.edge_emb(e_feat)
e_feat = self.fc_e(e_feat).view(-1, self._num_heads, self._edge_feats)

Is it necessary to run a fully connected layer over embeddings? As far as I understand, the embeddings can naturally learn the same projection emb = self.fc_c(emb). This becomes even more expensive when we think that the conv might have only 20 types of edges, but it is running this fully connected layer hundreds of thousands of times for the same repeated 20 types.

Problem 2:

self.attn_l = nn.Parameter(th.FloatTensor(size=(1, num_heads, out_feats)))
self.attn_r = nn.Parameter(th.FloatTensor(size=(1, num_heads, out_feats)))
self.attn_e = nn.Parameter(th.FloatTensor(size=(1, num_heads, edge_feats)))

ee = (e_feat * self.attn_e).sum(dim=-1).unsqueeze(-1)
el = (feat_src * self.attn_l).sum(dim=-1).unsqueeze(-1)
er = (feat_dst * self.attn_r).sum(dim=-1).unsqueeze(-1)

e_feat, feat_src and feat_dst are the products of an MLP. Is it necessary to multiply it by a constant (called attention here)? I guess the MLP can naturally achieve the same value. We can just say that:

y = (a * x+ b)  
y = d(a*x + b)
y = d*a*x + b*d
y = new_a*x + new_b

If you remove these two parts, then attention can be calculated just as right + left + edge_emb (graph.apply_edges(fn.u_add_v('feat_src', 'feat_dst', 'e_feat'))), without doing all these transformations beforehand.

opened by fmellomascarenhas 1

Owner

THUDM

Data Mining Research Group at Tsinghua University

GitHub

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

519 Jan 2, 2023

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

1 Oct 28, 2022

Scalable Graph Neural Networks for Heterogeneous Graphs

Neighbor Averaging over Relation Subgraphs (NARS) NARS is an algorithm for node classification on heterogeneous graphs, based on scalable neighbor ave

67 Dec 3, 2022

Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

GATER This repository contains the code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”. Our implementation is

12 Nov 24, 2022

The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

SHGNN: Structure-Aware Heterogeneous Graph Neural Network The source code and dataset of the paper: SHGNN: Structure-Aware Heterogeneous Graph Neural

7 Nov 13, 2022

Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Heterogeneous INteract and aggreGatE (GraphHINGE) This is a pytorch implementation of GraphHINGE model. This is the experiment code in the following w

69 Nov 24, 2022

A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

Library | Paper | Slack We released two versions of OAG-BERT in CogDL package. OAG-BERT is a heterogeneous entity-augmented academic language model wh

58 Dec 17, 2022

Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

RHGN Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling Dependencies torch==1.6.0 torchvision==0.7.0 dgl==0.7.1

Big Data and Multi-modal Computing Group, CRIPAC

6 Nov 29, 2022

Implementation of Heterogeneous Graph Attention Network

HetGAN Implementation of Heterogeneous Graph Attention Network This is the code repository of paper "Prediction of Metro Ridership During the COVID-19

5 Dec 28, 2021

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

Deep GNN, Shallow Sampling Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, R

117 Dec 20, 2022

A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

python_graphs This package is for computing graph representations of Python programs for machine learning applications. It includes the following modu

258 Dec 29, 2022

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

GSDN-F and GSDN-EF This repository provides a reference implementation of GSDN-F and GSDN-EF as described in the paper "Understanding Graph Neural Net

18 Nov 14, 2022

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

1 Nov 18, 2021

Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Related tags

Overview

Heterogeneous Graph Benchmark

Roadmap

Revisiting

Benchmarking and Refining

More

Citation

Comments

Dataset about IMDB

use /NC/benchmark/methods/GNN to run IMDB

about L2 norm in Simple HGN and a problem about the submission website

Stuck at evaluation after submitting to node classification leaderboard （节点预测提交后没有返回评测结果）

about meta-paths

Question about feature preprocessing for certain datasets

convolution optimization

Owner

THUDM

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

Scalable Graph Neural Networks for Heterogeneous Graphs

Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

Implementation of Heterogeneous Graph Attention Network

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"

Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch