Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Related tags

Deep Learning HGB
Overview

Heterogeneous Graph Benchmark

Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Roadmap

We organize our repo by task, and one sub-folder per task. Currently, we have four tasks, i.e., node classification (NC), link prediction (LP), knowledge-aware recommendation (Recom) and text classification (TC).

Revisiting

This part refers to Section 3 and Table 1 in our paper.

Benchmarking and Refining

This part refers to Section 4,5,6 in our paper.

You should notice that the test data labels are randomly replaced to prevent data leakage issues. If you want to obtain test scores, you need to submit your prediction to our website.

For node classification and link prediction tasks, you can submit online. But for recommendation task, since the prediction files are too large to submit, you have to test offline by yourself.

If you want to show your method on our official leaderboard on HGB website, you should submit your code or paper to us. Once your code or paper is verified, your method will be displayed on the official leaderboard. (The request form is under development and will be available soon!)

More

This repo is actively under development. Therefore, there are some extra experiments in this repo beyond our paper, such as graph-based text classification. For more information, see our website. Welcome contribute new tasks, datasets, methods to HGB!

Moreover, we also have an implementation of Simple-HGN in cogdl.

Citation

  • Title: Are we really making much progress? Revisiting, benchmarking and refining the Heterogeneous Graph Neural Networks.
  • Authors: Qingsong Lv*, Ming Ding*, Qiang Liu, Yuxiang Chen, Wenzheng Feng, Siming He, Chang Zhou, Jianguo Jiang, Yuxiao Dong, Jie Tang.
  • In proceedings: KDD 2021.
Comments
  • Dataset about IMDB

    Dataset about IMDB

    Dataset IMDB has multi labels. But I am wondering how to save the file for evaluation, cause I don't see it in run_multi.py. And else, I want to know what API in sklearn to evaluate multi-label.

    opened by Theheavens 5
  • use /NC/benchmark/methods/GNN to run IMDB

    use /NC/benchmark/methods/GNN to run IMDB

    the result just can't reach the result in the paper. When I ran IMDB, the result was like macro-f1=0.44, micro-f1=0.56 or so. I use the parameter described in the paper. "We set 𝑑 = 64, 𝑛ℎ = 8 for all datasets. For IMDB, we set 𝑠 = 0.1 and 𝐿 = 5. We use feat = 0 for IMDB."

    I wonder why this happened. Thanks for the attention

    opened by lanadelreyfan 1
  • about L2 norm in Simple HGN and a problem about the submission website

    about L2 norm in Simple HGN and a problem about the submission website

    Thanks for your team's good work. But I have a doubt about the last L2 norm trick used in Simple HGN. From the paper's ablation study, we can see that the performance of Simple HGN drops greatly without last L2 norm trick(even decrease 5-8% , it seems that the high performance of Simple HGN comes from this trick), which is quite strange.

    Also, the website https://www.biendata.xyz/competition/hgb-1 has a problem recently. When I make a submission, the website will stop at "evaluating" for quite a long time and won't return the results. Can you check it? Thank you

    opened by SsrCode 1
  • Stuck at evaluation after submitting to node classification leaderboard (节点预测提交后没有返回评测结果)

    Stuck at evaluation after submitting to node classification leaderboard (节点预测提交后没有返回评测结果)

    After I made a submission to node classification leaderboard, the interface is stuck at 100% upload and a greyed-out "Evaluating" button forever without returning any results. Is there any issue in the evaluation server? Submitting to link prediction worked fine. Thanks!

    您好,我在上传结果到节点分类排行榜时,界面卡在了100%上传,没有返回评测结果。上传到链路预测的时候没有问题。请问是否是评测服务器有问题?谢谢!

    (Possible duplicate of #14 )

    opened by BarclayII 1
  • about meta-paths

    about meta-paths

    您好,在论文中,ACM的元路径设置如图,但在代码中 HAN 的meta-path不是这样https://github.com/THUDM/HGB/blob/master/NC/benchmark/methods/HAN/utils.py#L170 这需要自己改动吗 image

    以及我跑了benchmark中HAN的代码,按照代码的路径设置,验证集最好 macro f1 为 0.87,micro f1 为 0.96,5 次后提交到 biendata,测试集的 macro f1 和 micro f1 都是 0.86 左右,与论文的 90.8% 左右还是相差较多 image 如果把路径按论文设置(即没有 ptp,添加额外的引用/参考关系),验证集最好 macro f1 为 0.78,micro f1 为 0.90

    不知道其他人跑 ACM 结果如何,请问 ACM 的实验有什么要注意的吗

    opened by NovelinUp 3
  • Question about feature preprocessing for certain datasets

    Question about feature preprocessing for certain datasets

    Dear authors,

    I find this repo only provides datasets which are available to download but doe not provide the preprocessing code for the datasets.

    I have a question about the dataset ACM and Freebase, I wonder what is the input features for target type nodes and how you assign features for other non-target-type nodes? Thanks!

    opened by eddiegaoo 0
  • convolution optimization

    convolution optimization

    Hi, I was checking the convolution, and apparently there are expensive layers there that can be completely eliminated:

    The code is:

    e_feat = self.edge_emb(e_feat)
    e_feat = self.fc_e(e_feat).view(-1, self._num_heads, self._edge_feats)
    ee = (e_feat * self.attn_e).sum(dim=-1).unsqueeze(-1)
    el = (feat_src * self.attn_l).sum(dim=-1).unsqueeze(-1)
    er = (feat_dst * self.attn_r).sum(dim=-1).unsqueeze(-1)
    graph.srcdata.update({'ft': feat_src, 'el': el})
    graph.dstdata.update({'er': er})
    graph.edata.update({'ee': ee})
    graph.apply_edges(fn.u_add_v('el', 'er', 'e'))
    e = self.leaky_relu(graph.edata.pop('e')+graph.edata.pop('ee'))
    

    Problem 1:

    self.edge_emb = nn.Embedding(num_etypes, edge_feats)
    e_feat = self.edge_emb(e_feat)
    e_feat = self.fc_e(e_feat).view(-1, self._num_heads, self._edge_feats)
    

    Is it necessary to run a fully connected layer over embeddings? As far as I understand, the embeddings can naturally learn the same projection emb = self.fc_c(emb). This becomes even more expensive when we think that the conv might have only 20 types of edges, but it is running this fully connected layer hundreds of thousands of times for the same repeated 20 types.

    Problem 2:

    self.attn_l = nn.Parameter(th.FloatTensor(size=(1, num_heads, out_feats)))
    self.attn_r = nn.Parameter(th.FloatTensor(size=(1, num_heads, out_feats)))
    self.attn_e = nn.Parameter(th.FloatTensor(size=(1, num_heads, edge_feats)))
    
    ee = (e_feat * self.attn_e).sum(dim=-1).unsqueeze(-1)
    el = (feat_src * self.attn_l).sum(dim=-1).unsqueeze(-1)
    er = (feat_dst * self.attn_r).sum(dim=-1).unsqueeze(-1)
    

    e_feat, feat_src and feat_dst are the products of an MLP. Is it necessary to multiply it by a constant (called attention here)? I guess the MLP can naturally achieve the same value. We can just say that:

    y = (a * x+ b)  
    y = d(a*x + b)
    y = d*a*x + b*d
    y = new_a*x + new_b
    

    If you remove these two parts, then attention can be calculated just as right + left + edge_emb (graph.apply_edges(fn.u_add_v('feat_src', 'feat_dst', 'e_feat'))), without doing all these transformations beforehand.

    opened by fmellomascarenhas 1
Owner
THUDM
Data Mining Research Group at Tsinghua University
THUDM
This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

BUPT GAMMA Lab 519 Jan 2, 2023
Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

null 1 Oct 28, 2022
Scalable Graph Neural Networks for Heterogeneous Graphs

Neighbor Averaging over Relation Subgraphs (NARS) NARS is an algorithm for node classification on heterogeneous graphs, based on scalable neighbor ave

Facebook Research 67 Dec 3, 2022
Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

GATER This repository contains the code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”. Our implementation is

Jiacheng Ye 12 Nov 24, 2022
The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

SHGNN: Structure-Aware Heterogeneous Graph Neural Network The source code and dataset of the paper: SHGNN: Structure-Aware Heterogeneous Graph Neural

Wentao Xu 7 Nov 13, 2022
Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Heterogeneous INteract and aggreGatE (GraphHINGE) This is a pytorch implementation of GraphHINGE model. This is the experiment code in the following w

Jinjiarui 69 Nov 24, 2022
A heterogeneous entity-augmented academic language model based on Open Academic Graph (OAG)

Library | Paper | Slack We released two versions of OAG-BERT in CogDL package. OAG-BERT is a heterogeneous entity-augmented academic language model wh

THUDM 58 Dec 17, 2022
Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling

RHGN Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling Dependencies torch==1.6.0 torchvision==0.7.0 dgl==0.7.1

Big Data and Multi-modal Computing Group, CRIPAC 6 Nov 29, 2022
Implementation of Heterogeneous Graph Attention Network

HetGAN Implementation of Heterogeneous Graph Attention Network This is the code repository of paper "Prediction of Metro Ridership During the COVID-19

null 5 Dec 28, 2021
A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

python_graphs This package is for computing graph representations of Python programs for machine learning applications. It includes the following modu

Google Research 258 Dec 29, 2022
The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

GSDN-F and GSDN-EF This repository provides a reference implementation of GSDN-F and GSDN-EF as described in the paper "Understanding Graph Neural Net

Guoji Fu 18 Nov 14, 2022
Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

zshicode 1 Nov 18, 2021
On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks We provide the code (in PyTorch) and datasets for our paper "On Size-Orient

Zemin Liu 4 Jun 18, 2022
Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"

Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline Ankit Goyal, Hei Law, Bowei Liu, Alejandro Newell, Jia Deng Internati

Princeton Vision & Learning Lab 115 Jan 4, 2023
Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

DHF1K =========================================================================== Wenguan Wang, J. Shen, M.-M Cheng and A. Borji, Revisiting Video Sal

Wenguan Wang 126 Dec 3, 2022
Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Twins: Revisiting the Design of Spatial Attention in Vision Transformers Very recently, a variety of vision transformer architectures for dense predic

null 482 Dec 18, 2022
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations This repo contains the Pytorch implementation of our paper: Revisit

Wouter Van Gansbeke 80 Nov 20, 2022
an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

revisiting-sepconv This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two f

Simon Niklaus 59 Dec 22, 2022