Tensorflow implementation for Self-supervised Graph Learning for Recommendation

This is our Tensorflow implementation for our SIGIR 2021 paper:

Jiancan Wu, Xiang Wang, Fuli Feng, Xiangnan He, Liang Chen, Jianxun Lian,and Xing Xie. 2021. Self-supervised Graph Learning for Recommendation, Paper in arXiv.

Environment Requirement

The code runs well under python 3.7.7. The required packages are as follows:

  • Tensorflow-gpu == 1.15.0
  • numpy == 1.19.1
  • scipy == 1.5.2
  • pandas == 1.1.1
  • cython == 0.29.21

Quick Start

Firstly, compline the evaluator of cpp implementation with the following command line:

python setup.py build_ext --inplace

If the compilation is successful, the evaluator of cpp implementation will be called automatically. Otherwise, the evaluator of python implementation will be called.

Note that the cpp implementation is much faster than python.

Further details, please refer to NeuRec

Secondly, specify dataset and recommender in configuration file NeuRec.properties.

Model specific hyperparameters are in configuration file ./conf/SGL.properties.

Some important hyperparameters (taking a 3-layer SGL-ED as example):

yelp2018 dataset


amazon-book dataset


ifashion dataset


Finally, run main.py in IDE or with command line:

python main.py
  • 求解代码里的些许问题


    首先,非常感谢您share SGL的代码,今天在看您代码的过程中,发现自己对sub_mat这个dict,并不是很理解,请问可以帮忙注释一下吗?非常感谢,这对一个初入门的选手太重要了。

        with tf.name_scope("input_data"):
                self.users = tf.placeholder(tf.int32, shape=(None,))
                self.pos_items = tf.placeholder(tf.int32, shape=(None,))
                self.neg_items = tf.placeholder(tf.int32, shape=(None,))
                self.sub_mat = {}
                if self.aug_type in [0, 1]:
                    #0: Node Dropout; 1: Edge Dropout
                    self.sub_mat['adj_values_sub1'] = tf.placeholder(tf.float32) 
                    self.sub_mat['adj_indices_sub1'] = tf.placeholder(tf.int64)
                    self.sub_mat['adj_shape_sub1'] = tf.placeholder(tf.int64)
                    self.sub_mat['adj_values_sub2'] = tf.placeholder(tf.float32)
                    self.sub_mat['adj_indices_sub2'] = tf.placeholder(tf.int64)
                    self.sub_mat['adj_shape_sub2'] = tf.placeholder(tf.int64)
                    #2: Random Walk
                    for k in range(1, self.n_layers + 1):
                        self.sub_mat['adj_values_sub1%d' % k] = tf.placeholder(tf.float32, name='adj_values_sub1%d' % k)
                        self.sub_mat['adj_indices_sub1%d' % k] = tf.placeholder(tf.int64, name='adj_indices_sub1%d' % k)
                        self.sub_mat['adj_shape_sub1%d' % k] = tf.placeholder(tf.int64, name='adj_shape_sub1%d' % k)
                        self.sub_mat['adj_values_sub2%d' % k] = tf.placeholder(tf.float32, name='adj_values_sub2%d' % k)
                        self.sub_mat['adj_indices_sub2%d' % k] = tf.placeholder(tf.int64, name='adj_indices_sub2%d' % k)
                        self.sub_mat['adj_shape_sub2%d' % k] = tf.placeholder(tf.int64, name='adj_shape_sub2%d' % k)
    opened by perveil 6
  • 数据集



    opened by BOTAK0803 4
  • RuntimeWarning: divide by zero encountered in power   d_inv = np.power(rowsum, -0.5).flatten()

    RuntimeWarning: divide by zero encountered in power d_inv = np.power(rowsum, -0.5).flatten()


    I reported this error while running your code. Do I need to add a validation mechanism here, or is there a problem with my super-parameter adjustment? Hope to get your guidance, thank you!

    opened by OneSimplePerson 2
  • Question about the projection

    Question about the projection

    Hello, Have you ever tried to add projection (e.g., MLP) like SimCLR? I have add such projections, but it doesn't work. The performance is not good and the convergence is relatively slow. Are there any suggestions for projection function in SGL?

    opened by hotchilipowder 2
  • 关于计算batch loss的问题

    关于计算batch loss的问题


    对于每个batch累计的各类loss,为什么打印的时候,除以的是 data_iter.num_trainings (数据集的大小)呢?


    opened by hhmy27 2
  • Question about sec 4.3.3

    Question about sec 4.3.3

    Hello, I am working on SGL recently. Thank you for offering this project. In sec 4.3.3, it said that

    we contaminate the training set by adding a certain proportion of adversarial examples (i.e., 5%, 10%, 15%, 20% negative user-item interactions), while keeping the testing set unchanged. Figure 6 shows the results on Yelp2018 and Amazon- Book datasets.

    May I ask how to get the adversarial examples? Is it just to add random links?

    I use amazon-book (keep the rating > 3), and add the interaction which score =1/2 as adversarial examples. But I don't find such performance degradation.

    opened by hotchilipowder 2
  • Is it normal for the training speed to be slow?

    Is it normal for the training speed to be slow?

    Something strange happened when I ran the codes. I ran it with yelp2018 data set, but each epoch took about 15 minutes, while my GPU did not have a lot of work. Is it normal for the training speed to be slow? Or is there something that I have missed?
    PS: I've used cython and cpp implements for evaluate.

    opened by KevinChow666 2
  • 自监督辅助任务损失和推荐系统的监督学习任务的损失比例


    您好!非常感谢您分享的代码 我注意到您在论文里说您的方法与图的模型无关 请问您做过自监督辅助任务损失 和 推荐系统的监督学习任务的损失 比例的相关实验研究吗?自监督辅助任务损失 和 推荐系统的监督学习任务的损失 在数值上以什么样的比例能取得较好的性能提升 或许您也可以提供一些参数优化的相关思路 期待您的回复!

    opened by XUPT-guoruihan 2
  • conf error:'./conf/MF.properties' is empty!

    conf error:'./conf/MF.properties' is empty!

    Hello, I have changed my root path ,but another error occurred.

    This is it:

    Traceback (most recent call last): File "/(#my root path#)/SGL/main.py", line 22, in conf = Configurator(root_folder + "NeuRec.properties", default_section="hyperparameters") File "/(#my root path#)/SGL/util/configurator.py", line 68, in init self.alg_arg = self._read_config_file(arg_file) File "/(#my root path#)/SGL/util/configurator.py", line 88, in _read_config_file raise ValueError("'%s' is empty!" % filename) ValueError: './conf/MF.properties' is empty!

    opened by wangyifeibeijing 2
  • 运行报错


    你好,最近在看你论文,同时尝试运行了下代码,发现有报错,内容如下: Traceback (most recent call last): File "main.py", line 23, in seed = conf["seed"] File "/self-sup/SGL/util/configurator.py", line 128, in getitem raise KeyError("There are not the parameter named '%s'" % item) KeyError: "There are not the parameter named 'seed'" 请问下如何解决这个问题,谢谢。

    opened by lijhong 2
  • 最终结果和两个subgraph embedding的关系

    最终结果和两个subgraph embedding的关系


    opened by FreddyGao 2
  • Detailed proof of Equation 15 in the paper

    Detailed proof of Equation 15 in the paper

    Thanks for your great work. I saw proofs of equations 12, 13, 14 in your supplementary material. But I still don't know the detailed derivation process of Equation 15, that is: why the L2 norm of c(v) is proportional to the following term: image

    opened by Lukangkang123 0
  • How to generate adversarial examples?

    How to generate adversarial examples?

    Hi, thanks for your great work.

    I am confused about robustness to noisy interactions in this paper.

    Towards this end, we contaminate the training set by adding a certain proportion of adversarial examples (i.e., 5%, 10%, 15%, 20% negative user-item interactions), while keeping the testing set unchanged.

    I tried to sample from interations which don't show in train.txt and test.txt. But I didn't find too many differences between LightGCN and SGL. I wonder if it is proper to generate adversarial examples in this way.

    Looking forward to your reply.

    opened by FinchNie 1
  • 请求解答一个文章中的疑惑


    是关于自监督损失的复杂度方面的,文中给出的推导过程是 在一个批次的计算中,分子部分:O(Bd) + 分母部分O(BMd) -> 一个批次的总复杂度为 O(|E|d(2+|V|))

    关于这个推导不是很理解,目前暂时还没有看源码,所以也不清楚具体的实现是怎么样,我自己的推导过程是 one epoch: user = bd+bmd=b(m+1)d item = bd+bid+b(i+1)d user+item = b(m+2+i)d -> O(bd(2+|V|)) 区别就在 bd(2+|V|) 还是 |E|d(2+|V|) 因为我本人刚入门,可能问题有点蠢,还望您能够解答,文章的其他部分还是能读下来的

    opened by ithok 2
  • some code errors

    some code errors

    您好,最近在尝试你的方法,但是代码有如下报错: (tensor) algroup@algroup:~/lxj/SGL-main$ python main.py --recommender=SGL seed= 2021 WARNING:tensorflow:From main.py:27: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

    split and save data... 2022-03-07 10:08:39.052: amazon-book_given_u0_i0 2022-03-07 10:08:39.053: Dataset name: amazon-book The number of users: 52643 The number of items: 91599 The number of ratings: 2984108 Average actions of users: 56.69 Average actions of items: 32.58 The sparsity of the dataset: 99.938115% Item degree grouping... Traceback (most recent call last): File "main.py", line 33, in dataset = Dataset(conf) File "/home/algroup/lxj/SGL-main/data/dataset.py", line 37, in init self._group_item_by_popularity() File "/home/algroup/lxj/SGL-main/data/dataset.py", line 262, in _group_item_by_popularity self.item_group[i] = i_degree[self.item_group_idx == i] AttributeError: 'Dataset' object has no attribute 'item_group'

    opened by libuyan-nuaa 9
