Codes for AAAI'21 paper 'Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation'

Overview

DHCN

Codes for AAAI 2021 paper 'Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation'.

Please note that the default link of our paper in google scholar links to an obsolete version with incorrect experimental results.

The latest version of our paper is available at:

https://ojs.aaai.org/index.php/AAAI/article/view/16578

Environments: Python3, Pytorch 1.6.0, Numpy 1.18.1

Datasets are available at Dropbox: https://www.dropbox.com/sh/j12um64gsig5wqk/AAD4Vov6hUGwbLoVxh3wASg_a?dl=0 The datasets are already preprocessed and encoded by pickle.

For Diginetica, the best beta value is 0.01; for Tmall, the best beta value is 0.02.

Some people may encounter a cudaError in line 50 or line 74 when running our codes if your numpy and pytorch version is different with ours. Currently, we haven't found the solution to resolve the version problem. If you have this problem, please try to change numpy and pytorch version same with ours.

Comments
  • The labels were leaked during the testing

    The labels were leaked during the testing

    Thank you for your sharing.

    However, we found a very serious problem in the DHCN, the labels were leaked during the testing.

    As mentioned in the paper, a sequence splitting method is used to augment and label the dataset. Figure 1 shows an example of the data after preprocessing.

    image

    Notice that, the data is not shuffled during the training and testing in DHCN, which means the last item in the previous sequence is the target item of the current sequence. (The target of each sequence is shown in Figure 2.)

    image

    In DHCN, you convert each batch (100 sequences) into a hypergraph, and the prediction for each sequence in the current batch is based on the items in the sequence and the hypergraph. However, previous sequences contain both the items and the target item of the current sequence, which means in the hypergraph the items of each sequence directly connect with the target item (The co-occur of items in previous sequences makes the target item have a great influence on the current prediction according to the Equation 2 in the paper).

    In my opinion, we should not make predictions with the help of neighbor sequences in the test data, unless we shuffle the dataset or do not use the sequence splitting method to augment and label the dataset. (CSRM uses neighbor sequences to help predictions, but it shuffles the dataset before training and testing.)

    This is the result of DHCN we obtained on Diginetica after data shuffling: Prec@20: 49.14, MRR@20: 16.05

    opened by Hubert161 12
  • How to generate the session embedding during the testing phase

    How to generate the session embedding during the testing phase

    Hi, thanks for sharing this interesting work.

    I am confused about the generation of session embedding during the training and testing phases. Specifically, during the training, we can generate a session embedding via the hypergraph, which means we use the interactions in the other sessions to learn the embedding for items within the current batch.

    When it comes to the testing, we only get access to the interactions within the current session, without the interactions from the other sessions. However when I read the code, I found no matter in the training or testing phase, the function build will always go through the hypergraph conv layer, and then it generates an embedding for the current session, as refers to, https://github.com/xiaxin1998/DHCN/blob/aeb54db1057b668dab871cd9ea712dc4e90daec6/model.py#L129

    I am not sure if I miss some key details, but I find it is hard to understand the current implementation in generating session embedding during the testing phase. Would you mind clarifying it?

    opened by rowedenny 10
  • Data_preprocess

    Data_preprocess

    Excuse me, at present, I plan to add some other information in the data set to the session-based recommendation task to improve the recommendation accuracy (such as timestamp), and plan to cite and compare your DHCN model. According to what you mentioned in the paper, I have downloaded the original data of the data set Tmall from IJCAI-15 competition. The original Tmall data set contains the information I want to use. May I ask you about the method of preprocessing the Tmall raw data? If you can send me the preprocessing code, I would be very grateful, thank you.

    opened by whatever0228 6
  • precision and recall

    precision and recall

    Excuse me, why the code is gone?Excuse me, why is the result of running the author's code very different from that in the paper (for example, diginetica dataset, P@20 value is about 50 in the paper , but recall@20 value is about 17 in the code).Is it because it is not set correctly?

    opened by tyh7425 6
  • Abnormal code result and the warning about divide by zero

    Abnormal code result and the warning about divide by zero

    https://github.com/xiaxin1998/DHCN/blob/b798a7ba69c95bc68209feb493b23a07ed8b57ad/util.py#L40

    A warning will appear on this line of code:

    RuntimeWarning: divide by zero encountered in true_divide
      DH = H.T.multiply(1.0/H.sum(axis=1).reshape(1, -1))
    

    There is an element with a value of 0 in H.sum(axis=1), this will eventually lead to the generation of inf. Doesn’t it affect the model data?

    I only tested the model in the diginetica dataset, using the original code and the preprocessed data provided by Dropbox. I guess the preprocessed code should be the same as SR-GNN, but the results I ran are abnormal. I hope you can provide some suggestions.

    opened by Nishikata97 5
  • Garbled datasets

    Garbled datasets

    The datasets downloaded from link "https://www.dropbox.com/sh/j12um64gsig5wqk/AAD4Vov6hUGwbLoVxh3wASg_a?dl=0" are garbled.I tired to solve this problem by changing the encoding format and other ways,but it still goes wrong.Could you please send me a dataset?My email address is [email protected] will be appreciated for your reply, thank you very much!

    opened by EnterUName 2
  • i want to sue the sample dataset

    i want to sue the sample dataset

    i want to test my model in the sample dataset(Mentioned in your code),I didn’t find it in the dataset link, where can I get it, or the sample is generated according to certain rules. It would be appreciated if you could provide it!!

    opened by zhangguozheng-1 2
  • TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    (pytorch) XXX@xxxxx:/DATA/XXX/DHCN$ python main.py --dataset Tmall --beta 0.02
    Traceback (most recent call last):
      File "main.py", line 5, in <module>
        from util import Data, split_validation
    ImportError: cannot import name 'Data' from 'util' (/DATA/XXX/DHCN/util.py)
    (pytorch) XXX@cvpruser:/DATA/XXX/DHCN$ python main.py --dataset Tmall --beta 0.02
    Namespace(batchSize=100, beta=0.02, dataset='Tmall', embSize=100, epoch=30, filter=False, l2=1e-05, layer=3, lr=0.001)
    /home/XXX/anaconda3/envs/pytorch/lib/python3.8/site-packages/numpy/core/_asarray.py:102: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
      return array(a, dtype, copy=False, order=order)
    -------------------------------------------------------
    epoch:  0
    start training:  2021-09-17 10:49:17.124433
    Traceback (most recent call last):
      File "main.py", line 71, in <module>
        main()
      File "main.py", line 53, in main
        metrics, total_loss = train_test(model, train_data, test_data)
      File "/DATA/XXX/DHCN/model.py", line 180, in train_test
        targets, scores, con_loss = forward(model, i, train_data)
      File "/DATA/XXX/DHCN/model.py", line 168, in forward
        item_emb_hg, sess_emb_hgnn, con_loss = model(session_item, session_len, D_hat, A_hat, reversed_sess_item, mask)
      File "/home/XXX/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/DATA/XXX/DHCN/model.py", line 153, in forward
        session_emb_lg = self.LineGraph(self.embedding.weight, D, A, session_item, session_len)
      File "/home/XXX/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/DATA/XXX/DHCN/model.py", line 75, in forward
        session_emb_lgcn = np.sum(session, 0)
      File "<__array_function__ internals>", line 5, in sum
      File "/home/XXX/anaconda3/envs/pytorch/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 2247, in sum
        return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
      File "/home/XXX/anaconda3/envs/pytorch/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 87, in _wrapreduction
        return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
      File "/home/XXX/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_tensor.py", line 643, in __array__
        return self.numpy()
    TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
    
    opened by sorrowyn 2
  • TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.  Process finished with exit code 1

    TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. Process finished with exit code 1

    model.py的第40行“ item_embeddings = np.sum(final, 0) / (self.layers+1)” 报错: TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    Process finished with exit code 1

    opened by shuqincao 1
  • Do you have any RuntimeError when you run this code?

    Do you have any RuntimeError when you run this code?

    In file "model.py" ,line 51, in forward item_embeddings = np.sum(final, 0)

    RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

    opened by git-x5 1
Owner
Xin Xia
PhD Candidate in The University of Queensland
Xin Xia
A Python implementation of LightFM, a hybrid recommendation algorithm.

LightFM Build status Linux OSX (OpenMP disabled) Windows (OpenMP disabled) LightFM is a Python implementation of a number of popular recommendation al

Lyst 4.2k Jan 2, 2023
A TensorFlow recommendation algorithm and framework in Python.

TensorRec A TensorFlow recommendation algorithm and framework in Python. NOTE: TensorRec is not under active development TensorRec will not be receivi

James Kirk 1.2k Jan 4, 2023
Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems.

Persine, the Persona Engine Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface a

Jonathan Soma 87 Nov 29, 2022
ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms

ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms, including but not limited to click-through-rate (CTR) prediction, learning-to-ranking (LTR), and Matrix/Tensor Embedding. The project objective is to develop a ecosystem to experiment, share, reproduce, and deploy in real world in a smooth and easy way (Hope it can be done).

LI, Wai Yin 90 Oct 8, 2022
A framework for large scale recommendation algorithms.

A framework for large scale recommendation algorithms.

Alibaba Group - PAI 880 Jan 3, 2023
Recommendation System to recommend top books from the dataset

recommendersystem Recommendation System to recommend top books from the dataset Introduction The recom.py is the main program code. The dataset is als

Vishal karur 1 Nov 15, 2021
An open source movie recommendation WebApp build by movie buffs and mathematicians that uses cosine similarity on the backend.

Movie Pundit Find your next flick by asking the (almost) all-knowing Movie Pundit Jump to Project Source » View Demo · Report Bug · Request Feature Ta

Kapil Pramod Deshmukh 8 May 28, 2022
Books Recommendation With Python

Books-Recommendation Business Problem During the last few decades, with the rise

Çağrı Karadeniz 7 Mar 12, 2022
Bert4rec for news Recommendation

News-Recommendation-system-using-Bert4Rec-model Bert4rec for news Recommendation

saran pandian 2 Feb 4, 2022
Graph Neural Networks for Recommender Systems

This repository contains code to train and test GNN models for recommendation, mainly using the Deep Graph Library (DGL).

null 217 Jan 4, 2023
QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

QRec is a Python framework for recommender systems (Supported by Python 3.7.4 and Tensorflow 1.14+) in which a number of influential and newly state-of-the-art recommendation models are implemented. QRec has a lightweight architecture and provides user-friendly interfaces. It can facilitate model implementation and evaluation.

Yu 1.4k Dec 27, 2022
Plex-recommender - Get movie recommendations based on your current PleX library

plex-recommender Description: Get movie/tv recommendations based on your current

null 5 Jul 19, 2022
Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'.

COTREC Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'. Requirements: Python 3.7, Pytorch 1.6.0 Best Hype

Xin Xia 42 Dec 9, 2022
ANKIT-OS/TG-SESSION-GENERATOR-BOTbisTG-SESSION-GENERATOR-BOT a special repository. Its Is A Telegram Bot To Generate String Session

ANKIT-OS/TG-SESSION-GENERATOR-BOTbisTG-SESSION-GENERATOR-BOT a special repository. Its Is A Telegram Bot To Generate String Session

ANKIT KUMAR 1 Dec 26, 2021
Official repository for "Exploiting Session Information in BERT-based Session-aware Sequential Recommendation", SIGIR 2022 short.

Session-aware BERT4Rec Official repository for "Exploiting Session Information in BERT-based Session-aware Sequential Recommendation", SIGIR 2022 shor

Jamie J. Seol 22 Dec 13, 2022
Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

OD-Rec Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation' Paper, saved teacher models and Andro

Xin Xia 11 Nov 22, 2022
FastAPI Server Session is a dependency-based extension for FastAPI that adds support for server-sided session management

FastAPI Server-sided Session FastAPI Server Session is a dependency-based extension for FastAPI that adds support for server-sided session management.

DevGuyAhnaf 5 Dec 23, 2022
Official public repository of paper "Intention Adaptive Graph Neural Network for Category-Aware Session-Based Recommendation"

Intention Adaptive Graph Neural Network (IAGNN) This is the official repository of paper Intention Adaptive Graph Neural Network for Category-Aware Se

null 9 Nov 22, 2022
You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks.

AllSet This is the repo for our paper: You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks. We prepared all codes and a subse

Jianhao 51 Dec 24, 2022
ANKIT-OS/TG-SESSION-HACK-BOT: A Special Repository.Telegram Bot Which Can Hack The Victim By Using That Victim Session

?? ᵀᴱᴸᴱᴳᴿᴬᴹ ᴴᴬᶜᴷ ᴮᴼᵀ ?? The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • ??️ Lᴀɴɢᴜᴀɢᴇs Aɴᴅ Tᴏᴏʟs ?? • If

ANKIT KUMAR 2 Dec 24, 2021