GAN-based Matrix Factorization for Recommender Systems

Ervin Dervishaj

Last update: Nov 6, 2022

Related tags

Deep Learning collaborative-filtering matrix-factorization generative-adversarial-network gan neural-networks autoencoder recsys recommender-system

Overview

GAN-based Matrix Factorization for Recommender Systems

This repository contains the datasets' splits, the source code of the experiments and their results for the paper "GAN-based Matrix Factorization for Recommender Systems" (arXiv: https://arxiv.org/abs/2201.08042) accepted at the 37th ACM/SIGAPP Symposium on Applied Computing (SAC '22).

How to use this repo

This repo is based on a version of Recsys_Course_AT_PoliMi. In order to run the code and experiments you need first to setup a Python environment. Any environment manager will work, but we suggest conda since it is easier to recreate our environment if using a GPU. conda can help with the installation of CUDA and CUDA toolkit necessary to utilize available GPU(s). We highly recommend running this repo with a GPU since GAN-based recommenders require long training times.

Conda

Run the following command to create a new environment with Python 3.6.8 and install all requirements in file conda_requirements.txt:

conda create -n <name-env> python==3.6.8 --file conda_requirements.txt

The file conda_requirements.txt also contains the packages cudatoolkit==9.0 and cudnn==7.1.2 which are installed completely separate from other versions you might already have installed and are managed by conda.

Install the following packages using pip inside the newly created environment since they are not found in the main channel of conda and conda-forge channel holds old versions of them:

pip install scikit-optimize==0.7.2 telegram-send==0.25

Activate the newly created environment:

conda activate <name-env>

Virtualenv & Pip

First download and install Python 3.6.8 from python.org. Then install virtualenv:

python -m pip install --user virtualenv

Now create a new environment with virtualenv (by default it will use the Python version it was installed with):

virtualenv <name-env> <path-to-new-env>

Activate the new environment with:

source <path-to-new-env>/bin/activate

Install the required packages through the file pip_requirements.txt:

pip install -r pip_requirements.txt

Note that if you intend to use a GPU and install required packages using virtualenv and pip then you need to install separately cudatoolkit==9.0 and cudnn==7.1.2 following instructions for your GPU on nvidia.com.

Before running any experiment or algorithm you need to compile the Cython code part of some of the recommenders. You can compile them all with the following command:

python run_compile_all_cython.py

N.B You need to have the following packages installed before compiling: gcc and python3-dev.

N.B Since the experiments can take a long time, the code notifies you on your Telegram account when the experiments start/end. Either configure telegram-send as indicated on https://pypi.org/project/telegram-send/#installation or delete the lines containing telegram-send inside RecSysExp.py.

Running experiments

All results presented in the paper are already provided in this repository. In case you want to re-run the experiments, below you can find the steps for each one of them.

Comparison with baselines¹

In order to run all the comparisons with the baselines use the file RecSysExp.py. First compute for each dataset the 5 mutually exclusive sets:

Training set: once best hyperparameters of the recommender are found, it will be finally trained with this set.
- Training set small: the recommender is first trained on this small training set with the aim of finding the best hyperparameters.
- Early stopping set: validation set used to incorporate early stopping in the hyperparameters tuning.
- Validation set: the recommender with the current hyperparameter values is tested against this set.
Test set: once the best hyperparameters are found, the recommender is finally tested with this set. The results presented are the ones on this set.

Compute the splits for each dataset with the following command:

python RecSysExp.py --build-dataset <dataset-name>

To run the tuning of a recommender use the following command:

python RecSysExp.py <dataset-name> <recommender-name> [--user | --item] [<similarity-type>]

dataset-name is a value among: 1M, hetrec2011, LastFM.
recommender-name is a value among: TopPop, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha, CAAE, CFGAN, GANMF.
--user or --item is a flag used only for GAN-based recommenders. It denotes the user/item-based training procedure for the selected recommender.
similarity-type is a value among: cosine, jaccard, tversky, dice, euclidean, asymmetric. It is used only for ItemKNN recommender.

All results, best hyperparameters and dataset splits are saved in the experiments directory.

Testing on test set with best hyperparameters

In order to test each tuned recommender on the test set (which is created when tuning the hyperparameters) run the following command:

python RunBestParameters.py <dataset-name> <recommender-name> [--user | --item] [<similarity-type>] [--force] [--bp <best-params-dir>]

dataset-name is a value among: 1M, hetrec2011, LastFM.
recommender-name is a value among: TopPop, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha, CAAE, CFGAN, GANMF.
--user or --item is a flag used only for GAN-based recommenders. It denotes the user/item based training procedure for the selected recommender.
similarity-type is a value among: cosine, jaccard, tversky, dice, euclidean, asymmetric. It is used only for ItemKNN recommender.
--force is a flag that forces the computation of the results on test set. By default, if the result for the tuple (dataset, recommender) exists in test_result directory, the computation is not performed.
--bp sets the directory where the best parameters (best_params.pkl) are located for this combination of (dataset, recommender), by default in experiments directory.

The results are saved in the test_results directory.

Ablation study

To run the ablation study, use the script AblationStudy.py as follows:

python AblationStudy.py <dataset-name> [binGANMF | feature-matching [--user | --item]]

dataset-name is a value among: 1M, hetrec2011, LastFM.
binGANMF runs the first ablation study, the GANMF model with binary classifier discrimnator. This tunes the recommender with RecSysExp.py and then evaluates it with RunBestParameters.py on the test set.
--user or --item is a flag that sets the training procedure for binGANMF recommender.
feature-matching runs the second ablation study, the effect of the feature matching loss and the user-user similarity heatmaps. The results are saved in the feature_matching directory.

MF model of GANMF

To run the qualitative study on the MF learned by GANMF, use the script MFLearned.py as follows:

python MFLearned.py

It executes both experiments and the results are saved in the latent_factors directory.

For the baselines Top Popular, PureSVD, ALS, SLIMBPR, ItemKNN, P3Alpha and model evaluation we have used implementations from Recsys_Course_AT_PoliMi. ↩

You might also like...

Neural Factorization of Shape and Reflectance Under An Unknown Illumination

NeRFactor [Paper] [Video] [Project] This is the authors' code release for: NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown I

283 Jan 4, 2023

TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

296 Dec 6, 2022

A PyTorch implementation of a Factorization Machine module in cython.

fmpytorch A library for factorization machines in pytorch. A factorization machine is like a linear model, except multiplicative interaction terms bet

167 Jul 6, 2022

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks. Code, based on the PyTorch framework, for reprodu

3 Dec 27, 2022

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Using a predicted aligned error matrix corresponding to an AlphaFold2 model , returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a pseudo-rigid domain.

24 Nov 23, 2022

A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization components are included and optional.

Description A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization co

9 Nov 10, 2022

A TikTok-like recommender system for GitHub repositories based on Gorse

GitRec GitRec is the missing recommender system for GitHub repositories based on Gorse. Architecture The trending crawler crawls trending repositories

337 Jan 4, 2023

A Real-World Benchmark for Reinforcement Learning based Recommender System

RL4RS: A Real-World Benchmark for Reinforcement Learning based Recommender System RL4RS is a real-world deep reinforcement learning recommender system

121 Dec 1, 2022

Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering recommendation algorithms in the world of scientiﬁc Python packages (numpy, scipy, matplotlib).

Crab - A Recommendation Engine library for Python Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering r

1.2k Dec 21, 2022

Comments

About code

Could you please make a simple GANMF project separately, including dataset processing, model training and metrics evaluating. This project is too big to learn.

opened by Merakis0 0

GAN-based Matrix Factorization for Recommender Systems

Related tags

Overview

GAN-based Matrix Factorization for Recommender Systems

How to use this repo

Conda

Virtualenv & Pip

Running experiments

Comparison with baselines¹

Testing on test set with best hyperparameters

Ablation study

MF model of GANMF

You might also like...

Neural Factorization of Shape and Reflectance Under An Unknown Illumination

TuckER: Tensor Factorization for Knowledge Graph Completion

A PyTorch implementation of a Factorization Machine module in cython.

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization components are included and optional.

A TikTok-like recommender system for GitHub repositories based on Gorse

A Real-World Benchmark for Reinforcement Learning based Recommender System

Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering recommendation algorithms in the world of scientiﬁc Python packages (numpy, scipy, matplotlib).

Comments

About code

Owner

Ervin Dervishaj

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

Implementation of SSMF: Shifting Seasonal Matrix Factorization

A Comparative Framework for Multimodal Recommender Systems

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

Open-sourcing the Slates Dataset for recommender systems research

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

An efficient PyTorch implementation of the evaluation metrics in recommender systems.

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

GAN-based Matrix Factorization for Recommender Systems

Related tags

Overview

GAN-based Matrix Factorization for Recommender Systems

How to use this repo

Conda

Virtualenv & Pip

Running experiments

Comparison with baselines1

Testing on test set with best hyperparameters

Ablation study

MF model of GANMF

Footnotes

You might also like...

Neural Factorization of Shape and Reflectance Under An Unknown Illumination

TuckER: Tensor Factorization for Knowledge Graph Completion

A PyTorch implementation of a Factorization Machine module in cython.

PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization components are included and optional.

A TikTok-like recommender system for GitHub repositories based on Gorse

A Real-World Benchmark for Reinforcement Learning based Recommender System

Crab is a ﬂexible, fast recommender engine for Python that integrates classic information ﬁltering recommendation algorithms in the world of scientiﬁc Python packages (numpy, scipy, matplotlib).

Comments

About code

Owner

Ervin Dervishaj

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

Implementation of SSMF: Shifting Seasonal Matrix Factorization

A Comparative Framework for Multimodal Recommender Systems

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

Open-sourcing the Slates Dataset for recommender systems research

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

An efficient PyTorch implementation of the evaluation metrics in recommender systems.

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

Comparison with baselines¹