A framework for large scale recommendation algorithms.

Overview

EasyRec简介

What is EasyRec?

intro.png

EasyRec is an easy to use framework for Recommendation

EasyRec致力于成为容易上手的工业界深度学习推荐算法框架,支持大规模训练、评估、导出和部署。EasyRec实现了业界领先的模型,包含排序、召回、多目标等模型,支持超参搜索,显著降低了建模的复杂度和工作量。

Why EasyRec?

Run everywhere

Diversified input data

Simple to config

  • Flexible feature config and simple model config
  • Efficient and robust feature generation[used in taobao]
  • Nice web interface in development

It is smart

Large scale and easy deployment

  • Support large scale embedding, incremental saving
  • Many parallel strategies: ParameterServer, Mirrored, MultiWorker
  • Easy deployment to EAS: automatic scaling, easy monitoring
  • Consistency guarantee: train and serving

A variety of models

Easy to customize

Get Started

  • Download
    git clone https://github.com/AlibabaPAI/EasyRec.git
    wget https://easyrec.oss-cn-beijing.aliyuncs.com/data/easyrec_data_20210818.tar.gz
    sh scripts/gen_proto.sh
You might also like...
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

Elliot is a comprehensive recommendation framework that analyzes the recommendation problem from the researcher's perspective.
Elliot is a comprehensive recommendation framework that analyzes the recommendation problem from the researcher's perspective.

Comprehensive and Rigorous Framework for Reproducible Recommender Systems Evaluation

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

The SLIDE package contains the source code for reproducing the main experiments in this paper. Dataset The Datasets can be downloaded in Amazon-

ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms

ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms, including but not limited to click-through-rate (CTR) prediction, learning-to-ranking (LTR), and Matrix/Tensor Embedding. The project objective is to develop a ecosystem to experiment, share, reproduce, and deploy in real world in a smooth and easy way (Hope it can be done).

Recommendationsystem - Movie-recommendation - matrixfactorization colloborative filtering recommendation system user
Recommendationsystem - Movie-recommendation - matrixfactorization colloborative filtering recommendation system user

recommendationsystem matrixfactorization colloborative filtering recommendation

Product-based-recommendation-system - A product based recommendation system which uses Machine learning algorithm such as KNN and cosine similarity
framework for large-scale SAR satellite data processing

pyroSAR A Python Framework for Large-Scale SAR Satellite Data Processing The pyroSAR package aims at providing a complete solution for the scalable or

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem

OSLO: Open Source framework for Large-scale transformer Optimization
OSLO: Open Source framework for Large-scale transformer Optimization

O S L O Open Source framework for Large-scale transformer Optimization What's New: December 21, 2021 Released OSLO 1.0. What is OSLO about? OSLO is a

DeepGNN is a framework for training machine learning models on large scale graph data.

DeepGNN Overview DeepGNN is a framework for training machine learning models on large scale graph data. DeepGNN contains all the necessary features in

Model search is a framework that implements AutoML algorithms for model architecture search at scale
Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model architecture for their classification problems (i.e., DNNs with different types of layers).

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale.
Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale.

Model Search Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers sp

Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recommendation algorithms in the world of scientific Python packages (numpy, scipy, matplotlib).

Crab - A Recommendation Engine library for Python Crab is a flexible, fast recommender engine for Python that integrates classic information filtering r

Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.
Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.

Graph Mining Author: Jiayi Chen Time: April 2021 Implemented Algorithms: Network: Scrabing Data, Network Construbtion and Network Measurement (e.g., P

CCKS-Title-based-large-scale-commodity-entity-retrieval-top1
CCKS-Title-based-large-scale-commodity-entity-retrieval-top1

- 基于标题的大规模商品实体检索top1 一、任务介绍 CCKS 2020:基于标题的大规模商品实体检索,任务为对于给定的一个商品标题,参赛系统需要匹配到该标题在给定商品库中的对应商品实体。 输入:输入文件包括若干行商品标题。 输出:输出文本每一行包括此标题对应的商品实体,即给定知识库中商品 ID,

Open-AI's DALL-E for large scale training in mesh-tensorflow.

DALL-E in Mesh-Tensorflow [WIP] Open-AI's DALL-E in Mesh-Tensorflow. If this is similarly efficient to GPT-Neo, this repo should be able to train mode

Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Comments
  • update hpo nni

    update hpo nni

    1. update hpo nni
    • merge oss_config,odps_config,easyrec_cmd_config,metric_config->config
    • add exit code
    • add easyrec_cmd compatibility(‘’,“”,whitespace)
    • update log
    • add exit when it is wrong
    1. update docs
    ci_test_passed ci_py3_test_passed ci_py3_tf25_test_passed 
    opened by yjjinjie 15
  • AttributeError: module 'tensorflow._api.v2.data' has no attribute 'TableRecordDataset'

    AttributeError: module 'tensorflow._api.v2.data' has no attribute 'TableRecordDataset'

    When running the command to train model:

    CUDA_VISIBLE_DEVICES=0 python3 -m easy_rec.python.train_eval --pipeline_config_path deepfm.config
    

    Something went wrong

    AttributeError: module 'tensorflow._api.v2.data' has no attribute 'TableRecordDataset'
    

    The operating environment is as follows

    os: ubuntu16.04
    python: 3.8.15
    tensorflow-gpu: 2.11.0
    easy-rec: 0.6.0(installed from git)
    
    opened by aiot-tech 2
  • [bugfix]: incr record add support for int32 indices; refactor embedding name to ids mapping

    [bugfix]: incr record add support for int32 indices; refactor embedding name to ids mapping

    incr record add support for int32 indices; refactor embedding name to ids mapping to ensure online offline consistency; datahub input filters inactive input shards, add retry when network is not stable.

    ci_test_passed ci_py3_test_passed 
    opened by chengmengli06 7
Releases(v0.6.0)
  • v0.6.0(Nov 28, 2022)

    What's Changed

    • [bugfix]: update python publich ci yml by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/294
    • [feat]: CSVPredictor support embedding input by @chenglongliu123 in https://github.com/alibaba/EasyRec/pull/298
    • [doc]: update nni hpo docs by @yjjinjie in https://github.com/alibaba/EasyRec/pull/299
    • [doc]: add docs for ev and sample_weight by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/305
    • [bugfix]: Fix dist eval ps high memory bug by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/302
    • update hpo docs and remove tensorflow in runtime.txt by @yjjinjie in https://github.com/alibaba/EasyRec/pull/307
    • [bugfix]: fix negative sample using hive input bug by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/309
    • [bugfix]: of GAUC/Session AUC metric when the id field is not configurated in feature groups by @yangxudong in https://github.com/alibaba/EasyRec/pull/315
    • [bugfix]: improve ckpt restore performance by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/311
    • [docs] update fine_tune_checkpoint specification by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/317
    • [bugfix]: fix share input bug by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/313
    • [feat]add udf for label by @dawn310826 in https://github.com/alibaba/EasyRec/pull/303

    Full Changelog: https://github.com/alibaba/EasyRec/compare/v0.5.6...v0.6.0

    Source code(tar.gz)
    Source code(zip)
    easy_rec-0.6.0-py2.py3-none-any.whl(15.46 MB)
  • v0.5.6(Oct 15, 2022)

    What's Changed

    • [bugfix]: fix load_fg_json_to_config, make it reentrant by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/204

    • [bugfix]: fix slow node crush in hive predictor by @dawn310826 in https://github.com/alibaba/EasyRec/pull/220

    • [bugfix]: fix bug of softmax_loss_with_negative_mining loss by @yangxudong in https://github.com/alibaba/EasyRec/pull/237

    • [bugfix]: fix bug for distribute eval worker hangs by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/252

    • [bugfix]: fix embedding variable restore bug by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/234

    • [bugfix]: fix share not used bug by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/233

    • [bugfix]: fix bug of sequence feature rtp input by @yangxudong in https://github.com/alibaba/EasyRec/pull/258

    • [bugfix]: fix bug for sequence feature wide and deep by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/270

    • [bugfix] export user_emb and item_emb used to calculate simularity by @cosmozhang1995 in https://github.com/alibaba/EasyRec/pull/277

    • [feature]: add docker file by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/211

    • [feature]: support gzipped csv files by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/203

    • [feature]: add hive rtp input by @dawn310826 in https://github.com/alibaba/EasyRec/pull/200

    • [feature]: add ngpu train scripts by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/213

    • [feature]: add hadoop env in Dockerfile by @tiankongdeguiji in https://github.com/alibaba/EasyRec/pull/214

    • [feature]: update hive predictor: load predict result to hive by @dawn310826 in https://github.com/alibaba/EasyRec/pull/215

    • [feature]: add support for clear_model and clear_export option by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/218

    • [feature]: add CMBF model by @yangxudong in https://github.com/alibaba/EasyRec/pull/217

    • [feature]: add nni hpo, json.load->yaml.safe.load,codestyle by @yjjinjie in https://github.com/alibaba/EasyRec/pull/222

    • [feature]: add CMBF layer for DBMTL model by @yangxudong in https://github.com/alibaba/EasyRec/pull/223

    • [feature]: support checkpoint_path dir by @dawn310826 in https://github.com/alibaba/EasyRec/pull/225

    • [feature]: optimize io cost of hiveinput and add hiveparquetinput by @dawn310826 in https://github.com/alibaba/EasyRec/pull/224

    • [feature]: modify document structure by @yangxudong in https://github.com/alibaba/EasyRec/pull/226

    • [feature]: add support for incremental_train.md by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/228

    • [feature]: add support easyrec distribute metrics by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/201

    • [feature]: add support for frequency filter and steps_to_alive for embedding variable by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/231

    • [feature]: add incremental update for odl by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/257

    • [feature]: support csv with header by @dawn310826 in https://github.com/alibaba/EasyRec/pull/285

    • [feature]: add designer doc by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/287

    • [feature]: support gnn on datascience by @dawn310826 in https://github.com/alibaba/EasyRec/pull/229

    • [feature]: support hit rate on ds by @dawn310826 in https://github.com/alibaba/EasyRec/pull/244

    • [feature]: add user define eval_result_path by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/245

    • [feature]: fix bug of fail to assign default value of rtp input field by @yangxudong in https://github.com/alibaba/EasyRec/pull/232

    • [feature]: add support for tf25 py3 test by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/255

    • [feature]: support multi loss for multi task learning model by @yangxudong in https://github.com/alibaba/EasyRec/pull/260

    • [feature]: support only sequence feature && fix neg sampler bug for sequence feature by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/264

    • [feature]: add aux_hist_seq by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/265

    • [feature]: add Uniter model and Uniter layer for DBMTL model by @yangxudong in https://github.com/alibaba/EasyRec/pull/266

    • [feature]: add sequence feature negative sample process by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/267

    • [feature]: features/cmbf by @yangxudong in https://github.com/alibaba/EasyRec/pull/268

    • [feature]: add dlc tutorial by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/272

    • [feature]: fix rtp_native documentation: add reserve_default: false in fg.json by @cosmozhang1995 in https://github.com/alibaba/EasyRec/pull/278

    • [feature]: add pycharm vscode docker environment by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/293

    New Contributors

    • @yjjinjie made their first contribution in https://github.com/alibaba/EasyRec/pull/222
    • @chenglongliu123 made their first contribution in https://github.com/alibaba/EasyRec/pull/279

    Full Changelog: https://github.com/alibaba/EasyRec/compare/v0.4.7...v0.5.6

    Source code(tar.gz)
    Source code(zip)
  • v0.4.7(May 26, 2022)

    What's Changed

    • [feat]: add distribute eval for ds environment by @lgqfhwy in https://github.com/alibaba/EasyRec/pull/167
    • [feat]: add pre-check by @dawn310826 in https://github.com/alibaba/EasyRec/pull/159
    • [feat]: add f1 reweighted loss for multi tower model & make odpsRtpInput support extra input column
    • [feat]: add mlperf config on criteo by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/128
    • [feat]: add mind neg sam by @chengmengli06 in https://github.com/alibaba/EasyRec/pull/108
    • [doc]: autocross doc update by @weidankong in https://github.com/alibaba/EasyRec/pull/191
    • [feat]: refactor predictor by @dawn310826 in https://github.com/alibaba/EasyRec/pull/186
    • [bugfix]: fix bug in dropoutnet.md by @kinghuin in https://github.com/alibaba/EasyRec/pull/194
    • [bugfix]: fix several code bugs and document bugs.

    Docker

    • mybigpai-registry.cn-beijing.cr.aliyuncs.com/easyrec/easyrec:py36-tf1.15-0.4.7

    New Contributors

    • @muxuezi made their first contribution in https://github.com/alibaba/EasyRec/pull/192
    • @weidankong made their first contribution in https://github.com/alibaba/EasyRec/pull/191
    • @kinghuin made their first contribution in https://github.com/alibaba/EasyRec/pull/194
    • Great appreciations to all the contributors, thanks for your contributions!
    Source code(tar.gz)
    Source code(zip)
    easy_rec-0.4.7-py2.py3-none-any.whl(4.12 MB)
  • v0.4.4(Apr 19, 2022)

  • v0.4.2(Apr 19, 2022)

  • v0.4.0(Mar 10, 2022)

    Major Features and Improvements

    • fix final exporter bug under ps-worker-evaluator mode;
    • fix fg export bug for multi-task models;
    • add dlrm models;
    • add vector retrieve function;
    • add interface to export checkpoints which could run on alibaba RTP(RealTime serving Platform);
    • add support for hive table as data input;
    • add support for export embedding variable(kv embedding) separately from dense variables;
    • add support for multi optimizers and gradient freezing ability.

    Download

    easy_rec-0.4.0-py2.py3-none-any.whl

    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Jan 22, 2022)

    Release 0.3.0

    Major Features and Improvements

    • Add support for sequence feature #62
    • Update esmm loss function #76
    • Add auc.num_thresholds setting #76
    • Add Cold start model #80
    • Add Collaborative Metric Learning i2i model #80
    • Add TextCNN model #80
    • Compat with DataScience enviroment #86
    • Support offline inference on DataScience #100
    • Add Unit Test #91

    Bug Fixes and Other Changes

    • Fix bug in partitioned variable restore #71
    • Fix bug in variational dropout and rocket launching #93

    Documentation update

    • Add document for Cold start model, Collaborative Metric Learning i2i model and TextCNN model #80
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Nov 17, 2021)

    Release 0.2.0

    Major Features and Improvements

    • Support for early stop #1
    • Support for stop by num_epoch #5
    • Release MultiTask model PLE #8
    • Compatible with feature_configs and feature_config.features #48
    • import fg_json to easyrec_config #54
    • upgrade params in feature config to higher precision #65
    • add write_graph option to train config. #65
    • Compatible with FG dtype and EasyRec dtype #66

    Bug Fixes and Other Changes

    • fix hyperparams_builder bug in tf2 #64
    • fix default value bug in inference #66
    • fix workqueue dtype bug in OdpsRTPInput #67

    Documentation update

    • update tutorial doc #56 #68
    • update train doc #4
    • update hpo doc #4
    • update feature doc #6
    • update inference doc #7 #56
    • update develop doc #56
    • update optimize doc #56
    • update release doc #56
    • update readme #56 #58
    Source code(tar.gz)
    Source code(zip)
  • v0.1.4(Aug 23, 2021)

  • v0.1.0(Aug 23, 2021)

Owner
Alibaba Group - PAI
Platform of AI
Alibaba Group - PAI
ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms

ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms, including but not limited to click-through-rate (CTR) prediction, learning-to-ranking (LTR), and Matrix/Tensor Embedding. The project objective is to develop a ecosystem to experiment, share, reproduce, and deploy in real world in a smooth and easy way (Hope it can be done).

LI, Wai Yin 90 Oct 8, 2022
A TensorFlow recommendation algorithm and framework in Python.

TensorRec A TensorFlow recommendation algorithm and framework in Python. NOTE: TensorRec is not under active development TensorRec will not be receivi

James Kirk 1.2k Jan 4, 2023
A Python implementation of LightFM, a hybrid recommendation algorithm.

LightFM Build status Linux OSX (OpenMP disabled) Windows (OpenMP disabled) LightFM is a Python implementation of a number of popular recommendation al

Lyst 4.2k Jan 2, 2023
Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems.

Persine, the Persona Engine Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface a

Jonathan Soma 87 Nov 29, 2022
Recommendation System to recommend top books from the dataset

recommendersystem Recommendation System to recommend top books from the dataset Introduction The recom.py is the main program code. The dataset is als

Vishal karur 1 Nov 15, 2021
An open source movie recommendation WebApp build by movie buffs and mathematicians that uses cosine similarity on the backend.

Movie Pundit Find your next flick by asking the (almost) all-knowing Movie Pundit Jump to Project Source » View Demo · Report Bug · Request Feature Ta

Kapil Pramod Deshmukh 8 May 28, 2022
Implementation of a hadoop based movie recommendation system

Implementation-of-a-hadoop-based-movie-recommendation-system 通过编写代码,设计一个基于Hadoop的电影推荐系统,通过此推荐系统的编写,掌握在Hadoop平台上的文件操作,数据处理的技能。windows 10 hadoop 2.8.3 p

汝聪(Ricardo) 5 Oct 2, 2022
Books Recommendation With Python

Books-Recommendation Business Problem During the last few decades, with the rise

Çağrı Karadeniz 7 Mar 12, 2022
Bert4rec for news Recommendation

News-Recommendation-system-using-Bert4Rec-model Bert4rec for news Recommendation

saran pandian 2 Feb 4, 2022
QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

QRec is a Python framework for recommender systems (Supported by Python 3.7.4 and Tensorflow 1.14+) in which a number of influential and newly state-of-the-art recommendation models are implemented. QRec has a lightweight architecture and provides user-friendly interfaces. It can facilitate model implementation and evaluation.

Yu 1.4k Dec 27, 2022