Python based GBDT implementation

Sberbank AI Lab

Last update: Sep 21, 2022

Related tags

Machine Learning Py-Boost

Overview

Py-boost: a research tool for exploring GBDTs

Modern gradient boosting toolkits are very complex and are written in low-level programming languages. As a result,

It is hard to customize them to suit one’s needs
New ideas and methods are not easy to implement
It is difficult to understand how they work

Py-boost is a Python-based gradient boosting library which aims at overcoming the aforementioned problems.

Authors: Anton Vakhrushev, Leonid Iosipoi.

Py-boost Key Features

Simple. Py-boost is a simplified gradient boosting library but it supports all main features and hyperparameters available in other implementations.

Fast with GPU. Despite the fact that Py-boost is written in Python, it works only on GPU and uses Python GPU libraries such as CuPy and Numba.

Easy to customize. Py-boost can be easily customized even if one is not familiar with GPU programming (just replace np with cp). What can be customized? Almost everuthing via custom callbacks. Examples: Row/Col sampling strategy, Training control, Losses/metrics, Multioutput handling strategy, Anything via custom callbacks

Installation

Before installing py-boost via pip you should have cupy installed. You can use:

pip install -U cupy-cuda110 py-boost

Note: replace with your cuda version! For the details see this guide

Quick tour

Py-boost is easy to use since it has similar to scikit-learn interface. For usage example please see:

Tutorial_1_Basics for simple usage examples
Tutorial_2_Advanced_multioutput for advanced multioutput features
Tutorial_3_Custom_features for examples of customization

More examples are comming soon

Other Sber AI Lab Projects

LightAutoML: https://github.com/sberbank-ai-lab/LightAutoML
AutoWoE: https://github.com/sberbank-ai-lab/AutoMLWhitebox
RePlay: https://github.com/sberbank-ai-lab/RePlay

You might also like...

Implementation of different ML Algorithms from scratch, written in Python 3.x

393 Nov 29, 2022

A Python implementation of the Robotics Toolbox for MATLAB

Robotics Toolbox for Python A Python implementation of the Robotics Toolbox for MATLAB® GitHub repository Documentation Wiki (examples and details) Sy

1.2k Jan 7, 2023

A Python implementation of GRAIL, a generic framework to learn compact time series representations.

GRAIL A Python implementation of GRAIL, a generic framework to learn compact time series representations. Requirements Python 3.6+ numpy scipy tslearn

3 Nov 24, 2021

Implementation of linesearch Optimization Algorithms in Python

Nonlinear Optimization Algorithms During my time as Scientific Assistant at the Karlsruhe Institute of Technology (Germany) I implemented various Opti

3 Dec 6, 2022

A scikit-learn based module for multi-label et. al. classification

scikit-multilearn scikit-multilearn is a Python module capable of performing multi-label learning tasks. It is built on-top of various scientific Pyth

802 Jan 1, 2023

jaxfg - Factor graph-based nonlinear optimization library for JAX.

Factor graphs + nonlinear optimization in JAX

134 Dec 21, 2022

LibTraffic is a unified, flexible and comprehensive traffic prediction library based on PyTorch

LibTraffic is a unified, flexible and comprehensive traffic prediction library, which provides researchers with a credibly experimental tool and a convenient development framework. Our library is implemented based on PyTorch, and includes all the necessary steps or components related to traffic prediction into a systematic pipeline.

432 Jan 5, 2023

WAGMA-SGD is a decentralized asynchronous SGD for distributed deep learning training based on model averaging.

WAGMA-SGD is a decentralized asynchronous SGD based on wait-avoiding group model averaging. The synchronization is relaxed by making the collectives externally-triggerable, namely, a collective can be initiated without requiring that all the processes enter it. It partially reduces the data within non-overlapping groups of process, improving the parallel scalability.

6 Jun 18, 2022

TorchDrug is a PyTorch-based machine learning toolbox designed for drug discovery

A powerful and flexible machine learning platform for drug discovery

1.1k Jan 8, 2023

Python based GBDT implementation

Related tags

Overview

Py-boost: a research tool for exploring GBDTs

Py-boost Key Features

Installation

Quick tour

Other Sber AI Lab Projects

You might also like...

Implementation of different ML Algorithms from scratch, written in Python 3.x

A Python implementation of the Robotics Toolbox for MATLAB

A Python implementation of GRAIL, a generic framework to learn compact time series representations.

Implementation of linesearch Optimization Algorithms in Python

A scikit-learn based module for multi-label et. al. classification

jaxfg - Factor graph-based nonlinear optimization library for JAX.

LibTraffic is a unified, flexible and comprehensive traffic prediction library based on PyTorch

WAGMA-SGD is a decentralized asynchronous SGD for distributed deep learning training based on model averaging.

TorchDrug is a PyTorch-based machine learning toolbox designed for drug discovery

Owner

Sberbank AI Lab

NumPy-based implementation of a multilayer perceptron (MLP)

Python-based implementations of algorithms for learning on imbalanced data.

A Python toolkit for rule-based/unsupervised anomaly detection in time series

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)

Empyrial is a Python-based open-source quantitative investment library dedicated to financial institutions and retail investors

A simple python program which predicts the success of a movie based on it's type, actor, actress and director

Predico Disease Prediction system based on symptoms provided by patient- using Python-Django & Machine Learning

Painless Machine Learning for python based on scikit-learn

Python implementation of the rulefit algorithm

Extreme Learning Machine implementation in Python