Python-based implementations of algorithms for learning on imbalanced data.

Overview

ND DIAL: Imbalanced Algorithms

Minimalist Python-based implementations of algorithms for imbalanced learning. Includes deep and representational learning algorithms (implemented via TensorFlow). Below is a list of the methods currently implemented.

  • Undersampling
    1. Random Majority Undersampling with/without Replacement
  • Oversampling
    1. SMOTE - Synthetic Minority Over-sampling Technique [1]
    2. DAE - Denoising Autoencoder [2] (TensorFlow)
    3. GAN - Generative Adversarial Network [3] (TensorFlow)
    4. VAE - Variational Autoencoder [4] (TensorFlow)
  • Ensemble Sampling
    1. RAMOBoost [5]
    2. RUSBoost [6]
    3. SMOTEBoost [7]

References:

[1] : N. V. Chawla, K. W. Bowyer, L. O. Hall, and P. Kegelmeyer. "SMOTE: Synthetic Minority Over-Sampling Technique." Journal of Artificial Intelligence Research (JAIR), 2002.
[2] : P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. "Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion". Journal of Machine Learning Research (JMLR), 2010.
[3] : I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. "Generative Adversarial Nets". Advances in Neural Information Processing Systems 27 (NIPS), 2014.
[4] : D. P. Kingma and M. Welling. "Auto-Encoding Variational Bayes". arXiv preprint arXiv:1312.6114, 2013.
[5] : S. Chen, H. He, and E. A. Garcia. "RAMOBoost: Ranked Minority Oversampling in Boosting". IEEE Transactions on Neural Networks, 2010.
[6] : C. Seiffert, T. M. Khoshgoftaar, J. V. Hulse, and A. Napolitano. "RUSBoost: Improving Classification Performance when Training Data is Skewed". International Conference on Pattern Recognition (ICPR), 2008.
[7] : N. V. Chawla, A. Lazarevic, L. O. Hall, and K. W. Bowyer. "SMOTEBoost: Improving Prediction of the Minority Class in Boosting." European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), 2003.
You might also like...
Uplift modeling and causal inference with machine learning algorithms
Uplift modeling and causal inference with machine learning algorithms

Disclaimer This project is stable and being incubated for long-term support. It may contain new experimental code, for which APIs are subject to chang

Machine Learning Algorithms

Machine-Learning-Algorithms In this project, the dataset was created through a survey opened on Google forms. The purpose of the form is to find the p

Machine learning algorithms implementation

Machine learning algorithms implementation This repository consisits of implementation of various machine learning algorithms. The algorithms implemen

Machine Learning Algorithms ( Desion Tree, XG Boost, Random Forest )
Machine Learning Algorithms ( Desion Tree, XG Boost, Random Forest )

implementation of machine learning Algorithms such as decision tree and random forest and xgboost on darasets then compare results for each and implement ant colony and genetic algorithms on tsp map, play blackjack game and robot in grid world and evaluate reward for it

A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Distributed Evolutionary Algorithms in Python

DEAP DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data stru

Implementation of different ML Algorithms from scratch, written in Python 3.x
Implementation of different ML Algorithms from scratch, written in Python 3.x

Implementation of different ML Algorithms from scratch, written in Python 3.x

Implementation of linesearch Optimization Algorithms in Python

Nonlinear Optimization Algorithms During my time as Scientific Assistant at the Karlsruhe Institute of Technology (Germany) I implemented various Opti

An open-source library of algorithms to analyse time series in GPU and CPU.
An open-source library of algorithms to analyse time series in GPU and CPU.

An open-source library of algorithms to analyse time series in GPU and CPU.

Comments
  • is:issue is:open Multi-class SMOTEBoost

    is:issue is:open Multi-class SMOTEBoost

    Just curious to ask, does this package supports a multi-class problem with a multi-minority scenario? Not sure if I read this correctly, but it seems to me the support is only for the binary class problem:

    if minority_target is None:
        # Determine the minority class label.
        stats_c_ = Counter(y)
        maj_c_ = max(stats_c_, key=stats_c_.get)
        min_c_ = min(stats_c_, key=stats_c_.get)
        self.minority_target = min_c_
    else:
        self.minority_target = minority_target
    

    In my current task, I have a multi-majority multi-minority scenario:

    Class = 0,	Count = 18749,	Percentage = 22.01
    Class = 1,	Count = 3482,	Percentage = 4.09
    Class = 2,	Count = 9566,	Percentage = 11.23
    Class = 3,	Count = 49741,	Percentage = 58.4
    Class = 4,	Count = 3634,	Percentage = 4.27
    
    opened by arilwan 1
Owner
DIAL | Notre Dame
DIAL | Notre Dame
LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRerank, Seq2Slate.

LibRerank LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRer

null 126 Dec 28, 2022
Implementations of Machine Learning models, Regularizers, Optimizers and different Cost functions.

Linear Models Implementations of LinearRegression, LassoRegression and RidgeRegression with appropriate Regularizers and Optimizers. Linear Regression

Keivan Ipchi Hagh 1 Nov 22, 2021
A Python Module That Uses ANN To Predict A Stocks Price And Also Provides Accurate Technical Analysis With Many High Potential Implementations!

Stox A Module to predict the "close price" for the next day and give "technical analysis". It uses a Neural Network and the LSTM algorithm to predict

Stox 31 Dec 16, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 7, 2023
LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms

LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms Based on the work by Smith et al. (2021) Query

null 5 Aug 6, 2022
Simple linear model implementations from scratch.

Hand Crafted Models Simple linear model implementations from scratch. Table of contents Overview Project Structure Getting started Citing this project

Jonathan Sadighian 2 Sep 13, 2021
Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

Teng (Elijah)  Xue 0 Jan 31, 2022
scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

null 12 Jun 29, 2022
A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and the A* Search (using the Manhattan Distance Heuristic)

null 17 Aug 14, 2022
Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

null 1.3k Dec 28, 2022