[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

Last update: Jan 4, 2023

Related tags

Deep Learning WellTunedSimpleNets

Overview

[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

Introduction

This repo contains the source code accompanying the paper:

Well-tuned Simple Nets Excel on Tabular Datasets

Authors: Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

Tabular datasets are the last "unconquered castle" for deep learning, with traditional ML methods like Gradient-Boosted Decision Trees still performing strongly even against recent specialized neural architectures. In this paper, we hypothesize that the key to boosting the performance of neural networks lies in rethinking the joint and simultaneous application of a large set of modern regularization techniques. As a result, we propose regularizing plain Multilayer Perceptron (MLP) networks by searching for the optimal combination/cocktail of 13 regularization techniques for each dataset using a joint optimization over the decision on which regularizers to apply and their subsidiary hyperparameters.

We empirically assess the impact of these regularization cocktails for MLPs on a large-scale empirical study comprising 40 tabular datasets and demonstrate that: (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditional ML methods, such as XGBoost.

News: Our work is accepted in the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021).

Setting up the virtual environment

Our work is built on top of AutoPyTorch. To look at our implementation of the regularization cocktail ingredients, you can do the following:

git clone https://github.com/automl/Auto-PyTorch.git
cd Auto-PyTorch/
git checkout regularization_cocktails

To install the version of AutoPyTorch that features our work, you can use these additional commands:

# The following commands assume the user is in the cloned directory
conda create -n reg_cocktails python=3.8
conda activate reg_cocktails
conda install gxx_linux-64 gcc_linux-64 swig
cat requirements.txt | xargs -n 1 -L 1 pip install
python setup.py install

Running the Regularization Cocktail code

The main files to run the regularization cocktails are in the cocktails folder and are main_experiment.py and refit_experiment.py. The first module can be used to start a full HPO search, while, the other module can be used to refit on certain datasets when the time does not suffice to perform the full HPO search and to complete the refit of the incumbent hyperparameter configuration.

The main arguments for main_experiment.py:

--task_id: The task id in OpenML. Basically the dataset that will be used in the experiment.
--wall_time: The total runtime to be used. It is the total runtime for the HPO search and also final refit.
--func_eval_time: The maximal time for one function evaluation parametrized by a certain hyperparameter configuration.
--epochs: The number of epochs for one hyperparameter configuration to be evaluated on.
--seed: The seed to be used for the run.
--tmp_dir: The temporary directory for the results to be stored in.
--output_dir: The output directory for the results to be stored in.
--nr_workers: The number of workers which corresponds to the number of hyperparameter configurations run in parallel.
--nr_threads: The number of threads.
--cash_cocktail: An important flag that activates the regularization cocktail formulation.

A minimal example of running the regularization cocktails:

python main_experiment.py --task_id 233088 --wall_time 600 --func_eval_time 60 --epochs 10 --seed 42 --cash_cocktail True

The example above will run the regularization cocktails for 10 minutes, with a function evaluation limit of 50 seconds for task 233088. Every hyperparameter configuration will be evaluated for 10 epochs, the seed 42 will be used for the experiment and data splits.

A minimal example of running only one regularization method:

python main_experiment.py --task_id 233088 --wall_time 600 --func_eval_time 60 --epochs 10 --seed 42 --use_weight_decay

In case you would like to investigate individual regularization methods, you can look at the different arguments that control them in the main_experiment.py. Additionally, if you want to remove the limit on the number of hyperparameter configurations, you can remove the following lines:

smac_scenario_args={
    'runcount_limit': number_of_configurations_limit,
}

Plots

The plots that are included in our paper were generated from the functions in the module results.py. Although mentioned in most function documentations, most of the functions that plot the baseline diagrams and plots expect a folder structure as follows:

common_result_folder/baseline/results.csv

There are functions inside the module itself that generate the results.csv files.

Baselines

The code for running the baselines can be found in the baselines folder.

TabNet, XGBoost, CatBoost can be found in the baselines/bohb folder.
The other baselines like AutoGluon, auto-sklearn and Node can be found in the corresponding folders named the same.

TabNet, XGBoost, CatBoost and AutoGluon have the same two main files as our regularization cocktails, main_experiment.py and refit_experiment.py.

Figures

Citation

@article{kadra2021regularization,
  title={Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data},
  author={Kadra, Arlind and Lindauer, Marius and Hutter, Frank and Grabocka, Josif},
  journal={arXiv preprint arXiv:2106.11189},
  year={2021}
}

Comments

AttributeError: 'NoneType' object has no attribute 'predict'
Hi there,

I stumbled across your project and paper and it seemed highly interesting. Unfortunately, I cannot successfully execute your project with my dataset. The following error occurs consistently

Traceback (most recent call last): File "cocktails/main_experiment.py", line 399, in <module> train_predictions = fitted_pipeline.predict(X_train) AttributeError: 'NoneType' object has no attribute 'predict'

I have checked, and double-checked the dataset but cannot find "the problem". All values are fine, properly scaled etc. etc. The onlything thats maybe remarkable is: The dataside is a bit wide: 80000 x 650 bu nothing too out of bounds.

Do you have an idea what I should check? Thanks for your support

Kind regards
opened by DonIvanCorleone 7
Double definitions in search space

I get problems, that were solved when I replaced this if with an elif. I only understand the code on a very surface level, though. Not sure this is the right fix for these problems. It seems like 2 things happen: i) you assign to the same node twice (imputer) ii) the one-hot encoder is not supported for non-categorical features. https://github.com/releaunifreiburg/WellTunedSimpleNets/blob/be5b3f3ccd3a5000b5108c19dc65c05357428fbc/utilities.py#L304

opened by SamuelGabriel 2
How data augmentation is applied to tabular data?

Hi author, thanks for sharing the code. I'm wondering how data augmentation strategies like mix-up, cut-out, cut-mix, etc., can be applied to tabular data (I understand they are usually applied to images though). Please advise, many thanks.

opened by hellowangqian 2

Some confusion with "cash_cocktail" option

I ran into your paper not too long ago and found it pretty interesting, thanks for sharing the code.

I'm looking to run this on my own dataset and I'm a bit confused as to the cash_cocktail option in main_experiment.py. My impression is that this automatically turns on all the options to search for HPO but when I run the code I get output that looks like this:

{'task_id': 233088, 'wall_time': 9000, 'func_eval_time': 1000, 'epochs': 105, 'seed': 11, 'tmp_dir': './runs/autoPyTorch_cocktails', 'output_dir': './runs/autoPyTorch_cocktails', 'nr_workers': 6, 'nr_threads': 1, 'cash_cocktail': True, 'use_swa': [False], 'use_se': [False], 'use_lookahead': [False], 'use_weight_decay': [False], 'use_batch_normalization': [False], 'use_skip_connection': [False], 'use_dropout': [False], 'mb_choice': 'none', 'augmentation': 'standard'}
{'task_id': 233088, 'wall_time': 9000, 'func_eval_time': 1000, 'epochs': 105, 'seed': 11, 'tmp_dir': './runs/autoPyTorch_cocktails', 'output_dir': './runs/autoPyTorch_cocktails', 'nr_workers': 6, 'nr_threads': 1, 'cash_cocktail': True, 'use_swa': [False], 'use_se': [False], 'use_lookahead': [False], 'use_weight_decay': [False], 'use_batch_normalization': [False], 'use_skip_connection': [False], 'use_dropout': [False], 'mb_choice': 'none', 'augmentation': 'standard'}

It looks like the options aren't being used? e.g. 'use_swa': [False], 'use_se': [False], 'use_lookahead': [False],...

I guess to rephrase the question: if one were to use your code on their private dataset, what options do you need to pass in to ensure that you are doing the full HPO? is it just the cash_cocktail flag?

opened by vlawhern 1

parameters change to make it work for current version of AutoPyTorch 0.2.1

Hi,

thanks for sharing the great work.

It seems that it is not compatible with the current version of AutoPytorch. I made following changes in utilities.py to make it work. Can you make sure my changes below are reasonable?

attached is utilities.py diff file in text

utilities.py.diff.txt

opened by luotongml 0

Owner

GitHub

Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

ReLU-GP Residual (RGPR) This repository contains code for reproducing the following NeurIPS 2021 paper: @inproceedings{kristiadi2021infinite, title=

4 Dec 26, 2021

source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval This repository contains source code and pre-trained/fine-tun

65 Dec 26, 2022

Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

47 Jun 30, 2022

NeurIPS 2021 Datasets and Benchmarks Track

82 Dec 11, 2022

Create animations for the optimization trajectory of neural nets

Animating the Optimization Trajectory of Neural Nets loss-landscape-anim lets you create animated optimization path in a 2D slice of the loss landscap

81 Dec 25, 2022

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

Investigating U-NETS With Various Intermediate Blocks For Spectrogram-based Singing Voice Separation A Pytorch Implementation of the paper "Investigat

63 Nov 14, 2022

SMD-Nets: Stereo Mixture Density Networks

SMD-Nets: Stereo Mixture Density Networks This repository contains a Pytorch implementation of "SMD-Nets: Stereo Mixture Density Networks" (CVPR 2021)

115 Dec 26, 2022

Code for visualizing the loss landscape of neural nets

Visualizing the Loss Landscape of Neural Nets This repository contains the PyTorch code for the paper Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer

2.2k Jan 9, 2023

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring Uncensored version of the following image can be found at https://i.

1.1k Dec 29, 2022

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

Real Cascade U-Nets for Anime Image Super Resolution 中文 | English ?? Real-CUGAN

111 Dec 28, 2022

PyTorch implementation for OCT-GAN Neural ODE-based Conditional Tabular GANs (WWW 2021)

OCT-GAN: Neural ODE-based Conditional Tabular GANs (OCT-GAN) Code for reproducing the experiments in the paper: Jayoung Kim*, Jinsung Jeon*, Jaehoon L

7 Dec 27, 2022

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

EasyDatas An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results Installation pip install git+https

4 Dec 14, 2021

Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.

Deep Learning Dataset Maker Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data. How to use Down

25 Dec 15, 2022

Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Continual learning datasets Introduction This repository contains PyTorch image

5 Aug 28, 2022

A standard framework for modelling Deep Learning Models for tabular data

PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike.

801 Jan 8, 2023

Implementation of TabTransformer, attention network for tabular data, in Pytorch

Tab Transformer Implementation of Tab Transformer, attention network for tabular data, in Pytorch. This simple architecture came within a hair's bread

420 Jan 5, 2023

Boosted neural network for tabular data

XBNet - Xtremely Boosted Network Boosted neural network for tabular data XBNet is an open source project which is built with PyTorch which tries to co

175 Jan 4, 2023

The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

This repository is the official PyTorch implementation of SAINT. Find the paper on arxiv SAINT: Improved Neural Networks for Tabular Data via Row Atte

284 Dec 21, 2022

Calculates carbon footprint based on fuel mix and discharge profile at the utility selected. Can create graphs and tabular output for fuel mix based on input file of series of power drawn over a period of time.

carbon-footprint-calculator Conda distribution ~/anaconda3/bin/conda install anaconda-client conda-build ~/anaconda3/bin/conda config --set anaconda_u

Seattle university Renewable energy research

7 Sep 26, 2022

[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

Related tags

Overview

[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

Introduction

Setting up the virtual environment

Running the Regularization Cocktail code

Plots

Baselines

Figures

Citation

Comments

AttributeError: 'NoneType' object has no attribute 'predict'

Double definitions in search space

How data augmentation is applied to tabular data?

Some confusion with "cash_cocktail" option

parameters change to make it work for current version of AutoPyTorch 0.2.1

Owner

Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

NeurIPS 2021 Datasets and Benchmarks Track

Create animations for the optimization trajectory of neural nets

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

SMD-Nets: Stereo Mixture Density Networks

Code for visualizing the loss landscape of neural nets

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

PyTorch implementation for OCT-GAN Neural ODE-based Conditional Tabular GANs (WWW 2021)

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.

Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

A standard framework for modelling Deep Learning Models for tabular data

Implementation of TabTransformer, attention network for tabular data, in Pytorch

Boosted neural network for tabular data

The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

Calculates carbon footprint based on fuel mix and discharge profile at the utility selected. Can create graphs and tabular output for fuel mix based on input file of series of power drawn over a period of time.