Repository for Multimodal AutoML Benchmark

Xingjian Shi

Last update: Nov 24, 2022

Related tags

Deep Learning automl_multimodal_benchmark

Overview

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Repository for the NeurIPS 2021 Dataset Track Submission "Benchmarking Multimodal AutoML for Tabular Data with Text Fields" (Link, Full Paper with Appendix). An earlier version of the paper, called "Multimodal AutoML on Structured Tables with Text Fields" (Link) has been accepted by ICML 2021 AutoML workshop as Oral. As we have since updated the benchmark with more datasets, the version used in the AutoML workshop paper has been archived at the icml_workshop branch.

This benchmark contains a diverse collection of tabular datasets. Each dataset contains numeric/categorical as well as text columns. The goal is to evaluate the performance of (automated) ML systems for supervised learning (classification and regression) with such multimodal data. The folder multimodal_text_benchmark/scripts/benchmark/ provides Python scripts to run different variants of the AutoGluon and H2O AutoML tools on the benchmark.

Datasets used in the Benchmark

Here's a brief summary of the datasets in our benchmark. Each dataset is described in greater detail in the multimodal_text_benchmark/ folder.

ID	key	#Train	#Test	Task	Metric	Prediction Target
prod	product_sentiment_machine_hack	5,091	1,273	multiclass	accuracy	sentiment related to product
salary	data_scientist_salary	15,84	3961	multiclass	accuracy	salary range in data scientist job listings
airbnb	melbourne_airbnb	18,316	4,579	multiclass	accuracy	price of Airbnb listing
channel	news_channel	20,284	5,071	multiclass	accuracy	category of news article
wine	wine_reviews	84,123	21,031	multiclass	accuracy	variety of wine
imdb	imdb_genre_prediction	800	200	binary	roc_auc	whether film is a drama
fake	fake_job_postings2	12,725	3,182	binary	roc_auc	whether job postings are fake
kick	kick_starter_funding	86,052	21,626	binary	roc_auc	will Kickstarter get funding
jigsaw	jigsaw_unintended_bias100K	100,000	25,000	binary	roc_auc	whether comments are toxic
qaa	google_qa_answer_type_reason_explanation	4,863	1,216	regression	r2	type of answer
qaq	google_qa_question_type_reason_explanation	4,863	1,216	regression	r2	type of question
book	bookprice_prediction	4,989	1,248	regression	r2	price of books
jc	jc_penney_products	10,860	2,715	regression	r2	price of JC Penney products
cloth	women_clothing_review	18,788	4,698	regression	r2	review score
ae	ae_price_prediction	22,662	5,666	regression	r2	American-Eagle item prices
pop	news_popularity2	24,007	6,002	regression	r2	news article popularity online
house	california_house_price	24,007	6,002	regression	r2	sale price of houses in California
mercari	mercari_price_suggestion100K	100,000	25,000	regression	r2	price of Mercari products

License

The versions of datasets in this benchmark are released under the CC BY-NC-SA license. Note that the datasets in this benchmark are modified versions of previously publicly-available original copies and we do not own any of the datasets in the benchmark. Any data from this benchmark which has previously been published elsewhere falls under the original license from which the data originated. Please refer to the licenses of each original source linked in the multimodal_text_benchmark/README.md.

Install the Benchmark Suite

cd multimodal_text_benchmark
# Install the benchmarking suite
python3 -m pip install -U -e .

You can do a quick test of the installation by going to the test folder

cd multimodal_text_benchmark/tests
python3 -m pytest test_datasets.py

To work with one of the datasets, use the following code:

from auto_mm_bench.datasets import dataset_registry

print(dataset_registry.list_keys())  # list of all dataset names
dataset_name = 'product_sentiment_machine_hack'

train_dataset = dataset_registry.create(dataset_name, 'train')
test_dataset = dataset_registry.create(dataset_name, 'test')
print(train_dataset.data)
print(test_dataset.data)

To access all datasets that comprise the benchmark:

from auto_mm_bench.datasets import create_dataset, TEXT_BENCHMARK_ALIAS_MAPPING

for dataset_name in list(TEXT_BENCHMARK_ALIAS_MAPPING.values()):
    print(dataset_name)
    dataset = create_dataset(dataset_name)

Run Experiments

Go to multimodal_text_benchmark/scripts/benchmark to see how to run some baseline ML methods over the benchmark.

References

BibTeX entry of the ICML Workshop Version:

@article{agmultimodaltext,
  title={Multimodal AutoML on Structured Tables with Text Fields},
  author={Shi, Xingjian and Mueller, Jonas and Erickson, Nick and Li, Mu and Smola, Alexander},
  journal={8th ICML Workshop on Automated Machine Learning (AutoML)},
  year={2021}
}

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect video data.

11 Oct 22, 2022

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

21 Nov 20, 2022

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

33 Dec 18, 2022

Repository for the Bias Benchmark for QA dataset.

BBQ Repository for the Bias Benchmark for QA dataset. Authors: Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Tho

18 Nov 18, 2022

Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

Benchmark Model Electrical and ICT System This repository contains the documentation, code, and models for the electrical and ICT benchmark model deve

1 Nov 29, 2021

A Comparative Framework for Multimodal Recommender Systems

Cornac Cornac is a comparative framework for multimodal recommender systems. It focuses on making it convenient to work with models leveraging auxilia

671 Jan 3, 2023

This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

TransBTS: Multimodal Brain Tumor Segmentation Using Transformer This repo is the official implementation for TransBTS: Multimodal Brain Tumor Segmenta

247 Dec 28, 2022

Deep Multimodal Neural Architecture Search

MMNas: Deep Multimodal Neural Architecture Search This repository corresponds to the PyTorch implementation of the MMnas for visual question answering

23 Dec 21, 2022

Rethinking the U-Net architecture for multimodal biomedical image segmentation

MultiResUNet Rethinking the U-Net architecture for multimodal biomedical image segmentation This repository contains the original implementation of "M

308 Jan 5, 2023

Repository for Multimodal AutoML Benchmark

Related tags

Overview

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Datasets used in the Benchmark

License

Install the Benchmark Suite

Run Experiments

References

You might also like...

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

Repository for the Bias Benchmark for QA dataset.

Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

A Comparative Framework for Multimodal Recommender Systems

This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

Deep Multimodal Neural Architecture Search

Rethinking the U-Net architecture for multimodal biomedical image segmentation

Owner

Xingjian Shi

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Model search is a framework that implements AutoML algorithms for model architecture search at scale

Clairvoyance: a Unified, End-to-End AutoML Pipeline for Medical Time Series

An AutoML Library made with Optuna and PyTorch Lightning

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

AutoDeeplab / auto-deeplab / AutoML for semantic segmentation, implemented in Pytorch

MMRazor: a model compression toolkit for model slimming and AutoML

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".