DISTIL: Deep dIverSified inTeractIve Learning.

decile-team

Last update: Dec 6, 2022

Related tags

Overview

Cut down your labeling cost and time by 3x-5x!

What is DISTIL?

DISTIL is an active learning toolkit that implements a number of state-of-the-art active learning strategies with a particular focus for active learning in the deep learning setting. DISTIL is built on PyTorch and decouples the training loop from the active learning algorithm, thereby providing flexibility to the user by allowing them to control the training procedure and model. It allows users to incorporate new active learning algorithms easily with minimal changes to their existing code. DISTIL also provides support for incorporating active learning with your custom dataset and allows you to experiment on well-known datasets. We are continuously incorporating newer and better active learning selection strategies into DISTIL, and we plan to expand the scope of the supported active learning algorithms to settings beyond the currently supported supervised classification setting.

Key Features of DISTIL

Decouples the active learning strategy from the training loop, allowing users to modify the training and/or the active learning strategy
Implements faster and more efficient versions of several active learning strategies
Contains most state-of-the-art active learning algorithms
Allows running basic experiments with just one command
Presents interface to various active learning strategies through only a couple lines of code
Requires only minimal changes to the configuration files to run your own experiments
Achieves higher test accuracies with less amount of training data, admitting a huge reduction in labeling cost and time
Requires minimal change to add it to existing training structures
Contains recipes, tutorials, and benchmarks for all active learning algorithms on many deep learning datasets

Starting with DISTIL

DISTIL can be installed using the following means:

From Git Repository

git clone https://github.com/decile-team/distil.git
cd distil
pip install -r requirements/requirements.txt

Pip Installation

pip install decile-distil

First Steps

To better understand DISTIL's functionality, we have provided example Jupyter notebooks in the tutorials folder, which can be easily executed by using Google Colab. We also provide a simple AL training loop that runs experiments using a provided configuration file. To run this loop, do the following from the base folder:

python train.py --config_path=/content/distil/configs/config_svhn_resnet_randomsampling.json

You can use the default configurations that we have provided in the configs folder, or you can make a custom configuration. For making your custom configuration file for training, please refer to Distil Configuration File Documentation.

Some of the algorithms currently implemented in DISTIL include the following:

To learn more on different active learning algorithms, check out the Active Learning Strategies Survey Blog

Documentation

Learn more about DISTIL by reading our documentation.

Mailing List

To receive updates about DISTIL and to be a part of the community, join the Decile_DISTIL_Dev group.

https://groups.google.com/forum/#!forum/Decile_DISTIL_Dev/join

Acknowledgment

This library takes inspiration, builds upon, and uses pieces of code from several open source codebases. These include Kuan-Hao Huang's deep active learning repository, Jordan Ash's Badge repository, and Andreas Kirsch's and Joost van Amersfoort's BatchBALD repository. Also, DISTIL uses submodlib for submodular optimization.

Team

DISTIL is created and maintained by Nathan Beck, Suraj Kothawade, Durga Sivasubramanian, Apurva Dani, Rishabh Iyer, and Ganesh Ramakrishnan. We look forward to have DISTIL more community driven. Please use it and contribute to it for your active learning research, and feel free to use it for your commercial projects. We will add the major contributors here.

Resources

Youtube Tutorials on DISTIL:

Blog Articles

Publications

[1] Settles, Burr. Active learning literature survey. University of Wisconsin-Madison Department of Computer Sciences, 2009.

[2] Wang, Dan, and Yi Shang. "A new active labeling method for deep learning." 2014 International joint conference on neural networks (IJCNN). IEEE, 2014

[3] Kai Wei, Rishabh Iyer, Jeff Bilmes, Submodularity in data subset selection and active learning, International Conference on Machine Learning (ICML) 2015

[4] Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds. CoRR, 2019. URL: http://arxiv.org/abs/1906.03671, arXiv:1906.03671.

[5] Sener, Ozan, and Silvio Savarese. "Active learning for convolutional neural networks: A core-set approach." ICLR 2018.

[6] Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, and Rishabh Iyer, GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning, 35th AAAI Conference on Artificial Intelligence, AAAI 2021

[7] Vishal Kaushal, Rishabh Iyer, Suraj Kothawade, Rohan Mahadev, Khoshrav Doctor, and Ganesh Ramakrishnan, Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision, 7th IEEE Winter Conference on Applications of Computer Vision (WACV), 2019 Hawaii, USA

[8] Wei, Kai, et al. "Submodular subset selection for large-scale speech training data." 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014.

[9] Ducoffe, Melanie, and Frederic Precioso. "Adversarial active learning for deep networks: a margin based approach." arXiv preprint arXiv:1802.09841 (2018).

[10] Gal, Yarin, Riashat Islam, and Zoubin Ghahramani. "Deep bayesian active learning with image data." International Conference on Machine Learning. PMLR, 2017.

[11] Suraj Kothawade, Nathan Beck, Krishnateja Killamsetty, and Rishabh Iyer, “SIMILAR: Submodular Information Measures based Active Learning in Realistic Scenarios,” To Appear In Neural Information Processing Systems, NeurIPS 2021.

Comments

Different results

Hello authors, It's great that you publish the source code, but the default hyper-parameters in config_cifar10_resnet_badge.json can not get similar accuracy in your project about badge strategy, so could you please share the hyper-params to produce the graph in README.

opened by chipsh 2
Eval Mode in select()

SMI, SCMI, and SCG need to have self.model.eval() in their select() methods. Without it, the embedding computation will not necessarily be the same for similar points, which may cause performance degradation.

opened by nab170130 1
Update README
Add the effective evaluation paper

Add links to the SIMILAR and effective eval papers

Remove the results right now from the main readme and rather link to the benchmark folders for each case. Each folder will have the detailed results for the corresponding case.
opened by rishabhk108 1
Update Interface for Scalability and Add New Content

This merge request replaces the old interface used in DISTIL that requires numpy arrays. To better adapt to larger (and otherwise different) datasets, the new strategies now take torch.utils.data.Dataset objects. This simplifies dataset management and trims the need for many added utilities.

Furthermore, submodular optimization has been changed to use submodlib, which is closely tied to DECILE. As a result, much of the utilities revolving around submodular optimization and disparity (dispersion) functions has been pruned as well. Any missing functionality should be implemented as part of submodlib.

Lastly, this merge request adds new documentation, updated examples, and the complete updated benchmark profile of the Effective Evaluation of Deep Active Learning on Image Classification Tasks.

opened by nab170130 0
Submodlib integration - Multiple changes and feature additions
Main change: Pre-compute kernel before SIM function instantiations

Feature addition: Facilitate feature extraction from any layer in the neural network

Multiple other changes in commit messages
opened by surajkothawade 0
Doc Index Plots and Utils Docstrings

Added plots to the index page of the documentation. Added docstrings to utils files, but did not specify autodoc construction to be included in documentation.

opened by nab170130 0
Grammar-check readme, fix file names and imports in all files

Grammar-checked the readme file. Fixed the file names so that all are lowercase, matching convention. Fixed imports in ALL files. Removed import block in init.py file in active_learning_strategies and fixed imports in all relevant files. Each file was tested: All attempted imports worked, and every example / testing script and notebook worked.

Note: THE GOOGLE COLAB NOTEBOOKS THAT ARE LINKED IN THE README NEED TO HAVE THE CHANGES IN THE IMPORT STATEMENTS PRESENT IN THE NOTEBOOKS FOLDER. MAKE SURE TO INCORPORATE THOSE CHANGES WHEN MERGING.

opened by nab170130 0
Merge Grammar-Checked README and Fixed File Names

Grammar-checked the README present from the commit on the main branch that this current branch originated. Fixed the file names such that all are lower case. Fixed the affected import statements. Note: Any notebook dependent on everything within the distil folder needs to have their import statements checked / fixed!

opened by nab170130 0
Device fix and import trimming

Added optional device parameter to each learning strategy and otherwise ensured that the specified device was being used in all locations where a torch object was being moved. Further trimmed imported but never used libraries.

opened by nab170130 0
Used with Custom Datasets

Though it says that distil coud be used with custom datasets there are no tutorials to support this claim and I have not been able to implement any kaggle datasets either. Please include instructions for if we want to use any dataset separate from the pre-defined

opened by svpowell 0
Is there code to draw the diagram under the experiment_plot folder

Hi, Thank you very much for the toolkit, I want to plot the experimental comparison result, but I don't konw how to plot the same effect as the paper, so can you provide the code for this plot? Thanks.

opened by InstantWindy 0
Semi-Supervised Learning

Hello!

Is it possible to perform Active-Learning with a combination of SSL methods such as Virtual Adversarial Training (VAT), Entropy Minimization (EntMin), etc?

I believe that this would be the major benefit of using DL for active learning. Otherwise, one can use an easier model to train & tune after each iteration.

Do you also think Extreme Learning Machine could be useful as a one-shot learning method to speed up the active-learning iterations with your library?

opened by kayuksel 0

Releases(0.2.0)

0.2.0(May 8, 2021)

Source code(tar.gz)
Source code(zip)
0.0.3(Jan 6, 2021)

Source code(tar.gz)
Source code(zip)

Owner

decile-team

DECILE: Data EffiCient machIne LEarning

GitHub https://decile-team-distil.readthedocs.io/en/latest/index.html

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

8.1k Jan 6, 2023

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases. Ivy wraps the functional APIs of existing frameworks. Framework-agnostic functions, libraries and layers can then be written using Ivy, with simultaneous support for all frameworks. Ivy currently supports Jax, TensorFlow, PyTorch, MXNet and Numpy. Check out the docs for more info!

8.2k Jan 2, 2023

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu

138 Dec 17, 2022

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

63 Oct 17, 2022

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

7 Nov 27, 2022

FTIR-Deep Learning - FTIR Deep Learning With Python

CANDIY-spectrum Human analyis of chemical spectra such as Mass Spectra (MS), Inf

1 Jan 3, 2022

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution Figure: Example visualization of the method and baseline as a

16 Dec 23, 2022

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

IVOS-W Paper Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild Zhaoyun Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanli

38 Dec 12, 2022

PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis This is a PyTorch implementation of the Deep Streaming Linear Discriminant

41 Dec 25, 2022

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

139 Jan 1, 2023

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

This repository provides the source code for training and testing state-of-the-art click-based interactive segmentation models with the official PyTorch implementation

Visual Understanding Lab @ Samsung AI Center Moscow

406 Jan 1, 2023

Your interactive network visualizing dashboard

Your interactive network visualizing dashboard Documentation: Here What is Jaal Jaal is a python based interactive network visualizing tool built usin

177 Jan 4, 2023

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN video | paper | website Anycost GANs for Interactive Image Synthesis and Editing Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zh

726 Dec 28, 2022

Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

A Co-Interactive Transformer for Joint Slot Filling and Intent Detection This repository contains the PyTorch implementation of the paper: A Co-Intera

67 Dec 5, 2022

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

364 Jan 3, 2023

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

152 Dec 28, 2022

DISTIL: Deep dIverSified inTeractIve Learning.

Related tags

Overview

Cut down your labeling cost and time by 3x-5x!

What is DISTIL?

Key Features of DISTIL

Starting with DISTIL

From Git Repository

Pip Installation

First Steps

Documentation

Mailing List

Acknowledgment

Team

Resources

Publications

Comments

Releases(0.2.0)

0.2.0(May 8, 2021)

0.0.3(Jan 6, 2021)

Owner

decile-team

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

FTIR-Deep Learning - FTIR Deep Learning With Python

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

Your interactive network visualizing dashboard

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS.

Seeing if I can put together an interactive version of 3b1b's Manim in Streamlit

Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".