DeepAL: Deep Active Learning in Python

Overview

DeepAL: Deep Active Learning in Python

Python implementations of the following active learning algorithms:

  • Random Sampling
  • Least Confidence [1]
  • Margin Sampling [2]
  • Entropy Sampling [3]
  • Uncertainty Sampling with Dropout Estimation [4]
  • Bayesian Active Learning Disagreement [4]
  • Core-Set Selection [5]
  • Adversarial margin [6]

Prerequisites

  • numpy 1.21.2
  • scipy 1.7.1
  • pytorch 1.10.0
  • torchvision 0.11.1
  • scikit-learn 1.0.1
  • tqdm 4.62.3
  • ipdb 0.13.9

You can also use the following command to install conda environment

conda env create -f environment.yml

Demo

  python demo.py \
      --n_round 10 \
      --n_query 1000 \
      --n_init_labeled 10000 \
      --dataset_name MNIST \
      --strategy_name RandomSampling \
      --seed 1

Please refer here for more details.

Citing

If you use our code in your research or applications, please consider citing our paper.

@article{Huang2021deepal,
    author    = {Kuan-Hao Huang},
    title     = {DeepAL: Deep Active Learning in Python},
    journal   = {arXiv preprint arXiv:2111.15258},
    year      = {2021},
}

Reference

[1] A Sequential Algorithm for Training Text Classifiers, SIGIR, 1994

[2] Active Hidden Markov Models for Information Extraction, IDA, 2001

[3] Active learning literature survey. University of Wisconsin-Madison Department of Computer Sciences, 2009

[4] Deep Bayesian Active Learning with Image Data, ICML, 2017

[5] Active Learning for Convolutional Neural Networks: A Core-Set Approach, ICLR, 2018

[6] Adversarial Active Learning for Deep Networks: a Margin Based Approach, arXiv, 2018

Comments
  • Error with coreset strategy

    Error with coreset strategy

    Tried coreset strategy with various initializations: NUM_INIT_LB and NUM_QUERY. All of them produce errors as shown below. Looks like 'sols{}.pkl' is not being generated.

    (pytorch_p36) [ec2-user@ip-172-31-38-51 deep-active-learning]$ python run.py /home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py:47: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp number of labeled pool: 1000 number of unlabeled pool: 39000 number of testing pool: 10000 MNIST SEED 1 CoreSet Round 0 testing accuracy 0.7458 Round 1 calculate distance matrix /home/ec2-user/src/deep-active-learning/query_strategies/core_set.py:24: RuntimeWarning: invalid value encountered in sqrt dist_mat = np.sqrt(dist_mat) 0:00:22.086067 calculate greedy solution greedy solution 0/50 greedy solution 10/50 greedy solution 20/50 greedy solution 30/50 greedy solution 40/50 0:00:13.353899

    /home/ec2-user/src/deep-active-learning/query_strategies/core_set.py(64)query() 63 ---> 64 sols = pickle.load(open('sols{}.pkl'.format(SEED), 'rb')) 65

    ipdb>

    Exiting Debugger.

    opened by kaushikpavani 4
  • Mapping between paper names to python class

    Mapping between paper names to python class

    Hey, Thanks for this awesome repository. A quick question, is there a mapping between the paper names to the python code classes? Specifically I'm interested in ''ACTIVE LEARNING FOR CONVOLUTIONAL NEURAL NETWORKS: A CORE-SET APPROACH`` which supposes to be one of these: "RandomSampling", "LeastConfidence", "MarginSampling", "EntropySampling", "LeastConfidenceDropout", "MarginSamplingDropout", "EntropySamplingDropout", "KMeansSampling", "KCenterGreedy", "BALDDropout", "AdversarialBIM", "AdversarialDeepFool",

    opened by kobybibas 2
  • Confusion about network constructor

    Confusion about network constructor

    It seems that each time I call the function Net.train(self, data), a new network with new initial parameters will be constructed.(As the code shown in https://github.com/ej0cl6/deep-active-learning/blob/563723356421bc7d82e3496700265992cf7fcb06/nets.py#L17) As I know, the network's parameters are trained continuously after each query in Active Learning settings, instead of constructing a new network and training from scratch. So the constructor of class Net maybe like:

    class Net:
        def __init__(self, net, params, device):
            self.net = net
            self.params = params
            self.device = device
            self.clf = self.net().to(self.device)    # add
            
        def train(self, data):
            n_epoch = self.params['n_epoch']
            # self.clf = self.net().to(self.device)    # remove
            self.clf.train()
            optimizer = optim.SGD(self.clf.parameters(), **self.params['optimizer_args'])
    

    I'm a beginner in Deep Active Learning, so the content above maybe just my misunderstanding about Deep Active Learning. Looking forward to your reply. Thank you.

    opened by UnpureRationalist 2
  • TypeError: missing arguments with ALBL strategy

    TypeError: missing arguments with ALBL strategy

    There is a bug in deep-active-learning/query_strategies/active_learning_by_learning.py line 9.

    self.strategy_list.append(RandomSampling(X, Y, idxs_lb, args))

    I got the error: TypeError: init() missing 2 required positional arguments: 'handler' and 'args'

    The fix should be:

    self.strategy_list.append(RandomSampling(X, Y, idxs_lb, net, handler, args))

    opened by kaushikpavani 2
  • Is there a mistake in Deepfool implementation?

    Is there a mistake in Deepfool implementation?

    I believe there is a mistake in this line in adversarial_deepfool.py if value_i < value_l: ri = value_i/np.linalg.norm(wi.numpy().flatten()) * wi value_l = value_i # <--- this line should be added here since otherwise,t value_i is just always the last value in the for loop?

    opened by saifullah3396 1
  • How can I add my own data set for active learning?

    How can I add my own data set for active learning?

    Hello, thank you for implementing a code base that integrates multiple active learning methods! It's a great job! I was wondering how can I add my own data sets and use this code to do active learning?

    opened by Djn-swjtu 1
  • Can this pipeline be applied to any deep learning model?

    Can this pipeline be applied to any deep learning model?

    Hey there,

    I'm looking to explore deep learning with active learning semi automated labels for a video classification project i have. I currently use a Bidirectional LSTM and may look to explore transformers, gradient boosted decision trees and SVMs as well. For LSTMs and transformers, is it possible for me to use my LSTM model with the current framework?

    • Tony
    opened by Tony363 1
  • TensorFlow Support

    TensorFlow Support

    HI there, thanks for your great job. I'm wondering if you know any TensorFlow Supporting implementation or do you have any plan to implement it as well?

    Thanks.

    opened by hadisaadat 1
  • Custom dataset for image classification

    Custom dataset for image classification

    Firstly I would like to thank you for the library Is there a way to use this framework on custom image datasets for image classification with active learning?

    opened by JosianeUwineza 1
  • CoreSet AL algorithm

    CoreSet AL algorithm

    Hi, thanks for your excellent codes. I have a question regarding to the CoreSet AL algorithm. In your previews version, there is this algorithm but now it is gone. May I know the reason? Thanks a lot in advance!!

    opened by luuyin 1
  • Gurobi License

    Gurobi License

    Thanks for your incredible lines of code. There are no problems until I run model.optimize() in file full_solver_gurobi.py, it raises error: 'GurobiError: Model too large for size-limited license; visit https://www.gurobi.com/free-trial for a full license'. Gurobi seem not to be free with large size models (because I ran 35000 vectors size of 512) I just want to quickly reproduce the reported results of the paper but not any commercial purposes. Is there any other way to reproduce the results of Core-Set [3] paper or to get the free Gurobi license?

    opened by vietth-bka 1
Owner
Kuan-Hao Huang
Kuan-Hao Huang
Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

Lightly 25 Dec 3, 2022
A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

modAL 1.9k Dec 31, 2022
An Active Automata Learning Library Written in Python

AALpy An Active Automata Learning Library AALpy is a light-weight active automata learning library written in pure Python. You can start learning auto

TU Graz - SAL Dependable Embedded Systems Lab (DES Lab) 78 Dec 30, 2022
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla

Tianning Yuan 269 Dec 21, 2022
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

MI-AOD Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection (The PDF is not available tem

Tianning Yuan 269 Dec 21, 2022
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI Gym toolkit.

Robin Henry 99 Dec 12, 2022
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".

A Simple Baseline for Low-Budget Active Learning This repository is the implementation of A Simple Baseline for Low-Budget Active Learning. In this pa

null 10 Nov 14, 2022
Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN)

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks (MAPDN) This is the implementation of the paper Multi-Age

Future Power Networks 83 Jan 6, 2023
Code for Active Learning at The ImageNet Scale.

Code for Active Learning at The ImageNet Scale. This repository implements many popular active learning algorithms and allows training with torch's DDP.

Zeyad Emam 47 Dec 12, 2022
The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

pyrelational is a python active learning library developed by Relation Therapeutics for rapidly implementing active learning pipelines from data management, model development (and Bayesian approximation), to creating novel active learning strategies.

Relation Therapeutics 95 Dec 27, 2022
GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

null 22 Dec 12, 2022
FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

null 18 Sep 1, 2022
Active Offline Policy Selection With Python

Active Offline Policy Selection This is supporting example code for NeurIPS 2021 paper Active Offline Policy Selection by Ksenia Konyushkova*, Yutian

DeepMind 27 Oct 15, 2022
A python bot to move your mouse every few seconds to appear active on Skype, Teams or Zoom as you go AFK. 🐭 🤖

PyMouseBot If you're from GT and annoyed with SGVPN idle timeouts while working on development laptop, You might find this useful. A python cli bot to

Oaker Min 6 Oct 24, 2022
Code for the Active Speakers in Context Paper (CVPR2020)

Active Speakers in Context This repo contains the official code and models for the "Active Speakers in Context" CVPR 2020 paper. Before Training The c

null 43 Oct 14, 2022
Active and Sample-Efficient Model Evaluation

Active Testing: Sample-Efficient Model Evaluation Hi, good to see you here! ?? This is code for "Active Testing: Sample-Efficient Model Evaluation". P

Jannik Kossen 19 Oct 30, 2022
Look Who’s Talking: Active Speaker Detection in the Wild

Look Who's Talking: Active Speaker Detection in the Wild Dependencies pip install -r requirements.txt In addition to the Python dependencies, ffmpeg

Clova AI Research 60 Dec 8, 2022
Submodular Subset Selection for Active Domain Adaptation (ICCV 2021)

S3VAADA: Submodular Subset Selection for Virtual Adversarial Active Domain Adaptation ICCV 2021 Harsh Rangwani, Arihant Jain*, Sumukh K Aithal*, R. Ve

Video Analytics Lab -- IISc 13 Dec 28, 2022
VR-Caps: A Virtual Environment for Active Capsule Endoscopy

VR-Caps: A Virtual Environment for Capsule Endoscopy Overview We introduce a virtual active capsule endoscopy environment developed in Unity that prov

DeepMIA Lab 90 Dec 27, 2022