Anytime Learning At Macroscale

Overview

On Anytime Learning At Macroscale

Learning from sequential data dumps

(key) Requirements

  • Python 3.7
  • Pytorch 1.9.0
  • Hydra 1.1.0 (pip install hydra-core & pip install hydra-submitit-launcher)

Structure

├── crlapi           
  ├── benchmark.py    # Creates the data stream, feeds it to the model and evaluates it
  ├── core.py         # Abstract classes for 
  ├── logger.py   
  ├── sl
    ├── architectures
      ├── ...         # NN architectures used in this project
    ├── clmodels
      ├── ...         # Models (e.g. Single, gEns, ..., )
    ├── streams
      ├── ...         # CIFAR and MNIST stream implementatins

Running Experiments

To run experiments, you need to call the dataset specific run file, and you need to pass the configuration of the run. We have place the configurations in the previous directory (../configs). The config structure is as follows

    ├── configs
        ├── mnist
           ├── run.py                 # run file
           ├── test_usage_gmoe.yaml   # This is the "gMoE" model
           ├── test_finetune_mlp.yaml # This is the "Single Model"
           ... 
        ├── cifar
           ├── run.py                 # run file
           ├── test_finetune_vgg.yaml # This is the "Single Model"
           ├── test_usage_gmoe.yaml   # This is the "gMoE" model
           ...

To run an e.g. mnist gMoE run, the command is (launched from the directory just above (so cd ..)

PYTHONPATH=./ python configs/mnist/run.py -cn test_usage_gmoe n_megabatches=2 replay=1 clmodel.max_epochs=200 

Important arguments

n_megabatches : controls the number of megabatches. So n_megabatches=1 is your regular full dataset training
replay : whether to use replay or not
clmodel.init_from_scratch : whether to reinitialize the model at every MB. Should only be used when replay=1
device : use cuda or cpu depending on your hardware

License

alma is released under the MIT license. See LICENSE for additional details about it. See also our Terms of Use and Privacy Policy.

You might also like...
cuML - RAPIDS Machine Learning Library
cuML - RAPIDS Machine Learning Library

cuML - GPU Machine Learning Algorithms cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions t

A modular active learning framework for Python
A modular active learning framework for Python

Modular Active Learning framework for Python3 Page contents Introduction Active learning from bird's-eye view modAL in action From zero to one in a fe

mlpack: a scalable C++ machine learning library --
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

A library of extension and helper modules for Python's data analysis and machine learning libraries.
A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc

50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

[Due to the time taken @ uni, work + hell breaking loose in my life, since things have calmed down a bit, will continue commiting!!!] [By the way, I'm

Machine Learning toolbox for Humans

Reproducible Experiment Platform (REP) REP is ipython-based environment for conducting data-driven research in a consistent and reproducible way. Main

Sequence learning toolkit for Python

seqlearn seqlearn is a sequence classification toolkit for Python. It is designed to extend scikit-learn and offer as similar as possible an API. Comp

Simple structured learning framework for python

PyStruct PyStruct aims at being an easy-to-use structured learning and prediction library. Currently it implements only max-margin methods and a perce

Comments
  • Adding Code of Conduct file

    Adding Code of Conduct file

    This is pull request was created automatically because we noticed your project was missing a Code of Conduct file.

    Code of Conduct files facilitate respectful and constructive communities by establishing expected behaviors for project contributors.

    This PR was crafted with love by Facebook's Open Source Team.

    CLA Signed 
    opened by facebook-github-bot 0
  • Adding Contributing file

    Adding Contributing file

    This is pull request was created automatically because we noticed your project was missing a Contributing file.

    CONTRIBUTING files explain how a developer can contribute to the project - which you should actively encourage.

    This PR was crafted with love by Facebook's Open Source Team.

    CLA Signed 
    opened by facebook-github-bot 0
Owner
Meta Research
Meta Research
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft 366 Jan 3, 2023
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 7, 2023
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

Daniel Formoso 5.7k Dec 30, 2022
A comprehensive repository containing 30+ notebooks on learning machine learning!

A comprehensive repository containing 30+ notebooks on learning machine learning!

Jean de Dieu Nyandwi 3.8k Jan 9, 2023
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

null 2 Aug 23, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022
Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

Teng (Elijah)  Xue 0 Jan 31, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 8, 2023