Learning where to learn - Gradient sparsity in meta and continual learning

Johannes Oswald

Last update: Dec 9, 2022

Related tags

Deep Learning learning_where_to_learn

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

In this paper, we investigate gradient sparsity found by MAML in various continual and few-shot learning scenarios.
Instead of only learning the initialization of neural network parameters, we additionally meta-learn parameters underneath a step function that stops gradient descent when smaller then 0.

We term this version Sparse-MAML - Link to the paper here.

Interestingly, we see that structured sparsity emerges in both the classic 4-layer ConvNet as well as a ResNet-12 for few-shot learning. This is accompanied by improved robustness and generalisation across many hyperparameters.

Note that Sparse-MAML is an extremely simple variant of MAML that possesses only the possibility to shut on/off training of specific parameters compared to proper gradient modulation.

This codebase implents the few-shot learning experiments that are presented in the paper. To reproduce the results in the paper, please follow these instructions:

Installation

#1. Install a conda env:

conda create -n sparse-MAML

#2. Activate the env:

source activate sparse-MAML

#3. Install anaconda:

conda install anaconda

#4. Install extra requiremetns (make sure you use the correct pip3):

pip3 install -r requirements.txt

#5. Run:

chmod u+x run_sparse_MAML.sh

#6. Execute:

./run_sparse_MAML.sh

Results

MiniImageNet Few-Shot	MAML	ANIL	BOIL	sparse-MAML	sparse-ReLU-MAML
5-way 5-shot \| ConvNet	63.15	61.50	66.45	67.03	64.84
5-way 1-shot \| ConvNet	48.07	46.70	49.61	50.35	50.39
5-way 5-shot \| ResNet12	69.36	70.03	70.50	70.02	73.01
5-way 1-shot \| ResNet12	53.91	55.25	-	55.02	56.39

BOIL results are taken from the original paper.

This code based is heavily build on top of torchmeta.

You might also like...

Avalanche RL: an End-to-End Library for Continual Reinforcement Learning

43 Dec 24, 2022

Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

Continual Learning With Filter Atom Swapping Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper If find t

11 Aug 29, 2022

Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

The Official Implementation of CLIB (Continual Learning for i-Blurry) Online Continual Learning on Class Incremental Blurry Task Configuration with An

34 Oct 26, 2022

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

H2O H2O is an in-memory platform for distributed, scalable machine learning. H2O uses familiar interfaces like R, Python, Scala, Java, JSON and the Fl

6.1k Jan 5, 2023

Comments

Undefined variables and data pre-processing
Hi,

Thanks for your nice work!!

I have several questions.

Some variables are not defined before being used, e.g., class_augmentations in https://github.com/Johswald/learning_where_to_learn/blob/1441fec556a570a6bd688e0581352c830dba6f88/utils/utils.py#L120

How do U pre-process the data for each task? In other words, how do U get the data for each task in detail? Where could we find the related code?

Best.
opened by TiankaiHang 1
CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0

Learning where to learn - Gradient sparsity in meta and continual learning

Related tags

Overview

Learning where to learn - Gradient sparsity in meta and continual learning

Installation

Results

You might also like...

Avalanche RL: an End-to-End Library for Continual Reinforcement Learning

Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

ICSS - Interactive Continual Semantic Segmentation

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

Official repository for the paper "Self-Supervised Models are Continual Learners" (CVPR 2022)

[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

Comments

Undefined variables and data pre-processing

CVE-2007-4559 Patch

Patching CVE-2007-4559

Owner

Johannes Oswald

Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

This project uses reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore.

Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation

CL-Gym: Full-Featured PyTorch Library for Continual Learning

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"