Continuum Learning with GEM: Gradient Episodic Memory

Facebook Research

Last update: Dec 27, 2022

Related tags

Deep Learning GradientEpisodicMemory

Overview

Gradient Episodic Memory for Continual Learning

Source code for the paper:

@inproceedings{GradientEpisodicMemory,
    title={Gradient Episodic Memory for Continual Learning},
    author={Lopez-Paz, David and Ranzato, Marc'Aurelio},
    booktitle={NIPS},
    year={2017}
}

To replicate the experiments, execute ./run_experiments.sh.

This source code is released under a Attribution-NonCommercial 4.0 International license, find out more about it here.

Comments

ValueError: matrix G is not positive definite
Hey team

Thank you so much for releasing the code for your paper. It is very useful. I have facing one problem with the code.

I am training my code using a series of tasks. The code works fine for initial tasks but once in a while it would crash with the following error:

v = quadprog.solve_qp(P, q, G, h)[0] File "quadprog/quadprog.pyx", line 104, in quadprog.solve_qp ValueError: matrix G is not positive definite

I could think of one reason why this could happen - Lets say that the gradient corresponding to the current task is in a direction completely opposite to that of the gradient corresponding to the ith previous task. In this case, when we take the projection, of the current gradient with respect to the episodic gradient, the projection turns out to be a non positive-definite matrix (in the extreme case, the projection turns out to be exactly a zero vector).

Do you know of any other reasons why this might happen or if I could get around this problem by tweaking some hyperparams.
opened by shagunsodhani 6
Memory Error while running the code

Can you please let me know the system specs on which the model was trained. I am trying to run gem.py on GPU RTX 2080 but I am receiving the memory error. Thank you.

opened by vidit98 1
Incremental Learning

Can incremental learning be achieved, that is to say, if a model can recognize images as cats and dogs, it is necessary to add new categories on this basis without retraining all data?

opened by LIMr1209 1
ValueError: matrix G is not positive definite

Thanks for your great work and for releasing these code.

I am having the same problem. As suggested I have done the following:

G = np.eye(t) + np.eye(t)*0.00001

But still, I am having the same error: v = quadprog.solve_qp(P, q, G, h)[0] File "quadprog/quadprog.pyx", line 104, in quadprog.solve_qp ValueError: matrix G is not positive definite

Any suggestion?

opened by andreamad8 1
Unable to run the code

Hi,

While trying to replicate the ICARL code, the following line throws an issue. https://github.com/facebookresearch/GradientEpisodicMemory/blob/master/model/icarl.py#L141

The dictionary self.mem_class_x is a dictionary with keys being tensors while it is accessed through integers. This leads to a keyerror issue.

Can you suggest something?

opened by gravity1989 1
Small fix in results/plot_results.py

The script "plot_results.py" was not working, so I changed : acc, bwt, fwt = data[3][5:] into acc, bwt, fwt = data[3][:]

(I also added a gitignore and an option to not download the data each time the script is run :) )

opened by TLESORT 1
Question about EWC Implementation

Hello developers! I looked at the implementation of EWC and got confused about computing the diagonal of the empirical Fisher Information Matrix. In https://github.com/facebookresearch/GradientEpisodicMemory/blob/34c6b8e9a0607db7567301c48b727430d20bee7e/model/ewc.py#L88 the square of the sum of gradients of all data points in the memory are computed, but this is not equal to the sum of squares of gradients. The latter should be used to compute Empirical Fisher, see https://wiseodd.github.io/techblog/2018/03/11/fisher-information/. Could you confirm this?

opened by yunboouyang 0

Continuum Learning with GEM: Gradient Episodic Memory

Related tags

Overview

Gradient Episodic Memory for Continual Learning

Comments

ValueError: matrix G is not positive definite

Memory Error while running the code

Incremental Learning

ValueError: matrix G is not positive definite

Unable to run the code

Small fix in results/plot_results.py

Question about EWC Implementation

Owner

Facebook Research

The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

A PyTorch implementation of Learning to learn by gradient descent by gradient descent

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Storchastic is a PyTorch library for stochastic gradient estimation in Deep Learning

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

This project provides a stock market environment using OpenGym with Deep Q-learning and Policy Gradient.

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

On the model-based stochastic value gradient for continuous reinforcement learning

Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.