PyContinual (An Easy and Extendible Framework for Continual Learning)

Overview

PyContinual (An Easy and Extendible Framework for Continual Learning)

Easy to Use

You can sumply change the baseline, backbone and task, and then ready to go. Here is an example:

	python run.py \  
	--bert_model 'bert-base-uncased' \  
	--backbone bert_adapter \ #or other backbones (bert, w2v...)  
	--baseline ctr \  #or other avilable baselines (classic, ewc...)
	--task asc \  #or other avilable task/dataset (dsc, newsgroup...)
	--eval_batch_size 128 \  
	--train_batch_size 32 \  
	--scenario til_classification \  #or other avilable scenario (dil_classification...)
	--idrandom 0  \ #which random sequence to use
	--use_predefine_args #use pre-defined arguments

Easy to Extend

You only need to write your own ./dataloader, ./networks and ./approaches. You are ready to go!

Introduction

Recently, continual learning approaches have drawn more and more attention. This repo contains pytorch implementation of a set of (improved) SoTA methods using the same training and evaluation pipeline.

This repository contains the code for the following papers:

Features

  • Datasets: It currently supports Language Datasets (Document/Sentence/Aspect Sentiment Classification, Natural Language Inference, Topic Classification) and Image Datasets (CelebA, CIFAR10, CIFAR100, FashionMNIST, F-EMNIST, MNIST, VLCS)
  • Scenarios: It currently supports Task Incremental Learning and Domain Incremental Learning
  • Training Modes: It currently supports single-GPU. You can also change it to multi-node distributed training and the mixed precision training.

Architecture

./res: all results saved in this folder.
./dat: processed data
./data: raw data ./dataloader: contained dataloader for different data ./approaches: code for training
./networks: code for network architecture
./data_seq: some reference sequences (e.g. asc_random) ./tools: code for preparing the data

Setup

  • If you want to run the existing systems, please see run_exist.md
  • If you want to expand the framework with your own model, please see run_own.md
  • If you want to see the full list of baselines and variants, please see baselines.md

Reference

If using this code, parts of it, or developments from it, please consider cite the references bellow.

@inproceedings{ke2021achieve,
  title={Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning},
  author={Ke, Zixuan and Liu, Bing and Ma, Nianzu and Xu, Hu, and Lei Shu},
  booktitle={NeurIPS},
  year={2021}
}

@inproceedings{ke2021contrast,
  title={CLASSIC: Continual and Contrastive Learning of Aspect Sentiment Classification Tasks},
  author={Ke, Zixuan and Liu, Bing and Xu, Hu, and Lei Shu},
  booktitle={EMNLP},
  year={2021}
}

@inproceedings{ke2021adapting,
  title={Adapting BERT for Continual Learning of a Sequence of Aspect Sentiment Classification Tasks},
  author={Ke, Zixuan and Xu, Hu and Liu, Bing},
  booktitle={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  pages={4746--4755},
  year={2021}
}

@inproceedings{ke2020continualmixed,
author= {Ke, Zixuan and Liu, Bing and Huang, Xingchang},
title= {Continual Learning of a Mixed Sequence of Similar and Dissimilar Tasks},
booktitle = {Advances in Neural Information Processing Systems},
volume={33},
year = {2020}}

@inproceedings{ke2020continual,
author= {Zixuan Ke and Bing Liu and Hao Wang and Lei Shu},
title= {Continual Learning with Knowledge Transfer for Sentiment Classification},
booktitle = {ECML-PKDD},
year = {2020}}

Contact

Please drop an email to Zixuan Ke, Xingchang Huang or Nianzu Ma if you have any questions regarding to the code. We thank Bing Liu, Hu Xu and Lei Shu for their valuable comments and opinioins.

Comments
  • Import error after executing B-CL script

    Import error after executing B-CL script

    Hi, I encounter Import error for executing the following script: ./commands/til_classification/asc/run_train_bert_adapter_capsule_mask_ncl.sh and the error message is as follows:

    Inits... Traceback (most recent call last): File "run.py", line 81, in net=import_modules.network.Net(taskcla,args=args) File "/tmp2/mistel/test/PyContinual/src/networks/classification/bert_adapter_capsule_mask.py", line 19, in init self.bert = MyBertModel.from_pretrained(args.bert_model,config=config,args=args) File "/home/mistel/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1385, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 664, in init self.encoder = MyBertEncoder(config,args) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 521, in init self.layer = nn.ModuleList([MyBertLayer(config,args) for _ in range(config.num_hidden_layers)]) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 521, in self.layer = nn.ModuleList([MyBertLayer(config,args) for _ in range(config.num_hidden_layers)]) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 363, in init self.attention = MyBertAttention(config,args) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 142, in init self.output = MyBertSelfOutput(config,args) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 86, in init from networks.classification.adapters import BertAdapterCapsuleMask ModuleNotFoundError: No module named 'networks.classification.adapters'

    after I edit the import path from from networks.classification.adapters import BertAdapterCapsuleMask to from networks.base.adapters import BertAdapterCapsuleMask in my_transformers.py the following message occurred:

    Inits... apply to attention BertAdapter BertAdapterMask apply_one_layer_shared CapsuleLayer Traceback (most recent call last): File "run.py", line 81, in net=import_modules.network.Net(taskcla,args=args) File "/tmp2/mistel/test/PyContinual/src/networks/classification/bert_adapter_capsule_mask.py", line 19, in init self.bert = MyBertModel.from_pretrained(args.bert_model,config=config,args=args) File "/home/mistel/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1385, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 664, in init self.encoder = MyBertEncoder(config,args) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 521, in init self.layer = nn.ModuleList([MyBertLayer(config,args) for _ in range(config.num_hidden_layers)]) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 521, in self.layer = nn.ModuleList([MyBertLayer(config,args) for _ in range(config.num_hidden_layers)]) File "/tmp2/mistel/test/PyContinual/src/./networks/base/my_transformers.py", line 101, in init self.adapter_capsule_mask = BertAdapterCapsuleMask(args) File "/tmp2/mistel/test/PyContinual/src/networks/base/adapters.py", line 628, in init self.capsule_net = CapsNet(config) File "/tmp2/mistel/test/PyContinual/src/networks/base/adapters.py", line 667, in init self.tsv_capsules = CapsuleLayer(config,'tsv') File "/tmp2/mistel/test/PyContinual/src/networks/base/adapters.py", line 698, in init self.tsv = torch.tril(torch.ones(config.ntasks,config.ntasks)).data.cuda()# for backward RuntimeError: CUDA error: out of memoryl/src/./networks/base/my_transformers.py", line 101, in init self.adapter_capsule_mask = BertAdapterCapsuleMask(args) File "/tmp2/mistel/test/PyContinual/src/networks/base/adapters.py", line 628, in init self.capsule_net = CapsNet(config) File "/tmp2/mistel/test/PyContinual/src/networks/base/adapters.py", line 667, in init self.tsv_capsules = CapsuleLayer(config,'tsv') File "/tmp2/mistel/test/PyContinual/src/networks/base/adapters.py", line 698, in init self.tsv = torch.tril(torch.ones(config.ntasks,config.ntasks)).data.cuda()# for backward RuntimeError: CUDA error: out of memory

    it seems like the second error message is I change the CUDA_VISIBLE_DEVICES=1 to CUDA_VISIBLE_DEVICES=0 in ./commands/til_classification/asc/run_train_bert_adapter_capsule_mask_ncl.sh however, in my enviroment, cuda:0 is RTX3090 and cuda:1 is RTX1080ti, it's not make sense, and I wandering if it is error from adapters.py.

    thanks for your patience, best regards.

    opened by mistel1225 3
  • Undefined args.aux_net

    Undefined args.aux_net

    Hi ~

    Thanks for your framework for continual Learning in NLP! But I find that the args.aux_net in https://github.com/ZixuanKe/PyContinual/blob/9beb0daced7746aaf83be4e99b8aa764b0c46056/src/run.py#L231 is never defined.

    Best, Qi-Wei

    opened by wangkiw 2
  • Question for paper

    Question for paper "Adapting BERT for Continual Learning of a Sequence of Aspect Sentiment Classification Tasks"

    In the Task Capsule Layer, each capsule represents a task and TCL prepares low-level features derived from each task. As such, a capsule is added to TCL for every new task. I don't understand whether the model only created capsule for a new task or it created capsule for all tasks when it learned a new tasks?

    opened by leducthanguet 2
  • Bug occurs when you delete unblock_attention in args

    Bug occurs when you delete unblock_attention in args

    It seems that after you delete unblock_attention in args. This bug occurs when running B-CL: Traceback (most recent call last): File "run.py", line 186, in appr.train(task,train_dataloader,valid_dataloader,num_train_steps,train,valid) File "/content/PyContinual/src/approaches/classification/bert_adapter_capsule_mask.py", line 60, in train global_step=self.train_epoch(t,train,iter_bar, optimizer,t_total,global_step) File "/content/PyContinual/src/approaches/classification/bert_adapter_capsule_mask.py", line 137, in train_epoch p.grad.data*=self.model.get_view_for_tsv(n,t) #open for general File "/content/PyContinual/src/networks/classification/bert_adapter_capsule_mask.py", line 193, in get_view_for_tsv if not self.args.unblock_attention: AttributeError: 'Namespace' object has no attribute 'unblock_attention'.

    opened by Si1verBul13tzxc 2
  • Missing files import_til(dil)_extraction

    Missing files import_til(dil)_extraction

    code in run.py:

    Args -- Experiment

    if args.scenario == 'til_extraction': import import_til_extraction as import_modules elif args.scenario == 'dil_extraction': import import_dil_extraction as import_modules else: import import_classification as import_modules

    Seems that import_til_extraction and import_dil_extraction are missing? Or these two files are the same as import_classification?

    opened by szhsjsu 2
  • Can CTR be extended to CNN backbones?

    Can CTR be extended to CNN backbones?

    Hello, I admire your wonderful job Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning and I want to extend the CTR method to CNN backbones with image datasets. Could it happen since I notice that it seems impossible in your baselines.md file.

    opened by Quantum-matrix 1
  • Reproduce the results from the paper

    Reproduce the results from the paper

    Hello, thank you very much for the framework implementation. I would like to reproduce the results from the paper "Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning". Could you please provide the configuration file with parameters of the experiments? With what parameters should I run the code to be able to reproduce the results. Many thanks in advance!

    opened by J4nn4 1
  • remove trailing space from til_classification arg

    remove trailing space from til_classification arg

    There's was is a trailing space in til_classification. Otherwise you need to put til_classification in quotes with a trailing space when using command line args, which isn't true for dil_classification

    opened by modaccount 1
  • On 20News data processing

    On 20News data processing

    When I re-run the existing system for 20news, I find that the input data text is like: "Newsgroups: rec.motorcycles\nPath: cantaloupe.srv.cs.cmu.edu!ro ...etc".

    Am I right that you do not discard the header (which often contains the name of the newgroup label) during data processing?

    opened by Chen-Hailin 1
  • Add baselines

    Add baselines

    1. Add the generation and token classification models for Der++, A-Gem, LDBR, DyTox, DualPrompt baselines.
    2. Add Roberta for prompt-based methods (previously only BART)
    3. Add buffer.py in ./networks for replay-based methods.
    4. Note that LDBR and DyTox are not suitable for generation and NER. LDBR adds a 'sep' token into the input to do an NSP task, and DyTox uses a task attention block that only attends to a task token (outputs a sentence-level representation). Here they are not included in this update.
    opened by linhaowei1 0
Owner
Zixuan Ke
Zixuan Ke
CL-Gym: Full-Featured PyTorch Library for Continual Learning

CL-Gym: Full-Featured PyTorch Library for Continual Learning CL-Gym is a small yet very flexible library for continual learning research and developme

Iman Mirzadeh 36 Dec 25, 2022
PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

Shipeng Wang 34 Dec 21, 2022
Avalanche RL: an End-to-End Library for Continual Reinforcement Learning

Avalanche RL: an End-to-End Library for Continual Reinforcement Learning Avalanche Website | Getting Started | Examples | Tutorial | API Doc | Paper |

ContinualAI 43 Dec 24, 2022
Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

Continual Learning With Filter Atom Swapping Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper If find t

null 11 Aug 29, 2022
Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

The Official Implementation of CLIB (Continual Learning for i-Blurry) Online Continual Learning on Class Incremental Blurry Task Configuration with An

NAVER AI 34 Oct 26, 2022
A Lighting Pytorch Framework for Recommendation System, Easy-to-use and Easy-to-extend.

Torch-RecHub A Lighting Pytorch Framework for Recommendation Models, Easy-to-use and Easy-to-extend. 安装 pip install torch-rechub 主要特性 scikit-learn风格易用

Mincai Lai 67 Jan 4, 2023
PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab 42 Nov 9, 2022
CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper) (Accepted for oral presentation at ACM

Minha Kim 1 Nov 12, 2021
ICSS - Interactive Continual Semantic Segmentation

Presentation This repository contains the code of our paper: Weakly-supervised c

Alteia 9 Jul 23, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 8, 2023
Official repository for the paper "Self-Supervised Models are Continual Learners" (CVPR 2022)

Self-Supervised Models are Continual Learners This is the official repository for the paper: Self-Supervised Models are Continual Learners Enrico Fini

Enrico Fini 73 Dec 18, 2022
[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

RCIL [CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation Chang-Bin Zhang1, Jia-Wen Xiao1, Xialei Liu1, Ying-Cong Chen2

Chang-Bin Zhang 71 Dec 28, 2022
An efficient and easy-to-use deep learning model compression framework

TinyNeuralNetwork 简体中文 TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework, which contains features like neura

Alibaba 441 Dec 25, 2022
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

English | 简体中文 Easy Parallel Library Overview Easy Parallel Library (EPL) is a general and efficient library for distributed model training. Usability

Alibaba 185 Dec 21, 2022
TorchFlare is a simple, beginner-friendly, and easy-to-use PyTorch Framework train your models effortlessly.

TorchFlare TorchFlare is a simple, beginner-friendly and an easy-to-use PyTorch Framework train your models without much effort. It provides an almost

Atharva Phatak 85 Dec 26, 2022
FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning (FedML) developed and maintained by Scaleout Systems. FEDn enables highly scalable cross-silo and cross-device use-cases over FEDn networks.

Scaleout 75 Nov 9, 2022
ObjectDetNet is an easy, flexible, open-source object detection framework

Getting started with the ObjectDetNet ObjectDetNet is an easy, flexible, open-source object detection framework which allows you to easily train, resu

null 5 Aug 25, 2020
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 3, 2023