LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.

Xu Luo

Last update: Dec 11, 2022

Related tags

Overview

LightningFSL: Few-Shot Learning with Pytorch-Lightning

In this repo, a number of pytorch-lightning implementations of FSL algorithms are provided, including two official ones

Boosting Few-Shot Classification with View-Learnable Contrastive Learning (ICME 2021)

Rectifying the Shortcut Learning of Background for Few-Shot Learning (NeurIPS 2021)

Advantages
Few-shot classification Results
- miniImageNet results
General Guide

Advantages:

This repository is built on top of LightningCLI, which is very convenient to use after being familiar with this tool.

Enabling multi-GPU training
- Our implementation of FSL framework allows DistributedDataParallel (DDP) to be included in the training of Few-Shot Learning, which is not available before to the best of our knowledge. Previous researches use DataParallel (DP) instead, which is inefficient and requires more computation storages. We achieve this by modifying the DDP sampler of Pytorch, making it possible to sample few-shot learning tasks among devices. See dataset_and_process/samplers.py for details.
High reimplemented accuracy
- Our reimplementations of some FSL algorithms achieve strong performance. For example, our ResNet12 implementation of ProtoNet and Cosine Classifier achieves 76+ and 80+ accuracy on 5w5s task of miniImageNet, respectively. All results can be reimplemented using pre-defined configuration files in config/.
Quick and convenient creation of new algorithms
- Pytorch-lightning provides our codebase with a clean and modular structure. Built on top of LightningCLI, our codebase unifies necessary basic components of FSL, making it easy to implement a brand-new algorithm. An impletation of an algorithm usually only requires three short additional files, one specifying the lightningModule, one specifying the classifer head, and the last one specifying all configurations. For example, see the code of ProtoNet (modules/PN.py, architectures/classifier/proto_head.py) and cosine classifier (modules/cosine_classifier.py, architectures/classifier/CC_head.py.
Easy reproducability
- Every time of running results in a full copy of yaml configuration file in the logging directory, enabling exact reproducability (by using the direct yaml file instead of creating a new one).
Enabling both episodic/non-episodic algorithms
- Switching with a single parameter is_meta in the configuration file.

Implemented Few-shot classification Results

Implemented results on few-shot classification datasets. The average results of 2,000 randomly sampled episodes repeated 5 times for 1/5-shot evaluation with 95% confidence interval are reported.

miniImageNet Dataset

Models	Backbone	5-way 1-shot	5-way 5-shot	pretrained models
Protypical Networks	ResNet12	61.19+-0.40	76.50+-0.45	link
Cosine Classifier	ResNet12	63.89+-0.44	80.94+-0.05	link
Meta-Baseline	ResNet12	62.65+-0.65	79.10+-0.29	link
S2M2	WRN-28-10	58.85+-0.20	81.83+-0.15	link
S2M2+Logistic_Regression	WRN-28-10	62.36+-0.42	82.01+-0.24
MoCo-v2(unsupervised)	ResNet12	52.03+-0.33	72.94+-0.29	link
Exemplar-v2	ResNet12	59.02+-0.24	77.23+-0.16	link
PN+CL	ResNet12	63.44+-0.44	79.42+-0.06	link
COSOC	ResNet12	69.28+0.49	85.16+-0.42	link

General Guide

To understand the code correctly, it is highly recommended to first quickly go through the pytorch-lightning documentation, especially LightningCLI. It won't be a long journey since pytorch-lightning is built on the top of pytorch.

Installation

Just run the command:

pip install -r requirements.txt

running an implemented few-shot model

Downloading Datasets:
- miniImageNet, miniImageNet-new(in COSOC)
Training (Except for Meta-baseline and COSOC):
- Choose the corresponding configuration file in 'config'(e.g.set_config_PN.py for PN model), set inside the parameter 'is_test' to False, set GPU ids (multi-GPU or not), dataset directory, logging dir as well as other parameters you would like to change.
- modify the first line in run.sh (e.g., python config/set_config_PN.py).
- To begin the running, run the command bash run.sh
Training Meta-baseline:
- This is a two-stage algorithm, with the first stage being CEloss-pretraining, followed by ProtoNet finetuning. So a two-stage training is need. The first training uses the configuration file config/set_config_meta_baseline_pretrain.py. The second uses config/set_config_meta_baseline_finetune.py, with pre-training model path from the first stage, specified by the parameterpre_trained_path in the configuration file.
Training COSOC:
- For pre-training Exemplar, choose configuration file config/set_config_MoCo.py and set parameter is_exampler to True.
- For runing COS algorithm, run the command python COS.py --save_dir [save_dir] --pretrained_Exemplar_path [model_path] --dataset_path [data_path]. [save_dir] specifies the saving directory of all foreground objects, [model_path] and [data_path] specify the pathes of pre-trained model and datasets, respectively.
- For runing a FSL algorithm with COS, choose configuration file config/set_config_COSOC.py and set parameter data["train_dataset_params"] to the directory of saved data of COS algorithm, pre_trained_path to the directory of pre-trained Exemplar.
Testing:
- Choose the same configuration file as training, set parameter is_test to True, pre_trained_path to the directory of checkpoint model (with suffix '.ckpt'), and other parameters (e.g. shot, batchsize) as you disire.
- modify the first line in run.sh (e.g., python config/set_config_PN.py).
- To begin the testing, run the command bash run.sh

Creating a new few-shot algorithm

It is quite simple to implement your own algorithm. most of algorithms only need creation of a new LightningModule and a classifier head. We give a breif description of the code structure here.

run.py

It is usually not needed to modify this file. The file run.py wraps the whole training and testing procedure of a FSL algorithm, for which all configurations are specified by an individual yaml file contained in the /config folder; see config/set_config_PN.py for example. The file run.py contains a python class Few_Shot_CLI, inherited from LightningCLI. It adds new hyperpameters (Also specified in configuration file) as well as testing process for FSL.

FewShotModule

Need modification. The folder modules contains LightningModules for FSL models, specifying model components, optimizers, logging metrics and train/val/test processes. Notably, modules/base_module.py contains the template module for all FSL models. All other modules inherit the base module; see modules/PN.py and modules/cosine_classifier.py for how episodic/non-episodic models inherit from the base module.

architectures

Need modification. We divide general FSL architectures into feature extractor and classification head, specified respectively in architectures/feature_extractor and architectures/classifier. These are just common nn modules in pytorch, which shall be embedded in LightningModule mentioned above. The recommended feature extractor is ResNet12, which is popular and shows promising performance. The classification head, however, varies with algorithms and need specific designs.

Datasets and DataModule

It is usually not needed for modification. Pytorch-lightning unifies data processing across training, val and testing into a single LightningDataModule. We disign such a datamodule in dataset_and_process/datamodules/few_shot_datamodule.py for FSL, enabling episodic/non-episodic sampling and DDP for multi-GPU fast training. The definition of Dataset itself is in dataset_and_process/datasets, specified as common pytorch datasets class. There is no need to modify the dataset module unless new datasets are involved.

Callbacks and Plugins

It is usually not needed for modification. See documentation of pytorch-lightning for detailed introductions of callbacks and Plugins. They are additional functionalities added to the system in a modular fashion.

Configuration

Need modification. See LightningCLI for how a yaml configuration file works. For each algorithm, there needs one specific configuration file, though most of the configurations are the same across algorithms. Thus it is convenient to copy one configuration and change it for a new algorithm.

Comments

Problem of reproducing Meta-baseline

Excuse me? It seems that I can't reproduce the the performance of meta-baseline as you give at 62. I followed the instruction of the two-stage training, resulting accuracy of 76 for pre-trained model and 60 for fine-tuned model (both on val set). And I set the training shot to 5 as you mentioned before. It didn't work either.

opened by LIUZIJING-CHN 16
Effect of projection head

Hi authors, thanks for your wonderful work!

I have some questions regarding the projection head in MoCo/Exemplar. Have you experimented with the pre-trained MoCo/Exemplar with and without a nonlinear projection head? How did they perform? Also, in the code, you commented that Surprisingly, pure contrastive-pretrained model performs very well on Few-Shot Learning. Is there any possibility for you to remember the approximate performance gap, i.e., pure MoCo pre-trained v.s. Exemplar pre-trained?

opened by Jiawei-Yang 4
I can’t reproduce the performance of PN

I can’t reproduce the performance of PN. In your readme，the accuracy of PN can reach 61.09. However only 59.26 on test set I I can reproduce while 62.31 on val set. I train it on 3090*2 and use the same config as set_config_PN.py. Are there any detiles I haven't noticed?
good first issue

opened by skingorz 4
Problem when reproducing Meta-baseline

Excuse me, After I finished the pretraining of meta-baseline, I got an error when loading the pretrained model in finetune stage. RuntimeError: Error(s) in loading state_dict for ProtoNet: Missing key(s) in state_dict "classifier.scale_cls". Unexpected key(s) in state_dict: "classifier.weight", "classifier.bias". It seems that you may have a wrong setting in the file set_config_meta_baseline_finetune.py

opened by LIUZIJING-CHN 2
About a license of the repository

Thanks for sharing the repository. Can you claim a LICENSE of the code repository?

E.g. If I use this repo for a baseline of the machine learning competition (modify, share, commercial use), is it allowed?

opened by bilzard 2
Problems occured when reimplement COSOC
Hi, when i tried to reimplement COSOC, I was confronted with two problems:

Multi-GPU training: I followed the guidance "4. Training COSOC" and finished the training of examplar and running of COS algorithm. However, when I tried to use 2 TITAN V(12G) GPUs to running FSL algorithm with COS, it would cause error "CUDA out of memory". More exactly, as long as validation went on, it would cause such a problem. (When training was on but validation hadn't start it went on normally.) I didn't modify any hyperparameters, the batchsize in this stage is still 128.

Training with single GPU and smaller batchsize: Given the problem in 1, I also tried training on single GPU with batchsize=32(the max supportable bs). But the validation results seemed as if nothing had been learned (36/60 epochs):

It would be very thankful for your reply!
bug
opened by Taylorfire 2
About pretrained models

I follow the general guide and want to train and test COSOC on miniImageNet. But I am not certain about the pretrained model you provide in the chart. Is the pretrained model used for training a examplar or just the result of a trained examplar (means it can be used for running COS algorithm)?

opened by Taylorfire 2
About the performance without multi-cropping?

Hi, thanks for your nice work.

Since the results reported in the paper are based on multi-cropping, I would like to know if there are raw results that do not use multi-cropping at evaluation?
good first issue

opened by zhanyuanyang 2
The manual data in COSOC.

I'm very interested in your work published on NIPS about the influence of background on FSL. It's time-consuming to crop the image manually. Will you open source the related data in the future?
enhancement

opened by skingorz 2
Cannot find the dataset

Hi, I'm trying to re-implement the COSOC, but I don't know how to get the following datasets. One is "miniImageNet_contra" for pretraining Exemplar, and the other is the "miniImageNet_prob_crop" for training COSOC. Are these different from miniImageNet? Or is it okay to use miniImageNet?

I'd really appreciate it if you could answer!

opened by ssdred1250 1
How to train PN network to use different ways in training and testing

I try to modify the dataset and basemodel in their init function. But it's a pity that I can't use these hyperparameters like val_ways train_ways. How to modify the code to implement this function? I haven't used Lighting torch before.

opened by tinywaferShark 3

LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.

Related tags

Overview

LightningFSL: Few-Shot Learning with Pytorch-Lightning

Contents

Advantages:

Implemented Few-shot classification Results

miniImageNet Dataset

General Guide

Installation

running an implemented few-shot model

Creating a new few-shot algorithm

run.py

FewShotModule

architectures

Datasets and DataModule

Callbacks and Plugins

Configuration

Comments

Owner

Xu Luo

PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

Pytorch Implementation for CVPR2018 Paper: Learning to Compare: Relation Network for Few-Shot Learning

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

True Few-Shot Learning with Language Models

Prototypical Networks for Few shot Learning in PyTorch

Pytorch implementation of the paper "Optimization as a Model for Few-Shot Learning"

mmfewshot is an open source few shot learning toolbox based on PyTorch

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch

Pytorch implementation of few-shot semantic image synthesis

(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

A general framework for deep learning experiments under PyTorch based on pytorch-lightning

Few-shot Learning of GPT-3

Library of various Few-Shot Learning frameworks for text classification

Few-Shot Graph Learning for Molecular Property Prediction