LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.

Overview

LightningFSL: Few-Shot Learning with Pytorch-Lightning

LICENSE Python last commit

In this repo, a number of pytorch-lightning implementations of FSL algorithms are provided, including two official ones

Boosting Few-Shot Classification with View-Learnable Contrastive Learning (ICME 2021)

Rectifying the Shortcut Learning of Background for Few-Shot Learning (NeurIPS 2021)

Contents

  1. Advantages
  2. Few-shot classification Results
  3. General Guide

Advantages:

This repository is built on top of LightningCLI, which is very convenient to use after being familiar with this tool.

  1. Enabling multi-GPU training
    • Our implementation of FSL framework allows DistributedDataParallel (DDP) to be included in the training of Few-Shot Learning, which is not available before to the best of our knowledge. Previous researches use DataParallel (DP) instead, which is inefficient and requires more computation storages. We achieve this by modifying the DDP sampler of Pytorch, making it possible to sample few-shot learning tasks among devices. See dataset_and_process/samplers.py for details.
  2. High reimplemented accuracy
    • Our reimplementations of some FSL algorithms achieve strong performance. For example, our ResNet12 implementation of ProtoNet and Cosine Classifier achieves 76+ and 80+ accuracy on 5w5s task of miniImageNet, respectively. All results can be reimplemented using pre-defined configuration files in config/.
  3. Quick and convenient creation of new algorithms
    • Pytorch-lightning provides our codebase with a clean and modular structure. Built on top of LightningCLI, our codebase unifies necessary basic components of FSL, making it easy to implement a brand-new algorithm. An impletation of an algorithm usually only requires three short additional files, one specifying the lightningModule, one specifying the classifer head, and the last one specifying all configurations. For example, see the code of ProtoNet (modules/PN.py, architectures/classifier/proto_head.py) and cosine classifier (modules/cosine_classifier.py, architectures/classifier/CC_head.py.
  4. Easy reproducability
    • Every time of running results in a full copy of yaml configuration file in the logging directory, enabling exact reproducability (by using the direct yaml file instead of creating a new one).
  5. Enabling both episodic/non-episodic algorithms
    • Switching with a single parameter is_meta in the configuration file.

Implemented Few-shot classification Results

Implemented results on few-shot classification datasets. The average results of 2,000 randomly sampled episodes repeated 5 times for 1/5-shot evaluation with 95% confidence interval are reported.

miniImageNet Dataset

Models Backbone 5-way 1-shot 5-way 5-shot pretrained models
Protypical Networks ResNet12 61.19+-0.40 76.50+-0.45 link
Cosine Classifier ResNet12 63.89+-0.44 80.94+-0.05 link
Meta-Baseline ResNet12 62.65+-0.65 79.10+-0.29 link
S2M2 WRN-28-10 58.85+-0.20 81.83+-0.15 link
S2M2+Logistic_Regression WRN-28-10 62.36+-0.42 82.01+-0.24
MoCo-v2(unsupervised) ResNet12 52.03+-0.33 72.94+-0.29 link
Exemplar-v2 ResNet12 59.02+-0.24 77.23+-0.16 link
PN+CL ResNet12 63.44+-0.44 79.42+-0.06 link
COSOC ResNet12 69.28+0.49 85.16+-0.42 link

General Guide

To understand the code correctly, it is highly recommended to first quickly go through the pytorch-lightning documentation, especially LightningCLI. It won't be a long journey since pytorch-lightning is built on the top of pytorch.

Installation

Just run the command:

pip install -r requirements.txt

running an implemented few-shot model

  1. Downloading Datasets:
  2. Training (Except for Meta-baseline and COSOC):
    • Choose the corresponding configuration file in 'config'(e.g.set_config_PN.py for PN model), set inside the parameter 'is_test' to False, set GPU ids (multi-GPU or not), dataset directory, logging dir as well as other parameters you would like to change.
    • modify the first line in run.sh (e.g., python config/set_config_PN.py).
    • To begin the running, run the command bash run.sh
  3. Training Meta-baseline:
    • This is a two-stage algorithm, with the first stage being CEloss-pretraining, followed by ProtoNet finetuning. So a two-stage training is need. The first training uses the configuration file config/set_config_meta_baseline_pretrain.py. The second uses config/set_config_meta_baseline_finetune.py, with pre-training model path from the first stage, specified by the parameterpre_trained_path in the configuration file.
  4. Training COSOC:
    • For pre-training Exemplar, choose configuration file config/set_config_MoCo.py and set parameter is_exampler to True.
    • For runing COS algorithm, run the command python COS.py --save_dir [save_dir] --pretrained_Exemplar_path [model_path] --dataset_path [data_path]. [save_dir] specifies the saving directory of all foreground objects, [model_path] and [data_path] specify the pathes of pre-trained model and datasets, respectively.
    • For runing a FSL algorithm with COS, choose configuration file config/set_config_COSOC.py and set parameter data["train_dataset_params"] to the directory of saved data of COS algorithm, pre_trained_path to the directory of pre-trained Exemplar.
  5. Testing:
    • Choose the same configuration file as training, set parameter is_test to True, pre_trained_path to the directory of checkpoint model (with suffix '.ckpt'), and other parameters (e.g. shot, batchsize) as you disire.
    • modify the first line in run.sh (e.g., python config/set_config_PN.py).
    • To begin the testing, run the command bash run.sh

Creating a new few-shot algorithm

It is quite simple to implement your own algorithm. most of algorithms only need creation of a new LightningModule and a classifier head. We give a breif description of the code structure here.

run.py

It is usually not needed to modify this file. The file run.py wraps the whole training and testing procedure of a FSL algorithm, for which all configurations are specified by an individual yaml file contained in the /config folder; see config/set_config_PN.py for example. The file run.py contains a python class Few_Shot_CLI, inherited from LightningCLI. It adds new hyperpameters (Also specified in configuration file) as well as testing process for FSL.

FewShotModule

Need modification. The folder modules contains LightningModules for FSL models, specifying model components, optimizers, logging metrics and train/val/test processes. Notably, modules/base_module.py contains the template module for all FSL models. All other modules inherit the base module; see modules/PN.py and modules/cosine_classifier.py for how episodic/non-episodic models inherit from the base module.

architectures

Need modification. We divide general FSL architectures into feature extractor and classification head, specified respectively in architectures/feature_extractor and architectures/classifier. These are just common nn modules in pytorch, which shall be embedded in LightningModule mentioned above. The recommended feature extractor is ResNet12, which is popular and shows promising performance. The classification head, however, varies with algorithms and need specific designs.

Datasets and DataModule

It is usually not needed for modification. Pytorch-lightning unifies data processing across training, val and testing into a single LightningDataModule. We disign such a datamodule in dataset_and_process/datamodules/few_shot_datamodule.py for FSL, enabling episodic/non-episodic sampling and DDP for multi-GPU fast training. The definition of Dataset itself is in dataset_and_process/datasets, specified as common pytorch datasets class. There is no need to modify the dataset module unless new datasets are involved.

Callbacks and Plugins

It is usually not needed for modification. See documentation of pytorch-lightning for detailed introductions of callbacks and Plugins. They are additional functionalities added to the system in a modular fashion.

Configuration

Need modification. See LightningCLI for how a yaml configuration file works. For each algorithm, there needs one specific configuration file, though most of the configurations are the same across algorithms. Thus it is convenient to copy one configuration and change it for a new algorithm.

Comments
  • Problem of reproducing Meta-baseline

    Problem of reproducing Meta-baseline

    Excuse me? It seems that I can't reproduce the the performance of meta-baseline as you give at 62. I followed the instruction of the two-stage training, resulting accuracy of 76 for pre-trained model and 60 for fine-tuned model (both on val set). And I set the training shot to 5 as you mentioned before. It didn't work either.

    opened by LIUZIJING-CHN 16
  • Effect of projection head

    Effect of projection head

    Hi authors, thanks for your wonderful work!

    I have some questions regarding the projection head in MoCo/Exemplar. Have you experimented with the pre-trained MoCo/Exemplar with and without a nonlinear projection head? How did they perform? Also, in the code, you commented that Surprisingly, pure contrastive-pretrained model performs very well on Few-Shot Learning. Is there any possibility for you to remember the approximate performance gap, i.e., pure MoCo pre-trained v.s. Exemplar pre-trained?

    opened by Jiawei-Yang 4
  • I can’t reproduce the performance of PN

    I can’t reproduce the performance of PN

    I can’t reproduce the performance of PN. In your readme,the accuracy of PN can reach 61.09. However only 59.26 on test set I I can reproduce while 62.31 on val set. I train it on 3090*2 and use the same config as set_config_PN.py. Are there any detiles I haven't noticed?

    good first issue 
    opened by skingorz 4
  • Problem when reproducing Meta-baseline

    Problem when reproducing Meta-baseline

    Excuse me, After I finished the pretraining of meta-baseline, I got an error when loading the pretrained model in finetune stage. RuntimeError: Error(s) in loading state_dict for ProtoNet: Missing key(s) in state_dict "classifier.scale_cls". Unexpected key(s) in state_dict: "classifier.weight", "classifier.bias". It seems that you may have a wrong setting in the file set_config_meta_baseline_finetune.py

    opened by LIUZIJING-CHN 2
  • About a license of the repository

    About a license of the repository

    Thanks for sharing the repository. Can you claim a LICENSE of the code repository?

    E.g. If I use this repo for a baseline of the machine learning competition (modify, share, commercial use), is it allowed?

    opened by bilzard 2
  • Problems occured when reimplement  COSOC

    Problems occured when reimplement COSOC

    Hi, when i tried to reimplement COSOC, I was confronted with two problems:

    1. Multi-GPU training: I followed the guidance "4. Training COSOC" and finished the training of examplar and running of COS algorithm. However, when I tried to use 2 TITAN V(12G) GPUs to running FSL algorithm with COS, it would cause error "CUDA out of memory". More exactly, as long as validation went on, it would cause such a problem. (When training was on but validation hadn't start it went on normally.) I didn't modify any hyperparameters, the batchsize in this stage is still 128. 1 error-2

    2. Training with single GPU and smaller batchsize: Given the problem in 1, I also tried training on single GPU with batchsize=32(the max supportable bs). But the validation results seemed as if nothing had been learned (36/60 epochs): error-3

    It would be very thankful for your reply!

    bug 
    opened by Taylorfire 2
  • About pretrained models

    About pretrained models

    I follow the general guide and want to train and test COSOC on miniImageNet. But I am not certain about the pretrained model you provide in the chart. Is the pretrained model used for training a examplar or just the result of a trained examplar (means it can be used for running COS algorithm)?

    opened by Taylorfire 2
  • About the performance without multi-cropping?

    About the performance without multi-cropping?

    Hi, thanks for your nice work.

    Since the results reported in the paper are based on multi-cropping, I would like to know if there are raw results that do not use multi-cropping at evaluation?

    good first issue 
    opened by zhanyuanyang 2
  • The manual data in COSOC.

    The manual data in COSOC.

    I'm very interested in your work published on NIPS about the influence of background on FSL. It's time-consuming to crop the image manually. Will you open source the related data in the future?

    enhancement 
    opened by skingorz 2
  • Cannot find the dataset

    Cannot find the dataset

    Hi, I'm trying to re-implement the COSOC, but I don't know how to get the following datasets. One is "miniImageNet_contra" for pretraining Exemplar, and the other is the "miniImageNet_prob_crop" for training COSOC. Are these different from miniImageNet? Or is it okay to use miniImageNet?

    I'd really appreciate it if you could answer!

    opened by ssdred1250 1
  •  How to train PN network to use different ways in training and testing

    How to train PN network to use different ways in training and testing

    I try to modify the dataset and basemodel in their init function. But it's a pity that I can't use these hyperparameters like val_ways train_ways. How to modify the code to implement this function? I haven't used Lighting torch before.

    opened by tinywaferShark 3
Owner
Xu Luo
M.S. student of SMILE Lab, UESTC
Xu Luo
PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

D2C: Diffuison-Decoding Models for Few-shot Conditional Generation Project | Paper PyTorch implementation of D2C: Diffuison-Decoding Models for Few-sh

Jiaming Song 90 Dec 27, 2022
Pytorch Implementation for CVPR2018 Paper: Learning to Compare: Relation Network for Few-Shot Learning

LearningToCompare Pytorch Implementation for Paper: Learning to Compare: Relation Network for Few-Shot Learning Howto download mini-imagenet and make

Jackie Loong 246 Dec 19, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 1, 2023
True Few-Shot Learning with Language Models

This codebase supports using language models (LMs) for true few-shot learning: learning to perform a task using a limited number of examples from a single task distribution.

Ethan Perez 124 Jan 4, 2023
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 8, 2023
Pytorch implementation of the paper "Optimization as a Model for Few-Shot Learning"

Optimization as a Model for Few-Shot Learning This repo provides a Pytorch implementation for the Optimization as a Model for Few-Shot Learning paper.

Albert Berenguel Centeno 238 Jan 4, 2023
mmfewshot is an open source few shot learning toolbox based on PyTorch

OpenMMLab FewShot Learning Toolbox and Benchmark

OpenMMLab 514 Dec 28, 2022
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Clova AI Research 97 Dec 23, 2022
Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Hypercorrelation Squeeze for Few-Shot Segmentation This is the implementation of the paper "Hypercorrelation Squeeze for Few-Shot Segmentation" by Juh

Juhong Min 165 Dec 28, 2022
Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch

Cross Transformers - Pytorch (wip) Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch Install $ pip install cross-t

Phil Wang 40 Dec 22, 2022
Pytorch implementation of few-shot semantic image synthesis

Few-shot Semantic Image Synthesis Using StyleGAN Prior Our method can synthesize photorealistic images from dense or sparse semantic annotations using

null 40 Sep 26, 2022
(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

Relational Embedding for Few-Shot Classification (ICCV 2021) Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho [paper], [project hompage] We propose t

Dahyun Kang 82 Dec 24, 2022
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

?? Flamingo - Pytorch Implementation of Flamingo, state-of-the-art few-shot visual question answering attention net, in Pytorch. It will include the p

Phil Wang 630 Dec 28, 2022
The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning        Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

FeiLong 116 Dec 19, 2022
A general framework for deep learning experiments under PyTorch based on pytorch-lightning

torchx Torchx is a general framework for deep learning experiments under PyTorch based on pytorch-lightning. TODO list gan-like training wrapper text

Yingtian Liu 6 Mar 17, 2022
Few-shot Learning of GPT-3

Few-shot Learning With Language Models This is a codebase to perform few-shot "in-context" learning using language models similar to the GPT-3 paper.

Tony Z. Zhao 224 Dec 28, 2022
Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

Thomas Dopierre 47 Jan 3, 2023
Few-Shot Graph Learning for Molecular Property Prediction

Few-shot Graph Learning for Molecular Property Prediction Introduction This is the source code and dataset for the following paper: Few-shot Graph Lea

Zhichun Guo 94 Dec 12, 2022