Self-training for Few-shot Transfer Across Extreme Task Differences

Cheng Perng Phoo

Last update: Oct 31, 2022

Related tags

Deep Learning STARTUP

Overview

Self-training for Few-shot Transfer Across Extreme Task Differences (STARTUP)

Introduction

This repo contains the official implementation of the following ICLR2021 paper:

Title: Self-training for Few-shot Transfer Across Extreme Task Differences
Authors: Cheng Perng Phoo, Bharath Hariharan
Institution: Cornell University
Arxiv: https://arxiv.org/abs/2010.07734
Abstract:
Most few-shot learning techniques are pre-trained on a large, labeled "base dataset". In problem domains where such large labeled datasets are not available for pre-training (e.g., X-ray, satellite images), one must resort to pre-training in a different "source" problem domain (e.g., ImageNet), which can be very different from the desired target task. Traditional few-shot and transfer learning techniques fail in the presence of such extreme differences between the source and target tasks. In this paper, we present a simple and effective solution to tackle this extreme domain gap: self-training a source domain representation on unlabeled data from the target domain. We show that this improves one-shot performance on the target domain by 2.9 points on average on the challenging BSCD-FSL benchmark consisting of datasets from multiple domains.

Requirements

This codebase is tested with:

PyTorch 1.7.1
Torchvision 0.8.2
NumPy
Pandas
wandb (used for logging. More here: https://wandb.ai/)

Running Experiments

Step 0: Dataset Preparation

MiniImageNet and CD-FSL: Download the datasets for CD-FSL benchmark following step 1 and step 2 here: https://github.com/IBM/cdfsl-benchmark
tieredImageNet: Prepare the tieredImageNet dataset following https://github.com/mileyan/simple_shot. Note after running the preparation script, you will need to split the saved images into 3 different folders: train, val, test.

Step 1: Teacher Training on the Base Dataset

We provide scripts to produce teachers for different base datasets. Regardless of the base datasets, please follow the following steps to produce the teachers:

Go into the directory teacher_miniImageNet/ (teacher_ImageNet/ for ImageNet)
Take care of the TODO: in run.sh and configs.py (if applicable).
Run bash run.sh to produce the teachers.

Note that for miniImageNet and tieredImageNet, the training script is adapted based on the official script provided by the CD-FSL benchmark. For ImageNet, we simply download the pre-trained models from PyTorch and convert them to relevant format.

Step 2: Student Training

To train the STARTUP's representation, please follow the following steps:

Go into the directory student_STARTUP/ (student_STARTUP_no_self_supervision/ for the version without SimCLR)
Take care of the TODO: in run.sh and configs.py
Run bash run.sh to produce the student/STARTUP representation.

Step 3: Evaluation

To evaluate different representations, go into evaluation/, modify the TODO: in run.sh and configs.py and run bash run.sh.

Notes

When producing the results for the submitted paper, we did not set torch.backends.cudnn.deterministic and torch.backends.cudnn.benchmark properly, thus causing non-deterministic behaviors. We have rerun our experiments and the updated numbers can be found here: https://docs.google.com/spreadsheets/d/1O1e9xdI1SxVvRWK9VVxcO8yefZhePAHGikypWfhRv8c/edit?usp=sharing. Although some of the numbers has changed, the conclusion in the paper remains unchanged. STARTUP is able to outperform all the baselines, bringing forth tremendous improvements to cross-domain few-shot learning.
All the trainings are done on Nvidia Titan RTX GPU. Evaluation of different representations are performed using Nvidia RTX 2080Ti. Regardless of the GPU models, CUDA11 is used.
This repo is built upon the official CD-FSL benchmark repo: https://github.com/IBM/cdfsl-benchmark/tree/9c6a42f4bb3d2638bb85d3e9df3d46e78107bc53. We thank the creators of the CD-FSL benchmark for releasing code to the public.
If you find this codebase or STARTUP useful, please consider citing our paper:

@inproceeding{phoo2021STARTUP,
    title={Self-training for Few-shot Transfer Across Extreme Task Differences},
    author={Phoo, Cheng Perng and Hariharan, Bharath},
    booktitle={Proceedings of the International Conference on Learning Representations},
    year={2021}
}

You might also like...

Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

Computational Fluid Dynamics in Python Using NumPy to solve the equations of fluid mechanics 🌊 🌊 🌊 together with Finite Differences, explicit time

4 Nov 12, 2022

🔮 Execution time predictions for deep neural network training iterations across different GPUs.

Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training Habitat is a tool that predicts a deep neural network's

44 Dec 27, 2022

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

SHOT++ Code for our TPAMI submission "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer" that is ext

75 Dec 16, 2022

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

Transfer-Learn is an open-source and well-documented library for Transfer Learning. It is based on pure PyTorch with high performance and friendly API. Our code is pythonic, and the design is consistent with torchvision. You can easily develop new algorithms, or readily apply existing algorithms.

2.2k Jan 3, 2023

Comments

Public Pretrained models

Hi, thanks for your interesting works and transparency. Although you have provided scripts to reproduce all models, could you also make public the pretrained weights? It would be much appreciated. Thank you.

opened by lehduong 1
csv file of unlabeled data

Hi, when I run your code, i got error like below.

FileNotFoundError: [Errno 2] File datasets/split_seed_1/ISIC_unlabeled_20.csv does not exist: 'datasets/split_seed_1/ISIC_unlabeled_20.csv'

I think csv files for 20% unlabeled data of target domains are missing. Could you upload csv files?

opened by MyeongJin-Kim 1
AssertionError: can only test a child process

Exception ignored in: <function _MultiProcessingDataLoaderIter.del at 0x7f6d5cf914c0> ...... assert self._parent_pid == os.getpid(), 'can only test a child process' AssertionError: can only test a child process

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████▉| 999/1000 [6:35:47<00:21, 21.14s/it

An AssertionError occured. I want to know if this error has no effect on the following training process.

opened by fikry102 2

Self-training for Few-shot Transfer Across Extreme Task Differences

Related tags

Overview

Self-training for Few-shot Transfer Across Extreme Task Differences (STARTUP)

Introduction

Requirements

Running Experiments

Step 0: Dataset Preparation

Step 1: Teacher Training on the Base Dataset

Step 2: Student Training

Step 3: Evaluation

Notes

You might also like...

Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

🔮 Execution time predictions for deep neural network training iterations across different GPUs.

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

Few-shot Learning of GPT-3

Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

Comments

Public Pretrained models

csv file of unlabeled data

AssertionError: can only test a child process

Owner

Cheng Perng Phoo

Few-NERD: Not Only a Few-shot NER Dataset

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch

Task-related Saliency Network For Few-shot learning

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

Implementation of the paper "Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning"

The code is for the paper "A Self-Distillation Embedded Supervised Affinity Attention Model for Few-Shot Segmentation"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis