Meta Learning for Semi-Supervised Few-Shot Classification

Mengye Ren

Last update: Jan 8, 2023

Related tags

Deep Learning few-shot-ssl-public

Overview

few-shot-ssl-public

Code for paper Meta-Learning for Semi-Supervised Few-Shot Classification. [arxiv]

Dependencies

cv2
numpy
pandas
python 2.7 / 3.5+
tensorflow 1.3+
tqdm

Our code is tested on Ubuntu 14.04 and 16.04.

Setup

First, designate a folder to be your data root:

export DATA_ROOT={DATA_ROOT}

Then, set up the datasets following the instructions in the subsections.

Omniglot

[Google Drive] (9.3 MB)

# Download and place "omniglot.tar.gz" in "$DATA_ROOT/omniglot".
mkdir -p $DATA_ROOT/omniglot
cd $DATA_ROOT/omniglot
mv ~/Downloads/omniglot.tar.gz .
tar -xzvf omniglot.tar.gz
rm -f omniglot.tar.gz

miniImageNet

[Google Drive] (1.1 GB)

Update: Python 2 and 3 compatible version: [train] [val] [test]

# Download and place "mini-imagenet.tar.gz" in "$DATA_ROOT/mini-imagenet".
mkdir -p $DATA_ROOT/mini-imagenet
cd $DATA_ROOT/mini-imagenet
mv ~/Downloads/mini-imagenet.tar.gz .
tar -xzvf mini-imagenet.tar.gz
rm -f mini-imagenet.tar.gz

tieredImageNet

[Google Drive] (12.9 GB)

# Download and place "tiered-imagenet.tar" in "$DATA_ROOT/tiered-imagenet".
mkdir -p $DATA_ROOT/tiered-imagenet
cd $DATA_ROOT/tiered-imagenet
mv ~/Downloads/tiered-imagenet.tar .
tar -xvf tiered-imagenet.tar
rm -f tiered-imagenet.tar

Note: Please make sure that the following hardware requirements are met before running tieredImageNet experiments.

Disk: 30 GB
RAM: 32 GB

Core Experiments

Please run the following scripts to reproduce the core experiments.

# Clone the repository.
git clone https://github.com/renmengye/few-shot-ssl-public.git
cd few-shot-ssl-public

# To train a model.
python run_exp.py --data_root $DATA_ROOT             \
                  --dataset {DATASET}                \
                  --label_ratio {LABEL_RATIO}        \
                  --model {MODEL}                    \
                  --results {SAVE_CKPT_FOLDER}       \
                  [--disable_distractor]

# To test a model.
python run_exp.py --data_root $DATA_ROOT             \
                  --dataset {DATASET}                \
                  --label_ratio {LABEL_RATIO}        \
                  --model {MODEL}                    \
                  --results {SAVE_CKPT_FOLDER}       \
                  --eval --pretrain {MODEL_ID}       \
                  [--num_unlabel {NUM_UNLABEL}]      \
                  [--num_test {NUM_TEST}]            \
                  [--disable_distractor]             \
                  [--use_test]

Possible {MODEL} options are basic, kmeans-refine, kmeans-refine-radius, and kmeans-refine-mask.
Possible {DATASET} options are omniglot, mini-imagenet, tiered-imagenet.
Use {LABEL_RATIO} 0.1 for omniglot and tiered-imagenet, and 0.4 for mini-imagenet.
Replace {MODEL_ID} with the model ID obtained from the training program.
Replace {SAVE_CKPT_FOLDER} with the folder where you save your checkpoints.
Add additional flags --num_unlabel 20 --num_test 20 for testing mini-imagenet and tiered-imagenet models, so that each episode contains 20 unlabeled images per class and 20 query images per class.
Add an additional flag --disable_distractor to remove all distractor classes in the unlabeled images.
Add an additional flag --use_test to evaluate on the test set instead of the validation set.
More commandline details see run_exp.py.

Simple Baselines for Few-Shot Classification

Please run the following script to reproduce a suite of baseline results.

python run_baseline_exp.py --data_root $DATA_ROOT    \
                           --dataset {DATASET}

Possible DATASET options are omniglot, mini-imagenet, tiered-imagenet.

Run over Multiple Random Splits

Please run the following script to reproduce results over 10 random label/unlabel splits, and test the model with different number of unlabeled items per episode. The default seeds are 0, 1001, ..., 9009.

python run_multi_exp.py --data_root $DATA_ROOT       \
                        --dataset {DATASET}          \
                        --label_ratio {LABEL_RATIO}  \
                        --model {MODEL}              \
                        [--disable_distractor]       \
                        [--use_test]

Possible MODEL options are basic, kmeans-refine, kmeans-refine-radius, and kmeans-refine-mask.
Possible DATASET options are omniglot, mini_imagenet, tiered_imagenet.
Use {LABEL_RATIO} 0.1 for omniglot and tiered-imagenet, and 0.4 for mini-imagenet.
Add an additional flag --disable_distractor to remove all distractor classes in the unlabeled images.
Add an additional flag --use_test to evaluate on the test set instead of the validation set.

Citation

If you use our code, please consider cite the following:

Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua B. Tenenbaum, Hugo Larochelle and Richard S. Zemel. Meta-Learning for Semi-Supervised Few-Shot Classification. In Proceedings of 6th International Conference on Learning Representations (ICLR), 2018.

@inproceedings{ren18fewshotssl,
  author   = {Mengye Ren and 
              Eleni Triantafillou and 
              Sachin Ravi and 
              Jake Snell and 
              Kevin Swersky and 
              Joshua B. Tenenbaum and 
              Hugo Larochelle and 
              Richard S. Zemel},
  title    = {Meta-Learning for Semi-Supervised Few-Shot Classification},
  booktitle= {Proceedings of 6th International Conference on Learning Representations {ICLR}},
  year     = {2018},
}

Comments

How to interpret the pkl files?
I read the files with the following codes:

import pickle as pkl with open("val_images_png.pkl", "rb") as f: data = pkl.load(f)

Then I found that data is a list of (n, 1) arrays. For example,

print(data[0].shape) # (18068, 1)

I think each array corresponds to an image. However, 18068 is not even divisible by 3, which means that it is not an RGB image?

How could I convert each array to an image with shape (H, W, 3)? Thanks.
opened by purboo 4

Super-class Labels of tieredImageNet

The general labels in each split seem to have an extra, corrupted(?) value that many data points take on. For instance in the training set there are 20 labels {0, ..., 19} but there is a label value 20 used by 341546 data points.

Should all of these data points be excluded, or is there a way to generate correct labels for these?

Apologies if I am confused in my understanding of your dataset. Thank you for working on a hierarchical few-shot dataset.

Here is some illustrative code from my exploration of the data:

>>> train['label_general_str']
['garment',
 'musical instrument, instrument',
 'restraint, constraint',
 'feline, felid',
 'instrument',
 'hound, hound dog',
 'electronic equipment',
 'passerine, passeriform bird',
 'ungulate, hoofed mammal',
 'aquatic bird',
 'snake, serpent, ophidian',
 'primate',
 'protective covering, protective cover, protect',
 'terrier',
 'saurian',
 'building, edifice',
 'establishment',
 'tool',
 'craft',
 'game equipment']
>>> len(train['label_general_str'])
20
>>> train['label_general'].max()  # should be 19
20
>>> uniq, count = np.unique(train['label_general'], return_counts=True)
>>> count
array([  1300,   1300,   1216,   1300,   1300,   1300,   1300,   1300,
         1300,   2600,   1300,   2449,   2600,  11700,   2590,  10258,
        13587,  13000,  24158,  11291, 341546])  # many invalid points with last value

opened by shelhamer 3

Data Format in tiered imagenet

Hi, Thankyou for sharing the code and data. I have a question about the tieredImagenet data. When I loaded the .pkl file, I found the data was a list of numpy array. But the array is not of the shape (img_size * img_size * 3) and can't not be reshaped into this form. A numpy array should be a image right? Besides, I find that the shape of images change, from near 8000 to near 19000?

opened by ylfzr 3
A doubt regarding the test data.

Hey @renmengye, I have a doubt regarding the paper. Suppose we trained the model by taking 5 classes in each episode. I have total of 40 classes. After training, I have test data where I have to classify each image to one of the 40 categories. Now how can I do that? The most logical way is to calculate a noormalized probability for each class and then assigned an image to a class with the highest probability. But this somehow looks not the right way as we trained the model for only 5 classes. Kindly help me here Regards

opened by Hsankesara 2
Unpickling is not successful

Hi,

Thank you for the paper and sharing the code with community.

When running 'run_exp.py' on mini-imagenet, I get the following error and have not been able to resolve it: No such file or directory: 'data/mini-imagenet/images/n0153282900000005.jpg'

I have downloaded and saved .pkl files as instructed but this error shows that probably unpickling is not done successfully, I guess. I would appreciate if you help me through.

opened by nlrahimi 0
Question about the number of unlabelled samples.

I am trying to reproduce your ssl-experiments and am a bit confused about the exact number of unlabelled images.

In your paper you write:

We used M = 5 for training and M = 20 for testing in most cases, thus measuring the ability of the models to generalize to a larger unlabeled set size.

In your README.md under 'Core Experiments' it says:

Add additional flags --num_unlabel 20 --num_test 20 for testing mini-imagenet and tiered-imagenet models, so that each episode contains 20 unlabeled images per class and 20 query images per class.

Could you please specify which experiments were carried out with M=5 and which with M=20. Moreover, for which experiments is there a difference in M between meta-train and meta-test time?

opened by carmete 0
Number of distractors

How to specify the number of distractors? I tried following modification, in run_exp.py line 329 meta_train_dataset = get_dataset( FLAGS.dataset, train_split_name, nclasses_train, nshot, num_distractor=NUMBER_OF_DISTRACTORS,

but the batch size of unlabelled data became nclasses_train*num_unlabel +num_unlabel*num_distractor. According your paper, the unlabelled batch size should be nclasses_train*(num_unlabel + num_distractor).

Any suggestion?

opened by YusukeO 0

ImportError: cannot import name OmniglotEpisode

 File "/home/dd/few-shot-ssl-public/run_baselines_exp.py", line 57, in <module>
    from fewshot.data.omniglot import OmniglotEpisode
ImportError: cannot import name OmniglotEpisode

opened by dongzhuoyao 0

How can i use "DATASET_REGISTRY" and "CONFIG_REGISTRY" in the code for this paper "META-LEARNING FOR SEMI-SUPERVISED FEW-SHOT CLASSIFICATION"?

How can i use "DATASET_REGISTRY" and "CONFIG_REGISTRY" in the code for this paper "META-LEARNING FOR SEMI-SUPERVISED FEW-SHOT CLASSIFICATION"?

opened by bbcenglish 0
Cannot reshape array of size 0 into shape (0,newaxis)

When I run the baseline file, it shows an error : cannot reshape array of size 0 into shape (0,newaxis) in the File "/home/xxx/few-shot/run_baselines_exp.py", line 413, in get_nn_fit x_test_ = x_test.reshape([x_test[ii].shape[0], -1]). Is the flags I set not wrong? My settings are as following, flags.DEFINE_integer("nclasses_eval", default=5, help="Number of classes for testing") flags.DEFINE_integer("nclasses_train", default=5, help="Number of classes for training") flags.DEFINE_integer("nshot", default=1, help="1 nshot") flags.DEFINE_integer("num_eval_episode", default=600, help="Number of evaluation episodes") flags.DEFINE_integer("num_test", default=-1, help="-1 Number of test images per episode") flags.DEFINE_integer("num_unlabel", default=5, help="5 Number of unlabeled for training") flags.DEFINE_integer("seed", default=0, help="Random seed")

opened by 1585231086 0
Question for Omniglot dataset setting

Hi, I have a question for Omniglot dataset setting. In the paper, for test episode,

We used M = 5 for training and M = 20 for testing in most cases, thus measuring the ability of the models to generalize to a larger unlabeled set size.

But in Omniglot dataset, there are only 20 samples per class, so I think M(the number of unlabeled data in support set for one class) should be less than 20. How did you set M for experiments with Omniglot dataset?

opened by kimwj94 0

What does m_dist_1 += tf.to_float(tf.equal(m_dist_1, 0.0)) mean?

In clustering, I don't understand this code:

# Run clustering.
for tt in range(num_cluster_steps):
      protos_1 = tf.expand_dims(protos, 2)
      protos_2 = tf.expand_dims(h_unlabel, 1)
      pair_dist = tf.reduce_sum((protos_1 - protos_2)**2, [3])  # [B, K, N]
      m_dist = tf.reduce_mean(pair_dist, [2])  # [B, K]
      m_dist_1 = tf.expand_dims(m_dist, 1)  # [B, 1, K]
      m_dist_1 += tf.to_float(tf.equal(m_dist_1, 0.0))

Does m_dist_1 += tf.to_float(tf.equal(m_dist_1, 0.0)) mean that if the distance from the center of the cluster is 1 then add 1. But why add 1?

opened by jinghanSunn 1

Which split files should I use to generate data exactly the same as the npy dataset to tiered?

In this repository, quite a few cvs files are provided. Which three files are the oringal split proposed in the paper and should be used for reporting performance in new papers?

opened by smallbox120 1

Owner

Mengye Ren

GitHub

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

MetaAdaptRank This repository provides the implementation of meta-learning to reweight synthetic weak supervision data described in the paper Few-Shot

5 Jun 16, 2022

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

T-Few This repository contains the official code for the paper: "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learni

220 Dec 31, 2022

Few-NERD: Not Only a Few-shot NER Dataset

Few-NERD: Not Only a Few-shot NER Dataset This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset.

319 Dec 30, 2022

Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

47 Jan 3, 2023

Spatial Contrastive Learning for Few-Shot Classification (SCL)

This repo contains the official implementation of Spatial Contrastive Learning for Few-Shot Classification (SCL), which presents of a novel contrastive learning method applied to few-shot image classification in order to learn more general purpose embeddings, and facilitate the test-time adaptation to novel visual categories.

34 Dec 25, 2022

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

2 Dec 17, 2021

Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

Easy Few-Shot Learning Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification. This repository is made for you

399 Jan 8, 2023

(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

Relational Embedding for Few-Shot Classification (ICCV 2021) Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho [paper], [project hompage] We propose t

82 Dec 24, 2022

An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

Channel LM Prompting (and beyond) This includes an original implementation of Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. "Noisy Cha

92 Jan 7, 2023

TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

TransPrompt This code is implement for our EMNLP 2021's paper 《TransPrompt：Towards an Automatic Transferable Prompting Framework for Few-shot Text Cla

23 Dec 21, 2022

vit for few-shot classification

Few-Shot ViT Requirements PyTorch (>= 1.9) TorchVision timm (latest) einops tqdm numpy scikit-learn scipy argparse tensorboardx Pretrained Checkpoints

26 Nov 30, 2022

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

116 Dec 19, 2022

Meta Learning for Semi-Supervised Few-Shot Classification

Related tags

Overview

few-shot-ssl-public

Dependencies

Setup

Omniglot

miniImageNet

tieredImageNet

Core Experiments

Simple Baselines for Few-Shot Classification

Run over Multiple Random Splits

Citation

Comments

Owner

Mengye Ren

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Few-NERD: Not Only a Few-shot NER Dataset

Library of various Few-Shot Learning frameworks for text classification

Spatial Contrastive Learning for Few-Shot Classification (SCL)

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.

(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

vit for few-shot classification

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

The code is for the paper "A Self-Distillation Embedded Supervised Affinity Attention Model for Few-Shot Segmentation"

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021.

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections