This is an unofficial PyTorch implementation of Meta Pseudo Labels

Overview

Meta Pseudo Labels

This is an unofficial PyTorch implementation of Meta Pseudo Labels. The official Tensorflow implementation is here.

Results

CIFAR-10-4K SVHN-1K ImageNet-10%
Paper (w/ finetune) 96.11 ± 0.07 98.01 ± 0.07 73.89
This code (w/o finetune) 94.46 - -
This code (w/ finetune) WIP - -
Acc. curve link - -

Usage

Train the model by 4000 labeled data of CIFAR-10 dataset:

python main.py --seed 5 --name [email protected] --dataset cifar10 --num-classes 10 --num-labeled 4000 --expand-labels --total-steps 300000 --eval-step 1000 --randaug 2 16 --batch-size 128 --lr 0.05 --weight-decay 5e-4  --ema 0.995 --nesterov --mu 7 --label-smoothing 0.15 --temperature 0.7 --threshold 0.6 --lambda-u 8 --warmup-steps 5000 --uda-steps 5000 --amp

Train the model by 10000 labeled data of CIFAR-100 dataset by using DistributedDataParallel:

python -m torch.distributed.launch --nproc_per_node 4 main.py --seed 5 --name [email protected] --dataset cifar100 --num-classes 100 --num-labeled 10000 --expand-labels --total-steps 300000 --eval-step 1000 --randaug 2 16 --batch-size 32 --lr 0.05 --weight-decay 5e-4  --ema 0.995 --nesterov --mu 7 --label-smoothing 0.15 --temperature 0.7 --threshold 0.6 --lambda-u 8 --warmup-steps 5000 --uda-steps 5000 --amp

Monitoring training progress

tensorboard --logdir results

Requirements

  • python 3.6+
  • torch 1.7+
  • torchvision 0.8+
  • tensorboard
  • numpy
  • tqdm
Comments
  • Encounter with error in DDP

    Encounter with error in DDP

    Nice work! But I meet a strange issue when I use multi-GPUs. Specifically, the code is the same and it works well in a single GPU. But when I run it on multi-GPUS using DDP. I meet an error

    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [128]] is at version 4; expected version 3 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

    Does anybody meet the same issue? Hope to get your reply! thx!!

    opened by Yuuuuuuuuuuuuuuuuuummy 4
  • Teacher unsupervised loss mask multiplied outside sum

    Teacher unsupervised loss mask multiplied outside sum

    First, I have to say this is such great work, and it has been a great help in understanding the MPL code not coming from a TF background.

    https://github.com/kekmodel/MPL-pytorch/blob/c66b4d3c45bc15c4be915b82eda50044f8cb3c04/main.py#L180-L182

    I would not say that I am confident that this is an issue, but it is an inconsistency that I would like to bring to your attention in case it is. In the original code, they multiply the mask inside the sum as shown below, but yours multiplies outside. Does this not change the values from what they would be if multiplied inside? Or does it not matter?

    image

    opened by dagleaves 4
  • Student should set to eval() before calculating s_logits_l ?

    Student should set to eval() before calculating s_logits_l ?

    If you don't do this, model's output will be random because of the dropout etc. Then your s_loss_l_new and s_loss_l_old will be random too and your dot_production won't give right feedback. Maybe this issue affects your results

    opened by Vurkty 3
  • dumb theoretical questions

    dumb theoretical questions

    I have not read the paper entirely so forgive me; but is there a way that information seeps from the unlabeled dataset to the label dataset? And if it doesn't, can we just have any dataset that could take the place of the unlabeled dataset, or does it have to look 'similar' to the label dataset? Thank you.

    opened by hlsfin 2
  • Training set should shuffle?

    Training set should shuffle?

    In here, when loading data, it seems we should shuffle the data.

    def cifar10_train(params, batch_size=None):
      """Load CIFAR-10 data."""
      shuffle_size = batch_size * 16
    
      filenames = [os.path.join(CIFAR_PATH, 'train.bin')]
      record_bytes = 1 + (3 * 32 * 32)
      dataset = tf.data.FixedLengthRecordDataset(filenames, record_bytes)
      dataset = dataset.map(
          lambda x: _cifar10_parser(params, x, training=True),
          num_parallel_calls=tf.data.experimental.AUTOTUNE)
      dataset = dataset.shuffle(shuffle_size).repeat()
      dataset = dataset.prefetch(tf.data.experimental.AUTOTUNE)
      dataset = dataset.batch(batch_size=batch_size, drop_remainder=True)
      dataset = _optimize_dataset(dataset)
    
      return dataset
    

    In your code, you didn't shuffle the dataset when loading it,

    labeled_loader = DataLoader(
            labeled_dataset,
            sampler=train_sampler(labeled_dataset),
            batch_size=args.batch_size,
            num_workers=args.workers,
            drop_last=True)  # the default of `shuffle` is False
    

    I don't really know if I am quoting the wrong TF code block of loading data, since I don't know much about TF 1.x and I know you've shuffled the labeled dataset in x_u_split(), so I just post my question here and it may not be an issue.

    I think it makes sense to set shuffle of at least one of labeled_loader and unlabeled_loader to be True, so we can guarantee that we don't have epochs that have same series of batches.

    opened by ifsheldon 2
  • How to train your code?? (ImportError: cannot import name 'amp' from 'torch.cuda')

    How to train your code?? (ImportError: cannot import name 'amp' from 'torch.cuda')

    Hello Sir,

    I tried to train your code. But I met some error.

    My start commend is as follows.

    python main.py  \
            --seed 5 \
            --name cifar10-4K.5 \
            --dataset cifar10 \
            --num-classes 10 \
            --num-labeled 4000 \
            --expand-labels \
            --total-steps 300000 \
            --eval-step 1000 \
            --randaug 2 16 \
            --batch-size 128 \
            --lr 0.05 \
            --weight-decay 5e-4 \
            --dense-dropout 0.2 \
            --ema 0.995 \
            --nesterov \
            --mu 7 \
            --label-smoothing 0.15 \
            --temperature 0.7 \
            --threshold 0.6 \
            --lambda-u 8 \
            --warmup-steps 5000 \
            --uda-steps 5000 \
            --student-wait-steps 3000 \
            --amp
    
    '''
    Traceback (most recent call last):
      File "main.py", line 10, in <module>
        from torch.cuda import amp
    ImportError: cannot import name 'amp' from 'torch.cuda'
    

    How to start your code??

    Thanks..

    opened by edwardcho 2
  • Why use aug image as model input?

    Why use aug image as model input?

    Thank you for your implementation! Here are some confusions.

    In your code, I think images_uw is unlabeled and origin image, and images_us is unlabeled and RandAugment image.

    In paper v.3(https://arxiv.org/pdf/2003.10580v3.pdf) Algorithm1, when "Update the student using the pseudo label", author use unlabeled and origin image to update StudentModel, but in your code,

    s_images = torch.cat((images_l, images_us))
    s_logits = student_model(s_images)
    .....
    s_loss = criterion(s_logits_us, hard_pseudo_label)
    

    Why you use images_us (RandAugment image) as the StudentModel input to compute s_loss instead of images_uw ?

    Same questions in Algorithm1 about "Compute the teacher’s feedback coefficient as in Equation 12" and "Compute the teacher’s gradient from the student’s feedback", the author use unlabeled and origin image as the input, but in your code all unlabeled image input is RandAugment, not origin. Can I ask why?

    Thank you!

    opened by Duplex3345678 2
  • Question about implementation of calculating second-order derivative

    Question about implementation of calculating second-order derivative

    First of all, thank you for the great PyTorch implementation.

    But I can't get how the optimization procedure of the teacher network (eq.3 of the paper) is implemented. The code should calculate the 2nd-order derivative during training, which is missing in the current version of the code. Would you check the code again and let me know whether there is something I'm missing?

    Thank you!

    opened by kdwonn 2
  • TypeError: __init__() got an unexpected keyword argument 'label_smoothing'

    TypeError: __init__() got an unexpected keyword argument 'label_smoothing'

    Anyone meet this problem??? I could not deal with it. Detail: → TypeError: init() got an unexpected keyword argument 'label_smoothing'

    in main.py line 546 → criterion = create_loss_fn(args) and in utils.py line 32 → criterion = nn.CrossEntropyLoss(label_smoothing=args.label_smoothing)

    Thanks!

    opened by silencessss 1
  • ValueError: Caught ValueError in DataLoader worker process 0.

    ValueError: Caught ValueError in DataLoader worker process 0.

    Hello, Thank you your code. I have some problem like :

    Traceback (most recent call last): File "C:\Users\user\Desktop\Pyler\MPL-pytorch\main.py", line 156, in train_loop (images_uw, images_us), _ = unlabeled_iter.next() File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data\dataloader.py", line 517, in next data = self._next_data() File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data return self._process_data(data) File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data data.reraise() File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch_utils.py", line 429, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data_utils\worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\user\Desktop\Pyler\MPL-pytorch\data.py", line 180, in getitem img = self.transform(img) File "C:\Users\user\Desktop\Pyler\MPL-pytorch\data.py", line 159, in call aug = self.aug(x) File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torchvision\transforms\transforms.py", line 60, in call img = t(img) File "C:\Users\user\Desktop\Pyler\MPL-pytorch\augmentation.py", line 211, in call img = op(img, v=self.m, max_v=max_v, bias=bias) File "C:\Users\user\Desktop\Pyler\MPL-pytorch\augmentation.py", line 165, in TranslateYConst return img.transform(img.size, PIL.Image.AFFINE, (1, 0, 0, 0, 1, v), RESAMPLE_MODE) File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\PIL\Image.py", line 2490, in transform im.__transformer( File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\PIL\Image.py", line 2556, in __transformer raise ValueError( ValueError: Unknown resampling filter (None). Use Image.NEAREST (0), Image.BILINEAR (2) or Image.BICUBIC (3)

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "C:\Users\user\Desktop\Pyler\MPL-pytorch\main.py", line 633, in main() File "C:\Users\user\Desktop\Pyler\MPL-pytorch\main.py", line 626, in main train_loop(args, labeled_loader, unlabeled_loader, test_loader, File "C:\Users\user\Desktop\Pyler\MPL-pytorch\main.py", line 162, in train_loop (images_uw, images_us), _ = unlabeled_iter.next() File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data\dataloader.py", line 517, in next data = self._next_data() File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data return self._process_data(data) File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data data.reraise() File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch_utils.py", line 429, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data_utils\worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\user\Desktop\Pyler\MPL-pytorch\data.py", line 180, in getitem img = self.transform(img) File "C:\Users\user\Desktop\Pyler\MPL-pytorch\data.py", line 159, in call aug = self.aug(x) File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\torchvision\transforms\transforms.py", line 60, in call img = t(img) File "C:\Users\user\Desktop\Pyler\MPL-pytorch\augmentation.py", line 211, in call img = op(img, v=self.m, max_v=max_v, bias=bias) File "C:\Users\user\Desktop\Pyler\MPL-pytorch\augmentation.py", line 111, in ShearX return img.transform(img.size, PIL.Image.AFFINE, (1, v, 0, 0, 1, 0), RESAMPLE_MODE) File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\PIL\Image.py", line 2490, in transform im.__transformer( File "C:\Users\user\anaconda3\envs\pyler\lib\site-packages\PIL\Image.py", line 2556, in __transformer raise ValueError( ValueError: Unknown resampling filter (None). Use Image.NEAREST (0), Image.BILINEAR (2) or Image.BICUBIC (3)

    Something wrong happend. I think it is because of pytorch version. My Pytorch version is 1.8.0. How can I fix it?

    Thank you

    opened by ParkDongChan 1
  • learning rate not match?

    learning rate not match?

    In the link, I see that the learning rate (of student?) starts from 0.5, but if I understand your code correctly, the learning rate should be 0 during the first 3000 steps as when training CIFAR10 model, you specified the num_wait_step to be 3000. Also, you specified the num_warmup_steps to be 5000. Can you please explain a bit? Thanks!

    opened by ifsheldon 1
  • About the derivation in the Appendix

    About the derivation in the Appendix

    It is a very good job!

    But I am confused about the derivation in Equation 10 of the appendix. How can we apply the REINFORCE equation to achieve Equation 10?

    image

    opened by blue-blue272 0
  • Accuracy with or without finetune

    Accuracy with or without finetune

    Hi, thanks for your great code, I am confused about accuracy with or without finetune on CIFAR-10-4K. Results show that accuracy with and without finetune on CIFAR-10-4K are 96.01% and 96.08%, which means there is no significant increase of accuracy with finetune. And I wonder is it means we could use only unlabeled data (with pseudo label generated by teacher model) to train our student model and obtain accuracy almost equal to train it with labeled data? So it indicates indirectly that pseudo labels generated by teacher model are high-quality? I would appreciated if you could give me a response. Thanks!

    opened by AdventureStory 2
  • Incorrect cross entropy?

    Incorrect cross entropy?

    https://github.com/kekmodel/MPL-pytorch/blob/7fb5b40cd53179bf4c09ef0f916815c3272d3e9d/main.py#L197

    At this point hard_pseudo_label is a batch size by 1 array of integer indexes, not a one hot encoding.

    Is this correct when calculating cross entropy?

    opened by conorturner 0
  • Why do we expand labels?

    Why do we expand labels?

    https://github.com/kekmodel/MPL-pytorch/blob/7fb5b40cd53179bf4c09ef0f916815c3272d3e9d/data.py#L138

    Hey all,

    I've been working my way through this implementation and cannot work out why the expand labels options exists? It seems that even if the labels aren't expanded the data loader will loop anyway.

    Can anyone explain why this is needed?

    opened by conorturner 0
Owner
Jungdae Kim
AI research engineer
Jungdae Kim
[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

Haochen Wang 268 Dec 24, 2022
noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

ProSelfLC: CVPR 2021 ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks For any specific discussion or potential fu

amos_xwang 57 Dec 4, 2022
An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

UnpairedSR An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020 turn RCAN(modified) --> xmodel(xilinx

JiaKui Hu 10 Oct 28, 2022
Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner This repository is the official implementation of Meta-rPPG: Remote Heart Ra

Eugene Lee 137 Dec 13, 2022
Pytorch implementation of the paper SPICE: Semantic Pseudo-labeling for Image Clustering

SPICE: Semantic Pseudo-labeling for Image Clustering By Chuang Niu and Ge Wang This is a Pytorch implementation of the paper. (In updating) SOTA on 5

Chuang Niu 154 Dec 15, 2022
PyTorch implementation of "Contrast to Divide: self-supervised pre-training for learning with noisy labels"

Contrast to Divide: self-supervised pre-training for learning with noisy labels This is an official implementation of "Contrast to Divide: self-superv

null 55 Nov 23, 2022
An implementation for Neural Architecture Search with Random Labels (CVPR 2021 poster) on Pytorch.

Neural Architecture Search with Random Labels(RLNAS) Introduction This project provides an implementation for Neural Architecture Search with Random L

null 18 Nov 8, 2022
Official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels".

WarPI The official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels". Run python main.py --corruption_type

Haoliang Sun 3 Sep 3, 2022
Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

SemCo The official pytorch implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

null 42 Nov 14, 2022
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

StyleSpeech - PyTorch Implementation PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation. Status (2021.06.13

Keon Lee 140 Dec 21, 2022
PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML)

pytorch-maml This is a PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML): https://arxiv

Kate Rakelly 516 Jan 5, 2023
Official Pytorch implementation of Meta Internal Learning

Official Pytorch implementation of Meta Internal Learning

null 10 Aug 24, 2022
This is an official PyTorch implementation of Task-Adaptive Neural Network Search with Meta-Contrastive Learning (NeurIPS 2021, Spotlight).

NeurIPS 2021 (Spotlight): Task-Adaptive Neural Network Search with Meta-Contrastive Learning This is an official PyTorch implementation of Task-Adapti

Wonyong Jeong 15 Nov 21, 2022
PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021.

PAML PyTorch implementation of the paper: "Preference-Adaptive Meta-Learning for Cold-Start Recommendation", IJCAI, 2021. (Continuously updating ) Int

null 15 Nov 18, 2022
Pseudo-Visual Speech Denoising

Pseudo-Visual Speech Denoising This code is for our paper titled: Visual Speech Enhancement Without A Real Visual Stream published at WACV 2021. Autho

Sindhu 94 Oct 22, 2022
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

Microsoft 247 Dec 25, 2022
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Differentiable Model Compression via Pseudo Quantization Noise DiffQ performs differentiable quantization using pseudo quantization noise. It can auto

Facebook Research 145 Dec 30, 2022
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 8, 2023
Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

Siddha Ganju 108 Dec 28, 2022