Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

Enrico Fini

Last update: Dec 26, 2022

Related tags

Deep Learning UNO

Overview

A Unified Objective for Novel Class Discovery

This is the official repository for the paper:

A Unified Objective for Novel Class Discovery
Enrico Fini, Enver Sangineto Stéphane Lathuilière, Zhun Zhong Moin Nabi, Elisa Ricci
ICCV 2021 (Oral)

Paper: ArXiv
Project Page: Website

Abstract: In this paper, we study the problem of Novel Class Discovery (NCD). NCD aims at inferring novel object categories in an unlabeled set by leveraging from prior knowledge of a labeled set containing different, but related classes. Existing approaches tackle this problem by considering multiple objective functions, usually involving specialized loss terms for the labeled and the unlabeled samples respectively, and often requiring auxiliary regularization terms. In this paper we depart from this traditional scheme and introduce a UNified Objective function (UNO) for discovering novel classes, with the explicit purpose of favoring synergy between supervised and unsupervised learning. Using a multi-view self-labeling strategy, we generate pseudo-labels that can be treated homogeneously with ground truth labels. This leads to a single classification objective operating on both known and unknown classes. Despite its simplicity, UNO outperforms the state of the art by a significant margin on several benchmarks (+10% on CIFAR-100 and +8% on ImageNet).

A visual comparison of our UNified Objective (UNO) with previous works.

Overview of the proposed architecture.

Installation

Our implementation is based on PyTorch and PyTorch Lightning. Logging is performed using Wandb. We recommend using conda to create the environment and install dependencies:

conda create --name uno python=3.8
conda activate uno
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=XX.X -c pytorch
pip install pytorch-lightning==1.1.3 lightning-bolts==0.3.0 wandb sklearn
mkdir -p logs/wandb checkpoints

Select the appropriate cudatoolkit version according to your system. Optionally, you can also replace pillow with pillow-simd (if your machine supports it) for faster data loading:

pip uninstall pillow
CC="cc -mavx2" pip install -U --force-reinstall pillow-simd

Datasets

For CIFAR10 and CIFAR100 you can just pass --download and the datasets will be automatically downloaded in the directory specified with --data_dir YOUR_DATA_DIR. For ImageNet you will need to follow the instructions on this website.

Checkpoints

All checkpoints (after the pretraining phase) are available on Google Drive. We recommend using gdown to download them directly to your server. First, install gdown with the following command:

pip install gdown

Then, open the Google Drive folder, choose the checkpoint you want to download, do right click and select Get link > Copy link. For instance, for CIFAR10 the link will look something like this:

https://drive.google.com/file/d/1Pa3qgHwK_1JkA-k492gAjWPM5AW76-rl/view?usp=sharing

Now, remove /view?usp=sharing and replace file/d/ with uc?id=. Finally, download the checkpoint running the following command:

gdown https://drive.google.com/uc?id=1Pa3qgHwK_1JkA-k492gAjWPM5AW76-rl

Logging

Logging is performed with Wandb. Please create an account and specify your --entity YOUR_ENTITY and --project YOUR_PROJECT. For debugging, or if you do not want all the perks of Wandb, you can disable logging by passing --offline.

Commands

Pretraining

Running pretraining on CIFAR10 (5 labeled classes):

python main_pretrain.py --dataset CIFAR10 --gpus 1  --precision 16 --max_epochs 200 --batch_size 256 --num_labeled_classes 5 --num_unlabeled_classes 5 --comment 5_5

Running pretraining on CIFAR100-80 (80 labeled classes):

python main_pretrain.py --dataset CIFAR100 --gpus 1 --precision 16 --max_epochs 200 --batch_size 256 --num_labeled_classes 80 --num_unlabeled_classes 20 --comment 80_20

Running pretraining on CIFAR100-50 (50 labeled classes):

python main_pretrain.py --dataset CIFAR100 --gpus 1 --precision 16 --max_epochs 200 --batch_size 256 --num_labeled_classes 50 --num_unlabeled_classes 50 --comment 50_50

Running pretraining on ImageNet (882 labeled classes):

python main_pretrain.py --gpus 2 --num_workers 8 --distributed_backend ddp --sync_batchnorm --precision 16 --dataset ImageNet --data_dir PATH/TO/IMAGENET --max_epochs 100 --warmup_epochs 5 --batch_size 256 --num_labeled_classes 882 --num_unlabeled_classes 30 --comment 882_30

Discovery

Running discovery on CIFAR10 (5 labeled classes, 5 unlabeled classes):

python main_discover.py --dataset CIFAR10 --gpus 1 --precision 16 --max_epochs 200 --batch_size 256 --num_labeled_classes 5 --num_unlabeled_classes 5 --pretrained PATH/TO/CHECKPOINTS/pretrain-resnet18-CIFAR10.cp --num_heads 4 --comment 5_5

Running discovery on CIFAR100-20 (80 labeled classes, 20 unlabeled classes):

python main_discover.py --dataset CIFAR100 --gpus 1 --max_epochs 200 --batch_size 256 --num_labeled_classes 80 --num_unlabeled_classes 20 --pretrained PATH/TO/CHECKPOINTS/pretrain-resnet18-CIFAR100-80_20.cp --num_heads 4 --comment 80_20 --precision 16

Running discovery on CIFAR100-50 (50 labeled classes, 50 unlabeled classes):

python main_discover.py --dataset CIFAR100 --gpus 1 --max_epochs 200 --batch_size 256 --num_labeled_classes 50 --num_unlabeled_classes 50 --pretrained PATH/TO/CHECKPOINTS/pretrain-resnet18-CIFAR100-50_50.cp --num_heads 4 --comment 50_50 --precision 16

Running discovery on ImageNet (882 labeled classes, 30 unlabeled classes)

python main_discover.py --dataset ImageNet --gpus 2 --num_workers 8 --distributed_backend ddp --sync_batchnorm --precision 16  --data_dir PATH/TO/IMAGENET --max_epochs 60 --base_lr 0.02 --warmup_epochs 5 --batch_size 256 --num_labeled_classes 882 --num_unlabeled_classes 30 --num_heads 3 --pretrained PATH/TO/CHECKPOINTS/pretrain-resnet18-ImageNet.cp --imagenet_split A --comment 882_30-A

NOTE: to run ImageNet split B/C just pass --imagenet_split B/C.

Citation

If you like our work, please cite our paper:

@InProceedings{fini2021unified,
    author    = {Fini, Enrico and Sangineto, Enver and Lathuilière, Stéphane and Zhong, Zhun and Nabi, Moin and Ricci, Elisa},
    title     = {A Unified Objective for Novel Class Discovery},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    year      = {2021}
}

Comments

swapped_prediction computation

https://github.com/DonkeyShot21/UNO/blob/50022c9e4a9d9eedd5916a4c12dcd5e7d793b6d9/main_discover.py#L180

Thanks for your nice work~

However, I have a question why we compute swapped_prediction for the length of num_head. It seems swapped_prediction is the same computation.

Hope your reply

opened by DeepTecher 9
How to implement the unconcat version?

Hi Enrico,

Thanks for your work and clean code! From your paper, I find that acc increases dramatically after concatenating label's and unlabel's logistics. And I really want to see how to implement the unconcat version. Looking forwards to your reply.

opened by ziyunli-2023 7
Reproducing the paper results
Dear authors,

Thank you for your exciting work and very clean code. I am having trouble reproducing the results mentioned in the paper and would appreciate it if you could help me.

1. Reproducing the UNO results from table 4. I was trying to get the scores on the samples of novel classes from the test split (Table 4 in the paper).

I have executed the commands for CIFAR10, CIFAR80-20, CIFAR50-50, and used Wandb for logging. However, the results on all datasets did not match the ones that I see in the paper. I took the results from incremental/unlabel/test/acc.

| . | Paper (avg/best) | Reproduced (avg/best) | | -- | -- | -- | | CIFAR10 | 93.3 / 93.3 | 90.8 / 90.8 | | CIFAR80-20 | 72.7 / 73.1 | 65.3 / 65.3 | | CIFAR50-50 | 50.6 / 50.7 | 44.9 / 45.7 |

Potential issues:

I am not using the exact versions of the packages mentioned in your ReadMe, and for that reason, I have run the CIFAR80-20 experiment twice, manually setting the seed (as in RankStats repo), however, I obtained very similar results. I also would not suspect a ~7% difference on CIFAR80-20 just to to the package version.

I may be using the wrong metric from wandb (I have used incremental/unlabel/test/acc). However, if you check my screenshot, for CIFAR80-20 all the other metrics are significantly different anyway (the value close to 72.7/73.1 does not appear anywhere).

2. How exactly the RankStats algorithm was evaluated on CIFAR50-50.

Could you please share if you performed any hyperparameter tuning for the CIFAR50-50 dataset when running the RankStats algorithm on it? I made multiple experiments and my training was very unstable, the algorithm always ends up scoring ~20/17 on known/novel classes.

Thanks a lot for your time.
opened by vlfom 7
a lot of questions about how to reproduce and cite the experimental results
Thank you for sharing the code of your paper! I have a lot of questions about how to reproduce and cite the experimental results of your paper.

My first question is how to reproduce the results on the original paper. I noticed that a new version (UNO v2) was released with higher performance, and meanwhile, a lot of hyperparameters have been changed. I wonder what values these hyperparameters were set in your original paper. Here is my guess:

For multi-view, only two large crops were used to build the swapped predictions loss. The two small crops were added in UNO v2.

For the base learning rate, it was originally set to base_lr=0.1 as described in the paper, not the value base_lr=0.4 in the current commit.

For the batch size, it was originally set to 512 as described in the paper. (Actually, I am not sure since it was set to 256 in the earlier commits of this repo).

For the discovery epochs, it was originally set to max_epochs=200 for all datasets as described in the paper, not like max_epochs=500 for cifar10/cifar100-20/cifar100-50 and max_epochs=60 for ImageNet in the current commit.

As for data augmentations, Solarize and Equalize were just added at UNOv2, not used in the original paper.

My second question is how to cite the experimental results. I noticed that a lot of training tricks (i.e., doubled training epochs, using extra two small crops for multi-view, and more data augmentation transformations) were used in UNOv2. However, some tricks have already made an unfair comparison with the previous work. For example, the representative previous work RS[1,2] just used batch_size=128, max_epoch=200, and without any complex augmentations used in your paper.

And as shown in your update as well as in my experiments, some changes, like the number of discovery epochs or the batch size, had significant effects on the final performance. So I am really confused about how to make fair comparisons...

References: [1] Automatically discovering and learning new visual categories with ranking statistics. ICLR 2020. [2] Autonovel: Automatically discovering and learning novel visual categories. TPAMI, 2021.
opened by vicmax 6
The results on CIFAR10

Hi, author:

Thanks for your nice code. I benifit a lot from your code.

For UNOv2, I can reimplement the results of CIFAR100, but I can't reimplement the results of CIFAR10 (93.6±0.2), which is worse than UNOv1 (96.1±0.5). I have try a lot times but I havn't figure out why.

I would very appreciate it if you can provide some help.

opened by kleinzcy 2
Clarification question on num_large_crops

Hi Enrico,

Just a clarification question. If I understand correctly, num_large_crops basically controls the number of augmented versions of an image. Can you confirm?

As num_large_crops is set to 2, we basically have two two augmented versions of the same image in each mini-batch. These are indeed used for swapped prediction. Is my understanding correct?

Please do get back when you get a chance.

Thanks, Joseph

opened by JosephKJ 2
How long does ImageNet experiments take?

Hi Enrico,

Thanks again for your nice work and clean code.

I am just wondering long does the ImageNet experiments take? Lets say the ImageNet-A. And, what GPU are you using?

Thanks, Joseph

opened by JosephKJ 2
loss_per_head seems wrong
Hi, I think the calculation of loss_per_head is wrong.

def cross_entropy_loss(self, preds, targets): # n_heads, batch_size, logits preds = F.log_softmax(preds / self.hparams.temperature, dim=-1) return -torch.mean(torch.sum(targets * preds, dim=-1))

this code averages both 0 and 1 dims so the loss is not head-wise. When conducting self.loss_per_head += loss_cluster.clone().detach(), each head loss adds the same mean value.

return -torch.mean(torch.sum(targets * preds, dim=-1), dim=-1) # revised

I think this would be good. If there's anything wrong with my understanding, please correct me. Thanks.
opened by haoosz 2
A question about the Eq.4

Excellent work. I have a question about Eq.4. Is the optimization of this Eq.4 using the Sinkhorn-Knopp algorithm equivalent to cross-entropy loss and label smoothing? What are the advantages of using the Sinkhorn-Knopp algorithm instead of cross-entropy loss? Thank you! Expect your reply.

opened by NanAlbert 2
Apply to a custom dataset

Hi, thanks for your great job! I wonder if this implementation can be use to detect a total new class without train, just use the pre-trained model? Thank you!

opened by hosea7456 2
UNO_V2 results

For uno_v2 results on cifar100, is the acc you provide the mean of multiple runs and take the average for the best head? Do you also have the standard deviation? And I also want to know if the ACC is chosen by the best validation epoch or the last epoch?

opened by ziyunli-2023 1
Issues with saving and loading checkpoints when using multiple gpus.
Hi, thank you for sharing your code.

I'm trying to follow your instructions but when I run discovery code, it fails to load pretrained model. My environment is

Ubuntu 16.04 LTS

2ea Nvidia RTX3090

python 3.8, cuda 11.0, pytorch 1.7.1, torchvision 0.8.2

same version of pytorch-lightning and lightning-bolts as the repo

My errors are

Traceback (most recent call last): File "main_discover.py", line 280, in main(args) File "main_discover.py", line 266, in main model = Discoverer(**args.dict) File "main_discover.py", line 70, in init state_dict = torch.load(self.hparams.pretrained, map_location=self.device) File "/home/dircon/anaconda3/envs/uno/lib/python3.8/site-packages/torch/serialization.py", line 594, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "/home/dircon/anaconda3/envs/uno/lib/python3.8/site-packages/torch/serialization.py", line 853, in _load result = unpickler.load() File "/home/dircon/anaconda3/envs/uno/lib/python3.8/site-packages/torch/serialization.py", line 845, in persistent_load load_tensor(data_type, size, key, _maybe_decode_ascii(location)) File "/home/dircon/anaconda3/envs/uno/lib/python3.8/site-packages/torch/serialization.py", line 833, in load_tensor storage = zip_file.get_storage_from_record(name, size, dtype).storage() RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading file data/94820505364016: invalid header or archive is corrupted

I believe it's due to distributed data parallel(ddp) but how can I stop from multiple cards to save the model?
opened by dhkim2810 0

Owner

Enrico Fini

PhD Student at University of Trento

GitHub

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two tasks.

264 Jan 9, 2023

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

63 Nov 18, 2022

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

64 Jan 5, 2023

Official PyTorch Implementation of Rank & Sort Loss [ICCV2021]

Rank & Sort Loss for Object Detection and Instance Segmentation The official implementation of Rank & Sort Loss. Our implementation is based on mmdete

229 Dec 20, 2022

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

36 Nov 4, 2022

Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021. Introduction We proposed a novel model training paradi

103 Dec 14, 2022

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

159 Dec 30, 2022

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

159 Dec 30, 2022

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

CSRA This is the official code of ICCV 2021 paper: Residual Attention: A Simple But Effective Method for Multi-Label Recoginition Demo, Train and Vali

163 Dec 22, 2022

Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021) This repository is the official PyTorc

139 Dec 29, 2022

Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

M3D-VTON: A Monocular-to-3D Virtual Try-On Network Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network" Paper | Suppl

109 Dec 29, 2022

[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

CTR-GCN This repo is the official implementation for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition. The pap

148 Dec 16, 2022

Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

[ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering Getting Started DSR has been implemented and tested on Ubunt

83 Nov 27, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

173 Dec 21, 2022

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

42 Jan 7, 2023

Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

70 Nov 22, 2022

Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

67 Jan 3, 2023

Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

22 Sep 22, 2022