Pytorch implementation of the paper SPICE: Semantic Pseudo-labeling for Image Clustering

Chuang Niu

Last update: Dec 15, 2022

Related tags

Deep Learning SPICE

Overview

SPICE: Semantic Pseudo-labeling for Image Clustering

By Chuang Niu and Ge Wang

This is a Pytorch implementation of the paper. (In updating)

SOTA on 5 benchmarks. Please refer to Papers With Code for Image Clustering

Installation

Please refer to requirement.txt for all required packages. Assuming Anaconda with python 3.7, a step-by-step example for installing this project is as follows:

conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
conda install -c conda-forge addict tensorboard python-lmdb
conda install matplotlib scipy scikit-learn pillow

Then, clone this repo

git clone https://github.com/niuchuangnn/SPICE.git
cd SPICE

Data

Prepare datasets of interest as described in dataset.md.

Training

Read the training tutorial for details.

Evaluation

Evaluation of SPICE-Self:

python tools/eval_self.py --config-file configs/stl10/eval.py --weight PATH/TO/MODEL --all 1

Evaluation of SPICE-Semi:

python tools/eval_semi.py --load_path PATH/TO/MODEL --net WideResNet --widen_factor 2 --data_dir PATH/TO/DATA --dataset cifar10 --all 1

Read the evaluation tutorial for more descriptions about the evaluation and the visualization of learned clusters.

Model Zoo

All trained models in our paper are available as follows.

Dataset	Version	ACC	NMI	ARI	Model link
STL10	SPICE-Self	91.0	82.0	81.5	Model
	SPICE	93.8	87.2	87.0	Model
	SPICE-Self*	89.9	80.9	79.7	Model
	SPICE*	92.9	86.0	85.3	Model
CIFAR10	SPICE-Self	83.8	73.4	70.5	Model
	SPICE	92.6	86.5	85.2	Model
	SPICE-Self*	84.9	74.5	71.8	Model
	SPICE*	91.7	85.8	83.6	Model
CIFAR100	SPICE-Self	46.8	44.8	29.4	Model
	SPICE	53.8	56.7	38.7	Model
	SPICE-Self*	48.0	45.0	30.8	Model
	SPICE*	58.4	58.3	42.2	Model
ImageNet-10	SPICE-Self	96.9	92.7	93.3	Model
	SPICE	96.7	91.7	92.9	Model
ImageNet-Dog	SPICE-Self	54.6	49.8	36.2	Model
	SPICE	55.4	50.4	34.3	Model
TinyImageNet	SPICE-Self	30.5	44.9	16.3	Model
	SPICE-Self*	29.2	52.5	14.5	Model

More models based on ResNet18 for both SPICE-Self* and SPICE-Semi*.

Dataset	Version	ACC	NMI	ARI	Model link
STL10	SPICE-Self*	86.2	75.6	73.2	Model
	SPICE*	92.0	85.2	83.6	Model
CIFAR10	SPICE-Self*	84.5	73.9	70.9	Model
	SPICE*	91.8	85.0	83.6	Model
CIFAR100	SPICE-Self*	46.8	45.7	32.1	Model
	SPICE*	53.5	56.5	40.4	Model

Acknowledgement for reference repos

Citation

@misc{niu2021spice,
      title={SPICE: Semantic Pseudo-labeling for Image Clustering}, 
      author={Chuang Niu and Ge Wang},
      year={2021},
      eprint={2103.09382},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Comments

Issue in replicating results on CIFAR10

I am trying to train the model and replicate results for the CIFAR10 dataset. I downloaded the dataset as per instructions and followed steps for training as in the attached file -
trainingStepsCIFAR10.md. It contains command and config used (if different from the default config in repo) along with output snippets.

MOCO is getting trained. I think cluster heads are not getting trained as accuracy is not increasing from 26%. I also tried a lower learning rate of 0.0005 but it didn't help. Finding locally consistent examples fails to find any consistent example at ratio_confident=0.99 (provided default) and only 6770 at ratio_confident=0.9.

Is this normal? If not, can you provide me some direction on where the issue might be present and how it can be resolved?

Detailed logs of training cluster head are in file: spice-self-logs.txt There are few errors of broken pipe. I think those are due to the usage of Jupyter for training and reconnections with Jupyter server. It shouldn't affect the convergence of the model.

opened by ML-Guy 21
question about parameters

dear author, image size of my own datasets is 224*224. in train_moco.py, how should I change learning rate and batch_size?

when I ran with batch_size=128 and lr remained , running loss is about 9.5.

opened by anewusername77 11

train_self_v2.py need ground truth label?

    targets_i.append(gt_cluster_labels[h][img_idx_i].to(cfg.gpu, non_blocking=True))

    loss_dict = model(imgs_i, target=targets_i, forward_type="loss")

    loss = sum(loss for loss in loss_dict.values())
    loss_mean = loss / num_heads

    optimizer.zero_grad()
    loss_mean.backward()
    optimizer.step()

i want to train my own data WITHOUT ground truth labels. but it seems this step need labels?

opened by anewusername77 4

A quick question

Dear authors of SPICE,

I am very impressed by your work. Just a quick question: how the semantic prediction matrix P is obtained from the features?

Thanks!

opened by zhenxianglance 4
what if i don't have reliable datasets with label

as the title said, can i train this model and lable those data?they are completely without any class labels, just want to assign them with a specific class(no need for meaning of classes)

opened by anewusername77 4
Error from local_consistency , cannot create reliable_labels , and how to get predicted labels in all the ground truth clusters

Trained the feature model with custom dataset , and did the stage 2 training using train_self_v2.py script,

But in the state 2 training evaluation metrics the accuracy was always -1 (never improved on 56 epochs), it was because the clusters created were not equal to the number of ground truth labels , but the training was completed.

After when running the local_consistency.py script the model output from the local_consistency forward type output was none

idx_select, labels_select = model(feas_sim=feas_sim, scores=scores, forward_type="local_consistency")

Tried with different ratio_confident and score_th values , but the values of idx_select, labels_select are tensor([ ], dtype=torch.int64) tensor([ ], dtype=torch.int64)

And so the reliable_labels are not created.

opened by Balakumaran-kandula 3
embeddings TSNE

Hi,

I want to plot the embeddings' tsne. I am using feas_moco_512_l2.npy as the embeddings and datasets/stl10/stl10_binary/fold_indices.txt as the labels. The tsne plot looks a bit random though I can see different clusters... It seems that "fold_indices.txt" is not what I think it is. I've also noticed that there are some indices that correspond to multiple different classes (for example index 4555 shows up for both class '1' and class '8' in "fold_indices.txt").

Any ideas?

Thanks!

opened by ofir-dl 3
Does SPICE need targets(label for data on supervised learning)?

Hello Sir,

I saw targets on CIFAR10.py Does SPICE need targets(label for data on supervised learning)? If so, why do you need targets on your process?

Thanks, Edward Cho.

opened by edwardcho 2
I can not train model in RTX3090

UserWarning: NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37. If you want to use the NVIDIA GeForce RTX 3090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

opened by YexiongLin 2
Evaluation on CIFAR100

Hi, Thanks for sharing your code. I want to do an evaluation on cifar100 I used the following net and weight but I got a mismatch error while loading state_dict.

--load_path ./model_zoo/model_cifar100_cls_res18.pth --net resnet18_cifar and --load_path ./model_zoo/model_cifar100.pth - -net WideResNet could you please tell me which net and model I should use for CIFAR100? Thanks

opened by nahidq 2
invalid ELF header

When trying to run your code I get the following error, do you have any suggestions as to what I might be doing wrong?

(base) anders@CenturionUbuntu:~/PycharmProjects/Spice/SPICE$ python tools/eval_self.py --config-file configs/cifar10/eval.py --weight model_zoo/model_cifar10.pth.tar --all 1 Traceback (most recent call last): File "tools/eval_self.py", line 12, in import torch File "/home/anders/anaconda3/lib/python3.8/site-packages/torch/init.py", line 188, in _load_global_deps() File "/home/anders/anaconda3/lib/python3.8/site-packages/torch/init.py", line 141, in _load_global_deps ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL) File "/home/anders/anaconda3/lib/python3.8/ctypes/init.py", line 381, in init self._handle = _dlopen(self._name, mode) OSError: /home/anders/anaconda3/lib/python3.8/site-packages/torch/lib/../../../../libcufft.so.10: invalid ELF header

opened by PhoQus 2
suggested bug fix + semi-evaluation question
Hi ! Great work.

To allow training MoCo with CIFAR datasets. I suggest you change row 184 in 'tools/train_moco.py'.

from:

model = moco.builder.MoCo( base_model, args.moco_dim, args.moco_k, args.moco_m, args.moco_t, args.mlp)

to:

model = moco.builder.MoCo( base_model, args.moco_dim, args.moco_k, args.moco_m, args.moco_t, args.mlp, args.img_size)

Another question, Do you perform your final evaluation (SPICE-semi) on the resulting 'model_best.pth' or 'latest_model.pth' ? I haven't saw any reference for it. Thanks !
opened by DanielShalam 0
SPICE on single GPU

It seems like you have prepared arguments to run the script on single GPU (--multiprocessing-distributed and --gpu), but then I run into "Only DistributedDataParallel is supported." exception. Or when I run the script only with "--gpu 0" as argument, I run into error originating in spawn .py code.

I only have access to a computer with single Nvidia 3090 GPU and want to test your model as part of my Master's thesis. Does it mean that I would need to implement a version for 1 GPU myself, or am I missing something in your implementation?

opened by TomasPlachy 0
If I want to remove targets (labels), how to change your code?

Hello Sir,

I think that Clustering should do not have 'the targets(labels)'. But I found targets in your code (cifar.py).

So I want to remove codes that related to targets(labels). How to change your code?? Is it possible?

Thanks, Edward Cho

opened by edwardcho 0

Using CIFAR10, when I tried to train, I met some error at step-4 (4. Determine reliable images).

Hello Sir,

As I mentioned at Title, I was tried to train your code.

When I tried train according to your help-site(https://github.com/niuchuangnn/SPICE/blob/main/train.md), I met some error.

4. Determine reliable images

python tools/local_consistency.py --config-file ./configs/cifar10/eval.py --embedding ./results/cifar10/embedding/feas_moco_512_l2.npy

This is error message.

Traceback (most recent call last):
  File "tools/local_consistency.py", line 141, in <module>
    main()
  File "tools/local_consistency.py", line 131, in main
    acc = calculate_acc(labels_select, gt_labels_select)
  File "./spice/utils/evaluation.py", line 21, in calculate_acc
    assert len(np.unique(ypred)) == len(np.unique(y))
AssertionError

So I checked labels_select and gt_labels_select,

len(labels_select) : 10
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
len(gt_labels_select) : 10
[8 5 8 5 3 4 1 5 1 3]

What I should do for this problem??

Thanks, Edward Cho.

opened by edwardcho 0

SPICE training Error

Hi, I have a question to ask. After I follow the steps to install, start training python tools/train_moco.py --img_size 32 --moco-k 12800 --arch resnet18_cifar --save_folder ./results/cifar10/moco_res18_cls --resume ./results/cifar10/moco_res18_cls/checkpoint_last.pth.tar --data_type cifar10 --data ./datasets/cifar10 --all 0in training tutorial

Below is the error:

(u) C:\Users\Kelly>cd SPICE

(u) C:\Users\Kelly\SPICE>python tools/train_moco.py --img_size 32 --moco-k 12800 --arch resnet18_cifar --save_folder ./results/cifar10/moco_res18_cls --resume ./results/cifar10/moco_res18_cls/checkpoint_last.pth.tar --data_type cifar10 --data ./datasets/cifar10 --all 0 Use GPU: 0 for training Traceback (most recent call last): File "tools/train_moco.py", line 453, in main() File "tools/train_moco.py", line 145, in main mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args)) File "C:\Users\Kelly.conda\envs\u\lib\site-packages\torch\multiprocessing\spawn.py", line 240, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "C:\Users\Kelly.conda\envs\u\lib\site-packages\torch\multiprocessing\spawn.py", line 198, in start_processes while not context.join(): File "C:\Users\Kelly.conda\envs\u\lib\site-packages\torch\multiprocessing\spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "C:\Users\Kelly.conda\envs\u\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in _wrap fn(i, *args) File "C:\Users\Kelly\SPICE\tools\train_moco.py", line 170, in main_worker dist.init_process_group(backend=args.dist_backend, init_method=args.dist_url, File "C:\Users\Kelly.conda\envs\u\lib\site-packages\torch\distributed\distributed_c10d.py", line 602, in init_process_group default_pg = _new_process_group_helper( File "C:\Users\Kelly.conda\envs\u\lib\site-packages\torch\distributed\distributed_c10d.py", line 727, in _new_process_group_helper raise RuntimeError("Distributed package doesn't have NCCL " "built in") RuntimeError: Distributed package doesn't have NCCL built in

How do I need to solve thanks Kelly

opened by kkellyk 0

Owner

Chuang Niu

GitHub

Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

108 Dec 28, 2022

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

5 Jul 18, 2022

Learn about Spice.ai with in-depth samples

Samples Learn about Spice.ai with in-depth samples ServerOps - Learn when to run server maintainance during periods of low load Gardener - Intelligent

16 Mar 23, 2022

Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

ADGC: Awesome Deep Graph Clustering ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets).

297 Dec 27, 2022

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

247 Dec 25, 2022

[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

387 Jan 8, 2023

[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

268 Dec 24, 2022

An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

UnpairedSR An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020 turn RCAN(modified) --> xmodel(xilinx

10 Oct 28, 2022

This is an unofficial PyTorch implementation of Meta Pseudo Labels

This is an unofficial PyTorch implementation of Meta Pseudo Labels. The official Tensorflow implementation is here.

320 Jan 8, 2023

PyTorch implementation for Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition.

Stochastic CSLR This is the PyTorch implementation for the ECCV 2020 paper: Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuou

28 Dec 19, 2022

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

152 Dec 28, 2022

labelpix is a graphical image labeling interface for drawing bounding boxes

Welcome to labelpix ?? labelpix is a graphical image labeling interface for drawing bounding boxes. ?? Homepage Install pip install -r requirements.tx

26 May 24, 2022

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering Jang Hyun Cho1, Utkarsh Mall2, Kavita Bala2, Bharath Harihar

164 Dec 30, 2022

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

SHOT++ Code for our TPAMI submission "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer" that is ext

75 Dec 16, 2022

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation Introduction Getting Started FSD50K Recipe AudioSet Recipe Label E

84 Dec 27, 2022

A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

146 Nov 25, 2022

PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

11 Jul 28, 2022

The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

STAR-FC This code is the implementation for the CVPR 2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes" ?? ?? . ?? Re

87 Dec 28, 2022

PyTorch implementation for COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction (CVPR 2021)

Completer: Incomplete Multi-view Clustering via Contrastive Prediction This repo contains the code and data of the following paper accepted by CVPR 20

72 Dec 7, 2022