This is the code for the paper "Contrastive Clustering" (AAAI 2021)

Yunfan Li

Last update: Dec 30, 2022

Related tags

Deep Learning Contrastive-Clustering

Overview

Contrastive Clustering (CC)

This is the code for the paper "Contrastive Clustering" (AAAI 2021)

Dependency

python>=3.7
pytorch>=1.6.0
torchvision>=0.8.1
munkres>=1.1.4
numpy>=1.19.2
opencv-python>=4.4.0.46
pyyaml>=5.3.1
scikit-learn>=0.23.2

Usage

Configuration

There is a configuration file "config/config.yaml", where one can edit both the training and test options.

Training

After setting the configuration, to start training, simply run

python train.py

Since the traning strategy for STL-10 is slightly different from others (unlabeled data is used on ICH only while training and test split are used on both ICH and CCH), to start training on STL-10, run

python train_STL10.py

Test

Once the training is completed, there will be a saved model in the "model_path" specified in the configuration file. To test the trained model, run

python cluster.py

We uploaded the pretrained model which achieves the performance reported in the paper to the "save" folder for reference.

Dataset

CIFAR-10, CIFAR-100, STL-10 will be automatically downloaded by Pytorch. Tiny-ImageNet can be downloaded from http://cs231n.stanford.edu/tiny-imagenet-200.zip. For ImageNet-10 and ImageNet-dogs, we provided their description in the "dataset" folder.

Citation

If you find CC useful in your research, please consider citing:

@article{li2020contrastive,
  title={Contrastive Clustering},
  author={Li, Yunfan and Hu, Peng and Liu, Zitao and Peng, Dezhong and Zhou, Joey Tianyi and Peng, Xi},
  journal={arXiv preprint arXiv:2009.09687},
  year={2020}
}

Comments

About the application in other fields

Dear Yunfan,

Thanks for your outstanding work! I have learned a lot from your paper, codes, and GitHub Issues.

In my research area, there are studies based on your work. However, according to my replication results, the clustering accuracy curve kept oscillating between high and very low values during the training process. I attribute this to the changes they made to your original loss function.

After being disappointed, I applied for your work directly to my dataset and was able to get good results. I was very surprised and grateful.

However, on my dataset, the loss shows a decreasing trend as the training proceeds, but the accuracy curve keeps decreasing from the very high values at the beginning.

My data is characterized by small image size, probably around 11x11, and a high number of channels. So I simply design a three-layer CNN network.

The accuracy curves show the following pattern,

Can you help me to see what is the reason for this?

Thanks again.

Best wishes.

opened by 4fee8fea 14
issue

Hi, when i test the 'ImageNet-dogs'，i get the error“IndexError: index 14 is out of bounds for axis 0 with size 1”, and i print the cost_matrix in 'evaluation.py',i only get [[0.]]. Is it because I trained less epoch?

Thanks.

opened by HBU-Lin-Li 14

Instance Loss output different from NT Xent loss

Hey,

I found a difference between the output of the Instance loss implemented vs NT Xent loss taken from SIMCLR(https://github.com/Spijkervet/SimCLR/blob/master/simclr/modules/nt_xent.py)

Although the functions loss very similar, the outputs seems to be different. Could you please look into it and share your insights?

import torch
import torch.nn as nn
import math


class InstanceLoss(nn.Module):
    def __init__(self, batch_size, temperature, device):
        super(InstanceLoss, self).__init__()
        self.batch_size = batch_size
        self.temperature = temperature
        self.device = device

        self.mask = self.mask_correlated_samples(batch_size)
        self.criterion = nn.CrossEntropyLoss(reduction="sum")

    def mask_correlated_samples(self, batch_size):
        N = 2 * batch_size
        mask = torch.ones((N, N))
        mask = mask.fill_diagonal_(0)
        for i in range(batch_size):
            mask[i, batch_size + i] = 0
            mask[batch_size + i, i] = 0
        mask = mask.bool()
        return mask

    def forward(self, z_i, z_j):
        N = 2 * self.batch_size
        z = torch.cat((z_i, z_j), dim=0)

        sim = torch.matmul(z, z.T) / self.temperature
        sim_i_j = torch.diag(sim, self.batch_size)
        sim_j_i = torch.diag(sim, -self.batch_size)

        positive_samples = torch.cat((sim_i_j, sim_j_i), dim=0).reshape(N, 1)
        negative_samples = sim[self.mask].reshape(N, -1)

        labels = torch.zeros(N).to(positive_samples.device).long()
        logits = torch.cat((positive_samples, negative_samples), dim=1)
        loss = self.criterion(logits, labels)
        loss /= N

        return loss


class NT_Xent(nn.Module):
    """
    More than inspired from https://github.com/Spijkervet/SimCLR/blob/master/modules/nt_xent.py

    Notes
    =====

    Using this pytorch implementation, you don't actually need to l2-norm the inputs, the results will be
    identical, as shown if you run this file.
    """

    def __init__(self, batch_size, temperature, device):
        super(NT_Xent, self).__init__()
        self.batch_size = batch_size
        self.temperature = temperature
        self.mask = self.get_correlated_samples_mask()
        self.device = device

        self.criterion = nn.CrossEntropyLoss(reduction="sum")
        self.similarity_f = nn.CosineSimilarity(dim=2)

    def forward(self, z_i, z_j):
        """
        We do not sample negative examples explicitly.
        Instead, given a positive pair, similar to (Chen et al., 2017), we treat the other 2(N − 1) augmented examples within a minibatch as negative examples.
        """

        p1 = torch.cat((z_i, z_j), dim=0)
        sim = self.similarity_f(p1.unsqueeze(1), p1.unsqueeze(0)) / self.temperature

        sim_i_j = torch.diag(sim, self.batch_size)
        sim_j_i = torch.diag(sim, -self.batch_size)

        positive_samples = torch.cat((sim_i_j, sim_j_i), dim=0).reshape(self.batch_size * 2, 1)
        negative_samples = sim[self.mask].reshape(self.batch_size * 2, -1)

        labels = torch.zeros(self.batch_size * 2).to(self.device).long()
        logits = torch.cat((positive_samples, negative_samples), dim=1)
        loss = self.criterion(logits, labels)
        loss /= 2 * self.batch_size
        return loss

    def get_correlated_samples_mask(self):
        mask = torch.ones((self.batch_size * 2, self.batch_size * 2), dtype=bool)
        mask = mask.fill_diagonal_(0)
        for i in range(self.batch_size):
            mask[i, self.batch_size + i] = 0
            mask[self.batch_size + i, i] = 0
        return mask



a, b = torch.rand(8, 12), torch.rand(8, 12)
a_norm, b_norm = torch.nn.functional.normalize(a), torch.nn.functional.normalize(b)
cosine_sim = torch.nn.CosineSimilarity()
instance_loss = InstanceLoss(8, 0.5, "cpu")
ntxent_loss = NT_Xent(8, 0.5, "cpu")
print('Cosine')
print(cosine_sim(a, b))
print(cosine_sim(a_norm, b_norm))
print('NT Xent')
print(ntxent_loss(a, b))
print(ntxent_loss(a_norm, b_norm))
print('Instance')
print(instance_loss(a, b))
print(instance_loss(a_norm, b_norm))

Output:

Cosine tensor([0.6606, 0.7330, 0.7845, 0.8602, 0.6992, 0.8224, 0.7167, 0.7500]) tensor([0.6606, 0.7330, 0.7845, 0.8602, 0.6992, 0.8224, 0.7167, 0.7500])

NT Xent tensor(2.7081) tensor(2.7081)

Instance tensor(3.1286) tensor(2.7081)

As you can see, Instance loss gives different results where as the others don't when fed a_norm and b_norm.

Colab notebook: https://github.com/Spijkervet/SimCLR/blob/master/simclr/modules/nt_xent.py

opened by sramakrishnan247 6

Plans for release supplementary materials

Hello, you have mentioned that there are some experiments in your supplementary material. Do you have any plan to release the supplementary material. Thanks for your reply.

opened by qzhang5217 6
Cluster Assignment Entropy
Hey Yunfan,

first of all, this is really great work and a well written paper! Thanks for providing the code.

I am trying to reimplement your method and am a bit confused about the way you penalize the cluster asssignment matrix. In your code you do

p_i = c_i.sum(0).view(-1) p_i /= p_i.sum() ne_i = math.log(p_i.size(0)) + (p_i * torch.log(p_i)).sum() p_j = c_j.sum(0).view(-1) p_j /= p_j.sum() ne_j = math.log(p_j.size(0)) + (p_j * torch.log(p_j)).sum() ne_loss = ne_i + ne_j

This gives a loss of 1.97 for the example cluster assignment matrix Y with 3 clusters and 2 samples + 2 augmented samples below

Y = torch.Tensor([[0.98, 0.01, 0.01], [0.98, 0.01, 0.01], [0.98, 0.01, 0.01], [0.98, 0.01, 0.01]]) c_i = Y[:2] c_j = Y[2:]

Your code seems to differ quite a lot from how you write it in the paper. According to your paper I would have done the following

Y_one_norm = torch.linalg.norm(Y, ord=1) c_i = c_i.sum(dim=0)/Y_one_norm c_j = c_j.sum(dim=0)/Y_one_norm ne_loss = (c_i*torch.log(c_i)+c_j*torch.log(c_j)).sum()

which gives a loss of -1.20.

Could you kindly let me know your intention of implementing the loss the way you did and why it seems to differ from the maths in your paper?

Thanks!

Lukas
opened by LukasBommes 5
Questions about multi-GPU training

Thanks for sharing such an excellent work： Are there any plans to provide multi-GPU training scripts? I would be very grateful if I could provide it！ thanks!

opened by Allfu 2
More details on conclusion

Hey! The paper is extremely well written but could you give some more info on the following part -

The proposed CC shows its promising performance in clustering. In the future, we plan to extend it to other tasks and applications such as semi-supervised learning and transfer learning.

This was mentioned in the conclusion. Is this a new area of research, or are there applications other than clustering that can be done with this existing repository? Can you please give some more insights about this part?

opened by sramakrishnan247 1
Clustering of images

Hey, I have one question. Is it possible to figure out which input image belongs to which cluster while inference? if one wants to pick images from a or two clusters.

opened by aks1087 1
关于idea的Fig1行级别解释

云帆，您好！

想请教你们论文的图1，我们都认为对比学习是将同一个样本的不同数据增强拉近，其他的push开，而且本文的核心是将标签看作表示实现聚类级别的对比学习。聚类级别的特征矩阵是经过softmax出来的是soft label 本文将特征矩阵的列作为对比学习越看越优雅！

可是行级别就是普通的对比学习，特征矩阵就是每个实例的表示；图一为什么是Instance Representation也当做soft label（疑似与框架图出现矛盾，框架图展示的是一个特征矩阵的行就是实例本身的特征，并不存在与图一展示的行级别soft label的说法），还是我理解上出现了问题，希望得到您的解答！

opened by haoweiclouds1 2
RuntimeError: numel: integer multiplication overflow
你好，我跑这个代码，换了个我们自己的数据集，经常会报

sim_i_j = torch.diag(sim, self.batch_size) RuntimeError: numel: integer multiplication overflow

这个是在contrastive_loss.py文件当中。不知道大佬有遇到过吗
opened by bugcat9 6
关于代码的运行

你好，我想问下，我用的是window的系统，用pycharm运行train.py，只出现Files already downloaded and verified Files already downloaded and verified，请问这是怎么回事呢？我想跑几个数据集和你的方法做对比实验，请问你可以提供一些帮助吗？

opened by zhangyuanyang21 8
Low Accuracy and NMI

Hi, Thank you for sharing your code. I would like to reproduce your results on CIFAR-10, I ran your original code with 4 GPUs and the results are attached below. My final ACC (NMI) after 990 epochs is about 69% (64%). Did you use any special method and/or hyperparameters for training your network which is not uploaded on Github? I would appreciate it if you could help me to reproduce your results. result_batch_256.txt

opened by mrsadeghi95-preteckt 3

Owner

Yunfan Li

GitHub

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Deep Unsupervised Image Hashing by Maximizing Bit Entropy This is the PyTorch implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hash

62 Dec 30, 2022

Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Knowledge Bridging for Empathetic Dialogue Generation This is the official implementation for paper Knowledge Bridging for Empathetic Dialogue Generat

50 Dec 20, 2022

Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

Graph Evolving Meta-Learning for Low-resource Medical Dialogue Generation Code to be further cleaned... This repo contains the code of the following p

29 Nov 1, 2022

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

multilingual-mrc-isdg Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph". This r

5 Dec 7, 2022

Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch >= 1.2.0 P

16 Dec 14, 2022

An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

Dual Correlation Reduction Network An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022. Any

109 Dec 23, 2022

git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Self-Attention Attribution This repository contains the implementation for AAAI-2021 paper Self-Attention Attribution: Interpreting Information Intera

60 Dec 29, 2022

[AAAI 2021] MVFNet: Multi-View Fusion Network for Efficient Video Recognition

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021) Overview We release the code of the MVFNet (Multi-View Fusion Network).

114 Nov 27, 2022

《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation

76 Dec 5, 2022

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

60 Dec 31, 2022

Official implementation for (Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching, AAAI-2021)

Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching Official pytorch implementation of "Show, Attend and Distill: Kn

80 Dec 16, 2022

(AAAI 2021) Progressive One-shot Human Parsing

End-to-end One-shot Human Parsing This is the official repository for our two papers: Progressive One-shot Human Parsing (AAAI 2021) End-to-end One-sh

54 Dec 30, 2022

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

30 Dec 6, 2022

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

12 Oct 28, 2022

This is the code for the paper "Contrastive Clustering" (AAAI 2021)

Related tags

Overview

Contrastive Clustering (CC)

Dependency

Usage

Configuration

Training

Test

Dataset

Citation

Comments

Owner

Yunfan Li

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

[AAAI 2021] MVFNet: Multi-View Fusion Network for Efficient Video Recognition

《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation》(AAAI 2021) GitHub:

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

Official implementation for (Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching, AAAI-2021)

(AAAI 2021) Progressive One-shot Human Parsing

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection, AAAI 2021.

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

[AAAI-2021] Visual Boundary Knowledge Translation for Foreground Segmentation

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].