The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

Shuai Shen

Last update: Dec 28, 2022

Related tags

Deep Learning STAR-FC

Overview

STAR-FC

This code is the implementation for the CVPR 2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes" 🌟 🌟 .

🎓 Requirements

Python = 3.6
Pytorch = 1.2.0
faiss

🧚 Hardware

The hardware we used in this work is as follows:

24G TITAN RTX
48 core Intel Xeon CPU [email protected] processor

🍰 Datasets

cd STAR-FC

Create a new folder for training data:

mkdir data

To run the code, please download the refined MS1M dataset and partition it into 10 splits, then construct the data directory as follows:

|——data
   |——features
      |——part0_train.bin
      |——part1_test.bin
      |——...
      |——part9_test.bin
   |——labels
      |——part0_train.meta
      |——part1_test.meta
      |——...
      |——part9_test.meta
   |——knns
      |——part0_train/faiss_k_80.npz
      |——part1_test/faiss_k_80.npz
      |——...
      |——part9_test/faiss_k_80.npz

We have used the data from: https://github.com/yl-1993/learn-to-cluster

🍬 Model

Put the pretrained models Backbone.pth and Head.pth in the ./pretrained_model. Our trained models will come soon.

☘️ Training

Adjust the configuration in ./src/configs/cfg_gcn_ms1m.py, then run the algorithm as follows:

cd STAR-FC
sh scripts/train_gcn_ms1m.sh

🌵 Testing

Adjust the configuration in ./src/configs/cfg_gcn_ms1m.py, then run the algorithm as follows:

cd STAR-FC
python test_final.py

Acknowledgement

This code is based on the publicly available face clustering codebase https://github.com/yl-1993/learn-to-cluster.

Citation

Please cite the following paper if you use this repository in your reseach.

@inproceedings{shen2021starfc,
   author={Shen, Shuai and Li, Wanhua and Zhu, Zheng and Huan, Guan and Du, Dalong and Lu, Jiwen and Zhou, Jie},
   title={Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes},
   booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
   year={2021}
}

Comments

There is a problem when running with test_final.py

Traceback (most recent call last): File "test_final.py", line 167, in gt_labels = np.load('./pretrained_model/gt_labels.npy') File "/mnt/lustre/caoguoliang1/anaconda3/envs/cluster/lib/python3.6/site-packages/numpy/lib/npyio.py", line 416, in load fid = stack.enter_context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: './pretrained_model/gt_labels.npy'

What is gt_labels.npy?

opened by GlennCGL 2
why bcubed fscore is so pool?

I test ms1m part1_test.bin with the model you provided. The result: [Time] evaluate with pairwise consumes 0.0512 s ave_pre: 0.9559, ave_rec: 0.8870, fscore: 0.9202 [Time] evaluate with bcubed consumes 20.8486 s ave_pre: 0.0858, ave_rec: 1.0000, fscore: 0.1581 [Time] evaluate with nmi consumes 0.1714 s nmi: 0.9738

opened by xiangliu886 1
The results are not robust in new dataset

Hi, I have trained and tested star-fc with public person datasets MSMT and Market, but got very poor results. Could you tell me why?

2022-03-18 18:18:19.397 real clusters: 3060, predict clusters: 7993

2022-03-18 18:18:19.397 [Time] evaluate with [31mpairwise[0m consumes 0.0095 s

2022-03-18 18:18:19.397 [32mave_pre: 0.0032, ave_rec: 0.8302, fscore: 0.0064[0m

2022-03-18 18:18:19.397 [Time] evaluate with [31mbcubed[0m consumes 4.2473 s

2022-03-18 18:18:19.397 [32mave_pre: 0.4681, ave_rec: 0.8321, fscore: 0.5991[0m

2022-03-18 18:18:19.397 [Time] evaluate with [31mnmi[0m consumes 0.0290 s

2022-03-18 18:18:19.397 [32mnmi: 0.7601[0m

2022-03-18 18:18:19.397 avg_acc: 0.8054215519079088

2022-03-18 18:18:19.397 pairwise: 0.006409889997558421

2022-03-18 18:18:19.397 bcubed: 0.5991425302360325

2022-03-18 18:18:19.397 nmi: 0.7600691590384783

opened by shuxjweb 1
About WebFace42M's feature file

Very good job ! I want to know if the author can release WebFace42M's feature file and its train/test splits. So that more researchers can follow and cite this work.

opened by slacklife 1
To much samples

Hi, I followed your example, didn't change the parameters, just replaced the dataset, and then I was reminded that the number of samples is too large, how can I fix it?

opened by 1017549629 1
About perform_val in training

Hi! I am checking the train_gcn.py and noticed something weird. In line 32 def perform_val() the test_inst_num is defined as the length of test_idx2lb, which is the total number of samples in the test set. However, the pair_a and pair_b are defined as lists of k duplicates of each sample and their k nearest neighbors. When looping over the patches, the patch_size is defined as patch_size = int(test_int_num/patch_num). It seems the loop only covers a total of sample size, instead of k times of the sample size, which is the total number of pairs. Does this mean only the initial portions of pair_a and pair_b are evaluated? The average_acc is also obtained by dividing sum_acc and test_inst_num, which is only a portion of pair a and pair b. Another question is why only test the pair-wise accuracy on the knn pairs instead of all pairs? The knn pairs are selected since they are close to each other, won't this cause bias since they are very likely to from the same cluster?

opened by RealNewNoob 1
About sampling strategy and clustering setting

Congratulations on your publication! I am reading your code and paper, however, I have a question about the sampling policy. In your paper, you mentioned M = 2, and N = 750, so two seeds, and their nearest 750 clusters are selected before CR, which makes a total of 1500. However, in train_gcn.py line 146, the for batch in range(cls_num): it seems all the clusters are looped, and for each of them, a total of 1300+200 = 1500 clusters are sampled before CR. In every training step, the features from these clusters are used to construct the affinity graph after SR. Did I miss something?

opened by RealNewNoob 1
Inference time comparison between GCNV and starfc

In Table 4, the inference time of starfc is 310s, which is faster than GCN-V+E 609s. But did you compare the inference time between starfc and gcnv (without gcn-e part)? Which one is faster? In my experiments, the accuracy of GCN-V is high enough. And the inference time of GCN-E takes up most of the inference time of GCN-V+E. So in most scenarios, GCN-E is not needed.

opened by marigoold 1
What is `cfg.cluster_num` ?
Hi ,congratulations to your work! Got some questions , maybe you could help? Thanks!

What is cfg.cluster_num ?

Is it the N in paper's Algorithms 1?

Why added by 200 in train_gcn.py line 143
opened by RHxW 1
when knn_method = "faiss_gpu", the code is running with error

because the code will build knn in faiss(cpu mode), we change the config "knn_method = faiss" to "knn_method = faiss_gpu", then runing with error

opened by hx121071 0
NameError: name 'knn_dynamic' is not defined

Hi, thank for repo.

when I run :python test_final.py

Traceback (most recent call last): File "STAR-FC/test_final.py", line 4, in from evaluation.evaluate import evaluate File "STAR-FC/evaluation/init.py", line 5, in from .evaluate import evaluate File "STAR-FC/evaluation/evaluate.py", line 9, in from utils import Timer, TextColors File "STAR-FC/utils/init.py", line 5, in from .knn import * File "STAR-FC/utils/knn.py", line 409, in class knn_faiss_dynamic(knn_dynamic): NameError: name 'knn_dynamic' is not defined

opened by AliRezaSafaei9494 0
关于实验结果

作者您好，请问论文中的实验结果可以用您在论文中设置的超参数复现出来吗，我这边使用了您在论文中提到的超参数设置（近邻簇为1500中取1300个，并进行90%的节点抽样，use_Sim = True，阈值设置为0，knn设置为80），训练到第一个epoch的3899batch，训练loss降为0.02左右，但在测试集上的pairwise ave_pre很低，不明白为什么？谢谢~

opened by joewybean 1

Owner

Shuai Shen

I am a Ph.D. student in the Department of Automation at Tsinghua University, advised by Prof. Jiwen Lu.

GitHub

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

153 Dec 14, 2022

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding ?? This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022

The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

Decoupled Dynamic Filter Networks This repo is the official implementation of CVPR2021 paper: "Decoupled Dynamic Filter Networks". Introduction DDF is

180 Dec 30, 2022

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

34 Dec 21, 2022

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

46 Dec 26, 2022

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video

45 Nov 29, 2022

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

35 Nov 20, 2022

Code for our CVPR2021 paper coordinate attention

Coordinate Attention for Efficient Mobile Network Design (preprint) This repository is a PyTorch implementation of our coordinate attention (will appe

726 Jan 5, 2023

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

150 Dec 28, 2022

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Robust Reflection Removal with Reflection-free Flash-only Cues (RFC) Paper | To be released: Project Page | Video | Data Tensorflow implementation for

162 Jan 5, 2023

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information by Masato Tamura, Hiroki Ohashi, and Tomoaki Yosh

105 Dec 23, 2022

The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

Related tags

Overview

STAR-FC

🎓 Requirements

🧚 Hardware

🍰 Datasets

🍬 Model

☘️ Training

🌵 Testing

Acknowledgement

Citation

Comments

2022-03-18 18:18:19.397 nmi: 0.7600691590384783

Owner

Shuai Shen

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

Code for our CVPR2021 paper coordinate attention

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

Code for the paper "Graph Attention Tracking". (CVPR2021)

PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021)

The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021