Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Sunghwan Hong

Last update: Jan 4, 2023

Related tags

Deep Learning computer-vision deep-learning pytorch neurips semantic-correspondence neurips-2021

Overview

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

For more information, check out the paper on [arXiv].

Training with different backbones and evaluations of them are to be updated soon..

Check out our new paper! [arXiv]

Network

Our model CATs is illustrated below:

Environment Settings

git clone https://github.com/SunghwanHong/CATs
cd CATs

conda create -n CATs python=3.6
conda activate CATs

pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install -U scikit-image
pip install git+https://github.com/albumentations-team/albumentations
pip install tensorboardX termcolor timm tqdm requests pandas

Evaluation

Download pre-trained weights on Link
All datasets are automatically downloaded into directory specified by argument datapath

Result on SPair-71k: (PCK 49.9%)

  python test.py --pretrained "/path_to_pretrained_model/spair" --benchmark spair

Result on SPair-71k, feature backbone frozen: (PCK 42.4%)

  python test.py --pretrained "/path_to_pretrained_model/spair_frozen" --benchmark spair

Results on PF-PASCAL: (PCK 75.4%, 92.6%, 96.4%)

  python test.py --pretrained "/path_to_pretrained_model/pfpascal" --benchmark pfpascal

Results on PF-PACAL, feature backbone frozen: (PCK 67.5%, 89.1%, 94.9%)

  python test.py --pretrained "/path_to_pretrained_model/pfpascal_frozen" --benchmark pfpascal

Acknowledgement

We borrow code from public projects (huge thanks to all the projects). We mainly borrow code from DHPF and GLU-Net.

BibTeX

If you find this research useful, please consider citing:

@inproceedings{cho2021cats,
  title={CATs: Cost Aggregation Transformers for Visual Correspondence},
  author={Cho, Seokju and Hong, Sunghwan and Jeon, Sangryul and Lee, Yunsung and Sohn, Kwanghoon and Kim, Seungryong},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}

Comments

transfer the target image's points to source image?

In your training and testing code, the target image's points are transfered into source image, which is different to the setting in DHPF and CHM. Is there anything wrong?

opened by willer94 4
Visualization results

Hi,Thanks for your great work. I want to see the Visualization results of it. How can I draw images of correspondence between 2 images like Figure 5 in your paper. Do you have code for that? Thanks!

opened by Acero522 3
Potential bug?

Hi Sunghwan Hong, Thanks for sharing the code. I have one question:

Will this be a potential bug at line https://github.com/SunghwanHong/Cost-Aggregation-transformers/blob/main/utils_training/evaluation.py#L32?

Should it be 'trg_kps' instead of src_kps?

Thanks

opened by dutran 2
Data download error.

Hi, thanks for the great work!

When I run the train.py, the data did not downloaded. I got tarfile.ReadError: not a gzip file error.

Can you give dataset links that we can download manually?

Thank you!

opened by SirojbekSafarov 2
some questions.
Hi, I am trying to understand your code. You code is very clean and well arranged I have to admit. I have a 2 questions when reading your code.

May I ask why you pick hyperpixel_ids=[0,8,20,21,26,28,29,30]?

Have you tried other backbone other than resnet101? By the way, do you have any advices for me to improve your code? I tried to use resnet152 as backbone, but it does not perform better than resnet101. I guess it because I used the un-optimal hyperpixel_ids.

Thank you very much.
opened by 5100117 2
PCK evaluation on PF-Willow

Hi, thanks for the great work!

I think you have an error in the PCK-threshold of PF-Willow. It is computed as the difference between the maximum and minimum kp. However, since you always pad the kp with -1, the minimum is always -1, which is not the actual coordinate of the minimum keypoint. Therefore, the pckthreshold is artificially large and therefore the results artificially high. I know this error is present in multiple works, for example also in DHPF. However, the comparison is not fair to methods that actually use the correct metric, like NC-Net.

opened by PruneTruong 1
Visualization again

Hi,

As mentioned in issue 7, i would like to see the semantic correspondence matching between any of the image pairs in the test dataset.

Will you update the github with that script ?

opened by sanjanagovind 0

Owner

Sunghwan Hong

M.S./Ph.D Integrated (2021.03 - present)

GitHub

Official implementation of NeurIPS 2021 paper "Contextual Similarity Aggregation with Self-attention for Visual Re-ranking"

CSA: Contextual Similarity Aggregation with Self-attention for Visual Re-ranking PyTorch training code for CSA (Contextual Similarity Aggregation). We

19 Oct 21, 2022

ML model to classify between cats and dogs

Cats-and-dogs-classifier This is my first ML model which can classify between cats and dogs. Here the accuracy is around 75%, however , the accuracy c

4 Aug 20, 2021

Just Randoms Cats with python

Random-Cat Just Randoms Cats with python.

2 Dec 21, 2021

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

87 Dec 30, 2022

Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Official implementation of GOCor This is the official implementation of our paper : GOCor: Bringing Globally Optimized Correspondence Volumes into You

71 Nov 18, 2022

Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

Few-shot Image Generation via Cross-domain Correspondence Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zh

251 Dec 11, 2022

Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Visual Parser (ViP) This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers. Key Feature

117 Dec 11, 2022

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

?? ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

225 Dec 29, 2022

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Deep3DMM Official repository for the CVPR 2021 paper Learning Feature Aggregation for Deep 3D Morphable Models. Requirements This code is tested on Py

38 Dec 27, 2022

A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

DGC-Net: Dense Geometric Correspondence Network This is a PyTorch implementation of our work "DGC-Net: Dense Geometric Correspondence Network" TL;DR A

191 Dec 16, 2022

PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

11 Jul 28, 2022

The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision The PyTorch implementation of DiscoBox: Weakly Supe

1 Oct 23, 2021

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Requirements pytorch 1.1+ torchvision 0.3+ pyclipper opencv3 gcc

400 Dec 26, 2022

Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Related tags

Overview

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Network

Environment Settings

Evaluation

Acknowledgement

BibTeX

Comments

transfer the target image's points to source image?

Visualization results

Potential bug?

Data download error.

some questions.

PCK evaluation on PF-Willow

Visualization again

Owner

Sunghwan Hong

Official implementation of NeurIPS 2021 paper "Contextual Similarity Aggregation with Self-attention for Visual Re-ranking"

ML model to classify between cats and dogs

Just Randoms Cats with python

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.

CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching（CVPR2021）

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

An open-source, low-cost, image-based weed detection device for fallow scenarios.

Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

Implement some metaheuristics and cost functions

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network