Code release for "COTR: Correspondence Transformer for Matching Across Images"

Related tags

Deep Learning COTR
Overview

COTR: Correspondence Transformer for Matching Across Images

This repository contains the inference code for COTR. We plan to release the training code in the future. COTR establishes correspondence in a functional and end-to-end fashion. It solves dense and sparse correspondence problem in the same framework.

Demos

Check out our demo video at here.

1. Install environment

Our implementation is based on PyTorch. Install the conda environment by: conda env create -f environment.yml.

Activate the environment by: conda activate cotr_env.

Notice that we use scipy=1.2.1 .

2. Download the pretrained weights

Down load the pretrained weights at here. Extract in to ./out, such that the weights file is at /out/default/checkpoint.pth.tar.

3. Single image pair demo

python demo_single_pair.py --load_weights="default"

Example sparse output:

Example dense output with triangulation:

Note: This example uses 10K valid sparse correspondences to densify.

4. Facial landmarks demo

python demo_face.py --load_weights="default"

Example:

5. Homography demo

python demo_homography.py --load_weights="default"

Citation

If you use this code in your research, cite the paper:

@article{jiang2021cotr,
  title={{COTR: Correspondence Transformer for Matching Across Images}},
  author={Wei Jiang and Eduard Trulls and Jan Hosang and Andrea Tagliasacchi and Kwang Moo Yi},
  booktitle={arXiv preprint},
  publisher_page={https://arxiv.org/abs/2103.14167},
  year={2021}
}
Comments
  • Matching time

    Matching time

    您好,感谢您精彩的工作。有点疑问向您请教,请问该如何理解一个点的查询,每秒可以做到35个对应点? "Our currently non-optimized prototype implementation queries one point at a time, and achieves 35 correspondences per second on a NVIDIA RTX 3090 GPU. " 我最近在跑您的代码,我在NVIDIA RTX 3090 GPU跑demo_single_pair.py,匹配大概花了30s,请问这正常吗? 谢谢!

    opened by zhirui-gao 19
  • find the coordinates of the corresponding point (x', y') on another picture.

    find the coordinates of the corresponding point (x', y') on another picture.

    Thank you for the outstanding work you do. I would like to ask if it is possible to enter the coordinates of a point (x, y) and find the coordinates of the corresponding point (x', y') on another picture.

    opened by lllllialois 9
  • patch partition?

    patch partition?

    Thank you for such an excellent job. I have some questions about cotr. During the training process, do you divide the scene images into 256*256 patches according to certain rules after scaling and then input them into the network for training? (I'm not sure where this step is implemented in the program.) How is corrs partitioned? Will it be the case that the corresponding point is divided into the next patch? How should this be handled? Is the validation process also similar to the training process after the split iteration.

    opened by zbc-l 5
  • How is the warpped image in Figure 9 generated?

    How is the warpped image in Figure 9 generated?

    Hi, thanks for the great work! I'm curious about how do you generate the warpped image in Figure 9 by dense flow. If I understand correctly, you input a pixel coordinate (x, y) in img1, and get its corresponding coordinate (x', y') in the img2. Then, you just copy the RGB in (x, y) to (x', y') in img2, and repeat this for all the coordinates in img1. Am I correct? Or, is there any efficient way of doing so? (like you've mentioned in #28 ?)

    opened by Wuziyi616 4
  • Question

    Question

    What does the dense correspondence map in Figure 1 mean and how to get it and how to reflect it numerically, I only know that it is the dense correspondence between the two images, what does color-coded ‘x’ channel mean ?

    opened by j1o2h3n 4
  • TypeError: 'NoneType' object is not callable

    TypeError: 'NoneType' object is not callable

    Thank you very much for your open source code! When I run "python demo_single_pair.py --load_weights="default"", the bug show. Could you give me some debugging advice? image

    opened by USTC-wlsong 4
  • Possible redundancy in the code

    Possible redundancy in the code

    Hi, I notice that when constructing the Transformer, you always return the intermediate features at this line. However, after feeding them to MLP for corr regression, you only take the prediction over the last layer at this line. So I guess maybe you can set return_intermediate=False to save some memory/computation?

    opened by Wuziyi616 3
  • Dense optical flow as in paper Figure 1 (c)

    Dense optical flow as in paper Figure 1 (c)

    Hi, thanks for the great work! I wonder how can I estimate the optical flow between two images. Say img1 is of shape [H, W], then can I basically reshape the grid coordinates to [H*W, 2] and then input to queries_a as in this demo?

    opened by Wuziyi616 3
  • Question

    Question

    Hello, when running through the code with the pre-trained model, it appears that RuntimeError: CUDA out of memory. Tried to allocate 2.00 GiB (GPU 0; 7.79 GiB total capacity; 2.90 GiB already allocated; 1.83 GiB free; 4.80 GiB reserved in total by PyTorch).Is there any solution?For example, which parameters to adjust?

    opened by Lucifer1002 2
  • Rotation angle

    Rotation angle

    Hello, I would like to ask, when COTR extracts the common view area, for some scenes with too large rotation angle, the formula area cannot be extracted. What is the possible reason for this phenomenon?

    opened by Lucifer1002 1
  • Match time

    Match time

    Hello, about COTR, if I use other feature extraction methods to get the feature point positions of the image and input them, can I reduce the time of COTR feature matching?

    opened by Lucifer1002 1
  • How can I ensure the smoothness of point movement when key point tracking is performed on the video?

    How can I ensure the smoothness of point movement when key point tracking is performed on the video?

    How can I ensure the smoothness of point movement when key point tracking is performed on the video? I am trying to find the key points frame by frame, but it is very un-smooth and will jump and drift repeatedly.

    opened by lllllialois 0
  • About ETH3D evaluation

    About ETH3D evaluation

    Hi Wei, thanks for sharing the code.

    Would it be possible to provide the ETH3D evaluation code? I was wondering how the data flow of the model's forward propagation.

    Look forward to your reply. Regards

    opened by CARRLIANSS 3
  • Sharing raw data of ETH3D and KITTI

    Sharing raw data of ETH3D and KITTI

    Hi everyone:

    I'd like to share the raw output from COTR for ETH3D and KITTI dataset.

    ETH3D eval: https://drive.google.com/file/d/1pfAuHRK7FvB6Hc9Rru-beH6F-2lpZAk6/view?usp=sharing

    KITTI: https://drive.google.com/file/d/1SiN5UbqautqosUCInQN2WhyxbRcbWt8b/view?usp=sharing

    The format is: {src_id}->{tgt_id}.npy, and I saved the results as a dictionary. There are several keys: "raw_corr", "drifting_forward", and "drifting_backward". "raw_corr" is the raw sparse correspondences in XYXY format, and "drifting_forward", "drifting_backward" are used to the masks to filter out drifted predictions.

    documentation 
    opened by jiangwei221 10
  • About HPatches datasets

    About HPatches datasets

    Thanks very much for your great work! AND i want to know that how do you test and evaluate the HPatches dataset(in the code)? Can you tell me how to get the relevant code?

    opened by ifuramango 2
  • training

    training

    Hello, I would like to ask if you are using the complete MegaDepth dataset for training data, or select a part of it, and if it is convenient, can you provide a training data?

    opened by Lucifer1002 3
Owner
UBC Computer Vision Group
University of British Columbia Computer Vision Group
UBC Computer Vision Group
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

null 536 Dec 20, 2022
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Transferable Semantic Augmentation for Domain Adaptation Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021) Paper

null 66 Dec 16, 2022
This is the official code release for the paper Shape and Material Capture at Home

This is the official code release for the paper Shape and Material Capture at Home. The code enables you to reconstruct a 3D mesh and Cook-Torrance BRDF from one or more images captured with a flashlight or camera with flash.

null 89 Dec 10, 2022
Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

The Boombox: Visual Reconstruction from Acoustic Vibrations Boyuan Chen, Mia Chiquier, Hod Lipson, Carl Vondrick Columbia University Project Website |

Boyuan Chen 12 Nov 30, 2022
We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

ConTNet Introduction ConTNet (Convlution-Tranformer Network) is proposed mainly in response to the following two issues: (1) ConvNets lack a large rec

null 93 Nov 8, 2022
Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."

Geometry-Aware Gradient Algorithms for Neural Architecture Search This repository contains the code required to run the experiments for the DARTS sear

null 18 May 27, 2022
This is the dataset and code release of the OpenRooms Dataset.

This is the dataset and code release of the OpenRooms Dataset.

Visual Intelligence Lab of UCSD 95 Jan 8, 2023
Code release of paper "Deep Multi-View Stereo gone wild"

Deep MVS gone wild Pytorch implementation of "Deep MVS gone wild" (Paper | website) This repository provides the code to reproduce the experiments of

François Darmon 53 Dec 24, 2022
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

null 524 Jan 8, 2023
Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

BlockGAN Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images BlockGAN: Learning 3D Object-aware Scene Rep

null 41 May 18, 2022
Code Release for Learning to Adapt to Evolving Domains

EAML Code release for "Learning to Adapt to Evolving Domains" (NeurIPS 2020) Prerequisites PyTorch >= 0.4.0 (with suitable CUDA and CuDNN version) tor

null 23 Dec 7, 2022
Code release for "Self-Tuning for Data-Efficient Deep Learning" (ICML 2021)

Self-Tuning for Data-Efficient Deep Learning This repository contains the implementation code for paper: Self-Tuning for Data-Efficient Deep Learning

THUML @ Tsinghua University 101 Dec 11, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

null 68 Dec 14, 2022
Code release for The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification (TIP 2020)

The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification Code release for The Devil is in the Channels: Mutual-Channel

PRIS-CV: Computer Vision Group 230 Dec 31, 2022
Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

CoTuning Official implementation for NeurIPS 2020 paper Co-Tuning for Transfer Learning. [News] 2021/01/13 The COCO 70 dataset used in the paper is av

THUML @ Tsinghua University 35 Sep 23, 2022
Code release for NeuS

NeuS We present a novel neural surface reconstruction method, called NeuS, for reconstructing objects and scenes with high fidelity from 2D image inpu

Peng Wang 813 Jan 4, 2023
Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds (ICCV 2021 oral) **Project Page | Arxiv ** Runsong Zhu¹, Yuan Liu², Zhen Dong¹, Te

null 40 Dec 30, 2022
Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

PixelTransformer Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation". Project Page Installation Please insta

Shubham Tulsiani 24 Dec 17, 2022