Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [CVPR2021]

Qunjie Zhou

Last update: Nov 29, 2022

Related tags

Overview

Patch2Pix for Accurate Image Correspondence Estimation

This repository contains the Pytorch implementation of our paper accepted at CVPR2021: Patch2Pix: Epipolar-Guided Pixel-Level Correspondences. [Paper] [Video].

To use our code, first download the repository:

git clone git@github.com:GrumpyZhou/patch2pix.git

Setup Running Environment

The code has been tested on Ubuntu (16.04&18.04) with Python 3.7 + Pytorch 1.7.0 + CUDA 10.2.
We recommend to use Anaconda to manage packages and reproduce the paper results. Run the following lines to automatically setup a ready environment for our code.

conda env create -f environment.yml
conda activte patch2pix

Download Pretrained Models

In order to run our examples, one needs to first download our pretrained Patch2Pix model. To further train a Patch2Pix model, one needs to download the pretrained NCNet. We provide the download links in pretrained/download.sh. To download both, one can run

cd pretrained
bash download.sh

Evaluation

❗️ NOTICE ❗️ : In this repository, we only provide examples to estimate correspondences using our Patch2Pix implemenetation.

To reproduce our evalutions on HPatches, Aachen and InLoc benchmarks, we refer you to our toolbox for image matching: image-matching-toolbox. There, you can also find implementation to reproduce the results of other state-of-the-art methods that we compared to in our paper.

Matching Examples

In our notebook examples/visualize_matches.ipynb , we give examples how to obtain matches given a pair of images using both Patch2Pix (our pretrained) and NCNet (our adapted). The example image pairs are borrowed from D2Net, one can easily replace it with your own examples.

Training

Notice the followings are necessary only if you want to train a model yourself.

Data preparation

We use MegaDepth dataset for training. To keep more data for training, we didn't split a validation set from MegaDepth. Instead we use the validation splits of PhotoTourism. The following steps describe how to prepare the same training and validation data that we used.

Preapre Training Data

We preprocess MegaDepth dataset following the preprocessing steps proposed by D2Net. For details, please checkout their "Downloading and preprocessing the MegaDepth dataset" section in their github documentation.
Then place the processed MegaDepth dataset under data/ folder and name it as MegaDepth_undistort (or create a symbolic link for it).
One can directly download our pre-computred training pairs using our download script.

cd data_pairs
bash download.sh

In case one wants to generate pairs with different settings, we provide notebooks to generate pairs from scratch. Once you finish step 1 and 2, the training pairs can be generated using our notebook data_pairs/prep_megadepth_training_pairs.ipynb.

Preapre Validation Data

Use our script to dowload and extract the subset of train and val sequences from the PhotoTourism dataset.

cd data
bash prepare_immatch_val_data.sh

Precompute image pairwise overlappings for fast loading of validation pairs.

# Under the root folder: patch2pix/
python -m data_pairs.precompute_immatch_val_ovs \
		--data_root data/immatch_benchmark/val_dense

Training Examples

To train our best model:

python -m train_patch2pix --gpu 0 \
    --epochs 25 --batch 4 \
    --save_step 1 --plot_counts 20 --data_root 'data' \
    --change_stride --panc 8 --ptmax 400 \
    --pretrain 'pretrained/ncn_ivd_5ep.pth' \
    -lr 0.0005 -lrd 'multistep' 0.2 5 \
    --cls_dthres 50 5 --epi_dthres 50 5  \
    -o 'output/patch2pix'

The above command will save the log file and checkpoints to the output folder specified by -o. Our best model was trained on a 48GB GPU. To train on a smaller GPU, e.g, with 12 GB, one can either set --batch 1 or --ptmax 250 which defines the maximum number of match proposals to be refined for each image pair. However, those changes might also decrease the training performance according to our experience. Notice, during the testing, our network only requires 12GB GPU.

Usage of Visdom Server Our training script is coded to monitor the training process using Visdom. To enable the monitoring, one needs to:

Run a visdom sever on your localhost, for example:

# Feel free to change the port
python -m visdom.server -port 9333 \
-env_path ~/.visdom/patch2pix

Append options -vh 'localhost' -vp 9333 to the commands of the training example above.

BibTeX

If you use our method or code in your project, please cite our paper:

@inproceedings{ZhouCVPRpatch2pix,
        author       = "Zhou, Qunjie and Sattler, Torsten and Leal-Taixe, Laura",
        title        = "Patch2Pix: Epipolar-Guided Pixel-Level Correspondences",
        booktitle    = "CVPR",
        year         = 2021,
}

Comments

Creating custom dataset

I want to create my own dataset to train with patch2pix. If I have fundemental matrix between source and destination images, would it be sufficent as supervision to train patch2pix ?

opened by omeryasar 7
Question regarding the loss definitions

Hi,

Looking here: https://github.com/GrumpyZhou/patch2pix/blob/main/train_patch2pix.py#L138-L200

cdist is used for the medium level prediction, and mdist is used for the fine level. I'm not quite sure I follow the logic here. Why is mdist and fdist not used for the medium/fine level?

opened by Parskatt 3
Can the pre-trained model be used both indoor and outdoor ?

Thank you for sharing your brilliant work.

I have a little question.

Can the pre-trained patch2pix_pretrained.pth model be used both indoor and outdoor ?

opened by noone-code 2
Question about the loss

Thank you for your outstanding work. My question is how many of the four kinds of loss (cls_mid/cls_fine/epi_mid/epi_fine) finally converge to? And how long the training will take？

opened by handsomelcj 1
Access to data_pairs/prep_megadepth_training_pairs.ipynb

Hi,

This is really a nice work.

I am trying to reproduce the results in your paper.

However, I couldn't find data_pairs/prep_megadepth_training_pairs.ipynb and also don't have access to the link.

Could you upload the ipynb to the repo please?

Thanks.

opened by cv4aec 1
Quantization Code

Dear Author, You've mentioned that to evaluate this method on Aachen Day-Night Benchmark, you did quantization for the matches. Will you release the code for the quantization part?

opened by QsingHuan 1
How can I use the 'NCNet' only when I run the 'visualize_matches.py'?

Hi, Could you give me any advice? I want to output some results like the results you illustrated in your paper(Figure7). How can I modify the code in 'visualize_matches.py' to get the outcome like that?

opened by BruceWANGDi 1
What's the meaning of R and t in megadepth_pairs.ov0.35_imrat1.5.pair500.excl_test.npy when computing F matrix?

Notice that there are R and t as inputs when computing F matrix which is used to generate geometric loss, but from my knowledge each image will have a R matrix and t matrix which are extracted from absolute pose. R and t here are elements from relative pose?

opened by HUSTNO1WXY 1
questions on cls_fine

Hi,

Thank you for your excellent work! I'd like to ask for some advice for training this model.

When I modified this model on my own datasets, I encounter a problem that it converges to a point where the loss of cls_fine = 0, but the metric, Specificity of cls_fine, is about 0.01, which means it has a weak ability to distinguish the false match. May I get some advice for improving the performance further?

opened by feiyu12138 2
Segfault at import statement

Hello Authors,

I am getting Segfault error when I am trying to run visualize_matches.ipynb code. I debugged this error and specifically it happens at

from utils.common.plotting import plot_matches

I think there is some issue with module importing. I am running code in Linux.

Has anyone faced this issue?

opened by abhaydoke09 1

Crash when no matches are found

Hi Qunjie,

Thanks for releasing the code!

I observed a test-time crash that happens when there are no matches found. The trace is

  File "../localize.py", line 385, in main
    matches, _, _, _ = matcher(img1_name, img2_name)
  File "../localize.py", line 314, in <lambda>
    matcher = lambda im1, im2: model.match_pairs(im1, im2)
  File "/local/datasets/aachen_day_night/temp/image-matching-toolbox/immatch/modules/patch2pix.py", line 86, in match_pairs
    io_thres=self.match_threshold)        
  File "/local/datasets/aachen_day_night/temp/image-matching-toolbox/immatch/modules/../../third_party/patch2pix/networks/patch2pix.py", line 296, in refine_matches
    regressor=self.regress_mid)
  File "/local/datasets/aachen_day_night/temp/image-matching-toolbox/immatch/modules/../../third_party/patch2pix/networks/patch2pix.py", line 215, in forward_fine_match
    psize, ptype, regressor)
  File "/local/datasets/aachen_day_night/temp/image-matching-toolbox/immatch/modules/../../third_party/patch2pix/networks/patch2pix.py", line 177, in forward_fine_match_mini_batch
    f1s = f1s.view(-1, N, psize, psize).permute(1, 0, 2, 3)
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0, 16, 16] because the unspecified dimension size -1 can be any value and is ambiguous

Here, I am using your image matching toolbox to use patch2pix with superglue features to match two images. Do you know what the best fix would be?

opened by tsattler 4

Owner

Qunjie Zhou

PhD Candidate at the Dynamic Vision and Learning Group.

GitHub

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

272 Dec 23, 2022

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

337 Dec 15, 2022

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature fo

50 Dec 21, 2022

PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

76 Jan 3, 2023

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

1 Oct 28, 2022

Implementation of ICCV19 Paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network"

OANet implementation Pytorch implementation of OANet for ICCV'19 paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network", by

225 Dec 5, 2022

REGTR: End-to-end Point Cloud Correspondences with Transformers

REGTR: End-to-end Point Cloud Correspondences with Transformers This repository contains the source code for REGTR. REGTR utilizes multiple transforme

108 Dec 17, 2022

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning By Zhenda Xie*, Yutong Lin*, Zheng Zhang, Yue Ca

293 Dec 20, 2022

The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Pixel-level Self-Paced Learning for Super-Resolution This is an official implementaion of the paper Pixel-level Self-Paced Learning for Super-Resoluti

41 Dec 15, 2022

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

87 Oct 19, 2022

Pixel-level Crack Detection From Images Of Levee Systems : A Comparative Study

PIXEL-LEVEL CRACK DETECTION FROM IMAGES OF LEVEE SYSTEMS : A COMPARATIVE STUDY G

2 Jul 23, 2022

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

SCL Introduction Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)' We evaluated our approach using two baseline

34 Oct 8, 2022

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides a neat implementation

273 Jan 3, 2023

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

46 Dec 26, 2022

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation This is the PyTorch implemention of ICCV'21 paper SGPA: Structure

24 Dec 5, 2022

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Exploring Cross-Image Pixel Contrast for Semantic Segmentation Exploring Cross-Image Pixel Contrast for Semantic Segmentation, Wenguan Wang, Tianfei Z

510 Jan 2, 2023

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow

201 Dec 21, 2022

Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Implementation for Pixel Consensus Voting (CVPR 2020). This codebase contains the essential ingredients of PCV, including various spatial discretizati

23 Oct 25, 2022

This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Spherical Gaussian Optimization This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization. This code has b

41 Dec 14, 2022