Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [CVPR2021]

Overview

Patch2Pix for Accurate Image Correspondence Estimation

This repository contains the Pytorch implementation of our paper accepted at CVPR2021: Patch2Pix: Epipolar-Guided Pixel-Level Correspondences. [Paper] [Video].

Overview To use our code, first download the repository:

git clone [email protected]:GrumpyZhou/patch2pix.git

Setup Running Environment

The code has been tested on Ubuntu (16.04&18.04) with Python 3.7 + Pytorch 1.7.0 + CUDA 10.2.
We recommend to use Anaconda to manage packages and reproduce the paper results. Run the following lines to automatically setup a ready environment for our code.

conda env create -f environment.yml
conda activte patch2pix

Download Pretrained Models

In order to run our examples, one needs to first download our pretrained Patch2Pix model. To further train a Patch2Pix model, one needs to download the pretrained NCNet. We provide the download links in pretrained/download.sh. To download both, one can run

cd pretrained
bash download.sh

Evaluation

❗️ NOTICE ❗️ : In this repository, we only provide examples to estimate correspondences using our Patch2Pix implemenetation.

To reproduce our evalutions on HPatches, Aachen and InLoc benchmarks, we refer you to our toolbox for image matching: image-matching-toolbox. There, you can also find implementation to reproduce the results of other state-of-the-art methods that we compared to in our paper.

Matching Examples

In our notebook examples/visualize_matches.ipynb , we give examples how to obtain matches given a pair of images using both Patch2Pix (our pretrained) and NCNet (our adapted). The example image pairs are borrowed from D2Net, one can easily replace it with your own examples.

Training

Notice the followings are necessary only if you want to train a model yourself.

Data preparation

We use MegaDepth dataset for training. To keep more data for training, we didn't split a validation set from MegaDepth. Instead we use the validation splits of PhotoTourism. The following steps describe how to prepare the same training and validation data that we used.

Preapre Training Data

  1. We preprocess MegaDepth dataset following the preprocessing steps proposed by D2Net. For details, please checkout their "Downloading and preprocessing the MegaDepth dataset" section in their github documentation.

  2. Then place the processed MegaDepth dataset under data/ folder and name it as MegaDepth_undistort (or create a symbolic link for it).

  3. One can directly download our pre-computred training pairs using our download script.

cd data_pairs
bash download.sh

In case one wants to generate pairs with different settings, we provide notebooks to generate pairs from scratch. Once you finish step 1 and 2, the training pairs can be generated using our notebook data_pairs/prep_megadepth_training_pairs.ipynb.

Preapre Validation Data

  1. Use our script to dowload and extract the subset of train and val sequences from the PhotoTourism dataset.
cd data
bash prepare_immatch_val_data.sh
  1. Precompute image pairwise overlappings for fast loading of validation pairs.
# Under the root folder: patch2pix/
python -m data_pairs.precompute_immatch_val_ovs \
		--data_root data/immatch_benchmark/val_dense

Training Examples

To train our best model:

python -m train_patch2pix --gpu 0 \
    --epochs 25 --batch 4 \
    --save_step 1 --plot_counts 20 --data_root 'data' \
    --change_stride --panc 8 --ptmax 400 \
    --pretrain 'pretrained/ncn_ivd_5ep.pth' \
    -lr 0.0005 -lrd 'multistep' 0.2 5 \
    --cls_dthres 50 5 --epi_dthres 50 5  \
    -o 'output/patch2pix' 

The above command will save the log file and checkpoints to the output folder specified by -o. Our best model was trained on a 48GB GPU. To train on a smaller GPU, e.g, with 12 GB, one can either set --batch 1 or --ptmax 250 which defines the maximum number of match proposals to be refined for each image pair. However, those changes might also decrease the training performance according to our experience. Notice, during the testing, our network only requires 12GB GPU.

Usage of Visdom Server Our training script is coded to monitor the training process using Visdom. To enable the monitoring, one needs to:

  1. Run a visdom sever on your localhost, for example:
# Feel free to change the port
python -m visdom.server -port 9333 \
-env_path ~/.visdom/patch2pix
  1. Append options -vh 'localhost' -vp 9333 to the commands of the training example above.

BibTeX

If you use our method or code in your project, please cite our paper:

@inproceedings{ZhouCVPRpatch2pix,
        author       = "Zhou, Qunjie and Sattler, Torsten and Leal-Taixe, Laura",
        title        = "Patch2Pix: Epipolar-Guided Pixel-Level Correspondences",
        booktitle    = "CVPR",
        year         = 2021,
}
Comments
  • Creating custom dataset

    Creating custom dataset

    I want to create my own dataset to train with patch2pix. If I have fundemental matrix between source and destination images, would it be sufficent as supervision to train patch2pix ?

    opened by omeryasar 7
  • Question regarding the loss definitions

    Question regarding the loss definitions

    Hi,

    Looking here: https://github.com/GrumpyZhou/patch2pix/blob/main/train_patch2pix.py#L138-L200

    cdist is used for the medium level prediction, and mdist is used for the fine level. I'm not quite sure I follow the logic here. Why is mdist and fdist not used for the medium/fine level?

    opened by Parskatt 3
  • Can the pre-trained model be used both indoor and outdoor ?

    Can the pre-trained model be used both indoor and outdoor ?

    Thank you for sharing your brilliant work.

    I have a little question.

    Can the pre-trained patch2pix_pretrained.pth model be used both indoor and outdoor ?

    opened by noone-code 2
  • Question about the loss

    Question about the loss

    Thank you for your outstanding work. My question is how many of the four kinds of loss (cls_mid/cls_fine/epi_mid/epi_fine) finally converge to? And how long the training will take?

    opened by handsomelcj 1
  • Access to data_pairs/prep_megadepth_training_pairs.ipynb

    Access to data_pairs/prep_megadepth_training_pairs.ipynb

    Hi,

    This is really a nice work.

    I am trying to reproduce the results in your paper.

    However, I couldn't find data_pairs/prep_megadepth_training_pairs.ipynb and also don't have access to the link.

    Could you upload the ipynb to the repo please?

    Thanks.

    opened by cv4aec 1
  • Quantization Code

    Quantization Code

    Dear Author, You've mentioned that to evaluate this method on Aachen Day-Night Benchmark, you did quantization for the matches. Will you release the code for the quantization part?

    opened by QsingHuan 1
  • How can I use the 'NCNet' only when I run the 'visualize_matches.py'?

    How can I use the 'NCNet' only when I run the 'visualize_matches.py'?

    Hi, Could you give me any advice? I want to output some results like the results you illustrated in your paper(Figure7). How can I modify the code in 'visualize_matches.py' to get the outcome like that?

    opened by BruceWANGDi 1
  • What's the meaning of R and t in megadepth_pairs.ov0.35_imrat1.5.pair500.excl_test.npy when computing F matrix?

    What's the meaning of R and t in megadepth_pairs.ov0.35_imrat1.5.pair500.excl_test.npy when computing F matrix?

    Notice that there are R and t as inputs when computing F matrix which is used to generate geometric loss, but from my knowledge each image will have a R matrix and t matrix which are extracted from absolute pose. R and t here are elements from relative pose?

    opened by HUSTNO1WXY 1
  • questions on cls_fine

    questions on cls_fine

    Hi,

    Thank you for your excellent work! I'd like to ask for some advice for training this model.

    When I modified this model on my own datasets, I encounter a problem that it converges to a point where the loss of cls_fine = 0, but the metric, Specificity of cls_fine, is about 0.01, which means it has a weak ability to distinguish the false match. May I get some advice for improving the performance further?

    opened by feiyu12138 2
  • Segfault at import statement

    Segfault at import statement

    Hello Authors,

    I am getting Segfault error when I am trying to run visualize_matches.ipynb code. I debugged this error and specifically it happens at

    from utils.common.plotting import plot_matches

    I think there is some issue with module importing. I am running code in Linux.

    Has anyone faced this issue?

    opened by abhaydoke09 1
  • Crash when no matches are found

    Crash when no matches are found

    Hi Qunjie,

    Thanks for releasing the code!

    I observed a test-time crash that happens when there are no matches found. The trace is

      File "../localize.py", line 385, in main
        matches, _, _, _ = matcher(img1_name, img2_name)
      File "../localize.py", line 314, in <lambda>
        matcher = lambda im1, im2: model.match_pairs(im1, im2)
      File "/local/datasets/aachen_day_night/temp/image-matching-toolbox/immatch/modules/patch2pix.py", line 86, in match_pairs
        io_thres=self.match_threshold)        
      File "/local/datasets/aachen_day_night/temp/image-matching-toolbox/immatch/modules/../../third_party/patch2pix/networks/patch2pix.py", line 296, in refine_matches
        regressor=self.regress_mid)
      File "/local/datasets/aachen_day_night/temp/image-matching-toolbox/immatch/modules/../../third_party/patch2pix/networks/patch2pix.py", line 215, in forward_fine_match
        psize, ptype, regressor)
      File "/local/datasets/aachen_day_night/temp/image-matching-toolbox/immatch/modules/../../third_party/patch2pix/networks/patch2pix.py", line 177, in forward_fine_match_mini_batch
        f1s = f1s.view(-1, N, psize, psize).permute(1, 0, 2, 3)
    RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0, 16, 16] because the unspecified dimension size -1 can be any value and is ambiguous
    

    Here, I am using your image matching toolbox to use patch2pix with superglue features to match two images. Do you know what the best fix would be?

    opened by tsattler 4
Owner
Qunjie Zhou
PhD Candidate at the Dynamic Vision and Learning Group.
Qunjie Zhou
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature fo

Google Interns 50 Dec 21, 2022
PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

null 76 Jan 3, 2023
Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

null 1 Oct 28, 2022
Implementation of ICCV19 Paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network"

OANet implementation Pytorch implementation of OANet for ICCV'19 paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network", by

Jiahui Zhang 225 Dec 5, 2022
REGTR: End-to-end Point Cloud Correspondences with Transformers

REGTR: End-to-end Point Cloud Correspondences with Transformers This repository contains the source code for REGTR. REGTR utilizes multiple transforme

Zi Jian Yew 108 Dec 17, 2022
Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning By Zhenda Xie*, Yutong Lin*, Zheng Zhang, Yue Ca

Zhenda Xie 293 Dec 20, 2022
The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Pixel-level Self-Paced Learning for Super-Resolution This is an official implementaion of the paper Pixel-level Self-Paced Learning for Super-Resoluti

Elon Lin 41 Dec 15, 2022
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

null 87 Oct 19, 2022
Pixel-level Crack Detection From Images Of Levee Systems : A Comparative Study

PIXEL-LEVEL CRACK DETECTION FROM IMAGES OF LEVEE SYSTEMS : A COMPARATIVE STUDY G

Manisha Panta 2 Jul 23, 2022
Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

SCL Introduction Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)' We evaluated our approach using two baseline

null 34 Oct 8, 2022
You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides a neat implementation

qiang chen 273 Jan 3, 2023
Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

TianxiangMa 46 Dec 26, 2022
PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation This is the PyTorch implemention of ICCV'21 paper SGPA: Structure

Chen Kai 24 Dec 5, 2022
Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Exploring Cross-Image Pixel Contrast for Semantic Segmentation Exploring Cross-Image Pixel Contrast for Semantic Segmentation, Wenguan Wang, Tianfei Z

Tianfei Zhou 510 Jan 2, 2023
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Implementation for Pixel Consensus Voting (CVPR 2020). This codebase contains the essential ingredients of PCV, including various spatial discretizati

Haochen 23 Oct 25, 2022
This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Spherical Gaussian Optimization This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization. This code has b

null 41 Dec 14, 2022