Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Computer Vision and Geometry Lab

Last update: Jan 5, 2023

Related tags

Deep Learning pixloc

Overview

Back to the Feature with PixLoc

We introduce PixLoc, a neural network for end-to-end learning of camera localization from an image and a 3D model via direct feature alignment. It is presented in our paper:

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose
to appear at CVPR 2021
Authors: Paul-Edouard Sarlin*, Ajaykumar Unagar*, Måns Larsson, Hugo Germain, Carl Toft, Victor Larsson, Marc Pollefeys, Vincent Lepetit, Lars Hammarstrand, Fredrik Kahl, and Torsten Sattler

This repository will host the training and inference code. Please subscribe to this issue if you wish to be notified of the code release.

Abstract

Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. Many regress precise geometric quantities, like poses or 3D points, from an input image. This either fails to generalize to new viewpoints or ties the model parameters to a specific scene. In this paper, we go Back to the Feature: we argue that deep networks should focus on learning robust and invariant visual features, while the geometric estimation should be left to principled algorithms. We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. Our approach is based on the direct alignment of multiscale deep features, casting camera localization as metric learning. PixLoc learns strong data priors by end-to-end training from pixels to pose and exhibits exceptional generalization to new scenes by separating model parameters and scene geometry. The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching by jointly refining keypoints and poses with little overhead.

BibTex Citation

Please consider citing our work if you use any of the ideas presented the paper or code from this repo:

@inproceedings{sarlin21pixloc,
  author    = {Paul-Edouard Sarlin and
               Ajaykumar Unagar and
               Måns Larsson and
               Hugo Germain and
               Carl Toft and
               Victor Larsson and
               Marc Pollefeys and
               Vincent Lepetit and
               Lars Hammarstrand and
               Fredrik Kahl and
               Torsten Sattler},
  title     = {{Back to the Feature}: Learning Robust Camera Localization from Pixels to Pose},
  booktitle = {CVPR},
  year      = {2021},
  url       = {https://arxiv.org/abs/2103.09213}
}

Comments

Reproducing Results on the https://www.visuallocalization.net/details/17831/ Benchmark

Hi @Skydes

Thank you for sharing your implementation and the tools surrounding this localization framework!

I have been trying to reproduce the results of hloc + pixloc on the visual localization benchmark for the Extended CMU dataset. However, I haven't been able to get results close to the values seen on the linked benchmark. The values I'm currently getting are:

Locally, I've downloaded the pixloc_cmu pre-trained weights hosted here, and I'm running the following command: python -m pixloc.run_CMU --from_poses Which after hours of running terminates with the message (truncated): [11/25/2021 00:32:21 pixloc INFO] Finished evaluating all slices, you can now submit the file /home/frank/github/pixloc/outputs/results/pixloc_CMU_slice2-3-4-5-6-13-14-15-16-17-18-19-20-21.txt to https://www.visuallocalization.net/submission/ I'm assuming that --from_poses runs the evaluation using hloc poses as a start, is this correct? Also, do you have any pointers on what I must be doing wrong?

opened by fulkast 20
ImportError: cannot import name 'set_logging_debug'

Hi, Thanks a lot for releasing your project. This is very amazing work. I meet a problem when I try to run code with run_CMU.py. it shows that ImportError: cannot import name 'set_logging_debug'. and also I didn't find logger in the directory where include run_CMU.py. could you help me with how to fix this? Thanks a lot.

opened by wwtx9 6
how to get the extrinsics between the cameras

thanks for your great work. i have download the robotcar from the url, and run the code "run_RobotCar.py" and get the result like this,

left/1418235223450115.jpg 0.11704475375741745 0.03295656318232015 -0.6387011805495836 -0.759786280822208 -125.37236785888672 -23.192873001098633 -7.491024494171143 rear/1418235392061935.jpg 0.7276782815863018 -0.5911740329291926 -0.23252374914543117 -0.2587088853928167 -100.58859252929688 -41.86554718017578 207.1205291748047 right/1418236138953803.jpg 0.6205097424969388 -0.6552427198975653 -0.20077227393455488 -0.3812022186540521 24.56517219543457 7.398570537567139 -221.69622802734375

i would like to know how to get the extrinsics between the cameras?

opened by angiend 5
Question: TypeError

Hi, Thanks for your amazing work. I encountered the following error when I tried to run code with a weak textured indoor scene dataset.

In the following case, is the value of this weight related to the quality of the dataset, such as the texture of the scene?

opened by CuiYan27 4
How the intrinsic values are calculated for Cambridge and 7 Scenes dataset?

Intrinsic parameters for query images in Cambridge and 7Scenes dataset are not available. But PixLoc requires calibrated query images so how these parameters are calculated to evaluating on these datasets?

Thanks!

opened by patelajaychh 4
'CambridgeLandmarks_Colmap_Retriangulated_1024px.zip' this file can't be downloaded

Hi, I'm trying to download 'CambridgeLandmarks_Colmap_Retriangulated_1024px.zip',but this file can't be downloaded .I don't know why.Do you know why ?

opened by juju904 3
Using sfm model on 7scenes without depth

Hi, Thanks for your sharing work. I noticed that the demo on 7scenes need the reference_sfm model "sfm_superpoint+superglue+depth". But I cannot download the rendered 7scenes depth images and only have the sfm model builded from RGB images. Can I use such reference sfm model directly on the pre-trained model?

opened by MisEty 2
Time profiling for ground truth pose data of Cambridge ad 7cenes?

https://github.com/cvg/pixloc/issues/19#issuecomment-980494604 I'm running SFM reconstruction of Cambridge KingsCollege scene using command- colmap mapper --database_path /data/hloc/outputs_KingsCollege/sfm_superpoint+superglue/database.db --image_path /data/datasets/Cambridge/KingsCollege/images_all --output_path /data/hloc/outputs_KingsCollege/sfm_superpoint+superglue/models --Mapper.num_threads 16

Cambridge: https://drive.google.com/file/d/1esqzZ1zEQlzZVic-H32V6kkZvc4NeS15/view 7Scenes: https://drive.google.com/file/d/1cu6KUR7WHO7G4EO49Qi3HEKU6n_yYDjb/view

There are total 1565 images. Its already 8hr and its still running. Is this usual? Also process is not using GPU. Does mapper function in COLMAP not support GPU?

I'm wondering how much time did it take to create above shared ground truths? Is it possible to share time profile of each scene?

opened by patelajaychh 2
What is Oracle referred in Pixloc paper?

PixLoc, trained on MegaDepth, is initialized with image retrieval obtained with either DenseVLAD [88] or an oracle, which returns the reference image containing the largest number of inlier matches found by hloc. This oracle shows the benefits of better image retrieval using a more complex pipeline without ground truth information.

opened by patelajaychh 2
got an ffmpeg error

I got an error when run the demo.ipynb:

Error: ffmpeg error (see stderr output for detail)

I don't know if I use the wrong version of ffmpeg because there is no detailed version information about it. So, can you share with me the version? or do you have any advice about this error?

opened by Ma-yiwei 2

Error in the demo notebook: add_prefixes() missing 1 required positional argument: 'eval_dir'

When trying to run the demo notebook, the following error appears:

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_37/2660258349.py in <module>
      8 
      9 print(f'default paths:\n{pformat(default_paths.asdict())}')
---> 10 paths = default_paths.add_prefixes(DATA_PATH/dataset, LOC_PATH/dataset)
     11 
     12 conf = default_confs['from_retrieval']

TypeError: add_prefixes() missing 1 required positional argument: 'eval_dir'

Indeed, the call to add_prefixes seems to be missing a parameter:

paths = default_paths.add_prefixes(DATA_PATH/dataset, LOC_PATH/dataset)

opened by bperel 2

CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0
Some questions about trainning

Hello, first of all about your work I think very excellent. I was recently reproducing your work when I was training the dataset megadepth on the server. I found that I always failed to achieve the results of your training logs. I just changed the batch_size=6,num_workers=8 to batch_size=12,num_workers=12. This makes me very confused and I hope I can get your answer. With best regards.

opened by Geelooo 0
can not change the encoder of UNet

Hi, Thanks for your amazing work. When retraining the model,I changed the encoder of UNet from vgg to resnet,the following error occurred.What is the reason?How to solve it?

opened by juju904 2
With custom data
Hi, thanks for your great work! I want to use pixloc to implement a localiazation task.

I followed these steps:

I create my own datasets that I capture by my own camera.

I use netvlad to create global descriptors.

I use superpoint to create local descriptors.

I use retrieval to create pairs.

Match the pairs use superglue.

Follow the reconstruction to create a model--sfm_superpoint+superglue

Finally, I use the model to localize query image.

Unfortunately, I can't achieve a precise result with pixloc. The SfM model looks well when I visualized in the COLMAP GUI and the estimated camera poses roughly correct. What should I do to improve it?
opened by CuiYan27 2
Support for Colmap camera model FULL_OPENCV

Dear pixloc authors and developers,

Many thanks for providing such a high-quality package! Well done.

For our application, we are using a FULL_OPENCV camera model, which isn't supported by pixloc at the moment.

The FULL_OPENCV model is registered in utils/colmap.py, but it's missing from pixlib/geometry/wrappers.py, as well as from the actual undistortion in pixlib/geometry/utils.py and its jacobian. The FULL_OPENCV calibration model is an extension to the OPENCV model, with an higher-order polynomial: k3, k4, k5 and k6 follow p1 and p2.

It seems that adding support for higher degree polynomials in the undistortion code would be simple, but I'm not sure about the boundary condition - what would this become?

opened by jackokaiser 2

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Related tags

Overview

Back to the Feature with PixLoc

Abstract

BibTex Citation

Comments

Patching CVE-2007-4559

Owner

Computer Vision and Geometry Lab

Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of images as "pixels"

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

Camera-caps - Examine the camera capabilities for V4l2 cameras

Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"