Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Related tags

Deep Learning pixloc
Overview

Back to the Feature with PixLoc

We introduce PixLoc, a neural network for end-to-end learning of camera localization from an image and a 3D model via direct feature alignment. It is presented in our paper:

This repository will host the training and inference code. Please subscribe to this issue if you wish to be notified of the code release.

Abstract

Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. Many regress precise geometric quantities, like poses or 3D points, from an input image. This either fails to generalize to new viewpoints or ties the model parameters to a specific scene. In this paper, we go Back to the Feature: we argue that deep networks should focus on learning robust and invariant visual features, while the geometric estimation should be left to principled algorithms. We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. Our approach is based on the direct alignment of multiscale deep features, casting camera localization as metric learning. PixLoc learns strong data priors by end-to-end training from pixels to pose and exhibits exceptional generalization to new scenes by separating model parameters and scene geometry. The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching by jointly refining keypoints and poses with little overhead.

BibTex Citation

Please consider citing our work if you use any of the ideas presented the paper or code from this repo:

@inproceedings{sarlin21pixloc,
  author    = {Paul-Edouard Sarlin and
               Ajaykumar Unagar and
               Måns Larsson and
               Hugo Germain and
               Carl Toft and
               Victor Larsson and
               Marc Pollefeys and
               Vincent Lepetit and
               Lars Hammarstrand and
               Fredrik Kahl and
               Torsten Sattler},
  title     = {{Back to the Feature}: Learning Robust Camera Localization from Pixels to Pose},
  booktitle = {CVPR},
  year      = {2021},
  url       = {https://arxiv.org/abs/2103.09213}
}
Comments
  • Reproducing Results on the https://www.visuallocalization.net/details/17831/ Benchmark

    Reproducing Results on the https://www.visuallocalization.net/details/17831/ Benchmark

    Hi @Skydes

    Thank you for sharing your implementation and the tools surrounding this localization framework!

    I have been trying to reproduce the results of hloc + pixloc on the visual localization benchmark for the Extended CMU dataset. However, I haven't been able to get results close to the values seen on the linked benchmark. The values I'm currently getting are: Untitled presentation (2)

    Locally, I've downloaded the pixloc_cmu pre-trained weights hosted here, and I'm running the following command: python -m pixloc.run_CMU --from_poses Which after hours of running terminates with the message (truncated): [11/25/2021 00:32:21 pixloc INFO] Finished evaluating all slices, you can now submit the file /home/frank/github/pixloc/outputs/results/pixloc_CMU_slice2-3-4-5-6-13-14-15-16-17-18-19-20-21.txt to https://www.visuallocalization.net/submission/ I'm assuming that --from_poses runs the evaluation using hloc poses as a start, is this correct? Also, do you have any pointers on what I must be doing wrong?

    opened by fulkast 20
  • ImportError: cannot import name 'set_logging_debug'

    ImportError: cannot import name 'set_logging_debug'

    Hi, Thanks a lot for releasing your project. This is very amazing work. I meet a problem when I try to run code with run_CMU.py. it shows that ImportError: cannot import name 'set_logging_debug'. and also I didn't find logger in the directory where include run_CMU.py. could you help me with how to fix this? Thanks a lot.

    opened by wwtx9 6
  • how to get the extrinsics between the cameras

    how to get the extrinsics between the cameras

    thanks for your great work. i have download the robotcar from the url, and run the code "run_RobotCar.py" and get the result like this,

    left/1418235223450115.jpg 0.11704475375741745 0.03295656318232015 -0.6387011805495836 -0.759786280822208 -125.37236785888672 -23.192873001098633 -7.491024494171143 rear/1418235392061935.jpg 0.7276782815863018 -0.5911740329291926 -0.23252374914543117 -0.2587088853928167 -100.58859252929688 -41.86554718017578 207.1205291748047 right/1418236138953803.jpg 0.6205097424969388 -0.6552427198975653 -0.20077227393455488 -0.3812022186540521 24.56517219543457 7.398570537567139 -221.69622802734375

    i would like to know how to get the extrinsics between the cameras?

    opened by angiend 5
  • Question: TypeError

    Question: TypeError

    Hi, Thanks for your amazing work. I encountered the following error when I tried to run code with a weak textured indoor scene dataset.

    In the following case, is the value of this weight related to the quality of the dataset, such as the texture of the scene?

    image image image

    opened by CuiYan27 4
  • How the intrinsic values are calculated for Cambridge and 7 Scenes dataset?

    How the intrinsic values are calculated for Cambridge and 7 Scenes dataset?

    Intrinsic parameters for query images in Cambridge and 7Scenes dataset are not available. But PixLoc requires calibrated query images so how these parameters are calculated to evaluating on these datasets?

    Thanks!

    opened by patelajaychh 4
  • 'CambridgeLandmarks_Colmap_Retriangulated_1024px.zip' this file can't be downloaded

    'CambridgeLandmarks_Colmap_Retriangulated_1024px.zip' this file can't be downloaded

    Hi, I'm trying to download 'CambridgeLandmarks_Colmap_Retriangulated_1024px.zip',but this file can't be downloaded .I don't know why.Do you know why ?

    opened by juju904 3
  • Using sfm model on 7scenes without depth

    Using sfm model on 7scenes without depth

    Hi, Thanks for your sharing work. I noticed that the demo on 7scenes need the reference_sfm model "sfm_superpoint+superglue+depth". But I cannot download the rendered 7scenes depth images and only have the sfm model builded from RGB images. Can I use such reference sfm model directly on the pre-trained model?

    opened by MisEty 2
  • Time profiling for ground truth pose data of Cambridge ad 7cenes?

    Time profiling for ground truth pose data of Cambridge ad 7cenes?

    https://github.com/cvg/pixloc/issues/19#issuecomment-980494604 I'm running SFM reconstruction of Cambridge KingsCollege scene using command- colmap mapper --database_path /data/hloc/outputs_KingsCollege/sfm_superpoint+superglue/database.db --image_path /data/datasets/Cambridge/KingsCollege/images_all --output_path /data/hloc/outputs_KingsCollege/sfm_superpoint+superglue/models --Mapper.num_threads 16

    Cambridge: https://drive.google.com/file/d/1esqzZ1zEQlzZVic-H32V6kkZvc4NeS15/view 7Scenes: https://drive.google.com/file/d/1cu6KUR7WHO7G4EO49Qi3HEKU6n_yYDjb/view

    There are total 1565 images. Its already 8hr and its still running. Is this usual? Also process is not using GPU. Does mapper function in COLMAP not support GPU?

    I'm wondering how much time did it take to create above shared ground truths? Is it possible to share time profile of each scene?

    opened by patelajaychh 2
  • What is Oracle referred in Pixloc paper?

    What is Oracle referred in Pixloc paper?

    PixLoc, trained on MegaDepth, is initialized with image retrieval obtained with either DenseVLAD [88] or an oracle, which returns the reference image containing the largest number of inlier matches found by hloc. This oracle shows the benefits of better image retrieval using a more complex pipeline without ground truth information.

    opened by patelajaychh 2
  • got an ffmpeg error

    got an ffmpeg error

    I got an error when run the demo.ipynb:

    Error: ffmpeg error (see stderr output for detail)

    I don't know if I use the wrong version of ffmpeg because there is no detailed version information about it. So, can you share with me the version? or do you have any advice about this error?

    opened by Ma-yiwei 2
  • Error in the demo notebook: add_prefixes() missing 1 required positional argument: 'eval_dir'

    Error in the demo notebook: add_prefixes() missing 1 required positional argument: 'eval_dir'

    When trying to run the demo notebook, the following error appears:

    TypeError                                 Traceback (most recent call last)
    /tmp/ipykernel_37/2660258349.py in <module>
          8 
          9 print(f'default paths:\n{pformat(default_paths.asdict())}')
    ---> 10 paths = default_paths.add_prefixes(DATA_PATH/dataset, LOC_PATH/dataset)
         11 
         12 conf = default_confs['from_retrieval']
    
    TypeError: add_prefixes() missing 1 required positional argument: 'eval_dir'
    

    Indeed, the call to add_prefixes seems to be missing a parameter:

    paths = default_paths.add_prefixes(DATA_PATH/dataset, LOC_PATH/dataset)
    
    opened by bperel 2
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • Some questions about trainning

    Some questions about trainning

    Hello, first of all about your work I think very excellent. I was recently reproducing your work when I was training the dataset megadepth on the server. I found that I always failed to achieve the results of your training logs. I just changed the batch_size=6,num_workers=8 to batch_size=12,num_workers=12. This makes me very confused and I hope I can get your answer. With best regards.

    opened by Geelooo 0
  • can not change the encoder of UNet

    can not change the encoder of UNet

    Hi, Thanks for your amazing work. When retraining the model,I changed the encoder of UNet from vgg to resnet,the following error occurred.What is the reason?How to solve it? image

    opened by juju904 2
  • With custom data

    With custom data

    Hi, thanks for your great work! I want to use pixloc to implement a localiazation task.

    I followed these steps:

    1. I create my own datasets that I capture by my own camera.
    2. I use netvlad to create global descriptors.
    3. I use superpoint to create local descriptors.
    4. I use retrieval to create pairs.
    5. Match the pairs use superglue.
    6. Follow the reconstruction to create a model--sfm_superpoint+superglue
    7. Finally, I use the model to localize query image.

    Unfortunately, I can't achieve a precise result with pixloc. The SfM model looks well when I visualized in the COLMAP GUI and the estimated camera poses roughly correct. What should I do to improve it?

    opened by CuiYan27 2
  • Support for Colmap camera model FULL_OPENCV

    Support for Colmap camera model FULL_OPENCV

    Dear pixloc authors and developers,

    Many thanks for providing such a high-quality package! Well done.

    For our application, we are using a FULL_OPENCV camera model, which isn't supported by pixloc at the moment.

    The FULL_OPENCV model is registered in utils/colmap.py, but it's missing from pixlib/geometry/wrappers.py, as well as from the actual undistortion in pixlib/geometry/utils.py and its jacobian. The FULL_OPENCV calibration model is an extension to the OPENCV model, with an higher-order polynomial: k3, k4, k5 and k6 follow p1 and p2.

    It seems that adding support for higher degree polynomials in the undistortion code would be simple, but I'm not sure about the boundary condition - what would this become?

    opened by jackokaiser 2
Owner
Computer Vision and Geometry Lab
Computer Vision and Geometry Lab
Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of images as "pixels"

picinpics Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of

RodrigoCMoraes 1 Oct 24, 2021
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization This is the PyTorch implementation of our paper "A Deep Feature Aggregation

null 9 Dec 9, 2022
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 1, 2022
Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos Introduction This repo is official PyTorch implementatio

Gyeongsik Moon 29 Sep 24, 2022
Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes (CVPR 2021) Project page | Paper | Colab | Colab for Drawing App Rethinking Style

CompVis Heidelberg 153 Jan 4, 2023
SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

Wang Han 王晗 1.3k Jan 8, 2023
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation This is a PyTorch and LibTorch implementation of MarkerPose: a

Jhacson Meza 47 Nov 18, 2022
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Welcome to Barlow Barlow is a tool for identifying the failure modes for a given neural network. To achieve this, Barlow first creates a group of imag

Sahil Singla 33 Dec 5, 2022
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

BRNet Introduction This is a release of the code of our paper Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds,

null 86 Oct 5, 2022
DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

DSAC* for Visual Camera Re-Localization (RGB or RGB-D) Introduction Installation Data Structure Supported Datasets 7Scenes 12Scenes Cambridge Landmark

Visual Learning Lab 143 Dec 22, 2022
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
Camera-caps - Examine the camera capabilities for V4l2 cameras

camera-caps This is a graphical user interface over the v4l2-ctl command line to

Jetsonhacks 25 Dec 26, 2022
Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Delving into Localization Errors for Monocular 3D Detection By Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang. Intr

XINZHU.MA 124 Jan 4, 2023
Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

MOS-Multi-Task-Face-Detect Introduction This repo is the official implementation of "MOS: A Low Latency and Lightweight Framework for Face Detection,

null 104 Dec 8, 2022