Learning Camera Localization via Dense Scene Matching, CVPR2021

tangshitao

Last update: Dec 1, 2022

Related tags

Computer Vision Dense-Scene-Matching

Overview

This repository contains code of our CVPR 2021 paper - "Learning Camera Localization via Dense Scene Matching" by Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu and Ping Tan.

This paper presents a new method for scene agnostic camera localization using dense scene matching (DSM), where a cost volume is constructed between a query image and a scene. The cost volume and the corresponding coordinates are processed by a CNN to predict dense coordinates. Camera poses can then be solved by PnP algorithms.

If you find this project useful, please cite:

@inproceedings{Tang2021Learning,
  title={Learning Camera Localization via Dense Scene Matching},
  author={Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu and Ping Tan},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Usage

Environment

The codes are tested along with
- pytorch=1.4.0
- lmdb (optional)
- yaml
- skimage
- opencv
- numpy=1.17
- tensorboard

Installation

Build PyTorch operations

  cd libs/model/ops
  python setup.py install

Build PnP algorithm

  cd libs/utils/lm_pnp
  mkdir build
  cd build
  cmake ..
  make all

Train and Test

Download

You can download the trained models and label files for 7scenes, Cambridge, Scannet.

For 7scenes, you can use the prepared data in the following.

Chess Fire Heads Office Pumpkin Kitchen Stairs

For Cambridge landmarks, you can download image files here, and depths here.
Test

Please refer to configs/7scenes.yaml for detailed explaination of how to set label file path and image file path.
- 7scenes
```
python tools/video_test.py --config configs/7scenes.yaml
```
- Camrbrige
```
python tools/video_test.py --config configs/cambridge.yaml
```
Train

We use ResNet-FPN pretrained model.
```
  python tools/train_net.py
```

Comments

The link of image files of Cambridge landmarks is invalid

Hi, thanks for your interesting work.

But the link of image files of Cambridge landmarks seems to be invalid, can you share with me the image files you download?

opened by zhuixunforever 5

atomicAdd error

I got an error when i run "python setup.py install". Any idea about the solution?

/home/msg/.local/lib/python3.6/site-packages/torch/include/ATen/core/TensorBody.h:303:30: note: declared here
   DeprecatedTypeProperties & type() const {
                              ^~~~
/usr/local/cuda/bin/nvcc -I/home/msg/.local/lib/python3.6/site-packages/torch/include -I/home/msg/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/msg/.local/lib/python3.6/site-packages/torch/include/TH -I/home/msg/.local/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/local/include/python3.6m -c correlation/src/corr_proj_kernel.cu -o build/temp.linux-x86_64-3.6/correlation/src/corr_proj_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O2 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=correlation_proj -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_52,code=sm_52 -std=c++14
correlation/src/corr_proj_kernel.cu(190): error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (double *, double)
          detected during instantiation of "void CorrelateDataBackward(at::PackedTensorAccessor32<scalar_t, 4UL, at::RestrictPtrTraits>, at::PackedTensorAccessor32<scalar_t, 4UL, at::RestrictPtrTraits>, at::PackedTensorAccessor32<scalar_t, 5UL, at::RestrictPtrTraits>, at::PackedTensorAccessor32<scalar_t, 4UL, at::RestrictPtrTraits>, at::PackedTensorAccessor32<scalar_t, 4UL, at::RestrictPtrTraits>, at::PackedTensorAccessor32<scalar_t, 4UL, at::RestrictPtrTraits>, at::PackedTensorAccessor32<scalar_t, 5UL, at::RestrictPtrTraits>, int, int, int, int) [with scalar_t=double]" 
(235): here

1 error detected in the compilation of "correlation/src/corr_proj_kernel.cu".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

Thanks for sharing your work.

opened by mhmtsarigul 3

No module named 'correlation_cuda'

Hello,

I try to run "python tools/video_test.py --config configs/7scenes.yaml" in Colab and I get the following error. Which Cuda version did you use? I could not find a solution.

Traceback (most recent call last):
  File "tools/video_test.py", line 14, in <module>
    from engine.launcher import *
  File "libs/engine/launcher.py", line 6, in <module>
    from model.arch.DSMNet import dsm_net
  File "libs/model/arch/DSMNet.py", line 5, in <module>
    from ..ops.correlation.modules.corr import (
  File "libs/model/ops/correlation/modules/corr.py", line 4, in <module>
    from ..functions.corr import correlation_op, correlation_proj_op
  File "libs/model/ops/correlation/functions/corr.py", line 3, in <module>
    import correlation_cuda
ModuleNotFoundError: No module named 'correlation_cuda'

Thanks for sharing your great work.

opened by mhmtsarigul 2

About the training dataset scannet
Thanks for sharing your work. I am trying to repeat your work but I have some questions about the training process:

I wonder whether you use the whole scannet dataset.

Since you offer the label file of scannet but the image file is not supplied, I want to confirm whether your training data is downloaded directly from the scannet's author without post-processing.

Thanks a lot.
opened by tdd233 3
About the "topk" attribute in the .bin file
Thanks for sharing your work. I try to use DSM on my own dataset. I have rgb image, depth image and corresponding camera pose for them. But I noticed that the program loaded the "deep_retrieve_full.bin" file and got a "topk" attribute for every frame. I have two questions about that.

Is the "topk" representing the sorted correlation values that described in the paragraph "Cost volume", chapter 3.3.1 in the paper?

How can I generate the "topk" information for my own dataset?

Thanks a lot.
opened by Wajov 3
Loaded camera extrinsic is different from raw .pose.txt file

Thanks for sharing your work. I try to use DSM on 7scenes datasets but I noticed that the camera extrinsic is loaded from "deep_retrieve_full.bin" and the translation vector is different from corresponding frame-xxxxxx.pose.txt. Why is there this difference?

opened by MisEty 1
Request for Cambridge Landmarks Dataset

I am sorry to ask you for help. For some urgent reasons I need the Cambridge Landmarks datasets but the official website seemed to be blocked and there's no other data source I could find. I'll appreciate much if you have another source to help me get the dataset!

opened by HaroYoxy 1
There are too many cycles

https://github.com/Tangshitao/Dense-Scene-Matching/blob/4957fa3f41419c31a60ffc82e234f23b8050583f/libs/engine/launcher.py#L27 should the "self.cfg.TRAIN.train_iters " be changed to "int(self.cfg.TRAIN.iters/self.cfg.TRAIN.model_save_iters)"? otherwise，there are too many cycles.we will save 1600000 models?

opened by csm-coder 10

Owner

tangshitao

GitHub

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

309 Dec 6, 2022

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

146 Dec 24, 2022

A curated list of resources dedicated to scene text localization and recognition

Scene Text Localization & Recognition Resources A curated list of resources dedicated to scene text localization and recognition. Any suggestions and

1.6k Dec 22, 2022

Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Using Opencv ,this project is based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching ,it will just mask that image . This project ,if used in cctv then it will detect black listed people if mentioned properly with their images.

1 Feb 13, 2022

Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

428 Nov 22, 2022

Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

Code for the AAAI18 paper PixelLink: Detecting Scene Text via Instance Segmentation, by Dan Deng, Haifeng Liu, Xuelong Li, and Deng Cai. Contributions

758 Dec 22, 2022

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約

Scene Text Localization & Recognition Resources Read this institute-wise: English, 简体中文. Read this year-wise: English, 简体中文. Tags: [STL] (Scene Text L

901 Dec 11, 2022

Localization of thoracic abnormalities model based on VinBigData (top 1%)

Repository contains the code for 2nd place solution of VinBigData Chest X-ray Abnormalities Detection competition. The goal of competition was to auto

33 May 24, 2022

(L2ID@CVPR2021) Boosting Co-teaching with Compression Regularization for Label Noise

Nested-Co-teaching (L2ID@CVPR2021) Pytorch implementation of paper "Boosting Co-teaching with Compression Regularization for Label Noise" [PDF] If our

41 Jan 3, 2023

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

76 Dec 6, 2022

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection Introduction The code and trained models of: TextField: Learning A Deep

101 Dec 12, 2022

A real-time dolly zoom camera effect

Dolly-Zoom I've always been amazed by the gradual perspective change of dolly zoom, and I have some experience in python and OpenCV, so I decided to c

52 Dec 8, 2022

Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera.

Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip location is mapped to RGB images to control the mouse cursor.

71 Dec 20, 2022

Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

This repository is mainly for camera intrinsic calibration and hand-eye calibration. Synthetic experiments are conducted in PyBullet simulator. 1. Tes

7 Oct 3, 2022

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

camloop Forget the boilerplate from OpenCV camera loops and get to coding the interesting stuff Table of Contents Usage Install Quickstart More advanc

9 Nov 12, 2021

Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

mouserController Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, r

6 Jun 28, 2022

The virtual calculator will be above the live streaming from your camera

The virtual calculator is above the live streaming from my camera usb , the program first detect my hand and in each frame calculate the distance between two finger ,if the distance is lower than the specific length , it detected as a click i can write any arithmitic operation , when i click in the equals sign the result appears in the display section. i can clear the display section by pressing c button in the keyboard .

5 Jul 1, 2022

Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

1 Jan 13, 2022

A simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dropbox account at every 5 seconds

Security Camera using Opencv & Dropbox This is a simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dro

1 Jan 31, 2022

Learning Camera Localization via Dense Scene Matching, CVPR2021

Related tags

Overview

Usage

Environment

Installation

Train and Test

Comments

Owner

tangshitao

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

A curated list of resources dedicated to scene text localization and recognition

Using Opencv ,based on Augmental Reality(AR) and will show the feature matching of image and then by finding its matching

Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

Localization of thoracic abnormalities model based on VinBigData (top 1%)

(L2ID@CVPR2021) Boosting Co-teaching with Compression Regularization for Label Noise

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

A real-time dolly zoom camera effect

Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera.

Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

The virtual calculator will be above the live streaming from your camera

Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

A simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dropbox account at every 5 seconds

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約