Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Andy Zeng

Last update: Jan 3, 2023

Related tags

Deep Learning cuda artificial-intelligence vision rgbd 3d 3d-reconstruction depth-camera volumetric-data 3d-deep-learning tsdf kinect-fusion

Overview

Volumetric TSDF Fusion of RGB-D Images in Python

This is a lightweight python script that fuses multiple registered color and depth images into a projective truncated signed distance function (TSDF) volume, which can then be used to create high quality 3D surface meshes and point clouds. Tested on Ubuntu 16.04.

An older CUDA/C++ version can be found here.

Requirements

Python 2.7+ with NumPy, PyCUDA, OpenCV, Scikit-image and Numba. These can be quickly installed/updated by running the following:
```
pip install --user numpy opencv-python scikit-image numba
```
[Optional] GPU acceleration requires an NVIDA GPU with CUDA and PyCUDA:
```
pip install --user pycuda
```

Demo

This demo fuses 1000 RGB-D images from the 7-scenes dataset into a 405 x 264 x 289 projective TSDF voxel volume with 2cm resolution at about 30 FPS in GPU mode (0.4 FPS in CPU mode), and outputs a 3D mesh mesh.ply which can be visualized with a 3D viewer like Meshlab.

Note: color images are saved as 24-bit PNG RGB, depth images are saved as 16-bit PNG in millimeters.

python demo.py

Seen In

References

Citing

This repository is a part of 3DMatch Toolbox. If you find this code useful in your work, please consider citing:

@inproceedings{zeng20163dmatch,
    title={3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions},
    author={Zeng, Andy and Song, Shuran and Nie{\ss}ner, Matthias and Fisher, Matthew and Xiao, Jianxiong and Funkhouser, Thomas},
    booktitle={CVPR},
    year={2017}
}

Comments

how to get a single layer mesh?

Hi, thanks for the implement. It seems the pointcloud generated has double layers for every surface. There is some kind of thickness related to _trunc_margin. Setting self._trunc_margin = self._voxel_size will lead to many holes. How can I get a single layer mesh without thickness

Marching cubes algorithm generates fine but irregular faces. In order to get smooth and regular point faces maybe some more post-processing work needs to be done. I will try poisson surface reconstruction later.

opened by plutoyuxie 3

demo.py is throwing an error after initial pull

python demo.py Traceback (most recent call last): File "demo.py", line 9, in

    import fusion
  File "tsdf-fusion-python/fusion.py", line 346
    xyz_t_h = (transform @ xyz_h.T).T
                         ^
SyntaxError: invalid syntax

opened by Codeguyross 3

False Alignment / Camera Pose / ROBI Dataset

I try to apply this TSDF Fusion on the ROBI Dataset, but I have troubles with the alignment of the frames and integration. The plane alignment seems to be right but I still have some offset in the plane.

It looks like the camera pose is not right, but I already used the same dataset/camera pose with TSDF from Open3D which worked flawlessly.

Does anyone have an idea, why it's not working correctly?

TSDF Volume from Open3D:

TSDF Volume/Mesh from this repo:

opened by Bastiiiiii 2
How to interpret the depth map here?

I visualized the depth map and it showed something like below:

Does anybody know why is it different from the common depth maps, such as those from Kinect, with all these waves on top of the view? Thanks.

opened by HaFred 2
fix weight bug
Previous code only works with the default obs_weight =1!

Before pull request (I only run fusion on the first 100 images):

obs_weight=1 (Default):

obs_weight=0.1:

After pull request:

obs_weight=0.1: same as the first figure.
opened by yenchenlin 2
Fix voxels behind camera

Previously valid_pix did not check to make sure that the voxel was in front of the camera. This caused problems in the integration (it did not manifest in the provided example, but does in general).

opened by zmurez 1
No Licence

Hey @andyzeng,

First off thanks for putting this code up here, it's been a really handy reference for getting my head around TSDFs.

I was hoping to use this code in a project I'm working on but I noticed that you haven't specified a license for this repo. If you didn't intend for this to be open source then no problem, that's your prerogative, but if not, any chance you could add a Licence.md? I'm sure you've been through this kind of thing before, but just in case, I found this to be a handy reference when I was choosing a license for a few of my repos: https://choosealicense.com/

Anyway, thanks again!

opened by thomascent 1
Correctly works on gpu and cpu now

Hi Andy,

I forgot to un-indent the color constant self._b_const which meant it threw an error when using a gpu. I've now fixed it and tested that it runs correctly both on the cpu and gpu.

I've also added a method for extracting a point cloud from the voxel volume and saving it to disk as a ply file.

Cheers :)

opened by kevinzakka 0
The reconstruction results of each input cannot overlap

I used blender to obtain RGBD images and camera internal and external parameters, but the reconstruction results were incorrect. The reconstruction results of each image were separated and could not be well overlapped. The following figure shows the reconstruction results of three groups of data.

What causes this? I hope you can answer it. Thank you

opened by jly0810 1
ValueError: too many values to unpack (expected 2)

When I run my own dataset, I have problems that I don't understand。 What should I do？

Initializing voxel volume... Voxel volume size: 62 x 39 x 54 - # points: 130,572 Fusing frame 1/228 Traceback (most recent call last): File "/snap/pycharm-professional/306/plugins/python/helpers/pydev/pydevd.py", line 1496, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/snap/pycharm-professional/306/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/wang/tsdf-fusion-python-master/demo.py", line 55, in <module> tsdf_vol.integrate(color_image, depth_im, cam_intr, cam_pose, obs_weight=1.) File "/home/wang/tsdf-fusion-python-master/fusion.py", line 218, in integrate im_h, im_w = depth_im.shape ValueError: too many values to unpack (expected 2)

opened by HquGhy 0
Surface level must be within volume data range.

hello Andy, thank you for your wonderful contribution.

I tried a different data , and modified the voxel_size to 0.1 to avoid memory problems but i get this error

ValueError: Surface level must be within volume data range.

from the function

measure.marching_cubes_lewiner(tsdf_vol, level=0)

at

tsdf_vol.get_point_cloud()

thank you for your time

opened by Alihamdy2496 3
About 2cm resolution?

Thank you very much for your nice repo. How can I increase the resolution? According to your documentation: "from the 7-scenes dataset into a 405 x 264 x 289 projective TSDF voxel volume with 2cm resolution at about 30 FPS in GPU mode"

Thanks in Advanced

opened by AI-ML-Enthusiast 0
About rigid transformation matrix?

Thanks for your good repo. Would provide the notation of your 4x4 rigid transformation matrix? It will easier to understand, what they are? or any other link?

opened by AI-ML-Enthusiast 1

Owner

Andy Zeng

Research Scientist in Robotics at Google Brain working on AI

GitHub http://andyzeng.github.io/

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

130 Dec 29, 2022

Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

Hand Gesture Volume Control Modules There are basically three modules Handtracking Program Handtracking Module Volume Control Program Handtracking Pro

1 Jan 12, 2022

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

DSAC* for Visual Camera Re-Localization (RGB or RGB-D) Introduction Installation Data Structure Supported Datasets 7Scenes 12Scenes Cambridge Landmark

143 Dec 22, 2022

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

0 Feb 6, 2022

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

This is the official PyTorch implementation of the ALBEF paper [Blog]. This repository supports pre-training on custom datasets, as well as finetuning on VQA, SNLI-VE, NLVR2, Image-Text Retrieval on MSCOCO and Flickr30k, and visual grounding on RefCOCO+. Pre-trained and finetuned checkpoints are released.

805 Jan 9, 2023

Fuse radar and camera for detection

SAF-FCOS: Spatial Attention Fusion for Obstacle Detection using MmWave Radar and Vision Sensor This project hosts the code for implementing the SAF-FC

18 Jan 1, 2023

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

3 Nov 30, 2021

Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

43 Dec 5, 2022

Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation Created by Zeyu HU Introduction This work is based on our paper VMNet: Voxel-Mes

82 Dec 27, 2022

Compute descriptors for 3D point cloud registration using a multi scale sparse voxel architecture

MS-SVConv : 3D Point Cloud Registration with Multi-Scale Architecture and Self-supervised Fine-tuning Compute features for 3D point cloud registration

42 Jul 25, 2022

Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

173 Dec 25, 2022

for taichi voxel-challange event

Taichi Voxel Challenge Figure: result of python3 example6.py. Please replace the image above (demo.jpg) with yours, so that other people can immediate

20 Nov 26, 2022

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper] Authors: Chenhang He, Ruihuang Li, Shuai Li, L

141 Dec 30, 2022

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

19 Oct 22, 2022

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

TFLite-msg_chn_wacv20-depth-completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model

2 Oct 4, 2021

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Related tags

Overview

Volumetric TSDF Fusion of RGB-D Images in Python

Requirements

Demo

Seen In

References

Citing

Comments

Owner

Andy Zeng

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Fuse radar and camera for detection

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

Compute descriptors for 3D point cloud registration using a multi scale sparse voxel architecture

Voxel Transformer for 3D object detection

for taichi voxel-challange event

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

PN-Net a neural field-based framework for depth estimation from single-view RGB images.