Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Overview

Volumetric TSDF Fusion of RGB-D Images in Python

This is a lightweight python script that fuses multiple registered color and depth images into a projective truncated signed distance function (TSDF) volume, which can then be used to create high quality 3D surface meshes and point clouds. Tested on Ubuntu 16.04.

An older CUDA/C++ version can be found here.

Requirements

  • Python 2.7+ with NumPy, PyCUDA, OpenCV, Scikit-image and Numba. These can be quickly installed/updated by running the following:
    pip install --user numpy opencv-python scikit-image numba
  • [Optional] GPU acceleration requires an NVIDA GPU with CUDA and PyCUDA:
    pip install --user pycuda

Demo

This demo fuses 1000 RGB-D images from the 7-scenes dataset into a 405 x 264 x 289 projective TSDF voxel volume with 2cm resolution at about 30 FPS in GPU mode (0.4 FPS in CPU mode), and outputs a 3D mesh mesh.ply which can be visualized with a 3D viewer like Meshlab.

Note: color images are saved as 24-bit PNG RGB, depth images are saved as 16-bit PNG in millimeters.

python demo.py

Seen In

References

Citing

This repository is a part of 3DMatch Toolbox. If you find this code useful in your work, please consider citing:

@inproceedings{zeng20163dmatch,
    title={3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions},
    author={Zeng, Andy and Song, Shuran and Nie{\ss}ner, Matthias and Fisher, Matthew and Xiao, Jianxiong and Funkhouser, Thomas},
    booktitle={CVPR},
    year={2017}
}
Comments
  • how to get a single layer mesh?

    how to get a single layer mesh?

    Hi, thanks for the implement. It seems the pointcloud generated has double layers for every surface. There is some kind of thickness related to _trunc_margin. Setting self._trunc_margin = self._voxel_size will lead to many holes. How can I get a single layer mesh without thickness

    Marching cubes algorithm generates fine but irregular faces. In order to get smooth and regular point faces maybe some more post-processing work needs to be done. I will try poisson surface reconstruction later.

    opened by plutoyuxie 3
  • demo.py is throwing an error after initial pull

    demo.py is throwing an error after initial pull

    python demo.py Traceback (most recent call last): File "demo.py", line 9, in

        import fusion
      File "tsdf-fusion-python/fusion.py", line 346
        xyz_t_h = (transform @ xyz_h.T).T
                             ^
    SyntaxError: invalid syntax
    
    opened by Codeguyross 3
  • False Alignment / Camera Pose / ROBI Dataset

    False Alignment / Camera Pose / ROBI Dataset

    I try to apply this TSDF Fusion on the ROBI Dataset, but I have troubles with the alignment of the frames and integration. The plane alignment seems to be right but I still have some offset in the plane.

    It looks like the camera pose is not right, but I already used the same dataset/camera pose with TSDF from Open3D which worked flawlessly.

    Does anyone have an idea, why it's not working correctly?

    TSDF Volume from Open3D: Screenshot from 2021-11-03 13-37-28

    TSDF Volume/Mesh from this repo: Screenshot from 2021-11-04 15-15-25

    opened by Bastiiiiii 2
  • How to interpret the depth map here?

    How to interpret the depth map here?

    I visualized the depth map and it showed something like below: image

    Does anybody know why is it different from the common depth maps, such as those from Kinect, with all these waves on top of the view? image Thanks.

    opened by HaFred 2
  • fix weight bug

    fix weight bug

    Previous code only works with the default obs_weight =1!

    Before pull request (I only run fusion on the first 100 images):

    • obs_weight=1 (Default):
    Screen Shot 2020-01-06 at 2 21 15 AM
    • obs_weight=0.1:
    Screen Shot 2020-01-06 at 2 23 03 AM

    After pull request:

    • obs_weight=0.1: same as the first figure.
    opened by yenchenlin 2
  • Fix voxels behind camera

    Fix voxels behind camera

    Previously valid_pix did not check to make sure that the voxel was in front of the camera. This caused problems in the integration (it did not manifest in the provided example, but does in general).

    opened by zmurez 1
  • No Licence

    No Licence

    Hey @andyzeng,

    First off thanks for putting this code up here, it's been a really handy reference for getting my head around TSDFs.

    I was hoping to use this code in a project I'm working on but I noticed that you haven't specified a license for this repo. If you didn't intend for this to be open source then no problem, that's your prerogative, but if not, any chance you could add a Licence.md? I'm sure you've been through this kind of thing before, but just in case, I found this to be a handy reference when I was choosing a license for a few of my repos: https://choosealicense.com/

    Anyway, thanks again!

    opened by thomascent 1
  • Correctly works on gpu and cpu now

    Correctly works on gpu and cpu now

    Hi Andy,

    I forgot to un-indent the color constant self._b_const which meant it threw an error when using a gpu. I've now fixed it and tested that it runs correctly both on the cpu and gpu.

    I've also added a method for extracting a point cloud from the voxel volume and saving it to disk as a ply file.

    Cheers :)

    opened by kevinzakka 0
  • The reconstruction results of each input cannot overlap

    The reconstruction results of each input cannot overlap

    I used blender to obtain RGBD images and camera internal and external parameters, but the reconstruction results were incorrect. The reconstruction results of each image were separated and could not be well overlapped. The following figure shows the reconstruction results of three groups of data.

    What causes this? I hope you can answer it. Thank you image

    opened by jly0810 1
  • ValueError: too many values to unpack (expected 2)

    ValueError: too many values to unpack (expected 2)

    When I run my own dataset, I have problems that I don't understand。 What should I do?

    Initializing voxel volume... Voxel volume size: 62 x 39 x 54 - # points: 130,572 Fusing frame 1/228 Traceback (most recent call last): File "/snap/pycharm-professional/306/plugins/python/helpers/pydev/pydevd.py", line 1496, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/snap/pycharm-professional/306/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/wang/tsdf-fusion-python-master/demo.py", line 55, in <module> tsdf_vol.integrate(color_image, depth_im, cam_intr, cam_pose, obs_weight=1.) File "/home/wang/tsdf-fusion-python-master/fusion.py", line 218, in integrate im_h, im_w = depth_im.shape ValueError: too many values to unpack (expected 2)

    opened by HquGhy 0
  • Surface level must be within volume data range.

    Surface level must be within volume data range.

    hello Andy, thank you for your wonderful contribution.

    I tried a different data , and modified the voxel_size to 0.1 to avoid memory problems but i get this error

    ValueError: Surface level must be within volume data range.

    from the function

    measure.marching_cubes_lewiner(tsdf_vol, level=0)

    at

    tsdf_vol.get_point_cloud()

    thank you for your time

    opened by Alihamdy2496 3
  • About 2cm resolution?

    About 2cm resolution?

    Thank you very much for your nice repo. How can I increase the resolution? According to your documentation: "from the 7-scenes dataset into a 405 x 264 x 289 projective TSDF voxel volume with 2cm resolution at about 30 FPS in GPU mode"

    Thanks in Advanced

    opened by AI-ML-Enthusiast 0
  • About rigid transformation matrix?

    About rigid transformation matrix?

    Thanks for your good repo. Would provide the notation of your 4x4 rigid transformation matrix? It will easier to understand, what they are? or any other link?

    opened by AI-ML-Enthusiast 1
Owner
Andy Zeng
Research Scientist in Robotics at Google Brain working on AI
Andy Zeng
TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

ETHZ ASL 130 Dec 29, 2022
Hand Gesture Volume Control is AIML based project which uses image processing to control the volume of your Computer.

Hand Gesture Volume Control Modules There are basically three modules Handtracking Program Handtracking Module Volume Control Program Handtracking Pro

VITTAL 1 Jan 12, 2022
DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

DSAC* for Visual Camera Re-Localization (RGB or RGB-D) Introduction Installation Data Structure Supported Datasets 7Scenes 12Scenes Cambridge Landmark

Visual Learning Lab 143 Dec 22, 2022
3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

Владислав Молодцов 0 Feb 6, 2022
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

This is the official PyTorch implementation of the ALBEF paper [Blog]. This repository supports pre-training on custom datasets, as well as finetuning on VQA, SNLI-VE, NLVR2, Image-Text Retrieval on MSCOCO and Flickr30k, and visual grounding on RefCOCO+. Pre-trained and finetuned checkpoints are released.

Salesforce 805 Jan 9, 2023
Fuse radar and camera for detection

SAF-FCOS: Spatial Attention Fusion for Obstacle Detection using MmWave Radar and Vision Sensor This project hosts the code for implementing the SAF-FC

ChangShuo 18 Jan 1, 2023
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 5, 2022
Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation Created by Zeyu HU Introduction This work is based on our paper VMNet: Voxel-Mes

HU Zeyu 82 Dec 27, 2022
Compute descriptors for 3D point cloud registration using a multi scale sparse voxel architecture

MS-SVConv : 3D Point Cloud Registration with Multi-Scale Architecture and Self-supervised Fine-tuning Compute features for 3D point cloud registration

null 42 Jul 25, 2022
Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

null 173 Dec 25, 2022
for taichi voxel-challange event

Taichi Voxel Challenge Figure: result of python3 example6.py. Please replace the image above (demo.jpg) with yours, so that other people can immediate

Liming Xu 20 Nov 26, 2022
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper] Authors: Chenhang He, Ruihuang Li, Shuai Li, L

Billy HE 141 Dec 30, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

TFLite-msg_chn_wacv20-depth-completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model

Ibai Gorordo 2 Oct 4, 2021
Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

fix_m1_rgb Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr. No warranty provided for using th

Kevin Gao 116 Jan 1, 2023
SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

SymmetryNet SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images ACM Transactions on Gra

null 26 Dec 5, 2022
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

zyrant丶 14 Dec 29, 2021
PN-Net a neural field-based framework for depth estimation from single-view RGB images.

PN-Net We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single

null 1 Oct 2, 2021