Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Overview

RfD-Net [Project Page] [Paper] [Video]

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Nießner
In CVPR, 2021.

points.png pred.png

From an incomplete point cloud of a 3D scene (left), our method learns to jointly understand the 3D objects and reconstruct instance meshes as the output (right).


Install

  1. This implementation uses Python 3.6, Pytorch1.7.1, cudatoolkit 11.0. We recommend to use conda to deploy the environment.

    • Install with conda:
    conda env create -f environment.yml
    conda activate rfdnet
    
    • Install with pip:
    pip install -r requirements.txt
    
  2. Next, compile the external libraries by

    python setup.py build_ext --inplace
    
  3. Install PointNet++ by

    cd external/pointnet2_ops_lib
    pip install .
    

Demo

The pretrained model can be downloaded here. Put the pretrained model in the directory as below

out/pretrained_models/pretrained_weight.pth

A demo is illustrated below to see how our method works.

cd RfDNet
python main.py --config configs/config_files/ISCNet_test.yaml --mode demo --demo_path demo/inputs/scene0549_00.off

VTK is used here to visualize the 3D scenes. The outputs will be saved under 'demo/outputs'. You can also play with your toy with this script.

If everything goes smooth, there will be a GUI window popped up and you can interact with the scene as below. screenshot_demo.png

If you run it on machines without X display server, you can use the offscreen mode by setting offline=True in demo.py. The rendered image will be saved in demo/outputs/some_scene_id/pred.png.


Prepare Data

In our paper, we use the input point cloud from the ScanNet dataset, and the annotated instance CAD models from the Scan2CAD dataset. Scan2CAD aligns the object CAD models from ShapeNetCore.v2 to each object in ScanNet, and we use these aligned CAD models as the ground-truth.

Preprocess ScanNet and Scan2CAD data

You can either directly download the processed samples [link] to the directory below (recommended)

datasets/scannet/processed_data/

or

  1. Ask for the ScanNet dataset and download it to
    datasets/scannet/scans
    
  2. Ask for the Scan2CAD dataset and download it to
    datasets/scannet/scan2cad_download_link
    
  3. Preprocess the ScanNet and Scan2CAD dataset for training by
    cd RfDNet
    python utils/scannet/gen_scannet_w_orientation.py
    
Preprocess ShapeNet data

You can either directly download the processed data [link] and extract them to datasets/ShapeNetv2_data/ as below

datasets/ShapeNetv2_data/point
datasets/ShapeNetv2_data/pointcloud
datasets/ShapeNetv2_data/voxel
datasets/ShapeNetv2_data/watertight_scaled_simplified

or

  1. Download ShapeNetCore.v2 to the path below

    datasets/ShapeNetCore.v2
    
  2. Process ShapeNet models into watertight meshes by

    python utils/shapenet/1_fuse_shapenetv2.py
    
  3. Sample points on ShapeNet models for training (similar to Occupancy Networks).

    python utils/shapenet/2_sample_mesh.py --resize --packbits --float16
    
  4. There are usually 100K+ points per object mesh. We simplify them to speed up our testing and visualization by

    python utils/shapenet/3_simplify_fusion.py --in_dir datasets/ShapeNetv2_data/watertight_scaled --out_dir datasets/ShapeNetv2_data/watertight_scaled_simplified
    
Verify preprocessed data

After preprocessed the data, you can run the visualization script below to check if they are generated correctly.

  • Visualize ScanNet+Scan2CAD+ShapeNet samples by

    python utils/scannet/visualization/vis_gt.py
    

    A VTK window will be popped up like below.

    verify.png

Training, Generating and Evaluation

We use the configuration file (see 'configs/config_files/****.yaml') to fully control the training/testing/generating process. You can check a template at configs/config_files/ISCNet.yaml.

Training

We firstly pretrain our detection module and completion module followed by a joint refining. You can follow the process below.

  1. Pretrain the detection module by

    python main.py --config configs/config_files/ISCNet_detection.yaml --mode train
    

    It will save the detection module weight at out/iscnet/a_folder_with_detection_module/model_best.pth

  2. Copy the weight path of detection module (see 1.) into configs/config_files/ISCNet_completion.yaml as

    weight: ['out/iscnet/a_folder_with_detection_module/model_best.pth']
    

    Then pretrain the completion module by

    python main.py --config configs/config_files/ISCNet_completion.yaml --mode train
    

    It will save the completion module weight at out/iscnet/a_folder_with_completion_module/model_best.pth

  3. Copy the weight path of completion module (see 2.) into configs/config_files/ISCNet.yaml as

    weight: ['out/iscnet/a_folder_with_completion_module/model_best.pth']
    

    Then jointly finetune RfD-Net by

    python main.py --config configs/config_files/ISCNet.yaml --mode train
    

    It will save the trained model weight at out/iscnet/a_folder_with_RfD-Net/model_best.pth

Generating

Copy the weight path of RfD-Net (see 3. above) into configs/config_files/ISCNet_test.yaml as

weight: ['out/iscnet/a_folder_with_RfD-Net/model_best.pth']

Run below to output all scenes in the test set.

python main.py --config configs/config_files/ISCNet_test.yaml --mode test

The 3D scenes for visualization are saved in the folder of out/iscnet/a_folder_with_generated_scenes/visualization. You can visualize a triplet of (input, pred, gt) following a demo below

python utils/scannet/visualization/vis_for_comparison.py 

If everything goes smooth, there will be three windows (corresponding to input, pred, gt) popped up by sequence as

Input Prediction Ground-truth

Evaluation

You can choose each of the following ways for evaluation.

  1. You can export all scenes above to calculate the evaluation metrics with any external library (for researchers who would like to unify the benchmark). Lower the dump_threshold in ISCNet_test.yaml in generation to enable more object proposals for mAP calculation (e.g. dump_threshold=0.05).

  2. In our evaluation, we voxelize the 3D scenes to keep consistent resolution with the baseline methods. To enable this,

    1. make sure the executable binvox are downloaded and configured as an experiment variable (e.g. export its path in ~/.bashrc for Ubuntu). It will be deployed by Trimesh.

    2. Change the ISCNet_test.yaml as below for evaluation.

       test:
         evaluate_mesh_mAP: True
       generation:
         dump_results: False
    

    Run below to report the evaluation results.

    python main.py --config configs/config_files/ISCNet_test.yaml --mode test
    

    The log file will saved in out/iscnet/a_folder_named_with_script_time/log.txt


Differences to the paper

  1. The original paper was implemented with Pytorch 1.1.0, and we reconfigure our code to fit with Pytorch 1.7.1.
  2. A post processing step to align the reconstructed shapes to the input scan is supported. We have verified that it can improve the evaluation performance by a small margin. You can switch on/off it following demo.py.
  3. A different learning rate scheduler is adopted. The learning rate decreases to 0.1x if there is no gain within 20 steps, which is much more efficient.

Citation

If you find our work helpful, please consider citing

@inproceedings{Nie_2021_CVPR,
    title={RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction},
    author={Nie, Yinyu and Hou, Ji and Han, Xiaoguang and Nie{\ss}ner, Matthias},
    booktitle={Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
    year={2021}
}


License

RfD-Net is relased under the MIT License. See the LICENSE file for more details.

Comments
  • Different per category AP scores from the paper & potential bug in the evaluation

    Different per category AP scores from the paper & potential bug in the evaluation

    Hello, thanks for the amazing work!

    I'm trying to reproduce the results with the pre-trained model, but I got quite different per category AP scores from the paper:

    |           | display | bathtub | trashbin | sofa  | chair | table | cabinet | bookshelf | mAP   |
    | --------- | ------- | ------- | -------- | ----- | ----- | ----- | ------- | --------- | ----- |
    | paper     | 26.67   | 27.57   | 23.34    | 15.71 | 12.23 | 1.92  | 14.48   | 13.39     | 16.90 |
    | reproduce | 23.13   | 15.89   | 18.00    | 41.61 | 10.13 | 0.95  | 26.35   | 9.10      | 18.14 |
    

    Besides, there seems to be a lot of false positives at conf_thresh = 0.05:

    ----------iou_thresh: 0.500000----------
    [eval mesh] table
    [eval mesh] prec = 0.0037091005431182937 (28.0/7549.0 | rec = 0.05063291139240506(28.0/553) | ap = 0.00946969696969697
    [eval mesh] chair
    [eval mesh] prec = 0.01814809908597165 (137.0/7549.0 | rec = 0.1253430924062214(137.0/1093) | ap = 0.10131491817235834
    [eval mesh] bookshelf
    [eval mesh] prec = 0.002119486024639025 (16.0/7549.0 | rec = 0.07547169811320754(16.0/212) | ap = 0.09090909090909091
    [eval mesh] sofa
    [eval mesh] prec = 0.007948072592396344 (60.0/7549.0 | rec = 0.5309734513274337(60.0/113) | ap = 0.416168487597059
    [eval mesh] trash_bin
    [eval mesh] prec = 0.010994833752814943 (83.0/7549.0 | rec = 0.3577586206896552(83.0/232) | ap = 0.18000805806512327
    [eval mesh] cabinet
    [eval mesh] prec = 0.017618227579811897 (133.0/7549.0 | rec = 0.5115384615384615(133.0/260) | ap = 0.26358882912551806
    [eval mesh] display
    [eval mesh] prec = 0.008610411975096039 (65.0/7549.0 | rec = 0.3403141361256545(65.0/191) | ap = 0.23137496193523358
    [eval mesh] bathtub
    [eval mesh] prec = 0.005961054444297258 (45.0/7549.0 | rec = 0.375(45.0/120) | ap = 0.15889753331566212
    

    Is this expected? Or should I use a higher confidence threshold?

    opened by ashawkey 7
  • demo executed fail

    demo executed fail

    Begin to finetune from the existing weight. Loading checkpoint from out/pretrained_models/pretrained_weight.pth. set() subnet missed. Weights for finetuning loaded. Loading data. Traceback (most recent call last): File "main.py", line 38, in demo.run(cfg) File "/home//code/RfDNet-main/demo.py", line 409, in run our_data = generate(cfg, net.module, input_data, post_processing=False) File "/home//code/RfDNet-main/demo.py", line 223, in generate eval_dict, parsed_predictions = parse_predictions(end_points, data, cfg.eval_config) File "/home/**/code/RfDNet-main/net_utils/ap_helper.py", line 257, in parse_predictions assert (len(pick) > 0) AssertionError

    opened by lonsathing 2
  • distutils.errors.DistutilsPlatformError: Microsoft Visual C++ 14.0 or greater is required. Get it with

    distutils.errors.DistutilsPlatformError: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools"

    Hi, I've been trying this code on Windows with Python3.7  Pytorch1.7.1. And I'm having this now:

    subprocess.CalledProcessError: Command 'cmd /u /c "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" x86_amd64 && set' returned non-zero exit status 255.

    The above exception was the direct cause of the following exception: distutils.errors.DistutilsPlatformError: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visu al-cpp-build-tools/

    I've installed the build tools from the given link and configure my environment variables and all possible solutions from the Internet. But the issue stayed.

    Any advice?

    Many thanks!

    opened by YuQiao0303 2
  • ShapeNetv2_data/watertight_scaled preprocessed ScanNet data not provided

    ShapeNetv2_data/watertight_scaled preprocessed ScanNet data not provided

    In your Preprocessed ScanNet data, ShapeNetv2_data/watertight_scaled is not provided (only watertight_scaled_simplified is provided). Please provide ShapeNetv2_data/watertight_scaled, thanks

    opened by KangchengLiu 2
  • Missing file external.pointnet2.pytorch_utils

    Missing file external.pointnet2.pytorch_utils

    When I run command python main.py --config configs/config_files/ISCNet_detection.yaml --mode train, I met the following error.

    Traceback (most recent call last):
      File "main.py", line 31, in <module>
        import train
      File "/home/Projects/RfDNet/train.py", line 4, in <module>
        from models.optimizers import load_optimizer, load_scheduler, load_bnm_scheduler
      File "/home/Projects/RfDNet/models/optimizers.py", line 5, in <module>
        from external.pointnet2.pytorch_utils import BNMomentumScheduler
    ModuleNotFoundError: No module named 'external.pointnet2'
    
    opened by CurryYuan 2
  • Project not building

    Project not building

    Hi, thank you for making your codebase public. Unfortunately I cannot build the project, as for the environemt.yml gives an error for the pointnet. I tried to intstall it seperately but still it does not work. Are you sure the project is working with pytorch 1.7.1?

    opened by gchal 2
  • Same data for validation set and testing set

    Same data for validation set and testing set

    By chance, I find the data splitting files for validation and testing are identical. I wonder whether it uses the same data for validation and testing. test data validation data

    opened by Co1lin 1
  • Demo is not executing successfully

    Demo is not executing successfully

    The demo file is not running successfully at all. It's not showing any error and logs. I tried to decipher the code and found that in main.py file its not able to import "net_utils.utils" module. The code is stuck on line 22. I tried to print a string before the import and after the import. But only a string before the import statement is printed. For reference screenshot is attached. scannet. I have followed every step mentioned in Readme.md file. Please help.

    opened by shifaezainab 0
  • ChamferDistance

    ChamferDistance

    Dear @yinyunie , When running main.py file (after finishing all the above installations), i met an error about the ChamferDistance file. I also run the chamfer_distance.py file alone but still met that error:

    nvcc fatal : Unknown option '-generate-dependencies-with-compile' ninja: build stopped: subcommand failed.

    opened by trungpham2606 2
Owner
Yinyu Nie
currently a Ph.D. student at NCCA, Bournemouth University.
Yinyu Nie
Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

Cheng Zhang 66 Nov 16, 2022
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 5, 2022
Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

Wenjing Wang 77 Dec 8, 2022
[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search The official implementation of the paper LightTra

Multimedia Research 290 Dec 24, 2022
Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

TimeLens: Event-based Video Frame Interpolation This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper T

Robotics and Perception Group 544 Dec 19, 2022
Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

MonoFlex Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21. Work in progress. Installation This repo is tested w

Yunpeng 169 Dec 6, 2022
Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

StableNet StableNet is a deep stable learning method for out-of-distribution generalization. This is the official repo for CVPR21 paper "Deep Stable L

null 120 Dec 28, 2022
RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network.

Phong Nguyen Ha 4 May 26, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Shuaifeng Zhi 243 Jan 7, 2023
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 135 Dec 26, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

null 136 Dec 12, 2022
This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

Crisalix 72 Dec 10, 2022
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

MIC-DKFZ 1.2k Jan 4, 2023
Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Chord Recognition Demo application The demo application is written in C# with .NETCore. As of July 9, 2020, the only version available is for windows

Andres Mauricio Rondon Patiño 24 Oct 22, 2022
U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

Doron Adler 43 Oct 3, 2022
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022