Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Yinyu Nie

Last update: Jan 6, 2023

Related tags

Deep Learning pytorch 3d-reconstruction shape-reconstruction scene-understanding scene-reconstruction cvpr2021

Overview

RfD-Net [Project Page] [Paper] [Video]

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Nießner
In CVPR, 2021.

From an incomplete point cloud of a 3D scene (left), our method learns to jointly understand the 3D objects and reconstruct instance meshes as the output (right).

Install

This implementation uses Python 3.6, Pytorch1.7.1, cudatoolkit 11.0. We recommend to use conda to deploy the environment.
- Install with conda:
```
conda env create -f environment.yml
conda activate rfdnet
```
- Install with pip:
```
pip install -r requirements.txt
```
Next, compile the external libraries by
```
python setup.py build_ext --inplace
```

Install PointNet++ by

cd external/pointnet2_ops_lib
pip install .

Demo

The pretrained model can be downloaded here. Put the pretrained model in the directory as below

out/pretrained_models/pretrained_weight.pth

A demo is illustrated below to see how our method works.

cd RfDNet
python main.py --config configs/config_files/ISCNet_test.yaml --mode demo --demo_path demo/inputs/scene0549_00.off

VTK is used here to visualize the 3D scenes. The outputs will be saved under 'demo/outputs'. You can also play with your toy with this script.

If everything goes smooth, there will be a GUI window popped up and you can interact with the scene as below.

If you run it on machines without X display server, you can use the offscreen mode by setting offline=True in demo.py. The rendered image will be saved in demo/outputs/some_scene_id/pred.png.

Prepare Data

In our paper, we use the input point cloud from the ScanNet dataset, and the annotated instance CAD models from the Scan2CAD dataset. Scan2CAD aligns the object CAD models from ShapeNetCore.v2 to each object in ScanNet, and we use these aligned CAD models as the ground-truth.

Preprocess ScanNet and Scan2CAD data

You can either directly download the processed samples [link] to the directory below (recommended)

datasets/scannet/processed_data/

Ask for the ScanNet dataset and download it to
```
datasets/scannet/scans
```
Ask for the Scan2CAD dataset and download it to
```
datasets/scannet/scan2cad_download_link
```

Preprocess the ScanNet and Scan2CAD dataset for training by

cd RfDNet
python utils/scannet/gen_scannet_w_orientation.py

Preprocess ShapeNet data

You can either directly download the processed data [link] and extract them to datasets/ShapeNetv2_data/ as below

datasets/ShapeNetv2_data/point
datasets/ShapeNetv2_data/pointcloud
datasets/ShapeNetv2_data/voxel
datasets/ShapeNetv2_data/watertight_scaled_simplified

Download ShapeNetCore.v2 to the path below
```
datasets/ShapeNetCore.v2
```
Process ShapeNet models into watertight meshes by
```
python utils/shapenet/1_fuse_shapenetv2.py
```
Sample points on ShapeNet models for training (similar to Occupancy Networks).
```
python utils/shapenet/2_sample_mesh.py --resize --packbits --float16
```

There are usually 100K+ points per object mesh. We simplify them to speed up our testing and visualization by

python utils/shapenet/3_simplify_fusion.py --in_dir datasets/ShapeNetv2_data/watertight_scaled --out_dir datasets/ShapeNetv2_data/watertight_scaled_simplified

Verify preprocessed data

After preprocessed the data, you can run the visualization script below to check if they are generated correctly.

Visualize ScanNet+Scan2CAD+ShapeNet samples by
```
python utils/scannet/visualization/vis_gt.py
```
A VTK window will be popped up like below.

Training, Generating and Evaluation

We use the configuration file (see 'configs/config_files/****.yaml') to fully control the training/testing/generating process. You can check a template at configs/config_files/ISCNet.yaml.

Training

We firstly pretrain our detection module and completion module followed by a joint refining. You can follow the process below.

Pretrain the detection module by
```
python main.py --config configs/config_files/ISCNet_detection.yaml --mode train
```
It will save the detection module weight at out/iscnet/a_folder_with_detection_module/model_best.pth
Copy the weight path of detection module (see 1.) into configs/config_files/ISCNet_completion.yaml as
```
weight: ['out/iscnet/a_folder_with_detection_module/model_best.pth']
```
Then pretrain the completion module by
```
python main.py --config configs/config_files/ISCNet_completion.yaml --mode train
```
It will save the completion module weight at out/iscnet/a_folder_with_completion_module/model_best.pth
Copy the weight path of completion module (see 2.) into configs/config_files/ISCNet.yaml as
```
weight: ['out/iscnet/a_folder_with_completion_module/model_best.pth']
```
Then jointly finetune RfD-Net by
```
python main.py --config configs/config_files/ISCNet.yaml --mode train
```
It will save the trained model weight at out/iscnet/a_folder_with_RfD-Net/model_best.pth

Generating

Copy the weight path of RfD-Net (see 3. above) into configs/config_files/ISCNet_test.yaml as

weight: ['out/iscnet/a_folder_with_RfD-Net/model_best.pth']

Run below to output all scenes in the test set.

python main.py --config configs/config_files/ISCNet_test.yaml --mode test

The 3D scenes for visualization are saved in the folder of out/iscnet/a_folder_with_generated_scenes/visualization. You can visualize a triplet of (input, pred, gt) following a demo below

python utils/scannet/visualization/vis_for_comparison.py

If everything goes smooth, there will be three windows (corresponding to input, pred, gt) popped up by sequence as

Input	Prediction	Ground-truth

Evaluation

You can choose each of the following ways for evaluation.

You can export all scenes above to calculate the evaluation metrics with any external library (for researchers who would like to unify the benchmark). Lower the dump_threshold in ISCNet_test.yaml in generation to enable more object proposals for mAP calculation (e.g. dump_threshold=0.05).
In our evaluation, we voxelize the 3D scenes to keep consistent resolution with the baseline methods. To enable this,
1. make sure the executable binvox are downloaded and configured as an experiment variable (e.g. export its path in ~/.bashrc for Ubuntu). It will be deployed by Trimesh.
2. Change the ISCNet_test.yaml as below for evaluation.
```
   test:
     evaluate_mesh_mAP: True
   generation:
     dump_results: False
```
Run below to report the evaluation results.
```
python main.py --config configs/config_files/ISCNet_test.yaml --mode test
```
The log file will saved in out/iscnet/a_folder_named_with_script_time/log.txt

Differences to the paper

The original paper was implemented with Pytorch 1.1.0, and we reconfigure our code to fit with Pytorch 1.7.1.
A post processing step to align the reconstructed shapes to the input scan is supported. We have verified that it can improve the evaluation performance by a small margin. You can switch on/off it following demo.py.
A different learning rate scheduler is adopted. The learning rate decreases to 0.1x if there is no gain within 20 steps, which is much more efficient.

Citation

If you find our work helpful, please consider citing

@inproceedings{Nie_2021_CVPR,
    title={RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction},
    author={Nie, Yinyu and Hou, Ji and Han, Xiaoguang and Nie{\ss}ner, Matthias},
    booktitle={Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
    year={2021}
}

License

RfD-Net is relased under the MIT License. See the LICENSE file for more details.

Comments

Different per category AP scores from the paper & potential bug in the evaluation

Hello, thanks for the amazing work!

I'm trying to reproduce the results with the pre-trained model, but I got quite different per category AP scores from the paper:

|           | display | bathtub | trashbin | sofa  | chair | table | cabinet | bookshelf | mAP   |
| --------- | ------- | ------- | -------- | ----- | ----- | ----- | ------- | --------- | ----- |
| paper     | 26.67   | 27.57   | 23.34    | 15.71 | 12.23 | 1.92  | 14.48   | 13.39     | 16.90 |
| reproduce | 23.13   | 15.89   | 18.00    | 41.61 | 10.13 | 0.95  | 26.35   | 9.10      | 18.14 |

Besides, there seems to be a lot of false positives at conf_thresh = 0.05:

----------iou_thresh: 0.500000----------
[eval mesh] table
[eval mesh] prec = 0.0037091005431182937 (28.0/7549.0 | rec = 0.05063291139240506(28.0/553) | ap = 0.00946969696969697
[eval mesh] chair
[eval mesh] prec = 0.01814809908597165 (137.0/7549.0 | rec = 0.1253430924062214(137.0/1093) | ap = 0.10131491817235834
[eval mesh] bookshelf
[eval mesh] prec = 0.002119486024639025 (16.0/7549.0 | rec = 0.07547169811320754(16.0/212) | ap = 0.09090909090909091
[eval mesh] sofa
[eval mesh] prec = 0.007948072592396344 (60.0/7549.0 | rec = 0.5309734513274337(60.0/113) | ap = 0.416168487597059
[eval mesh] trash_bin
[eval mesh] prec = 0.010994833752814943 (83.0/7549.0 | rec = 0.3577586206896552(83.0/232) | ap = 0.18000805806512327
[eval mesh] cabinet
[eval mesh] prec = 0.017618227579811897 (133.0/7549.0 | rec = 0.5115384615384615(133.0/260) | ap = 0.26358882912551806
[eval mesh] display
[eval mesh] prec = 0.008610411975096039 (65.0/7549.0 | rec = 0.3403141361256545(65.0/191) | ap = 0.23137496193523358
[eval mesh] bathtub
[eval mesh] prec = 0.005961054444297258 (45.0/7549.0 | rec = 0.375(45.0/120) | ap = 0.15889753331566212

Is this expected? Or should I use a higher confidence threshold?

opened by ashawkey 7

demo executed fail

Begin to finetune from the existing weight. Loading checkpoint from out/pretrained_models/pretrained_weight.pth. set() subnet missed. Weights for finetuning loaded. Loading data. Traceback (most recent call last): File "main.py", line 38, in demo.run(cfg) File "/home//code/RfDNet-main/demo.py", line 409, in run our_data = generate(cfg, net.module, input_data, post_processing=False) File "/home//code/RfDNet-main/demo.py", line 223, in generate eval_dict, parsed_predictions = parse_predictions(end_points, data, cfg.eval_config) File "/home/**/code/RfDNet-main/net_utils/ap_helper.py", line 257, in parse_predictions assert (len(pick) > 0) AssertionError

opened by lonsathing 2
distutils.errors.DistutilsPlatformError: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools"

Hi, I've been trying this code on Windows with Python3.7 Pytorch1.7.1. And I'm having this now:

subprocess.CalledProcessError: Command 'cmd /u /c "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" x86_amd64 && set' returned non-zero exit status 255.

The above exception was the direct cause of the following exception: distutils.errors.DistutilsPlatformError: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visu al-cpp-build-tools/

I've installed the build tools from the given link and configure my environment variables and all possible solutions from the Internet. But the issue stayed.

Any advice?

Many thanks!

opened by YuQiao0303 2
ShapeNetv2_data/watertight_scaled preprocessed ScanNet data not provided

In your Preprocessed ScanNet data, ShapeNetv2_data/watertight_scaled is not provided (only watertight_scaled_simplified is provided). Please provide ShapeNetv2_data/watertight_scaled, thanks

opened by KangchengLiu 2

Missing file external.pointnet2.pytorch_utils

When I run command python main.py --config configs/config_files/ISCNet_detection.yaml --mode train, I met the following error.

Traceback (most recent call last):
  File "main.py", line 31, in <module>
    import train
  File "/home/Projects/RfDNet/train.py", line 4, in <module>
    from models.optimizers import load_optimizer, load_scheduler, load_bnm_scheduler
  File "/home/Projects/RfDNet/models/optimizers.py", line 5, in <module>
    from external.pointnet2.pytorch_utils import BNMomentumScheduler
ModuleNotFoundError: No module named 'external.pointnet2'

opened by CurryYuan 2

Project not building

Hi, thank you for making your codebase public. Unfortunately I cannot build the project, as for the environemt.yml gives an error for the pointnet. I tried to intstall it seperately but still it does not work. Are you sure the project is working with pytorch 1.7.1?

opened by gchal 2
Same data for validation set and testing set

By chance, I find the data splitting files for validation and testing are identical. I wonder whether it uses the same data for validation and testing. test data validation data

opened by Co1lin 1
Demo is not executing successfully

The demo file is not running successfully at all. It's not showing any error and logs. I tried to decipher the code and found that in main.py file its not able to import "net_utils.utils" module. The code is stuck on line 22. I tried to print a string before the import and after the import. But only a string before the import statement is printed. For reference screenshot is attached. . I have followed every step mentioned in Readme.md file. Please help.

opened by shifaezainab 0
ChamferDistance

Dear @yinyunie , When running main.py file (after finishing all the above installations), i met an error about the ChamferDistance file. I also run the chamfer_distance.py file alone but still met that error:

nvcc fatal : Unknown option '-generate-dependencies-with-compile' ninja: build stopped: subcommand failed.

opened by trungpham2606 2

Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Related tags

Overview

RfD-Net [Project Page] [Paper] [Video]

Install

Demo

Prepare Data

Preprocess ScanNet and Scan2CAD data

Preprocess ShapeNet data

Verify preprocessed data

Training, Generating and Evaluation

Training

Generating

Evaluation

Differences to the paper

Citation

License

Comments

Owner

Yinyu Nie

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

U-2-Net: U Square Net - Modified for paired image training of style transfer

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''