Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

HU Zeyu

Last update: Dec 27, 2022

Related tags

Deep Learning VMNet

Overview

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation

Created by Zeyu HU

Introduction

This work is based on our paper VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation, which appears at the IEEE International Conference on Computer Vision (ICCV) 2021.

In recent years, sparse voxel-based methods have become the state-of-the-arts for 3D semantic segmentation of indoor scenes, thanks to the powerful 3D CNNs. Nevertheless, being oblivious to the underlying geometry, voxel-based methods suffer from ambiguous features on spatially close objects and struggle with handling complex and irregular geometries due to the lack of geodesic information. In view of this, we present Voxel-Mesh Network (VMNet), a novel 3D deep architecture that operates on the voxel and mesh representations leveraging both the Euclidean and geodesic information. Intuitively, the Euclidean information extracted from voxels can offer contextual cues representing interactions between nearby objects, while the geodesic information extracted from meshes can help separate objects that are spatially close but have disconnected surfaces. To incorporate such information from the two domains, we design an intra-domain attentive module for effective feature aggregation and an inter-domain attentive module for adaptive feature fusion. Experimental results validate the effectiveness of VMNet: specifically, on the challenging ScanNet dataset for large-scale segmentation of indoor scenes, it outperforms the state-of-the-art SparseConvNet and MinkowskiNet (74.6% vs 72.5% and 73.6% in mIoU) with a simpler network structure (17M vs 30M and 38M parameters).

Citation

If you find our work useful in your research, please consider citing:

@misc{hu2021vmnet,
      title={VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation}, 
      author={Zeyu Hu and Xuyang Bai and Jiaxiang Shang and Runze Zhang and Jiayu Dong and Xin Wang and Guangyuan Sun and Hongbo Fu and Chiew-Lan Tai},
      year={2021},
      eprint={2107.13824},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Installation

Our code is based on Pytorch. Please make sure CUDA and cuDNN are installed. One configuration has been tested:
- Python 3.7
- Pytorch 1.4.0
- torchvision 0.5.0
- CUDA 10.0
- cudatoolkit 10.0.130
- cuDNN 7.6.5
VMNet depends on the torch-geometric and torchsparse libraries. Please follow their installation instructions. One configuration has been tested, higher versions should work as well:
- torch-geometric 1.6.3
- torchsparse 1.1.0
We adapted VCGlib to generate pooling trace maps for vertex clustering and quadric error metrics.
```
git clone https://github.com/cnr-isti-vclab/vcglib

# QUADRIC ERROR METRICS
cd vcglib/apps/tridecimator/
qmake
make

# VERTEX CLUSTERING
cd ../sample/trimesh_clustering
qmake
make
```
Please add vcglib/apps/tridecimator and vcglib/apps/sample/trimesh_clustering to your environment path variable.
Other dependencies. One configuration has been tested:
- open3d 0.9.0
- plyfile 0.7.3
- scikit-learn 0.24.0
- scipy 1.6.0

Data Preparation

Please refer to https://github.com/ScanNet/ScanNet and https://github.com/niessner/Matterport to get access to the ScanNet and Matterport dataset. Our method relies on the .ply as well as the .labels.ply files. We take ScanNet dataset as example for the following instructions.
Create directories to store processed data.
- 'path/to/processed_data/train/'
- 'path/to/processed_data/val/'
- 'path/to/processed_data/test/'

Prepare train data.

python prepare_data.py --considered_rooms_path dataset/data_split/scannetv2_train.txt --in_path path/to/ScanNet/scans --out_path path/to/processed_data/train/

Prepare val data.

python prepare_data.py --considered_rooms_path dataset/data_split/scannetv2_val.txt --in_path path/to/ScanNet/scans --out_path path/to/processed_data/val/

Prepare test data.

python prepare_data.py --test_split --considered_rooms_path dataset/data_split/scannetv2_test.txt --in_path path/to/ScanNet/scans_test --out_path path/to/processed_data/test/

Train

On train/val/test setting.

CUDA_VISIBLE_DEVICES=0 python run.py --train --exp_name name_you_want --data_path path/to/processed_data

On train+val/test setting (for ScanNet benchmark).

CUDA_VISIBLE_DEVICES=0 python run.py --train_benchmark --exp_name name_you_want --data_path path/to/processed_data

Inference

Validation. Pretrained model (73.3% mIoU on ScanNet Val). Please download and put into directory check_points/val_split.
```
CUDA_VISIBLE_DEVICES=0 python run.py --val --exp_name val_split --data_path path/to/processed_data
```
Test. Pretrained model (74.6% mIoU on ScanNet Test). Please download and put into directory check_points/test_split. TxT files for benchmark submission will be saved in directory test_results/.
```
CUDA_VISIBLE_DEVICES=0 python run.py --test --exp_name test_split --data_path path/to/processed_data
```

Acknowledgements

Our code is built upon torch-geometric, torchsparse and dcm-net.

License

Our code is released under MIT License (see LICENSE file for details).

Comments

The pretrain model has wrong state_dict keys

Hi, I get some issue about

I test the the script with pretrain model

python run.py --test --exp_name test_split --data_path path/to/processed_data

The output logs are:

use_cuda: True
exp_name: test_split
#parameters 17463870
Traceback (most recent call last):
  File "run.py", line 279, in <module>
    test(exp_name, test_files)
  File "run.py", line 139, in test
    model.load_state_dict(checkpoint['model_state_dict'])
  File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VMNet:
	Unexpected key(s) in state_dict: "Geo_branch.mid_geo.geo_0.lin_edge.weight", "Geo_branch.mid_geo.geo_0.lin_edge.bias", "Geo_branch.mid_geo.geo_1.lin_edge.weight", "Geo_branch.mid_geo.geo_1.lin_edge.bias", "Geo_branch.cd_5.geo_0.lin_edge.weight", "Geo_branch.cd_5.geo_0.lin_edge.bias", "Geo_branch.de5_geo.geo_0.lin_edge.weight", "Geo_branch.de5_geo.geo_0.lin_edge.bias", "Geo_branch.de5_geo.geo_1.lin_edge.weight", "Geo_branch.de5_geo.geo_1.lin_edge.bias", "Geo_branch.cd_4.geo_0.lin_edge.weight", "Geo_branch.cd_4.geo_0.lin_edge.bias", "Geo_branch.de4_geo.geo_0.lin_edge.weight", "Geo_branch.de4_geo.geo_0.lin_edge.bias", "Geo_branch.de4_geo.geo_1.lin_edge.weight", "Geo_branch.de4_geo.geo_1.lin_edge.bias", "Geo_branch.cd_3.geo_0.lin_edge.weight", "Geo_branch.cd_3.geo_0.lin_edge.bias", "Geo_branch.de3_geo.geo_0.lin_edge.weight", "Geo_branch.de3_geo.geo_0.lin_edge.bias", "Geo_branch.de3_geo.geo_1.lin_edge.weight", "Geo_branch.de3_geo.geo_1.lin_edge.bias", "Geo_branch.cd_2.geo_0.lin_edge.weight", "Geo_branch.cd_2.geo_0.lin_edge.bias", "Geo_branch.de2_geo.geo_0.lin_edge.weight", "Geo_branch.de2_geo.geo_0.lin_edge.bias", "Geo_branch.de2_geo.geo_1.lin_edge.weight", "Geo_branch.de2_geo.geo_1.lin_edge.bias", "Geo_branch.cd_1.geo_0.lin_edge.weight", "Geo_branch.cd_1.geo_0.lin_edge.bias", "Geo_branch.de1_geo.geo_0.lin_edge.weight", "Geo_branch.de1_geo.geo_0.lin_edge.bias", "Geo_branch.de1_geo.geo_1.lin_edge.weight", "Geo_branch.de1_geo.geo_1.lin_edge.bias", "Geo_branch.cd_0.geo_0.lin_edge.weight", "Geo_branch.cd_0.geo_0.lin_edge.bias", "Geo_branch.de0_geo.geo_0.lin_edge.weight", "Geo_branch.de0_geo.geo_0.lin_edge.bias", "Geo_branch.de0_geo.geo_1.lin_edge.weight", "Geo_branch.de0_geo.geo_1.lin_edge.bias".

What are the Unexpected key(s) in state_dict ? does the VMNet not defined?

When I preprocess the data, I had build VCGlib vcglib/apps/tridecimator and vcglib/apps/sample/trimesh_clustering , I add environment path by:

export PATH=$PATH:/path/to/vcglib/apps/tridecimator:/path/to/vcglib/apps/sample/trimesh_clustering
# create links
sudo ln -s /path/to/vcglib/apps/tridecimator/tridecimator /usr/local/bin
sudo ln -s /path/to/vcglib/apps/sample/trimesh_clustering/trimesh_clustering /usr/local/bin

But run the preprocess, there is core dumped:

in_path:../scannet/VMtest
out_path:../scannet/VMNet_data/train/
[0.02, 0.04, 30, 30, 30, 30, 30]
Processing ../scannet/VMtest/scene0000_00/scene0000_00_vh_clean_2.ply
curr_dir: ../scannet/VMNet_data/train/scene0000_00
trimesh_clustering: ../../../vcg/simplex/vertex/component.h:75: vcg::vertex::EmptyCore<TT>::ColorType& vcg::vertex::EmptyCore<TT>::C() [with TT = MyUsedTypes; vcg::vertex::EmptyCore<TT>::ColorType = vcg::Color4<unsigned char>]: Assertion `0' failed.
Aborted (core dumped)
trimesh_clustering: ../../../vcg/simplex/vertex/component.h:75: vcg::vertex::EmptyCore<TT>::ColorType& vcg::vertex::EmptyCore<TT>::C() [with TT = MyUsedTypes; vcg::vertex::EmptyCore<TT>::ColorType = vcg::Color4<unsigned char>]: Assertion `0' failed.
Aborted (core dumped)
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "prepare_data.py", line 226, in process_frame
    old_vertices=vertices[-1])
  File "prepare_data.py", line 173, in quadric_error_metric
    '.ply', '.csv'), old_vertices=old_vertices, new_vertices=vertices_l)
  File "prepare_data.py", line 78, in csv2npy
    with open(in_file_path, 'r') as csvfile:
FileNotFoundError: [Errno 2] No such file or directory: '../scannet/VMNet_data/train/scene0000_00/curr_mesh.csv'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "prepare_data.py", line 295, in <module>
    pf_pool.map(process_frame_p, file_paths)
  File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/multiprocessing/pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
FileNotFoundError: [Errno 2] No such file or directory: '../scannet/VMNet_data/train/scene0000_00/curr_mesh.csv'

Where should I set the correct environment path for vcglib/apps/tridecimator and vcglib/apps/sample/trimesh_clustering?

opened by cia1099 13

Issue of 'input voxels are not valid'

Hi, authors, thanks for sharing your work.

When I tried to train VMNet with my own training data, the valid_idxs is not meet the assert condition, like https://github.com/hzykent/VMNet/blob/61816831ca006862c3b0e6fcaa83e74e690dba0a/dataset/scannetv2.py#L158

I think the problem is related to my own data, but the specific reason for such an issue is unclear. Do you have any suggestions about that?

PS: I use the prepare_data.py script to preprocess my own data, as same as the preprocessing procedure for the ScanNet dataset.

opened by Hao-HUST 6
about downloading ScanNet
Hi, I haven't used ScanNet before, so I'm not familiar with it and need your help. Due to my limited memory space, I want to only download the needed parts of the ScanNet. Since you say:

Our method relies on the .ply as well as the .labels.ply files.

so does it mean that I can download the ScanNet only using this two command?

download-scannet.py -o [directory in which to download] --type _vh_clean_2.ply download-scannet.py -o [directory in which to download] --type _vh_clean_2.labels.ply

In addition, how much memory space is needed to store the processed data?
opened by cjyiiiing 2
Quadric Error Metrics: contraction of non-connected vertices possible?

Hi there,

Thanks for your amazing work!

I would be glad if you could please answer the following question about the usage of QEM in VMNet.
The publication seems to indicate that for vertex contraction only vertices connected by edges are considered. However, in the original QEM publication, the authors also propose selecting vertex pairs for contraction based on their Euclidean distance. They use a threshold value t for the Euclidean distance.

Using only vertices connected by edges would imply a threshold of t = 0. In prepare_data.py, Tridecimator from VCGlib is called. If I understand the call correctly, the optional argument -e is not passed which specifies the threshold. In the Tridecimator application, the threshold t then defaults to inf, meaning all pairs of vertices would be eligible for contraction.

Therefore I would like to know: Is contraction of non-connected vertices possible in VMNet?

Thanks a lot for your time, Benjamin

opened by BenjaminEpple 1
about Matterport3D results

Hi~

Thanks for the great work. I find your work use Matterport3D datatset and get pretty good results showed in the paper.

But I don’t find any code related to Matterport3D in this repository. I wonder if you can share the Matterport3D related code and give a brief description of how to reproduce the results on Matterport3D.

Looking forward to your reply and this will help a lot.

opened by lcysonya 1

Owner

HU Zeyu

GitHub

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

10 Oct 11, 2022

Blender scripts for computing geodesic distance

GeoDoodle Geodesic distance computation for Blender meshes Table of Contents Overivew Usage Implementation Overview This addon provides an operator fo

20 Jun 8, 2022

A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh-Keys A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python Have been seeing alot

53 Dec 13, 2022

Mesh Graphormer is a new transformer-based method for human pose and mesh reconsruction from an input image

MeshGraphormer ✨ ✨ This is our research code of Mesh Graphormer. Mesh Graphormer is a new transformer-based method for human pose and mesh reconsructi

251 Jan 8, 2023

Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

generate_cloud_points Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh. Run python disp_mesh.py Or you

2 Dec 24, 2021

AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

AI Face Mesh: This is a simple face mesh detection program based on Artificial Intelligence which made with Python. It's able to detect 468 different

1 Jan 13, 2022

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

87 Oct 19, 2022

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

32 Sep 21, 2022

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

36 Nov 4, 2022

Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

118 Dec 26, 2022

Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021. Introduction We proposed a novel model training paradi

103 Dec 14, 2022

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation （ICCV2021）

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation This is a pytorch project for the paper Dynamic Divide-and-Conquer Ad

29 Nov 21, 2022

ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Introduction The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into ss

55 Nov 9, 2022

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021) This is the implementation of PSD (ICCV 2021),

12 Dec 12, 2022

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

107 Dec 20, 2022

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

42 Jan 7, 2023

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two tasks.

264 Jan 9, 2023

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

63 Nov 18, 2022

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

64 Jan 5, 2023

Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

Related tags

Overview

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation

Introduction

Citation

Installation

Data Preparation

Train

Inference

Acknowledgements

License

Comments

The pretrain model has wrong state_dict keys

Issue of 'input voxels are not valid'

about downloading ScanNet

Quadric Error Metrics: contraction of non-connected vertices possible?

about Matterport3D results

Owner

HU Zeyu

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Blender scripts for computing geodesic distance

A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh Graphormer is a new transformer-based method for human pose and mesh reconsruction from an input image

Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation （ICCV2021）

ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks