Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

Related tags

Deep Learning VMNet
Overview

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation

Framework Fig

Created by Zeyu HU

Introduction

This work is based on our paper VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation, which appears at the IEEE International Conference on Computer Vision (ICCV) 2021.

In recent years, sparse voxel-based methods have become the state-of-the-arts for 3D semantic segmentation of indoor scenes, thanks to the powerful 3D CNNs. Nevertheless, being oblivious to the underlying geometry, voxel-based methods suffer from ambiguous features on spatially close objects and struggle with handling complex and irregular geometries due to the lack of geodesic information. In view of this, we present Voxel-Mesh Network (VMNet), a novel 3D deep architecture that operates on the voxel and mesh representations leveraging both the Euclidean and geodesic information. Intuitively, the Euclidean information extracted from voxels can offer contextual cues representing interactions between nearby objects, while the geodesic information extracted from meshes can help separate objects that are spatially close but have disconnected surfaces. To incorporate such information from the two domains, we design an intra-domain attentive module for effective feature aggregation and an inter-domain attentive module for adaptive feature fusion. Experimental results validate the effectiveness of VMNet: specifically, on the challenging ScanNet dataset for large-scale segmentation of indoor scenes, it outperforms the state-of-the-art SparseConvNet and MinkowskiNet (74.6% vs 72.5% and 73.6% in mIoU) with a simpler network structure (17M vs 30M and 38M parameters).

Citation

If you find our work useful in your research, please consider citing:

@misc{hu2021vmnet,
      title={VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation}, 
      author={Zeyu Hu and Xuyang Bai and Jiaxiang Shang and Runze Zhang and Jiayu Dong and Xin Wang and Guangyuan Sun and Hongbo Fu and Chiew-Lan Tai},
      year={2021},
      eprint={2107.13824},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Installation

  • Our code is based on Pytorch. Please make sure CUDA and cuDNN are installed. One configuration has been tested:

    • Python 3.7
    • Pytorch 1.4.0
    • torchvision 0.5.0
    • CUDA 10.0
    • cudatoolkit 10.0.130
    • cuDNN 7.6.5
  • VMNet depends on the torch-geometric and torchsparse libraries. Please follow their installation instructions. One configuration has been tested, higher versions should work as well:

    • torch-geometric 1.6.3
    • torchsparse 1.1.0
  • We adapted VCGlib to generate pooling trace maps for vertex clustering and quadric error metrics.

    git clone https://github.com/cnr-isti-vclab/vcglib
    
    # QUADRIC ERROR METRICS
    cd vcglib/apps/tridecimator/
    qmake
    make
    
    # VERTEX CLUSTERING
    cd ../sample/trimesh_clustering
    qmake
    make
    

    Please add vcglib/apps/tridecimator and vcglib/apps/sample/trimesh_clustering to your environment path variable.

  • Other dependencies. One configuration has been tested:

    • open3d 0.9.0
    • plyfile 0.7.3
    • scikit-learn 0.24.0
    • scipy 1.6.0

Data Preparation

  • Please refer to https://github.com/ScanNet/ScanNet and https://github.com/niessner/Matterport to get access to the ScanNet and Matterport dataset. Our method relies on the .ply as well as the .labels.ply files. We take ScanNet dataset as example for the following instructions.

  • Create directories to store processed data.

    • 'path/to/processed_data/train/'
    • 'path/to/processed_data/val/'
    • 'path/to/processed_data/test/'
  • Prepare train data.

    python prepare_data.py --considered_rooms_path dataset/data_split/scannetv2_train.txt --in_path path/to/ScanNet/scans --out_path path/to/processed_data/train/
    
  • Prepare val data.

    python prepare_data.py --considered_rooms_path dataset/data_split/scannetv2_val.txt --in_path path/to/ScanNet/scans --out_path path/to/processed_data/val/
    
  • Prepare test data.

    python prepare_data.py --test_split --considered_rooms_path dataset/data_split/scannetv2_test.txt --in_path path/to/ScanNet/scans_test --out_path path/to/processed_data/test/
    

Train

  • On train/val/test setting.

    CUDA_VISIBLE_DEVICES=0 python run.py --train --exp_name name_you_want --data_path path/to/processed_data
    
  • On train+val/test setting (for ScanNet benchmark).

    CUDA_VISIBLE_DEVICES=0 python run.py --train_benchmark --exp_name name_you_want --data_path path/to/processed_data
    

Inference

  • Validation. Pretrained model (73.3% mIoU on ScanNet Val). Please download and put into directory check_points/val_split.

    CUDA_VISIBLE_DEVICES=0 python run.py --val --exp_name val_split --data_path path/to/processed_data
    
  • Test. Pretrained model (74.6% mIoU on ScanNet Test). Please download and put into directory check_points/test_split. TxT files for benchmark submission will be saved in directory test_results/.

    CUDA_VISIBLE_DEVICES=0 python run.py --test --exp_name test_split --data_path path/to/processed_data
    

Acknowledgements

Our code is built upon torch-geometric, torchsparse and dcm-net.

License

Our code is released under MIT License (see LICENSE file for details).

Comments
  • The pretrain model has wrong state_dict keys

    The pretrain model has wrong state_dict keys

    Hi, I get some issue about

    1. I test the the script with pretrain model
    python run.py --test --exp_name test_split --data_path path/to/processed_data
    

    The output logs are:

    use_cuda: True
    exp_name: test_split
    #parameters 17463870
    Traceback (most recent call last):
      File "run.py", line 279, in <module>
        test(exp_name, test_files)
      File "run.py", line 139, in test
        model.load_state_dict(checkpoint['model_state_dict'])
      File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
        self.__class__.__name__, "\n\t".join(error_msgs)))
    RuntimeError: Error(s) in loading state_dict for VMNet:
    	Unexpected key(s) in state_dict: "Geo_branch.mid_geo.geo_0.lin_edge.weight", "Geo_branch.mid_geo.geo_0.lin_edge.bias", "Geo_branch.mid_geo.geo_1.lin_edge.weight", "Geo_branch.mid_geo.geo_1.lin_edge.bias", "Geo_branch.cd_5.geo_0.lin_edge.weight", "Geo_branch.cd_5.geo_0.lin_edge.bias", "Geo_branch.de5_geo.geo_0.lin_edge.weight", "Geo_branch.de5_geo.geo_0.lin_edge.bias", "Geo_branch.de5_geo.geo_1.lin_edge.weight", "Geo_branch.de5_geo.geo_1.lin_edge.bias", "Geo_branch.cd_4.geo_0.lin_edge.weight", "Geo_branch.cd_4.geo_0.lin_edge.bias", "Geo_branch.de4_geo.geo_0.lin_edge.weight", "Geo_branch.de4_geo.geo_0.lin_edge.bias", "Geo_branch.de4_geo.geo_1.lin_edge.weight", "Geo_branch.de4_geo.geo_1.lin_edge.bias", "Geo_branch.cd_3.geo_0.lin_edge.weight", "Geo_branch.cd_3.geo_0.lin_edge.bias", "Geo_branch.de3_geo.geo_0.lin_edge.weight", "Geo_branch.de3_geo.geo_0.lin_edge.bias", "Geo_branch.de3_geo.geo_1.lin_edge.weight", "Geo_branch.de3_geo.geo_1.lin_edge.bias", "Geo_branch.cd_2.geo_0.lin_edge.weight", "Geo_branch.cd_2.geo_0.lin_edge.bias", "Geo_branch.de2_geo.geo_0.lin_edge.weight", "Geo_branch.de2_geo.geo_0.lin_edge.bias", "Geo_branch.de2_geo.geo_1.lin_edge.weight", "Geo_branch.de2_geo.geo_1.lin_edge.bias", "Geo_branch.cd_1.geo_0.lin_edge.weight", "Geo_branch.cd_1.geo_0.lin_edge.bias", "Geo_branch.de1_geo.geo_0.lin_edge.weight", "Geo_branch.de1_geo.geo_0.lin_edge.bias", "Geo_branch.de1_geo.geo_1.lin_edge.weight", "Geo_branch.de1_geo.geo_1.lin_edge.bias", "Geo_branch.cd_0.geo_0.lin_edge.weight", "Geo_branch.cd_0.geo_0.lin_edge.bias", "Geo_branch.de0_geo.geo_0.lin_edge.weight", "Geo_branch.de0_geo.geo_0.lin_edge.bias", "Geo_branch.de0_geo.geo_1.lin_edge.weight", "Geo_branch.de0_geo.geo_1.lin_edge.bias".
    

    What are the Unexpected key(s) in state_dict ? does the VMNet not defined?

    1. When I preprocess the data, I had build VCGlib vcglib/apps/tridecimator and vcglib/apps/sample/trimesh_clustering , I add environment path by:
    export PATH=$PATH:/path/to/vcglib/apps/tridecimator:/path/to/vcglib/apps/sample/trimesh_clustering
    # create links
    sudo ln -s /path/to/vcglib/apps/tridecimator/tridecimator /usr/local/bin
    sudo ln -s /path/to/vcglib/apps/sample/trimesh_clustering/trimesh_clustering /usr/local/bin
    

    But run the preprocess, there is core dumped:

    in_path:../scannet/VMtest
    out_path:../scannet/VMNet_data/train/
    [0.02, 0.04, 30, 30, 30, 30, 30]
    Processing ../scannet/VMtest/scene0000_00/scene0000_00_vh_clean_2.ply
    curr_dir: ../scannet/VMNet_data/train/scene0000_00
    trimesh_clustering: ../../../vcg/simplex/vertex/component.h:75: vcg::vertex::EmptyCore<TT>::ColorType& vcg::vertex::EmptyCore<TT>::C() [with TT = MyUsedTypes; vcg::vertex::EmptyCore<TT>::ColorType = vcg::Color4<unsigned char>]: Assertion `0' failed.
    Aborted (core dumped)
    trimesh_clustering: ../../../vcg/simplex/vertex/component.h:75: vcg::vertex::EmptyCore<TT>::ColorType& vcg::vertex::EmptyCore<TT>::C() [with TT = MyUsedTypes; vcg::vertex::EmptyCore<TT>::ColorType = vcg::Color4<unsigned char>]: Assertion `0' failed.
    Aborted (core dumped)
    multiprocessing.pool.RemoteTraceback: 
    """
    Traceback (most recent call last):
      File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/multiprocessing/pool.py", line 121, in worker
        result = (True, func(*args, **kwds))
      File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
        return list(map(*args))
      File "prepare_data.py", line 226, in process_frame
        old_vertices=vertices[-1])
      File "prepare_data.py", line 173, in quadric_error_metric
        '.ply', '.csv'), old_vertices=old_vertices, new_vertices=vertices_l)
      File "prepare_data.py", line 78, in csv2npy
        with open(in_file_path, 'r') as csvfile:
    FileNotFoundError: [Errno 2] No such file or directory: '../scannet/VMNet_data/train/scene0000_00/curr_mesh.csv'
    """
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "prepare_data.py", line 295, in <module>
        pf_pool.map(process_frame_p, file_paths)
      File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/multiprocessing/pool.py", line 268, in map
        return self._map_async(func, iterable, mapstar, chunksize).get()
      File "/home/keroro/Program_Files/miniconda3/envs/tt/lib/python3.7/multiprocessing/pool.py", line 657, in get
        raise self._value
    FileNotFoundError: [Errno 2] No such file or directory: '../scannet/VMNet_data/train/scene0000_00/curr_mesh.csv'
    
    

    Where should I set the correct environment path for vcglib/apps/tridecimator and vcglib/apps/sample/trimesh_clustering?

    opened by cia1099 13
  • Issue of 'input voxels are not valid'

    Issue of 'input voxels are not valid'

    Hi, authors, thanks for sharing your work.

    When I tried to train VMNet with my own training data, the valid_idxs is not meet the assert condition, like https://github.com/hzykent/VMNet/blob/61816831ca006862c3b0e6fcaa83e74e690dba0a/dataset/scannetv2.py#L158

    I think the problem is related to my own data, but the specific reason for such an issue is unclear. Do you have any suggestions about that?

    PS: I use the prepare_data.py script to preprocess my own data, as same as the preprocessing procedure for the ScanNet dataset.

    opened by Hao-HUST 6
  • about downloading ScanNet

    about downloading ScanNet

    Hi, I haven't used ScanNet before, so I'm not familiar with it and need your help. Due to my limited memory space, I want to only download the needed parts of the ScanNet. Since you say:

    Our method relies on the .ply as well as the .labels.ply files.

    so does it mean that I can download the ScanNet only using this two command?

    download-scannet.py -o [directory in which to download] --type _vh_clean_2.ply
    download-scannet.py -o [directory in which to download] --type _vh_clean_2.labels.ply
    

    In addition, how much memory space is needed to store the processed data?

    opened by cjyiiiing 2
  • Quadric Error Metrics: contraction of non-connected vertices possible?

    Quadric Error Metrics: contraction of non-connected vertices possible?

    Hi there,

    Thanks for your amazing work!

    I would be glad if you could please answer the following question about the usage of QEM in VMNet.
    The publication seems to indicate that for vertex contraction only vertices connected by edges are considered. However, in the original QEM publication, the authors also propose selecting vertex pairs for contraction based on their Euclidean distance. They use a threshold value t for the Euclidean distance.

    Using only vertices connected by edges would imply a threshold of t = 0. In prepare_data.py, Tridecimator from VCGlib is called. If I understand the call correctly, the optional argument -e is not passed which specifies the threshold. In the Tridecimator application, the threshold t then defaults to inf, meaning all pairs of vertices would be eligible for contraction.

    Therefore I would like to know: Is contraction of non-connected vertices possible in VMNet?

    Thanks a lot for your time, Benjamin

    opened by BenjaminEpple 1
  • about Matterport3D results

    about Matterport3D results

    Hi~

    Thanks for the great work. I find your work use Matterport3D datatset and get pretty good results showed in the paper.

    But I don’t find any code related to Matterport3D in this repository. I wonder if you can share the Matterport3D related code and give a brief description of how to reproduce the results on Matterport3D.

    Looking forward to your reply and this will help a lot.

    opened by lcysonya 1
Owner
HU Zeyu
HU Zeyu
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022
Blender scripts for computing geodesic distance

GeoDoodle Geodesic distance computation for Blender meshes Table of Contents Overivew Usage Implementation Overview This addon provides an operator fo

null 20 Jun 8, 2022
A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh-Keys A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python Have been seeing alot

Joseph 53 Dec 13, 2022
Mesh Graphormer is a new transformer-based method for human pose and mesh reconsruction from an input image

MeshGraphormer ✨ ✨ This is our research code of Mesh Graphormer. Mesh Graphormer is a new transformer-based method for human pose and mesh reconsructi

Microsoft 251 Jan 8, 2023
Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

generate_cloud_points Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh. Run python disp_mesh.py Or you

Peng Yu 2 Dec 24, 2021
AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

AI Face Mesh: This is a simple face mesh detection program based on Artificial Intelligence which made with Python. It's able to detect 468 different

Md. Rakibul Islam 1 Jan 13, 2022
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

null 87 Oct 19, 2022
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

CV Lab @ Yonsei University 36 Nov 4, 2022
Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

Enrico Fini 118 Dec 26, 2022
Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021. Introduction We proposed a novel model training paradi

Lucas 103 Dec 14, 2022
Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation (ICCV2021)

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation This is a pytorch project for the paper Dynamic Divide-and-Conquer Ad

DV Lab 29 Nov 21, 2022
ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Introduction The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into ss

null 55 Nov 9, 2022
Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021) This is the implementation of PSD (ICCV 2021),

null 12 Dec 12, 2022
ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Zongdai 107 Dec 20, 2022
This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

Jiaqi Wang 42 Jan 7, 2023
TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two tasks.

null 264 Jan 9, 2023
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

null 63 Nov 18, 2022
ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

null 64 Jan 5, 2023