3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

Related tags

Deep Learning Software_Foundation_Skoltech_Coursework

Overview

3DMV

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 paper, 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation.

Code

Installation:

Training is implemented with PyTorch. This code was developed under PyTorch 0.2 and recently upgraded to PyTorch 0.4.

Training:

See python train.py --help for all train options. Example train call:

python train.py --gpu 0 --train_data_list [path to list of train files] --data_path_2d [path to 2d image data] --class_weight_file [path to txt file of train histogram] --num_nearest_images 5 --model2d_path [path to pretrained 2d model]

Trained models: models.zip

Testing

See python test.py --help for all test options. Example test call:

python test.py --gpu 0 --scene_list [path to list of test scenes] --model_path [path to trained model.pth] --data_path_2d [path to 2d image data] --data_path_3d [path to test scene data] --num_nearest_images 5 --model2d_orig_path [path to pretrained 2d model]

Data:

This data has been precomputed from the ScanNet (v2) dataset.

Train data for ScanNet v2: 3dmv_scannet_v2_train.zip (6.2G)

2D train images can be processed from the ScanNet dataset using the 2d data preparation script in prepare_data
Expected file structure for 2D data:

scene0000_00/
|--color/
   |--[framenum].jpg
       ⋮
|--depth/
   |--[framenum].png   (16-bit pngs)
       ⋮
|--pose/
   |--[framenum].txt   (4x4 rigid transform as txt file)
       ⋮
|--label/    (if applicable)
   |--[framenum].png   (8-bit pngs)
       ⋮
scene0000_01/
⋮

Test scenes for ScanNet v2: 3dmv_scannet_v2_test_scenes.zip (110M)

Citation:

If you find our work useful in your research, please consider citing:

@inproceedings{dai20183dmv,
 author = {Dai, Angela and Nie{\ss}ner, Matthias},
 booktitle = {Proceedings of the European Conference on Computer Vision ({ECCV})},
 title = {3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation},
 year = {2018}
}

Contact:

If you have any questions, please email Angela Dai at [email protected].

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

Related tags

Overview

3DMV

Code

Installation:

Training:

Testing

Data:

Citation:

Contact:

You might also like...

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Fast Soft Color Segmentation

Automatic detection and classification of Covid severity degree in LUS (lung ultrasound) scans

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

Implementation of Geometric Vector Perceptron, a simple circuit for 3d rotation equivariance for learning over large biomolecules, in Pytorch. Idea proposed and accepted at ICLR 2021

GeoTransformer - Geometric Transformer for Fast and Robust Point Cloud Registration

Owner

Владислав Молодцов

Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter

Segmentation and Identification of Vertebrae in CT Scans using CNN, k-means Clustering and k-NN

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper

Auto-Lama combines object detection and image inpainting to automate object removals

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)