MonoScene: Monocular 3D Semantic Scene Completion

Overview

MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page]
Anh-Quan Cao, Raoul de Charette
Inria, Paris, France

If you find this work useful, please cite our paper:

@misc{cao2021monoscene,
      title={MonoScene: Monocular 3D Semantic Scene Completion}, 
      author={Anh-Quan Cao and Raoul de Charette},
      year={2021},
      eprint={2112.00726},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Code and models will be released soon. Please watch this repo for updates.

Demo

SemanticKITTI KITTI-360
(Trained on SemanticKITTI)

NYUv2

Comments
  • TypeError: 'int' object is not subscriptable

    TypeError: 'int' object is not subscriptable

    (monoscene) ruidong@ubuntu-X299-UD4-Pro:~/workplace/MonoScene$ python monoscene/scripts/train_monoscene.py dataset=kitti enable_log=true kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS kitti_logdir=$KITTI_LOG n_gpus=2 batch_size=2 ^[[Dexp_kitti_1_FrusSize_8_nRelations4_WD0.0001_lr0.0001_CEssc_geoScalLoss_semScalLoss_fpLoss_CERel_3DCRP_Proj_2_4_8 n_relations (32, 32, 4) Traceback (most recent call last): File "monoscene/scripts/train_monoscene.py", line 118, in main class_weights=class_weights, File "/home/ruidong/workplace/MonoScene/monoscene/models/monoscene.py", line 80, in init context_prior=context_prior, File "/home/ruidong/workplace/MonoScene/monoscene/models/unet3d_kitti.py", line 62, in init self.feature * 4, self.feature * 4, size_l3, bn_momentum=bn_momentum File "/home/ruidong/workplace/MonoScene/monoscene/models/CRP3D.py", line 15, in init self.flatten_size = size[0] * size[1] * size[2] TypeError: 'int' object is not subscriptable

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    opened by DipDipPotatoChips 21
  • Questions about cross-entropy loss

    Questions about cross-entropy loss

    Dear authors, thanks for your great works! In your paper, you say that "the losses are computed only where y is defined". I wonder if this means you do not add supervision on non-occupied voxels and only use multi-class classification loss on occupied voxels ? If this holds true, why the model can identify which voxels are occupied ?

    opened by weiyithu 13
  • about test

    about test

    FileNotFoundError: [Errno 2] No such file or directory: '/home/ruidong/workplace/MonoScene/trained_models/monoscene_kitti.ckpt'

    the last printing of trainning is: Epoch 29: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2325/2325 [1:06:52<00:00, 1.73s/it, loss=3.89, v_num=]

    opened by DipDipPotatoChips 13
  • Cuda out of memory

    Cuda out of memory

    Dear author, you said that Use smaller 2D backbone by chaning the basemodel_name and num_features The pretrained model name is here. You can try the efficientnet B5 can reduces the memory, I want to know the B5 weight and the value of num_features?

    opened by lulianLiu 12
  • Pretrained models on other dataset: NuScenes

    Pretrained models on other dataset: NuScenes

    Hi @anhquancao,

    Thanks so much for your paper and your implementation. Do you have your pretrained model on the NuScenes? If yes, could you share it? The reason is that I want to build upon your work on the NuScenes dataset but there exists a large domain gap between the two (SemanticKITTI and NuScenes) so the pretrained on SemanticKITTI works does not well on the NuScenes.

    Thanks!

    opened by ducminhkhoi 11
  • failed to run test

    failed to run test

    When I try to run this script, it crashed without giving any information: python monoscene/scripts/generate_output.py +output_path=$MONOSCENE_OUTPUT dataset=kitti_360 +kitti_360_root=$KITTI_360_ROOT +kitti_360_sequence=2013_05_28_drive_0028_sync n_gpus=1 batch_size=1

    image

    Any suggestion will be much appreciated.

    opened by ChiyuanFeng 9
  • cannot find calib

    cannot find calib

    PS F:\Studying\CY-Workspace\MonoScene-master> python monoscene/scripts/eval_monoscene.py dataset=kitti kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS n_gpus=1 batch_size= 1 GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs n_relations 4 Using cache found in C:\Users\DELL/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master Loading base model ()...Done. Removing last two layers (global_pool & classifier). Building Encoder-Decoder model..Done. Traceback (most recent call last): File "monoscene/scripts/eval_monoscene.py", line 71, in main data_module.setup() File "F:\anaconda\envs\monoscene\lib\site-packages\pytorch_lightning\core\datamodule.py", line 440, in wrapped_fn fn(*args, **kwargs) File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dm.py", line 34, in setup color_jitter=(0.4, 0.4, 0.4), File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dataset.py", line 60, in init os.path.join(self.root, "dataset", "sequences", sequence, "calib.txt") File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dataset.py", line 193, in read_calib with open(calib_path, "r") as f: FileNotFoundError: [Errno 2] No such file or directory: 'dataset\sequences\00\calib.txt'

    opened by cyaccpect 9
  • about visualization

    about visualization

    (monoscene) ruidong@ubuntu-X299-UD4-Pro:~/workplace/MonoScene$ python monoscene/scripts/visualization/kitti_vis_pred.py +file=/home/ruidong/workplace/MonoScene/outputs/kitti/08/000000.pkl +dataset=kitt monoscene/scripts/visualization/kitti_vis_pred.py:23: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations coords_grid = coords_grid.astype(np.float) Traceback (most recent call last): File "monoscene/scripts/visualization/kitti_vis_pred.py", line 196, in main d=7, File "monoscene/scripts/visualization/kitti_vis_pred.py", line 75, in draw grid_coords = np.vstack([grid_coords.T, voxels.reshape(-1)]).T AttributeError: 'tuple' object has no attribute 'T'

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    opened by DipDipPotatoChips 9
  • Porting the work of this paper to a new dataset

    Porting the work of this paper to a new dataset

    Hello author, first of all thank you for your great work. I want to directly apply your work to the nuscenes dataset, is it possible? Does the nuscenes dataset need point cloud data to assist in generating voxel data?

    opened by yukaizhou 8
  • Can you help me in another paper?

    Can you help me in another paper?

    Hello! Last year, when you reproduced the code SISC(https://github.com/OPEN-AIR-SUN/SISC), you found a bug and solve it! Now, I get the same problem too,can you tell me how to solve it ! Thank you very much!

    opened by WkangLiu 8
  • ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

    ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

    there is something wrong with my machine and I reinstall my ubuntu. I re-gitclone the code and just keep the data.but when I follow the readme to do installation,it print:

    (monoscene) potato@ubuntu-X299-UD4-Pro:~/workplace/MonoScene$ pip install -e ./ Obtaining file:///home/potato/workplace/MonoScene Installing collected packages: monoscene Running setup.py develop for monoscene Successfully installed monoscene-0.0.0 (monoscene) potato@ubuntu-X299-UD4-Pro:~/workplace/MonoScene$ python monoscene/scripts/train_monoscene.py dataset=kitti enable_log=true kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS kitti_logdir=$KITTI_LOG n_gpus=1 batch_size=1 sem_scal_loss=False Traceback (most recent call last): File "monoscene/scripts/train_monoscene.py", line 1, in from monoscene.data.semantic_kitti.kitti_dm import KittiDataModule File "/home/potato/workplace/MonoScene/monoscene/data/semantic_kitti/kitti_dm.py", line 3, in import pytorch_lightning as pl File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in from pytorch_lightning import metrics # noqa: E402 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in from pytorch_lightning.metrics.utils import deprecated_metrics, void File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in from torchmetrics.utilities.data import get_num_classes as _get_num_classes ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

    opened by DipDipPotatoChips 7
Releases(v0.1)
Owner
Codes from Computer Vision group of RITS Team, Inria
null
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)

Self-Supervised Multi-Frame Monocular Scene Flow 3D visualization of estimated depth and scene flow (overlayed with input image) from temporally conse

Visual Inference Lab @TU Darmstadt 85 Dec 22, 2022
[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Y

null 118 Dec 26, 2022
git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Commonsense Knowledge Base Completion with Structural and Semantic Context Code for the paper Commonsense Knowledge Base Completion with Structural an

AI2 96 Nov 5, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022
Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Implementation of Neural Scene Graphs, that optimizes multiple radiance fields to represent different objects and a static scene background. Learned representations can be rendered with novel object compositions and views.

null 151 Dec 26, 2022
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022
Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

Cheng Zhang 66 Nov 16, 2022
Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Automatic Number Plate Recognition Automatic Number Plate Recognition (ANPR) is the process of reading the characters on the plate with various optica

Meftun AKARSU 52 Dec 22, 2022
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Make-A-Scene - PyTorch Pytorch implementation (inofficial) of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (https://arxiv.org/

Casual GAN Papers 259 Dec 28, 2022
Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

RfD-Net [Project Page] [Paper] [Video] RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction Yinyu Nie, Ji Hou, Xiaoguang Han, Matthi

Yinyu Nie 162 Jan 6, 2023
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 8, 2023
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 8, 2023
ICRA 2021 "Towards Precise and Efficient Image Guided Depth Completion"

PENet: Precise and Efficient Depth Completion This repo is the PyTorch implementation of our paper to appear in ICRA2021 on "Towards Precise and Effic

null 232 Dec 25, 2022
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021) An efficient PyTorch library for Point Cloud Completion.

Microsoft 119 Jan 2, 2023
[CVPR 2021 Oral] Variational Relational Point Completion Network

VRCNet: Variational Relational Point Completion Network This repository contains the PyTorch implementation of the paper: Variational Relational Point

PL 121 Dec 12, 2022
[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

ShapeInversion Paper Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy "Unsupervised 3D

null 100 Dec 22, 2022