Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Overview

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

PointContrast

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.

[CVPR 2021 Paper] [Video] [Project Page] [ScanNet Data-Efficient Benchmark]

Environment

This codebase was tested with the following environment configurations.

  • Ubuntu 20.04
  • CUDA 10.2
  • GCC 7.3.0
  • Python 3.7.7
  • PyTorch 1.5.1
  • MinkowskiEngine v0.4.3

Installation

We use conda for the installation process:

# Install virtual env and PyTorch
conda create -n sparseconv043 python=3.7
conda activate sparseconv043
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch

# Complie and install MinkowskiEngine 0.4.3.
conda install mkl mkl-include -c intel
wget https://github.com/NVIDIA/MinkowskiEngine/archive/refs/tags/v0.4.3.zip
cd MinkowskiEngine-0.4.3 
python setup.py install

Next, download Contrastive Scene Contexts git repository and install the requirement from the root directory.

git clone https://github.com/facebookresearch/ContrastiveSceneContexts.git
cd ContrastiveSceneContexts
pip install -r requirements.txt

Our code also depends on PointGroup and PointNet++.

# Install OPs in PointGroup by:
conda install -c bioconda google-sparsehash
cd downstream/semseg/lib/bfs/ops
python setup.py build_ext --include-dirs=YOUR_ENV_PATH/include
python setup.py install

# Install PointNet++
cd downstream/votenet/models/backbone/pointnet2
python setup.py install

Pre-training on ScanNet

Data Pre-processing

For pre-training, one can generate ScanNet Pair data by following code (need to change the TARGET and SCANNET_DIR accordingly in the script).

cd pretrain/scannet_pair
./preprocess.sh

This piece of code first extracts pointcloud from partial frames, and then computes a filelist of overlapped partial frames for each scene. Generate a combined txt file called overlap30.txt of filelists of each scene by running the code

cd pretrain/scannet_pair
python generate_list.py --target_dir TARGET

This overlap30.txt should be put into folder TARGET/splits.

Pre-training

Our codebase enables multi-gpu training with distributed data parallel (DDP) module in pytorch. To train PointContrast with 8 GPUs (batch_size=32, 4 per GPU) on a single server:

cd pretrain/contrastive_scene_contexts
# Pretrain with SparseConv backbone
OUT_DIR=./output DATASET=ROOT_PATH_OF_DATA scripts/pretrain_sparseconv.sh
# Pretrain with PointNet++ backbone
OUT_DIR=./output DATASET=ROOT_PATH_OF_DATA scripts/pretrain_pointnet2.sh

ScanNet Downstream Tasks

Data Pre-Processing

We provide the code for pre-processing the data for ScanNet downstream tasks. One can run following code to generate the training data for semantic segmentation and instance segmentation.

# Edit path variables, SCANNET_OUT_PATH
cd downstream/semseg/lib/datasets/preprocessing
python scannet.py

For ScanNet detection data generation, please refer to VoteNet ScanNet Data. Run command to soft link the generated detection data (located in PATH_DET_DATA) to following location:

# soft link detection data
cd downstream/det/
ln -s PATH_DET_DATA datasets/scannet/scannet_train_detection_data

For Data-Efficient Learning, download the scene_list and points_list as well as bbox_list from ScanNet Data-Efficient Benchmark. To Active Selection for points_list, run following code:

# Get features per point
cd downstream/semseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/inference_features.sh
# run k-means on feature space
cd lib
python sampling_points.py --point_data SCANNET_OUT_PATH --feat_data PATH_CHECKPOINT

Semantic Segmentation

We provide code for the semantic segmentation experiments conducted in our paper. Our code supports multi-gpu training. To train with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_scenes.sh

For Limited Points Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_INDS=PATH_SCENE_LIST ./scripts/data_efficient/by_points.sh

Model Zoo

We also provide our pre-trained checkpoints (and log file) for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to downloaded pre-trained model path:
cd downstream/semseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh
Training Data mIoU (val) Initialization Pre-trained Model Logs Tensorboard
1% scenes 29.3 download download link link
5% scenes 45.4 download download link link
10% scenes 59.5 download download link link
20% scenes 64.1 download download link link
100% scenes 73.8 download download link link
20 points 53.8 download download link link
50 points 62.9 download download link link
100 points 66.9 download download link link
200 points 69.0 download download link link

Instance Segmentation

We provide code for the instance segmentation experiments conducted in our paper. Our code supports multi-gpu training. To train with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_scenes.sh

For Limited Points Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_INDS=PATH_POINTS_LIST ./scripts/data_efficient/by_points.sh

For ScanNet Benchmark, run following code (train on train+val and evaluate on val):

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=SCANNET_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet_benchmark.sh

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh

For submitting to ScanNet Benchmark with our pre-trained model, run following command (the submission file is located in output/benchmark_instance):

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet_benchmark.sh
Training Data [email protected] (val) Initialization Pre-trained Model Logs Curves
1% scenes 12.3 download download link link
5% scenes 33.9 download download link link
10% scenes 45.3 download download link link
20% scenes 49.8 download download link link
100% scenes 59.4 download download link link
20 points 27.2 download download link link
50 points 35.7 download download link link
100 points 43.6 download download link link
200 points 50.4 download download link link
train + val 76.5 (64.8 on test) download download link link

3D Object Detection

We provide the code for 3D Object Detection downstream task. The code is adapted directly fron VoteNet. Additionally, we provide two backones, namely PointNet++ and SparseConv. To fine-tune the downstream task, run following command:

cd downstream/votenet/
# train sparseconv backbone
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet.sh
# train pointnet++ backbone
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_scannet_pointnet.sh

For Limited Scene Reconstruction, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT TRAIN_FILE=PATH_SCENE_LIST ./scripts/data_efficient/by_Scentrain_scannet.sh

For Limited Bbox Annotation, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
DATAPATH=SCANNET_DATA LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT SAMPLED_BBOX=PATH_BBOX_LIST ./scripts/data_efficient/by_bboxes.sh

For submitting to ScanNet Data-Efficient Benchmark, you can set "test.write_to_bencmark=True" in "downstream/votenet/scripts/test_scannet.sh" or "downstream/votenet/scripts/test_scannet_pointnet.sh"

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can evaluate our pre-trained model by running following code.

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_scannet.sh
Training Data [email protected] (val) [email protected] (val) Initialize Pre-trained Model Logs Curves
10% scenes 9.9 24.7 download download link link
20% scenes 21.4 41.4 download download link link
40% scenes 29.5 52.0 download download link link
80% scenes 36.3 56.3 download download link link
100% scenes 39.3 59.1 download download link link
100% scenes (PointNet++) 39.2 62.5 download download link link
1 bboxes 30.3 54.5 download download link link
2 bboxes 32.4 55.3 download download link link
4 bboxes 34.6 58.9 download download link link
7 bboxes 35.9 59.7 download download link link

Stanford 3D (S3DIS) Fine-tuning

Data Pre-Processing

We provide the code for pre-processing the data for Stanford3D (S3DIS) downstream tasks. One can run following code to generate the training data for semantic segmentation and instance segmentation.

# Edit path variables, STANFORD_3D_OUT_PATH
cd downstream/semseg/lib/datasets/preprocessing
python stanford.py

Semantic Segmentation

We provide code for the semantic segmentation experiments conducted in our paper. Our code supports multi-gpu training. To fine-tune with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/semseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_stanford3d.sh

Model Zoo

We provide our pre-trained model and log file for reference. You can evalutate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/semseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_stanford3d.sh
Training Data mIoU (val) Initialization Pre-trained Model Logs Tensorboard
100% scenes 72.2 download download link link

Instance Segmentation

We provide code for the instance segmentation experiments conducted in our paper. Our code supports multi-gpu training. To fine-tune with 8 GPUs on a single server,

# Edit relevant path variables and then run:
cd downstream/insseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_stanford3d.sh

Model Zoo

We provide our pre-trained model and log file for reference. You can evaluate our pre-trained model by running code:

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/insseg/
DATAPATH=STANFORD_3D_OUT_PATH LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_stanford3d.sh
Training Data [email protected] (val) Initialization Pre-trained Model Logs Tensorboard
100% scenes 63.4 download download link link

SUN-RGBD Fine-tuning

Data Pre-Processing

For SUN-RGBD detection data generation, please refer to VoteNet SUN-RGBD Data. To soft link generated SUN-RGBD detection data (SUN_RGBD_DATA_PATH) to following location, run the command:

cd downstream/det/datasets/sunrgbd
# soft link 
link -s SUN_RGBD_DATA_PATH/sunrgbd_pc_bbox_votes_50k_v1_train sunrgbd_pc_bbox_votes_50k_v1_train
link -s SUN_RGBD_DATA_PATH/sunrgbd_pc_bbox_votes_50k_v1_val sunrgbd_pc_bbox_votes_50k_v1_val

3D Object Detection

We provide the code for 3D Object Detection downstream task. The code is adapted directly fron VoteNet. To fine-tune the downstream task, run following code:

# Edit relevant path variables and then run:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/train_sunrgbd.sh

Model Zoo

We provide our pre-trained checkpoints (and log file) for reference. You can load our pre-trained model by setting the pre-trained model path to PATH_CHECKPOINT.

# PATH_CHECKPOINT points to pre-trained model path:
cd downstream/votenet/
LOG_DIR=./output PRETRAIN=PATH_CHECKPOINT ./scripts/test_sunrgbd.sh
Training Data [email protected] (val) [email protected] (val) Initialize Pre-trained Model Log Curve
100% scenes 36.4 58.9 download download link link

Citing our paper

@article{hou2020exploring,
  title={Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts},
  author={Hou, Ji and Graham, Benjamin and Nie{\ss}ner, Matthias and Xie, Saining},
  journal={arXiv preprint arXiv:2012.09165},
  year={2020}
}

License

Contrastive Scene Contexts is relased under the MIT License. See the LICENSE file for more details.

Comments
  • Docker Image and domain specific information

    Docker Image and domain specific information

    Hi guys,

    Thank you for making the codebase available and explaining all the minute details both in the paper and in the github repo. Your work inspires me to pre-train with my own data (around 700,000) scans of the assemblies of different automobile structures and finetune it for downstream tasks (Semantic segentation and object detection). This work seems like the best match to the kind of information I want to capture from a cluttered LiDAR scanned scene.

    I need some light in understanding whether it's worth pretraining with my dataset instead of using the Scannet pretrained weights, as there I sense an evidently clear domain mismatch between the indoor scans and the type of scans I am speaking about. Any inputs in this regard will be highly appreciated.

    I know it's some additional effort from your side, but can you please make the docker image or docker file available for setting up the environment for the project and reduce the cycle time of the project I'm working on.

    I really appreciate your valuable time and wish you a happy Christmas in advance.

    @likethesky @Celebio @colesbury @pdollar @minqi

    opened by Shreyas-Gururaj 7
  • downstream task semseg

    downstream task semseg

    Hello,

    I have a question on semseg downstream task on the Stanford dataset.

    Thanks to provide all the log files and pretrained models. But Although dir or norm losses seem to be used for the semseg downstream task on the Stanford dataset as shown in your log file, there's no part to produce dir or norm losses in the 'downstream/semseg/lib/ddp_trainer.py, line 270-272'. To reproduce your work on the Stanford dataset, should we modify dataset.py and ddp_trainer.py to include those loss terms? (As I checked the ScanNet semseg log file, I found that those dir and norm loss terms are not used, unlike Stanford semseg task)

    Thanks in advance.

    opened by JunhyeopLee 6
  • S3DIS Semantic Segmentation Training from Scratch

    S3DIS Semantic Segmentation Training from Scratch

    In both PointContrast and ContrastiveSceneContexts papers, semantic segmentation results on S3DIS are stated as 68.2 mIoU. But in MinkowskiNet's GitHub repository(https://github.com/chrischoy/SpatioTemporalSegmentation), they achieve 66.3 mIoU using Mink16UNet34. You are also using Res16UNet34C with 5cm voxel size. When I train the model using my own repository with Res16UNet34C I also get around 66.4 mIoU. Is there anything I am missing? Can you explain how you get +2 mIoU compared to the original Minkowski model? Is it about data augmentation, optimizer etc. ?

    opened by YilmazKadir 5
  • question about NMS in instance segmentation

    question about NMS in instance segmentation

    Hi, thanks for your code release. I have a question about the nms code here. It seems that it only removes some proposals predicted in [10, 12, 16] classes, not like a regular NMS where the proposal score is utilized for proposal ranking and the iou overlap between proposals are computed to remove redundant proposals?

    opened by Dingry 4
  • multiprocessing and spawn

    multiprocessing and spawn

    Hi,

    Thank you for open-sourcing your work! It is really neat!

    However, I have trouble launching your jobs.

    I have to set start method to "spawn" in order to run the launch.sh (torch.multiprocess.set_start_method('spawn')) . Otherwise I got this error:

    RuntimeError: cuda runtime error (3) : initialization error at /opt/conda/conda-bld/pytorch_1591914855613/work/aten/src/THC/THCGeneral.cpp:47
    

    or

    RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
    

    However, if I do this, I got error in multiprocess_utils about pickle function

    _pickle.PicklingError: Can't pickle <function single_proc_run at 0x7fcd513d9170>: attribute lookup single_proc_run on __main__ failed
    

    I checked that this suggests I should use 'fork' instead of 'spawn'

    I am using pytorch 1.5.1 (py3.7_cuda10.2.89_cudnn7.6.5_0), and hydra version:

    hydra-colorlog            1.0.0                    pypi_0    pypi
    hydra-core                1.0.0                    pypi_0    pypi
    hydra-submitit-launcher   1.1.0                    pypi_0    pypi
    

    I wonder if you have any idea on how to correctly launch your job?

    Thank you!

    opened by JudyYe 4
  • Issue on preparing scannet downstream data

    Issue on preparing scannet downstream data

    Sorry but I can not find path 'downstream/semseg/lib/datasets/preprocessing/scannet' in this repo. I find scannet.py to preprocess data in PointContrast repo, but I get the error ''PosxiPath' object has no attribute 'write''. I do not know what is wrong.

    opened by sjtuchenye 4
  • Semseg on S3DIS with sigle GPU

    Semseg on S3DIS with sigle GPU

    I try to run './scripts/train_stanford3d.sh' and modify the config to single gpu, without changes to other code. I do not know why I get such error?

    Traceback (most recent call last): File "ddp_main.py", line 232, in cli_main main(config) File "ddp_main.py", line 187, in main train(model, train_data_loader, val_data_loader, config) File "/data1/ljx/CY/PointContrast/downstream/semseg/lib/train.py", line 107, in train coords, input, target = data_iter.next() File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) AttributeError: Caught AttributeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/data1/ljx/CY/PointContrast/downstream/semseg/lib/dataset.py", line 277, in getitem coords, feats, labels, center = self.load_ply(index) File "/data1/ljx/CY/PointContrast/downstream/semseg/lib/dataset.py", line 71, in wrapper results = func(self, *args, **kwargs) File "/data1/ljx/CY/PointContrast/downstream/semseg/lib/datasets/stanford.py", line 158, in load_ply plydata = PlyData.read(filepath) File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/plyfile.py", line 287, in read data = PlyData._parse_header(stream) File "/data1/ljx/anaconda3/envs/PointContrast/lib/python3.7/site-packages/plyfile.py", line 229, in _parse_header line = stream.readline().decode('ascii').strip() AttributeError: 'PosixPath' object has no attribute 'readline'

    opened by sjtuchenye 4
  • Pre-trained models

    Pre-trained models

    Hi, thank you very much for your great codes and detailed explanation! I have some questions about pre-trained models. I want to use the pre-trained model obtained from unsupervised learning and fine-tune the pre-trained model on my own dataset for semantic segmentation. You provide 'Initialization' and 'Pre-trained Model' for all your experiments. From my understanding, your limited annotation training starts from a pre-trained network, and that means the 'Initialization' should be the pre-trained model. If so, what are 'Pre-trained Model'? Which model should I use for my fine-tuning? I'm looking forward to your reply.

    opened by yaping222 3
  • Train split for limited scene reconstructions

    Train split for limited scene reconstructions

    Do you use the first 1%, 5%, 10%, 20% of the data when you fine-tune the network on ScanNet semantic segmentation for limited reconstruction case, or do you randomly shuffle the data first? If so, do you avoid splitting subscenes that are generated in the same scene to train and validation?

    opened by YilmazKadir 2
  • Preprocess S3DIS data

    Preprocess S3DIS data

    Hi, I downloaded the S3DIS data and unzipped the data, but it seems like the file structure of the dataset does not match your preprocessing code. The structure is as follows:

    ├── ReadMe.txt
    ├── Stanford3dDataset_v1.2_Aligned_Version.mat
    ├── Stanford3dDataset_v1.2_Aligned_Version.zip
    ├── Stanford3dDataset_v1.2.mat
    └── Stanford3dDataset_v1.2.zip
    

    Can you share your file structure and give me some suggestions? Thank you very much!

    opened by Hiusam 2
  • Training data split

    Training data split

    Thank you for the interesting work. Would it be possible for you to share the training data split filenames or script for generating for ScanNet and Shapenet?

    image image Thanks in advance,

    opened by ahme0307 2
  • I want to ask about running my own pointcloud in S3DIS Fine-tuning

    I want to ask about running my own pointcloud in S3DIS Fine-tuning

    Hello, excuse me. I am a novice to 3d semantic segmentation. And it's nice to see your work. I'd like to ask you a few questions. ①I would like to run my own point cloud in the semantic segmentation task of S3DIS Fine-tuning. May I ask whether could I convert my point cloud into the same file structure and format as the Stanford3dDataset_v1.2_Aligned_Version dataset, and then use it as input. In addition, whether the weight of the network (PRETRAIN=PATH_CHECKPOINT) is necessary in the ./scripts/train_stanford3d.sh. ②when I run the code following the README, I encountered problems in ./scripts/train_stanford3d.sh, it reported the following error: image I wonder if you know how to solve it, looking forward to your reply!!

    opened by Freedomcls 2
  • cuda memory

    cuda memory

    Hi, thanks for this great work, I have a question, since I want to run the training code in my desktop. I use a single 3090 GPU. I saw you use 8 GPUs and set batch size to 32. If I want to run the pre-train code in my GPU, what's the batch size should I set? because I set it to 2 but still out of cuda memory.

    Thanks, zihui

    opened by zhangzihui247 7
  • Availible Dockerfile

    Availible Dockerfile

    Hi,

    First of all, thank you very much for sharing this project. I am glad to learn all the details of your paper from this repo.

    For the speed of reproducing, I would like to have a Docker image to reproduce this work very quickly without the burden of installing those packages.

    I have made some progress:

    FROM nvidia/cuda:10.2-devel-ubuntu18.04 AS build
    
    RUN apt-get update && apt-get install -y --no-install-recommends \
            lsof wget ca-certificates \
            g++-7 && \
        rm -rf /var/lib/apt/lists/*
    
    RUN wget -q https://repo.anaconda.com/miniconda/Miniconda3-py38_4.9.2-Linux-x86_64.sh -O ~/miniconda.sh && \
        /bin/bash ~/miniconda.sh -b -p /opt/conda && \
        rm ~/miniconda.sh && \
        /opt/conda/bin/conda clean -tipsy && \
        ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
        echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc && \
        echo "conda activate base" >> ~/.bashrc
    
    ENV PATH=/opt/conda/bin:$PATH \
        LANG=C.UTF-8 \
        CXX=g++-7
    
    RUN conda install -c conda-forge conda-pack && \
        conda create -n mink -c pytorch-lts -c conda-forge -c anaconda \
            python=3.8 \
            openblas-devel \
            pytorch torchvision cudatoolkit=10.2 && \
        conda install -c bioconda google-sparsehash && \
        conda clean -ya
    
    RUN apt-get update && apt-get install -y --no-install-recommends \
            git && \
        rm -rf /var/lib/apt/lists/*
    
    ENV TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1 7.0+PTX"
    RUN /opt/conda/envs/mink/bin/pip install --no-deps --no-cache -U git+https://github.com/NVIDIA/MinkowskiEngine -v \
                    --install-option="--blas_include_dirs=/opt/conda/envs/mink/include" \
                    --install-option="--blas=openblas" \
                    --install-option="--force_cuda" \
                    --install-option="--cuda_home=/usr/local/cuda"
    
    RUN /opt/conda/envs/mink/bin/pip install hydra-core==1.0.0 \
                                             tensorboardX==2.0 \
                                             scipy==1.5.4 \
                                             scikit-learn==0.23.1 \
                                             plyfile==0.4 \
                                             pandas==1.0.5 \
                                             trimesh==3.7.5 \
                                             imageio==2.8.0 \
                                             hydra-colorlog==1.0.0 \
                                             hydra-submitit-launcher==1.1.0 \
                                             matplotlib==3.2.2 \
                                             opencv-python==4.5.1.48 
    WORKDIR /mink
    RUN conda-pack -n mink -o /tmp/mink.tar && \
        tar xf /tmp/mink.tar && rm /tmp/mink.tar
    
    RUN /mink/bin/conda-unpack
    
    FROM nvidia/cuda:10.2-devel-ubuntu18.04
    
    ENV CONDA_PREFIX=/mink
    ENV PATH=$CONDA_PREFIX/bin:$PATH \
        LANG=C.UTF-8
    
    COPY --from=build $CONDA_PREFIX $CONDA_PREFIX
    
    SHELL ["/bin/bash", "-c"]
    RUN source /$CONDA_PREFIX/bin/activate
    

    I am able to import MinkowskiEngine and PyTorch. But I could not find a way to install PointGroup and PointNet++ by using Docker image. It would be very nice if you could release a docker image to reproduce your project.

    Or if there is someone who would like to develop this docker image together, feel free to contact me and we could build this image together. Because I really want to learn this state-of-the-art work.

    Best regards, zshyang

    opened by zshyang 1
Owner
Facebook Research
Facebook Research
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

ZJU3DV 1.4k Dec 30, 2022
Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Single-view robot pose and joint angle estimation via render & compare Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic CVPR: Conference on C

Yann Labbé 51 Oct 14, 2022
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

NVIDIA Research Projects 1.4k Jan 1, 2023
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

Sungha Choi 173 Dec 21, 2022
Dynamic Slimmable Network (CVPR 2021, Oral)

Dynamic Slimmable Network (DS-Net) This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral). Architecture of DS-

Changlin Li 197 Dec 9, 2022
[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis [arxiv|pdf|v

Yinan He 78 Dec 22, 2022
[CVPR 2021 Oral] Variational Relational Point Completion Network

VRCNet: Variational Relational Point Completion Network This repository contains the PyTorch implementation of the paper: Variational Relational Point

PL 121 Dec 12, 2022
Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Quasi-Dense Tracking This is the offical implementation of paper Quasi-Dense Similarity Learning for Multiple Object Tracking. We present a trailer th

ETH VIS Research Group 327 Dec 27, 2022
Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

Tong WU 89 Dec 15, 2022
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

TAP: Text-Aware Pre-training TAP: Text-Aware Pre-training for Text-VQA and Text-Caption by Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Flo

Microsoft 61 Nov 14, 2022
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 135 Dec 26, 2022
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

如今我已剑指天涯 46 Dec 21, 2022
🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds (CVPR 2020) This is the official implementation of RandLA-Net (CVPR2020, Oral

Qingyong 1k Dec 30, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 41 Dec 9, 2022
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Realtime Multi-Person Pose Estimation By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Introduction Code repo for winning 2016 MSCOCO Keypoints Cha

Zhe Cao 4.9k Dec 31, 2022