Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

DV Lab

Last update: Jan 7, 2023

Related tags

Deep Learning FocalsConv

Overview

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

This is the official implementation of Focals Conv (CVPR 2022), a new sparse convolution design for 3D object detection (feasible for both lidar-only and multi-modal settings). For more details, please refer to:

Focal Sparse Convolutional Networks for 3D Object Detection [Paper]
Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, Jiaya Jia

Experimental results

KITTI dataset

	Car@R11	Car@R40	download
PV-RCNN + Focals Conv	83.91	85.20	Google \| Baidu (key: m15b)
PV-RCNN + Focals Conv (multimodal)	84.58	85.34	Google \| Baidu (key: ie6n)
Voxel R-CNN (Car) + Focals Conv (multimodal)	85.68	86.00	Google \| Baidu (key: tnw9)

nuScenes dataset

	mAP	NDS	download
CenterPoint + Focals Conv (multi-modal)	63.86	69.41	Google \| Baidu (key: 01jh)
CenterPoint + Focals Conv (multi-modal) - 1/4 data	62.15	67.45	Google \| Baidu (key: 6qsc)

Visualization of voxel distribution of Focals Conv on KITTI val dataset:

Getting Started

Installation

a. Clone this repository

https://github.com/dvlab-research/FocalsConv && cd FocalsConv

b. Install the environment

Following the install documents for OpenPCdet and CenterPoint codebases respectively, based on your preference.

*spconv 2.x is highly recommended instead of spconv 1.x version.

c. Prepare the datasets.

Download and organize the official KITTI and Waymo following the document in OpenPCdet, and nuScenes from the CenterPoint codebase.

*Note that for nuScenes dataset, we use image-level gt-sampling (copy-paste) in the multi-modal training. Please download this dbinfos_train_10sweeps_withvelo.pkl to replace the original one. (Google | Baidu (key: b466))

*Note that for nuScenes dataset, we conduct ablation studies on a 1/4 data training split. Please download infos_train_mini_1_4_10sweeps_withvelo_filter_True.pkl if you needed for training. (Google | Baidu (key: 769e))

d. Download pre-trained models.

If you want to directly evaluate the trained models we provide, please download them first.

If you want to train by yourselvef, for multi-modal settings, please download this resnet pre-train model first, torchvision-res50-deeplabv3.

Evaluation

We provide the trained weight file so you can just run with that. You can also use the model you trained.

For models in OpenPCdet,

NUM_GPUS=8
cd tools 
bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/voxel_rcnn_car_focal_multimodal.yaml --ckpt path/to/voxelrcnn_focal_multimodal.pth

bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/pv_rcnn_focal_multimodal.yaml --ckpt ../pvrcnn_focal_multimodal.pth

bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/pv_rcnn_focal_lidar.yaml --ckpt path/to/pvrcnn_focal_lidar.pth

For models in CenterPoint,

CONFIG="nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal"
python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} ./tools/dist_test.py configs/nusc/voxelnet/$CONFIG.py --work_dir ./work_dirs/$CONFIG --checkpoint centerpoint_focal_multimodal.pth

Training

For configures in OpenPCdet,

bash scripts/dist_train.sh ${NUM_GPUS} --cfg_file cfgs/kitti_models/CONFIG.yaml

For configures in CenterPoint,

python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} ./tools/train.py configs/nusc/voxelnet/$CONFIG.py --work_dir ./work_dirs/CONFIG

Note that we use 8 GPUs to train OpenPCdet models and 4 GPUs to train CenterPoint models.

TODO List

- Config files and trained models on the overall Waymo dataset.
- Config files and scripts for the test augs (double-flip and rotation) in nuScenes test submission.
- Results and models of Focals Conv Networks on 3D Segmentation datasets.

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{focalsconv-chen,
  title={Focal Sparse Convolutional Networks for 3D Object Detection},
  author={Chen, Yukang and Li, Yanwei and Zhang, Xiangyu and Sun, Jian and Jia, Jiaya},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Acknowledgement

This work is built upon the OpenPCDet and CenterPoint. Please refer to the official github repositories, OpenPCDet and CenterPoint for more information.
This README follows the style of IA-SSD.

License

This project is released under the Apache 2.0 license.

Related Repos

Comments

Question about the results on nuscenes

Thanks for your inspiring work.

I have some questions concerning the results on nuscene.

When do you impose the focal loss [loss_box_of_pts] on nuscenes? I notice that you only employ it with modality fusion but disable it with LiDAR only? Is there any reason behind this?

opened by Solacex 14
I can't train the multi-modal accuracy(KITTI val split in AP3D(R11))you mentioned in the paper

my val result: epochs:90 3d AP:89.3103, 85.0520, 79.2089

your val result in Table 8 Focal Conv-F 3d: 89.82 85.22 85.19

I wonder why can't improve accuracy in hard situation and how many epochs you train?

opened by qimingx 11

Backward error when training on nuScenes

Thanks for contributing this wonderful work.

Previously when I run Focal Conv on Kitti, every thing is OK. However, when I try to train on nuScene using nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal. I got an Error :

File "det3d/torchie/apis/train.py", line 337, in train_detector
    trainer.run(data_loaders, cfg.workflow, cfg.total_epochs, local_rank=cfg.local_rank)
  File "det3d/torchie/trainer/trainer.py", line 553, in run
    epoch_runner(data_loaders[i], self.epoch, **kwargs)
  File "det3d/torchie/trainer/trainer.py", line 428, in train
    self.call_hook("after_train_iter")
  File "det3d/torchie/trainer/trainer.py", line 335, in call_hook
    getattr(hook, fn_name)(self)
  File "det3d/core/utils/dist_utils.py", line 54, in after_train_iter
    runner.outputs["loss"].backward()
  File "torch/_tensor.py", line 484, in backward
    torch.autograd.backward(
  File "torch/autograd/__init__.py", line 191, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [148285, 16]], which is output 0 of ReluBackward0, is at version 12; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

I also try to run the config of the normal CenterPoint with voxel net nusc_centerpoint_voxelnet_0075voxel_fix_bn_z.py in this repo and it trains smoothly. So I guess it is some problems occurs in the Focal Conv layer. Any Idea about this problem? Any hint or suggestion about the possible error place to look into also helps. A lot of Thanks.

opened by klightz 8

lidar-only focal centerpoint config is missing

I see centerpoint with multimodal focalsconv , but can not find centerpoint with unimodal focalsconv version. So I made new config myself changing centerpoint config from voxelnet to voxelfocal, from spmiddleresnetfhd to spmiddleresnetfhdfocal with use_img = false. Is it right way to do so??

opened by konyul 7
Issues about fusing multimodal features

dear author: In def construct_multimodal_features, there are two ways to fuse multimodal features. Fuse_sum = False means we use concatenation. But when I set fuse_sum = False in "out = out.replace_feature(self.construct_multimodal_features(out, x_rgb, batch_dict, False))", code will report an error which is "RuntimeError: running_mean should contain 32 elements not 16". I wonder how to amend code to solve this problem. Thanks a lot!

opened by Eaton2022 5

Can not reproduce the result by loading the provided checkpoint on my machines

Hello, sorry for another question, recently I try to train the model by myself, and I get a 4-5 points lower result than the README report. So I try to load the checkpoint you provide first for a valid test.

However, I found that the checkpoint I loaded also in a lower performance. I also tried to setup the environment independently on my another machine and directly do the checkpoint test, still get a worse result, same number as my previous one. I suspect it may caused by some package version mismatch, or some api behavior. Any Idea about this?

I think I exactly follow the guideline for the dataset and codebase setup. For the evaluation command, I follow the README and run:

CONFIG="nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal"
python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} ./tools/dist_test.py configs/nusc/voxelnet/$CONFIG.py --work_dir ./work_dirs/$CONFIG --checkpoint centerpoint_focal_multimodal.pth

Here are my key packages version:

torch==1.8.2
opencv-python==4.4.0.46
kornia==0.6.6
spconv==2.1.22

The output of the prediction:

Loading NuScenes tables for version v1.0-trainval...
23 category,
8 attribute,
4 visibility,
64386 instance,
12 sensor,
10200 calibrated_sensor,
2631083 ego_pose,
68 log,
850 scene,
34149 sample,
2631083 sample_data,
1166187 sample_annotation,
4 map,
Done loading in 39.821 seconds.
======
Reverse indexing ...
Done reverse indexing in 9.6 seconds.
======
Finish generate predictions for testset, save to work_dirs/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal/infos_val_10sw
eeps_withvelo_filter_True.json
Initializing nuScenes detection evaluation
Loaded results from work_dirs/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal/infos_val_10sweeps_withvelo_filter_True.jso
n. Found detections for 6019 samples.
Loading annotations for val split from nuScenes version: v1.0-trainval
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 6019/6019 [00:17<00:00, 350.05it/s]
Loaded ground truth annotations for 6019 samples.
Filtering predictions
=> Original number of boxes: 497297
=> After distance based filtering: 357680
=> After LIDAR and RADAR points based filtering: 357680
=> After bike rack filtering: 357308
Filtering ground truth annotations
=> Original number of boxes: 187528
=> After distance based filtering: 134565
=> After LIDAR and RADAR points based filtering: 121871
=> After bike rack filtering: 121861
Rendering sample token 5376e3a2874542d8b440faa899e52b97
Rendering sample token 14f665de1fa34d0a9d12838a5b77d687
Rendering sample token c428be7e072c4c2489b90a6dcefcae4c
Rendering sample token d6d3eac48860468aa0eba1ae2896b5ea
Rendering sample token 67aad7ad948f44f8af668ea8389bdd52
Rendering sample token 9c9f22a58fdc45f2b8a119cda3554f1f
Rendering sample token e30f071748cc49eb85babe49265a4eda
Rendering sample token f4550267cd0240e1a1ceb844e33e97d4
Rendering sample token 93fdce35d7db4764ad5f822f57ab49e2
Rendering sample token 22186f4894ab46b481a9e1ee31d7734e
Accumulating metric data...
Calculating metrics...
Rendering PR and TP curves
Saving metrics to: ./work_dirs/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_focal_multimodal
mAP: 0.6030
mATE: 0.2830
mASE: 0.2549
mAOE: 0.2796
mAVE: 0.2552
mAAE: 0.1885
NDS: 0.6754
Eval time: 110.2s
Per-class results:
Object Class    AP      ATE     ASE     AOE     AVE     AAE
car     0.854   0.178   0.155   0.108   0.267   0.193
truck   0.563   0.311   0.179   0.067   0.243   0.235
bus     0.699   0.314   0.179   0.049   0.420   0.274
trailer 0.396   0.513   0.209   0.442   0.191   0.177
construction_vehicle    0.221   0.666   0.425   0.865   0.120   0.284
pedestrian      0.853   0.140   0.275   0.377   0.212   0.093
motorcycle      0.618   0.200   0.244   0.229   0.387   0.240
bicycle 0.458   0.166   0.266   0.300   0.202   0.011
traffic_cone    0.686   0.140   0.333   nan     nan     nan
barrier 0.683   0.201   0.284   0.080   nan     nan
Evaluation nusc: Nusc v1.0-trainval Evaluation
car Nusc dist [email protected], 1.0, 2.0, 4.0
76.30, 85.90, 89.02, 90.20 mean AP: 0.8535605962619657
truck Nusc dist [email protected], 1.0, 2.0, 4.0
38.66, 55.67, 63.59, 67.33 mean AP: 0.563122813202771
construction_vehicle Nusc dist [email protected], 1.0, 2.0, 4.0
3.60, 14.35, 29.32, 40.98 mean AP: 0.22061467261075263
bus Nusc dist [email protected], 1.0, 2.0, 4.0
45.98, 70.61, 80.23, 82.79 mean AP: 0.6990066750667856
trailer Nusc dist [email protected], 1.0, 2.0, 4.0
10.95, 34.49, 51.42, 61.45 mean AP: 0.3957810641400781
barrier Nusc dist [email protected], 1.0, 2.0, 4.0
58.83, 68.34, 72.09, 73.76 mean AP: 0.682558135475819
motorcycle Nusc dist [email protected], 1.0, 2.0, 4.0
53.71, 63.50, 64.63, 65.32 mean AP: 0.6179004111724922
bicycle Nusc dist [email protected], 1.0, 2.0, 4.0
43.90, 45.99, 46.47, 46.84 mean AP: 0.45797520318286145
pedestrian Nusc dist [email protected], 1.0, 2.0, 4.0
83.08, 84.88, 85.97, 87.29 mean AP: 0.8530652142070791
traffic_cone Nusc dist [email protected], 1.0, 2.0, 4.0
65.87, 67.16, 69.11, 72.33 mean AP: 0.6861473045618407

So you can see i get a 60.3 on the full dataset, and I get 56 map on 1/4 dataset.

BTW, I also check the dataset correctness by observing the behavior of the CenterPoint performance. I load the checkpoint under this setup centerpoint_voxel_1440 and I can exactly get the 59.6 mAP they as report. So i think something goes wrong with the image fusion part. Any Idea about this issue? Really thanks a lot!

opened by klightz 5

voxel rcnn usage

Hi, thanks for your excellent work! I want to check the eval result of 3 classes on voxel rcnn. So how can I train? In the kitti_model directory, I can't see voxel rcnn with focal conv configuration files of 3 classes. Can you release it? thanks again

opened by vehxianfish 5
how to generate 1/4 nuscenes data on openpcdet

hello, you have give the 1/4 nuscenes data in centerpoint based pkl as [infos_train_mini_1_4_10sweeps_withvelo_filter_True.pkl], so i want to now how to generate 1/4 data based on openpcdet. Thank you very much!

opened by wenchened 4
why split foreground and background features based importance but then combine them for next submaniford conv?

Hi, these codes confuse me:

Foreground and background features are splitted firstly based the predicted importance:

https://github.com/dvlab-research/FocalsConv/blob/875e7b0c931ad7d1b2577c4c2201f228c7314a57/OpenPCDet/pcdet/models/backbones_3d/focal_sparse_conv/focal_sparse_conv.py#L212

but then combine the foreground and background features to be fed into a next submaniford conv:

https://github.com/dvlab-research/FocalsConv/blob/875e7b0c931ad7d1b2577c4c2201f228c7314a57/OpenPCDet/pcdet/models/backbones_3d/focal_sparse_conv/focal_sparse_conv.py#L216

why background features participate in next conv？

opened by FlyingQianMM 3
How to compute the params and runtime(inference time?)

Dear author, first of all, thanks for your great work. After reading your paper, I really want to know how to calculate the params and the runtime of adding Focals Conv to VoxelRCNN as u mentioned in your Experiments, and I want to try it, but I don't know how to do it, there's little information on the Internet, and after searching it in Google, I got confused. So I ask for your help if you have the code to accomplish it. I would very appreciate it if you could help me! Thank you in advance.

opened by Jane-QinJ 3
UnboundLocalError: local variable 'mv_height' referenced before assignment

Hello, thank u for your awesome job! When I try to use the pv_rcnn_focal_lidar.yaml to train, there was an error occured: UnboundLocalError: local variable 'mv_height' referenced before assignment

the full error is:

Traceback (most recent call last): | 0/464 [00:00<?, ?it/s] File "train.py", line 201, in main() File "train.py", line 153, in main train_model( File "/data1/qinjia/FocalsConv/OpenPCDet/tools/train_utils/train_utils.py", line 111, in train_model accumulated_iter = train_one_epoch( File "/data1/qinjia/FocalsConv/OpenPCDet/tools/train_utils/train_utils.py", line 25, in train_one_epoch batch = next(dataloader_iter) File "/home/qinjia/Software/anaconda3/envs/FocalConv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 363, in next data = self._next_data() File "/home/qinjia/Software/anaconda3/envs/FocalConv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data return self._process_data(data) File "/home/qinjia/Software/anaconda3/envs/FocalConv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data data.reraise() File "/home/qinjia/Software/anaconda3/envs/FocalConv/lib/python3.8/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) UnboundLocalError: Caught UnboundLocalError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/qinjia/Software/anaconda3/envs/FocalConv/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop data = fetcher.fetch(index) File "/home/qinjia/Software/anaconda3/envs/FocalConv/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/qinjia/Software/anaconda3/envs/FocalConv/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/data1/qinjia/FocalsConv/OpenPCDet/tools/../pcdet/datasets/kitti/kitti_dataset.py", line 425, in getitem data_dict = self.prepare_data(data_dict=input_dict) File "/data1/qinjia/FocalsConv/OpenPCDet/tools/../pcdet/datasets/dataset.py", line 130, in prepare_data data_dict = self.data_augmentor.forward( File "/data1/qinjia/FocalsConv/OpenPCDet/tools/../pcdet/datasets/augmentor/data_augmentor.py", line 243, in forward data_dict = cur_augmentor(data_dict=data_dict) File "/data1/qinjia/FocalsConv/OpenPCDet/tools/../pcdet/datasets/augmentor/database_sampler.py", line 403, in call mv_height = mv_height[valid_mask] UnboundLocalError: local variable 'mv_height' referenced before assignment

and I set the USE_ROAD_PLANE : FALSE in pv_rcnn_focal_lidar.yaml , because my custom dataset don't have the road plane data, but then I try to use KITTI data, the same error still occured.

I try to google this error, it seems like that this vraiable is not defined before the reference. I try to check out the mv_height in /FocalsConv/OpenPCDet/tools/../pcdet/datasets/augmentor/database_sampler.py. line 403, in call: mv_height = mv_height[valid_mask]

But I don't understand what wrong with this, if u can help me, I'd be very appreciated! Thanks!

opened by Jane-QinJ 3
add web demo/model to Huggingface

Hi, would you be interested in adding FocalsConv to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.

Example from other organizations: Keras: https://huggingface.co/keras-io Microsoft: https://huggingface.co/microsoft Facebook: https://huggingface.co/facebook

Example spaces with repos: github: https://github.com/salesforce/BLIP Spaces: https://huggingface.co/spaces/salesforce/BLIP

github: https://github.com/facebookresearch/omnivore Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces how to add models: https://huggingface.co/docs/hub/adding-a-model uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

opened by AK391 2

Owner

DV Lab

Deep Vision Lab

GitHub

Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

70 Nov 22, 2022

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)

842 Jan 4, 2023

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

114 Nov 28, 2022

This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

59 Dec 9, 2022

Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

151 Dec 26, 2022

Submanifold sparse convolutional networks

Submanifold Sparse Convolutional Networks This is the PyTorch library for training Submanifold Sparse Convolutional Networks. Spatial sparsity This li

1.8k Jan 6, 2023

Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transf

486 Dec 20, 2022

Focal and Global Knowledge Distillation for Detectors

FGD Paper: Focal and Global Knowledge Distillation for Detectors Install MMDetection and MS COCO2017 Our codes are based on MMDetection. Please follow

261 Dec 23, 2022

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

302 Dec 14, 2022

[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

36 Nov 23, 2022

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

138 Dec 28, 2022

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

108 Dec 27, 2022

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

1.4k Jan 1, 2023

[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

249 Dec 28, 2022

Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

News 05/10/2022 To make the comparison on ScanNet easier, we provide all quantitative and qualitative results of baselines here, including COLMAP, COL

365 Dec 30, 2022

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

264 Dec 23, 2022

The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

116 Dec 19, 2022

[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1 Liang Pan1 Zhongang Cai1,2,3 Ziwei Liu1* 1S-Lab, Nanyang Technologic

96 Jan 3, 2023