pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

Abdullah Hamdi

Last update: Jan 3, 2023

Related tags

Deep Learning deep-learning point-cloud pytorch classification 3d 3d-models iccv2021

Overview

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021)

By Abdullah Hamdi, Silvio Giancola, Bernard Ghanem

Paper | Video | Tutorial .

The official Pytroch code of ICCV 2021 paper MVTN: Multi-View Transformation Network for 3D Shape Recognition. MVTN learns to transform the rendering parameters of a 3D object to improve the perspectives for better recognition by multi-view netowkrs. Without extra supervision or add loss, MVTN improve the performance in 3D classification and shape retrieval. MVTN achieves state-of-the-art performance on ModelNet40, ShapeNet Core55, and the most recent and realistic ScanObjectNN dataset (up to 6% improvement).

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Hamdi_2021_ICCV,
    author    = {Hamdi, Abdullah and Giancola, Silvio and Ghanem, Bernard},
    title     = {MVTN: Multi-View Transformation Network for 3D Shape Recognition},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {1-11}
}

Requirement

This code is tested with Python 3.7 and Pytorch >= 1.5

install Pytorch3d as follows

conda create -y -n MVTN python=3.7
conda activate MVTN
conda install -c pytorch pytorch=1.7.1 torchvision cudatoolkit=10.2
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -c bottler nvidiacub
conda install pytorch3d -c pytorch3d

install other helper libraries

conda install pandas
conda install -c conda-forge trimesh
pip install einops imageio scipy matplotlib tensorboard h5py metric-learn

Usage: 3D Classification & Retrieval

The main Python script in the root directorty run_mvtn.py.

First download the datasets and unzip inside the data/ directories as follows:

ModelNet40 this link (ModelNet objects meshes are simplified to fit the GPU and allows for backpropogation ).
ShapeNet Core55 v2 this link ( You need to create an account)
ScanObjectNN this link (ScanObjectNN with its three main variants [obj_only ,with_bg , hardest] controlled by the --dset_variant option ).

Then you can run MVTN with

python run_mvtn.py --data_dir data/ModelNet40/ --run_mode train --mvnetwork mvcnn --nb_views 8 --views_config learned_spherical

--data_dir the data directory. The dataloader is picked adaptively from custom_dataset.py based on the choice between "ModelNet40", "ShapeNetCore.v2", or the "ScanObjectNN" choice.
--run_mode is the run mode. choices: "train"(train for classification), "test_cls"(test classification after training), "test_retr"(test retrieval after training), "test_rot"(test rotation robustness after training), "test_occ"(test occlusion robustness after training)
--mvnetwork is the multi-view network used in the pipeline. Choices: "mvcnn" , "rotnet", "viewgcn"
--views_config is one of six view selection methods that are either learned or heuristics : choices: "circular", "random", "spherical" "learned_circular" , "learned_spherical" , "learned_direct". Only the ones that are learned are MVTN variants.
--resume a flag to continue training from last checkpoint.
--pc_rendering : a flag if you want to use point clouds instead of mesh data and point cloud rendering instead of mesh rendering. This should be default when only point cloud data is available ( like in ScanObjectNN dataset)
--object_color: is the uniform color of the mesh or object rendered. default="white", choices=["white", "random", "black", "red", "green", "blue", "custom"]

Other parameters can be founded in config.yaml configuration file or run python run_mvtn.py -h. The default parameters are the ones used in the paper.

The results will be saved in results/00/0001/ folder that contaions the camera view points and the renderings of some example as well the checkpoints and the logs.

Note: For best performance on point cloud tasks, please set canonical_distance : 1.0 in the config.yaml file. For mesh tasks, keep as is.

Other files

models/renderer.py contains the main Pytorch3D differentiable renderer class that can render multi-view images for point clouds and meshes adaptively.
models/mvtn.py contains a standalone class for MVTN that can be used with any other pipeline.
custom_dataset.py includes all the pytorch dataloaders for 3D datasets: ModelNet40, SahpeNet core55 ,ScanObjectNN, and ShapeNet Parts
blender_simplify.py is the Blender code used to simplify the meshes with simplify_mesh function from util.py as the following :

simplify_ratio  = 0.05 # the ratio of faces to be maintained after simplification 
input_mesh_file = os.path.join(data_dir,"ModelNet40/plant/train/plant_0014.off") 
mymesh, reduced_mesh = simplify_mesh(input_mesh_file,simplify_ratio=simplify_ratio)

The output simplified mesh will be saved in the same directory of the original mesh with "SMPLER" appended to the name

Misc

Please open an issue or contact Abdullah Hamdi ([email protected]) if there is any question.

Acknoledgements

This paper and repo borrows codes and ideas from several great github repos: MVCNN pytorch , view GCN, RotationNet and most importantly the great Pytorch3D library.

License

The code is released under MIT License (see LICENSE file for details).

Comments

ScanObjectNN (paper table 2)

Hi Thank you so much for the code release.

Can you please give the exact training and evaluation commands used for training and testing the ScanObjectNN dataset to recreate the results of table 2 in paper?

Thanks in advance. Much appreciated.

opened by sheshap 9

Remedy did not work

Hellow，Thanks for your nice work，there is a bug：

i did not change any parameters ，and run ：

CUDA_VISIBLE_DEVICES='2' python run_mvtn.py \
        --data_dir data/ModelNet40/ \
        --run_mode train \
        --mvnetwork mvcnn \
        --nb_views 8 \
        --views_config learned_spherical

output：

...
Evaluation:
        train acc: 82.79 - train Loss: 0.6117
        Val Acc: 76.54 - val Loss: 0.8504
        Current best val acc: 76.90

-----------------------------------
Epoch: [45/100]
        Iter [50/492] Loss: 0.4384
Remedy did not work

opened by JerkyT 1

I can't get the accuracy in the paper

In paper,MVTN using resnet18 can get the accuracy of 90% on ModelNet40 test. I can only get 78% accuracy.My command is python run_mvtn.py --data_dir data/ModelNet40/ --run_mode train --mvnetwork mvcnn --nb_views 8 --views_config learned_spherical

opened by liuyuanqing 1
Is the ModelNet40 dataset aligned?

Thank you for your last reply, I'm sorry I have another question.

Because it is not stated in the paper, I think that the ModelNet40 dataset used is unaligned.

However, in the supplementary material, it states that when testing rotation robustness,

A common practice in the literature in 3D shape classification is to test the robustness of models trained on the aligned dataset by injecting perturbations during test time [ 20 ]. We follow the same setup as [ 20 ] by introducing random rotations during test time around the Y-axis (gravity-axis).

Does this mean that you are using an aligned ModelNet40 dataset only for testing rotation robustness? Or is the ModetNet40 you stated in README.md already an aligned dataset?

opened by Kumoi0728 1
Input points for Classification (1k or 2k) ?

Can you please confirm if the classification results are using 2048 input points instead of 1024 points?

Not clear from the tables (mainly because results of previous models are based on 1024 input points).

Kindly share the results of classification with 1024 points or provide exact commands and configurations to recreate the results.

Thanks in advance

opened by sheshap 1
How to find nb_points for custom dataset?

Hey! I am new to this kind of approach and wanted to apply it to a custom dataset of mine. How do I find the correct nb_points value for my dataset? Thanks!

opened by wlcosta 1
checkpoints

Thank you very much for your reply. Sorry I have encountered another problem. What is the use of the weights under results/checkpoints/modelnet/? Could you provide the model weights for the best results?

opened by whu-lee 1
Update blender_simplify to Blender 2.8 API
Using the current code on Blender 2.8 yields the following error:

bpy.data.objects['Camera'].select = True AttributeError: 'Object' object has no attribute 'select'

According to API changes in Blender 2.8:

In 2.7x, you could directly (de)select an Object from its select property. This has been removed in 2.8x, in favor of some get/set functions.

Therefore, we need to change this piece of code to allow for direct usage on Blender 2.8, which is now the default for installation.
good first issue
opened by wlcosta 0
Dataset Link for ModelNet40 and ScanObjectNN

You know, in China, we can not access files on google drive. Can you share other download links to ModelNet40 and ScanObjectNN? For example, Baidu NetDisk has similar function with google drive, which is also widely used in China and many other countries.

It would be appreciated if you can provide such links. Thanks.

opened by auniquesun 1
Worse accuracy while continuing training due to a possible mistake in initializing `setup`
I trained a MVTN model with 100 epochs with the following command, and stopped training after 57 epochs.

python run_mvtn.py --data_dir data/ModelNet40/ --run_mode train --mvnetwork mvcnn --epochs 100 --nb_views 1 --views_config learned_circular

And the output of the 57th epoch is like this,

Epoch: [57/100] Iter [50/492] Loss: 0.7633 Iter [100/492] Loss: 0.7892 Iter [150/492] Loss: 0.3939 Iter [200/492] Loss: 0.1820 Iter [250/492] Loss: 0.2282 Iter [300/492] Loss: 0.6939 Iter [350/492] Loss: 0.4468 Iter [400/492] Loss: 0.2383 Iter [450/492] Loss: 0.5454 Evaluation: train acc: 82.03 - train Loss: 0.6457 Val Acc: 71.31 - val Loss: 1.0960 Current best val acc: 72.61

When I load the trained model to continue training, although it started training from the 58th epoch correctly, the accuracies got lower,

Epoch: [58/100] Iter [50/492] Loss: 1.2060 Iter [100/492] Loss: 0.6699 Iter [150/492] Loss: 0.5014 Iter [200/492] Loss: 0.4189 Iter [250/492] Loss: 0.2721 Iter [300/492] Loss: 0.3099 Iter [350/492] Loss: 1.0518 Iter [400/492] Loss: 1.1512 Iter [450/492] Loss: 0.2506 Evaluation: train acc: 55.48 - train Loss: 1.6519 Val Acc: 60.13 - val Loss: 1.4470 Current best val acc: 72.61

I found that in ops.py line 260-264, only when is_learning_views = True , the trained MVTN model will be loaded,

if setup["is_learning_views"]: models_bag["mvtn"].load_state_dict( checkpoint['mvtn']) models_bag["mvtn_optimizer"].load_state_dict( checkpoint['mvtn_optimizer'])

and in line 55-56, is_learning_views in setup is initialized like this,

setup["is_learning_views"] = setup["views_config"] in ["learned_offset", "learned_direct", "learned_spherical", "learned_random", "learned_transfer"]

should the learned_offset in line 55 be repalced by learned_circular? Becaues the choices of learned views_config must be learned_circular, learned_spherical, learned_direct, learned_random or learned_transfer.

I am sorry if the reason is not here. I would appreciate it if you could tell me the correct way. :) @ajhamdi
opened by Kumoi0728 3

pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

Related tags

Overview

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021)

Paper | Video | Tutorial .

Citation

Requirement

Usage: 3D Classification & Retrieval

Other files

Misc

Acknoledgements

License

Comments

Owner

Abdullah Hamdi

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

ALBERT-pytorch-implementation - ALBERT pytorch implementation

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.