pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

Overview

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021)

By Abdullah Hamdi, Silvio Giancola, Bernard Ghanem

Paper | Video | Tutorial .

PWC PWC PWCPWC

MVTN pipeline

The official Pytroch code of ICCV 2021 paper MVTN: Multi-View Transformation Network for 3D Shape Recognition. MVTN learns to transform the rendering parameters of a 3D object to improve the perspectives for better recognition by multi-view netowkrs. Without extra supervision or add loss, MVTN improve the performance in 3D classification and shape retrieval. MVTN achieves state-of-the-art performance on ModelNet40, ShapeNet Core55, and the most recent and realistic ScanObjectNN dataset (up to 6% improvement).

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Hamdi_2021_ICCV,
    author    = {Hamdi, Abdullah and Giancola, Silvio and Ghanem, Bernard},
    title     = {MVTN: Multi-View Transformation Network for 3D Shape Recognition},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {1-11}
}

Requirement

This code is tested with Python 3.7 and Pytorch >= 1.5

conda create -y -n MVTN python=3.7
conda activate MVTN
conda install -c pytorch pytorch=1.7.1 torchvision cudatoolkit=10.2
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -c bottler nvidiacub
conda install pytorch3d -c pytorch3d
  • install other helper libraries
conda install pandas
conda install -c conda-forge trimesh
pip install einops imageio scipy matplotlib tensorboard h5py metric-learn

Usage: 3D Classification & Retrieval

The main Python script in the root directorty run_mvtn.py.

First download the datasets and unzip inside the data/ directories as follows:

  • ModelNet40 this link (ModelNet objects meshes are simplified to fit the GPU and allows for backpropogation ).

  • ShapeNet Core55 v2 this link ( You need to create an account)

  • ScanObjectNN this link (ScanObjectNN with its three main variants [obj_only ,with_bg , hardest] controlled by the --dset_variant option ).

Then you can run MVTN with

python run_mvtn.py --data_dir data/ModelNet40/ --run_mode train --mvnetwork mvcnn --nb_views 8 --views_config learned_spherical  
  • --data_dir the data directory. The dataloader is picked adaptively from custom_dataset.py based on the choice between "ModelNet40", "ShapeNetCore.v2", or the "ScanObjectNN" choice.
  • --run_mode is the run mode. choices: "train"(train for classification), "test_cls"(test classification after training), "test_retr"(test retrieval after training), "test_rot"(test rotation robustness after training), "test_occ"(test occlusion robustness after training)
  • --mvnetwork is the multi-view network used in the pipeline. Choices: "mvcnn" , "rotnet", "viewgcn"
  • --views_config is one of six view selection methods that are either learned or heuristics : choices: "circular", "random", "spherical" "learned_circular" , "learned_spherical" , "learned_direct". Only the ones that are learned are MVTN variants.
  • --resume a flag to continue training from last checkpoint.
  • --pc_rendering : a flag if you want to use point clouds instead of mesh data and point cloud rendering instead of mesh rendering. This should be default when only point cloud data is available ( like in ScanObjectNN dataset)
  • --object_color: is the uniform color of the mesh or object rendered. default="white", choices=["white", "random", "black", "red", "green", "blue", "custom"]

Other parameters can be founded in config.yaml configuration file or run python run_mvtn.py -h. The default parameters are the ones used in the paper.

The results will be saved in results/00/0001/ folder that contaions the camera view points and the renderings of some example as well the checkpoints and the logs.

Note: For best performance on point cloud tasks, please set canonical_distance : 1.0 in the config.yaml file. For mesh tasks, keep as is.

Other files

  • models/renderer.py contains the main Pytorch3D differentiable renderer class that can render multi-view images for point clouds and meshes adaptively.
  • models/mvtn.py contains a standalone class for MVTN that can be used with any other pipeline.
  • custom_dataset.py includes all the pytorch dataloaders for 3D datasets: ModelNet40, SahpeNet core55 ,ScanObjectNN, and ShapeNet Parts
  • blender_simplify.py is the Blender code used to simplify the meshes with simplify_mesh function from util.py as the following :
simplify_ratio  = 0.05 # the ratio of faces to be maintained after simplification 
input_mesh_file = os.path.join(data_dir,"ModelNet40/plant/train/plant_0014.off") 
mymesh, reduced_mesh = simplify_mesh(input_mesh_file,simplify_ratio=simplify_ratio)

The output simplified mesh will be saved in the same directory of the original mesh with "SMPLER" appended to the name

Misc

  • Please open an issue or contact Abdullah Hamdi ([email protected]) if there is any question.

Acknoledgements

This paper and repo borrows codes and ideas from several great github repos: MVCNN pytorch , view GCN, RotationNet and most importantly the great Pytorch3D library.

License

The code is released under MIT License (see LICENSE file for details).

Comments
  • ScanObjectNN (paper table 2)

    ScanObjectNN (paper table 2)

    Hi Thank you so much for the code release.

    Can you please give the exact training and evaluation commands used for training and testing the ScanObjectNN dataset to recreate the results of table 2 in paper?

    Thanks in advance. Much appreciated.

    opened by sheshap 9
  • Remedy did not work

    Remedy did not work

    Hellow,Thanks for your nice work,there is a bug:

    i did not change any parameters ,and run :

    CUDA_VISIBLE_DEVICES='2' python run_mvtn.py \
            --data_dir data/ModelNet40/ \
            --run_mode train \
            --mvnetwork mvcnn \
            --nb_views 8 \
            --views_config learned_spherical
    

    output:

    ...
    Evaluation:
            train acc: 82.79 - train Loss: 0.6117
            Val Acc: 76.54 - val Loss: 0.8504
            Current best val acc: 76.90
    
    -----------------------------------
    Epoch: [45/100]
            Iter [50/492] Loss: 0.4384
    Remedy did not work
    
    opened by JerkyT 1
  • I can't get the accuracy in the paper

    I can't get the accuracy in the paper

    In paper,MVTN using resnet18 can get the accuracy of 90% on ModelNet40 test. I can only get 78% accuracy.My command is python run_mvtn.py --data_dir data/ModelNet40/ --run_mode train --mvnetwork mvcnn --nb_views 8 --views_config learned_spherical

    opened by liuyuanqing 1
  • Is the ModelNet40 dataset aligned?

    Is the ModelNet40 dataset aligned?

    Thank you for your last reply, I'm sorry I have another question.

    Because it is not stated in the paper, I think that the ModelNet40 dataset used is unaligned.

    However, in the supplementary material, it states that when testing rotation robustness,

    A common practice in the literature in 3D shape classification is to test the robustness of models trained on the aligned dataset by injecting perturbations during test time [ 20 ]. We follow the same setup as [ 20 ] by introducing random rotations during test time around the Y-axis (gravity-axis).

    Does this mean that you are using an aligned ModelNet40 dataset only for testing rotation robustness? Or is the ModetNet40 you stated in README.md already an aligned dataset?

    opened by Kumoi0728 1
  • Input points for Classification (1k or 2k) ?

    Input points for Classification (1k or 2k) ?

    Can you please confirm if the classification results are using 2048 input points instead of 1024 points?

    Not clear from the tables (mainly because results of previous models are based on 1024 input points).

    Kindly share the results of classification with 1024 points or provide exact commands and configurations to recreate the results.

    Thanks in advance

    opened by sheshap 1
  • How to find nb_points for custom dataset?

    How to find nb_points for custom dataset?

    Hey! I am new to this kind of approach and wanted to apply it to a custom dataset of mine. How do I find the correct nb_points value for my dataset? Thanks!

    opened by wlcosta 1
  • checkpoints

    checkpoints

    Thank you very much for your reply. Sorry I have encountered another problem. What is the use of the weights under results/checkpoints/modelnet/? Could you provide the model weights for the best results?

    opened by whu-lee 1
  • Update blender_simplify to Blender 2.8 API

    Update blender_simplify to Blender 2.8 API

    Using the current code on Blender 2.8 yields the following error:

    bpy.data.objects['Camera'].select = True
    AttributeError: 'Object' object has no attribute 'select'
    

    According to API changes in Blender 2.8:

    In 2.7x, you could directly (de)select an Object from its select property. This has been removed in 2.8x, in favor of some get/set functions.

    Therefore, we need to change this piece of code to allow for direct usage on Blender 2.8, which is now the default for installation.

    good first issue 
    opened by wlcosta 0
  • Dataset Link for ModelNet40 and ScanObjectNN

    Dataset Link for ModelNet40 and ScanObjectNN

    You know, in China, we can not access files on google drive. Can you share other download links to ModelNet40 and ScanObjectNN? For example, Baidu NetDisk has similar function with google drive, which is also widely used in China and many other countries.

    It would be appreciated if you can provide such links. Thanks.

    opened by auniquesun 1
  • Worse accuracy while continuing training due to a possible mistake in initializing `setup`

    Worse accuracy while continuing training due to a possible mistake in initializing `setup`

    I trained a MVTN model with 100 epochs with the following command, and stopped training after 57 epochs.

    python run_mvtn.py --data_dir data/ModelNet40/ --run_mode train --mvnetwork mvcnn --epochs 100 --nb_views 1 --views_config learned_circular
    

    And the output of the 57th epoch is like this,

    Epoch: [57/100] Iter [50/492] Loss: 0.7633 Iter [100/492] Loss: 0.7892 Iter [150/492] Loss: 0.3939 Iter [200/492] Loss: 0.1820 Iter [250/492] Loss: 0.2282 Iter [300/492] Loss: 0.6939 Iter [350/492] Loss: 0.4468 Iter [400/492] Loss: 0.2383 Iter [450/492] Loss: 0.5454 Evaluation: train acc: 82.03 - train Loss: 0.6457 Val Acc: 71.31 - val Loss: 1.0960 Current best val acc: 72.61

    When I load the trained model to continue training, although it started training from the 58th epoch correctly, the accuracies got lower,

    Epoch: [58/100] Iter [50/492] Loss: 1.2060 Iter [100/492] Loss: 0.6699 Iter [150/492] Loss: 0.5014 Iter [200/492] Loss: 0.4189 Iter [250/492] Loss: 0.2721 Iter [300/492] Loss: 0.3099 Iter [350/492] Loss: 1.0518 Iter [400/492] Loss: 1.1512 Iter [450/492] Loss: 0.2506 Evaluation: train acc: 55.48 - train Loss: 1.6519 Val Acc: 60.13 - val Loss: 1.4470 Current best val acc: 72.61

    I found that in ops.py line 260-264, only when is_learning_views = True , the trained MVTN model will be loaded,

    if setup["is_learning_views"]:
            models_bag["mvtn"].load_state_dict(
                checkpoint['mvtn'])
            models_bag["mvtn_optimizer"].load_state_dict(
                checkpoint['mvtn_optimizer'])
    

    and in line 55-56, is_learning_views in setup is initialized like this,

    setup["is_learning_views"] = setup["views_config"] in ["learned_offset",
                                                           "learned_direct", "learned_spherical", "learned_random", "learned_transfer"]
    

    should the learned_offset in line 55 be repalced by learned_circular? Becaues the choices of learned views_config must be learned_circular, learned_spherical, learned_direct, learned_random or learned_transfer.

    I am sorry if the reason is not here. I would appreciate it if you could tell me the correct way. :) @ajhamdi

    opened by Kumoi0728 3
Owner
Abdullah Hamdi
Deep Learning , Machine Learning , Game Design , Artificial Intelligence , Virtual Reality.
Abdullah Hamdi
[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

RetrievalFuse Paper | Project Page | Video RetrievalFuse: Neural 3D Scene Reconstruction with a Database Yawar Siddiqui, Justus Thies, Fangchang Ma, Q

Yawar Nihal Siddiqui 75 Dec 22, 2022
A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

Harry Yang 121 Dec 17, 2022
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 6, 2023
ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... 모델의 개념이해를 돕기 위한 구현물로 현재 변수명을 상세히 적었고

BG Kim 3 Oct 6, 2022
The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

João Fonseca 3 Jan 3, 2023
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

null 49 Nov 23, 2022
A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

Exploring simple siamese representation learning This is a PyTorch re-implementation of the SimSiam paper on ImageNet dataset. The results match that

Taojiannan Yang 72 Nov 9, 2022
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

null 212 Dec 25, 2022
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

null 364 Dec 14, 2022
PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

Zhengqi Li 585 Jan 4, 2023
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

null 35 Dec 6, 2022
An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

GLOM An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" for MNIST Dataset. To understand this

null 50 Oct 19, 2022
Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

OTA: Optimal Transport Assignment for Object Detection This project provides an implementation for our CVPR2021 paper "OTA: Optimal Transport Assignme

null 217 Jan 3, 2023
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 3, 2023
[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

null 153 Dec 14, 2022
PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis This is a PyTorch implementation of the Deep Streaming Linear Discriminant

Tyler Hayes 41 Dec 25, 2022
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

TimeSformer This is an official pytorch implementation of Is Space-Time Attention All You Need for Video Understanding?. In this repository, we provid

Facebook Research 1k Dec 31, 2022
PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

DECOR-GAN PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement, Zhiqin Chen, Vladimir G. Kim, Matthew Fish

Zhiqin Chen 72 Dec 31, 2022