Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

Overview

Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

Demo | Project Page | Video | Paper

Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi, Visual Geometry Group, University of Oxford. In CVPR 2020 (Best Paper Award).

We propose a method to learn weakly symmetric deformable 3D object categories from raw single-view images, without ground-truth 3D, multiple views, 2D/3D keypoints, prior shape models or any other supervision.

Setup (with Anaconda)

1. Install dependencies:

conda env create -f environment.yml

OR manually:

conda install -c conda-forge scikit-image matplotlib opencv moviepy pyyaml tensorboardX

2. Install PyTorch:

conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=9.2 -c pytorch

Note: The code is tested with PyTorch 1.2.0 and CUDA 9.2 on CentOS 7. A GPU version is required for training and testing, since the neural_renderer package only has GPU implementation. You are still able to run the demo without GPU.

3. Install neural_renderer:

This package is required for training and testing, and optional for the demo. It requires a GPU device and GPU-enabled PyTorch.

pip install neural_renderer_pytorch

Note: It may fail if you have a GCC version below 5. If you do not want to upgrade your GCC, one alternative solution is to use conda's GCC and compile the package from source. For example:

conda install gxx_linux-64=7.3
git clone https://github.com/daniilidis-group/neural_renderer.git
cd neural_renderer
python setup.py install

4. (For demo only) Install facenet-pytorch:

This package is optional for the demo. It allows automatic human face detection.

pip install facenet-pytorch

Datasets

  1. CelebA face dataset. Please download the original images (img_celeba.7z) from their website and run celeba_crop.py in data/ to crop the images.
  2. Synthetic face dataset generated using Basel Face Model. This can be downloaded using the script download_synface.sh provided in data/.
  3. Cat face dataset composed of Cat Head Dataset and Oxford-IIIT Pet Dataset (license). This can be downloaded using the script download_cat.sh provided in data/.
  4. Synthetic car dataset generated from ShapeNet cars. The images are rendered from with random viewpoints from the top, where the cars are primarily oriented vertically. This can be downloaded using the script download_syncar.sh provided in data/.

Please remember to cite the corresponding papers if you use these datasets.

Pretrained Models

Download pretrained models using the scripts provided in pretrained/, eg:

cd pretrained && sh download_pretrained_celeba.sh

Demo

python -m demo.demo --input demo/images/human_face --result demo/results/human_face --checkpoint pretrained/pretrained_celeba/checkpoint030.pth

Options:

  • --gpu: enable GPU
  • --detect_human_face: enable automatic human face detection and cropping using MTCNN provided in facenet-pytorch. This only works on human face images. You will need to manually crop the images for other objects.
  • --render_video: render 3D animations using neural_renderer (GPU is required)

Training and Testing

Check the configuration files in experiments/ and run experiments, eg:

python run.py --config experiments/train_celeba.yml --gpu 0 --num_workers 4

Citation

@InProceedings{Wu_2020_CVPR,
  author = {Shangzhe Wu and Christian Rupprecht and Andrea Vedaldi},
  title = {Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild},
  booktitle = {CVPR},
  year = {2020}
}
Comments
  • Train a model based on synface dataset and result is not good as paper

    Train a model based on synface dataset and result is not good as paper

    Hi! Thank you very much for your excellent work!

    I use the provided script python run.py --config experiments/train_celeba.yml --gpu 0 --num_workers 4 to train a model for the synface dataset.

    And then I use python run.py --config experiments/test_celeba.yml --gpu 0 --num_workers 4 to test the model.

    Finally, I got 0.0092±0.002 SIDE and 17.77±1.92 MAD, which is not good as in the paper(0.793 ±0.140 and 16.51 ±1.56 MAD in Table 2).

    May I have a problem with my operation?

    Thank you!

    opened by zhouwy19 6
  • How to train a car

    How to train a car

    Hello, thank you very much for being open source. I have a problem that I only see about face and cat in your code, but your paper has about car. So I want to ask you how to train cars? Thank you very much! @elliottwu

    opened by Weipeilang 4
  • replace neural mesh renderer with pytorch3d

    replace neural mesh renderer with pytorch3d

    Hi, wu!

    I‘m sorry to disturb you again. Inspired from your repo, I just replace the neural renderer with pytorch3d. Specially, I choose MeshRenderer in pytorch3d.

    I first eliminated the inconsistency between NMR and PyTorch3D coordinate systems by overriding PerspectiveCameras class in pytorch3d. However, I found that the noise for shape is so big that the network can not learn a reasonable face shape. Note that I have trained the model with 100 epochs. The figure below is a comparison between NMR and PyTorch3D.

    NMR: 1

    PyTorch3D: 2

    RasterizationSettings is showed as below:

         # hard rasterization
         raster_settings = RasterizationSettings(
                image_size=self.image_size,
                blur_radius=0.0,
                faces_per_pixel=1,
                bin_size=0
            )
    

    I noticed that you mentioned one can add a smoothing loss to the depth map for alleviating the noisy depth problem in #9. So can you provide the specific form of such a smoothing loss? I really don't have a clue about this.

    Thank you very much!

    opened by YunjieYu 3
  • Render is too slow in 512×512 resolution

    Render is too slow in 512×512 resolution

    Hi! Thank you very much for such a great work!

    I'm trying to train a network in the size of 512×512. However, the time for a single forward propagation (batch size = 8) is too long (~60.12s). Furthermore, I find that the time is mainly spent on rendering the reconstructed depth map (~59.98s). I want to know what causes the rendering to be so slow.

    Again thank you for your work and code, looking forward to your reply :>.

    opened by YunjieYu 2
  • Forbidden Error for Syncar dataset.

    Forbidden Error for Syncar dataset.

    Hello, Thank you for sharing your amazing work. We are students trying to replicate this on the car dataset. Unfortunately, it seems like there are some permission issues to download the data.

    curl -o syncar.zip "https://www.robots.ox.ac.uk/~vgg/research/unsup3d/data/syncar.zip"

    does not work because of a 403 error ( Permission denied ). Would it be possible to grant access for the same as the others seems to be working fine ?

    opened by terrybelinda 2
  • How to test unsup3d on Google Colab?

    How to test unsup3d on Google Colab?

    Hi, when I tried to test unsup3d on Google Colab, I had some problems, such as:

    /usr/local/bin/python: Error while finding module specification for 'demo.demo' (AttributeError: module 'demo' has no attribute 'path')

    so, how to write a Colab notebook can successfully test unsup3d? just like: https://blog.csdn.net/yrwang_xd/article/details/103150691

    Thanks!

    opened by Salix-y 2
  • Questions about image size

    Questions about image size

    Hi, @elliottwu , Sorry to bother you again, but I have 2 questions about the setting of image size:

    1. Will increasing the input image_size improve the reconstruction effect? Since I have another dataset trained on the unsup3d model, but I didn't get satisfactory recon results, so I wonder if increasing the input image_size will heal the problem;

    2. I have tried increasing the image_size of the input image, set image_size in data_loader as 128 (2 times as original image_size=64), but I encountered the following error:

    RuntimeError: The size of tensor a (128) must match the size of tensor b (4224) at non-singleton dimension 0
    

    After checking, I found the two tensors are canon_normal and canon_light_d.view(-1,1,1,3) in the forward process, a element-wise multiplication will be operated on them, but they are unequal on the first dimension respectively:

    torch.Size([128, 128, 128, 3])
    torch.Size([4224, 1, 1, 3])
    

    So I wonder if you have encountered this kind of error, and how you solved it. Thank you very much, and looking forward to your response.

    opened by YokkaBear 2
  • Download Pretrained Models

    Download Pretrained Models

    Congratulations! I am so interested in your project. When i was running the .sh file to download the pretrained models, i always get the network problem. It is so nice of you if you could send your all pretrained models to my e-mail box. Best wishes to you! Have a nice day! Here is my address : [email protected].

    opened by Nancyhhh 2
  • Output form of the model

    Output form of the model

    Hi, thx for your work, very impressive!

    I got a query about the output form of the mode. To my knowledge, the output (reconstruction) and the canonical view are all 2D images, but with depth value which could be used to reconstruct 3D volumes. Is that right? Or canonical view (reconstruction image) is a 3D volume already?

    Again thank you for your work and code, looking forward to your reply :>.

    opened by VoiceBeer 2
  • what's the difference and relation between

    what's the difference and relation between "view_after" and "yaw_rotations" when rendering?

    Congratulation to be this best paper! The 3d reconstruction results is really impressive with only single view image used! I'm new to 3d cv task, when I run the demo.py, I'm confused with the "view_after" and "yaw_rotations". I think since the yaw_rotations has rotate the view, why do we need the view_after here

    opened by vegetable09 1
  • Any plan to release the car dataset?

    Any plan to release the car dataset?

    Hi Elliott,

    Really nice work!

    Do you plan to release the car dataset you use in the paper for further research? Also, how did you choose the light parameters when rendering the images? Thank you very much.

    opened by bennyguo 1
  • Depth prediction - pose changes

    Depth prediction - pose changes

    hello, I am trying to predict the depth directly from a face image, however the final depth is always predicted for different camera pose. How can I retain the camera pose and face angle, that I have in my initial input image? Thank you

    opened by cantonioupao 0
  • After a period of training, the result will collapse

    After a period of training, the result will collapse

    I used celeba dataset and webface dataset, and then used your code and settings for training. After training the first epoch, the results are as follows: Render image: image Source image: image But at the second epoch, the rendered image becomes extremely poor. image It seems that a large part of the face has been filtered out

    opened by LuoYi99 1
  • Can this algorithm generate 3d model with different num. of Vertices and faces?

    Can this algorithm generate 3d model with different num. of Vertices and faces?

    Can this algorithm generate 3d model with different num of Vertices and faces? and how to config? As you know, 3d Face models will be used in different usage scenarios. Thanks

    opened by bin1guo 0
  • Training with front-view car images results not good.

    Training with front-view car images results not good.

    Hi, Thanks for the paper. I appreciate the awesome work. We are students trying to re-train the Unsup3D model on front view car images. However, the results of the training were not good. We trained with 100 epochs and the validation loss in final epoch is 0.70303. We are definitely sure that we are missing something.

    Can you please guide/advise on tweaking any parameters to construct good 3D model for front-view car images ?

    Thanks

    opened by raagapranitha 1
  • pip install neural_renderer_pytorch

    pip install neural_renderer_pytorch

    @elliottwu I'm using a Windows environment, RTX 3080 when using the install neural_renderer_pytorch , an error is displayed: error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\nvcc.exe' failed with exit status 2 ---------------------------------------- ERROR: Command errored out with exit status 1: 'D:\Anaconda3\envs\unsup\python.exe' -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\ADMINI~1\AppData\Local\Temp\pip-install-s57_gk9g\neural-renderer-pytorch_baa3a 7b911c743b6a25d2f949292e84f\setup.py'"'"'; file='"'"'C:\Users\ADMINI~1\AppData\Local\Temp\pip-install-s57_gk9g\neural-renderer-pytorch_baa3a7b911c743b6a25d2f949292e84f\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(_file _) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\ADMINI~1\Ap pData\Local\Temp\pip-record-zg4vslmq\install-record.txt' --single-version-externally-managed --compile --install-headers 'D:\Anaconda3\envs\unsup\Include\neural-renderer-pytorch' Check the logs for full command output.

    Is the version of the graphics driver too high?

    opened by GitHubmalajava 2
Owner
null
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
The pytorch implementation of DG-Font: Deformable Generative Networks for Unsupervised Font Generation

DG-Font: Deformable Generative Networks for Unsupervised Font Generation The source code for 'DG-Font: Deformable Generative Networks for Unsupervised

null 130 Dec 5, 2022
Official pytorch code for SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal

SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal This is the official pytorch code for SSAT: A Symmetric Semantic-

ForeverPupil 57 Dec 13, 2022
Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

ARAPReg Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators.. Installation The cod

Bo Sun 132 Nov 28, 2022
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 4, 2020
This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Deformable Neural Radiance Fields This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies. Project Page Paper Video This codebase conta

Google 1k Jan 9, 2023
[CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment

RADN [CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment [Paper on arXiv] Overview Update [2021/5/7] add codes for W

IIGROUP 53 Dec 28, 2022
Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

null 2k Jan 5, 2023
Official implementation of NPMs: Neural Parametric Models for 3D Deformable Shapes - ICCV 2021

NPMs: Neural Parametric Models Project Page | Paper | ArXiv | Video NPMs: Neural Parametric Models for 3D Deformable Shapes Pablo Palafox, Aljaz Bozic

PabloPalafox 109 Nov 22, 2022
DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

DPT This repo is the official implementation of DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021). We provide code and model

CASIA-IVA-Lab 111 Dec 21, 2022
PyTorch implementation of Deformable Convolution

Deformable Convolutional Networks in PyTorch This repo is an implementation of Deformable Convolution. Ported from author's MXNet implementation. Buil

null 411 Dec 16, 2022
PyTorch implementation of Deformable Convolution

PyTorch implementation of Deformable Convolution !!!Warning: There is some issues in this implementation and this repo is not maintained any more, ple

Wei Ouyang 893 Dec 18, 2022
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

null 5 Nov 21, 2022
MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

Xinyi Ying 28 Dec 15, 2022
3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

3DDUNET This is the code for 3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021) Conference Paper Link Dataset We use SMOID dataset

null 1 Jan 7, 2022
[MedIA2021]MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning

MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning [MedIA or Arxiv] and [Demo] This repository pr

Healthcare Intelligence Laboratory 92 Dec 8, 2022
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

Adam Van Etten 145 Jan 1, 2023
A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, from tumor identification to wildlife monitoring to poverty mapping.

P-Lambda 437 Dec 30, 2022
The code of paper 'Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection'

Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection Pytorch implemetation of paper 'Learning to Aggregate and Personalize

Tencent YouTu Research 136 Dec 29, 2022