Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Last update: May 18, 2022

Related tags

Deep Learning BlockGAN

Overview

BlockGAN

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images
Thu Nguyen-Phuoc, Chrisian Richardt, Long Mai, Yong-liang Yang, Niloy Mitra

Dataset

Please contact Thu Nguyen-Phuoc for datasets.

Training

To run the training of BlockGAN

python main.py ./config_synthetic.json --dataset Chair --input_fname_pattern ".png" 

python main.py ./config_real.json --dataset Car --input_fname_pattern ".jpg"

Help with config.json

image_path:
			Full path to the dataset directory.
gpu:
			Index number of the GPU to use. Default: 0.
batch_size:
			Batch size. Defaults is 32.
max_epochs:
			Number of epochs to train. Defaults is 50.
epoch_step:
			Number of epochs to train before starting to decrease the learning rate. Default is 25.
z_dim:
			Dimension of the noise vector. Defaults is 90.
z_dim2:
			Dimension of the noise vector. Defaults is 30.			
d_eta:
			Learning rate of the discriminator.Default is 0.0001
g_eta:
			Learning rate of the generator.Default is 0.0001
reduce_eta:
			Reduce learning rate during training.Default is False
D_update:
			Number of updates for the Discriminator for every training step.Default is 1.
G_update:
			Number of updates for the Generator for every training step.Default is 2.
beta1:
			Beta 1 for the Adam optimiser. Default is 0.5
beta2:
			Beta 2 for the Adam optimiser. Default is 0.999
discriminator:
			Name of the discriminator to use. 
generator:
			Name of the generator to use. 
view_func:
			Name of the view sampling function to use.
skew_func:
			Name of the perspective skew function to use.
train_func:
			Name of the train function to use.
build_func:
			Name of the build function to use.
style_disc:
			Use Style discriminator. Useful for training images at 128.
sample_z:
			Distribution to sample the noise fector. Default is "uniform".
add_D_noise:
			Add noise to the input of the discriminator. Default is "false".
DStyle_lambda:
			Lambda for the style discriminator loss. Default is 1.0
ele_low:
    		        Default is 70.
ele_high:
			Default is 110.
azi_low:
			Default is 0.
azi_high:
			Default is 360.
scale_low:
			Default is 1.0
scale_high:
			Default is 1.0
x_low:
			Default is 0.
x_high:
			Default is 0.
y_low:
			Default is 0.
y_high:
			Default is 0.
z_low:
			Default is 0.
z_high:
			Default is 0.
with_translation:
			To use translation in 3D transformation. Default is "true".
with_scale:
			To use scaling in 3D transformation. Default is "true".
focal_length:
			Camera parameter. Default is 35.
sensor_size:
			Camera parameter. Default is 32.
camera_dist:
			Camera distance. Default is 11.
new_size:
			Voxel grid size. Default is 16.	
size:
			Voxel grid size. Default is 16.	
output_dir: 
			Full path to the output directory.

Citation

If you use this code for your research, please cite our paper

@inproceedings{BlockGAN2020,
  title={ BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images  },
  author={Nguyen-Phuoc, Thu and Richardt, Christian and Mai, Long and Yang, Yong-Liang and Mitra, Niloy},
  booktitle =  {Advances in Neural Information Processing Systems 33},
 month = {Nov},
 year = {2020}
}

You might also like...

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

66 Nov 16, 2022

Code release for "COTR: Correspondence Transformer for Matching Across Images"

COTR: Correspondence Transformer for Matching Across Images This repository contains the inference code for COTR. We plan to release the training code

360 Jan 6, 2023

[ICCV'21] Official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations

CrowdNav with Social-NCE This is an official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations by

125 Dec 23, 2022

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022

Official PyTorch implementation of BlobGAN: Spatially Disentangled Scene Representations

BlobGAN: Spatially Disentangled Scene Representations Official PyTorch Implementation Paper | Project Page | Video | Interactive Demo BlobGAN.mp4 This

148 Dec 29, 2022

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

72 Dec 28, 2022

[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

33 Dec 12, 2022

Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Implementation of Neural Scene Graphs, that optimizes multiple radiance fields to represent different objects and a static scene background. Learned representations can be rendered with novel object compositions and views.

151 Dec 26, 2022

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

35 Nov 20, 2022

Comments

issue report about learning rate in Adamoptimizer

Even though the "reduce_eta" configuration in config_real.json is "false", but I fonud even I set it to "true" the linearly decaying of the learning rate did not work.

The reasons may be: in model_BlockGAN.py line 253, 254, you used cfg['d_eta'] and cfg['g_eta'] as the learning rate in Adamoptimizer. However, the linearly decaying only works on self.d_lr_in and self.g_lr_in (line 360, 361). That says you should pass self.d_lr_in and self.g_lr_in to Adamoptimizer instead of cfg['d_eta'] and cfg['g_eta'].

opened by JingWang-IRC 0

Code release for BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images

Related tags

Overview

BlockGAN

Dataset

Training

Citation

You might also like...

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

Code release for "COTR: Correspondence Transformer for Matching Across Images"

[ICCV'21] Official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations

Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

Official PyTorch implementation of BlobGAN: Spatially Disentangled Scene Representations

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Neural Scene Graphs for Dynamic Scene (CVPR 2021)

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

Comments

issue report about learning rate in Adamoptimizer

Owner

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

Object-aware Contrastive Learning for Debiased Scene Representation

Object-aware Contrastive Learning for Debiased Scene Representation

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Code for "Learning Canonical Representations for Scene Graph to Image Generation", Herzig & Bar et al., ECCV2020

[NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images

Code release to accompany paper "Geometry-Aware Gradient Algorithms for Neural Architecture Search."

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"