Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Haitao Yang

Last update: Dec 30, 2022

Related tags

Deep Learning Sync2Gen

Overview

Sync2Gen

Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

0. Environment

Environment: python 3.6 and cuda 10.0 on Ubuntu 18.04

Pytorch 1.4.0
tensorflow 1.14.0 (for tensorboard)

1. Dataset

├──dataset_3dfront/
    ├──data
        ├── bedroom
            ├── 0_abs.npy
            ├── 0_rel.pkl
            ├── ...
        ├── living
            ├── 0_abs.npy
            ├── 0_rel.pkl
            ├── ...
        ├── train_bedroom.txt
        ├── train_living.txt
        ├── val_bedroom.txt
        └── val_living.txt

See 3D-FRONT Dataset for dataset generation.

2. VAE

2.1 Generate scenes from random noises

Download the pretrained model from https://drive.google.com/file/d/1VKNlEdUj1RBUOjBaBxE5xQvfsZodVjam/view?usp=sharing

Sync2Gen
└── log
    └── 3dfront
        ├── bedroom
        │   └── vaef_lr0001_w00001_B64
        │       ├── checkpoint_eval799.tar
        │       └── pairs
        └── living
            └── vaef_lr0001_w00001_B64
                ├── checkpoint_eval799.tar
                └── pairs

type='bedroom'; # or living
CUDA_VISIBLE_DEVICES=0 python ./test_sparse.py  --type $type  --log_dir ./log/3dfront/$type/vaef_lr0001_w00001_B64 --model_dict=model_scene_forward --max_parts=80 --num_class=20 --num_each_class=4 --batch_size=32 --variational --latent_dim 20 --abs_dim 16  --weight_kld 0.0001  --learning_rate 0.001 --use_dumped_pairs --dump_results --gen_from_noise --num_gen_from_noise 100

The predictions are dumped in ./dump/$type/vaef_lr0001_w00001_B64

2.2 Training

To train the network:

type='bedroom'; # or living
CUDA_VISIBLE_DEVICES=0 python ./train_sparse.py --data_path ./dataset_3dfront/data  --type $type  --log_dir ./log/3dfront/$type/vaef_lr0001_w00001_B64  --model_dict=model_scene_forward --max_parts=80 --num_class=20 --num_each_class=4 --batch_size=64 --variational --latent_dim 20 --abs_dim 16  --weight_kld 0.0001  --learning_rate 0.001

3. Bayesian optimization

cd optimization

3.1 Prior generation

See Prior generation.

3.2 Optimization

type=bedroom # or living;
bash opt.sh $type vaef_lr0001_w00001_B64  EXP_NAME

We use Pytorch-LBFGS for optimization.

3.3 Visualization

There is a simple visualization tool:

type=bedroom # or living
bash vis.sh $type vaef_lr0001_w00001_B64 EXP_NAME

The visualization is in ./vis. {i:04d}_2(3)d_pred.png is the initial prediction from VAE. {i:04d}_2(3)d_sync.png is the optimized layout after synchronization.

Acknowledgements

The repo is built based on:

We thank the authors for their great job.

Contact

If you have any questions, you can contact Haitao Yang (yanghtr [AT] outlook [DOT] com).

You might also like...

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

248 Dec 23, 2022

Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models

54 Dec 17, 2022

Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Implementation of Neural Scene Graphs, that optimizes multiple radiance fields to represent different objects and a static scene background. Learned representations can be rendered with novel object compositions and views.

151 Dec 26, 2022

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

16 Oct 5, 2022

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

35 Nov 20, 2022

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

236 Dec 22, 2022

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

152 Dec 28, 2022

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

95 Nov 25, 2022

Comments

How to replace Box with Shape?

Hi, Very promising results and interesting work! When I run your code, I found the released code only produce the box layouts in a scene. And in your paper(Sec 5.1), the method uses a pre-trained model [49] to get the latent codes for each shape and retrieve a new shape with three PCA components during the test. But, [49] is a GAN-based generative model, so how to obtain the latent code for each object's shape. It is very nice if you can provide more details/codes on how to replace generated boxes with shapes.

Thanks!

opened by tommaoer 1

Maybe the scend model don't cat the abosulte attribute and the relative attribute?

def forward(self, x_abs, x_rel=None, latent_code=None):
        if latent_code is not None:
            pred_abs = self.decoder_abs(latent_code)
            pred_rel = self.decoder_rel(pred_abs)
            return pred_abs, pred_rel

        z, kld = self.encoder_abs(x_abs) # (B, latent_dim)
        kldiv_loss = -kld.sum() / x_abs.shape[0]

        pred_abs = self.decoder_abs(z)
        pred_rel = self.decoder_rel(pred_abs.detach())

        return pred_abs, pred_rel, kldiv_loss

opened by JackW987 0

Obtain the latent code for each object's shape

Hello! When I run your code, I found the released code only produce the box layouts in a scene. And in your paper(Sec 5.1), the method uses a pre-trained model [49] to get the latent codes for each shape and retrieve a new shape with three PCA components during the test. But, [49] is a GAN-based generative model, so how to obtain the latent code for each object's shape. Could you provide more details/codes on how to replace generated boxes with shapes? Thanks a lot!

opened by YeolYao 0

Owner

Haitao Yang

GitHub

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

182 Dec 30, 2022

FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.

FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning PyTorch implementation for the paper: FACIAL: Synthesizing Dynamic Talking

226 Jan 8, 2023

Official implementation of Protected Attribute Suppression System, ICCV 2021

6 Jan 1, 2023

A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis

A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis This is the pytorch implementation for our MICCAI 2021 paper. A Mul

7 Apr 4, 2022

A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

146 Nov 25, 2022

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

585 Jan 4, 2023

Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, accepted by IJCAI 2021.

Instance-Aware Latent-Space Search This is a PyTorch implementation of the following paper: Disentangled Face Attribute Editing via Instance-Aware Lat

67 Dec 21, 2022

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation, and authentication

1 Nov 17, 2021

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

296 Dec 29, 2022

Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

17 Oct 12, 2022

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Related tags

Overview

Sync2Gen

0. Environment

1. Dataset

2. VAE

2.1 Generate scenes from random noises

2.2 Training

3. Bayesian optimization

3.1 Prior generation

3.2 Optimization

3.3 Visualization

Acknowledgements

Contact

You might also like...

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

Comments

How to replace Box with Shape?

Maybe the scend model don't cat the abosulte attribute and the relative attribute?

Obtain the latent code for each object's shape

Owner

Haitao Yang

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.

Official implementation of Protected Attribute Suppression System, ICCV 2021

A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis

A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, accepted by IJCAI 2021.

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.