Code for ICCV 2021 paper Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes using Scene Graphs

Overview

Graph-to-3D

This is the official implementation of the paper Graph-to-3d: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs | arxiv
Helisa Dhamo*, Fabian Manhardt*, Nassir Navab, Federico Tombari
ICCV 2021

We address the novel problem of fully-learned 3D scene generation and manipulation from scene graphs, in which a user can specify in the nodes or edges of a semantic graph what they wish to see in the 3D scene.

If you find this code useful in your research, please cite

@inproceedings{graph2scene2021,
  title={Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes using Scene Graphs},
  author={Dhamo, Helisa and Manhardt, Fabian and Navab, Nassir and Tombari, Federico},
  booktitle={IEEE International Conference on Computer Vision (ICCV)},
  year={2021}
}

Setup

We have tested it on Ubuntu 16.04 with Python 3.7 and PyTorch 1.2.0

Code

# clone this repository and move there
git clone https://github.com/he-dhamo/graphto3d.git
cd graphto3d
# create a conda environment and install the requirments
conda create --name g2s_env python=3.7 --file requirements.txt 
conda activate g2s_env          # activate virtual environment
# install pytorch and cuda version as tested in our work
conda install pytorch==1.2.0 cudatoolkit=10.0 -c pytorch
# more pip installations
pip install tensorboardx graphviz plyfile open3d==0.9.0.0 open3d-python==0.7.0.0 
# Set python path to current project
export PYTHONPATH="$PWD"

To evaluate shape diversity, you will need to setup the Chamfer distance. Download the extension folder from the AtlasNetv2 repo and install it following their instructions:

cd ./extension
python setup.py install

To download our checkpoints for our trained models and the Atlasnet weights used to obtain shape features:

cd ./experiments
chmod +x ./download_checkpoints.sh && ./download_checkpoints.sh

Dataset

Download the 3RScan dataset from their official site. You will need to download the following files using their script:

python download.py -o /path/to/3RScan/ --type semseg.v2.json
python download.py -o /path/to/3RScan/ --type labels.instances.annotated.v2.ply

Additionally, download the metadata for 3RScan:

cd ./GT
chmod +x ./download_metadata_3rscan.sh && ./download_metadata_3rscan.sh

Download the 3DSSG data files to the ./GT folder:

chmod +x ./download_3dssg.sh && ./download_3dssg.sh

We use the scene splits with up to 9 objects per scene from the 3DSSG paper. The relationships here are preprocessed to avoid the two-sided annotation for spatial relationships, as these can lead to paradoxes in the manipulation task. Finally, you will need our directed aligned 3D bounding boxes introduced in our project page. The following scripts downloads these data.

chmod +x ./download_postproc_3dssg.sh && ./download_postproc_3dssg.sh

Run the transform_ply.py script from this repo to obtain 3RScan scans in the correct alignment:

cd ..
python scripts/transform_ply.py --data_path /path/to/3RScan

Training

To train our main model with shared shape and layout embedding run:

python scripts/train_vaegan.py --network_type shared --exp ./experiments/shared_model --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual True

To run the variant with separate (disentangled) layout and shape features:

python scripts/train_vaegan.py --network_type dis --exp ./experiments/separate_baseline --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual True

For the 3D-SLN baseline run:

python scripts/train_vaegan.py --network_type sln --exp ./experiments/sln_baseline --dataset_3RScan ../3RScan_v2/data/ --path2atlas ./experiments/atlasnet/model_70.pth --residual False --with_manipulator False --with_changes False --weight_D_box 0 --with_shape_disc False

One relevant parameter is --with_feats. If set to true, this tries to read shape features directly instead of reading point clouds and feading them in AtlasNet to obtain the feature. If features are not yet to be found, it generates them during the first epoch, and reads these stored features instead of points in the next epochs. This saves a lot of time at training.

Each training experiment generates an args.json configuration file that can be used to read the right parameters during evaluation.

Evaluation

To evaluate the models run

python scripts/evaluate_vaegan.py --dataset_3RScan ../3RScan_v2/data/ --exp ./experiments/final_checkpoints/shared --with_points False --with_feats True --epoch 100 --path2atlas ./experiments/atlasnet/model_70.pth --evaluate_diversity False

Set --evaluate_diversity to True if you want to compute diversity. This takes a while, so it's disabled by default. To run the 3D-SLN baseline, or the variant with separate layout and shape features, simply provide the right experiment folder in --exp.

Acknowledgements

This repository contains code parts that are based on 3D-SLN and AtlasNet. We thank the authors for making their code available.

Comments
  • Empty objects

    Empty objects

    Hello,

    I was trying to train the network and the first step is the precomputation of the AtlasNet features.

    I followed the README (and fixed a few lines).

    self.rel_json_file = os.path.join(self.root, '3DSSG_processed_files', '{}.json'.format(splits_fname))
    self.box_json_file = os.path.join(self.root, 'GT 3D boxes 3DSSG', 'obj_boxes_train_refined.json')
    

    But I often get empty objects with zero points and an error in this line

    choice2 = np.random.choice(len(obj_pointset), self.npoints - choice.shape[0], replace=True)
    

    How do you resolve this?

    opened by wamiq-reyaz 8
  • 404 not Found

    404 not Found

    Hi, when I want to download the checkpoint for the training model and the Atlasnet weight, I perform "chmod + x ./download_checkpoints.sh & ./download_checkpoints.sh and the same "404 not found" also appears. How did you solve it? image

    opened by SunWeiLin-Lynne 3
  • The number of objects

    The number of objects

    Hello, I hope to combine 3DSSG and oriented bounding box you annotated on the paper.

    But the number of objects for same scan doesn't match.

    For example, 3DSSG has 30 objects but yours has 25 objects. And I don't understand why the id is not in order(like 35->37->50)

    Could you explain about it? ObjectMatch

    opened by kimkj38 3
  • 404 Not Found nginx/1.18.0 (Ubuntu)

    404 Not Found nginx/1.18.0 (Ubuntu)

    Execute chmod +x ./download_metadata_3rscan.sh && ./download_metadata_3rscan.sh, or chmod +x ./download_3dssg.sh && ./download_3dssg.sh, and 404 not fuound appears.

    opened by 2WangZhen3 3
  • Provided checkpoint incompatible with provided dataset

    Provided checkpoint incompatible with provided dataset

    I have one a small problem with your provided checkpoints.

    I downloaded the 3DSSG dataset from here: https://campar.in.tum.de/public_datasets/3DSSG/3DSSG.zip including the classes.txt which has 528 classes. I downloaded the checkpoints from here: https://drive.google.com/drive/folders/1RbNUpHLQiY1zskd0buAgvl-yFwnD9N8M?usp=sharing

    When I try to evaluate the model using:

    python scripts/evaluate_vaegan.py --dataset_3RScan ../3RScan_v2/data/ --exp ./experiments/final_checkpoints/shared --with_points False --with_feats True --epoch 100 --path2atlas ./experiments/atlasnet/model_70.pth --evaluate_diversity False
    

    I get the following error:

    RuntimeError: Error(s) in loading state_dict for Sg2ScVAEModel:
            size mismatch for obj_embeddings_ec_box.weight: copying a param with shape torch.Size([162, 128]) from checkpoint, the shape in current model is torch.Size([530, 128]).
            size mismatch for obj_embeddings_ec_shape.weight: copying a param with shape torch.Size([162, 128]) from checkpoint, the shape in current model is torch.Size([530, 128]).
            size mismatch for pred_embeddings_ec_box.weight: copying a param with shape torch.Size([27, 256]) from checkpoint, the shape in current model is torch.Size([41, 256]).
            size mismatch for pred_embeddings_ec_shape.weight: copying a param with shape torch.Size([27, 256]) from checkpoint, the shape in current model is torch.Size([41, 256]).
            size mismatch for obj_embeddings_dc_box.weight: copying a param with shape torch.Size([162, 256]) from checkpoint, the shape in current model is torch.Size([530, 256]).
            size mismatch for obj_embeddings_dc_man.weight: copying a param with shape torch.Size([162, 256]) from checkpoint, the shape in current model is torch.Size([530, 256]).
            size mismatch for obj_embeddings_dc_shape.weight: copying a param with shape torch.Size([162, 256]) from checkpoint, the shape in current model is torch.Size([530, 256]).
            size mismatch for pred_embeddings_dc_box.weight: copying a param with shape torch.Size([27, 512]) from checkpoint, the shape in current model is torch.Size([41, 512]).
            size mismatch for pred_embeddings_dc_shape.weight: copying a param with shape torch.Size([27, 512]) from checkpoint, the shape in current model is torch.Size([41, 512]).
            size mismatch for pred_embeddings_dc.weight: copying a param with shape torch.Size([27, 256]) from checkpoint, the shape in current model is torch.Size([41, 256]).
            size mismatch for pred_embeddings_man_dc.weight: copying a param with shape torch.Size([27, 768]) from checkpoint, the shape in current model is torch.Size([41, 768]).
    

    I think this error is generated because self.obj_embeddings_ec_box = nn.Embedding(num_objs + 1, obj_embedding_dim) in Sg2ScVAEModel in VAEGAN_SHARED.py takes the number of classes as the number of input features. And the number of classes is defined by the length of vocab['object_idx_to_name'] which is loaded from the classes.txt which is 528 classes + the first row scene. This similar for the relationships and self.pred_embeddings_ec_box = nn.Embedding(num_preds, 2 * embedding_dim).

    Seams like you trained only with 161 classes and 27 relationships. Could you provide me with the correct classes.txt file and relationships.json to replicate your results?

    opened by kochsebastian 2
  • point cloud object generation

    point cloud object generation

    Hi, thank you for your amazing work. I trained your main model with shared shape and layout embedding, the accuracy is the same as the paper, but when I visualize the point cloud object, the quality of point cloud generation is not good. I followed the steps in the readme, am I missing any details? image

    image image

    opened by SharryRay 1
  • How did you create the scene graph splits?

    How did you create the scene graph splits?

    With your project you provide the relationships_train_clean.json and relationships_validation_clean.json files which contain the scene graphs. These scene graphs are different from the original scene graphs from 3DSSG as they are split up into splits which do not cover the entire scene.The splits make sense to me to make the scene graphs less complex for scene generation. But the the construction of the splits is not really obvious to me.

    image

    In this screenshot I have visualized the scene and scene graph from e61b0e04-bada-2f31-82d6-72831a602ba7 split 6. Corresponding objects are highlighted with the same color.

    This scene graph is especially weird for me because I would have assumed that all the chairs around the dining table were part of this split of the scene graph, but instead objects very far away from the dining table were selected like the door or wall. I couldn't find any documentation how you created these splits and as far as I know they are not part of the original 3DSSG dataset. So it would be awesome if you could provide some information about this. Maybe even provide a script to show how you created the splits which could also be changed to create more local splits.

    opened by kochsebastian 1
  • Error while downloading checkpoints

    Error while downloading checkpoints

    Hello,

    When downloading the checkpoints using the bash file download_checkpoints.sh, I get the following error.

    Error:

    ERROR: cannot verify campar.in.tum.de's certificate, issued by ‘CN=DFN-Verein Global Issuing CA,OU=DFN-PKI,O=Verein zur Foerderung eines Deutschen Forschungsnetzes e. V.,C=DE’:
      Unable to locally verify the issuer's authority.
    To connect to campar.in.tum.de insecurely, use `--no-check-certificate'.
    unzip:  cannot find or open final_checkpoints.zip, final_checkpoints.zip.zip or final_checkpoints.zip.ZIP.
    mkdir: atlasnet: File exists
    mv: rename model_70.pth to ./atlasnet/model_70.pth: No such file or directory
    

    Steps to reproduce:

    • conda create --name g2s_env python=3.7 --file requirements.txt
    • conda activate g2s_env # activate virtual environment
    • conda install pytorch==1.2.0 cudatoolkit=10.0 -c pytorch
    • pip install tensorboardx graphviz plyfile open3d==0.9.0.0 open3d-python==0.7.0.0
    • export PYTHONPATH="$PWD"

    Then download the checkpoints using the following command to get the error:

    chmod +x ./download_checkpoints.sh && ./download_checkpoints.sh

    Environment:

    • macOS 12.0.1
    • Python 3.7.13
    opened by zohairhadi 1
  • Questions about the paper

    Questions about the paper

    The title of the paper says end-to-end.

    Is it true that you train the DeepSDF model together with the GCNs?

    If so where can I find the architecture of the shape Encoder ? What is the input modality for the shape - is it a point cloud, SDF, voxel grid? Is that encoder pretrained? Otherwise training with loss makes no sense.

    I can't figure out where in the paper this is discussed.

    Another thing that is unclear to me is the labelling of the latent graph - is the generated per class and per node? I am guessing only the object nodes get any latent while the edge/relationship nodes are processed by the GCN to learn their meaning and do not have any KL term. But instead just propagate as are? Eq 1,2,3 are a bit unclear to me.

    Thanks!

    opened by wamiq-reyaz 1
  • Question about Evaluation

    Question about Evaluation

    Hello, while analyzing your code and paper, I have a question about them (especially in evaluation).

    You mentioned the evaluation metrics for boxes and shapes in the paper.

    I understood that in your 'evaluate_vaegan.py' code, your metrics for boxes are printed as 'acc & ... & ...', but I can't find any metrics for shapes that you mentioned in your paper as 'cycle-consistency experiment'. (There isn't any recall values printed in the result of evaluation)

    Is it experimented with another code?

    Thank you!

    opened by JJukE 0
  • Q: comparison of edge features with bounding boxes data.

    Q: comparison of edge features with bounding boxes data.

    Some possible errors in the edge annotations have been detected.

    For example, in the scene graph ID e61b0e04-bada-2f31-82d6-72831a602ba7 the objects 7 and 42 have the following relationship : [7, 42, 8, 'bigger than'].

    However, based on the volume of their bounding boxes, the relationship seems to be incorrect.

    "7": {"param7": [0.7991325537319209, 0.17943594268441287, 0.5799999926239252, ...]
    "42": {"param7": [0.4229016144056281, 0.43046053349812485, 0.7599999904632568, ...],
    
    vol_7  = 0.08316799874
    vol_42 = 0.13835226372
    

    For a screenshot of the scene and the full list of possible conflicts please refer to this link : https://drive.google.com/drive/folders/1B_1hb8wH3TleII2iKEy_FOv5t0eXIHvG?usp=sharing

    Pietro.

    opened by pbonazzi 0
  • Download links

    Download links

    Hi,

    The download links in download_metadata_3rscan.sh are not working.

    If you can let us know what the different files are, we can also create the text files from the json files.

    opened by thearkamitra 1
  • Question for visualize_scene.py

    Question for visualize_scene.py

    image image image

    Hello,I tried to run the evaluate_vaegan.py to visualize 3D scene in "meshes", and there is some bug ,I want to know what do I need to modify in the args or the code? Looking forward to your reply!

    opened by Mr-Faceless 2
  • Add == instead of =

    Add == instead of =

    Hey, was getting the following error on Colab with Python 3.7.12

    ERROR: Invalid requirement: 'cloudpickle=0.5.3' (from line 3 of requirements.txt)
    Hint: = is not a valid operator. Did you mean == ?
    

    Changing the = to == for each requirement might fix it.

    opened by SaadBazaz 0
Owner
Helisa Dhamo
Helisa Dhamo
Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Implementation of Neural Scene Graphs, that optimizes multiple radiance fields to represent different objects and a static scene background. Learned representations can be rendered with novel object compositions and views.

null 151 Dec 26, 2022
PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

Zhengqi Li 585 Jan 4, 2023
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Faster R-CNN pretrained on VisualGenome This repository modifies maskrcnn-benchmark for object detection and attribute prediction on VisualGenome data

Shizhe Chen 7 Apr 20, 2021
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

?? Nix-TTS An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji

Rendi Chevi 156 Jan 9, 2023
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Make-A-Scene - PyTorch Pytorch implementation (inofficial) of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (https://arxiv.org/

Casual GAN Papers 259 Dec 28, 2022
Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

Haitao Yang 62 Dec 30, 2022
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 41 Dec 9, 2022
ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem

Hao Su's Lab, UCSD 48 Dec 30, 2022
Populating 3D Scenes by Learning Human-Scene Interaction https://posa.is.tue.mpg.de/

Populating 3D Scenes by Learning Human-Scene Interaction [Project Page] [Paper] License Software Copyright License for non-commercial scientific resea

Mohamed Hassan 81 Nov 8, 2022
Code for "Learning Canonical Representations for Scene Graph to Image Generation", Herzig & Bar et al., ECCV2020

Learning Canonical Representations for Scene Graph to Image Generation (ECCV 2020) Roei Herzig*, Amir Bar*, Huijuan Xu, Gal Chechik, Trevor Darrell, A

roei_herzig 24 Jul 7, 2022
This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields Project Page | Paper | Supplementary | Video | Slides | Blog | Talk If

null 1.1k Dec 30, 2022
Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

zshicode 1 Nov 18, 2021
Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

MeshTransformer ✨ This is our research code of End-to-End Human Pose and Mesh Reconstruction with Transformers. MEsh TRansfOrmer is a simple yet effec

Microsoft 473 Dec 31, 2022
image scene graph generation benchmark

Scene Graph Benchmark in PyTorch 1.7 This project is based on maskrcnn-benchmark Highlights Upgrad to pytorch 1.7 Multi-GPU training and inference Bat

Microsoft 303 Dec 27, 2022
Efficient 6-DoF Grasp Generation in Cluttered Scenes

Contact-GraspNet Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes Martin Sundermeyer, Arsalan Mousavian, Rudolph Triebel, Dieter

NVIDIA Research Projects 148 Dec 28, 2022
The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

BCMI 105 Dec 30, 2022
Code for ICCV 2021 paper "Distilling Holistic Knowledge with Graph Neural Networks"

HKD Code for ICCV 2021 paper "Distilling Holistic Knowledge with Graph Neural Networks" cifia-100 result The implementation of compared methods are ba

Wang Yucheng 30 Dec 18, 2022