Implementation of the method proposed in the paper "Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation"

Overview

Neural Descriptor Fields (NDF)

PyTorch implementation for training continuous 3D neural fields to represent dense correspondence across objects, and using these descriptor fields to mimic demonstrations of a pick-and-place task on a robotic system

drawing


This is the reference implementation for our paper:

Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation

drawing drawing

PDF | Video

Anthony Simeonov*, Yilun Du*, Andrea Tagliasacchi, Joshua B. Tenenbaum, Alberto Rodriguez, Pulkit Agrawal**, Vincent Sitzmann** (*Equal contribution, order determined by coin flip. **Equal advising)


Google Colab

If you want a quickstart demo of NDF without installing anything locally, we have written a Colab. It runs the same demo as the Quickstart Demo section below where a local coordinate frame near one object is sampled, and the corresponding local frame near a new object (with a different shape and pose) is recovered via our energy optimization procedure.


Setup

Clone this repo

git clone --recursive https://github.com/anthonysimeonov/ndf_robot.git
cd ndf_robot

Install dependencies (using a virtual environment is highly recommended):

pip install -e .

Setup additional tools (Franka Panda inverse kinematics -- unnecessary if not using simulated robot for evaluation):

cd pybullet-planning/pybullet_tools/ikfast/franka_panda
python setup.py

Setup environment variables (this script must be sourced in each new terminal where code from this repository is run)

source ndf_env.sh

Quickstart Demo

Download pretrained weights

./scripts/download_demo_weights.sh

Download data assets

./scripts/download_demo_data.sh

Run example script

cd src/ndf_robot/eval
python ndf_demo.py

The code in the NDFAlignmentCheck class in the file src/ndf_robot/eval/ndf_alignment.py contains a minimal implementation of our SE(3)-pose energy optimization procedure. This is what is used in the Quickstart demo above. For a similar implementation that is integrated with our pick-and-place from demonstrations pipeline, see src/ndf_robot/opt/optimizer.py

Training

Download all data assets

If you want the full dataset (~150GB for 3 object classes):

./scripts/download_training_data.sh 

If you want just the mug dataset (~50 GB -- other object class data can be downloaded with the according scripts):

./scripts/download_mug_training_data.sh 

If you want to recreate your own dataset, see Data Generation section

Run training

cd src/ndf_robot/training
python train_vnn_occupancy_net.py --obj_class all --experiment_name  ndf_training_exp

More information on training here

Evaluation with simulated robot

Make sure you have set up the additional inverse kinematics tools (see Setup section)

Download all the object data assets

./scripts/download_obj_data.sh

Download pretrained weights

./scripts/download_demo_weights.sh

Download demonstrations

./scripts/download_demo_demonstrations.sh

Run evaluation

If you are running this command on a remote machine, be sure to remove the --pybullet_viz flag!

cd src/ndf_robot/eval
CUDA_VISIBLE_DEVICES=0 python evaluate_ndf.py \
        --demo_exp grasp_rim_hang_handle_gaussian_precise_w_shelf \
        --object_class mug \
        --opt_iterations 500 \
        --only_test_ids \
        --rand_mesh_scale \
        --model_path multi_category_weights \
        --save_vis_per_model \
        --config eval_mug_gen \
        --exp test_mug_eval \
        --pybullet_viz

More information on experimental evaluation can be found here.

Data Generation

Download all the object data assets

./scripts/download_obj_data.sh

Run data generation

cd src/ndf_robot/data_gen
python shapenet_pcd_gen.py \
    --total_samples 100 \
    --object_class mug \
    --save_dir test_mug \
    --rand_scale \
    --num_workers 2

More information on dataset generation can be found here.

Collect new demonstrations with teleoperated robot in PyBullet

Make sure you have downloaded all the object data assets (see Data Generation section)

Run teleoperation pipeline

cd src/ndf_robot/demonstrations
python label_demos.py --exp test_bottle --object_class bottle --with_shelf

More information on collecting robot demonstrations can be found here.

Citing

If you find our paper or this code useful in your work, please cite our paper:

@article{simeonovdu2021ndf,
  title={Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation},
  author={Simeonov, Anthony and Du, Yilun and Tagliasacchi, Andrea and Tenenbaum, Joshua B. and Rodriguez, Alberto and Agrawal, Pulkit and Sitzmann, Vincent},
  journal={arXiv preprint arXiv:2112.05124},
  year={2021}
}

Acknowledgements

Parts of this code were built upon the implementations found in the occupancy networks repo and the vector neurons repo. Check out their projects as well!

Comments
  • Generating dataset on new categories

    Generating dataset on new categories

    Hi there!

    NDF works impressively well on mugs, bowls and bottles. So I'm curious about how it will perform on categories with more complex shapes such as airplanes or chairs.

    According to your docs, what I need for training an NDF are generated point clouds and ground truth occupancy values. I've successfully produced the point clouds via https://github.com/anthonysimeonov/ndf_robot/blob/master/src/ndf_robot/data_gen/shapenet_pcd_gen.py. However, it seems that the code to produce ground truth occupancy values is not yet available. Is it possible that you could provide the code? Or maybe could you tell me the procedure to produce the occupancy values?

    Thanks a lot!!!

    opened by Steven-xzr 4
  • Shelves intersect with bowls and bottles in the demo data

    Shelves intersect with bowls and bottles in the demo data

    I visualize the demo data and find the shelves intersect with bowls and bottles? Are they supposed to be like this? imageimage imageimage

    Is the demonstration in the teaser available? I think the rack in the teaser is more complicate than the rack in the simulation demos. imageimage

    opened by zzilch 2
  • Question about the paper: the translation equivariance coupled with rotation equivariance

    Question about the paper: the translation equivariance coupled with rotation equivariance

    Hi, thanks for sharing this great and interesting work!

    I'm a bit curious about how partial is the input observed point cloud, as I understand from the paper, the translation equivariance is achieved through subtracting the centre of mass, but this really depends on how complete the point cloud is. If it's too partial, the centre of mass will largely shift from the actual object centre. Since the translation and rotation equivariance are always coupled, from the vector neuron perspective, it is learning the representation with centre shift augmentation and might lead to rotation equivariance error.

    Thanks!

    opened by ray8828 2
  • Reconstruction performance of the pretrained weights.

    Reconstruction performance of the pretrained weights.

    I estimated the occupancy of the object point cloud in the demo data, and got very small values like this. image image

    I also estimated with full object points or with dense grid coordinates (instead of 1500 n_pts used above). The results were weird too. I just forwarded the model with centered object points. Did I miss any steps? I am using the multi_category_weights.pth now, and I am curious about the reconstruction performance of the pretrained nets.

    opened by zzilch 1
  • where is

    where is "pybullet-planning/pybullet_tools/ikfast/franka_panda"

    I did not find the "pybullet-planning/pybullet_tools/ikfast/franka_panda". Where is the "pybullet-planning/pybullet_tools/ikfast/franka_panda". Thank you!

    opened by wuguangbin1230 1
  • for real world experiments, what dataset are the networks trained on?

    for real world experiments, what dataset are the networks trained on?

    In the paper it mentioned training on pybullet rendered dataset for simulation experiments. But how about the real world? Seems like it requires non-trivial amount of data to train (100,000 objects for training for simulation experiments). How is this dataset collected in real world to train? Or is it sim-to-real (any quantitative analysis on the sim-to-real gap influence)?

    opened by Guptajakala 1
  • What is T in DecoderInner

    What is T in DecoderInner

    Hello @anthonysimeonov,

    I am sorry to disturb you but in line 241 of file src/ndf_robot/model/vnn_occupancy_net_pointnet_dgcnn.py I do not understand to what correspond the T dimension of p. For my understanding p is a batch of 3D points and therefore we have Batch_size=batch size, T=?, D=3 Thank you, Julien.

    opened by Julien-Gustin 0
  • Using NDF for assembly task

    Using NDF for assembly task

    Hi @anthonysimeonov , Congrats for your work, I really appreciate the idea.

    I was wondering whether the descriptors could be used for a different but related task. I am working on 3D assembly (see the breaking-bad dataset for a visual explanation), which means trying to assemble two (or more) broken parts of an object. The idea behind using NDF is that grasping is similar to assembly, because we have two complementary part (the grasp from the robot or the broken part of the object). It is also equi-variant to rotation, which is great for assembling (if you rotate all pieces solution is still ok).

    So I was playing around with the code to do some experiments, I can get the latent vector and forward it to get the descriptors, but I am unsure how exactly to use them. Instead of sampling one random 'batch' (many points close to each other outside the surface of the mesh) I am sampling around the whole object and creating the descriptors, and now I would want to match the descriptors from one broken object with the descriptors from the other part of the broken object.

    Intuitively, I would expect the descriptors to be complementary (also not sure how to mathematically define complementary in this case), but I see in section II (Method) of the paper, Equation 11 says that minimizing the difference between descriptors is the way to get the transformation. But trying to find correspondences in a "standard" way (i was using example from TEASER++ registration) did not work.

    In order to understand better, I was trying to investigate further the concept of energy landscapes, which looks very promising. Can you point me to some code (even part of this repo) to watch to get a better understanding of it?

    So, a couple of questiosn:

    • if I have a descriptor on a pointcloud (first broken part) and on a second pointcloud (second broken part, same object) and if they belong to the same point, should the descriptors be the same or the complementary?
    • how did you manage to create the visualization of the energy landscapes? Where can I look to manage to get one of my own?
    • do you think there is a way to use NDF for assembly?

    Thanks a lot in advance

    opened by freerafiki 0
  • Reproduce experiment result on DON as showed in the paper

    Reproduce experiment result on DON as showed in the paper

    Hi, thanks for releasing the source code of this excellent work!

    I am really interested in how your baseline of 2D correspondence from DON performs so poorly on placing task, as described in your paper.

    Do you have any plan to release your baseline code with DON?

    Thanks a million!

    opened by SgtVincent 0
  • Obtaining a point cloud for the subject object

    Obtaining a point cloud for the subject object

    Hi!

    In the simulation, you used pybullet's functionality to acquire segmentation, how did you acquire it for the actual device? Please let me know how to acquire only the point cloud of the target object with 4 cameras on a real robot. It would be great if you could publish your method or code. Best regards.

    opened by kirby516 2
  • Sampling query points

    Sampling query points

    Hello!

    I want to use NDF, but instead of using the demonstrations to sample query points, I want to do it by manually specifying a point on the object. How should I approach this? Which file do I need to look at? Any guidance will be appreciated.

    opened by kaushikbalasundar 0
Owner
null
An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

Decoupled-Contrastive-Learning This repository is an implementation for the loss function proposed in Decoupled Contrastive Loss paper. Requirements P

Ramin Nakhli 71 Dec 4, 2022
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 5, 2023
A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Mugs: A Multi-Granular Self-Supervised Learning Framework This is a PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-

Sea AI Lab 62 Nov 8, 2022
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Fine-Grained R2R Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP2020 paper Sub-Instruction Aware Vision-and-Language Navigation. C

YicongHong 34 Nov 15, 2022
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

Christoph Reich 10 Jan 2, 2023
Implementation of Geometric Vector Perceptron, a simple circuit for 3d rotation equivariance for learning over large biomolecules, in Pytorch. Idea proposed and accepted at ICLR 2021

Geometric Vector Perceptron Implementation of Geometric Vector Perceptron, a simple circuit with 3d rotation equivariance for learning over large biom

Phil Wang 59 Nov 24, 2022
Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

512x512 flowers after 12 hours of training, 1 gpu 256x256 flowers after 12 hours of training, 1 gpu Pizza 'Lightweight' GAN Implementation of 'lightwe

Phil Wang 1.5k Jan 2, 2023
This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Deep-Detail-Enhancement-for-Any-Garment Introduction This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in

null 40 Dec 13, 2022
This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Off-Belief Learning Introduction This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021. Environment Setup

Facebook Research 32 Jan 5, 2023
Pytorch implementation of the popular Improv RNN model originally proposed by the Magenta team.

Pytorch Implementation of Improv RNN Overview This code is a pytorch implementation of the popular Improv RNN model originally implemented by the Mage

Sebastian Murgul 3 Nov 11, 2022
Torch-ngp - A pytorch implementation of the hash encoder proposed in instant-ngp

HashGrid Encoder (WIP) A pytorch implementation of the HashGrid Encoder from ins

hawkey 1k Jan 1, 2023
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

FLASH - Pytorch Implementation of the Transformer variant proposed in the paper Transformer Quality in Linear Time Install $ pip install FLASH-pytorch

Phil Wang 209 Dec 28, 2022
The implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021

DynamicNeuralGarments Introduction This repository contains the implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021. ./GarmentMoti

null 42 Dec 27, 2022
Code for CMaskTrack R-CNN (proposed in Occluded Video Instance Segmentation)

CMaskTrack R-CNN for OVIS This repo serves as the official code release of the CMaskTrack R-CNN model on the Occluded Video Instance Segmentation data

Q . J . Y 61 Nov 25, 2022
Implement object segmentation on images using HOG algorithm proposed in CVPR 2005

HOG Algorithm Implementation Description HOG (Histograms of Oriented Gradients) Algorithm is an algorithm aiming to realize object segmentation (edge

Leo Hsieh 2 Mar 12, 2022
PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis This is a PyTorch implementation of the Deep Streaming Linear Discriminant

Tyler Hayes 41 Dec 25, 2022
PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

Meta Archive 873 Dec 15, 2022