Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

UT-Austin Robot Perception and Learning Lab

Last update: Jan 3, 2023

Related tags

Deep Learning robotics grasping 3d-reconstruction robot-learning affordance robot-manipulation

Overview

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Zhenyu Jiang, Yifeng Zhu, Maxwell Svetlik, Kuan Fang, Yuke Zhu

Project | arxiv

Introduction

GIGA (Grasp detection via Implicit Geometry and Affordance) is a network that jointly detects 6 DOF grasp poses and reconstruct the 3D scene. GIGA takes advantage of deep implicit functions, a continuous and memory-efficient representation, to enable differentiable training of both tasks. GIGA takes as input a Truncated Signed Distance Function (TSDF) representation of the scene, and predicts local implicit functions for grasp affordance and 3D occupancy. By querying the affordance implict functions with grasp center candidates, we can get grasp quality, grasp orientation and gripper width at these centers. GIGA is trained on a synthetic grasping dataset generated with physics simulation.

Installation

Create a conda environment.
Install packages list in requirements.txt. Then install torch-scatter following here, based on pytorch version and cuda version.
Go to the root directory and install the project locally using pip

pip install -e .

Build ConvONets dependents by running python scripts/convonet_setup.py build_ext --inplace.
Download the data, then unzip and place the data folder under the repo's root. Pretrained models of GIGA, GIGA-Aff and VGN are in data/models.

Self-supervised Data Generation

Raw synthetic grasping trials

Pile scenario:

python scripts/generate_data_parallel.py --scene pile --object-set pile/train --num-grasps 4000000 --num-proc 40 --save-scene ./data/pile/data_pile_train_random_raw_4M

Packed scenario:

python scripts/generate_data_parallel.py --scene packed --object-set packed/train --num-grasps 4000000 --num-proc 40 --save-scene ./data/pile/data_packed_train_random_raw_4M

Please run python scripts/generate_data_parallel.py -h to print all options.

Data clean and processing

First clean and balance the data using:

python scripts/clean_balance_data.py /path/to/raw/data

Then construct the dataset (add noise):

python scripts/construct_dataset_parallel.py --num-proc 40 --single-view --add-noise dex /path/to/raw/data /path/to/new/data

Save occupancy data

Sampling occupancy data on the fly can be very slow and block the training, so I sample and store the occupancy data in files beforehand:

python scripts/save_occ_data_parallel.py /path/to/raw/data 100000 2 --num-proc 40

Please run python scripts/save_occ_data_parallel.py -h to print all options.

Training

Train GIGA

Run:

# GIGA
python scripts/train_giga.py --dataset /path/to/new/data --dataset_raw /path/to/raw/data

Simulated grasping

Run:

python scripts/sim_grasp_multiple.py --num-view 1 --object-set (packed/test | pile/test) --scene （packed ｜ pile) --num-rounds 100 --sideview --add-noise dex --force --best --model /path/to/model --type (vgn | giga | giga_aff) --result-path /path/to/result

This commands will run experiment with each seed specified in the arguments.

Run python scripts/sim_grasp_multiple.py -h to print a complete list of optional arguments.

Related Repositories

Our code is largely based on VGN
We use ConvONets as our backbone.

Comments

ValueError: file_type not specified!

Hi, When I run sim_grasp_multiple.py and add the parameter --vis for simulation grasping, the following problem occurred:

(giga) wang@wang-U:~/github/GIGA$ python scripts/sim_grasp_multiple.py --num-view 1 --object-set pile/test --scene pile --num-rounds 100 --sideview --add-noise dex --force --best --model /home/wang/github/GIGA/data/models/giga_pile.pt --type giga --result-path /home/wang/github/GIGA/data/result --vis

pybullet build time: May 28 2020 16:37:34 Loading [giga] model from /home/wang/github/GIGA/data/models/giga_pile.pt 0%| | 0/100 [00:00<?, ?it/s] Traceback (most recent call last): File "scripts/sim_grasp_multiple.py", line 123, in main(args) File "scripts/sim_grasp_multiple.py", line 38, in main success_rate, declutter_rate = clutter_removal.run( File "/home/wang/github/GIGA/src/vgn/experiments/clutter_removal.py", line 89, in run logger.log_mesh(scene_mesh, visual_mesh, f'round_{round_id:03d}trial{trial_id:03d}') File "/home/wang/github/GIGA/src/vgn/experiments/clutter_removal.py", line 177, in log_mesh aff_mesh.export(self.mesh_dir / (name + "_aff.obj")) File "/home/wang/anaconda3/envs/giga/lib/python3.8/site-packages/trimesh/scene/scene.py", line 842, in export return export.export_scene( File "/home/wang/anaconda3/envs/giga/lib/python3.8/site-packages/trimesh/exchange/export.py", line 210, in export_scene raise ValueError('file_type not specified!') ValueError: file_type not specified!

What is the reason? Thank you！ ^_^

opened by ccc1711 12
Segmentation fault (core dumped)

Hi,thank you for you repo! However, I encountered an error when running sim_grasp_multiple.py. As shown in the figure

Could you give me some suggestions？

opened by ccc1711 12

Visualizing on custom dataset

What is the coordinate system / reference you take for adding the grasp visual in grasp2mesh()?

I tweaked your code a bit to run it on my dataset, but the gripper is way farther from the object mesh. I ran the pre-trained model and got the best grasp params (affordance, rotation, width and center). Here is the code I used to visualize:

Code:

# Loading custom data mesh
mesh = trimesh.load('mydata/processed/meshes/map_{}.obj'.format(scene_id),force='scene')
grasp = Grasp(Transform(ori, pos, width)

finger_depth = 0.05
color = np.array([0, 250, 0, 180]).astype(np.uint8)
radius = 0.1 * finger_depth
w, d = grasp.width, finger_depth
pose = grasp.pose * Transform(Rotation.identity(), [0.0, -w / 2, d / 2])
scale = [radius, radius, d]
left_finger = trimesh.creation.cylinder(radius,
                                        d,
                                        transform=pose.as_matrix())
scene.add_geometry(left_finger, 'left_finger')

# right finger
pose = grasp.pose * Transform(Rotation.identity(), [0.0, w / 2, d / 2])
scale = [radius, radius, d]
right_finger = trimesh.creation.cylinder(radius,
                                            d,
                                            transform=pose.as_matrix())
scene.add_geometry(right_finger, 'right_finger')

# wrist
pose = grasp.pose * Transform(Rotation.identity(), [0.0, 0.0, -d / 4])
scale = [radius, radius, d / 2]
wrist = trimesh.creation.cylinder(radius,
                                    d / 2,
                                    transform=pose.as_matrix())
scene.add_geometry(wrist, 'wrist')

# palm
pose = grasp.pose * Transform(
    Rotation.from_rotvec(np.pi / 2 * np.r_[1.0, 0.0, 0.0]),
    [0.0, 0.0, 0.0])
scale = [radius, radius, w]
palm = trimesh.creation.cylinder(radius, w, transform=pose.as_matrix())
scene.add_geometry(palm, 'palm')
scene.add_geometry(trimesh.creation.axis())

opened by ishikaa25 9

problem about 'Save occupancy data'

Hi all, I ran python scripts/save_occ_data_parallel.py /path/to/raw/data 100000 2 --num-proc 40 to generate occupancy data with the pre-generated data provided by you. I found this line in python scripts/save_occ_data_parallel.py encounter *** AttributeError: 'str' object has no attribute 'copy'. The environmental setting is aligned with the description in README.md

Just wonder whether there is any solution for this issue. Any help appreciated :)

opened by WeiChengTseng 9
About NVIDIA Driver in WSL2

I have tried to install the program in wsl2 using ubuntu. If I use command "nvidia-smi' in linux, I can get a result:

But when I try to install cuda from nvidia by wget https://developer.download.nvidia.com/compute/cuda/11.5.0/local_installers/cuda_11.5.0_495.29.05_linux.run sudo sh cuda_11.5.0_495.29.05_linux.run It will warn like that:

Thus is it necessary to download and install the nvidia driver for linux 64-bit in nvidia website? I tried to do that, but I can not install it locally in wsl2-ubuntu system. It reminded that there is not an NVIDIA GPU in this system. Thanks!

opened by zqnss666 5
Questions about experimental results

Hi，

The results I got with the model you gave are different from those in your paper, and the results are different under different devices. I want to know if you have the same problem at that time. Is this because the test objects are randomly selected and randomly placed? Is the test scene generated each time not fixed, so sometimes the experimental results are different？

Looking forward to your answer. Thank you！

opened by wuzeww 5
generate_data_parallel.py hangs with pool.apply_async(), but not when called in the main thread.

By naively following the installation instructions, I'm finding that this line hangs when running with multiprocessing, but not when running in the main thread.

Would it be possible to list the full conda environment you used, by running conda list -e > requirements.txt? For example, I would be interested to know which versions of python, Open3D and multiprocessing you used.

Any help appreciated :)

opened by djl11 5
Installation error: " LINK : fatal error LNK1181: can not open input file“m.lib” "

Hello! I just began my robotics research which is very related to yours' . I am trying to follow your work in Github. When I carried out the step 4 of Installation, firstly it said the cl.exe could not be found, so I installed Visual Studio 2017 and set the path of environment variables, like following: env variable "path": env variable "LIB": env variable "INCLUDE":

But an error still occurred: LINK : fatal error LNK1181: can not open input file“m.lib” . The specific output in the command line:

I thought it might still caused by libpath or something, but I could not fix it after trials. I think it will definitely affect the following steps of installation or data generation so it is annoying. Do you have some ideas or suggestions to fix it? Thanks a lot!

opened by zqnss666 4
how to query at a higher resolution of 60×60×60
Hi @Steve-Tod , I have two questions:

how to query at a higher resolution of 60x60x60? (I have tried directly set the resolution as 60 in the detection_implicit.py, but qual_vol[valid_voxels == False] = 0.0 gives me an error that the size of valid voxels doesn't match the size of qual_vol)

Without sideview, the performances decrease a lot. Do you have any explanations?

Thanks!
opened by jxtxwsbn 4
Training speed is too slow

Hi,I training one epoch on pile scene,and estimated time to spend more than 10 hours.

Here is my computer configuration： pytorch 1.7.0 py3.8_cuda10.2.89_cudnn7.6.5_0 GPU:GeForce RTX 2070SUPER CPU:Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz Memory:16G Hard Disk Drive:1T HDD

Here are some screenshots:

I found that GPU utilization was erratic and was 0 for a long time, and CPU utilization was not very high. So I presume it's because the io speed of the HDD hard disk drive is too slow.

I hope you can help me find the problem, thanks!

opened by huaji98 4
Downloadable Training Data

It would be very useful to have the training data returned from generate_data_parallel.py script available to download, for both the pile and packed cases.

I appreciate this may be a large amount of memory, and therefore difficult to host, so there is no expectation of course! But it would avoid people needing to run the costly data generation process locally in order to experiment with the training.

opened by djl11 4
Corruption or core dumped

hi, when i run the command: python scripts/construct_dataset_parallel.py --num-proc 40 --single-view --add-noise dex /path/to/raw/data /path/to/new/data An error like this: and i change the num-proc into 1 An error like this:

opened by Jonas-LUOJIAN 1
why process "pos" in dataset like this?
Why process "pos" in dataset like the line:

https://github.com/UT-Austin-RPL/GIGA/blob/d67c4388d334babe6c11c1555a6e848fb4828c84/src/vgn/dataset_voxel.py#L35

Why the "self.size" in

https://github.com/UT-Austin-RPL/GIGA/blob/d67c4388d334babe6c11c1555a6e848fb4828c84/src/vgn/simulation.py#L37

define like this?

Why the do not use "pos" in " x, y, z" instead of "i, j k" in following lines? https://github.com/UT-Austin-RPL/GIGA/blob/d67c4388d334babe6c11c1555a6e848fb4828c84/scripts/construct_dataset_parallel.py#L73
opened by tycoer 1

Owner

UT-Austin Robot Perception and Learning Lab

GitHub

3D AffordanceNet is a 3D point cloud benchmark consisting of 23k shapes from 23 semantic object categories, annotated with 56k affordance annotations and covering 18 visual affordance categories.

3D AffordanceNet This repository is the official experiment implementation of 3D AffordanceNet benchmark. 3D AffordanceNet is a 3D point cloud benchma

49 Dec 1, 2022

Pytorch implementation of One-Shot Affordance Detection

One-shot Affordance Detection PyTorch implementation of our one-shot affordance detection models. This repository contains PyTorch evaluation code, tr

46 Dec 12, 2022

Project to create an open-source 6 DoF input device

6DInputs A Project to create open-source 3D printed 6 DoF input devices Note the plural ('6DInputs' and 'devices') in the headings. We would like seve

47 Jul 28, 2022

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing Paper Introduction Multi-task indoor scene understanding is widely considered a

62 Dec 5, 2022

Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

COIN ?? This repo contains a Pytorch implementation of COIN: COmpression with Implicit Neural representations, including code to reproduce all experim

104 Dec 14, 2022

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning Object-object Interaction Affordance Learning. For a given object-object int

26 Nov 4, 2022

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

183 Jan 3, 2023

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Siren: Implicit Neural Representations with Periodic Activation Functions The unofficial Tensorflow 2 implementation of the paper Implicit Neural Repr

2 Jun 27, 2022

Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Implementation for Iso-Points (CVPR 2021) Official code for paper Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations paper |

66 Nov 8, 2022

Code for the paper "Implicit Representations of Meaning in Neural Language Models"

Implicit Representations of Meaning in Neural Language Models Preliminaries Create and set up a conda environment as follows: conda create -n state-pr

39 Nov 3, 2022

This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

697 Jan 6, 2023

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab

42 Nov 9, 2022

Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

DIGAN (ICLR 2022) Official PyTorch implementation of "Generating Videos with Dyn

147 Dec 31, 2022

Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Related tags

Overview

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Introduction

Installation

Self-supervised Data Generation

Raw synthetic grasping trials

Data clean and processing

Save occupancy data

Training

Train GIGA

Simulated grasping

Related Repositories

Comments

Owner

UT-Austin Robot Perception and Learning Lab

3D AffordanceNet is a 3D point cloud benchmark consisting of 23k shapes from 23 semantic object categories, annotated with 56k affordance annotations and covering 18 visual affordance categories.

Pytorch implementation of One-Shot Affordance Detection

Project to create an open-source 6 DoF input device

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

Pytorch implementation of COIN, a framework for compression with implicit neural representations 🌸

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Code for the paper "Implicit Representations of Meaning in Neural Language Models"

This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

Official PyTorch implementation of BlobGAN: Spatially Disentangled Scene Representations

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implementation of ICCV19 Paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network"

Pytorch implementation for "Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter".

A PyTorch implementation of Implicit Q-Learning

PyTorch implementation of the implicit Q-learning algorithm (IQL)