Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

Overview

GraspNet Baseline

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020).

[paper] [dataset] [API] [doc]


Top 50 grasps detected by our baseline model.

teaser

Requirements

  • Python 3
  • PyTorch 1.6
  • Open3d >=0.8
  • TensorBoard 2.3
  • NumPy
  • SciPy
  • Pillow
  • tqdm

Installation

Get the code.

git clone https://github.com/graspnet/graspnet-baseline.git
cd graspnet-baseline

Install packages via Pip.

pip install -r requirements.txt

Compile and install pointnet2 operators (code adapted from votenet).

cd pointnet2
python setup.py install

Compile and install knn operator (code adapted from pytorch_knn_cuda).

cd knn
python setup.py install

Install graspnetAPI for evaluation.

git clone https://github.com/graspnet/graspnetAPI.git
cd graspnetAPI
pip install .

Tolerance Label Generation

Tolerance labels are not included in the original dataset, and need additional generation. Make sure you have downloaded the orginal dataset from GraspNet. The generation code is in dataset/generate_tolerance_label.py. You can simply generate tolerance label by running the script: (--dataset_root and --num_workers should be specified according to your settings)

cd dataset
sh command_generate_tolerance_label.sh

Or you can download the tolerance labels from Google Drive/Baidu Pan and run:

mv tolerance.tar dataset/
cd dataset
tar -xvf tolerance.tar

Training and Testing

Training examples are shown in command_train.sh. --dataset_root, --camera and --log_dir should be specified according to your settings. You can use TensorBoard to visualize training process.

Testing examples are shown in command_test.sh, which contains inference and result evaluation. --dataset_root, --camera, --checkpoint_path and --dump_dir should be specified according to your settings. Set --collision_thresh to -1 for fast inference.

The pretrained weights can be downloaded from:

checkpoint-rs.tar and checkpoint-kn.tar are trained using RealSense data and Kinect data respectively.

Demo

A demo program is provided for grasp detection and visualization using RGB-D images. You can refer to command_demo.sh to run the program. --checkpoint_path should be specified according to your settings (make sure you have downloaded the pretrained weights). The output should be similar to the following example:

Try your own data by modifying get_and_process_data() in demo.py. Refer to doc/example_data/ for data preparation. RGB-D images and camera intrinsics are required for inference. factor_depth stands for the scale for depth value to be transformed into meters. You can also add a workspace mask for denser output.

Results

Results "In repo" report the model performance with single-view collision detection as post-processing. In evaluation we set --collision_thresh to 0.01.

Evaluation results on RealSense camera:

Seen Similar Novel
AP AP0.8 AP0.4 AP AP0.8 AP0.4 AP AP0.8 AP0.4
In paper 27.56 33.43 16.95 26.11 34.18 14.23 10.55 11.25 3.98
In repo 47.47 55.90 41.33 42.27 51.01 35.40 16.61 20.84 8.30

Evaluation results on Kinect camera:

Seen Similar Novel
AP AP0.8 AP0.4 AP AP0.8 AP0.4 AP AP0.8 AP0.4
In paper 29.88 36.19 19.31 27.84 33.19 16.62 11.51 12.92 3.56
In repo 42.02 49.91 35.34 37.35 44.82 30.40 12.17 15.17 5.51

Citation

Please cite our paper in your publications if it helps your research:

@inproceedings{fang2020graspnet,
  title={GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping},
  author={Fang, Hao-Shu and Wang, Chenxi and Gou, Minghao and Lu, Cewu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR)},
  pages={11444--11453},
  year={2020}
}

License

All data, labels, code and models belong to the graspnet team, MVIG, SJTU and are freely available for free non-commercial use, and may be redistributed under these conditions. For commercial queries, please drop an email at fhaoshu at gmail_dot_com and cc lucewu at sjtu.edu.cn .

Comments
  • In the last step, is it necessary to artificially refine the gt grasp data set for generation?

    In the last step, is it necessary to artificially refine the gt grasp data set for generation?

    In the last step, is it necessary to artificially refine the gt grasp data set for generation? If so, what tools are used to quickly mark up positive and negative samples?

    opened by gzchenjiajun 12
  • What means by grasp_height and grasp_depth?

    What means by grasp_height and grasp_depth?

    What means by grasp_height and grasp_width?

    I read the source code models/graspnet.py:79,butI don't understand what grasp_height and grasp_depth mean.

    I understand other parameters. For example, grasp_width is the gripper's gripping width.

    thanks

    opened by gzchenjiajun 11
  • Grasp Score greater than 1

    Grasp Score greater than 1

    Hi everyone. I am using your work in order to build a vision-based grasping system for my master thesis. By running the demo.py and by observing the grasp score, there are grasps that have a score greater than 1, but from your paper, I would have expected a max grasp score of 1. So, why are there grasps scores greater than 1? Thanks.

    opened by FrancescoRosa3 9
  • 多显卡训练问题

    多显卡训练问题

    您好,我在尝试使用多显卡训练的时候,设置参数如下: os.environ["CUDA_VISIBLE_DEVICES"] = '0,3' net = nn.DataParallel(net.cuda()) 但是模型加载数据的时候出现错误: RuntimeError: Caught RuntimeError in replica 0 on device 0. 请问作者在多显卡训练的时候是否遇到过这个问题?

    opened by xiaozheng-liu 8
  • CUDA kernel failed : no kernel image is available for execution on the device

    CUDA kernel failed : no kernel image is available for execution on the device

    您好,我电脑上有两张显卡,一张2080Ti,一张1080,驱动为NVIDIA-SMI 450.66,CUDA10.1

    我想在1080上跑测试,于是我将command_test.sh中的CUDA_VISIBLE_DEVICES=0改为了CUDA_VISIBLE_DEVICES=1

    但当我运行sh command_test.sh时,会报如下错误:

    CUDA kernel failed : no kernel image is available for execution on the device void furthest_point_sampling_kernel_wrapper(int, int, int, const float*, float*, int*) at L:233 in /home/agent/grasp/graspnet-baseline/pointnet2/_ext_src/src/sampling_gpu.cu

    该错误发生的位置大概是inference()中的

    for batch_idx, batch_data in enumerate(TEST_DATALOADER)

    请问是什么原因导致的呢?

    opened by wozxfdha 8
  • AttributeError: 'ColorVisuals' object has no attribute 'crc'

    AttributeError: 'ColorVisuals' object has no attribute 'crc'

    I tried to use sh command_test.sh to test a trained model, but it reported an error in loading dexnet model, as shown in the following. I would like to inquire if you have any ideas to solve this problem. Thanks very much.

    Loading data path and collision labels...: 100%|████████████████████████| 90/90 [00:00<00:00, 219.96it/s]
    23040
    23040
    -> loaded checkpoint logs/log_rs/checkpoint.tar (epoch: 12)
    Loading data path...: 100%|█████████████████████████████████████████████| 90/90 [00:00<00:00, 205.90it/s]
    multiprocessing.pool.RemoteTraceback: 
    """
    Traceback (most recent call last):
      File "/home/amax/miniconda3/envs/py37/lib/python3.7/multiprocessing/pool.py", line 121, in worker
        result = (True, func(*args, **kwds))
      File "/home/amax/zzhaoao/Grasp/graspnetAPI/graspnetAPI/graspnet_eval.py", line 121, in eval_scene
        model_list, dexmodel_list, _ = self.get_scene_models(scene_id, ann_id=0)
      File "/home/amax/zzhaoao/Grasp/graspnetAPI/graspnetAPI/graspnet_eval.py", line 52, in get_scene_models
        dexmodel = pickle.load(f)
    AttributeError: 'ColorVisuals' object has no attribute 'crc'
    """
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "test.py", line 119, in <module>
        evaluate()
      File "test.py", line 113, in evaluate
        res, ap = ge.eval_all(cfgs.dump_dir, proc=cfgs.num_workers)
      File "/home/amax/zzhaoao/Grasp/graspnetAPI/graspnetAPI/graspnet_eval.py", line 305, in eval_all
        res = np.array(self.parallel_eval_scenes(scene_ids = list(range(100, 190)), dump_folder = dump_folder, proc = proc))
      File "/home/amax/zzhaoao/Grasp/graspnetAPI/graspnetAPI/graspnet_eval.py", line 231, in parallel_eval_scenes
        scene_acc_list.append(res.get())
      File "/home/amax/miniconda3/envs/py37/lib/python3.7/multiprocessing/pool.py", line 657, in get
        raise self._value
    AttributeError: 'ColorVisuals' object has no attribute 'crc'
    
    opened by zhaozj89 4
  • Grasps not on object + what is grasp height and depth?

    Grasps not on object + what is grasp height and depth?

    @Fang-Haoshu @chenxi-wang Thanks for this publication as well as the open source code.

    I have a question reagarding the resulting grasps. Using the pretrained weights and the demo code, the network will sometimes not return any grasps on the object: image Do you have any idea how to make it work in this scenario?

    Additionally, I do not understand how to use the grasp depth and height. I looked at this issue, but still do not get what this information represents.

    Using a parallel gripper, shouldn't the translation and rotation be enough to determine where to grasp the object? I have implemented your code on a Universal Robots manipulator mounted with Realsense d435 and OnRobot parallel gripper, and when sending the TCP to the grasp translation and rotation, it always seem to be a few cm short of gripping the object correct. Could this be related to not using the depth and height information?

    Best regards, Emil Holst

    opened by holst456 4
  • Realsesnse 435 Pointclouds quality issue

    Realsesnse 435 Pointclouds quality issue

    Great article! I managed to get the code up and running and I am considering using it for my bachelor thesis in robotics. My problem is that i cannot get nearly as good point-clouds as you have in your demo data. Did you use any filters or fused multi-view point clouds etc.? E.g. this image Turns into this: image

    opened by holst456 4
  • Is GraspNet more suitable for top-down view?

    Is GraspNet more suitable for top-down view?

    Hi, thanks for sharing your work. It really facilitates robotic grasping. I am using your code recently in the gazebo. I found that the generated grasp points are on the edge of the table or inserts into the body of the can as in the figure below. The elevation of the camera is set to 45 degrees in my case. Could I ask if the graspnet is more suitable for top-down view( nearly 90 degrees to be perpendicular to the table) or top-down grasping?

    截屏2021-07-05 10 35 18
    opened by hetolin 4
  • About collision _label/scene_0110/collision_labels.npz in Dataset

    About collision _label/scene_0110/collision_labels.npz in Dataset

    Hi, I encountered an error as follows:

    import numpy as np
    data_root = '/media/wind/Share/DataSet/GraspNet1Billion'
    collision_label = np.load(os.path.join(data_root, 'collision_label/scene_0110/collision_labels.npz'))
    print(collision_label.files)
    print(collision_label['arr_8'].shape)
    

    Output: ['arr_0', 'arr_4', 'arr_5', 'arr_2', 'arr_3', 'arr_7', 'arr_1', 'arr_8', 'arr_6'] OSError Traceback (most recent call last) in () 3 collision_label = np.load(os.path.join(data_root, 'collision_label/scene_0110/collision_labels.npz')) 4 print(collision_label.files) ----> 5 print(collision_label['arr_8'].shape)

    /home/wind/anaconda3/envs/pytorch160/lib/python3.7/site-packages/numpy/lib/npyio.py in getitem(self, key) 253 return format.read_array(bytes, 254 allow_pickle=self.allow_pickle, --> 255 pickle_kwargs=self.pickle_kwargs) 256 else: 257 return self.zip.read(key)

    /home/wind/anaconda3/envs/pytorch160/lib/python3.7/site-packages/numpy/lib/format.py in read_array(fp, allow_pickle, pickle_kwargs) 761 read_count = min(max_read_count, count - i) 762 read_size = int(read_count * dtype.itemsize) --> 763 data = _read_bytes(fp, read_size, "array data") 764 array[i:i+read_count] = numpy.frombuffer(data, dtype=dtype, 765 count=read_count)

    /home/wind/anaconda3/envs/pytorch160/lib/python3.7/site-packages/numpy/lib/format.py in _read_bytes(fp, size, error_template) 890 # done about that. note that regular files can't be non-blocking 891 try: --> 892 r = fp.read(size - len(data)) 893 data += r 894 if len(r) == 0 or len(data) == size:

    /home/wind/anaconda3/envs/pytorch160/lib/python3.7/zipfile.py in read(self, n) 897 self._offset = 0 898 while n > 0 and not self._eof: --> 899 data = self._read1(n) 900 if n < len(data): 901 self._readbuffer = data

    /home/wind/anaconda3/envs/pytorch160/lib/python3.7/zipfile.py in _read1(self, n) 967 data += self._read2(n - len(data)) 968 else: --> 969 data = self._read2(n) 970 971 if self._compress_type == ZIP_STORED:

    /home/wind/anaconda3/envs/pytorch160/lib/python3.7/zipfile.py in _read2(self, n) 997 n = min(n, self._compress_left) 998 --> 999 data = self._fileobj.read(n) 1000 self._compress_left -= len(data) 1001 if not data:

    /home/wind/anaconda3/envs/pytorch160/lib/python3.7/zipfile.py in read(self, n) 740 "Close the writing handle before trying to read.") 741 self._file.seek(self._pos) --> 742 data = self._file.read(n) 743 self._pos = self._file.tell() 744 return data

    OSError: [Errno 5] Input/output error

    opened by yanjh97 4
  • meaning value of  meta.mat

    meaning value of meta.mat

    Hello @Fang-Haoshu.

    Would you mind give me a quick explain about the meaning of each field inside your meta.mat file.

    {'__header__': b'MATLAB 5.0 MAT-file Platform: posix, Created on: Tue Mar 17 14:51:33 2020',
     '__version__': '1.0',
     '__globals__': [],
     'poses': array([[[-0.9756758 ,  0.20997444,  0.71809477, -0.10539012,
               0.28258908,  0.9570219 , -0.6063459 ,  0.6341392 ,
              -0.95235777],
             [-0.05635871,  0.7769375 , -0.6932231 , -0.9939491 ,
              -0.95306253, -0.1432863 , -0.11288764,  0.7661504 ,
              -0.22686528],
             [ 0.21185002, -0.59353083, -0.06149539, -0.0309534 ,
              -0.10869765, -0.25214702,  0.7871474 ,  0.10431238,
              -0.20383038],
             [ 0.1117    , -0.0714    , -0.0837    , -0.1092    ,
               0.0027    , -0.2043    , -0.0069    , -0.1581    ,
               0.0391    ]],
     
            [[ 0.21893798,  0.14432675, -0.64941216,  0.9167016 ,
               0.8435303 , -0.24861187,  0.7650126 , -0.66297054,
              -0.28011078],
             [-0.2993843 , -0.6250445 , -0.6992277 , -0.08504111,
               0.30085424,  0.04236081, -0.3529656 ,  0.6081769 ,
               0.915004  ],
             [ 0.9286739 , -0.76713043,  0.2989055 , -0.39041796,
              -0.44490823, -0.96767646,  0.53867525, -0.4365672 ,
               0.29035428],
             [ 0.0326    ,  0.0548    , -0.0858    ,  0.135     ,
              -0.0408    ,  0.0057    , -0.1437    , -0.0485    ,
               0.0349    ]],
     
            [[ 0.0110857 , -0.9669956 , -0.25020745,  0.38542327,
               0.45672753,  0.14933594,  0.21702617, -0.39791653,
               0.1206343 ],
             [ 0.95246667,  0.07541542, -0.17470661, -0.06952123,
               0.03403645,  0.9887743 ,  0.9288012 ,  0.20768833,
               0.33361623],
             [ 0.30444106, -0.24337623, -0.95229924,  0.9201172 ,
               0.8889553 ,  0.00491755,  0.30037972,  0.8936039 ,
              -0.9349586 ],
             [ 0.4197    ,  0.473     ,  0.4442    ,  0.5089    ,
               0.4402    ,  0.471     ,  0.3932    ,  0.4459    ,
               0.4803    ]]], dtype=float32),
     'cls_indexes': array([[ 1,  3,  6,  9, 23, 27, 52, 62, 39]]),
     'intrinsic_matrix': array([[631.54864502,   0.        , 638.43517329],
            [  0.        , 631.20751953, 366.49904066],
            [  0.        ,   0.        ,   1.        ]]),
     'factor_depth': array([[1000.]])}
    

    Thanks

    opened by gachiemchiep 4
  • Annotate grasp pose

    Annotate grasp pose

    Thanks for your great job.Your annotation method is Brilliant.

    I noticed your annotation method including two main parts:

    1. 6D pose annotation
    2. Grasp pose annotation

    May I ask could you please provide detailed tools for annotating the second part: Grasp pose ?Cause we are helpless for this part annotation. Thanks very much.

    opened by qingweihk 2
Owner
GraspNet
GraspNet-1Billion official orgnization. Make general grasping great!
GraspNet
Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

null 47 Dec 16, 2022
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

Yuzhi ZHAO 11 Jul 28, 2022
VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too

null 82 Dec 15, 2022
A Simple Long-Tailed Rocognition Baseline via Vision-Language Model

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

Teli Ma 4 Jan 20, 2022
This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

peng gao 11 Dec 1, 2021
ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin et al., 2020).

ReConsider ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin

Facebook Research 47 Jul 26, 2022
UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

UDP-Pose This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Top-Down Results on

null 20 Jul 29, 2022
Learning to Simulate Dynamic Environments with GameGAN (CVPR 2020)

Learning to Simulate Dynamic Environments with GameGAN PyTorch code for GameGAN Learning to Simulate Dynamic Environments with GameGAN Seung Wook Kim,

null 199 Dec 26, 2022
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)

TTNet-Pytorch The implementation for the paper "TTNet: Real-time temporal and spatial video analysis of table tennis" An introduction of the project c

Nguyen Mau Dung 438 Dec 29, 2022
Code accompanying "Dynamic Neural Relational Inference" from CVPR 2020

Code accompanying "Dynamic Neural Relational Inference" This codebase accompanies the paper "Dynamic Neural Relational Inference" from CVPR 2020. This

Colin Graber 48 Dec 23, 2022
git《Investigating Loss Functions for Extreme Super-Resolution》(CVPR 2020) GitHub:

Investigating Loss Functions for Extreme Super-Resolution NTIRE 2020 Perceptual Extreme Super-Resolution Submission. Our method ranked first and secon

Sejong Yang 0 Oct 17, 2022
Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

DNA This repository provides the code of our paper: Blockwisely Supervised Neural Architecture Search with Knowledge Distillation. Illustration of DNA

Changlin Li 215 Dec 19, 2022
《Train in Germany, Test in The USA: Making 3D Object Detectors Generalize》(CVPR 2020)

Train in Germany, Test in The USA: Making 3D Object Detectors Generalize This paper has been accpeted by Conference on Computer Vision and Pattern Rec

Xiangyu Chen 101 Jan 2, 2023
《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

This contains the codes for cross-view geo-localization method described in: Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching, CVPR2020.

null 41 Oct 27, 2022
Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

FFD Source Code Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face M

null 88 Nov 22, 2022
From Fidelity to Perceptual Quality: A Semi-Supervised Approach for Low-Light Image Enhancement (CVPR'2020)

Under-exposure introduces a series of visual degradation, i.e. decreased visibility, intensive noise, and biased color, etc. To address these problems, we propose a novel semi-supervised learning approach for low-light image enhancement.

Yang Wenhan 117 Jan 3, 2023
Pixel Consensus Voting for Panoptic Segmentation (CVPR 2020)

Implementation for Pixel Consensus Voting (CVPR 2020). This codebase contains the essential ingredients of PCV, including various spatial discretizati

Haochen 23 Oct 25, 2022