O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

Kaichun Mo

Last update: Nov 4, 2022

Related tags

Deep Learning o2oafford

Overview

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning

Object-object Interaction Affordance Learning. For a given object-object interaction task (e.g., fitting), our method takes as inputs a 3D acting object point cloud (the bucket) and a partial 3D scan of the scene/object (the cabinet), and outputs an affordance prediction heatmap that estimates the likelihood of the acting object successfully accomplishing the task at every pixel. At the test time, one may easily sample a position from the heatmap to perform the action.

Introduction

Contrary to the vast literature in modeling, perceiving, and understanding agent-object (e.g., human-object, hand-object, robot-object) interaction in computer vision and robotics, very few past works have studied the task of object-object interaction, which also plays an important role in robotic manipulation and planning tasks. There is a rich space of object-object interaction scenarios in our daily life, such as placing an object on a messy tabletop, fitting an object inside a drawer, pushing an object using a tool, etc. In this paper, we propose a unified affordance learning framework to learn object-object interaction for various tasks. By constructing four object-object interaction task environments using physical simulation (SAPIEN) and thousands of ShapeNet models with rich geometric diversity, we are able to conduct large-scale object-object affordance learning without the need for human annotations or demonstrations. At the core of technical contribution, we propose an object-kernel point convolution network to reason about detailed interaction between two objects. Experiments on large-scale synthetic data and real-world data prove the effectiveness of the proposed approach.

About the paper

O2O-Afford is accepted to CoRL 2021!

Our team: Kaichun Mo, Yuzhe Qin, Fanbo Xiang, Hao Su, Leonidas J. Guibas from Stanford University and UC San Diego.

ArXiv Version: https://arxiv.org/abs/2106.15087

Project Page: https://cs.stanford.edu/~kaichun/o2oafford/

Citations

@inProceedings{mo2021o2oafford,
    title={{O2O-Afford}: Annotation-Free Large-Scale Object-Object Affordance Learning},
    author={Mo, Kaichun and Qin, Yuzhe and Xiang, Fanbo and Su, Hao and Guibas, Leonidas},
    year={2021},
    booktitle={Conference on Robot Learning (CoRL)}
}

Code

Please go to exps folder and refer to the README there.

Questions

Please post issues for questions and more helps on this Github repo page. We encourage using Github issues instead of sending us emails since your questions may benefit others.

License

MIT Licence

Updates

[Nov 8, 2021] Preliminary vesion of Data and Code released.

Comments

How to deploy the project on a remote Linux server?
Hi, I want to train a model using GPUs on my remote Linux server, which does not have GUI. But when every dependency (except for Blender) was prepared, I faced some problems with the SAPIEN environment. When I run this:python env_placement/collect_data.py 40147 StorageFurniture, there was no OpenGL window and I got an error:

X Error of failed request: GLXBadContextTag Major opcode of failed request: 146 (GLX) Minor opcode of failed request: 5 (X_GLXMakeCurrent) Serial number of failed request: 146 Current serial number in output stream: 146

and bash scripts/gen_data-placement.sh got the same error.

I guess it's a remote display problem. I would like to ask you how to solve this kind of problem on a remote server.

Thanks a lot.
opened by HanXuan-Li 0

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem

48 Dec 30, 2022

DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021)

DeepLM DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021) Run Please install th

130 Dec 2, 2022

Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

37 Dec 8, 2022

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

GraspNet Baseline Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020). [paper] [dataset] [API] [do

209 Dec 29, 2022

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

43 Jan 5, 2023

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This

290 Dec 29, 2022

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

373 Jan 2, 2023

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

128 Dec 8, 2022

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

184 Dec 11, 2022

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

Related tags

Overview

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning

Introduction

About the paper

Citations

Code

Questions

License

Updates

You might also like...

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021)

Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Comments

How to deploy the project on a remote Linux server?

Owner

Kaichun Mo

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation (CoRL 2021)

Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Pytorch implementation of One-Shot Affordance Detection

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

Public implementation of "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression" from CoRL'21

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

LIVECell - A large-scale dataset for label-free live cell segmentation