code of paper „Unknown Object Segmentation from Stereo Images“, IROS 2021

Overview

Unknown Object Segmentation from Stereo Images

Unknown Object Segmentation from Stereo Images Maximilian Durner*, Wout Boerdijk*, Martin Sundermeyer, Werner Friedl, Zoltan-Csaba Marton, and Rudolph Triebel. Accepted at IROS2021. paper, dataset

Overview

Abstract

Although instance-aware perception is a key prerequisite for many autonomous robotic applications, most of the methods only partially solve the problem by focusing solely on known object categories. However, for robots interacting in dynamic and cluttered environments, this is not realistic and severely limits the range of potential applications. Therefore, we propose a novel object instance segmentation approach that does not require any semantic or geometric information of the objects beforehand. In contrast to existing works, we do not explicitly use depth data as input, but rely on the insight that slight viewpoint changes, which for example are provided by stereo image pairs, are often sufficient to determine object boundaries and thus to segment objects. Focusing on the versatility of stereo sensors, we employ a transformer-based architecture that maps directly from the pair of input images to the object instances. This has the major advantage that instead of a noisy, and potentially incomplete depth map as an input, on which the segmentation is computed, we use the original image pair to infer the object instances and a dense depth map. In experiments in several different application domains, we show that our Instance Stereo Transformer (INSTR) algorithm outperforms current state-of-the-art methods that are based on depth maps. Training code and pretrained models are available at https://github.com/DLR-RM/instr.

Citation

If you find our work useful, please cite us:

@article{durner2021unknown,
    title={Unknown Object Segmentation from Stereo Images}, 
    author={Maximilian Durner and Wout Boerdijk and Martin Sundermeyer and Werner Friedl and Zoltan-Csaba Marton and Rudolph Triebel},
    year={2021},
    eprint={2103.06796},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Content Description

This repository contains code and a pre-trained model for the Instance Stereo Transformer (INSTR), and everything necessary to reproduce the main results presented in our paper. It also contains scripts to generate synthetic data, and a demo script.

Requirements: Hardware

For Training

Nvidia GPU with >= 11GB memory (or adjust the batch size). We trained on a Nvidia RTX 2080 Ti.

For Testing

Nvidia GPU with >= 3GB memory

Requirements: Software

It is reommended that you use a conda environment for installation. Run conda env create -f environment.yaml to install a conda environment named instr. Then, run conda activate instr to activate it.

Getting Started

Pre-trained Model

We provide a pre-trained model (link). With it you can

  • reproduce the main results from the paper
  • directly use it for a demo
  • use it as an initial starting point for fine-tuning

Read the following for more information.

Train / Fine-tune INSTR

To train INSTR from scratch following the procedure in the paper first create the synthetic dataset (see the README). Otherwise, adapt the dataloader accordingly, if you want to fine-tune / train on another dataset. Adapt the paths in the config, expecially lines 9, 13 and 17. The config format follows YACS, so you can also easily modify options via the command line (see command below).

To start a training, run python train.py --config-file configs/config.yaml OPTIONAL_ARGS, where OPTIONAL_ARGS overwrite config settings. For example, python train.py --config-file configs/config.yaml MODEL.AUX_DECODER_LOSS False AXIAL_ATTENTION False will train without disparity loss and axial attention.

Running with the default config will store config, models and tensorboard visualization in ./output/instr_{time_stamp}.

For fine-tuning it is suggested to reduce the learning rate by a factor of 10, e.g. python train.py --config-file configs/config.yaml OPTIMIZER.LR 0.00001.

Evaluate INSTR on STIOS

STIOS is a table-top object dataset recorded by two stereo sensors with manual annotated instances for each frame (website, code utilities). To evaluate and reproduce the experiments in the paper (Tab. 2), download STIOS and the pretrained model. Extract the pretrained model in the project's root directory. Then, run python predict_stios.py --state-dict pretrained_instr/models/pretrained_model.pth --root /path/to/stios {--rcvisard, --zed}. This will generate mIoU and F1 scores for every scene.

Demo

Download the pretrained model and extract the contents here. Overwrite the camera class so that it returns a pair of stereo images (RGB, np.array, uint8) from your stereo camera. Then, run python demo.py for the default demo.

Run python demo.py --help or have a look at the predictor class for further information.

You might also like...
 RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching
RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching This repository contains the source code for our paper: RAFT-Stereo: Multilevel

Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021
Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021

Differentiable Factor Graph Optimization for Learning Smoothers Overview Status Setup Datasets Training Evaluation Acknowledgements Overview Code rele

[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning
[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

SurRoL IROS 2021 SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning Features dVRK compati

SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)
SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)

SuMa++: Efficient LiDAR-based Semantic SLAM This repository contains the implementation of SuMa++, which generates semantic maps only using three-dime

Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features
Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features

Struct-MDC (click the above buttons for redirection!) Official page of "Struct-MDC: Mesh-Refined Unsupervised Depth Completion Leveraging Structural R

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Unknown Horizons official code repository
Unknown Horizons official code repository

Unknown-Horizons based on Fifengine is no longer in development. We are porting it to Godot Engine. Please dont report any new bugs. Only bugfixes wil

the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]
the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

BGNet This repository contains the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet] Environment Python 3.6.* C

Python library for serializing any arbitrary object graph into JSON. It can take almost any Python object and turn the object into JSON. Additionally, it can reconstitute the object back into Python.

jsonpickle jsonpickle is a library for the two-way conversion of complex Python objects and JSON. jsonpickle builds upon the existing JSON encoders, s

Neural Factorization of Shape and Reflectance Under An Unknown Illumination
Neural Factorization of Shape and Reflectance Under An Unknown Illumination

NeRFactor [Paper] [Video] [Project] This is the authors' code release for: NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown I

Towards Debiasing NLU Models from Unknown Biases

Towards Debiasing NLU Models from Unknown Biases Abstract: NLU models often exploit biased features to achieve high dataset-specific performance witho

CUP-DNN is a deep neural network model used to predict tissues of origin for cancers of unknown of primary.

CUP-DNN CUP-DNN is a deep neural network model used to predict tissues of origin for cancers of unknown of primary. The model was trained on the expre

KalmanFilterExercise - A Kalman Filter is a algorithmic filter that is used to estimate the state of an unknown variable
KalmanFilterExercise - A Kalman Filter is a algorithmic filter that is used to estimate the state of an unknown variable

Kalman Filter Exercise What are Kalman Filters? A Kalman Filter is a algorithmic

Multi-robot collaborative exploration and mapping through Voronoi partition and DRL in unknown environment
Multi-robot collaborative exploration and mapping through Voronoi partition and DRL in unknown environment

Voronoi Multi_Robot Collaborate Exploration Introduction In the unknown environment, the cooperative exploration of multiple robots is completed by Vo

Video Object Segmentation(VOS) From Zero to HeroVideo Object Segmentation(VOS) From Zero to Hero

Video Object Segmentation(VOS) From Zero to Hero! Goal 1:train a two layers cnn model for vos. Finish! see model.py FFNet for more diteal.(2021.9.30)

Honours project, on creating a depth estimation map from two stereo images of featureless regions

image-processing This module generates depth maps for shape-blocked-out images Install If working with anaconda, then from the root directory: conda e

Multi View Stereo on Internet Images
Multi View Stereo on Internet Images

Evaluating MVS in a CPC Scenario This repository contains the set of artficats used for the ENGN8601/8602 research project. The thesis emphasizes on t

Pytorch codes for
Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Self-Supervised-MVS This repository is the official PyTorch implementation of our AAAI 2021 paper: "Self-supervised Multi-view Stereo via Effective Co

Comments
  • Is there a compatibility issue with Python 3.8?

    Is there a compatibility issue with Python 3.8?

    I'm running into issues when running the bproc.init() in the quickstart example. I've tried this on two different environments: On my virtual environment I get ModuleNotFoundError: No module named 'pip._vendor.retrying' and on my global environment I get ModuleNotFoundError: No module named 'pip._vendor.retrying'.

    From this SO thread I'm wondering if I'm facing a compatibility issue.

    opened by alexander-soare 1
Owner
DLR-RM
German Aerospace Center (DLR) - Institute of Robotics and Mechatronics (RM) - open source projects
DLR-RM
Codes of CVPR2022 paper: Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction

Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction Figure 1. Teaser. Introduction This paper studies the problem

Yining Hong 32 Dec 29, 2022
Find dead Python code

Vulture - Find dead code Vulture finds unused code in Python programs. This is useful for cleaning up and finding errors in large code bases. If you r

Jendrik Seipp 2.4k Dec 27, 2022
Safe code refactoring for modern Python.

Safe code refactoring for modern Python projects. Overview Bowler is a refactoring tool for manipulating Python at the syntax tree level. It enables s

Facebook Incubator 1.4k Jan 4, 2023
Removes commented-out code from Python files

eradicate eradicate removes commented-out code from Python files. Introduction With modern revision control available, there is no reason to save comm

Steven Myint 146 Dec 13, 2022
A library that modifies python source code to conform to pep8.

Pep8ify: Clean your code with ease Pep8ify is a library that modifies python source code to conform to pep8. Installation This library currently works

Steve Pulec 117 Jan 3, 2023
Simple, hassle-free, dependency-free, AST based source code refactoring toolkit.

refactor is an end-to-end refactoring framework that is built on top of the 'simple but effective refactorings' assumption. It is much easier to write a simple script with it rather than trying to figure out what sort of a regex you need in order to replace a pattern (if it is even matchable with regexes).

Batuhan Taskaya 385 Jan 6, 2023
Turn your C++/Java code into a Python-like format for extra style points and to make everyone hates you

Turn your C++/Java code into a Python-like format for extra style points and to make everyone hates you

Tô Đức (Watson) 4 Feb 7, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

null 68 Dec 14, 2022
The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

OverlapTransformer The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for

HAOMO.AI 136 Jan 3, 2023
Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

PyTorch-High-Res-Stereo-Depth-Estimation Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch. Stereo dep

Ibai Gorordo 26 Nov 24, 2022