Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

David Brüggemann

Last update: Dec 5, 2022

Related tags

Overview

Adaptive Task-Relational Context (ATRC)

This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense Prediction. The code is organized using PyTorch Lightning.

Overview

ATRC is an attention-driven module to refine task-specific dense predictions by capturing cross-task contexts. Through Neural Architecture Search (NAS), ATRC selects contexts for multi-modal distillation based on the source-target tasks' relation. We investigate four context types: global, local, t-label and s-label (as well as the option to sever the cross-task connection). In the figure above, each CP block handles one source-target task connection.

We provide code for searching ATRC configurations and training various multi-modal distillation networks on the NYUD-v2 and PASCAL-Context benchmarks, based on HRNet backbones.

Usage

Requirements

The code is run in a conda environment with Python 3.8.11:

conda install pytorch==1.7.0 torchvision==0.8.1 cudatoolkit=10.1 -c pytorch
conda install pytorch-lightning==1.1.8 -c conda-forge
conda install opencv==4.4.0 -c conda-forge
conda install scikit-image==0.17.2
pip install jsonargparse[signatures]==3.17.0

NOTE: PyTorch Lightning is still going through heavy development, so make sure version 1.1.8 is used with this code to avoid issues.

Download the Data

Before running the code, download and extract the datasets to any directory $DATA_DIR:

wget https://data.vision.ee.ethz.ch/brdavid/atrc/NYUDv2.tar.gz -P $DATA_DIR
wget https://data.vision.ee.ethz.ch/brdavid/atrc/PASCALContext.tar.gz -P $DATA_DIR
tar xfvz $DATA_DIR/NYUDv2.tar.gz -C $DATA_DIR && rm $DATA_DIR/NYUDv2.tar.gz
tar xfvz $DATA_DIR/PASCALContext.tar.gz -C $DATA_DIR && rm $DATA_DIR/PASCALContext.tar.gz

ATRC Search

To start an ATRC search on NYUD-v2 with a HRNetV2-W18-small backbone, use for example:

python ./src/main_search.py --cfg ./config/nyud/hrnet18/atrc_search.yaml --datamodule.data_dir $DATA_DIR --trainer.gpus 2 --trainer.accelerator ddp

The path to the data directory $DATA_DIR needs to be provided. With every validation epoch, the current ATRC configuration is saved as a atrc_genotype.json file in the log directory.

Multi-Modal Distillation Network Training

To train ATRC distillation networks supply the path to the corresponding atrc_genotype.json, e.g., $GENOTYPE_DIR:

python ./src/main.py --cfg ./config/nyud/hrnet18/atrc.yaml --model.atrc_genotype_path $GENOTYPE_DIR/atrc_genotype.json --datamodule.data_dir $DATA_DIR --trainer.gpus 1

Some genotype files can be found under genotypes/.

Baselines can be run by selecting the config file, e.g., multi-task learning baseline:

python ./src/main.py --cfg ./config/nyud/hrnet18/baselinemt.yaml --datamodule.data_dir $DATA_DIR --trainer.gpus 1

The evaluation of boundary detection is disabled, since the MATLAB-based SEISM repository was used for obtaining the optimal dataset F-measure scores. Instead, the boundary predictions are simply saved on the disk in this code.

Citation

If you find this code useful in your research, please consider citing the paper:

@InProceedings{bruggemann2020exploring,
  Title     = {Exploring Relational Context for Multi-Task Dense Prediction},
  Author    = {Bruggemann, David and Kanakis, Menelaos and Obukhov, Anton and Georgoulis, Stamatios and Van Gool, Luc},
  Booktitle = {ICCV},
  Year      = {2021}
}

Credit

The pretrained backbone weights and code are from MMSegmentation. The distilled surface normal and saliency labels for PASCAL-Context are from ASTMT. Local attention CUDA kernels are from this repo.

Contact

For questions about the code or paper, feel free to contact me (send email).

You might also like...

Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

244 Dec 27, 2022

Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

1.3k Jan 2, 2023

This is an official implementation of the High-Resolution Transformer for Dense Prediction.

High-Resolution Transformer for Dense Prediction Introduction This is the official implementation of High-Resolution Transformer (HRT). We present a H

403 Dec 13, 2022

[CVPR 2021 Oral] Variational Relational Point Completion Network

VRCNet: Variational Relational Point Completion Network This repository contains the PyTorch implementation of the paper: Variational Relational Point

121 Dec 12, 2022

DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction

DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction This is the implementation of DeepSTD in

5 Sep 26, 2022

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

🌈 ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

225 Dec 29, 2022

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [PaddlePaddle Implementation] Homepage of paper: Paint Transformer: Fee

442 Dec 16, 2022

Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

8 Sep 30, 2022

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece

6 Nov 25, 2022

Comments

Why single-level optimization?

Hello and congrats for your great work! I would like to ask if by any chance you had attempted to use bi-level optimization for your proposed "learning-to-attend" scheme? From my understanding (please correct me if I am wrong), you converge at different types of attention at each run and you select the optimal inter-task type of attention with majority voting across five runs (i.e. 3x global 1x local 1x S-label). Would bi- instead of single-level optimization remedy this issue? Would you have any insights yourself on that?

opened by KatsarosEf 2

Potential bugs of the "relationalcontext" head

Hi @brdav ,

It seems that the current version of relationalcontext head may have some potential bugs when handling t-label or s-label. It might need a bi-directional aux caculation.

https://github.com/brdav/atrc/blob/36a2f11b15f1e8630331506b7136a8299088506d/src/model/heads/relationalcontext.py#L157

might need to be changed to

for task in self.tasks:
        if any(self.atrc_genotype[task][source] == 3 for source in self.tasks):
            aux_tasks.append(task)
            continue
        if any(self.atrc_genotype[task][source] == 4 for source in self.tasks):
            aux_tasks.append(task)
            continue
        if any(self.atrc_genotype[target][task] == 3 for target in self.tasks):
            aux_tasks.append(task)
            continue
        if any(self.atrc_genotype[target][task] == 4 for target in self.tasks):
            aux_tasks.append(task)
            continue

Thanks

opened by HarborYuan 2

Issues about the NYUD dataset

Hi,

Thanks for your work on multi-task learning.

Currently, I am working on my own project with atrc, and I tried to use the multi-task baseline as a start. However, using the baselinemt.yml config file(bs 32 iter 80k), I only got 0.6755 RMSE Depth, 22.2 MAE Normal, 33.96 mIoU semseg, compared to 0.6248, 21.02, 36.35 in the paper. Smaller batch size and iterations lead to a worse result.

So, I want to ask that did I miss something or is there something I can do to build my own multi-task baseline in this codebase.

Best, Harbor

opened by HarborYuan 2
fix NYUD dataset

Hi,

Thanks for your work.

I am trying to work with your project, but it seems that the nyud data loader have something abnormal. I have tried to figure it out, could you please check whether it is correct.

Thanks again.

opened by HarborYuan 1

Owner

David Brüggemann

PhD student at Computer Vision Lab, ETH Zurich

GitHub

Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

Relation Prediction as an Auxiliary Training Objective for Knowledge Base Completion This repo provides the code for the paper Relation Prediction as

85 Jan 2, 2023

[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [arXiv] [Project Page] @inproceedings{ huang2021fapn, title={{FaPN}: Feature-alig

23 Jul 22, 2022

[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Deep Relational Metric Learning This repository is the official PyTorch implementation of Deep Relational Metric Learning. Framework Datasets CUB-200-

39 Dec 10, 2022

MPViT:Multi-Path Vision Transformer for Dense Prediction

MPViT : Multi-Path Vision Transformer for Dense Prediction This repository inlcu

272 Dec 20, 2022

(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

Relational Embedding for Few-Shot Classification (ICCV 2021) Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho [paper], [project hompage] We propose t

82 Dec 24, 2022

[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)

Exploring Temporal Coherence for More General Video Face Forgery Detection(FTCN) Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, Fang Wen Accepted b

57 Dec 28, 2022

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

66 Nov 16, 2022

Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Related tags

Overview

Adaptive Task-Relational Context (ATRC)

Overview

Usage

Requirements

Download the Data

ATRC Search

Multi-Modal Distillation Network Training

Citation

Credit

Contact

You might also like...

Implementation of "A MLP-like Architecture for Dense Prediction"

Dense Prediction Transformers

This is an official implementation of the High-Resolution Transformer for Dense Prediction.

[CVPR 2021 Oral] Variational Relational Point Completion Network

DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

Comments

Why single-level optimization?

Potential bugs of the "relationalcontext" head

Issues about the NYUD dataset

fix NYUD dataset

Owner

David Brüggemann

Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

MPViT:Multi-Path Vision Transformer for Dense Prediction

(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

This folder contains the implementation of the multi-relational attribute propagation algorithm.

Dense Prediction Transformers