AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Last update: Dec 22, 2022

Related tags

Deep Learning AdaShare

Overview

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS 2020)

Introduction

AdaShare is a novel and differentiable approach for efficient multi-task learning that learns the feature sharing pattern to achieve the best recognition accuracy, while restricting the memory footprint as much as possible. Our main idea is to learn the sharing pattern through a task-specific policy that selectively chooses which layers to execute for a given task in the multi-task network. In other words, we aim to obtain a single network for multi-task learning that supports separate execution paths for different tasks.

Here is the link for our arxiv version.

Welcome to cite our work if you find it is helpful to your research.

@article{sun2020adashare,
  title={Adashare: Learning what to share for efficient deep multi-task learning},
  author={Sun, Ximeng and Panda, Rameswar and Feris, Rogerio and Saenko, Kate},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}

Experiment Environment

Our implementation is in Pytorch. We train and test our model on 1 Tesla V100 GPU for NYU v2 2-task, CityScapes 2-task and use 2 Tesla V100 GPUs for NYU v2 3-task and Tiny-Taskonomy 5-task.

We use python3.6 and please refer to this link to create a python3.6 conda environment.

Install the listed packages in the virual environment:

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
conda install matplotlib
conda install -c menpo opencv
conda install pillow
conda install -c conda-forge tqdm
conda install -c anaconda pyyaml
conda install scikit-learn
conda install -c anaconda scipy
pip install tensorboardX

Datasets

Please download the formatted datasets for NYU v2 here

The formatted CityScapes can be found here.

Download Tiny-Taskonomy as instructed by its GitHub.

The formatted DomainNet can be found here.

Remember to change the dataroot to your local dataset path in all yaml files in the ./yamls/.

Training

Policy Learning Phase

Please execute train.py for policy learning, using the command

python train.py --config <yaml_file_name> --gpus <gpu ids>

For example, python train.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0.

Sample yaml files are under yamls/adashare

Note: use domainnet branch for experiments on DomainNet, i.e. python train_domainnet.py --config <yaml_file_name> --gpus <gpu ids>

Retrain Phase

After Policy Learning Phase, we sample 8 different architectures and execute re-train.py for retraining.

python re-train.py --config <yaml_file_name> --gpus <gpu ids> --exp_ids <random seed id>

where we use different --exp_ids to specify different random seeds and generate different architectures. The best performance of all 8 runs is reported in the paper.

For example, python re-train.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0 --exp_ids 0.

Note: use domainnet branch for experiments on DomainNet, i.e. python re-train_domainnet.py --config <yaml_file_name> --gpus <gpu ids>

Test/Inference

After Retraining Phase, execute test.py for get the quantitative results on the test set.

python test.py --config <yaml_file_name> --gpus <gpu ids> --exp_ids <random seed id>

For example, python test.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0 --exp_ids 0.

We provide our trained checkpoints as follows:

Please download our model in NYU v2 2-Task Learning
Please donwload our model in CityScapes 2-Task Learning
Please download our model in NYU v2 3-Task Learning

To use these provided checkpoints, please download them to ../experiments/checkpoints/ and uncompress there. Use the following command to test

python test.py --config yamls/adashare/nyu_v2_2task_test.yml --gpus 0 --exp_ids 0
python test.py --config yamls/adashare/cityscapes_2task_test.yml --gpus 0 --exp_ids 0
python test.py --config yamls/adashare/nyu_v2_3task_test.yml --gpus 0 --exp_ids 0

Test with our pre-trained checkpoints

We also provide some sample images to easily test our model for nyu v2 3 tasks.

Please download our model in NYU v2 3-Task Learning

Execute test_sample.py to test on sample images in ./nyu_v2_samples, using the command

python test_sample.py --config  yamls/adashare/nyu_v2_3task_test.yml --gpus 0

It will print the average quantitative results of sample images.

Note

If any link is invalid or any question, please email [email protected]

Comments

Training stuck at 28%

Thanks for your open source code, when I train with the following command:

python train.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0
python re-train.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0 --exp_ids 0

My terminal always outputs the following result and it ends.

28%|████████████████████████████████████████████████████▌                                                                                                                                          | 11/40 [01:16<03:21,  6.96s/it]
 28%|██████████████████████████████████████████████████▎                                                                                                                                    | 11/40 [13:15:37<34:57:33, 4339.76s/it]
 28%|██████████████████████████████████████████████████▎                                                                                                                                    | 11/40 [15:05:14<39:46:32, 4937.67s/it]
 28%|███████████████████████████████████████████▍

Is it normal?

opened by e96031413 6

About the padding_policy
Thanks for release the codes!

When reading the codes in deeplab_resnet.py : `skip_layer = sum(self.layers) - num_train_layers if cuda_device != -1: padding = torch.ones(skip_layer, 2).to(cuda_device) else: padding = torch.ones(skip_layer, 2) padding[:, 1] = 0 padding_policys = [] feats = [] for t_id in range(self.num_tasks): padding_policy = torch.cat((padding.float(), self.policys[t_id][-num_train_layers:].float()), dim=0) padding_policys.append(padding_policy)

feats.append(self.backbone(img, padding_policy)) else: feats = [self.backbone(img)] * self.num_tasks`

I wonder if the padding policy is to achieve the purpose of curriculum strategy. Looking forward to your reply, Thank you!
opened by J-Wu97 2
Confusion about the results of NYU v2 2-task

Hi Ximeng, I am reading your NIPS2020 publication "AdaShare: LearningWhatToShareForEfﬁcient DeepMulti-TaskLearning " and I am also confused about the reported results of NYU v2 2-task. I find that the performance of NDDR-CNN is lower than what was reported in the original paper. Does this is caused by the difference of backbone or other settings? Would you mind providing more details? Thank you very much!

opened by slyviacassell 2

sklearn FutureWarning on 28% problem

After changing to another machine to train the nyu_v2 dataset, my terminal print out the following error, it seems that the 28% is due to the keyword args from sklearn. Do you have any idea on this topic?

/opt/conda/lib/python3.6/site-packages/sklearn/utils/validation.py:70: FutureWarning: Pass labels=[ 0  1  2  3  4  5  6  7  8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39] as keyword args. From version 0.25 passing these as positional arguments will result in an error
  FutureWarning)
 28%|████████████████████████▍                                                                | 11/40 [01:04<02:49,  5.83s/it]

opened by e96031413 2

can you share your code?

Hi, Recently, I find the work, Adashare, in NeurIPS 2020. This work is very interesting and gives me some novel ideas. Could you share the code to reproduce this work? Thank you.

opened by heany 2
Problem with Reproducing the Single-Task Results on Cityscapes-Depth

Dear Author,

I'm trying to reproduce the single-task results on Cityscapes-Depth, which are presented in Table 2 (Abs.:0.017, Rel.: 0.33, δ < 1:25; 1:252; 1:253: 70.3, 86.3, 93.3). But I cannot get the similar results. Here are some details of my experiments setting:

About model: 1.Use Deeplab_ResNet_Backbone (with parameters BasicBlock, [3, 4, 6, 3] for Deeplab-ResNet34) in your code to produce the feature. 2.Then use the ASPP architecture as head to get the output.

About data: 3. Use the CityScapes dataloader in your code.

About metric: 4. Use the depth_loss and depth_err in your code, and follow the logic in eval() function in train.py

About hyperparameters: 5. SGD with lr=0.01, batch size=16, iterations=6,000; then lr=0.0001, iterations=6,000.

Is there anything I miss? Could you please give me any suggestion to get the similar Single-Task results as your experiment? Thank you vey much!

Best Regards, Lijun

opened by zhanglijun95 1
Line 50 should be False instead of True

If we use True, it would be re-learning stage. However, when we first time training with the yaml, it will raise an error showed as below:

ValueError: snapshot ./experiments/checkpoints/cityscapes_2task/warmup_model.pth.tar does not exist

opened by e96031413 0
Results on DomainNet

Dear Author,

Could you share your results on DomainNet? Like the domain accuracy of the single task model, naive multi-task model, and your method? Thank you!

opened by zhanglijun95 0
Calculation of number of parameters

Hi,

The paper reports that the parameter count of AdaShare is 1 and for MTAN 3.11 in NYUv2 3-tasks. Could you let me know how is this number calculated and what is the unit ?

When running the code and using the "count_params" function, I get a result of 90.71 parameters. This number is different from the one in the paper.

Could you clarify this discrepency ?

Thank you for your time.

opened by NareshGuru77 0
ASPP module as heads

Hi,

Thank you for sharing the code of your project.

The paper mentions the use of ASPP module as the head of the tasks but the code seems to be the "Classification_Module" with different dilation rates. This is the ASPP module used ?

Also, is this the same module used for single task baseline ?

Thank you.

opened by NareshGuru77 0
Guide on creating the 3 training datasets in policy learning phase

Hi,

I am interested in applying Adashare on my own dataset but having trouble trying to format the 3 data loaders used in train.py since not much information is provided on how it was done.

Could you please explain how you break a given dataset into the three train, train1 and train2 sets?

Thanks

opened by rashindrie 0
Shape mismatch between task output and task _pred output

After running train.py and retrain.py, I run test.py and am met with some errors in a shape mismatch.

I am trying to get the precision_score using the sklearn.metrics package, but the task output from the GT is of shape (,) while the task _pred output is of shape (1). To fix this issue, I have to make a change in the task's error method in base_env.py to use torch.unsqueeze.

Any idea why there would be a shape mismatch all of a sudden after having worked for both train.py and retrain.py?

opened by atapley 0

Owner

GitHub

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

Squirrel Core Share, load, and transform data in a collaborative, flexible, and efficient way What is Squirrel? Squirrel is a Python library that enab

249 Dec 7, 2022

Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Improving evidential deep learning via multi task learning It is a repository of AAAI2022 paper, “Improving evidential deep learning via multi-task le

11 Nov 19, 2022

Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

8 Sep 30, 2022

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece

6 Nov 25, 2022

Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders

MultiMAE: Multi-modal Multi-task Masked Autoencoders Roman Bachmann*, David Mizrahi*, Andrei Atanov, Amir Zamir Website | arXiv | BibTeX Official PyTo

Visual Intelligence & Learning Lab, Swiss Federal Institute of Technology (EPFL)

385 Jan 6, 2023

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

129 Dec 11, 2022

Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

引言感谢苏神带来的模型，原文地址：https://spaces.ac.cn/archives/8877 如何运行对应模型EfficientGlobalPoi

40 Dec 14, 2022

A list of multi-task learning papers and projects.

This page contains a list of papers on multi-task learning for computer vision. Please create a pull request if you wish to add anything. If you are interested, consider reading our recent survey paper.

297 Dec 17, 2022

A list of multi-task learning papers and projects.

84 Apr 27, 2021

RoboDesk A Multi-Task Reinforcement Learning Benchmark

RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro

66 Oct 7, 2022

Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021)

Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021) Authors: Xinshi Chen, Haoran Sun, Caleb Ellington, Eric Xing, Le Song Link to pap

2 Dec 20, 2021

MultiTaskLearning - Multi Task Learning for 3D segmentation

Multi Task Learning for 3D segmentation Perception stack of an Autonomous Drivin

2 Sep 22, 2022

FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

FocusFace This is the official repository of "FocusFace: Multi-task Contrastive Learning for Masked Face Recognition" accepted at IEEE International C

21 Nov 17, 2022

Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

87 Dec 26, 2022

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

7 Nov 27, 2022

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Related tags

Overview

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS 2020)

Introduction

Experiment Environment

Datasets

Training

Policy Learning Phase

Retrain Phase

Test/Inference

Test with our pre-trained checkpoints

Note

Comments

Owner

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

A list of multi-task learning papers and projects.

A list of multi-task learning papers and projects.

RoboDesk A Multi-Task Reinforcement Learning Benchmark

Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021)

MultiTaskLearning - Multi Task Learning for 3D segmentation

FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

Multi-Task Learning as a Bargaining Game

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

It helps user to learn Pick-up lines and share if he has a better one

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually