AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Overview

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS 2020)

Introduction

alt text

AdaShare is a novel and differentiable approach for efficient multi-task learning that learns the feature sharing pattern to achieve the best recognition accuracy, while restricting the memory footprint as much as possible. Our main idea is to learn the sharing pattern through a task-specific policy that selectively chooses which layers to execute for a given task in the multi-task network. In other words, we aim to obtain a single network for multi-task learning that supports separate execution paths for different tasks.

Here is the link for our arxiv version.

Welcome to cite our work if you find it is helpful to your research.

@article{sun2020adashare,
  title={Adashare: Learning what to share for efficient deep multi-task learning},
  author={Sun, Ximeng and Panda, Rameswar and Feris, Rogerio and Saenko, Kate},
  journal={Advances in Neural Information Processing Systems},
  volume={33},
  year={2020}
}

Experiment Environment

Our implementation is in Pytorch. We train and test our model on 1 Tesla V100 GPU for NYU v2 2-task, CityScapes 2-task and use 2 Tesla V100 GPUs for NYU v2 3-task and Tiny-Taskonomy 5-task.

We use python3.6 and please refer to this link to create a python3.6 conda environment.

Install the listed packages in the virual environment:

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
conda install matplotlib
conda install -c menpo opencv
conda install pillow
conda install -c conda-forge tqdm
conda install -c anaconda pyyaml
conda install scikit-learn
conda install -c anaconda scipy
pip install tensorboardX

Datasets

Please download the formatted datasets for NYU v2 here

The formatted CityScapes can be found here.

Download Tiny-Taskonomy as instructed by its GitHub.

The formatted DomainNet can be found here.

Remember to change the dataroot to your local dataset path in all yaml files in the ./yamls/.

Training

Policy Learning Phase

Please execute train.py for policy learning, using the command

python train.py --config <yaml_file_name> --gpus <gpu ids>

For example, python train.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0.

Sample yaml files are under yamls/adashare

Note: use domainnet branch for experiments on DomainNet, i.e. python train_domainnet.py --config <yaml_file_name> --gpus <gpu ids>

Retrain Phase

After Policy Learning Phase, we sample 8 different architectures and execute re-train.py for retraining.

python re-train.py --config <yaml_file_name> --gpus <gpu ids> --exp_ids <random seed id>

where we use different --exp_ids to specify different random seeds and generate different architectures. The best performance of all 8 runs is reported in the paper.

For example, python re-train.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0 --exp_ids 0.

Note: use domainnet branch for experiments on DomainNet, i.e. python re-train_domainnet.py --config <yaml_file_name> --gpus <gpu ids>

Test/Inference

After Retraining Phase, execute test.py for get the quantitative results on the test set.

python test.py --config <yaml_file_name> --gpus <gpu ids> --exp_ids <random seed id>

For example, python test.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0 --exp_ids 0.

We provide our trained checkpoints as follows:

  1. Please download our model in NYU v2 2-Task Learning
  2. Please donwload our model in CityScapes 2-Task Learning
  3. Please download our model in NYU v2 3-Task Learning

To use these provided checkpoints, please download them to ../experiments/checkpoints/ and uncompress there. Use the following command to test

python test.py --config yamls/adashare/nyu_v2_2task_test.yml --gpus 0 --exp_ids 0
python test.py --config yamls/adashare/cityscapes_2task_test.yml --gpus 0 --exp_ids 0
python test.py --config yamls/adashare/nyu_v2_3task_test.yml --gpus 0 --exp_ids 0

Test with our pre-trained checkpoints

We also provide some sample images to easily test our model for nyu v2 3 tasks.

Please download our model in NYU v2 3-Task Learning

Execute test_sample.py to test on sample images in ./nyu_v2_samples, using the command

python test_sample.py --config  yamls/adashare/nyu_v2_3task_test.yml --gpus 0

It will print the average quantitative results of sample images.

Note

If any link is invalid or any question, please email [email protected]

Comments
  • Training stuck at 28%

    Training stuck at 28%

    Thanks for your open source code, when I train with the following command:

    python train.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0
    python re-train.py --config yamls/adashare/nyu_v2_2task.yml --gpus 0 --exp_ids 0
    

    My terminal always outputs the following result and it ends.

    28%|████████████████████████████████████████████████████▌                                                                                                                                          | 11/40 [01:16<03:21,  6.96s/it]
     28%|██████████████████████████████████████████████████▎                                                                                                                                    | 11/40 [13:15:37<34:57:33, 4339.76s/it]
     28%|██████████████████████████████████████████████████▎                                                                                                                                    | 11/40 [15:05:14<39:46:32, 4937.67s/it]
     28%|███████████████████████████████████████████▍   
    

    Is it normal?

    opened by e96031413 6
  • About the padding_policy

    About the padding_policy

    Thanks for release the codes!

    When reading the codes in deeplab_resnet.py : `skip_layer = sum(self.layers) - num_train_layers if cuda_device != -1: padding = torch.ones(skip_layer, 2).to(cuda_device) else: padding = torch.ones(skip_layer, 2) padding[:, 1] = 0 padding_policys = [] feats = [] for t_id in range(self.num_tasks): padding_policy = torch.cat((padding.float(), self.policys[t_id][-num_train_layers:].float()), dim=0) padding_policys.append(padding_policy)

                feats.append(self.backbone(img, padding_policy))
        else:
            feats = [self.backbone(img)] * self.num_tasks`
    

    I wonder if the padding policy is to achieve the purpose of curriculum strategy. Looking forward to your reply, Thank you!

    opened by J-Wu97 2
  • Confusion about the results of NYU v2 2-task

    Confusion about the results of NYU v2 2-task

    Hi Ximeng, I am reading your NIPS2020 publication "AdaShare: LearningWhatToShareForEfficient DeepMulti-TaskLearning " and I am also confused about the reported results of NYU v2 2-task. I find that the performance of NDDR-CNN is lower than what was reported in the original paper. Does this is caused by the difference of backbone or other settings? Would you mind providing more details? Thank you very much!

    opened by slyviacassell 2
  • sklearn FutureWarning on 28% problem

    sklearn FutureWarning on 28% problem

    After changing to another machine to train the nyu_v2 dataset, my terminal print out the following error, it seems that the 28% is due to the keyword args from sklearn. Do you have any idea on this topic?

    /opt/conda/lib/python3.6/site-packages/sklearn/utils/validation.py:70: FutureWarning: Pass labels=[ 0  1  2  3  4  5  6  7  8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
     24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39] as keyword args. From version 0.25 passing these as positional arguments will result in an error
      FutureWarning)
     28%|████████████████████████▍                                                                | 11/40 [01:04<02:49,  5.83s/it]
    
    opened by e96031413 2
  • can you share your code?

    can you share your code?

    Hi, Recently, I find the work, Adashare, in NeurIPS 2020. This work is very interesting and gives me some novel ideas. Could you share the code to reproduce this work? Thank you.

    opened by heany 2
  • Problem with Reproducing the Single-Task Results on Cityscapes-Depth

    Problem with Reproducing the Single-Task Results on Cityscapes-Depth

    Dear Author,

    I'm trying to reproduce the single-task results on Cityscapes-Depth, which are presented in Table 2 (Abs.:0.017, Rel.: 0.33, δ < 1:25; 1:252; 1:253: 70.3, 86.3, 93.3). But I cannot get the similar results. Here are some details of my experiments setting:

    About model: 1.Use Deeplab_ResNet_Backbone (with parameters BasicBlock, [3, 4, 6, 3] for Deeplab-ResNet34) in your code to produce the feature. 2.Then use the ASPP architecture as head to get the output.

    About data: 3. Use the CityScapes dataloader in your code.

    About metric: 4. Use the depth_loss and depth_err in your code, and follow the logic in eval() function in train.py

    About hyperparameters: 5. SGD with lr=0.01, batch size=16, iterations=6,000; then lr=0.0001, iterations=6,000.

    Is there anything I miss? Could you please give me any suggestion to get the similar Single-Task results as your experiment? Thank you vey much!

    Best Regards, Lijun

    opened by zhanglijun95 1
  • Line 50 should be False instead of True

    Line 50 should be False instead of True

    If we use True, it would be re-learning stage. However, when we first time training with the yaml, it will raise an error showed as below:

    ValueError: snapshot ./experiments/checkpoints/cityscapes_2task/warmup_model.pth.tar does not exist

    opened by e96031413 0
  • Results on DomainNet

    Results on DomainNet

    Dear Author,

    Could you share your results on DomainNet? Like the domain accuracy of the single task model, naive multi-task model, and your method? Thank you!

    opened by zhanglijun95 0
  • Calculation of number of parameters

    Calculation of number of parameters

    Hi,

    The paper reports that the parameter count of AdaShare is 1 and for MTAN 3.11 in NYUv2 3-tasks. Could you let me know how is this number calculated and what is the unit ?

    When running the code and using the "count_params" function, I get a result of 90.71 parameters. This number is different from the one in the paper.

    Could you clarify this discrepency ?

    Thank you for your time.

    opened by NareshGuru77 0
  • ASPP module as heads

    ASPP module as heads

    Hi,

    Thank you for sharing the code of your project.

    The paper mentions the use of ASPP module as the head of the tasks but the code seems to be the "Classification_Module" with different dilation rates. This is the ASPP module used ?

    Also, is this the same module used for single task baseline ?

    Thank you.

    opened by NareshGuru77 0
  • Guide on creating the 3 training datasets in policy learning phase

    Guide on creating the 3 training datasets in policy learning phase

    Hi,

    I am interested in applying Adashare on my own dataset but having trouble trying to format the 3 data loaders used in train.py since not much information is provided on how it was done.

    Could you please explain how you break a given dataset into the three train, train1 and train2 sets?

    Thanks

    opened by rashindrie 0
  • Shape mismatch between task output and task _pred output

    Shape mismatch between task output and task _pred output

    After running train.py and retrain.py, I run test.py and am met with some errors in a shape mismatch.

    image

    I am trying to get the precision_score using the sklearn.metrics package, but the task output from the GT is of shape (,) while the task _pred output is of shape (1). To fix this issue, I have to make a change in the task's error method in base_env.py to use torch.unsqueeze.

    Any idea why there would be a shape mismatch all of a sudden after having worked for both train.py and retrain.py?

    opened by atapley 0
Owner
null
A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

Squirrel Core Share, load, and transform data in a collaborative, flexible, and efficient way What is Squirrel? Squirrel is a Python library that enab

Merantix Momentum 249 Dec 7, 2022
Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Improving evidential deep learning via multi task learning It is a repository of AAAI2022 paper, “Improving evidential deep learning via multi-task le

deargen 11 Nov 19, 2022
Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

RuanJingqing 8 Sep 30, 2022
Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece

Qibin He 6 Nov 25, 2022
Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders

MultiMAE: Multi-modal Multi-task Masked Autoencoders Roman Bachmann*, David Mizrahi*, Andrei Atanov, Amir Zamir Website | arXiv | BibTeX Official PyTo

Visual Intelligence & Learning Lab, Swiss Federal Institute of Technology (EPFL) 385 Jan 6, 2023
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

引言 感谢苏神带来的模型,原文地址:https://spaces.ac.cn/archives/8877 如何运行 对应模型EfficientGlobalPoi

powerycy 40 Dec 14, 2022
A list of multi-task learning papers and projects.

This page contains a list of papers on multi-task learning for computer vision. Please create a pull request if you wish to add anything. If you are interested, consider reading our recent survey paper.

svandenh 297 Dec 17, 2022
A list of multi-task learning papers and projects.

A list of multi-task learning papers and projects.

svandenh 84 Apr 27, 2021
RoboDesk A Multi-Task Reinforcement Learning Benchmark

RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro

Google Research 66 Oct 7, 2022
Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021)

Multi-task Learning of Order-Consistent Causal Graphs (NeuRIPs 2021) Authors: Xinshi Chen, Haoran Sun, Caleb Ellington, Eric Xing, Le Song Link to pap

Xinshi Chen 2 Dec 20, 2021
MultiTaskLearning - Multi Task Learning for 3D segmentation

Multi Task Learning for 3D segmentation Perception stack of an Autonomous Drivin

null 2 Sep 22, 2022
FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

FocusFace This is the official repository of "FocusFace: Multi-task Contrastive Learning for Masked Face Recognition" accepted at IEEE International C

Pedro Neto 21 Nov 17, 2022
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Deep Multi-Magnification Network This repository provides training and inference codes for Deep Multi-Magnification Network published here. Deep Multi

Computational Pathology 12 Aug 6, 2022
fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

Ali Abdalla 34 Jan 5, 2023
It helps user to learn Pick-up lines and share if he has a better one

Pick-up-Lines-Generator(Open Source) It helps user to learn Pick-up lines Share and Add one or many to the DataBase Unique SQLite DataBase AI Undercon

knock_nott 0 May 4, 2022
Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Portrait Segmentation using Tensorflow This script removes the background from an input image. You can read more about segmentation here Setup The scr

null 291 Dec 24, 2022