[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

Yu-Huan Wu

Last update: Jan 6, 2023

Related tags

Deep Learning MobileSal

Overview

MobileSal

IEEE TPAMI 2021: MobileSal: Extremely Efficient RGB-D Salient Object Detection

This repository contains full training & testing code, and pretrained saliency maps. We have achieved competitive performance on the RGB-D salient object detection task with a speed of 450fps.

If you run into any problems or feel any difficulties to run this code, do not hesitate to leave issues in this repository.

My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

[PDF]

Requirements

PyTorch

Python 3.6+
PyTorch >=0.4.1, OpenCV-Python
Tested on PyTorch 1.7.1

Jittor

Python 3.7+
Jittor, OpenCV-Python
Tested on Jittor 1.3.1

For Jittor users, we create a branch jittor. So please run the following command first:

git checkout jittor

Installing

Please prepare the required packages.

pip install -r envs/requirements.txt

Data Preparing

Before training/testing our network, please download the training data:

Preprocessed data of 6 datasets: [Google Drive], [Baidu Pan, 9nxi]

Note: if you are blocked by Google and Baidu services, you can contact me via e-mail and I will send you a copy of data and model weights.

We have processed the data to json format so you can use them without any preprocessing steps. After completion of downloading, extract the data and put them to ./data/ folder. Then, the ./datasets/ folder should contain six folders: NJU2K/, NLPR/, STERE/, SSD/, SIP/, DUT-RGBD/, representing NJU2K, NLPR, STEREO, SSD, SIP, DUTLF-D datasets, respectively.

Train

It is very simple to train our network. We have prepared a script to run the training step:

bash ./tools/train.sh

Pretrained Models

As in our paper, we train our model on the NJU2K_NLPR training set, and test our model on NJU2K_test, NLPR_test, STEREO, SIP, and SSD datasets. For DUTLF-D, we train our model on DUTLF-D training set and evaluate on its testing test.

(Default) Trained on NJU2K_NLPR training set:

Single-scale Training: [Google Drive], [Baidu Pan, 9nxi]
Multi-scale Training: [Google Drive], [Baidu Pan, 9nxi]

(Custom) Training on DUTLF-D training set:

Multi-scale Training: [Google Drive], [Baidu Pan, 9nxi]

Download them and put them into the pretrained/ folder.

Test / Evaluation / Results

After preparing the pretrained models, it is also very simple to test our network:

bash ./tools/test.sh

The scripts will automatically generate saliency maps on the maps/ directory.

Pretrained Saliency maps

For covenience, we provide the pretrained saliency maps on several datasets as below:

Single-scale Training: [Google Drive], [Baidu Pan, 9nxi]
Multi-scale Training: [Google Drive], [Baidu Pan, 9nxi]

TODO

Release the pretrained models and saliency maps on COME15K dataset.
Release the ONNX model for real-world applications.
Add results with the P2T transformer backbone.

Other Tips

I encourage everyone to contact me via my e-mail. My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

License

The code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for NonCommercial use only.

Citations

If you are using the code/model/data provided here in a publication, please consider citing our work:

@ARTICLE{wu2021mobilesal,
  author={Wu, Yu-Huan and Liu, Yun and Xu, Jun and Bian, Jia-Wang and Gu, Yu-Chao and Cheng, Ming-Ming},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={MobileSal: Extremely Efficient RGB-D Salient Object Detection}, 
  year={2021},
  doi={10.1109/TPAMI.2021.3134684}
}

Acknowlogdement

This repository is built under the help of the following five projects for academic use only:

You might also like...

Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

MixFaceNets This is the official repository of the paper: MixFaceNets: Extremely Efficient Face Recognition Networks. (Accepted in IJCB2021) https://i

51 Dec 13, 2022

Jittor implementation of PCT:Point Cloud Transformer

PCT: Point Cloud Transformer This is a Jittor implementation of PCT: Point Cloud Transformer.

547 Jan 3, 2023

Jittor Medical Segmentation Lib -- The assignment of Pattern Recognition course (2021 Spring) in Tsinghua University

THU模式识别2021春 -- Jittor 医学图像分割模型列表本仓库收录了课程作业中同学们采用jittor框架实现的如下模型： UNet SegNet DeepLab V2 DANet EANet HarDNet及其改动HarDNet_alter PSPNet OCNet OCRNet DL

48 Dec 26, 2022

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

Jittor: a Just-in-time(JIT) deep learning framework Quickstart | Install | Tutorial | Chinese Jittor is a high-performance deep learning framework bas

2.7k Jan 3, 2023

GANSketchingJittor - Implementation of Sketch Your Own GAN in Jittor

GANSketching in Jittor Implementation of (Sketch Your Own GAN) in Jittor(计图). Or

10 Jul 2, 2022

Jittor 64*64 implementation of StyleGAN

StyleGanJittor (Tsinghua university computer graphics course) Overview Jittor 64

3 Jan 20, 2022

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

274 Jan 5, 2023

Learning from Synthetic Shadows for Shadow Detection and Removal [Inoue+, IEEE TCSVT 2020].

Learning from Synthetic Shadows for Shadow Detection and Removal (IEEE TCSVT 2020) Overview This repo is for the paper "Learning from Synthetic Shadow

67 Dec 28, 2022

Official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Recognition" in AAAI2022.

AimCLR This is an official PyTorch implementation of "Contrastive Learning from Extremely Augmented Skeleton Sequences for Self-supervised Action Reco

44 Dec 17, 2022

Comments

FPS

Thanks for your great work! We are very interested in the FPS mentioned in the paper. However, after running the code in speed_test.py, we have the following confusion.

(1) The paper mentions using a single 2080Ti GPU to test and get a 450 fps speed, but actually running the self-defined tensor with a batch of 20 in speed_test.py on a single 2080Ti will out of memory. So, how can we get the 450fps on a single 2080Ti GPU?

(2) We found that a self-defined tensor is used for inference speed testing in speed_test.py, which is different from the process in test.py that uses an actual image. The comparison is perhaps unfair. So we are very curious what the exact FPS result would be in test.py with a batch of 1. And what about including the ‘’torch.cuda.synchronize()‘’?

opened by gbliao 12