DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Alibaba Cloud

Last update: Nov 27, 2022

Related tags

Deep Learning DCT-Mask

Overview

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

This project hosts the code for implementing the DCT-MASK algorithms for instance segmentation.

[DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation] Xing Shen*, Jirui Yang*, Chunbo Wei, Bing Deng, Jianqiang Huang, Xiansheng Hua Xiaoliang Cheng, Kewei Liang

In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition(CVPR 2021)

arXiv preprint(arXiv:2011.09876)

Contributions

We propose a high-quality and low-complexity mask representation for instance segmentation, which encodes the high-resolution binary mask into a compact vector with discrete cosine transform.
With slight modifications, DCT-Mask could be integrated into most pixel-based frameworks, and achieve significant and consistent improvement on different datasets, backbones, and training schedules. Specifically, it obtains more improvements for more complex backbones and higher-quality annotations.
DCT-Mask does not require extra pre-processing or pre-training. It achieves high-resolution mask prediction at a speed similar to low-resolution.

Installation

Requirements

PyTorch ≥ 1.5 and fvcore == 0.1.1.post20200716

This implementation is based on detectron2. Please refer to INSTALL.md. for installation and dataset preparation.

Usage

The codes of this project is on projects/DCT_Mask/

Train with multiple GPUs

cd ./projects/DCT_Mask/
./train1.sh

Testing

cd ./projects/DCT_Mask/
./test1.sh

Model ZOO

Trained models on COCO

Model	Backbone	Schedule	Multi-scale training	Inference time (s/im)	AP (minival)	Link
DCT-Mask R-CNN	R50	1x	Yes	0.0465	36.5	download(Fetch code: xpdm)
DCT-Mask R-CNN	R101	3x	Yes	0.0595	39.9	download(Fetch code: 7q6x)
DCT-Mask R-CNN	RX101	3x	Yes	0.1049	41.2	download(Fetch code: ufw2)
Casecade DCT-Mask R-CNN	R50	1x	Yes	0.0630	37.5	download(Fetch code: yqxp)
Casecade DCT-Mask R-CNN	R101	3x	Yes	0.0750	40.8	download(Fetch code: r8xv)
Casecade DCT-Mask R-CNN	RX101	3x	Yes	0.1195	42.0	download(Fetch code: pdej)

Trained models on Cityscapes

Model	Data	Backbone	Schedule	Multi-scale training	AP (val)	Link
DCT-Mask R-CNN	Fine-Only	R50	1x	Yes	37.0	download(Fetch code: dn7i)
DCT-Mask R-CNN	CoCo-Pretrain +Fine	R50	1x	Yes	39.6	download(Fetch code: ntqf)

Notes

We observe about 0.2 AP noise in COCO.
High variance observed in CityScapes when trained on fine annotations only. We report the median of 5 runs AP in the article (i.e. 35.6), while in this repo we report the best results (37.0).
Initialized from COCO pre-training will reduce the variance on CityScapes as well as increasing mask AP.
The inference time is measured on single GPU with batchsize 1. All GPUs are NVIDIA V100.
Lvis 0.5 is used for evaluation.

Contributing to the project

Any pull requests or issues are welcome.

If there is any problem with this project, please contact Xing Shen.

Citations

Please consider citing our papers in your publications if the project helps your research.

License

MIT License.

Comments

DCT decoding error

Hello Xingbaji,

I met a weird error when I tried to combine DCT-mask loss into my projects. The predicted instance mask is all like the below image, did you have any suggestions to fix it?

Best regards, Jiahua

opened by usherbob 0
Inf/NaN. Training has diverged.

Predicted boxes or scores contain Inf/NaN. Training has diverged." FloatingPointError: Predicted boxes or scores contain Inf/NaN. Training has diverged.

opened by maralzar 0
instances dct mask head

I initial on my dataset but this is always true and the output of instances doesn't have GT. for instances_per_image in instances: if len(instances_per_image) == 0: continue #OUTPUT INSTANCES: instances [Instances(num_instances=0, image_height=768, image_width=1151, fields=[proposal_boxes: Boxes(tensor([], device='cuda:0', size=(0, 4))), objectness_logits: tensor([], device='cuda:0'), gt_classes: tensor([], device='cuda:0', dtype=torch.int64), gt_boxes: Boxes(tensor([], device='cuda:0', size=(0, 4))), gt_masks: PolygonMasks(num_instances=0)]), Instances(num_instances=0, image_height=768, image_width=1159, fields=[proposal_boxes: Boxes(tensor([], device='cuda:0', size=(0, 4))), objectness_logits: tensor([], device='cuda:0'), gt_classes: tensor([], device='cuda:0', dtype=torch.int64), gt_boxes: Boxes(tensor([], device='cuda:0', size=(0, 4))), gt_masks: PolygonMasks(num_instances=0)])]

opened by maralzar 0

Cossim - Sharpened Cosine Distance implementation in PyTorch

Sharpened Cosine Distance PyTorch implementation of the Sharpened Cosine Distanc

10 Mar 22, 2022

Product-based-recommendation-system - A product based recommendation system which uses Machine learning algorithm such as KNN and cosine similarity

Product-based-recommendation-system A product based recommendation system which

2 Feb 15, 2022

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You

93 Dec 25, 2022

This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation.

ISL This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation, which is accepted

19 May 4, 2022

pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

16 Nov 4, 2020

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Related tags

Overview

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Contributions

Installation

Requirements

Usage

Train with multiple GPUs

Testing

Model ZOO

Trained models on COCO

Trained models on Cityscapes

Notes

Contributing to the project

Citations

License

You might also like...

Cossim - Sharpened Cosine Distance implementation in PyTorch

Product-based-recommendation-system - A product based recommendation system which uses Machine learning algorithm such as KNN and cosine similarity

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation.

pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

PyTorch package for the discrete VAE used for DALL·E.

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

This is 2nd term discrete maths project done by UCU students that uses backtracking to solve various problems.

Comments

DCT decoding error

Inf/NaN. Training has diverged.

instances dct mask head

Owner

Alibaba Cloud

Projecting interval uncertainty through the discrete Fourier transform

Hough Transform and Hough Line Transform Using OpenCV

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

AdamW optimizer and cosine learning rate annealing with restarts

Cosine Annealing With Warmup