SeMask: Semantically Masked Transformers for Semantic Segmentation.

Picsart AI Research (PAIR)

Last update: Dec 30, 2022

Related tags

Deep Learning pytorch semantic-segmentation cityscapes ade20k coco-stuff-10k semask

Overview

SeMask: Semantically Masked Transformers

Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi

This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation.

Results
Setup Instructions
Citing SeMask

1. Results

Note: † denotes the backbones were pretrained on ImageNet-22k and 384x384 resolution images.

ADE20K

Method	Backbone	Crop Size	mIoU	mIoU (ms+flip)	#params	config	Checkpoint
SeMask-T FPN	SeMask Swin-T	512x512	42.11	43.16	35M	config	TBD
SeMask-S FPN	SeMask Swin-S	512x512	45.92	47.63	56M	config	TBD
SeMask-B FPN	SeMask Swin-B^†	512x512	49.35	50.98	96M	config	TBD
SeMask-L FPN	SeMask Swin-L^†	640x640	51.89	53.52	211M	config	TBD
SeMask-L MaskFormer	SeMask Swin-L^†	640x640	54.75	56.15	219M	config	TBD
SeMask-L Mask2Former	SeMask Swin-L^†	640x640	56.41	57.52	222M	config	TBD
SeMask-L Mask2Former FAPN	SeMask Swin-L^†	640x640	56.68	58.00	227M	config	TBD
SeMask-L Mask2Former MSFAPN	SeMask Swin-L^†	640x640	56.54	58.22	224M	config	TBD

Cityscapes

Method	Backbone	Crop Size	mIoU	mIoU (ms+flip)	#params	config	Checkpoint
SeMask-T FPN	SeMask Swin-T	768x768	74.92	76.56	34M	config	TBD
SeMask-S FPN	SeMask Swin-S	768x768	77.13	79.14	56M	config	TBD
SeMask-B FPN	SeMask Swin-B^†	768x768	77.70	79.73	96M	config	TBD
SeMask-L FPN	SeMask Swin-L^†	768x768	78.53	80.39	211M	config	TBD
SeMask-L Mask2Former	SeMask Swin-L^†	512x1024	83.97	84.98	222M	config	TBD

COCO-Stuff 10k

Method	Backbone	Crop Size	mIoU	mIoU (ms+flip)	#params	config	Checkpoint
SeMask-T FPN	SeMask Swin-T	512x512	37.53	38.88	35M	config	TBD
SeMask-S FPN	SeMask Swin-S	512x512	40.72	42.27	56M	config	TBD
SeMask-B FPN	SeMask Swin-B^†	512x512	44.63	46.30	96M	config	TBD
SeMask-L FPN	SeMask Swin-L^†	640x640	47.47	48.54	211M	config	TBD

2. Setup Instructions

We provide the codebase with SeMask incorporated into various models. Please check the setup instructions inside the corresponding folders:

SeMask-FPN: Setup Instructions
SeMask-MaskFormer: Setup Instructions
SeMask-Mask2Former: Setup Instructions
SeMask-FAPN: Setup Instructions

3. Citing SeMask

@article{jain2022semask,
  title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
  author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
  journal={arXiv preprint arXiv:...},
  year={2022}
}

Acknowledgements

Code is based heavily on the following repositories: Swin-Transformer-Semantic-Segmentation, Mask2Former, MaskFormer and FaPN-full.

Comments

Possibility to test on one single image

Hi, I would like to test the instance segmentation model on a single image. Is it possible. In test.py i dont see any input argument to test on our own images. Here is the command i did

python tools/test.py configs/semask_swin/coco_stuff10k/semfpn_semask_swin_large_patch4_window12_640x640_80k_coco10k.py semask_large_fpn_coco10k.pth --eval mIoU --show-dir visuals

Thank you

opened by an99990 23
ONNX Model

Hi, I wish to convert it into an onnx model. When trying to run the pytorch2onnx.py present inside the tools directory, I get:

RuntimeError: Exporting the operator roll to ONNX opset version 11 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

Thanks.

opened by romil611 18
Error when run the demo.py

Hi, I always have the ImportError: cannot import name '_log_api_usage_once' from 'torchvision.utils' (/home/jinshan/anaconda3/envs/tf/lib/python3.9/site-packages/ when I run the demo.py following the instruction, Could you help me with the problem? Thank you!!

opened by Jinshan99 15
RecursionError: Caught RecursionError in DataLoader for training with custom dataset

Hi, thank you for sharing this great work. I encounter the RecursionError: maximum recursion depth exceeded while calling a Python object for training with the custom dataset.

I applied the solution of setting the sys.getrecursionlimit() to a higher value but it doesn't work.

Thanks.

opened by erenuzyildirim 12
question : out of memory and mask prediction

Hi, So ive been trying a lot of the variants here and I have some questions. I am able to compile Mask2Former with swin large as backbone with 8GB RAM but not able to compile SeMask-FPN with the tiny swin backbone. The number of parameters is really different, is there an explanation ?

Also, I was trying to visualise the mask output in the prediction, but i only get black screen. The predictions output is a dict [sem_seg[X, img.width,img.lenght]] , does this mean that there X number of masks ? and the output visualise is like a sum of all those masks ?

opened by an99990 12
Strange sem-seg resutls

Hi there! I am trying to test on SeMask-FAPN, by calling: python demo.py --config-file ../configs/ade20k/semantic-segmentation/semask_swin/fapn_maskformer2_semask_swin_large_IN21k_384_bs16_160k_res640.yaml --input /workspace/SeMask-Segmentation/images/ADE_val_00001001.jpg --output /workspace/SeMask-Segmentation/output/ --opts MODEL.WEIGHTS /workspace/SeMask-Segmentation/checkpoints/semask_large_mask2former_fapn_ade20k.pth and I downloaded pre-trained model from https://drive.google.com/file/d/1DQ9KltSLDj47H2jYnCtVwyBf7KPR9SM_/view The result is here:

That doesn't make sense. Do you have any idea about where I got things wrong? Thank you!

opened by Shawn207 7
Some questions about training stage
Thank you very much for your pioneering work. I have a few questions about the training phase and hope to get your answers.

What are the models and numbers of graphics cards used in training？

What is the training duration for different capacity models?

I'm not sure whether the network can work, If I set the batch size to 1, it's due to the particularity of the task I study.

Thank you. I'm looking forward to your reply.
opened by LiuZhe6 7
SeMask-mask2former training error

I meet a problem as follow, look forward to your help. I can run mask2former correctly. But SeMask-mask2former meets error.

RUN Command: python3 train_net.py --num-gpus 2 --config-file configs/cityscapes/semantic-segmentation/semask_swin/maskformer2_semask_swin_large_IN21k_384_bs16_90k.yaml

Error: File "/SeMask-Segmentation/SeMask-Mask2Former/mask2former/modeling/criterion.py", line 329, in loss_cate gt_seg_targets = torch.cat([t["seg_maps"].unsqueeze(0) for t in targets], dim=0) RuntimeError: Sizes of tensors must match except in dimension 2. Got 1024 and 910 (The offending index is 0)

opened by Sunting78 4
Version compatibility problem with DCN and Detectron2

Hi there! According to your setup instruction for SeMask-FaPN, building DCNv2 with torch 1.7.1, following installing Detectron 2 is needed. However, Detectron2 documents requires torch >= 1.8. I upgraded torch to 1.8.0 but DCNv2 gives the following error: (SeMask) xzhan2@cerlab27:~/SeMask-Segmentation/SeMask-FAPN/SeMask-Mask2Former/demo$ python demo.py --config-file ../configs/ade20k/semantic-segmentation/semask_swin/fapn_maskformer2_semask_swin_large_IN21k_384_bs16_160k_res640.yaml --input ~/Pictures/test.jpeg Traceback (most recent call last): File "demo.py", line 26, in <module> from mask2former import add_maskformer2_config File "/home/xzhan2/SeMask-Segmentation/SeMask-FAPN/SeMask-Mask2Former/demo/../mask2former/__init__.py", line 3, in <module> from . import modeling File "/home/xzhan2/SeMask-Segmentation/SeMask-FAPN/SeMask-Mask2Former/demo/../mask2former/modeling/__init__.py", line 5, in <module> from .pixel_decoder.fapn import PixelFANDecoder File "/home/xzhan2/SeMask-Segmentation/SeMask-FAPN/SeMask-Mask2Former/demo/../mask2former/modeling/pixel_decoder/fapn.py", line 16, in <module> from dcn_v2 import DCN as dcn_v2 File "/home/xzhan2/SeMask-Segmentation/SeMask-FAPN/DCNv2/dcn_v2.py", line 17, in <module> import _ext as _backend ImportError: /home/xzhan2/SeMask-Segmentation/SeMask-FAPN/DCNv2/_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIdEEPKNS_6detail12TypeMetaDataEv

I also tried to degrade to torch 1.7 and install Detectron with torch 1.7, but it fails. The version requriements seem to be conflicting. Could you please give me a bit suggestions? Thanks a lot!!

opened by Shawn207 4
one working example

My output image is always the same as the input image. Is it possible to have one working example with the demo.

I tried theses

python demo.py --config-file ../configs/coco/panoptic-segmentation/maskformer2_R101_bs16_50ep.yaml --input ../images/person_bike.jpg --opts MODEL.WEIGHTS ../R-101.pkl

python demo.py --config-file ../configs/ade20k/panoptic-segmentation/swin/maskformer2_swin_large_IN21k_384_bs16_160k.yaml --input ../images/person_bike.jpg --opts MODEL.WEIGHTS ../semask_large_mask2former_ade20k\ $1$.pth

python demo.py --config-file ../configs/coco/instance-segmentation/maskformer2_R101_bs16_50ep.yaml --input ../images/person_bike.jpg --opts MODEL.WEIGHTS ../R-101.pkl

the last command i got this

thank you

opened by an99990 4
Fix for non-distributed training
I thought I'd document a fix for non-distributed training in case other people are trying to use this repo as well.

Even if distributed training isn't selected, you'll get a:

RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

during the forward call.

This happens in the decoder head due to the SyncBN layer. Find the config file under base corresponding to your model, and change the norm_cfg to use BN instead of SyncBN. For example, I'm using SeMask-FPN, so under configs/base/models/semfpn_semask_swin.py, I change:

norm_cfg = dict(type='SyncBN', requires_grad=True)

to

norm_cfg = dict(type='BN', requires_grad=True)

The layers seem to be totally compatible with one another, so you can load weights from models using SyncBN fine. Just a warning that you'll need to hardcode this if changing between regular and distributed training.
opened by GerardMaggiolino 3

Owner

Picsart AI Research (PAIR)

GitHub

ConvMAE: Masked Convolution Meets Masked Autoencoders

ConvMAE ConvMAE: Masked Convolution Meets Masked Autoencoders Peng Gao1, Teli Ma1, Hongsheng Li2, Jifeng Dai3, Yu Qiao1, 1 Shanghai AI Laboratory, 2 M

345 Jan 8, 2023

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

32 Sep 21, 2022

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

254 Dec 27, 2022

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

62 Nov 21, 2022

From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

SESNet for remote sensing image change detection It is the implementation of the paper: "SESNet: A Semantically Enhanced Siamese Network for Remote Se

1 May 24, 2022

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021) PyTorch implementation of SnapMix | paper Method Overview Cite

126 Dec 30, 2022

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

161 Dec 8, 2022

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

112 Dec 16, 2022

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

897 Jan 5, 2023

Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers Figure 1: Performance of SegFormer-B0 to SegFormer-B5. Project page

1.4k Dec 31, 2022

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

337 Dec 15, 2022

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

134 Dec 19, 2022

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

SeMask: Semantically Masked Transformers for Semantic Segmentation.

Related tags

Overview

SeMask: Semantically Masked Transformers

Contents

1. Results

ADE20K

Cityscapes

COCO-Stuff 10k

2. Setup Instructions

3. Citing SeMask

Acknowledgements

Comments

Owner

Picsart AI Research (PAIR)

ConvMAE: Masked Convolution Meets Masked Autoencoders

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-supervised ViT.