SeMask: Semantically Masked Transformers for Semantic Segmentation.

Overview

SeMask: Semantically Masked Transformers

Framework: PyTorch

Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi

This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation.

semask

Contents

  1. Results
  2. Setup Instructions
  3. Citing SeMask

1. Results

Note: † denotes the backbones were pretrained on ImageNet-22k and 384x384 resolution images.

ADE20K

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512x512 42.11 43.16 35M config TBD
SeMask-S FPN SeMask Swin-S 512x512 45.92 47.63 56M config TBD
SeMask-B FPN SeMask Swin-B 512x512 49.35 50.98 96M config TBD
SeMask-L FPN SeMask Swin-L 640x640 51.89 53.52 211M config TBD
SeMask-L MaskFormer SeMask Swin-L 640x640 54.75 56.15 219M config TBD
SeMask-L Mask2Former SeMask Swin-L 640x640 56.41 57.52 222M config TBD
SeMask-L Mask2Former FAPN SeMask Swin-L 640x640 56.68 58.00 227M config TBD
SeMask-L Mask2Former MSFAPN SeMask Swin-L 640x640 56.54 58.22 224M config TBD

Cityscapes

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 768x768 74.92 76.56 34M config TBD
SeMask-S FPN SeMask Swin-S 768x768 77.13 79.14 56M config TBD
SeMask-B FPN SeMask Swin-B 768x768 77.70 79.73 96M config TBD
SeMask-L FPN SeMask Swin-L 768x768 78.53 80.39 211M config TBD
SeMask-L Mask2Former SeMask Swin-L 512x1024 83.97 84.98 222M config TBD

COCO-Stuff 10k

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512x512 37.53 38.88 35M config TBD
SeMask-S FPN SeMask Swin-S 512x512 40.72 42.27 56M config TBD
SeMask-B FPN SeMask Swin-B 512x512 44.63 46.30 96M config TBD
SeMask-L FPN SeMask Swin-L 640x640 47.47 48.54 211M config TBD

demo

2. Setup Instructions

We provide the codebase with SeMask incorporated into various models. Please check the setup instructions inside the corresponding folders:

3. Citing SeMask

@article{jain2022semask,
  title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
  author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
  journal={arXiv preprint arXiv:...},
  year={2022}
}

Acknowledgements

Code is based heavily on the following repositories: Swin-Transformer-Semantic-Segmentation, Mask2Former, MaskFormer and FaPN-full.

Comments
  • Possibility to test on one single image

    Possibility to test on one single image

    Hi, I would like to test the instance segmentation model on a single image. Is it possible. In test.py i dont see any input argument to test on our own images. Here is the command i did

    python tools/test.py configs/semask_swin/coco_stuff10k/semfpn_semask_swin_large_patch4_window12_640x640_80k_coco10k.py semask_large_fpn_coco10k.pth --eval mIoU --show-dir visuals

    Thank you

    opened by an99990 23
  • ONNX Model

    ONNX Model

    Hi, I wish to convert it into an onnx model. When trying to run the pytorch2onnx.py present inside the tools directory, I get:

    RuntimeError: Exporting the operator roll to ONNX opset version 11 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

    Thanks.

    opened by romil611 18
  • Error when run the demo.py

    Error when run the demo.py

    Hi, I always have the ImportError: cannot import name '_log_api_usage_once' from 'torchvision.utils' (/home/jinshan/anaconda3/envs/tf/lib/python3.9/site-packages/ when I run the demo.py following the instruction, Could you help me with the problem? Thank you!!

    opened by Jinshan99 15
  • RecursionError: Caught RecursionError in DataLoader for training with custom dataset

    RecursionError: Caught RecursionError in DataLoader for training with custom dataset

    Hi, thank you for sharing this great work. I encounter the RecursionError: maximum recursion depth exceeded while calling a Python object for training with the custom dataset.

    I applied the solution of setting the sys.getrecursionlimit() to a higher value but it doesn't work.

    Thanks.

    opened by erenuzyildirim 12
  • question : out of memory and mask prediction

    question : out of memory and mask prediction

    Hi, So ive been trying a lot of the variants here and I have some questions. I am able to compile Mask2Former with swin large as backbone with 8GB RAM but not able to compile SeMask-FPN with the tiny swin backbone. The number of parameters is really different, is there an explanation ?

    Also, I was trying to visualise the mask output in the prediction, but i only get black screen. The predictions output is a dict [sem_seg[X, img.width,img.lenght]] , does this mean that there X number of masks ? and the output visualise is like a sum of all those masks ?

    opened by an99990 12
  • Strange sem-seg resutls

    Strange sem-seg resutls

    Hi there! I am trying to test on SeMask-FAPN, by calling: python demo.py --config-file ../configs/ade20k/semantic-segmentation/semask_swin/fapn_maskformer2_semask_swin_large_IN21k_384_bs16_160k_res640.yaml --input /workspace/SeMask-Segmentation/images/ADE_val_00001001.jpg --output /workspace/SeMask-Segmentation/output/ --opts MODEL.WEIGHTS /workspace/SeMask-Segmentation/checkpoints/semask_large_mask2former_fapn_ade20k.pth and I downloaded pre-trained model from https://drive.google.com/file/d/1DQ9KltSLDj47H2jYnCtVwyBf7KPR9SM_/view The result is here: adetest

    That doesn't make sense. Do you have any idea about where I got things wrong? Thank you!

    opened by Shawn207 7
  • Some questions about training stage

    Some questions about training stage

    Thank you very much for your pioneering work. I have a few questions about the training phase and hope to get your answers.

    1. What are the models and numbers of graphics cards used in training?
    2. What is the training duration for different capacity models?
    3. I'm not sure whether the network can work, If I set the batch size to 1, it's due to the particularity of the task I study.

    Thank you. I'm looking forward to your reply.

    opened by LiuZhe6 7
  • SeMask-mask2former training error

    SeMask-mask2former training error

    I meet a problem as follow, look forward to your help. I can run mask2former correctly. But SeMask-mask2former meets error.

    RUN Command: python3 train_net.py --num-gpus 2 --config-file configs/cityscapes/semantic-segmentation/semask_swin/maskformer2_semask_swin_large_IN21k_384_bs16_90k.yaml

    Error: File "/SeMask-Segmentation/SeMask-Mask2Former/mask2former/modeling/criterion.py", line 329, in loss_cate gt_seg_targets = torch.cat([t["seg_maps"].unsqueeze(0) for t in targets], dim=0) RuntimeError: Sizes of tensors must match except in dimension 2. Got 1024 and 910 (The offending index is 0)

    opened by Sunting78 4
  • Version compatibility problem with DCN and Detectron2

    Version compatibility problem with DCN and Detectron2

    Hi there! According to your setup instruction for SeMask-FaPN, building DCNv2 with torch 1.7.1, following installing Detectron 2 is needed. However, Detectron2 documents requires torch >= 1.8. I upgraded torch to 1.8.0 but DCNv2 gives the following error: (SeMask) xzhan2@cerlab27:~/SeMask-Segmentation/SeMask-FAPN/SeMask-Mask2Former/demo$ python demo.py --config-file ../configs/ade20k/semantic-segmentation/semask_swin/fapn_maskformer2_semask_swin_large_IN21k_384_bs16_160k_res640.yaml --input ~/Pictures/test.jpeg Traceback (most recent call last): File "demo.py", line 26, in <module> from mask2former import add_maskformer2_config File "/home/xzhan2/SeMask-Segmentation/SeMask-FAPN/SeMask-Mask2Former/demo/../mask2former/__init__.py", line 3, in <module> from . import modeling File "/home/xzhan2/SeMask-Segmentation/SeMask-FAPN/SeMask-Mask2Former/demo/../mask2former/modeling/__init__.py", line 5, in <module> from .pixel_decoder.fapn import PixelFANDecoder File "/home/xzhan2/SeMask-Segmentation/SeMask-FAPN/SeMask-Mask2Former/demo/../mask2former/modeling/pixel_decoder/fapn.py", line 16, in <module> from dcn_v2 import DCN as dcn_v2 File "/home/xzhan2/SeMask-Segmentation/SeMask-FAPN/DCNv2/dcn_v2.py", line 17, in <module> import _ext as _backend ImportError: /home/xzhan2/SeMask-Segmentation/SeMask-FAPN/DCNv2/_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIdEEPKNS_6detail12TypeMetaDataEv

    I also tried to degrade to torch 1.7 and install Detectron with torch 1.7, but it fails. The version requriements seem to be conflicting. Could you please give me a bit suggestions? Thanks a lot!!

    opened by Shawn207 4
  • one working example

    one working example

    My output image is always the same as the input image. Is it possible to have one working example with the demo.

    I tried theses

    python demo.py --config-file ../configs/coco/panoptic-segmentation/maskformer2_R101_bs16_50ep.yaml --input ../images/person_bike.jpg --opts MODEL.WEIGHTS ../R-101.pkl

    python demo.py --config-file ../configs/ade20k/panoptic-segmentation/swin/maskformer2_swin_large_IN21k_384_bs16_160k.yaml --input ../images/person_bike.jpg --opts MODEL.WEIGHTS ../semask_large_mask2former_ade20k\ \(1\).pth

    python demo.py --config-file ../configs/coco/instance-segmentation/maskformer2_R101_bs16_50ep.yaml --input ../images/person_bike.jpg --opts MODEL.WEIGHTS ../R-101.pkl

    the last command i got this original image

    image

    thank you

    opened by an99990 4
  • Fix for non-distributed training

    Fix for non-distributed training

    I thought I'd document a fix for non-distributed training in case other people are trying to use this repo as well.

    Even if distributed training isn't selected, you'll get a:

    RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
    

    during the forward call.

    This happens in the decoder head due to the SyncBN layer. Find the config file under base corresponding to your model, and change the norm_cfg to use BN instead of SyncBN. For example, I'm using SeMask-FPN, so under configs/base/models/semfpn_semask_swin.py, I change:

    norm_cfg = dict(type='SyncBN', requires_grad=True)
    

    to

    norm_cfg = dict(type='BN', requires_grad=True)
    

    The layers seem to be totally compatible with one another, so you can load weights from models using SyncBN fine. Just a warning that you'll need to hardcode this if changing between regular and distributed training.

    opened by GerardMaggiolino 3
Owner
Picsart AI Research (PAIR)
Picsart AI Research (PAIR)
ConvMAE: Masked Convolution Meets Masked Autoencoders

ConvMAE ConvMAE: Masked Convolution Meets Masked Autoencoders Peng Gao1, Teli Ma1, Hongsheng Li2, Jifeng Dai3, Yu Qiao1, 1 Shanghai AI Laboratory, 2 M

Alpha VL Team of Shanghai AI Lab 345 Jan 8, 2023
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

null 254 Dec 27, 2022
The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

Jake YANG 62 Nov 21, 2022
From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

SESNet for remote sensing image change detection It is the implementation of the paper: "SESNet: A Semantically Enhanced Siamese Network for Remote Se

null 1 May 24, 2022
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021) PyTorch implementation of SnapMix | paper Method Overview Cite

DavidHuang 126 Dec 30, 2022
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

Abhay Gupta 161 Dec 8, 2022
Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

zhaohu xing 112 Dec 16, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 897 Jan 5, 2023
Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers Figure 1: Performance of SegFormer-B0 to SegFormer-B5. Project page

NVIDIA Research Projects 1.4k Dec 31, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

Daniil Pakhomov 134 Dec 19, 2022
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 3, 2022
MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

pytorch-made This code is an implementation of "Masked AutoEncoder for Density Estimation" by Germain et al., 2015. The core idea is that you can turn

Andrej 498 Dec 30, 2022
EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling This is the official implementation for "Frustratingly Simple Pretraining Al

Atsuki Yamaguchi 31 Nov 18, 2022
The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 114 Jan 6, 2023
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners This repository is built upon BEiT, thanks very much! Now, we on

Zhiliang Peng 2.3k Jan 4, 2023
PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-supervised ViT.

MAE for Self-supervised ViT Introduction This is an unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-sup

null 36 Oct 30, 2022