Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

Overview

NVIDIA Source Code License Python 3.8

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Figure 1: Performance of SegFormer-B0 to SegFormer-B5.

Project page | Paper | Demo (Youtube) | Demo (Bilibili)

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers.
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo.
Technical Report 2021.

This repository contains the PyTorch training/evaluation code and the pretrained models for SegFormer.

SegFormer is a simple, efficient and powerful semantic segmentation method, as shown in Figure 1.

We use MMSegmentation v0.13.0 as the codebase.

Installation

For install and data preparation, please refer to the guidelines in MMSegmentation v0.13.0.

Other requirements: pip install timm==0.3.2

Evaluation

Download trained weights.

Example: evaluate SegFormer-B1 on ADE20K:

# Single-gpu testing
python tools/test.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file

# Multi-gpu testing
./tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM>

# Multi-gpu, multi-scale testing
tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM> --aug-test

Training

Download weights pretrained on ImageNet-1K, and put them in a folder pretrained/.

Example: train SegFormer-B1 on ADE20K:

# Single-gpu training
python tools/train.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py 

# Multi-gpu training
./tools/dist_train.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py <GPU_NUM>

License

Please check the LICENSE file. SegFormer may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{xie2021segformer,
  title={SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers},
  author={Xie, Enze and Wang, Wenhai and Yu, Zhiding and Anandkumar, Anima and Alvarez, Jose M and Luo, Ping},
  journal={arXiv preprint arXiv:2105.15203},
  year={2021}
}
Comments
  • Mapillary Class Remapping

    Mapillary Class Remapping

    Hello, I see that Mapillary uses a remapping to 19 classes,

    https://github.com/NVlabs/SegFormer/blob/3561d14362abe60675755ee00d266308e4e3015e/mmseg/datasets/pipelines/transforms.py#L1025

    Does this mean the experiments done in the paper uses 19 classes for all methods on Mapilary?

    opened by serser 9
  • Simple SegFormer network class

    Simple SegFormer network class

    Hello How are you? Thanks for contributing to this project. It is difficult for us to use this project because it contains many other scripts. Did u check https://github.com/lucidrains/segformer-pytorch which is a third-party implementation for SegFormer? This project contains ONLY a simple segformer network class so it is easy to use. But the number of params of MiT-B0 network by this implementation is 7M. I know that the number of params of MiT-B0 is 3.6M in the paper. Could u check https://github.com/lucidrains/segformer-pytorch shortly? If it is difficult, could u make the SegFormer network class like the above implementation? Thanks

    opened by rose-jinyang 6
  • train error mit_b1.pth is not a checkpoint file

    train error mit_b1.pth is not a checkpoint file

    It's a great honor for me to study your reserch, when i download the pretrained model into pretrained directory . It shows as follows, hope you can give me some advice. Thanks for your time and kindness. image

    opened by peterlv666 5
  • Pretraining segformer on ImageNet-22K

    Pretraining segformer on ImageNet-22K

    The Swin transformer release a large model pretrained on ImageNet-22K for semantic segmentation and achieved a good result. I wonder if you are interested in improving segformer in a similar way? Thanks!

    opened by htzheng 4
  • Inference speed of the model

    Inference speed of the model

    Hello How are you? Thanks for contributing to this project. Which device did u test your models on?

    image

    You did NOT explain the device specification in the paper.

    opened by rose-jinyang 4
  • Training details

    Training details

    Hi, I'm trying to reproduce SegFormer on PASCAL VOC dataset. When using the codes of this repo, I could get ~77% mIoU (without multi-scale test). However, I only got ~75% mIoU with my reproduced code. Here are my training details.

    I have reproduced the training and validation data pipeline, including random scaling , random horizontal flipping , and random cropping, etc. For the model, I used the code of this repo and the pre-trained weights. I also used an AdamW optimizer with a warmup scheduler. The other optimizer settings are set as the same with this repo.

    Therefore, I'm wondering if there are any extra training details in SegFormer or mmseg itself. I'll very appreciate for your reply.

    opened by rulixiang 3
  • About the efficient attention module

    About the efficient attention module

    Hi,

    I would like to ask a question about the efficient attention module, please: I see that you use a reduction ratio R to descrease the spatial size of input sequences, normally this operation will produce a output sequence of spatial size N/R. But according to your Table.6 it doesn't, the output spatial size is still N. I would like to ask where do you upsample your sequence spatial size from N/R back to N in the attention module after the reduced QKV multiplication?

    Thank you!

    opened by yihongXU 3
  • Question on Mapillary pretrain when evaluating on cityscapes(val) dataset

    Question on Mapillary pretrain when evaluating on cityscapes(val) dataset

    I met a problem when training on Mapillary and evaluating on cityscapes. The class "wall" miou=0.0. Could you please provide the training log of Mapillary pretrain and eval?(prefer Model B2) Thanks a lot!

    +---------------+-------+-------+
    |     Class     |  IoU  |  Acc  |
    +---------------+-------+-------+
    |      road     | 96.93 | 98.09 |
    |    sidewalk   | 76.57 | 90.46 |
    |    building   | 89.02 | 95.67 |
    |      wall     |  0.0  |  0.0  |
    |     fence     | 35.52 | 59.93 |
    |      pole     | 52.85 | 63.03 |
    | traffic light | 59.63 | 71.81 |
    |  traffic sign | 68.11 | 77.09 |
    |   vegetation  | 89.89 | 96.67 |
    |    terrain    |  26.0 |  26.5 |
    |      sky      | 90.97 | 93.58 |
    |     person    | 72.78 | 87.27 |
    |     rider     | 33.21 |  41.0 |
    |      car      | 91.25 | 97.25 |
    |     truck     |  61.8 | 64.37 |
    |      bus      | 66.93 | 71.56 |
    |     train     | 62.85 | 65.31 |
    |   motorcycle  | 47.68 | 65.62 |
    |    bicycle    | 67.57 | 74.03 |
    +---------------+-------+-------+
    2021-06-21 16:06:43,150 - mmseg - INFO - Summary:
    2021-06-21 16:06:43,150 - mmseg - INFO - 
    +-------+-------+-------+
    |  aAcc |  mIoU |  mAcc |
    +-------+-------+-------+
    | 93.66 | 62.61 | 70.49 |
    +-------+-------+-------+
    
    opened by littleSunlxy 3
  • KeyError: 'AlignedResize is not in the pipeline registry'

    KeyError: 'AlignedResize is not in the pipeline registry'

    Hi,

    I hava a similar error to #2. I've just forked the repo to add a print statement, so fix #1 is included. When running python tools/test.py, I'm getting the following:

    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
        return obj_cls(**args)
      File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/test_time_aug.py", line 59, in __init__
        self.transforms = Compose(transforms)
      File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/compose.py", line 22, in __init__
        transform = build_from_cfg(transform, PIPELINES)
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 44, in build_from_cfg
        f'{obj_type} is not in the {registry.name} registry')
    KeyError: 'AlignedResize is not in the pipeline registry'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
        return obj_cls(**args)
      File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/ade.py", line 91, in __init__
        **kwargs)
      File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/custom.py", line 88, in __init__
        self.pipeline = Compose(pipeline)
      File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/compose.py", line 22, in __init__
        transform = build_from_cfg(transform, PIPELINES)
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
        raise type(e)(f'{obj_cls.__name__}: {e}')
    KeyError: "MultiScaleFlipAug: 'AlignedResize is not in the pipeline registry'"
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "tools/test.py", line 170, in <module>
        main()
      File "tools/test.py", line 122, in main
        dataset = build_dataset(cfg.data.test)
      File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/builder.py", line 73, in build_dataset
        dataset = build_from_cfg(cfg, DATASETS, default_args)
      File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
        raise type(e)(f'{obj_cls.__name__}: {e}')
    KeyError: 'ADE20KDataset: "MultiScaleFlipAug: \'AlignedResize is not in the pipeline registry\'"'
    

    I've made a Google Colab to reproduce: https://colab.research.google.com/drive/1-t_lj5K2ZEFemxn88DSfcy9W7RTvklsz?usp=sharing

    opened by NielsRogge 3
  • No module named 'mmseg'

    No module named 'mmseg'

    hi thanks for your great repo

    I have no idea why but it seems to import mmseg not from local module but from installed one and it keeps shows me this error

    No module named 'mmseg'

    When I run pycharm as debugging mode, than it works all fine... only when I try to run it with terminal or run mode...

    any help will be very appreciated

    opened by ooodragon94 2
  • How to change checkpoint saving frequency

    How to change checkpoint saving frequency

    Hi, first of all, thank you for your research and code.

    I see that during training, the model is saved every 4000 iterations. Where can I change this spec, such that my model is saved every, lets say, 1000 iterations?

    Thank you

    opened by gcilli 2
  • How to convert the model to tensorrt or onnx?

    How to convert the model to tensorrt or onnx?

    For robot implementation, we need onnx or openvino version of segformer, but currently, I found that segformer is not to to be converted to those versions, Does anyone can help us to find the reason or share your successful examples, thank you!

    opened by yuchenlichuck 0
  • Dataset Creation

    Dataset Creation

    Hi,

    I am working on a task of semantic segmentation. I am facing issue in generation of data in the required format. Can anyone help me with free tools that I can use to generate the data?

    opened by FatemaD1 0
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • What does

    What does "type = IMTRv21_5" mean?

    I found it diffcult to understand this two file. #1 local_configs/base/models/segformer.py #2 local_configs/base/models/segformer.py I knew that #1 means the base of #2, but I find the code in file 2: "backbone = dict(type= ‘mit_b0’)",
    and the other code in file 1: "backbone = dict(type= ‘IMTRv21_5’)",

    I wonder what does "type = IMTRv21_5" mean? Please Guide me!

    opened by Buling-Knight 0
Owner
NVIDIA Research Projects
NVIDIA Research Projects
Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

null 101 Nov 25, 2022
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

null 49 Nov 23, 2022
The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

Angtian Wang 76 Nov 23, 2022
StyleGAN2-ADA - Official PyTorch implementation

Abstract: Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes.

NVIDIA Research Projects 3.2k Dec 30, 2022
Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

Bosch Research 272 Dec 28, 2022
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Richard Wang 443 Dec 6, 2022
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

null 364 Dec 14, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

UnRigidFlow This is the official PyTorch implementation of UnRigidFlow (IJCAI2019). Here are two sample results (~10MB gif for each) of our unsupervis

Liang Liu 28 Nov 16, 2022
Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

null 35 Dec 6, 2022
Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

Dongkwan Kim 127 Dec 28, 2022
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022
Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

Aviv Shamsian 121 Dec 25, 2022
StyleGAN2 - Official TensorFlow Implementation

StyleGAN2 - Official TensorFlow Implementation

NVIDIA Research Projects 10.1k Dec 28, 2022
Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

Microsoft 11.3k Dec 30, 2022
Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

null 46 Nov 9, 2022
Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović   Aleksandar Stanić   Stefan Bauer   Jürgen Schmid

Djordje Miladinovic 34 Jan 19, 2022