Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

NVIDIA Research Projects

Last update: Dec 31, 2022

Related tags

Overview

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Figure 1: Performance of SegFormer-B0 to SegFormer-B5.

Project page | Paper | Demo (Youtube) | Demo (Bilibili)

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers.
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, and Ping Luo.
Technical Report 2021.

This repository contains the PyTorch training/evaluation code and the pretrained models for SegFormer.

SegFormer is a simple, efficient and powerful semantic segmentation method, as shown in Figure 1.

We use MMSegmentation v0.13.0 as the codebase.

Installation

For install and data preparation, please refer to the guidelines in MMSegmentation v0.13.0.

Other requirements: pip install timm==0.3.2

Evaluation

Download trained weights.

Example: evaluate SegFormer-B1 on ADE20K:

# Single-gpu testing
python tools/test.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file

# Multi-gpu testing
./tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM>

# Multi-gpu, multi-scale testing
tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM> --aug-test

Training

Download weights pretrained on ImageNet-1K, and put them in a folder pretrained/.

Example: train SegFormer-B1 on ADE20K:

# Single-gpu training
python tools/train.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py 

# Multi-gpu training
./tools/dist_train.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py <GPU_NUM>

License

Please check the LICENSE file. SegFormer may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{xie2021segformer,
  title={SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers},
  author={Xie, Enze and Wang, Wenhai and Yu, Zhiding and Anandkumar, Anima and Alvarez, Jose M and Luo, Ping},
  journal={arXiv preprint arXiv:2105.15203},
  year={2021}
}

Comments

Mapillary Class Remapping

Hello, I see that Mapillary uses a remapping to 19 classes,

https://github.com/NVlabs/SegFormer/blob/3561d14362abe60675755ee00d266308e4e3015e/mmseg/datasets/pipelines/transforms.py#L1025

Does this mean the experiments done in the paper uses 19 classes for all methods on Mapilary?

opened by serser 9
Simple SegFormer network class

Hello How are you? Thanks for contributing to this project. It is difficult for us to use this project because it contains many other scripts. Did u check https://github.com/lucidrains/segformer-pytorch which is a third-party implementation for SegFormer? This project contains ONLY a simple segformer network class so it is easy to use. But the number of params of MiT-B0 network by this implementation is 7M. I know that the number of params of MiT-B0 is 3.6M in the paper. Could u check https://github.com/lucidrains/segformer-pytorch shortly? If it is difficult, could u make the SegFormer network class like the above implementation? Thanks

opened by rose-jinyang 6
train error mit_b1.pth is not a checkpoint file

It's a great honor for me to study your reserch, when i download the pretrained model into pretrained directory . It shows as follows, hope you can give me some advice. Thanks for your time and kindness.

opened by peterlv666 5
Pretraining segformer on ImageNet-22K

The Swin transformer release a large model pretrained on ImageNet-22K for semantic segmentation and achieved a good result. I wonder if you are interested in improving segformer in a similar way? Thanks!

opened by htzheng 4
Inference speed of the model

Hello How are you? Thanks for contributing to this project. Which device did u test your models on?

You did NOT explain the device specification in the paper.

opened by rose-jinyang 4
Training details

Hi, I'm trying to reproduce SegFormer on PASCAL VOC dataset. When using the codes of this repo, I could get ~77% mIoU (without multi-scale test). However, I only got ~75% mIoU with my reproduced code. Here are my training details.

I have reproduced the training and validation data pipeline, including random scaling , random horizontal flipping , and random cropping, etc. For the model, I used the code of this repo and the pre-trained weights. I also used an AdamW optimizer with a warmup scheduler. The other optimizer settings are set as the same with this repo.

Therefore, I'm wondering if there are any extra training details in SegFormer or mmseg itself. I'll very appreciate for your reply.

opened by rulixiang 3
About the efficient attention module

Hi,

I would like to ask a question about the efficient attention module, please: I see that you use a reduction ratio R to descrease the spatial size of input sequences, normally this operation will produce a output sequence of spatial size N/R. But according to your Table.6 it doesn't, the output spatial size is still N. I would like to ask where do you upsample your sequence spatial size from N/R back to N in the attention module after the reduced QKV multiplication?

Thank you!

opened by yihongXU 3

Question on Mapillary pretrain when evaluating on cityscapes(val) dataset

I met a problem when training on Mapillary and evaluating on cityscapes. The class "wall" miou=0.0. Could you please provide the training log of Mapillary pretrain and eval？(prefer Model B2) Thanks a lot!

+---------------+-------+-------+
|     Class     |  IoU  |  Acc  |
+---------------+-------+-------+
|      road     | 96.93 | 98.09 |
|    sidewalk   | 76.57 | 90.46 |
|    building   | 89.02 | 95.67 |
|      wall     |  0.0  |  0.0  |
|     fence     | 35.52 | 59.93 |
|      pole     | 52.85 | 63.03 |
| traffic light | 59.63 | 71.81 |
|  traffic sign | 68.11 | 77.09 |
|   vegetation  | 89.89 | 96.67 |
|    terrain    |  26.0 |  26.5 |
|      sky      | 90.97 | 93.58 |
|     person    | 72.78 | 87.27 |
|     rider     | 33.21 |  41.0 |
|      car      | 91.25 | 97.25 |
|     truck     |  61.8 | 64.37 |
|      bus      | 66.93 | 71.56 |
|     train     | 62.85 | 65.31 |
|   motorcycle  | 47.68 | 65.62 |
|    bicycle    | 67.57 | 74.03 |
+---------------+-------+-------+
2021-06-21 16:06:43,150 - mmseg - INFO - Summary:
2021-06-21 16:06:43,150 - mmseg - INFO - 
+-------+-------+-------+
|  aAcc |  mIoU |  mAcc |
+-------+-------+-------+
| 93.66 | 62.61 | 70.49 |
+-------+-------+-------+

opened by littleSunlxy 3

KeyError: 'AlignedResize is not in the pipeline registry'

Hi,

I hava a similar error to #2. I've just forked the repo to add a print statement, so fix #1 is included. When running python tools/test.py, I'm getting the following:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
    return obj_cls(**args)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/test_time_aug.py", line 59, in __init__
    self.transforms = Compose(transforms)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/compose.py", line 22, in __init__
    transform = build_from_cfg(transform, PIPELINES)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 44, in build_from_cfg
    f'{obj_type} is not in the {registry.name} registry')
KeyError: 'AlignedResize is not in the pipeline registry'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
    return obj_cls(**args)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/ade.py", line 91, in __init__
    **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/custom.py", line 88, in __init__
    self.pipeline = Compose(pipeline)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/pipelines/compose.py", line 22, in __init__
    transform = build_from_cfg(transform, PIPELINES)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
KeyError: "MultiScaleFlipAug: 'AlignedResize is not in the pipeline registry'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tools/test.py", line 170, in <module>
    main()
  File "tools/test.py", line 122, in main
    dataset = build_dataset(cfg.data.test)
  File "/usr/local/lib/python3.7/dist-packages/mmseg/datasets/builder.py", line 73, in build_dataset
    dataset = build_from_cfg(cfg, DATASETS, default_args)
  File "/usr/local/lib/python3.7/dist-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
KeyError: 'ADE20KDataset: "MultiScaleFlipAug: \'AlignedResize is not in the pipeline registry\'"'

I've made a Google Colab to reproduce: https://colab.research.google.com/drive/1-t_lj5K2ZEFemxn88DSfcy9W7RTvklsz?usp=sharing

opened by NielsRogge 3

No module named 'mmseg'

hi thanks for your great repo

I have no idea why but it seems to import mmseg not from local module but from installed one and it keeps shows me this error

No module named 'mmseg'

When I run pycharm as debugging mode, than it works all fine... only when I try to run it with terminal or run mode...

any help will be very appreciated

opened by ooodragon94 2
How to change checkpoint saving frequency

Hi, first of all, thank you for your research and code.

I see that during training, the model is saved every 4000 iterations. Where can I change this spec, such that my model is saved every, lets say, 1000 iterations?

Thank you

opened by gcilli 2
How to convert the model to tensorrt or onnx?

For robot implementation, we need onnx or openvino version of segformer, but currently, I found that segformer is not to to be converted to those versions, Does anyone can help us to find the reason or share your successful examples, thank you!

opened by yuchenlichuck 0
Dataset Creation

Hi,

I am working on a task of semantic segmentation. I am facing issue in generation of data in the required format. Can anyone help me with free tools that I can use to generate the data?

opened by FatemaD1 0
CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 0
What does "type = IMTRv21_5" mean?

I found it diffcult to understand this two file. #1 local_configs/base/models/segformer.py #2 local_configs/base/models/segformer.py I knew that #1 means the base of #2, but I find the code in file 2: "backbone = dict(type= ‘mit_b0’)",
and the other code in file 1: "backbone = dict(type= ‘IMTRv21_5’)",

I wonder what does "type = IMTRv21_5" mean? Please Guide me!

opened by Buling-Knight 0

Owner

NVIDIA Research Projects

GitHub https://arxiv.org/abs/2105.15203

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

101 Nov 25, 2022

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

76 Nov 23, 2022

StyleGAN2-ADA - Official PyTorch implementation

Abstract: Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes.

3.2k Dec 30, 2022

Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

272 Dec 28, 2022

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

443 Dec 6, 2022

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow

201 Dec 21, 2022

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

364 Dec 14, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

UnRigidFlow This is the official PyTorch implementation of UnRigidFlow (IJCAI2019). Here are two sample results (~10MB gif for each) of our unsupervis

28 Nov 16, 2022

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

35 Dec 6, 2022

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

127 Dec 28, 2022

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

87 Dec 30, 2022

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

59 Dec 17, 2022

Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

121 Dec 25, 2022

StyleGAN2 - Official TensorFlow Implementation

10.1k Dec 28, 2022

Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

11.3k Dec 30, 2022

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

46 Nov 9, 2022

Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović Aleksandar Stanić Stefan Bauer Jürgen Schmid

34 Jan 19, 2022

Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

Related tags

Overview

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Project page | Paper | Demo (Youtube) | Demo (Bilibili)

Installation

Evaluation

Training

License

Citation

Comments

Patching CVE-2007-4559

Owner

NVIDIA Research Projects

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

StyleGAN2-ADA - Official PyTorch implementation

Official implementation of the ICLR 2021 paper

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

Official code implementation for "Personalized Federated Learning using Hypernetworks"

StyleGAN2 - Official TensorFlow Implementation

Old Photo Restoration (Official PyTorch Implementation)

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

Official PyTorch implementation of Spatial Dependency Networks.