Reading list for research topics in Masked Image Modeling

Overview

awesome-MIM

Reading list for research topics in Masked Image Modeling(MIM).

We list the most popular methods for MIM, if I missed something, please submit a request. (Note: We show the date of the first version of Arxiv here. But the link of paper may be not the early version.)

Self-supervied Vision Transformers as backbone models.

Date Method Conference Title Code
2021-06-14 BeiT ICLR 2022(Oral) BEiT: BERT Pre-Training of Image Transformers BeiT
2021-11-11 MAE Arxiv 2021 Masked Autoencoders Are Scalable Vision Learners MAE
2021-11-15 iBoT Arxiv 2021 iBOT: Image BERT Pre-Training with Online Tokenizer iBoT
2021-11-18 SimMIM Arxiv 2021 SimMIM: A Simple Framework for Masked Image Modeling SimMIM
2021-12-16 MaskFeat Arxiv 2021 Masked Feature Prediction for Self-Supervised Visual Pre-Training None
2021-12-20 SplitMask Arxiv 2021 Are Large-scale Datasets Necessary for Self-Supervised Pre-training? None
2022-01-31 ADIOS Arxiv 2022 Adversarial Masking for Self-Supervised Learning None
2022-02-07 CAE Arxiv 2022 Context Autoencoder for Self-Supervised Representation Learning None
2022-02-07 CIM Arxiv 2022 Corrupted Image Modeling for Self-Supervised Visual Pre-Training None
You might also like...
 An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners
An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners This is a coarse version for MAE, only make the pretrain model, the fine

Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

mae-repo PyTorch re-implememtation of "masked autoencoders are scalable vision learners". In this repo, it heavily borrows codes from codebase https:/

SeMask: Semantically Masked Transformers for Semantic Segmentation.
SeMask: Semantically Masked Transformers for Semantic Segmentation.

SeMask: Semantically Masked Transformers Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi This repo co

FocusFace: Multi-task Contrastive Learning for Masked Face Recognition
FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

FocusFace This is the official repository of "FocusFace: Multi-task Contrastive Learning for Masked Face Recognition" accepted at IEEE International C

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders
Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders

MultiMAE: Multi-modal Multi-task Masked Autoencoders Roman Bachmann*, David Mizrahi*, Andrei Atanov, Amir Zamir Website | arXiv | BibTeX Official PyTo

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training [Arxiv] VideoMAE: Masked Autoencoders are Data-Efficient Learne

A pytorch implementation of Reading Wikipedia to Answer Open-Domain Questions.

DrQA A pytorch implementation of the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions (DrQA). Reading comprehension is a task to produ

Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers.

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

Comments
  • Can you add ConvMAE ?

    Can you add ConvMAE ?

    ConvMAE: Masked Convolution Meets Masked Autoencoders

    https://arxiv.org/abs/2205.03892

    https://github.com/alpha-vl/convmae

    Please consider the addition of ConvMAE on your repo.

    opened by gaopengpjlab 1
  • Please consider add CIM

    Please consider add CIM

    Hello, thanks for the awesome repo. Please consider add a related work: Corrupted Image Modeling for Self-Supervised Visual Pre-Training (https://arxiv.org/pdf/2202.03382.pdf)

    opened by Yuxin-CV 0
  • Please consider add CIM

    Please consider add CIM

    Hello @ucasligang, thanks for the awesome repo.

    Please consider add a related work: Corrupted Image Modeling for Self-Supervised Visual Pre-Training (arxiv: https://arxiv.org/abs/2202.03382)

    Thanks!

    opened by Yuxin-CV 0
Owner
ligang
I am a Ph.D student from the University of Chinese Academy of Science (UCAS), studying computer vision, especially self-supervised learning.
ligang
EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling This is the official implementation for "Frustratingly Simple Pretraining Al

Atsuki Yamaguchi 31 Nov 18, 2022
Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

null 0 Jan 23, 2022
Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Dominic Rampas 247 Dec 16, 2022
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

Facebook Research 25.5k Jan 7, 2023
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Jan 4, 2023
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 3, 2022
MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

pytorch-made This code is an implementation of "Masked AutoEncoder for Density Estimation" by Germain et al., 2015. The core idea is that you can turn

Andrej 498 Dec 30, 2022
The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 114 Jan 6, 2023
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners This repository is built upon BEiT, thanks very much! Now, we on

Zhiliang Peng 2.3k Jan 4, 2023
PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-supervised ViT.

MAE for Self-supervised ViT Introduction This is an unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-sup

null 36 Oct 30, 2022