Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

Related tags

Overview

MLP-Mixer: An all-MLP Architecture for Vision

This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision.

Usage :

import torch
import numpy as np
from mlp-mixer import MLPMixer

img = torch.ones([1, 3, 224, 224])

model = MLPMixer(in_channels=3, image_size=224, patch_size=16, num_classes=1000,
                 dim=512, depth=8, token_dim=256, channel_dim=2048)

parameters = filter(lambda p: p.requires_grad, model.parameters())
parameters = sum([np.prod(p.size()) for p in parameters]) / 1_000_000
print('Trainable Parameters: %.3fM' % parameters)

out_img = model(img)

print("Shape of out :", out_img.shape)  # [B, in_channels, image_size, image_size]

Citation :

@misc{tolstikhin2021mlpmixer,
      title={MLP-Mixer: An all-MLP Architecture for Vision}, 
      author={Ilya Tolstikhin and Neil Houlsby and Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Thomas Unterthiner and Jessica Yung and Daniel Keysers and Jakob Uszkoreit and Mario Lucic and Alexey Dosovitskiy},
      year={2021},
      eprint={2105.01601},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Some component borrowed from ViT code of @lucidrains repo : https://github.com/lucidrains/vit-pytorch

You might also like...

An All-MLP solution for Vision, from Google AI

MLP Mixer - Pytorch An All-MLP solution for Vision, from Google AI, in Pytorch. No convolutions nor attention needed! Yannic Kilcher video Install $ p

784 Jan 6, 2023

Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

244 Dec 27, 2022

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Towards General Purpose Vision Systems By Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, and Derek Hoiem Overview Welcome to the official code base f

79 Dec 23, 2022

Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

Perceiver IO Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs Usage import torch from src.perceiver.

111 Nov 15, 2022

Implementation of ResMLP, an all MLP solution to image classification, in Pytorch

ResMLP - Pytorch Implementation of ResMLP, an all MLP solution to image classification out of Facebook AI, in Pytorch Install $ pip install res-mlp-py

178 Dec 2, 2022

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

383 Jan 2, 2023

MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (arxiv) This is a Pytorch implementation of our paper. We present Vision

162 Nov 28, 2022

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? Code for paper: Does Unsupervised Architecture Representation

39 Dec 17, 2022

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Swin-Transformer-Tensorflow A direct translation of the official PyTorch implementation of "Swin Transformer: Hierarchical Vision Transformer using Sh

52 Dec 29, 2022

Comments

CIFAR training example

Hello, thanks for this project! I'm trying to add a training code to mlp-mixer using the CIFAR dataset. I have added a transform then to adapt the images:

# Image resize 256
transform256 = T.Compose([
            T.Resize(256),
            T.CenterCrop(224),
            T.ToTensor(),
            T.Normalize(
                mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]
            )
        ])

# training set
training_folder = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'training')

trainset = torchvision.datasets.CIFAR10(root=training_folder, train=True,
                                        download=True, transform=transform256)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)
# test set
testset = torchvision.datasets.CIFAR10(root=training_folder, train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)
# cifar classes
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

# MLP Mixer
mixer_model = MLPMixer(in_channels=3, 
                image_size=224, 
                patch_size=16, 
                num_classes=1000,
                dim=512, 
                depth=8, 
                token_dim=256, 
                channel_dim=2048)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
mixer_model.to(torch.device(device))

and to try out image representation in the model:

for i, data in enumerate(trainloader, 0):
    inputs, labels = data
    inputs, labels = inputs.to(device), labels.to(device)
    print("Input:", inputs.shape)
    outputs = mixer_model(inputs)
    print(outputs)
    if i == 1:
        break

and I get

Input: torch.Size([4, 3, 224, 224])
MLPMixer out: torch.Size([4, 1000])

while a naive training code, I'm not actually sure if the input image resize to the model is correct. Thank you.

opened by loretoparisi 2

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

Related tags

Overview

MLP-Mixer: An all-MLP Architecture for Vision

Usage :

Citation :

Acknowledgement :

You might also like...

An All-MLP solution for Vision, from Google AI

Implementation of "A MLP-like Architecture for Dense Prediction"

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

Implementation of ResMLP, an all MLP solution to image classification, in Pytorch

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

MLP-Like Vision Permutator for Visual Recognition (PyTorch)

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Comments

CIFAR training example

Owner

Rishikesh (ऋषिकेश)

Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

Implements MLP-Mixer: An all-MLP Architecture for Vision.

This is an official implementation for "AS-MLP: An Axial Shifted MLP Architecture for Vision".

Unofficial Implementation of MLP-Mixer in TensorFlow

Unofficial Implementation of MLP-Mixer, Image Classification Model

Vision Transformer and MLP-Mixer Architectures

Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.

PyTorch implementation of MLP-Mixer

Pytorch implementation of MLP-Mixer with loading pre-trained models.

Keras attention models including botnet,CoaT,CoAtNet,CMT,cotnet,halonet,resnest,resnext,resnetd,volo,mlp-mixer,resmlp,gmlp,levit