Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

Ngoc Nguyen Ba

Last update: Dec 10, 2022

Related tags

Deep Learning mlp-mixer

Overview

MLP Mixer

Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision. Give us a star if you like this repo.

Author:

Github: bangoc123
Email: [email protected]

This library belongs to our project: Papers-Videos-Code where we will implement AI SOTA papers and publish all source code. Additionally, videos to explain these models will be uploaded to ProtonX Youtube channels.

[Note] You can use your data to train this model.

I. Set up environment

Make sure you have installed Miniconda. If not yet, see the setup document here.
cd into mlp-mixer and use command line conda env create -f environment.yml to setup the environment
Run conda environment using the command conda activate mlp-mixer

II. Set up your dataset.

Create 2 folders train and validation in the data folder (which was created already). Then Please copy your images with the corresponding names into these folders.

train folder was used for the training process
validation folder was used for validating training result after each epoch

This library use image_dataset_from_directory API from Tensorflow 2.0 to load images. Make sure you have some understanding of how it works via its document.

Structure of these folders.

train/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
...class_c/
......c_image_1.jpg
......c_image_2.jpg

validation/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
...class_c/
......c_image_1.jpg
......c_image_2.jpg

III. Train your model by running this command line

python train.py --epochs ${epochs} --num-classes ${num_classes}

You want to train a model in 10 epochs for binary classification problems (with 2 classes)

Example:

python train.py --epochs 10 --num-classes 2

There are some important arguments for the script you should consider when running it:

train-folder: The folder of training images
valid-folder: The folder of validation images
model-folder: Where the model after training saved
num-classes: The number of your problem classes.
batch-size: The batch size of the dataset
c: Patch Projection Dimension
dc: Token-mixing units. It was mentioned in the paper on page 3
ds: Channel-mixing units. It was mentioned in the paper on page 3
num-of-mlp-blocks: The number of MLP Blocks
learning-rate: The learning rate of Adam Optimizer

After training successfully, your model will be saved to model-folder defined before

IV. Testing model with a new image

We offer a script for testing a model using a new image via a command line:

python predict.py --test-file-path ${test_file_path}

where test_file_path is the path of your test image.

Example:

python predict.py --test-file-path ./data/test/cat.2000.jpg

V. Feedback

If you meet any issues when using this library, please let us know via the issues submission tab.

You might also like...

An All-MLP solution for Vision, from Google AI

MLP Mixer - Pytorch An All-MLP solution for Vision, from Google AI, in Pytorch. No convolutions nor attention needed! Yannic Kilcher video Install $ p

784 Jan 6, 2023

Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

244 Dec 27, 2022

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Towards General Purpose Vision Systems By Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, and Derek Hoiem Overview Welcome to the official code base f

79 Dec 23, 2022

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? Code for paper: Does Unsupervised Architecture Representation

39 Dec 17, 2022

Implementation of ResMLP, an all MLP solution to image classification, in Pytorch

ResMLP - Pytorch Implementation of ResMLP, an all MLP solution to image classification out of Facebook AI, in Pytorch Install $ pip install res-mlp-py

178 Dec 2, 2022

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

383 Jan 2, 2023

MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (arxiv) This is a Pytorch implementation of our paper. We present Vision

162 Nov 28, 2022

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

Episodic Transformers (E.T.) Episodic Transformer for Vision-and-Language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun Episodic Transform

62 Dec 24, 2022

Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Segformer - Pytorch Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch. Install $ pip install segformer-pytorch

208 Dec 25, 2022

Comments

How to calculate all trainable parameters??

Thanks so much for the nice implementation. I just have a question about parameter.

How can I calculate all trainable parameters in this model? I have tried mlpmixer.summary(), but i do not work.

opened by tuandv2021 1
How do I use GPU to accelerate training?

Hello, thank you for reproducing mlp-mixer, but in the process of using and training my own datasets, I found that the speed is slower and GPU acceleration is not used. How can I solve this problem? Looking forward to your answer.

opened by JieJayCao 1
Which class is predicted?
Thanks so much for the nice implementation.

I just have a question about the predicted class.

For instance, I have two classes in two directories: cat and dog. After training the model, I tried to predict the image you had in the test directory, and got the following:

Output Softmax: [[0. 1.]] This image belongs to class: 1

Which class does 1 refer to in this case?

Thanks.
opened by abderhasan 1