Implementation of ConvMixer in TensorFlow and Keras

Overview

ConvMixer

ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in that it operates directly on patches as input, separates the mixing of spatial and channel dimensions, and maintains equal size and resolution throughout the network. In contrast, however, the ConvMixer uses only standard convolutions to achieve the mixing steps. Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

Official GitHub Link: https://github.com/tmp-iclr/convmixer

Paper Link: https://openreview.net/pdf?id=TVHS5Y4dNvM

Note: Paper is under review for ICLR 2022

Open In Colab

Model Architechture

Installation

pip install -q tensorflow-addons

Note: We are using TensorFlow-Addons for using the AdamW optimizer and GeLU activation function.

Results

Unknown-2 Unknown

TensorBoard Link: https://tensorboard.dev/experiment/bkhqOz0RQ1Cv5dwrDQySMQ/

Note: Trained 25 Epochs and got a top-5-accuracy of 64.41%

Future Work

  • To train on 150 epochs
  • To train model on ImageNet dataset

Citation

@inproceedings{
anonymous2022patches,
title={Patches Are All You Need?},
author={Anonymous},
booktitle={Submitted to The Tenth International Conference on Learning Representations },
year={2022},
url={https://openreview.net/forum?id=TVHS5Y4dNvM},
note={under review}
}

License

MIT License

Copyright (c) 2021 Sayan Nath

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
You might also like...
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Face Mask Detection on Image and Video using tensorflow and keras
Face Mask Detection on Image and Video using tensorflow and keras

Face-Mask-Detection Face Mask Detection on Image and Video using tensorflow and keras Train Neural Network on face-mask dataset using tensorflow and k

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Mask R-CNN for Object Detection and Segmentation This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bound

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.
Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

Hyperparameter Optimization for TensorFlow, Keras and PyTorch
Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Hyperparameter Optimization for Keras Talos • Key Features • Examples • Install • Support • Docs • Issues • License • Download Talos radically changes

Deep GPs built on top of TensorFlow/Keras and GPflow

GPflux Documentation | Tutorials | API reference | Slack What does GPflux do? GPflux is a toolbox dedicated to Deep Gaussian processes (DGP), the hier

Releases(0.0.1)
  • 0.0.1(Oct 15, 2021)

    ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in that it operates directly on patches as input, separates the mixing of spatial and channel dimensions, and maintains equal size and resolution throughout the network. In contrast, however, the ConvMixer uses only standard convolutions to achieve the mixing steps. Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

    Open In Colab

    View the TensorBoard here.

    Note: Trained on 25 Epochs.

    Source code(tar.gz)
    Source code(zip)
    convmixer-model.h5(6.94 MB)
    convmixer.zip(6.21 MB)
    train-logs.csv(2.94 KB)
Owner
Sayan Nath
GSoC'21 @tensorflow | Calling `model.fit()` @Codebugged-Research | Lead @DSC-KIIT
Sayan Nath
ConvMixer unofficial implementation

ConvMixer ConvMixer 非官方实现 pytorch 版本已经实现。 nets 是重构版本 ,test 是官方代码 感兴趣小伙伴可以对照看一下。 keras 已经实现 tf2.x 中 是tensorflow 2 版本 gelu 激活函数要求 tf>=2.4 否则使用入下代码代替gelu

Jian Tengfei 8 Jul 11, 2022
This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras)

Yogi-Optimizer_Keras This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras) The NeurIPS-Paper can be found here: http://papers.nips.c

null 14 Sep 13, 2022
Keras udrl - Keras implementation of Upside Down Reinforcement Learning

keras_udrl Keras implementation of Upside Down Reinforcement Learning This is me

Eder Santana 7 Jan 24, 2022
Classification models 1D Zoo - Keras and TF.Keras

Classification models 1D Zoo - Keras and TF.Keras This repository contains 1D variants of popular CNN models for classification like ResNets, DenseNet

Roman Solovyev 12 Jan 6, 2023
Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Custom Keras ML block example for Edge Impulse This repository is an example on

Edge Impulse 8 Nov 2, 2022
Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

Yam Peleg 10 Jan 30, 2022
A Keras implementation of YOLOv4 (Tensorflow backend)

keras-yolo4 请使用更完善的版本: https://github.com/miemie2013/Keras-YOLOv4 Please visit here for more complete model: https://github.com/miemie2013/Keras-YOLOv

null 384 Nov 29, 2022
A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

PokeGAN A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon. Dataset The model has been trained on dataset that includes 8

null 19 Jul 26, 2022
Convnext-tf - Unofficial tensorflow keras implementation of ConvNeXt

ConvNeXt Tensorflow This is unofficial tensorflow keras implementation of ConvNe

null 29 Oct 6, 2022