TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

Overview

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?

Open In Colab


A TensorFlow implementation of TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? [1]. In this paper, an earlier version of which is presented at NeurIPS 2021 [2], the authors suggest an adaptive token learning algorithm that makes ViT computationally much more efficient (in terms of FLOPs) and also increases downstream accuracy (here classification accuracy). Experimenting with CIFAR-10 we reduce the number of pathces from 64 to 4 (number of adaptively learned tokens) and also report a boost in the accuracy. We experiment with different hyperparameters and report results which aligns with the literature.

With and Without TokenLearner

We report results training our mini ViT with and without the vanilla TokenLearner module here. You can find the vanilla Token Learner module in the TokenLearner.ipynb notebook.

TokenLearner # tokens in
TokenLearner
Top-1 Acc
(Averaged across 5 runs)
GFLOPs TensorBoard
N - 56.112% 0.0184 Link
Y 8 56.55% 0.0153 Link
N - 56.37% 0.0184 Link
Y 4 56.4980% 0.0147 Link
N - (# Transformer layers: 8) 55.36% 0.0359 Link

TokenLearner v1.1

We have also implemented the Token Learner v11 module which aligns with the official implementation. The Token Learner v11 module can be found in the TokenLearner-V1.1.ipynb notebook. The results training with this module are as follows:

# Groups # Tokens Top-1 Acc GFLOPs TensorBoard
4 4 54.638% 0.0149 Link
8 8 54.898% 0.0146 Link
4 8 55.196% 0.0149 Link

We acknowledge that the results with this new TokenLearner module are slightly off than expected and this might mitigate with hyperparameter tuning.

Note: To compute the FLOPs of our models we use this utility from this repository.

Acknowledgements

References

[1] TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?; Ryoo et al.; arXiv 2021; https://arxiv.org/abs/2106.11297

[2] TokenLearner: Adaptive Space-Time Tokenization for Videos; Ryoo et al., NeurIPS 2021; https://openreview.net/forum?id=z-l1kpDXs88

You might also like...
Unofficial Implementation of MLP-Mixer in TensorFlow
Unofficial Implementation of MLP-Mixer in TensorFlow

mlp-mixer-tf Unofficial Implementation of MLP-Mixer [abs, pdf] in TensorFlow. Note: This project may have some bugs in it. I'm still learning how to i

Tensorflow implementation for Self-supervised Graph Learning for Recommendation

If the compilation is successful, the evaluator of cpp implementation will be called automatically. Otherwise, the evaluator of python implementation will be called.

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.
Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

PAWS-TF 🐾 Implementation of Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples (PAWS)

Unofficial TensorFlow  implementation of the Keyword Spotting Transformer model
Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Keyword Spotting Transformer This is the unofficial TensorFlow implementation of the Keyword Spotting Transformer model. This model is used to train o

A tensorflow implementation of GCN-LPA

GCN-LPA This repository is the implementation of GCN-LPA (arXiv): Unifying Graph Convolutional Neural Networks and Label Propagation Hongwei Wang, Jur

Official Tensorflow implementation of
Official Tensorflow implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

M-LSD: Towards Light-weight and Real-time Line Segment Detection Official Tensorflow implementation of "M-LSD: Towards Light-weight and Real-time Line

Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

Tensorflow implementation of MIRNet for Low-light image enhancement
Tensorflow implementation of MIRNet for Low-light image enhancement

MIRNet Tensorflow implementation of the MIRNet architecture as proposed by Learning Enriched Features for Real Image Restoration and Enhancement. Lanu

Tensorflow python implementation of
Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos This repository is the official tensorflow python implementation

Comments
  • feat: add tokenlearner v1.1

    feat: add tokenlearner v1.1

    Results:

    | # Groups | # Tokens | Top-1 Acc | TensorBoard | |:---:|:---:|:---:|:---:| | 4 | 4 | 54.638% | Link | | 8 | 8 | 54.898% | Link | | 4 | 8 | 55.196% | Link |

    opened by sayakpaul 3
  • Make everything modular

    Make everything modular

    Notes:

    • The PatchEncoder and TokenLearer were subclassed from keras.layers.Layer, now they have been turned into modules like the mlp function.
    • Extracted the transformer block and build a fucntion around it.
    • Changed a bunch of variable names to make it align with the literature.
    opened by ariG23498 0
Owner
Aritra Roy Gosthipaty
Learning with a learning rate of 1e-10. Deep Learning Associate at @pyimagesearch.
Aritra Roy Gosthipaty
Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy.

Deploy tensorflow graphs for fast evaluation and export to tensorflow-less environments running numpy. Now with tensorflow 1.0 support. Evaluation usa

Marcel R. 349 Aug 6, 2022
TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

null 2.6k Jan 4, 2023
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Peter Lin 6.5k Jan 4, 2023
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

flow-dev 2 Aug 21, 2022
Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow

xRBM Library Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow Installation Using pip: pip install xrbm Examples Tut

Omid Alemi 55 Dec 29, 2022
Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

Sami Abu-El-Haija 14 Nov 25, 2021
StyleGAN2 - Official TensorFlow Implementation

StyleGAN2 - Official TensorFlow Implementation

NVIDIA Research Projects 10.1k Dec 28, 2022
An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

Zhihu 44 Oct 20, 2022
Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow

Perceiver This Python package implements Perceiver: General Perception with Iterative Attention by Andrew Jaegle in TensorFlow. This model builds on t

Rishit Dagli 84 Oct 15, 2022
Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.

Denoised-Smoothing-TF Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow. Denoised Smoothing is

Sayak Paul 19 Dec 11, 2022