This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

Overview
Comments
  • Tiny ImageNet Eval Top1 100%?

    Tiny ImageNet Eval Top1 100%?

    Hey, I have a question to bother you. I use this code to train Tiny ImageNet, all the configs are reserved. But after dozens of epochs, the top1 accuracy on eval is 100%. Is there anything wrong? I can't find the problem. And I observe that the test acc is higher than the test EMA acc in former epochs, then why need I to use EMA? Thanks.

    opened by PaulTHong 6
  • Config file of DeIT-B/16

    Config file of DeIT-B/16

    Thanks for this sharing your amazing work. I was trying to train DeIT-B/16 from scratch on ImageNet-1k using the hyperparams reported in your paper. I'm pretty sure I'm missing something, but I'm unable to reach 82.4%. With the hyperparams I use, I get around 78.6% which is even worse than DeIT-S/16.

    Could you please share the training command line for DeIT-B/16 or share the config file for the same? Thanks a lot.

    opened by shashankvkt 1
  • The description of

    The description of "lam = (lam0 + lam1) / 2 # ()+(b,) ratio=0.5"

    I carefully checked the definition of lam in your paper, but I didn't find the description of "lam = (lam0 + lam1) / 2 # ()+(b,) ratio=0.5".So, are you using half of the origin lam and half of the attention lam for the final lam? Thanks

    opened by fistyee 1
  • how to get the attention or heat map of swin-transformer?

    how to get the attention or heat map of swin-transformer?

    hello, authors! Recently, I have been interested in swin-transformer. However, the heat map I got is very unreasonable. I see that you had worked on swin-transformer, so I want to know your ways of drawing heat map or attention map of swin-transformer. I would be grateful if you could give me the answer.

    opened by choresefree 1
  • Waiting for the open source  and pretrained models

    Waiting for the open source and pretrained models

    Hey, I have a question. For non-classification task, this paper proposed utilizing pretrained backbone with TransMix will perform better, which need pretrained model. If I straightly add this augment module on current transformer-based segmentation task, will it work? Thanks!

    opened by PaulTHong 1
  • baseline results

    baseline results

    I just checked the configuration file, it seems that some of the training strategies are quite different from the original DeiT training recipe, (e.g. Batch size, learning rate scheduler, model ema ...) So I'm wondering what would be the baseline result for this configuration?

    opened by kyle-1997 0
  • About Swim Transformer

    About Swim Transformer

    Hi,We have read the paper 4.5 about the swim transformer part.Would you guys open source the ca-swim and how to using the transmix into it? Thanks for you attention.

    opened by Cat-L 0
Owner
Jie-Neng Chen
CS Ph.D student @ CCVL
Jie-Neng Chen
PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Representation

How to Reproduce our Results This repository contains PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Represen

opcrisis 46 Dec 15, 2022
Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Contrast and Mix (CoMix) The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Backgroun

Computer Vision and Intelligence Research (CVIR) 13 Dec 10, 2022
Official repository for "Intriguing Properties of Vision Transformers" (2021)

Intriguing Properties of Vision Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, & Ming-Hsuan Yang P

Muzammal Naseer 155 Dec 27, 2022
Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

Improving-Adversarial-Transferability-of-Vision-Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Khan, Fatih Porikli arxiv link A

Muzammal Naseer 47 Dec 2, 2022
Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, as a standalone package for Pytorch

Triangle Multiplicative Module - Pytorch Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or c

Phil Wang 22 Oct 28, 2022
Lightweight mmm - Lightweight (Bayesian) Media Mix Model

Lightweight (Bayesian) Media Mix Model This is not an official Google product. L

Google 342 Jan 3, 2023
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 364 Dec 28, 2022
This repository contains PyTorch code for Robust Vision Transformers.

This repository contains PyTorch code for Robust Vision Transformers.

null 117 Dec 7, 2022
Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Recurrent Fast Weight Programmers This is the official repository containing the code we used to produce the experimental results reported in the pape

IDSIA 36 Nov 15, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

vision-transformer-from-scratch This repository includes several kinds of vision transformers from scratch so that one beginner can understand the the

null 1 Dec 24, 2021
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Microsoft 408 Dec 30, 2022
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Bin Xiao 175 Jan 8, 2023
Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

Less is More: Pay Less Attention in Vision Transformers Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers. By

null 73 Jan 1, 2023
Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transf

Microsoft 486 Dec 20, 2022
The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

Wang 25 Sep 26, 2022
A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

Overview This is a hobby project which includes a hand-gesture controlled virtual piano using an android phone camera and some OpenCV library. My moti

Abhinav Gupta 1 Nov 19, 2021
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Softlearning Softlearning is a deep reinforcement learning toolbox for training maximum entropy policies in continuous domains. The implementation is

Robotic AI & Learning Lab Berkeley 997 Dec 30, 2022