This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

Jie-Neng Chen

Last update: Jan 1, 2023

Related tags

Deep Learning TransMix

Overview

TransMix: Attend to Mix for Vision Transformers

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers

Code and pretrained models will be released soon.

Comments

Tiny ImageNet Eval Top1 100%?

Hey, I have a question to bother you. I use this code to train Tiny ImageNet, all the configs are reserved. But after dozens of epochs, the top1 accuracy on eval is 100%. Is there anything wrong? I can't find the problem. And I observe that the test acc is higher than the test EMA acc in former epochs, then why need I to use EMA? Thanks.

opened by PaulTHong 6
Config file of DeIT-B/16

Thanks for this sharing your amazing work. I was trying to train DeIT-B/16 from scratch on ImageNet-1k using the hyperparams reported in your paper. I'm pretty sure I'm missing something, but I'm unable to reach 82.4%. With the hyperparams I use, I get around 78.6% which is even worse than DeIT-S/16.

Could you please share the training command line for DeIT-B/16 or share the config file for the same? Thanks a lot.

opened by shashankvkt 1
The description of "lam = (lam0 + lam1) / 2 # ()+(b,) ratio=0.5"

I carefully checked the definition of lam in your paper, but I didn't find the description of "lam = (lam0 + lam1) / 2 # ()+(b,) ratio=0.5".So, are you using half of the origin lam and half of the attention lam for the final lam? Thanks

opened by fistyee 1
how to get the attention or heat map of swin-transformer?

hello, authors! Recently, I have been interested in swin-transformer. However, the heat map I got is very unreasonable. I see that you had worked on swin-transformer, so I want to know your ways of drawing heat map or attention map of swin-transformer. I would be grateful if you could give me the answer.

opened by choresefree 1
Waiting for the open source and pretrained models

Hey, I have a question. For non-classification task, this paper proposed utilizing pretrained backbone with TransMix will perform better, which need pretrained model. If I straightly add this augment module on current transformer-based segmentation task, will it work? Thanks!

opened by PaulTHong 1
baseline results

I just checked the configuration file, it seems that some of the training strategies are quite different from the original DeiT training recipe, (e.g. Batch size, learning rate scheduler, model ema ...) So I'm wondering what would be the baseline result for this configuration?

opened by kyle-1997 0
About Swim Transformer

Hi,We have read the paper 4.5 about the swim transformer part.Would you guys open source the ca-swim and how to using the transmix into it? Thanks for you attention.

opened by Cat-L 0

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

Related tags

Overview

TransMix: Attend to Mix for Vision Transformers

Comments

Tiny ImageNet Eval Top1 100%?

Config file of DeIT-B/16

The description of "lam = (lam0 + lam1) / 2 # ()+(b,) ratio=0.5"

how to get the attention or heat map of swin-transformer?

Waiting for the open source and pretrained models

baseline results

About Swim Transformer

Owner

Jie-Neng Chen

PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Representation

Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Official repository for "Intriguing Properties of Vision Transformers" (2021)

Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, as a standalone package for Pytorch

Lightweight mmm - Lightweight (Bayesian) Media Mix Model

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

This repository contains PyTorch code for Robust Vision Transformers.

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.

The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.