Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism

slwang9353

Last update: Sep 6, 2021

Related tags

Deep Learning Period-alternatives-of-Softmax

Overview

Period-alternatives-of-Softmax

Experimental Demo for our paper

'Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism'

We suggest that replacing the exponential function by periodic functions. Through experiments on a simply designed demo referenced to LeViT, our method is proved to be able to alleviate the gradient problem and yield substantial improvements compared to Softmax and its variants.

** Create your own 'dataset' fold, and maybe need to modify the demo.py file for your own dataset except for cifar-10, cifar-100 and Tiny-imageNet.

Function available:

softmax , norm_softmax
sinmax, norm_sinmax
cosmax, norm_cosmax
sin_2_max, norm_sin_2_max
sin_2_max_move, norm_sin_2_max_move
sirenmax, norm_sirenmax
sin_softmax, norm_sin_softmax

mode available:

search:
        Random search for a suitable set of learning rate and weight decay, and record the results in 
        Attention_test/*functions/lr_wd_search.txt
run:
        Train the demo, and there will be four .npy files created in root.
        (1) 'record_val_acc.npy' for val acc record every 100 iter;
        (2) 'record_train_acc.npy' for train acc record every batch;
        (3) 'record_loss.npy' for train loss record every batch;
        (4) 'kq_value.npy' for Q.K record *before sclaled*.
att_run:
        Same as the run mode but:
        (1) No kq_value record;
        (2) Every 5 epoch, input a test image and record the attention score map of each head of each layer.
            Saved in 'Attention_test/attention_maps.npy'

Python periodic table module

elemenpy Hello! elements.py is a small Python periodic table module that is used for calling certain information about an element. Installation Instal

2 Dec 27, 2021

An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

94 Dec 10, 2022

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Siren: Implicit Neural Representations with Periodic Activation Functions The unofficial Tensorflow 2 implementation of the paper Implicit Neural Repr

2 Jun 27, 2022

The Noise Contrastive Estimation for softmax output written in Pytorch

An NCE implementation in pytorch About NCE Noise Contrastive Estimation (NCE) is an approximation method that is used to work around the huge computat

287 Nov 25, 2022

an implementation of softmax splatting for differentiable forward warping using PyTorch

softmax-splatting This is a reference implementation of the softmax splatting operator, which has been proposed in Softmax Splatting for Video Frame I

338 Dec 28, 2022

GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification

GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification This is the official pytorch implementation of t

5 Nov 14, 2022

IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

IJON SPACE EXPLORER IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL. Using only a small (usually one line) annotati

146 Dec 16, 2022

Code for the TIP 2021 Paper "Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss"

PurNet Project for the TIP 2021 Paper "Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss" Abstract Image-based salie

4 Aug 25, 2022

Feedback is important: response-aware feedback mechanism for background based conversation

RFM The code for the paper: "Feedback is important: response-aware feedback mechanism for background based conversation." Requirements python 3.7 pyto

2 Sep 29, 2022

Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism

Related tags

Overview

Period-alternatives-of-Softmax

'Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism'

Function available:

mode available:

You might also like...

Python periodic table module

An SE(3)-invariant autoencoder for generating the periodic structure of materials

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

The Noise Contrastive Estimation for softmax output written in Pytorch

an implementation of softmax splatting for differentiable forward warping using PyTorch

GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification

IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

Code for the TIP 2021 Paper "Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss"

Feedback is important: response-aware feedback mechanism for background based conversation

Owner

slwang9353

Attention mechanism with MNIST dataset

Investigating Attention Mechanism in 3D Point Cloud Object Detection (arXiv 2021)

PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling

Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

code for paper"A High-precision Semantic Segmentation Method Combining Adversarial Learning and Attention Mechanism"

EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

A PyTorch implementation of Learning to learn by gradient descent by gradient descent

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

Official repository for Fourier model that can generate periodic signals