A Structured Self-attentive Sentence Embedding

Kaushal Shetty

Last update: Nov 28, 2022

Related tags

Deep Learning visualization deep-learning python3 pytorch classification attention attention-mechanism attention-model sentence-embeddings self-attentive-rnn self-attention attention-weights

Overview

Structured Self-attentive sentence embeddings

Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR 2017: https://arxiv.org/abs/1703.03130 .

USAGE:

For binary sentiment classification on imdb dataset run : python classification.py "binary"

For multiclass classification on reuters dataset run : python classification.py "multiclass"

You can change the model parameters in the model_params.json file Other tranining parameters like number of attention hops etc can be configured in the config.json file.

If you want to use pretrained glove embeddings , set the use_embeddings parameter to "True" ,default is set to False. Do not forget to download the glove.6B.50d.txt and place it in the glove folder.

Implemented:

Classification using self attention
Regularization using Frobenius norm
Gradient clipping
Visualizing the attention weights

Instead of pruning ,used averaging over the sentence embeddings.

Visualization:

After training, the model is tested on 100 test points. Attention weights for the 100 test data are retrieved and used to visualize over the text using heatmaps. A file visualization.html gets saved in the visualization/ folder after successful training. The visualization code was provided by Zhouhan Lin (@hantek). Many thanks.

Below is a shot of the visualization on few datapoints.

Training accuracy 93.4% Tested on 1000 points with 90.2% accuracy

Keyword-BERT: Keyword-Attentive Deep Semantic Matching

project discription An implementation of the Keyword-BERT model mentioned in my paper Keyword-Attentive Deep Semantic Matching (Plz cite this github r

1 Nov 14, 2021

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

892 Dec 28, 2022

Comments

Penalization term

In the original paper, the penalization term is equal to the Frobenius norm squared. In your implementation you did not consider the power 2. Did you try comparing both cases?

opened by moussaKam 6
Experiments

Hello dear @kaushalshetty ,

I'm trying to run your experiments, but I can get the same results. I made the same changes than webdizz, but the accuracy are always 50%. This default configuration should generate the 93.4% accuracy result?

Thanks

opened by heukirne 2
about classfication.py

when i run this sentence : classification_type = sys.argv[1] IndexError: list index out of range I don't know how to slove, can you tell me the solution?thanks

opened by Catherine-HFUT 1
How

Hello，thanks for your codes！I am trying to learn Self-Attention. But cant find the input files, Can you please give the sample input and other required files to run this example ?

opened by TingEn-Li 1

A Structured Self-attentive Sentence Embedding

Related tags

Overview

Structured Self-attentive sentence embeddings

USAGE:

Implemented:

Visualization:

You might also like...

Keyword-BERT: Keyword-Attentive Deep Semantic Matching

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

TANL: Structured Prediction as Translation between Augmented Natural Languages

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

PyTorch implementation of ARM-Net: Adaptive Relation Modeling Network for Structured Data.

This repo contains the official implementations of EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

A Closer Look at Structured Pruning for Neural Network Compression

Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning

Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

Comments

Penalization term

Experiments

about classfication.py

How

Owner

Kaushal Shetty

Locally Constrained Self-Attentive Sequential Recommendation

A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Code for the paper "How Attentive are Graph Attention Networks?"

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Dynamic Attentive Graph Learning for Image Restoration, ICCV2021 [PyTorch Code]

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Code in both PyTorch and TensorFlow

The first public PyTorch implementation of Attentive Recurrent Comparators

A framework for attentive explainable deep learning on tabular data