Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

Last update: Oct 29, 2022

Related tags

Deep Learning UIKA

Overview

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

Requirements

python 3.7
pytorch-gpu 1.7
numpy 1.19.4
pytorch_pretrained_bert 0.6.2
nltk 3.3
GloVe.840B.300d
bert-base-uncased

Environment

OS: Ubuntu-16.04.1
GPU: GeForce RTX 2080
CUDA: 10.2
cuDNN: v8.0.2

Dataset

target datasets
- raw data: "./dataset/"
- processing data: "./dataset_npy/"
- word embedding file: "./embeddings/"
pretraining datasets
- Amazon review: Amazon Reviews for Sentiment Analysis | Kaggle
- Yelp review: Yelp Review Sentiment Dataset | Kaggle
- For the first time, please run "python ./process_data.py" to process the pretraining datasets (remember modifying the path).

Training options

ds_name: the name of target dataset, ['14semeval_laptop', '14semeval_rest', 'Twitter'], default='14semeval_rest'
pre_name: the name of pretraining dataset, ['Amazon', 'Yelp'], default='Amazon'
bs: batch size to use during training, [64, 100, 200], default=64
learning_rate: learning rate to use, [0.001, 0.0005, 0.00001], default=0.001
n_epoch: number of epoch to use, [5, 10], default=10
model: the name of model, ['ABGCN', 'GCAE', 'ATAE'], default='ABGCN'
is_test: train or test the model, [0, 1], default=1
is_bert: GloVe-based or BERT-based, [0, 1], default=0
alpha: value of parameter \alpha in knowledge guidance loss of the paper, [0.5, 0.6, 0.7], default=0.06
stage: the number of training stage, [1, 2, 3, 4], default=4

Running

running for the first stage (pretraining on the document)
- python ./main.py -pre_name Amaozn -bs 256 -learning_rate 0.0005 -n_epoch 10 -model ABGCN -is_test 0 -is_bert 0 -stage 1
running for the second stage
- python ./main.py -ds_name 14semeval_laptop -bs 64 -learning_rate 0.001 -n_epoch 5 -model ABGCN -is_test 0 -is_bert 0 -alpha 0.6 -stage 2
runing for the final stage
- python ./main.py -ds_name 14semeval_laptop -bs 64 -learning_rate 0.001 -n_epoch 10 -model ABGCN -is_test 0 -is_bert 0 -stage 3
training from scratch:
- python ./main.py -ds_name 14semeval_laptop -bs 64 -learning_rate 0.001 -n_epoch 10 -model ABGCN -is_test 0 -is_bert 0 -stage 4

Evaluation

To have a quick look, we saved the best model weight trained on the target datasets in the "./best_model_weight". You can easily load them and test the performance. Due to the limited file space, we only provide the weight of ABGCN on 14semeval_laptop and 14semeval_rest datasets. You can evaluate the model weight with:

python ./main.py -ds_name 14semeval_laptop -bs 64 -model ABGCN -is_test 1 -is_bert 0
python ./main.py -ds_name 14semeval_rest-bs 64 -model ABGCN -is_test 1 -is_bert 0

Notes

The target datasets and more than 50% of the code are borrowed from TNet-ATT (Tang et.al, ACL2019).
The pretraining datasets are obtained from www.Kaggle.com.

You might also like...

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus General info This is

71 Oct 25, 2022

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge Introduction SentiLARE is a sentiment-aware pre-trained language

74 Dec 30, 2022

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

39 Sep 20, 2022

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You

93 Dec 25, 2022

MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

MemStream Implementation of MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift . Siddharth Bhatia, Arjit Jain, Shivi

61 Dec 2, 2022

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

Instance-conditional Knowledge Distillation for Object Detection

Instance-conditional Knowledge Distillation for Object Detection This is a MegEngine implementation of the paper "Instance-conditional Knowledge Disti

47 Nov 17, 2022

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

298 Jan 7, 2023

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

FunMatch-Distillation TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A g

67 Dec 20, 2022

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

Related tags

Overview

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

Requirements

Environment

Dataset

target datasets

pretraining datasets

Training options

Running

running for the first stage (pretraining on the document)

running for the second stage

runing for the final stage

training from scratch:

Evaluation

Notes

You might also like...

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Instance-conditional Knowledge Distillation for Object Detection

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

Owner

Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning.

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

[NAACL & ACL 2021] SapBERT: Self-alignment pretraining for BERT.

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

Code for our paper Aspect Sentiment Quad Prediction as Paraphrase Generation in EMNLP 2021.

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Build a medical knowledge graph based on Unified Language Medical System (UMLS)

Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)