Utilizing RBERT model for KLUE Relation Extraction task

snoop2head

Last update: Nov 15, 2022

Related tags

Text Data & NLP KLUE-RBERT

Overview

RBERT for Relation Extraction task for KLUE

Project Description

Relation Extraction task is one of the task of Korean Language Understanding Evaluation(KLUE) Benchmark.
Relation extraction can be defined as multiclass classification task for relationship between subject entity and object entity.
Classes are such as no_relation, per:employee_of, org:founded_by... totaling 30 labels.
This repo contains custom fine-tuning method utilizing monologg's R-BERT Implementation.
Custom punctuations with Pororo NER has been added to the dataset prior to the model's training.
If you want to refer to the experimentation note such as punctuation method of the entity, please refer to the blog post

Arguments Usage

Argument	type	Default	Explanation
batch_size	int	40	batch size for training and inferece
num_folds	int	5	number of fold for Stratified KFold
num_train_epochs	int	5	number of epochs for training
loss	str	focalloss	loss function
gamma	float	1.0	focalloss's gamma value
optimizer	str	adamp	optimizer for training
scheduler	str	get_cosine_schedule_with_warmup	learning rate scheduler
learning_rate	float	0.00005	initial learning rate
weight_decay	float	0.01	Loss function's weight decay, preventing overfit
warmup_step	int	500
debug	bool	false	debug with CPU device for better error representation
dropout_rate	float	0.1
save_steps	int	100	number of steps for saving the model
evaluation_steps	int	100	number of step until the evaluation
metric_for_best_model	str	eval/loss	the metric for determining which is the best model
load_best_model_at_end	bool	True

References

Authorship

Hardware

GPU : Tesla V100 32GB

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

60 Dec 31, 2022

A toolkit for document-level event extraction, containing some SOTA model implementations

Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker Source code for ACL-IJCNLP 2021 Long paper: Document-le

84 Dec 15, 2022

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

41 Jan 3, 2023

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It

8.4k Dec 26, 2022

Python implementation of TextRank for phrase extraction and summarization of text documents

PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document

1.9k Jan 6, 2023

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It

7.5k Feb 17, 2021

Python implementation of TextRank for phrase extraction and summarization of text documents

PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document

1.4k Feb 17, 2021

SpikeX - SpaCy Pipes for Knowledge Extraction

SpikeX is a collection of pipes ready to be plugged in a spaCy pipeline. It aims to help in building knowledge extraction tools with almost-zero effort.

384 Dec 12, 2022

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

The PyTorch-Kaldi Speech Recognition Toolkit PyTorch-Kaldi is an open-source repository for developing state-of-the-art DNN/HMM speech recognition sys

2.3k Dec 27, 2022

Utilizing RBERT model for KLUE Relation Extraction task

Related tags

Overview

RBERT for Relation Extraction task for KLUE

Project Description

Arguments Usage

References

Authorship

Hardware

You might also like...

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

A toolkit for document-level event extraction, containing some SOTA model implementations

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Python implementation of TextRank for phrase extraction and summarization of text documents

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Python implementation of TextRank for phrase extraction and summarization of text documents

SpikeX - SpaCy Pipes for Knowledge Extraction

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Owner

snoop2head

An Open-Source Package for Neural Relation Extraction (NRE)

An Open-Source Package for Neural Relation Extraction (NRE)

Spert NLP Relation Extraction API deployed with torchserve for inference

Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Deploying a Text Summarization NLP use case on Docker Container Utilizing Nvidia GPU

This repository implements a brute-force spellchecker utilizing the Damerau-Levenshtein edit distance.

Words-per-minute - A terminal app written in python utilizing the curses module that tests the user's ability to type