Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

Ubiquitous Knowledge Processing Lab

Last update: Dec 1, 2022

Related tags

Text Data & NLP pytorch-bertflow

Overview

Pytorch-bertflow

This is an re-implemented version of BERT-flow using Pytorch framework, which can reproduce the results from the original repo. This code is used to reproduce the results in the TSDAE paper.

Usage

Please refer to the simple example ./example.py

python example.py

Note

Please shuffle your training data, which makes a huge difference.
The pooling function makes a huge difference in some datasets (especially for the ones used in the paper). To reproduce the results, please use 'first-last-avg'.

Contact

Contact person and main contributor: Kexin Wang, [email protected]

https://www.ukp.tu-darmstadt.de/

https://www.tu-darmstadt.de/

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Simple GUI where you can enter an article and get a crisp summarized version.

Text-Summarization-using-TextRank-BART Simple GUI where you can enter an article and get a crisp summarized version. How to run: Clone the repo Instal

4 Sep 28, 2022

A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

RITA DSL This is a language, loosely based on language Apache UIMA RUTA, focused on writing manual language rules, which compiles into either spaCy co

60 Sep 26, 2022

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated. This engine can later be used for downstream tasks in NLP such as Q&A, summarization, generation, and natural language understanding (NLU).

1 Mar 20, 2022

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you

2.3k Dec 29, 2022

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you

2k Feb 9, 2021

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

FantasyBert English | 中文 Introduction An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations. You can imp

137 Oct 26, 2022

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install

204 Jul 14, 2022

An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.

IVR-Chatbot Achievements 🏆 Team Uhtred won the Maverick 2.0 Bot-a-thon 2021 organized by AbInbev India. ❓ Problem Statement As we all know that, lot

9 Dec 8, 2022

Bpe algorithm can finetune tokenizer - Bpe algorithm can finetune tokenizer

"# bpe_algorithm_can_finetune_tokenizer" this is an implyment for https://github

1 Feb 2, 2022

Comments

Reproducing the results of bertflow in the original paper

Hi, thank you for your great work! I am grateful to your pytorch-bertflow framework and I am using it to reproduce original bertflow experiments. But the result(SRCC on STS-B) is always lower than it is reported in the paper. I guess there are some details I ignore when reproducing. loss: -1.180362 [473600/551204] corrcoef_dev: 0.223951 loss: -1.119098 [480000/551204] corrcoef_dev: 0.223691 loss: -1.132908 [486400/551204] corrcoef_dev: 0.224357 loss: -1.211618 [492800/551204] corrcoef_dev: 0.225161 I choose WNLI as the training set and STS-B as the dev set. As the result shown above, SRCC is about 22, which is quite low.

opened by fzp0424 1

Finetuning all-MiniLM-L6-v2 ValueError

Hello thank you for your contribution! I am training to fine-tune the all-MiniLM-L6-v2 (https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) on my data but after the first batch, I get a ValueError and a loss of inf.

ValueError: Expected value argument (Tensor of shape (4, 192, 1, 1)) to be within the support (Real()) of the distribution Normal(loc: torch.Size([4, 192, 1, 1]), scale: torch.Size([4, 192, 1, 1])), but found invalid values: tensor([[[[nan]],

     [[nan]],

.....

Here is my very simple script (I just replaced the data and put the training in a loop). The error that I get is:


import pandas as pd
import numpy as np
from tflow_utils import TransformerGlow, AdamWeightDecayOptimizer
from transformers import AutoTokenizer,AutoModel

model_name_or_path = '/tmp/all-MiniLM-L6-v2'
bertflow = TransformerGlow(model_name_or_path, pooling='mean')  # pooling could be 'mean', 'max', 'cls' or 'first-last-avg' (mean pooling over the first and the last layers)
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
no_decay = ["bias", "LayerNorm.weight"]
optimizer_grouped_parameters= [
    {
        "params": [p for n, p in bertflow.glow.named_parameters()  \
                        if not any(nd in n for nd in no_decay)],  # Note only the parameters within bertflow.glow will be updated and the Transformer will be freezed during training.
        "weight_decay": 0.01,
    },
    {
        "params": [p for n, p in bertflow.glow.named_parameters()  \
                        if any(nd in n for nd in no_decay)],
        "weight_decay": 0.0,
    },
]
optimizer = AdamWeightDecayOptimizer(
    params=optimizer_grouped_parameters, 
    lr=1e-5, 
    eps=1e-6,
)
# Important: Remember to shuffle your training data!!! This makes a huge difference!!!

np.random.seed(0)
df = pd.read_csv("data/classification/data_small.csv")
data = df.text.to_list().copy()
np.random.shuffle(data)


bertflow.train()
batch_size = 4
nb_batch = int(np.ceil(len(data) / batch_size))
print(nb_batch)
for batch_id in range(nb_batch):
    batch = data[batch_id*batch_size:(batch_id+1)*batch_size]
    model_inputs = tokenizer(
        batch,
        add_special_tokens=True,
        return_tensors='pt',
        max_length=256,
        padding='longest',
        truncation=True
    )
    z, loss = bertflow(model_inputs['input_ids'], model_inputs['attention_mask'], return_loss=True)  # Here z is the sentence embedding
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    print(batch_id, loss)

Do you have any ideas where this could come from ? I have tried different learning rates but it doesn't solve the problem

opened by GriesserP 2

Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

Related tags

Overview

Pytorch-bertflow

Usage

Note

Contact

You might also like...

Simple GUI where you can enter an article and get a crisp summarized version.

A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.

Bpe algorithm can finetune tokenizer - Bpe algorithm can finetune tokenizer

Comments

Reproducing the results of bertflow in the original paper

Finetuning all-MiniLM-L6-v2 ValueError

Owner

Ubiquitous Knowledge Processing Lab

Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)

Pytorch version of BERT-whitening

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert, MILES uses the bert-base-multilingual-uncased model, as well as simple language-agnostic approaches to complex word identification (CWI) and candidate ranking.

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

VD-BERT: A Unified Vision and Dialog Transformer with BERT

PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).