🏖 Easy training and deployment of seq2seq models.

Overview

Headliner

Build Status Build Status Docs codecov PyPI Version License

Headliner is a sequence modeling library that eases the training and in particular, the deployment of custom sequence models for both researchers and developers. You can very easily deploy your models in a few lines of code. It was originally built for our own research to generate headlines from Welt news articles (see figure 1). That's why we chose the name, Headliner.

Figure 1: One example from our Welt.de headline generator.

Update 21.01.2020

The library now supports fine-tuning pre-trained BERT models with custom preprocessing as in Text Summarization with Pretrained Encoders!

check out this tutorial on colab!

🧠 Internals

We use sequence-to-sequence (seq2seq) under the hood, an encoder-decoder framework (see figure 2). We provide a very simple interface to train and deploy seq2seq models. Although this library was created internally to generate headlines, you can also use it for other tasks like machine translations, text summarization and many more.

Figure 2: Encoder-decoder sequence-to-sequence model.

Why Headliner?

You may ask why another seq2seq library? There are a couple of them out there already. For example, Facebook has fairseq, Google has seq2seq and there is also OpenNMT. Although those libraries are great, they have a few drawbacks for our use case e.g. the former doesn't focus much on production whereas the Google one is not actively maintained. OpenNMT was the closest one to match our requirements i.e. it has a strong focus on production. However, we didn't like that their workflow (preparing data, training and evaluation) is mainly done via the command line. They also expose a well-defined API though but the complexity there is still too high with too much custom code (see their minimal transformer training example).

Therefore, we built this library for us with the following goals in mind:

  • Easy-to-use API for training and deployment (only a few lines of code)
  • Uses TensorFlow 2.0 with all its new features (tf.function, tf.keras.layers etc.)
  • Modular classes: text preprocessing, modeling, evaluation
  • Extensible for different encoder-decoder models
  • Works on large text data

For more details on the library, read the documentation at: https://as-ideas.github.io/headliner/

Headliner is compatible with Python 3.6 and is distributed under the MIT license.

⚙️ Installation

⚠️ Before installing Headliner, you need to install TensorFlow as we use this as our deep learning framework. For more details on how to install it, have a look at the TensorFlow installation instructions.

Then you can install Headliner itself. There are two ways to install Headliner:

  • Install Headliner from PyPI (recommended):
pip install headliner
  • Install Headliner from the GitHub source:
git clone https://github.com/as-ideas/headliner.git
cd headliner
python setup.py install

📖 Usage

Training

For the training, you need to import one of our provided models or create your own custom one. Then you need to create the dataset, a tuple of input-output sequences, and then train it:

from headliner.trainer import Trainer
from headliner.model.transformer_summarizer import TransformerSummarizer

data = [('You are the stars, earth and sky for me!', 'I love you.'),
        ('You are great, but I have other plans.', 'I like you.')]

summarizer = TransformerSummarizer(embedding_size=64, max_prediction_len=20)
trainer = Trainer(batch_size=2, steps_per_epoch=100)
trainer.train(summarizer, data, num_epochs=2)
summarizer.save('/tmp/summarizer')

Prediction

The prediction can be done in a few lines of code:

from headliner.model.transformer_summarizer import TransformerSummarizer

summarizer = TransformerSummarizer.load('/tmp/summarizer')
summarizer.predict('You are the stars, earth and sky for me!')

Models

Currently available models include a basic encoder-decoder, an encoder-decoder with Luong attention, the transformer and a transformer on top of a pre-trained BERT-model:

from headliner.model.basic_summarizer import BasicSummarizer
from headliner.model.attention_summarizer import AttentionSummarizer
from headliner.model.transformer_summarizer import TransformerSummarizer
from headliner.model.bert_summarizer import BertSummarizer

basic_summarizer = BasicSummarizer()
attention_summarizer = AttentionSummarizer()
transformer_summarizer = TransformerSummarizer()
bert_summarizer = BertSummarizer()

Advanced training

Training using a validation split and model checkpointing:

from headliner.model.transformer_summarizer import TransformerSummarizer
from headliner.trainer import Trainer

train_data = [('You are the stars, earth and sky for me!', 'I love you.'),
              ('You are great, but I have other plans.', 'I like you.')]
val_data = [('You are great, but I have other plans.', 'I like you.')]

summarizer = TransformerSummarizer(num_heads=1,
                                   feed_forward_dim=512,
                                   num_layers=1,
                                   embedding_size=64,
                                   max_prediction_len=50)
trainer = Trainer(batch_size=8,
                  steps_per_epoch=50,
                  max_vocab_size_encoder=10000,
                  max_vocab_size_decoder=10000,
                  tensorboard_dir='/tmp/tensorboard',
                  model_save_path='/tmp/summarizer')

trainer.train(summarizer, train_data, val_data=val_data, num_epochs=3)

Advanced prediction

Prediction information such as attention weights and logits can be accessed via predict_vectors returning a dictionary:

from headliner.model.transformer_summarizer import TransformerSummarizer

summarizer = TransformerSummarizer.load('/tmp/summarizer')
summarizer.predict_vectors('You are the stars, earth and sky for me!')

Resume training

A previously trained summarizer can be loaded and then retrained. In this case the data preprocessing and vectorization is loaded from the model.

train_data = [('Some new training data.', 'New data.')] * 10

summarizer_loaded = TransformerSummarizer.load('/tmp/summarizer')
trainer = Trainer(batch_size=2)
trainer.train(summarizer_loaded, train_data)
summarizer_loaded.save('/tmp/summarizer_retrained')

Use pretrained GloVe embeddings

Embeddings in GloVe format can be injected in to the trainer as follows. Optionally, set the embedding to non-trainable.

trainer = Trainer(embedding_path_encoder='/tmp/embedding_encoder.txt',
                  embedding_path_decoder='/tmp/embedding_decoder.txt')

# make sure the embedding size matches to the embedding size of the files
summarizer = TransformerSummarizer(embedding_size=64,
                                   embedding_encoder_trainable=False,
                                   embedding_decoder_trainable=False)

Custom preprocessing

A model can be initialized with custom preprocessing and tokenization:

from headliner.preprocessing.preprocessor import Preprocessor

train_data = [('Some inputs.', 'Some outputs.')] * 10

preprocessor = Preprocessor(filter_pattern='',
                            lower_case=True,
                            hash_numbers=False)
train_prep = [preprocessor(t) for t in train_data]
inputs_prep = [t[0] for t in train_prep]
targets_prep = [t[1] for t in train_prep]

# Build tf subword tokenizers. Other custom tokenizers can be implemented
# by subclassing headliner.preprocessing.Tokenizer
from tensorflow_datasets.core.features.text import SubwordTextEncoder
tokenizer_input = SubwordTextEncoder.build_from_corpus(
inputs_prep, target_vocab_size=2**13, reserved_tokens=[preprocessor.start_token, preprocessor.end_token])
tokenizer_target = SubwordTextEncoder.build_from_corpus(
    targets_prep, target_vocab_size=2**13,  reserved_tokens=[preprocessor.start_token, preprocessor.end_token])

vectorizer = Vectorizer(tokenizer_input, tokenizer_target)
summarizer = TransformerSummarizer(embedding_size=64, max_prediction_len=50)
summarizer.init_model(preprocessor, vectorizer)

trainer = Trainer(batch_size=2)
trainer.train(summarizer, train_data, num_epochs=3)

Use pre-trained BERT embeddings

Pre-trained BERT models can be included as follows. Be aware that pre-trained BERT models are expensive to train and require custom preprocessing!

from headliner.preprocessing.bert_preprocessor import BertPreprocessor
from spacy.lang.en import English

train_data = [('Some inputs.', 'Some outputs.')] * 10

# use BERT-specific start and end token
preprocessor = BertPreprocessor(nlp=English()
train_prep = [preprocessor(t) for t in train_data]
targets_prep = [t[1] for t in train_prep]


from tensorflow_datasets.core.features.text import SubwordTextEncoder
from transformers import BertTokenizer
from headliner.model.bert_summarizer import BertSummarizer

# Use a pre-trained BERT embedding and BERT tokenizer for the encoder 
tokenizer_input = BertTokenizer.from_pretrained('bert-base-uncased')
tokenizer_target = SubwordTextEncoder.build_from_corpus(
    targets_prep, target_vocab_size=2**13,  reserved_tokens=[preprocessor.start_token, preprocessor.end_token])

vectorizer = BertVectorizer(tokenizer_input, tokenizer_target)
summarizer = BertSummarizer(num_heads=2,
                            feed_forward_dim=512,
                            num_layers_encoder=0,
                            num_layers_decoder=4,
                            bert_embedding_encoder='bert-base-uncased',
                            embedding_size_encoder=768,
                            embedding_size_decoder=768,
                            dropout_rate=0.1,
                            max_prediction_len=50))
summarizer.init_model(preprocessor, vectorizer)

trainer = Trainer(batch_size=2)
trainer.train(summarizer, train_data, num_epochs=3)

Training on large datasets

Large datasets can be handled by using an iterator:

def read_data_iteratively():
    return (('Some inputs.', 'Some outputs.') for _ in range(1000))

class DataIterator:
    def __iter__(self):
        return read_data_iteratively()

data_iter = DataIterator()

summarizer = TransformerSummarizer(embedding_size=10, max_prediction_len=20)
trainer = Trainer(batch_size=16, steps_per_epoch=1000)
trainer.train(summarizer, data_iter, num_epochs=3)

🤝 Contribute

We welcome all kinds of contributions such as new models, new examples and many more. See the Contribution guide for more details.

📝 Cite this work

Please cite Headliner in your publications if this is useful for your research. Here is an example BibTeX entry:

@misc{axelspringerai2019headliners,
  title={Headliner},
  author={Christian Schäfer & Dat Tran},
  year={2019},
  howpublished={\url{https://github.com/as-ideas/headliner}},
}

🏗 Maintainers

© Copyright

See LICENSE for details.

References

Text Summarization with Pretrained Encoders

Effective Approaches to Attention-based Neural Machine Translation

Acknowlegements

https://www.tensorflow.org/tutorials/text/transformer

https://github.com/huggingface/transformers

https://machinetalk.org/2019/03/29/neural-machine-translation-with-attention-mechanism/

Comments
  • Bert Model prediction giving same output

    Bert Model prediction giving same output

    Hi @cschaefer26 and @datitran ,

    Thank you so much for the headliner library, I have been playing around it for a while and so far really enjoyed it.

    I was working on a use case for summarization and use headliner's bert model, I followed the following readme code (with few tweaks for SummarizerTransformer parameters) for headliner with my use case train_data:

    from headliner.preprocessing import Preprocessor
    
    train_data = [('Some inputs.', 'Some outputs.')] * 10
    
    # use BERT-specific start and end token
    preprocessor = Preprocessor(start_token='[CLS]',
                                end_token='[SEP]',
                                lower_case=True)
    train_prep = [preprocessor(t) for t in train_data]
    targets_prep = [t[1] for t in train_prep]
    
    
    from tensorflow_datasets.core.features.text import SubwordTextEncoder
    from transformers import BertTokenizer
    from headliner.model import SummarizerBert
    
    # Use a pre-trained BERT embedding and BERT tokenizer for the encoder 
    tokenizer_input = BertTokenizer.from_pretrained('bert-base-uncased')
    tokenizer_target = SubwordTextEncoder.build_from_corpus(
        targets_prep, target_vocab_size=2**13,  reserved_tokens=[preprocessor.start_token, preprocessor.end_token])
    
    vectorizer = Vectorizer(tokenizer_input, tokenizer_target)
    summarizer = SummarizerBert(num_heads=4,
                                feed_forward_dim=512,
                                num_layers_encoder=3,
                                num_layers_decoder=3,
                                bert_embedding_encoder='bert-base-uncased',
                                embedding_encoder_trainable=False,
                                embedding_size_encoder=768,
                                embedding_size_decoder=64,
                                dropout_rate=0.1,
                                max_prediction_len=400)
    )
    summarizer.init_model(preprocessor, vectorizer)
    
    trainer = Trainer(batch_size=2)
    trainer.train(summarizer, train_data, num_epochs=200)
    

    I train the model for 200 epochs and i see in logs that the loss keeps on reducing (starting from around 4 to 0.69), so seems like training happens just fine.

    After that when i try to do prediction on the saved model using following code, it gives me the same prediction always for any test_sentence :

    from headliner.model.summarizer_bert import SummarizerBert
    
    summarizer = SummarizerBert.load('/path/to/headliner_bert_model')
    summarizer.predict(test_sentence)
    

    Please if anyone can advise if am missing anything? with respect to prediction part for bert model?

    opened by atirpetkar 6
  • Models are not saved after training.

    Models are not saved after training.

    I am training a transformer model on kaggle. After the training, models are not getting saved. Earlier it was working fine before the addition of BERT transformer.

    Here is the notebook: https://www.kaggle.com/mohitsaini235/chatbot?scriptVersionId=22931537

    I have commented out some code so that you can run it quickly and see the results.

    opened by sainimohit23 6
  • Training with longer examples leads to `InvalidArgumentError`

    Training with longer examples leads to `InvalidArgumentError`

    When I try to train a SummarizerTransformer on longer training examples I get the following error: InvalidArgumentError: Incompatible shapes: [1,11,64] vs. [1,8,64] in train_step. It looks like it is depending on the length of the targets.

    Minimal example:

    from headliner.trainer import Trainer
    from headliner.model.summarizer_transformer import SummarizerTransformer
    
    data = [
            ('You are the stars, earth and sky for me!', 'I love you I love you I love you.'),
            ('You are the stars, earth and sky for me!', 'I love you.')
    ]
    
    summarizer = SummarizerTransformer(embedding_size=64, max_prediction_len=20)
    trainer = Trainer(batch_size=1, steps_per_epoch=100)
    trainer.train(summarizer, data, num_epochs=1)
    

    Leads to an InvalidArgumentError while

    from headliner.trainer import Trainer
    from headliner.model.summarizer_transformer import SummarizerTransformer
    
    data = [
            ('You are the stars, earth and sky for me!', 'I love you.'),
            ('You are the stars, earth and sky for me!', 'I love you.')
    ]
    
    summarizer = SummarizerTransformer(embedding_size=64, max_prediction_len=20)
    trainer = Trainer(batch_size=1, steps_per_epoch=100)
    trainer.train(summarizer, data, num_epochs=1)
    

    without the ('You are the stars, earth and sky for me!', 'I love you I love you I love you.') pair, works fine.

    I use:

    python==3.6
    tensorflow==2.0.0
    headliner== 0.0.22
    

    It does not depend on if I run it on a gpu or cpu only.

    Can you reproduce this bug?

    opened by pschwllr 2
  • Question: Pre-trained BERT for MT

    Question: Pre-trained BERT for MT

    Hi,

    has there been any research done from your side comparing BLEU score to SOTA results for this pre-trained BERT for MT? Or is it merely to illustrate the flexibility of the library, e.g, attaching a decoder and re-training on a custom dataset?

    opened by Stamenov 2
  • Error while loading trained model

    Error while loading trained model

    I followed the documentation and got error for the following code:

    Training part

    NUM_UNITS = 1024
    BATCH_SIZE = 32
    STEPS_PER_EPOCH = len(data) // BATCH_SIZE
    STEPS_TO_LOG = 100
    MAX_OUTPUT_LENGTH = 50
    EPOCHS = 20
    EMB_SIZE = 128
    
    from headliner.trainer import Trainer
    from headliner.model.summarizer_attention import SummarizerAttention
    
    summarizer = SummarizerAttention(lstm_size=NUM_UNITS, embedding_size=EMB_SIZE)
    trainer = Trainer(batch_size=BATCH_SIZE, 
                      steps_per_epoch=STEPS_PER_EPOCH, 
                      steps_to_log=STEPS_TO_LOG, 
                      max_output_len=MAX_OUTPUT_LENGTH, 
                      model_save_path=save_path)
    trainer.train(summarizer, train, num_epochs=EPOCHS, val_data=test)
    
    

    Loading pre-trained model:

    summarizer_loaded = SummarizerAttention.load('summarizer')
    trainer = Trainer(batch_size=2)
    trainer.train(summarizer_loaded, data)
    
    ---------------------------------------------------------------------------
    UnknownError                              Traceback (most recent call last)
    <ipython-input-20-1bcb176df5c9> in <module>()
          1 summarizer_loaded = SummarizerAttention.load('summarizer')
          2 trainer = Trainer(batch_size=2)
    ----> 3 trainer.train(summarizer_loaded, data)
          4 # summarizer_loaded.save('/tmp/summarizer_retrained')
    
    C:\ProgramData\Anaconda3\lib\site-packages\headliner\trainer.py in train(self, summarizer, train_data, val_data, num_epochs, scorers, callbacks)
        203         train_step = summarizer.new_train_step(self.loss_function, self.batch_size, apply_gradients=True)
        204         while epoch_count < num_epochs:
    --> 205             for train_source_seq, train_target_seq in train_dataset.take(-1):
        206                 batch_count += 1
        207                 current_loss = train_step(train_source_seq, train_target_seq)
    
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in __next__(self)
        620 
        621   def __next__(self):  # For Python 3 compatibility
    --> 622     return self.next()
        623 
        624   def _next_internal(self):
    
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in next(self)
        664     """Returns a nested structure of `Tensor`s containing the next element."""
        665     try:
    --> 666       return self._next_internal()
        667     except errors.OutOfRangeError:
        668       raise StopIteration
    
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\iterator_ops.py in _next_internal(self)
        649             self._iterator_resource,
        650             output_types=self._flat_output_types,
    --> 651             output_shapes=self._flat_output_shapes)
        652 
        653       try:
    
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\gen_dataset_ops.py in iterator_get_next_sync(iterator, output_types, output_shapes, name)
       2670       else:
       2671         message = e.message
    -> 2672       _six.raise_from(_core._status_to_exception(e.code, message), None)
       2673   # Add nodes to the TensorFlow graph.
       2674   if not isinstance(output_types, (list, tuple)):
    
    C:\ProgramData\Anaconda3\lib\site-packages\six.py in raise_from(value, from_value)
    
    UnknownError: AttributeError: 'Vectorizer' object has no attribute 'max_input_len'
    Traceback (most recent call last):
    
      File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\ops\script_ops.py", line 221, in __call__
        ret = func(*args)
    
      File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 585, in generator_py_func
        values = next(generator_state.get_iterator(iterator_id))
    
      File "C:\ProgramData\Anaconda3\lib\site-packages\headliner\trainer.py", line 264, in <genexpr>
        data_vectorized = (vectorizer(d) for d in data_preprocessed)
    
      File "C:\ProgramData\Anaconda3\lib\site-packages\headliner\preprocessing\vectorizer.py", line 42, in __call__
        if self.max_input_len is not None:
    
    AttributeError: 'Vectorizer' object has no attribute 'max_input_len'
    
    
    	 [[{{node PyFunc}}]] [Op:IteratorGetNextSync]
    
    
    opened by sainimohit23 2
  • Import library

    Import library

    Hi,

    Thank you so much for sharing this library. It seems to be very easy to use compared to others.

    I try to import the library to reproduce your example but I receive an error message :

    File "/home/ubuntu/.local/lib/python3.5/site-packages/headliner/model/summarizer.py", line 14 self.vectorizer: Union[Vectorizer, None] = None ^ SyntaxError: invalid syntax

    I am running Python 3.5.2.

    Am I doing something wrong?

    Thanks

    opened by YahyaL 2
  • Trainig with custom AmazonFoodReview Dataset for TextSummarization

    Trainig with custom AmazonFoodReview Dataset for TextSummarization

    Hi, First of all thanx for bringing such an easy to use Sequence-to-Sequence NN to open source. Actually I was thinking to use HeadLiner, and for testing I started Training with custom data of AmazonFoodReviews for a summarization model. But it ended up with a loss of "4.662851969401042".

    I was using TransformerSummarizer to train the custom model and the code is here below.

    summarizer = TransformerSummarizer(num_heads=1,embedding_size=64, max_prediction_len=20) trainer = Trainer(batch_size=2, steps_per_epoch=100) trainer.train(summarizer, training_data, num_epochs=100) summarizer.save('/tmp/summarizer')

    Training data was in form of tuple (only 10 samples to print here) : [('have bought several of the vitality canned dog food products and have found them all to be of good quality the product looks more like stew than processed meat and it smells better my labrador is finicky and she appreciates this product better than most', 'good quality dog food'), ('product arrived labeled as jumbo salted peanuts the peanuts were actually small sized unsalted not sure if this was an error or if the vendor intended to represent the product as jumbo', 'not as advertised'), ('this is confection that has been around few centuries it is light pillowy citrus gelatin with nuts in this case filberts and it is cut into tiny squares and then liberally coated with powdered sugar and it is tiny mouthful of heaven not too chewy and very flavorful highly recommend this yummy treat if you are familiar with the story of lewis the lion the witch and the wardrobe this is the treat that seduces edmund into selling out his brother and sisters to the witch', 'delight says it all'), ('if you are looking for the secret ingredient in robitussin believe have found it got this in addition to the root beer extract ordered and made some cherry soda the flavor is very medicinal', 'cough medicine'), ('great taffy at great price there was wide assortment of yummy taffy delivery was very quick if your taffy lover this is deal', 'great taffy'), ('got wild hair for taffy and ordered this five pound bag the taffy was all very enjoyable with many flavors watermelon root beer melon peppermint grape etc my only complaint is there was bit too much red black licorice flavored pieces between me my kids and my husband this lasted only two weeks would recommend this brand of taffy it was delightful treat', 'nice taffy'), ('this saltwater taffy had great flavors and was very soft and chewy each candy was individually wrapped well none of the candies were stuck together which did happen in the expensive version fralinger would highly recommend this candy served it at beach themed party and everyone loved it', 'great just as good as the expensive brands'), ('this taffy is so good it is very soft and chewy the flavors are amazing would definitely recommend you buying it very satisfying', 'wonderful tasty taffy'), ('right now am mostly just sprouting this so my cats can eat the grass they love it rotate it around with wheatgrass and rye too', 'yay barley'), ('this is very healthy dog food good for their digestion also good for small puppies my dog eats her required amount at every feeding', 'healthy dog food')]

    vocab encoder: 18122, vocab decoder: 4439

    But the prediction failed badly.

    Then I started to give a try to BertSummarizer on same dataset. Then it gives the error : TypeError:generatoryielded an element that did not match the expected structure. The expected structure was (tf.int32, tf.int32, tf.int32), but the yielded element was ([3, 7293, 1725, 14131, 10785, 16089, 17337, 2220, 4703, 6185, 12287, 574, 7293, 6281, 16104, 414, 16352, 1242, 10785, 6793, 12569, 16089, 12274, 9204, 10148, 9029, 15200, 16066, 12257, 9667, 574, 8317, 14571, 1408, 10321, 8751, 8290, 5960, 574, 14182, 736, 16193, 12274, 1408, 16066, 10171, 2], [3, 1655, 3098, 1136, 1500, 2])..

    I would be very thankful if you help me out.

    Thanx.

    opened by ShoubhikBanerjee 1
  • Colab hosted notebooks for quick demo.

    Colab hosted notebooks for quick demo.

    It is not an issue. It is more like a suggestion.

    Is it possible to have google colab hosted notebooks like this in the documentation to play around with code and to run quick demos?

    opened by sainimohit23 1
  • Enable retraining with non-trainable embedding.

    Enable retraining with non-trainable embedding.

    Currently if a model is retrained the embedding is switched to trainable=true. Use an additional flag for trainable embeddings that is restored when loading the model.

    opened by cschaefer26 0
  • Can I bypass tokenizer and predict seq2seq directly?

    Can I bypass tokenizer and predict seq2seq directly?

    I am trying to use Transformer to predict a sequence from another sequence. But I failed to see how can I bypass the tokenizer so that I can directly send my sequence to Transformer.

    opened by zhangwangwz 0
  • does headliner support beam search during decoding?

    does headliner support beam search during decoding?

    hello,i want to kown does headliner supper beam search way when pridection,if it support, how can i use it,the document dont mention it, and when i check the source code, it seems you use the greedy search during decoding. many thanks.

    opened by zhangshouleibupt 2
Owner
Axel Springer Ideas Engineering GmbH
We are driving, shaping and coding the future of tech at Axel Springer.
Axel Springer Ideas Engineering GmbH
null 1 Jun 28, 2022
An open source framework for seq2seq models in PyTorch.

pytorch-seq2seq Documentation This is a framework for sequence-to-sequence (seq2seq) models implemented in PyTorch. The framework has modularized and

International Business Machines 1.4k Jan 2, 2023
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 159 Apr 4, 2022
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification,seq2seq,attention,beam search

multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification,seq2seq,attention,beam search

hellonlp 30 Dec 12, 2022
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Proteno This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deploymen

null 37 Dec 4, 2022
Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

背景 安装教程 快速上手 (一)预训练模型 (二)机器翻译 (三)文本分类 TenTrans 进阶 1. 多语言机器翻译 2. 跨语言预训练 背景 TrenTrans是一个统一的端到端的多语言多任务预训练平台,支持多种预训练方式,以及序列生成和自然语言理解任务。 安装教程 git clone git

Tencent Minority-Mandarin Translation Team 42 Dec 20, 2022
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

?? The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Hugging Face 15k Jan 2, 2023
An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

FantasyBert English | 中文 Introduction An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations. You can imp

Fan 137 Oct 26, 2022
Tevatron is a simple and efficient toolkit for training and running dense retrievers with deep language models.

Tevatron Tevatron is a simple and efficient toolkit for training and running dense retrievers with deep language models. The toolkit has a modularized

texttron 193 Jan 4, 2023
Super easy library for BERT based NLP models

Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor

Utterworks 1.8k Dec 27, 2022
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

deepset 1.6k Dec 27, 2022
Super easy library for BERT based NLP models

Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor

Utterworks 1.5k Feb 18, 2021
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

deepset 1.1k Feb 14, 2021
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat, to task-oriented dia

Facebook Research 9.7k Jan 9, 2023
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat, to task-oriented dia

Facebook Research 7k Feb 18, 2021
This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

EleutherAI 42 Dec 13, 2022
Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models. A paraphrase framework is more than just a paraphrasing model.

Prithivida 681 Jan 1, 2023