Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Overview

This is a fork of Fairseq(-py) with implementations of the following models:

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

An NMT models with two-dimensional convolutions to jointly encode the source and the target sequences.

Pervasive Attention also provides an extensive decoding grid that we leverage to efficiently train wait-k models.

See README.

Efficient Wait-k Models for Simultaneous Machine Translation

Transformer Wait-k models (Ma et al., 2019) with unidirectional encoders and with joint training of multiple wait-k paths.

See README.

Fairseq Requirements and Installation

  • PyTorch version >= 1.4.0
  • Python version >= 3.6
  • For training new models, you'll also need an NVIDIA GPU and NCCL

Installing Fairseq

git clone https://github.com/elbayadm/attn2d
cd attn2d
pip install --editable .

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

For Pervasive Attention, please cite:

@InProceedings{elbayad18conll,
    author ="Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob",
    title = "Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction",
    booktitle = "Proceedings of the 22nd Conference on Computational Natural Language Learning",
    year = "2018",
 }

For our wait-k models, please cite:

@article{elbayad20waitk,
    title={Efficient Wait-k Models for Simultaneous Machine Translation},
    author={Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob},
    journal={arXiv preprint arXiv:2005.08595},
    year={2020}
}

For Fairseq, please cite:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}
Comments
  • The training speed of pervasive attention model

    The training speed of pervasive attention model

    Hello,

    I am trying to run the pervasive attention model recipes as recommended in the README: https://github.com/elbayadm/attn2d/blob/master/examples/pervasive/README.md

    Based on my observation, the model is too slow with almost 30 words per second on a single GPU 1080! May you please give me a rough idea of what should be the speed expectations based on your experiments?

    Thanks for your insights Parnia

    opened by papar22 1
  • AttributeError: 'float' object has no attribute 'data' on pytorch 0.4.1

    AttributeError: 'float' object has no attribute 'data' on pytorch 0.4.1

    I get the following error at trainer.backward_step() on running the demo script: File "attn2d/nmt/trainer.py", line 230, in backward_step self.clip_norm).data.item() AttributeError: 'float' object has no attribute 'data'

    which I fixed by: grad_norm = torch.nn.utils.clip_grad_norm_(self.model.parameters(), self.clip_norm)

    Which then runs OK for torch==0.4.1

    Which version of pytorch was the code written for?

    opened by adriangrepo 1
  • ModuleNotFoundError: No module named 'nmt.loader.pair_loader'

    ModuleNotFoundError: No module named 'nmt.loader.pair_loader'

    Hi, as title says, I got this error:

    Traceback (most recent call last): File "/home//repos/attn2d/train.py", line 84, in train(params) File "/home//repos/attn2d/train.py", line 20, in train from nmt.trainer import Trainer File "/home//repos/attn2d/nmt/trainer.py", line 17, in from nmt.loader.pair_loader import DataPair ModuleNotFoundError: No module named 'nmt.loader.pair_loader'

    opened by koszilard 1
  • 16 undefined names

    16 undefined names

    • Replace ‘false’ with ‘False’
    • Missing import tensorflow as tf
    • Missing import numpy as np
    • See #2

    flake8 testing of https://github.com/elbayadm/attn2d on Python 3.7.0

    $ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

    ./nmt/scheduler.py:53:44: F821 undefined name 'exp'
                                               exp(self._iter / self.speed))
                                               ^
    ./nmt/optimizer.py:202:16: F821 undefined name 'nn'
            return nn.utils.clip_grad_norm_(params, self.grad_norm_max, self.grad_norm_type)
                   ^
    ./nmt/models/pooling.py:145:15: F821 undefined name 'PositionalPooling4'
            super(PositionalPooling4, self).__init__()
                  ^
    ./nmt/models/convs2s2D.py:523:20: F821 undefined name 'trg_emb'
                    if trg_emb.size(1) > max_h:
                       ^
    ./nmt/models/convs2s2D.py:524:31: F821 undefined name 'trg_emb'
                        trg_emb = trg_emb[:, -max_h:, :, :]
                                  ^
    ./nmt/models/convs2s2D.py:624:41: F821 undefined name 'trg_labels'
                    trg_labels = torch.cat((trg_labels, trg_labels_t),
                                            ^
    ./nmt/utils/logging.py:17:23: F821 undefined name 'DETOK'
        source = " ".join(DETOK.detokenize(source.split())).encode('utf-8')
                          ^
    ./nmt/utils/logging.py:18:19: F821 undefined name 'DETOK'
        gt = " ".join(DETOK.detokenize(gt.split())).encode('utf-8')
                      ^
    ./nmt/utils/logging.py:19:21: F821 undefined name 'DETOK'
        pred = " ".join(DETOK.detokenize(pred.split())).encode('utf-8')
                        ^
    ./nmt/utils/logging.py:129:16: F821 undefined name 'tf'
        _summary = tf.summary.scalar(name=key,
                   ^
    ./nmt/utils/logging.py:130:41: F821 undefined name 'tf'
                                     tensor=tf.Variable(value),
                                            ^
    ./nmt/utils/logging.py:132:15: F821 undefined name 'tf'
        summary = tf.Summary(value=[tf.Summary.Value(tag=key, simple_value=value)])
                  ^
    ./nmt/utils/logging.py:132:33: F821 undefined name 'tf'
        summary = tf.Summary(value=[tf.Summary.Value(tag=key, simple_value=value)])
                                    ^
    ./nmt/loss/working_loss.py:565:31: F821 undefined name 'false'
                p.requires_grad = false
                                  ^
    ./nmt/loss/_loss.py:402:31: F821 undefined name 'false'
                p.requires_grad = false
                                  ^
    ./nmt/loss/samplers/ngram.py:72:52: F821 undefined name 'score'
            return preds_matrix, np.ones(batch_size) * score, stats
                                                       ^
    16    F821 undefined name 'false'
    16
    
    opened by cclauss 1
  • Is this a typo?

    Is this a typo?

    at attn2d/nmt/models/pooling.py line 145.

    class PositionalPooling(nn.Module):
        def __init__(self, max_length, emb_size):
            super(PositionalPooling4, self).__init__()
            self.src_embedding = nn.Embedding(max_length, emb_size)
            self.trg_embedding = nn.Embedding(max_length, emb_size)
            self.src_embedding.weight.data.fill_(1)
            self.trg_embedding.weight.data.fill_(1)
            self.src_embedding.bias.data.fill_(0)
            self.trg_embedding.bias.data.fill_(0)
    

    Is PositionalPooling4 PositionalPooling?

    opened by elect000 1
  • Training/Eval Error for waitk model

    Training/Eval Error for waitk model

    🐛 Bug

    I'am trying to run the trainning code follow the waitk guide file , and fixed some bug just as @ereday this issue mentioned , but still got error when i ran the train code :

    RuntimeError: Output 0 of SplitBackward0 is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

    To Reproduce

    Steps to reproduce the behavior (always include the command you ran):

    k=7
    MODEL=tf_wait${k}_wmt14Ende
    CUDA_VISIBLE_DEVICES=0 python train.py $DATA_BIN -s en -t de --left-pad-source False \
        --user-dir examples/waitk --arch waitk_transformer_small \
        --save-dir $Workdir/checkpoints/$MODEL --tensorboard-logdir $Workdir/logs/$MODEL \
        --seed 1 --no-epoch-checkpoints --no-progress-bar --log-interval 10  \
        --optimizer adam --adam-betas '(0.9, 0.98)' --weight-decay 0.0001 \
        --max-tokens 4000 --update-freq 2 --max-update 50000 \
        --lr-scheduler inverse_sqrt --warmup-updates 4000 --warmup-init-lr '1e-07' --lr 0.002 \
        --min-lr '1e-9' --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
        --share-decoder-input-output-embed --waitk  $k
    
    1. See error

    Expected behavior

    Environment

    • fairseq Version (e.g., 1.0 or master):
    • PyTorch Version (e.g., 1.0)
    • OS (e.g., Linux):
    • How you installed fairseq (pip, source):
    • Build command you used (if compiling from source):
    • Python version:
    • CUDA/cuDNN version:
    • GPU models and configuration:
    • Any other relevant information:

    Additional context

    this repo seems out-of-date and the issue raised half years ago is still no replied.

    bug 
    opened by EricLina 0
  • ModuleNotFoundError: No module named 'examples.simultaneous'

    ModuleNotFoundError: No module named 'examples.simultaneous'

    🐛 Bug

    Hi,

    I was trying to evaluate the pre-trained models under "Efficient Wait-k Models for Simultaneous Machine Translation". For this, I followed the instructions given in the readme. Specifically, I did followings:

    After downloading model and data and placing them underpre_saved:

    cd ~/attn2d/pre_saved
    tar xzf iwslt14_de_en.tar.gz
    tar xzf tf_waitk_model.tar.gz
    
    
    k=5 # Evaluation time k
    output=wait$k.log
    CUDA_VISIBLE_DEVICES=0 python generate.py pre_saved/iwslt14_deen_bpe10k_binaries/ -s de -t en --gen-subset test --path pre_saved/tf_waitk_model.tar.gz --task waitk_translation --eval-waitk $k --model-overrides "{'max_source_positions': 1024, 'max_target_positions': 1024}" --left-pad-source False --user-dir examples/waitk --no-progress-bar --max-tokens 8000 --remove-bpe --beam 1 2>&1 | tee -a $output
    

    It generates following error message:

    Traceback (most recent call last):
      File "generate.py", line 11, in <module>
        cli_main()
      File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
        parser = options.get_generation_parser()
      File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
        parser = get_parser("Generation", default_task)
      File "/home/attn2d/fairseq/options.py", line 197, in get_parser
        utils.import_user_module(usr_args)
      File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
        importlib.import_module(module_name)
      File "/home/anaconda3/envs/py37/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 728, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
        from . import models, tasks
      File "/home/attn2d/examples/waitk/models/__init__.py", line 7, in <module>
        importlib.import_module('examples.simultaneous.models.' + model_name)
      File "/home/anaconda3/envs/py37/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
    ModuleNotFoundError: No module named 'examples.simultaneous'
    

    EDIT

    Okay, here is more detail about this:

    I believe thisline is responsible for the error message shared above.

    I changed this importlib.import_module('examples.simultaneous.models.' + model_name) to importlib.import_module('examples.waitk.models.' + model_name)

    Then, I got another error:

      File "generate.py", line 11, in <module>
        cli_main()
      File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
        parser = options.get_generation_parser()
      File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
        parser = get_parser("Generation", default_task)
      File "/home/attn2d/fairseq/options.py", line 197, in get_parser
        utils.import_user_module(usr_args)
      File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
        importlib.import_module(module_name)
      File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 728, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
        from . import models, tasks
      File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
        importlib.import_module('examples.waitk.models.' + model_name)
      File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
        from . import models, tasks
      File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
        importlib.import_module('examples.waitk.models.' + model_name)
      File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "/home/attn2d/examples/waitk/models/waitk_transformer.py", line 24, in <module>
        from examples.simultaneous.modules import TransformerEncoderLayer, TransformerDecoderLayer
    
    

    So, I changed this line here to ```from examples.waitk.modules import TransformerEncoderLayer, ```` too. Then when I tried once more, I got the following error:

      File "generate.py", line 11, in <module>
        cli_main()
      File "/home/attn2d/fairseq_cli/generate.py", line 276, in cli_main
        parser = options.get_generation_parser()
      File "/home/attn2d/fairseq/options.py", line 33, in get_generation_parser
        parser = get_parser("Generation", default_task)
      File "/home/attn2d/fairseq/options.py", line 197, in get_parser
        utils.import_user_module(usr_args)
      File "/home/attn2d/fairseq/utils.py", line 350, in import_user_module
        importlib.import_module(module_name)
      File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
      File "<frozen importlib._bootstrap>", line 983, in _find_and_load
      File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 728, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
        from . import models, tasks
      File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
        importlib.import_module('examples.waitk.models.' + model_name)
      File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "/home/attn2d/examples/waitk/__init__.py", line 1, in <module>
        from . import models, tasks
      File "/home/attn2d/examples/waitk/models/__init__.py", line 8, in <module>
        importlib.import_module('examples.waitk.models.' + model_name)
      File "/home/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "/home/attn2d/examples/waitk/models/waitk_transformer.py", line 25, in <module>
        from examples.waitk.modules import TransformerEncoderLayer, TransformerDecoderLayer
      File "/home/attn2d/examples/waitk/modules/__init__.py", line 2, in <module>
        from .controller import Controller
    

    So, to fix it I commented out following lines inexamples/waitk/modules/init.py:

    from .controller import Controller
    from .branch_controller import BranchController
    from .oracle import SimulTransOracleDP, SimulTransOracleDP1
    

    Next, I've tried to use the generation command given in the readme once more..

    CUDA_VISIBLE_DEVICES=0 python generate.py pretrained-sources/iwslt14_deen_bpe10k_binaries/ -s de -t en --gen-subset test --path pretrained-sources/model.pt --task waitk_translation --eval-waitk $k --model-overrides "{'max_source_positions': 1024, 'max_target_positions': 1024}" --left-pad-source False --user-dir examples/waitk --no-progress-bar --max-tokens 8000 --remove-bpe --beam 1 2>&1 | tee -a $output

    I got this error:

    
    2021-09-20 20:29:46 | INFO | fairseq_cli.generate | Namespace(all_gather_list_size=16384, beam=1, bpe=None, checkpoint_suffix='', cpu=False, criterion='cross_entropy', data='pretrained-sources/iwslt14_deen_bpe10k_binaries/', data_buffer_size=0, dataset_impl=None, decoding_format=None, diverse_beam_groups=-1, diverse_beam_strength=0.5, diversity_rate=-1.0, empty_cache_freq=0, eval_bleu=False, eval_bleu_args=None, eval_bleu_detok='space', eval_bleu_detok_args=None, eval_bleu_print_samples=False, eval_bleu_remove_bpe=None, eval_tokenized_bleu=False, eval_waitk=5, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_no_flatten_grads=False, fp16_scale_tolerance=0.0, fp16_scale_window=None, gen_subset='test', iter_decode_eos_penalty=0.0, iter_decode_force_max_iter=False, iter_decode_max_iter=10, iter_decode_with_beam=1, iter_decode_with_external_reranker=False, left_pad_source='False', left_pad_target='False', lenpen=1, load_alignments=False, log_format=None, log_interval=100, lr_scheduler='fixed', lr_shrink=0.1, match_source_len=False, max_len_a=0, max_len_b=200, max_sentences=None, max_source_positions=1024, max_target_positions=1024, max_tokens=8000, memory_efficient_fp16=False, min_len=1, min_loss_scale=0.0001, model_overrides="{'max_source_positions': 1024, 'max_target_positions': 1024}", model_parallel_size=1, momentum=0.99, nbest=1, no_beamable_mm=False, no_early_stop=False, no_progress_bar=True, no_repeat_ngram_size=0, num_shards=1, num_workers=1, optimizer='nag', path='pretrained-sources/model.pt', prefix_size=0, print_alignment=False, print_step=False, quantization_config_path=None, quiet=False, remove_bpe='@@ ', replace_unk=None, required_batch_size_multiple=8, results_path=None, retain_iter_history=False, sacrebleu=False, sampling=False, sampling_topk=-1, sampling_topp=-1.0, score_reference=False, seed=1, shard_id=0, skip_invalid_size_inputs_valid_test=False, source_lang='de', target_lang='en', task='waitk_translation', temperature=1.0, tensorboard_logdir='', threshold_loss_scale=None, tokenizer=None, truncate_source=False, unkpen=0, unnormalized=False, upsample_primary=1, user_dir='examples/waitk', warmup_updates=0, weight_decay=0.0)
    2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | [de] dictionary: 8848 types
    2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | [en] dictionary: 6632 types
    2021-09-20 20:29:46 | INFO | fairseq.data.data_utils | loaded 6750 examples from: pretrained-sources/iwslt14_deen_bpe10k_binaries/test.de-en.de
    2021-09-20 20:29:46 | INFO | fairseq.data.data_utils | loaded 6750 examples from: pretrained-sources/iwslt14_deen_bpe10k_binaries/test.de-en.en
    2021-09-20 20:29:46 | INFO | fairseq.tasks.translation | pretrained-sources/iwslt14_deen_bpe10k_binaries/ test de-en 6750 examples
    2021-09-20 20:29:46 | INFO | fairseq_cli.generate | loading model(s) from pretrained-sources/model.pt
    Traceback (most recent call last):
      File "generate.py", line 11, in <module>
        cli_main()
      File "/home/attn2d/fairseq_cli/generate.py", line 278, in cli_main
        main(args)
      File "/home/attn2d/fairseq_cli/generate.py", line 36, in main
        return _main(args, sys.stdout)
      File "/home/attn2d/fairseq_cli/generate.py", line 103, in _main
        num_workers=args.num_workers,
      File "/home/attn2d/fairseq/tasks/fairseq_task.py", line 181, in get_batch_iterator
        required_batch_size_multiple=required_batch_size_multiple,
      File "/home/attn2d/fairseq/data/data_utils.py", line 220, in batch_by_size
        from fairseq.data.data_utils_fast import batch_by_size_fast
      File "fairseq/data/data_utils_fast.pyx", line 1, in init fairseq.data.data_utils_fast
        # cython: language_level=3
    ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
    ##########
    

    I just gave up after that.. @elbayadm I hope you can help me on this.

    Code sample

    Environment

    I have followed the instructions in the README to install my environment. :

    git clone https://github.com/elbayadm/attn2d
    cd attn2d
    pip install --editable .
    
    

    As a result, I have the following libraries in my environment:

    Python 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) 
    [GCC 9.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import torch
    torc>>> torch.__version__
    '1.9.0+cu102'
    >>> import fairseq
    >>> fairseq.__version__
    '0.9.0'
    >>> 
    $ python --version
    Python 3.7.10
    

    Operating system: Linux

    bug 
    opened by ereday 2
Owner
Maha
PhD student (LIG & INRIA)
Maha
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language mod

null 20.5k Jan 8, 2023
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 1.1k Dec 27, 2022
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language mod

null 11.3k Feb 18, 2021
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 986 Feb 17, 2021
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 1000 Apr 19, 2021
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language mod

null 13.2k Jul 7, 2021
Sequence-to-Sequence Framework in PyTorch

nmtpytorch allows training of various end-to-end neural architectures including but not limited to neural machine translation, image captioning and au

LIUM 395 Nov 21, 2022
A highly sophisticated sequence-to-sequence model for code generation

CoderX A proof-of-concept AI system by Graham Neubig (June 30, 2021). About CoderX CoderX is a retrieval-based code generation AI system reminiscent o

Graham Neubig 39 Aug 3, 2021
MASS: Masked Sequence to Sequence Pre-training for Language Generation

MASS: Masked Sequence to Sequence Pre-training for Language Generation

Microsoft 1.1k Dec 17, 2022
Sequence-to-Sequence learning using PyTorch

Seq2Seq in PyTorch This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train

Elad Hoffer 514 Nov 17, 2022
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

null 186 Dec 24, 2022
Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Yoon Kim 43 Dec 23, 2022
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

MTFAA-Net Unofficial PyTorch implementation of Baidu's MTFAA-Net: "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speec

Shimin Zhang 87 Dec 19, 2022
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

Ryuichi Yamamoto 1.8k Dec 30, 2022
[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

CMU Locus Lab 460 Oct 13, 2022
Convolutional 2D Knowledge Graph Embeddings resources

ConvE Convolutional 2D Knowledge Graph Embeddings resources. Paper: Convolutional 2D Knowledge Graph Embeddings Used in the paper, but do not use thes

Tim Dettmers 586 Dec 24, 2022
Yet Another Sequence Encoder - Encode sequences to vector of vector in python !

Yase Yet Another Sequence Encoder - encode sequences to vector of vectors in python ! Why Yase ? Yase enable you to encode any sequence which can be r

Pierre PACI 12 Aug 19, 2021
Task-based datasets, preprocessing, and evaluation for sequence models.

SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models. SeqIO is a library for processing sequential data to be fed into downst

Google 290 Dec 26, 2022
LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

LightSeq is a high performance inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT2, Transformer, etc. It is therefore best useful for Machine Translation, Text Generation, Dialog, Language Modelling, and other related tasks using these models.

Bytedance Inc. 2.5k Jan 3, 2023