Open Source Neural Machine Translation in PyTorch

Overview

OpenNMT-py: Open-Source Neural Machine Translation

Build Status Run on FH Documentation Gitter Forum

OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine translation framework. It is designed to be research friendly to try out new ideas in translation, summary, morphology, and many other domains. Some companies have proven the code to be production ready.

We love contributions! Please look at issues marked with the contributions welcome tag.

Before raising an issue, make sure you read the requirements and the documentation examples.

Unless there is a bug, please use the forum or Gitter to ask questions.


Announcement - OpenNMT-py 2.0

We're happy to announce the upcoming release v2.0 of OpenNMT-py.

The major idea behind this release is the -- almost -- complete makeover of the data loading pipeline. A new 'dynamic' paradigm is introduced, allowing to apply on the fly transforms to the data.

This has a few advantages, amongst which:

  • remove or drastically reduce the preprocessing required to train a model;
  • increase the possibilities of data augmentation and manipulation through on-the fly transforms.

These transforms can be specific tokenization methods, filters, noising, or any custom transform users may want to implement. Custom transform implementation is quite straightforward thanks to the existing base class and example implementations.

You can check out how to use this new data loading pipeline in the updated docs.

All the readily available transforms are described here.

Performance

Given sufficient CPU resources according to GPU computing power, most of the transforms should not slow the training down. (Note: for now, one producer process per GPU is spawned -- meaning you would ideally need 2N CPU threads for N GPUs).

Breaking changes

For now, the new data loading paradigm does not support Audio, Video and Image inputs.

A few features are also dropped, at least for now:

  • audio, image and video inputs;
  • source word features.

For any user that still need these features, the previous codebase will be retained as legacy in a separate branch. It will no longer receive extensive development from the core team but PRs may still be accepted.

Feel free to check it out and let us know what you think of the new paradigm!


Table of Contents

Setup

OpenNMT-py requires:

  • Python >= 3.6
  • PyTorch == 1.6.0

Install OpenNMT-py from pip:

pip install OpenNMT-py

or from the sources:

git clone https://github.com/OpenNMT/OpenNMT-py.git
cd OpenNMT-py
pip install -e .

Note: if you encounter a MemoryError during installation, try to use pip with --no-cache-dir.

(Optional) Some advanced features (e.g. working pretrained models or specific transforms) require extra packages, you can install them with:

pip install -r requirements.opt.txt

Features

Quickstart

Full Documentation

Step 1: Prepare the data

To get started, we propose to download a toy English-German dataset for machine translation containing 10k tokenized sentences:

wget https://s3.amazonaws.com/opennmt-trainingdata/toy-ende.tar.gz
tar xf toy-ende.tar.gz
cd toy-ende

The data consists of parallel source (src) and target (tgt) data containing one sentence per line with tokens separated by a space:

  • src-train.txt
  • tgt-train.txt
  • src-val.txt
  • tgt-val.txt

Validation files are used to evaluate the convergence of the training. It usually contains no more than 5k sentences.

$ head -n 3 toy-ende/src-train.txt
It is not acceptable that , with the help of the national bureaucracies , Parliament 's legislative prerogative should be made null and void by means of implementing provisions whose content , purpose and extent are not laid down in advance .
Federal Master Trainer and Senior Instructor of the Italian Federation of Aerobic Fitness , Group Fitness , Postural Gym , Stretching and Pilates; from 2004 , he has been collaborating with Antiche Terme as personal Trainer and Instructor of Stretching , Pilates and Postural Gym .
" Two soldiers came up to me and told me that if I refuse to sleep with them , they will kill me . They beat me and ripped my clothes .

We need to build a YAML configuration file to specify the data that will be used:

# toy_en_de.yaml

## Where the samples will be written
save_data: toy-ende/run/example
## Where the vocab(s) will be written
src_vocab: toy-ende/run/example.vocab.src
tgt_vocab: toy-ende/run/example.vocab.tgt
# Prevent overwriting existing files in the folder
overwrite: False

# Corpus opts:
data:
    corpus_1:
        path_src: toy-ende/src-train.txt
        path_tgt: toy-ende/tgt-train.txt
    valid:
        path_src: toy-ende/src-val.txt
        path_tgt: toy-ende/tgt-val.txt
...

From this configuration, we can build the vocab(s) that will be necessary to train the model:

onmt_build_vocab -config toy_en_de.yaml -n_sample 10000

Notes:

  • -n_sample is required here -- it represents the number of lines sampled from each corpus to build the vocab.
  • This configuration is the simplest possible, without any tokenization or other transforms. See other example configurations for more complex pipelines.

Step 2: Train the model

To train a model, we need to add the following to the YAML configuration file:

  • the vocabulary path(s) that will be used: can be that generated by onmt_build_vocab;
  • training specific parameters.
# toy_en_de.yaml

...

# Vocabulary files that were just created
src_vocab: toy-ende/run/example.vocab.src
tgt_vocab: toy-ende/run/example.vocab.tgt

# Train on a single GPU
world_size: 1
gpu_ranks: [0]

# Where to save the checkpoints
save_model: toy-ende/run/model
save_checkpoint_steps: 500
train_steps: 1000
valid_steps: 500

Then you can simply run:

onmt_train -config toy_en_de.yaml

This configuration will run the default model, which consists of a 2-layer LSTM with 500 hidden units on both the encoder and decoder. It will run on a single GPU (world_size 1 & gpu_ranks [0]).

Before the training process actually starts, the *.vocab.pt together with *.transforms.pt will be dumpped to -save_data with configurations specified in -config yaml file. We'll also generate transformed samples to simplify any potentially required visual inspection. The number of sample lines to dump per corpus is set with the -n_sample flag.

For more advanded models and parameters, see other example configurations or the FAQ.

Step 3: Translate

onmt_translate -model toy-ende/run/model_step_1000.pt -src toy-ende/src-test.txt -output toy-ende/pred_1000.txt -gpu 0 -verbose

Now you have a model which you can use to predict on new data. We do this by running beam search. This will output predictions into toy-ende/pred_1000.txt.

Note:

The predictions are going to be quite terrible, as the demo dataset is small. Try running on some larger datasets! For example you can download millions of parallel sentences for translation or summarization.

(Optional) Step 4: Release

When you are satisfied with your trained model, you can release it for inference. The release process will remove training-only parameters from the checkpoint:

onmt_release_model -model toy-ende/run/model_step_1000.pt -output toy-ende/run/model_step_1000_release.pt

The release script can also export checkpoints to CTranslate2, a fast inference engine for Transformer models. See the -format command line option.

Alternative: Run on FloydHub

Run on FloydHub

Click this button to open a Workspace on FloydHub for training/testing your code.

Pretrained embeddings (e.g. GloVe)

Please see the FAQ: How to use GloVe pre-trained embeddings in OpenNMT-py

Pretrained models

Several pretrained models can be downloaded and used with onmt_translate:

http://opennmt.net/Models-py/

Acknowledgements

OpenNMT-py is run as a collaborative open-source project. The original code was written by Adam Lerer (NYC) to reproduce OpenNMT-Lua using PyTorch.

Major contributors are:

OpenNMT-py is part of the OpenNMT project.

Citation

If you are using OpenNMT-py for academic work, please cite the initial system demonstration paper published in ACL 2017:

@inproceedings{klein-etal-2017-opennmt,
    title = "{O}pen{NMT}: Open-Source Toolkit for Neural Machine Translation",
    author = "Klein, Guillaume  and
      Kim, Yoon  and
      Deng, Yuntian  and
      Senellart, Jean  and
      Rush, Alexander",
    booktitle = "Proceedings of {ACL} 2017, System Demonstrations",
    month = jul,
    year = "2017",
    address = "Vancouver, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P17-4012",
    pages = "67--72",
}
Comments
  • Abstractive Summarization Results

    Abstractive Summarization Results

    Hey guys, looking at recent pull requests and issues, it looks like a common interest of contributors (On top of NMT obv) is Abstractive Summarization.

    Any suggestions of how to train a model that will get close results to recent papers on the CNN-Daily Mail Database? Any additional preprocessing?

    Thanks?

    type:performance type:question 
    opened by mataney 82
  • How to reproduce your results on WMT'14 ENDE datasets of

    How to reproduce your results on WMT'14 ENDE datasets of "Attention is All You Need"?

    Hi, I want to reproduce the results on WMT'14 ENDE datasets of "Attention is All You Need" paper. I have read OpenNMT-FAQ and I want to know the exact details about your experiments:

    1. Did you do the BPE or word spiece?
    2. What's the exact BLEU score of your experiments on WMT'14 ENDE dataset?
    3. Is there some other differences between README's steps and transformer experiments? If yes please provide a complete tutorial so as to help us reproduce your results.

    Thank you very much! @srush

    opened by SkyAndCloud 47
  • Pytorch 0.4 support: python2/3 issue with detach()

    Pytorch 0.4 support: python2/3 issue with detach()

    This occurs on systems with either cuda8/cudnn6 or cuda9/cudnn7, Ubuntu 16.04. Pytorch built from source.

    python3 train.py -data data/demo -save_model demo-model -gpuid 1

        main()
      File "train.py", line 299, in main
        train_model(model, train, valid, fields, optim)
      File "train.py", line 159, in train_model
        train_stats = trainer.train(epoch, report_func)
      File "/home/levinth/OpenNMT-py/onmt/Trainer.py", line 133, in train
        dec_state.detach()
      File "/home/levinth/OpenNMT-py/onmt/Models.py", line 444, in detach
        h.detach_()
    RuntimeError: Can't detach views in-place. Use detach() instead
    
    type:bug 
    opened by David-Levinthal 44
  • Report for Chinese Abstractive summarization performance

    Report for Chinese Abstractive summarization performance

    This the report for Chinese Abstractive summarization performance. Welcome to discuss.

    Result

    LCSTS dataset
    Rouge-1 / ROUGE-2 / ROUGE-L
    34.8 / 22.5 / 32.3
    
    Gigaword Chinese dataset
    Rouge-1 / ROUGE-2 / ROUGE-L
    51.92 / 38.39 / 49.12
    

    Preprocessing script

    python3 preprocess.py \  
          -train_src $ORIGIN_DIR/train.source \  
          -train_tgt $ORIGIN_DIR/train.target \  
          -valid_src $ORIGIN_DIR/valid.source \
          -valid_tgt $ORIGIN_DIR/valid.target \
          -src_vocab_size 8000 \
          -tgt_vocab_size 8000 \
          -src_seq_length 400 \
          -tgt_seq_length 30 \
          -src_seq_length_trunc 400 \
          -tgt_seq_length_trunc 100 \
          -max_shard_size 20000000 \
          -save_data $DATA_DIR/processed
    

    Training script

     python3 train.py \
          -data $DATA_DIR/processed \
          -word_vec_size 500 \
          -encoder_type brnn \
          -epochs 30 \
          -enc_layers 1 \
          -dec_layers 1 \
          -rnn_size 300 \
          -gpuid 0 \
          -save_model $MODEL_DIR/ \
          > $MODEL_DIR/log.txt
    

    Generating script

     python3 translate.py \
          -model $MODEL_DIR/$BEST_MODEL \
          -beam_size 5 \
          -verbose \
          -batch_size 1 \
          -tgt $GOLD \
          -output $MODEL_DIR/$PRED \
          -src $TEST
    
    opened by playma 34
  • Update checkpoint vocabulary

    Update checkpoint vocabulary

    Draft PR to add new vocabulary to existing checkpoint by reusing checkpoint's embeddings

    This is what the new training procedure would be:

    1. Run build_vocab as usual. This would generate the vocabulary files for the new corpora.
    2. Then in the training procedure:
    • Load checkpoint (fields would be populated with the checkpoint’s vocab)
    • Build fields for the new vocabulary as usual from the vocabulary files generated in step1.
    • Extend checkpoint fields vocabulary with the new vocabulary fields (only appending new words at the end, as stated in Torch docs)
    • Assign the extended vocabulary to the new models fields
    • In build_base_model, after loading the checkpoint’s state_dicts, replace new model’s embeddings from 0 to len(checkpoint.vocab_size) with checkpoint’s embeddings. It should be in the same order as in the extended vocabulary new words were appended at the end.
    • Remove embeddings parameters from checkpoint as new embeddings have been already added to the model
    • Continue as usual
    opened by anderleich 33
  • Option -gpuid not working as it should

    Option -gpuid not working as it should

    When I use -gpuid option in current master of OpenNMT, regardless of what gpu I choose, the training script always chooses gpu 0. If I use multigpu, for example, -gpuid 2 3, the training goes to gpus 0 and 1.

    Is this a known issue?

    type:bug contributions welcome 
    opened by goncalomcorreia 27
  • Upgrade to torchtext 0.3

    Upgrade to torchtext 0.3

    pytorch: 0.4 torchtext: 0.3

    Hello, I've got one error when doing the first preprocessing step from here.

    Traceback (most recent call last):
      File "preprocess.py", line 204, in <module>
        main()
      File "preprocess.py", line 191, in main
        fields = onmt.io.get_fields(opt.data_type, src_nfeats, tgt_nfeats)
      File "/home/lr/yukun/OpenNMT-py/onmt/io/IO.py", line 44, in get_fields
        return TextDataset.get_fields(n_src_features, n_tgt_features)
      File "/home/lr/yukun/OpenNMT-py/onmt/io/TextDataset.py", line 229, in get_fields
        postprocessing=make_src, sequential=False)
    TypeError: __init__() got an unexpected keyword argument 'tensor_type'
    

    It seems that torchtext 0.3 has changed their interface parameter in torchtext.data.Field()

    But in torchtext 0.2.3, it works well except following warning:

    xxx/torchtext/data/field.py:321: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
    

    And it seems new version of torchtext has fixed this from here. I am new to pytorch and opennmt. I don't know whether this warning matter?

    type:feature 
    opened by yukunfeng 27
  • Update transformer based on opennmt-tf

    Update transformer based on opennmt-tf

    1. forward() in Transformer is not compatible with Translator.py (line 153), which requires context_lengths.
    2. Transformer actually doesn't converge using default options. (PPL is just ridiculous)
    3. Too many parameters are reported.

    Some of them are already reported in other threads. It could be hardly used in current state.

    type:bug 
    opened by helson73 27
  • Massive memory use because entire file is slurped in IO.py

    Massive memory use because entire file is slurped in IO.py

    Is there a strong reason for slurping the entire source file in translate.py? On a source file of 2M single-character-per-token sentences, memory usage exceeds 256 GB. I figure that since evaluation is sentence-by-sentence (or batch-by-batch), this can be avoided.

    Lab-mates and I have written an itertools.islice-based workaround. I'm happy to convert it into a PR if I have more information about the design goals and decisions, so I don't step on toes.

    type:bug 
    opened by aryamccarthy 27
  • Fixed embeddings

    Fixed embeddings

    This pull request adds support fixed word embeddings (that is, embeddings that will not be changed as the parameter is changed) on the source or target side. In order to make this go more smoothly, I also refactored Embeddings -- the constructor now takes the parameters it needs explicitly, instead of just taking the relevant dictionaries and opt. If opt were passed, the model would somehow need to decide whether the -fix_word_vecs_enc or -fix_word_vecs_dec is relevant. This refactoring also allows loading of pretrained embeddings to be done when the object is instantiated, instead of later on in train.py. Future work could also be done to allow pretrained and/or fixed feature vectors.

    opened by bpopeters 27
  • Experimental FP16 training

    Experimental FP16 training

    It uses the optimizer wrapper from https://github.com/NVIDIA/apex/ and simply defaults to an automatic loss scaling.

    State of this branch: it runs, meaning the training runs in FP16 without producing NaN values (currently focusing on standard Transformer training).

    However, the gains appeared to be very small on a V100 (like 5% faster), that's why I'm opening this branch for testing and help.

    Thanks.

    opened by guillaumekln 25
  • Incorrect normalization

    Incorrect normalization

    It does not have an impact on regular training but the math is incorrect.

    in _gradient_accumulation we normalize with "normlization" which is calculated on the total number of token in the "global batch" = Accum x "minibatch" then the loss of each mini batch is normalized by the global batch "nomalization" number of token instead of the number of tokens of each mini-batch. We can correct this in two ways:

    1. the accurate one: we need to calculate the number of tokens in each minibatch inside the gradient_accum loop
    2. we can approximate by multiplying the loss by the "accum" number.

    Which one shall we do ?

    NB: it has an impact when combining the loss with another term (lm_prior_loss for instance).

    opened by vince62s 0
  • Translation outputs differ with different batch sizes

    Translation outputs differ with different batch sizes

    Hi,

    I've recently noticed that I get slightly different translation results when translating with different batch sizes. I guess this is not expected...

    For example,

    Batch size 100 --> PRED SCORE: -6.0490, PRED PPL: 423.68 NB SENTENCES: 5000
    Batch size 150 --> PRED SCORE: -5.2232, PRED PPL: 185.54 NB SENTENCES: 5000
    

    I'm using the latest version of OpenNMT-py

    Thanks

    opened by anderleich 4
  • To remove a deprecated method _maybe_gather_stats() from Trainer

    To remove a deprecated method _maybe_gather_stats() from Trainer

    - The deprecated method has not been pulling its weight since we started to use IterOnDevice (#1849)
    - The only statement calling it has been removed (#2198, #2205)
    
    opened by aeceou 0
  • [WIP] Refactor Loss Compute for clarity

    [WIP] Refactor Loss Compute for clarity

    Instantiate CommonLossCompute with a from_opt classmethod.

    Make choice between copyattn / no copyattn in the trainer for clarity.

    remove NMT/LMLossCompute

    opened by vince62s 0
  • [WIP] LM Prior - Language Model Distillation during MT model training.

    [WIP] LM Prior - Language Model Distillation during MT model training.

    Cf: https://arxiv.org/pdf/2004.14928.pdf Initially designed for Low Resource NMT but it gave also good results with a EN-DE experiment. Output perplexity in the target language is lower than without the LM distillation.

    Issue: very slow, requires a GPU with substantial RAM since it load both models. (fine on a RTX 3090)

    It might be much faster if we can use CTranslate2 for the LM forward. cf: https://github.com/OpenNMT/CTranslate2/issues/876

    Pending: Train the MT model on one GPU while using the LM model on another GPU. Ideas welcome not to change too much the code. lm_prior_loss requires to align MT and LM vocabs (currently not the case) cf: https://github.com/OpenNMT/OpenNMT-py/issues/2170 to make the code cleaner.

    opened by vince62s 0
Releases(2.3.0)
  • 2.3.0(Sep 14, 2022)

    New features

    • BLEU/TER (& custom) scoring during training and validation (#2198)
    • LM related tools (#2197)
    • Allow encoder/decoder freezing (#2176)
    • Dynamic data loading for inference (#2145)
    • Sentence-level scores at inference (#2196)
    • MBR and oracle reranking scoring tools (#2196)

    Fixes and improvements

    • Updated beam exit condition (#2190)
    • Improve scores reporting (#2191)
    • Fix dropout scheduling (#2194)
    • Better catch CUDA ooms when training (#2195)
    • Fix source features support in inference and REST server (#2109)
    • Make REST server more flexible with dictionaries (#2104)
    • Fix target prefixing in LM decoding (#2099)
    Source code(tar.gz)
    Source code(zip)
  • 2.2.0(Sep 14, 2021)

    New features

    • Support source features (thanks @anderleich !)

    Fixes and improvements

    • Adaptations to relax torch version
    • Customizable transform statistics (#2059)
    • Adapt release code for ctranslate2 2.0
    Source code(tar.gz)
    Source code(zip)
  • 2.1.2(Apr 30, 2021)

  • 2.1.1(Apr 30, 2021)

  • 2.1.0(Apr 16, 2021)

    New features

    • Allow vocab update when training from a checkpoint (cec3cc8, 2f70dfc)

    Fixes and improvements

    • Various transforms related bug fixes
    • Fix beam warning and buffers reuse
    • Handle invalid lines in vocab file gracefully
    Source code(tar.gz)
    Source code(zip)
  • 2.0.1(Jan 27, 2021)

  • 2.0.0(Jan 20, 2021)

    First official release for OpenNMT-py major upgdate to 2.0!

    New features

    • Language Model (GPT-2 style) training and inference
    • Nucleus (top-p) sampling decoding

    Fixes and improvements

    • Fix some BART default values
    Source code(tar.gz)
    Source code(zip)
  • 2.0.0rc2(Nov 10, 2020)

    Fixes and improvements

    • Parallelize onmt_build_vocab (422d824)
    • Some fixes to the on-the-fly transforms
    • Some CTranslate2 related updates
    • Some fixes to the docs

    This will be the first release to be automatically deployed via GitHub Actions.

    Source code(tar.gz)
    Source code(zip)
  • 2.0.0rc1(Sep 25, 2020)

    This is the first release candidate for OpenNMT-py major upgdate to 2.0.0!

    The major idea behind this release is the -- almost -- complete makeover of the data loading pipeline . A new 'dynamic' paradigm is introduced, allowing to apply on the fly transforms to the data.

    This has a few advantages, amongst which:

    • remove or drastically reduce the preprocessing required to train a model;
    • increase and simplify the possibilities of data augmentation and manipulation through on-the fly transforms.

    These transforms can be specific tokenization methods, filters, noising, or any custom transform users may want to implement. Custom transform implementation is quite straightforward thanks to the existing base class and example implementations.

    You can check out how to use this new data loading pipeline in the updated docs and examples.

    All the readily available transforms are described here.

    Performance

    Given sufficient CPU resources according to GPU computing power, most of the transforms should not slow the training down. (Note: for now, one producer process per GPU is spawned -- meaning you would ideally need 2N CPU threads for N GPUs).

    Breaking changes

    A few features are dropped, at least for now:

    • audio, image and video inputs;
    • source word features.

    Some very old checkpoints with previous fields and vocab structure are also incompatible with this new version.

    For any user that still need some of these features, the previous codebase will be retained as legacy in a separate branch. It will no longer receive extensive development from the core team but PRs may still be accepted.

    Source code(tar.gz)
    Source code(zip)
  • 1.2.0(Aug 17, 2020)

    Fixes and improvements

    • Support pytorch 1.6 (e813f4d, eaaae6a)
    • Support official torch 1.6 AMP for mixed precision training (2ac1ed0)
    • Flag to override batch_size_multiple in FP16 mode, useful in some memory constrained setups (23e5018)
    • Pass a dict and allow custom options in preprocess/postprocess functions of REST server (41f0c02, 8ec54d2)
    • Allow different tokenization for source and target in REST server (bb2d045, 4659170)
    • Various bug fixes

    New features

    • Gated Graph Sequence Neural Networks encoder (11e8d0), thanks @SteveKommrusch
    • Decoding with a target prefix (95aeefb, 0e143ff, 91ab592), thanks @Zenglinxiao
    Source code(tar.gz)
    Source code(zip)
  • 1.1.1(Mar 20, 2020)

  • 1.1.0(Mar 19, 2020)

    New features

    • Support CTranslate2 models in REST server (91d5d57)
    • Extend support for custom preprocessing/postprocessing function in REST server by using return dictionaries (d14613d, 9619ac3, 92a7ba5)
    • Experimental: BART-like source noising (5940dcf)

    Fixes and improvements

    • Add options to CTranslate2 release (e442f3f)
    • Fix dataset shard order (458fc48)
    • Rotate only the server logs, not training (189583a)
    • Fix alignment error with empty prediction (91287eb)
    Source code(tar.gz)
    Source code(zip)
  • 1.0.2(Mar 5, 2020)

    Fixes and improvements

    • Enable CTranslate2 conversion of Transformers with relative position (db11135)
    • Adapt -replace_unk to use with learned alignments if they exist (7625b53)
    Source code(tar.gz)
    Source code(zip)
  • 1.0.1(Feb 17, 2020)

    Fixes and improvements

    • Ctranslate2 conversion handled in release script (1b50e0c)
    • Use attention_dropout properly in MHA (f5c9cd4)
    • Update apex FP16_Optimizer path (d3e2268)
    • Some REST server optimizations
    • Fix and add some docs
    Source code(tar.gz)
    Source code(zip)
  • 1.0.0(Dec 13, 2019)

    New features

    • Implementation of "Jointly Learning to Align & Translate with Transformer" (@Zenglinxiao)

    Fixes and improvements

    • Add nbest support to REST server (@Zenglinxiao)
    • Merge greedy and beam search codepaths (@Zenglinxiao)
    • Fix "block ngram repeats" (@KaijuML, @pltrdy)
    • Small fixes, some more docs
    Source code(tar.gz)
    Source code(zip)
  • 1.0.0.rc1(Oct 1, 2019)

    We have now reached some good stability of the code base.

    This is the 1.0.0 release candidate.

    • Fix Apex / FP16 training (Apex new API is buggy)
    • Multithread preprocessing way faster (Thanks François Hernandez)
    • Pip Installation v1.0.0.rc1 (thanks Paul Tardy)

    Enjoy and feel free to report issues.

    Source code(tar.gz)
    Source code(zip)
  • 0.9.2(Sep 5, 2019)

    • Switch to Pytorch 1.2
    • Pre/post processing on the translation server (useful for Chinese) Thanks @Zenglinxiao
    • option to remove the FFN layer in AAN + AAN optimization (faster)
    • Coverage loss (per Abisee paper 2017) implementation Thanks @pltrdy
    • Video Captioning task: Thanks @flauted !
    • Token batch at inference
    • Small fixes and add-ons
    Source code(tar.gz)
    Source code(zip)
  • 0.9.1(Jun 13, 2019)

    • New mechanism for MultiGPU training "1 batch producer / multi batch consumers" resulting in big memory saving when handling huge datasets thanks @pltrdy @francoishernandez

    • New APEX AMP (mixed precision) API thanks @francoishernandez NB: you need to resintall Nvidia/Apex

    • Option to overwrite shards when preprocessing

    • Small fixes and add-ons

    Source code(tar.gz)
    Source code(zip)
  • 0.9.0(May 16, 2019)

    Updated Travis to Pytorch 1.1

    • Faster vocab building when processing shards (no reloading) thanks @francoishernandez

    • New dataweighting feature thanks @francoishernandez see the FAQ doc for more information

    • New dropout scheduler. Same logic as accum_count / accum_steps see opts.py

    • fix Gold Scores

    • small fixes and add-ons.

    Unrelated, but new website online ! thanks @guillaumekln

    Enjoy !

    Source code(tar.gz)
    Source code(zip)
  • 0.8.2(Feb 17, 2019)

    • Update documentation and Library example (thanks @flauted @elisemicho )
    • Revamp args
    • Bug fixes, save moving average in FP32 (thanks @francoishernandez )
    • Allow FP32 inference for FP16 models
    Source code(tar.gz)
    Source code(zip)
  • 0.8.1(Feb 12, 2019)

  • 0.8.0(Feb 9, 2019)

    Many fixes and code cleaning thanks @flauted, @guillaumekln

    Datasets code refactor (thanks @flauted) you need to re-preprocess datasets

    New features FP16 Support: Experimental, using Apex, Checkpoints may break in future version. Continuous exponential moving average (thanks @francoishernandez, and Marian) Relative positions encoding (thanks @francoishernandez, and Google T2T) Deprecate the old beam search, fast batched beam search supports all options

    Source code(tar.gz)
    Source code(zip)
  • 0.7.2(Jan 31, 2019)

    Multi level text fields for better handling of embeddings. thanks @flauted

    code cleaning and bug fixing thanks @bpopeters @guillaumekln @pltrdy

    NB: you cannot train on 0.7.2 with preprocessed data on a prior version, you need to re-preprocess.

    Source code(tar.gz)
    Source code(zip)
  • 0.7.1(Jan 24, 2019)

    Many fixes and code refactoring thanks @bpopeters, @flauted, @guillaumekln

    New features Random sampling thanks @daphnei Enable sharding for huge files at translation

    Source code(tar.gz)
    Source code(zip)
  • 0.7.0(Jan 2, 2019)

  • 0.6.0(Nov 28, 2018)

  • 0.5.0(Oct 24, 2018)

    Ability to reset the optimizer when using -train_from

    -reset_optim = ['none', 'all', 'states', 'keep_states'] none: default behavior as before all: reset the optimizer !! steps start at zero again. states: reset only states, keep all other parameters from checkpoint keep_states: keep current states from checkpoint, but allow to change parameters (learning_rate for instance)

    Bug fixes. Tested with Pytorch 1.0RC works fine.

    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Oct 11, 2018)

  • 0.4.0(Oct 8, 2018)

    Fixed Speech2Text training (thanks Yuntian)

    Removed -max_shard_size, replaced by -shard_size = number of examples in a shard.

    Default value = 1M which works fine in most Text dataset cases. (will avoid Ram OOM in most cases)

    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(Sep 27, 2018)

    Now requires Pytorch 0.4.1

    Multi-node Multi-GPU with Torch Distributed

    New options are: -master_ip: ip address of the master node -master_port: port number of th emaster node -world_size = total number of processes to be run (total GPUs accross all nodes) -gpu_ranks = list of indices of processes accross all nodes

    -gpuid is deprecated

    See examples in https://github.com/OpenNMT/OpenNMT-py/blob/master/docs/source/FAQ.md

    Fixes to img2text now working

    New sharding based on number of examples

    Fixes to avoid 0.4.1 deprecated functions.

    Source code(tar.gz)
    Source code(zip)
Owner
OpenNMT
Open source ecosystem for neural machine translation and neural sequence learning
OpenNMT
Open Source Neural Machine Translation in PyTorch

OpenNMT-py: Open-Source Neural Machine Translation OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine trans

OpenNMT 4.8k Feb 18, 2021
Free and Open Source Machine Translation API. 100% self-hosted, offline capable and easy to setup.

LibreTranslate Try it online! | API Docs | Community Forum Free and Open Source Machine Translation API, entirely self-hosted. Unlike other APIs, it d

null 2.8k Oct 1, 2022
Training open neural machine translation models

Train Opus-MT models This package includes scripts for training NMT models using MarianNMT and OPUS data for OPUS-MT. More details are given in the Ma

Language Technology at the University of Helsinki 148 Sep 22, 2022
PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Non-Autoregressive Transformer Code release for Non-Autoregressive Neural Machine Translation by Jiatao Gu, James Bradbury, Caiming Xiong, Victor O.K.

Salesforce 257 Aug 18, 2022
Open-source offline translation library written in Python. Uses OpenNMT for translations

Open source neural machine translation in Python. Designed to be used either as a Python library or desktop application. Uses OpenNMT for translations and PyQt for GUI.

Argos Open Tech 1.4k Sep 25, 2022
Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

EasyNMT - Easy to use, state-of-the-art Neural Machine Translation This package provides easy to use, state-of-the-art machine translation for more th

Ubiquitous Knowledge Processing Lab 619 Sep 29, 2022
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 1.1k Sep 29, 2022
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 986 Feb 17, 2021
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 1000 Apr 19, 2021
Phrase-Based & Neural Unsupervised Machine Translation

Unsupervised Machine Translation This repository contains the original implementation of the unsupervised PBSMT and NMT models presented in Phrase-Bas

Facebook Research 1.5k Sep 9, 2022
Yet Another Neural Machine Translation Toolkit

YANMTT YANMTT is short for Yet Another Neural Machine Translation Toolkit. For a backstory how I ended up creating this toolkit scroll to the bottom o

Raj Dabre 107 Sep 27, 2022
Learning to Rewrite for Non-Autoregressive Neural Machine Translation

RewriteNAT This repo provides the code for reproducing our proposed RewriteNAT in EMNLP 2021 paper entitled "Learning to Rewrite for Non-Autoregressiv

Xinwei Geng 17 Jul 13, 2022
The implementation of Parameter Differentiation based Multilingual Neural Machine Translation

The implementation of Parameter Differentiation based Multilingual Neural Machin

Qian Wang 20 Jul 22, 2022
Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

hezw.tkcw 13 Aug 3, 2022
Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

背景 安装教程 快速上手 (一)预训练模型 (二)机器翻译 (三)文本分类 TenTrans 进阶 1. 多语言机器翻译 2. 跨语言预训练 背景 TrenTrans是一个统一的端到端的多语言多任务预训练平台,支持多种预训练方式,以及序列生成和自然语言理解任务。 安装教程 git clone git

Tencent Minority-Mandarin Translation Team 40 Sep 19, 2022
Local cross-platform machine translation GUI, based on CTranslate2

DesktopTranslator Local cross-platform machine translation GUI, based on CTranslate2 Download Windows Installer You can either download a ready-made W

Yasmin Moslem 21 Sep 28, 2022
null 1 Jun 28, 2022
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual

Rasa 14.9k Sep 24, 2022