PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Overview

Non-Autoregressive Transformer

Code release for Non-Autoregressive Neural Machine Translation by Jiatao Gu, James Bradbury, Caiming Xiong, Victor O.K. Li, and Richard Socher.

Requires PyTorch 0.3, torchtext 0.2.1, and SpaCy.

The pipeline for training a NAT model for a given language pair includes:

  1. run_alignment_wmt_LANG.sh (runs fast_align for alignment supervision)
  2. run_LANG.sh (trains an autoregressive model)
  3. run_LANG_decode.sh (produces the distillation corpus for training the NAT)
  4. run_LANG_fast.sh (trains the NAT model)
  5. run_LANG_fine.sh (fine-tunes the NAT model)
You might also like...
Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

A bunch of random PyTorch models using PyTorch's C++ frontend
A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

Comments
  • import copy in train.py for line 36

    import copy in train.py for line 36

    flake8 testing of https://github.com/salesforce/nonauto-nmt on Python 3.6.3

    $ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

    ./model.py:92:75: F821 undefined name 'the_mask'
            return targets[input_mask], out[out_mask].view(-1, out.size(-1)), the_mask
                                                                              ^
    ./self_learn.py:626:48: F821 undefined name 'align_index'
                    decoding1 = unsorted(decoding, align_index)
                                                   ^
    ./self_learn.py:905:45: F821 undefined name 'loss_alter'
                loss_alter, loss_worse = export(loss_alter), export(loss_worse)
                                                ^
    ./self_learn.py:905:65: F821 undefined name 'loss_worse'
                loss_alter, loss_worse = export(loss_alter), export(loss_worse)
                                                                    ^
    ./self_learn.py:1342:78: F821 undefined name 'fertility_mode'
        names = ['dev.src.b{}={}.{}'.format(args.beam_size, args.load_from, args,fertility_mode),
                                                                                 ^
    ./self_learn.py:1343:77: F821 undefined name 'fertility_mode'
                'dev.trg.b{}={}.{}'.format(args.beam_size, args.load_from, args,fertility_mode),
                                                                                ^
    ./self_learn.py:1344:77: F821 undefined name 'fertility_mode'
                'dev.dec.b{}={}.{}'.format(args.beam_size, args.load_from, args,fertility_mode)]
                                                                                ^
    ./train.py:35:17: F821 undefined name 'copy'
        new_batch = copy.copy(batch)
                    ^
    8     F821 undefined name 'the_mask'
    8
    
    cla:missing 
    opened by cclauss 2
  • enumerate function in for loop

    enumerate function in for loop

    Thank you for sharing your code. I have a problem with running the code, when I run the code there is no error, but in debug mode, there is an issue in line 142 of "train.py" file. There is a "for" loop that its condition doesn't satisfy because of the "enumerate" function. how can I fix this? It seems that's because of "train" variable and enumerate function can not convert it to numbers or something like this.

    Screenshot from 2020-04-15 20-47-32

    I would be appreciated for any help.

    opened by VahidChahkandi 0
  • Transformer Architecture and other issues

    Transformer Architecture and other issues

    In this implementation transformer decoder layer i is attending to encoder layer i output, which seems to be different from Google's original transformer implementation which always attends to the last layer output.

    https://github.com/salesforce/nonauto-nmt/blob/efcbe4f2329b140ac3ce06abb6409457cebc8e49/model.py#L601

    Plus, the provided scripts seem to contain options not supported by run.py, and many train_fast.sh relies on -load_from old_fast_model, which further makes the paper's results hard to reproduce.

    opened by da03 1
  • Dataset

    Dataset

    Thank you for sharing your codes. I have a question about how to preprocess the data. For example, for the iwslt en-de dataset, you use a file named train.tags.en-de.bpe.dev.en2 in the script run_alignment_iwslt.sh. It seems to not belong to the original dataset. Where does it come from?

    opened by Maggione 4
Owner
Salesforce
A variety of vendor agnostic projects which power Salesforce
Salesforce
ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... 모델의 개념이해를 돕기 위한 구현물로 현재 변수명을 상세히 적었고

BG Kim 3 Oct 6, 2022
An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

Enrico Fini 48 Sep 27, 2022
RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

Simo Ryu 90 Dec 8, 2022
A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

Harry Yang 121 Dec 17, 2022
A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

Alexis David Jacq 172 Dec 12, 2022
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

LEI TAI 111 Dec 8, 2022
Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

Arthur Juliani 76 Jan 7, 2023
Fang Zhonghao 13 Nov 19, 2022
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 4, 2023
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 6, 2023