A Fast Sequence Transducer Implementation with PyTorch Bindings

Overview

transducer

A Fast Sequence Transducer Implementation with PyTorch Bindings. The corresponding publication is Sequence Transduction with Recurrent Neural Networks.

Tested with Python 3.7 and PyTorch 1.3

Install and Test

First install PyTorch then from the top level of the repo run

python setup.py install

And test with

python test.py
Comments
  • always get nothing trying to use viterbi decode interface

    always get nothing trying to use viterbi decode interface

    Hi, awni! thanks for your gred repo, I have a problem in How to use the decode interface : I have tried to use code like following: ` B, T, *_ = scores.size()

       logit_lengths = torch.full((B, ), T, dtype=torch.int, device=scores.device)
    
       y = torch.full([B, 1], 0, dtype=torch.int32, device=scores.device)
    
        cur_len = 0
    
        for i in range(T):
            old_y = y
            preds, _ = self.pred_net(old_y)
            label_lengths = torch.full((B, ), cur_len, dtype=torch.int, device=scores.device)
            y = self.criterion.viterbi(scores, preds,logit_lengths, label_lengths)
            b, new_len = y.shape
            if new_len < 1:
                break
             print("shape of y is: ", y.shape)
            cur_len = new_len
    

    `

    but I always got break at the first step

    opened by xiongjun19 4
  • setup error: ‘isfinite’ was not declared in this scope

    setup error: ‘isfinite’ was not declared in this scope

    For other's reference, I encountered an "‘isfinite’ was not declared in this scope" error when running the setup.py script. I was able to resolve it based on the fix outlined here: https://github.com/erincatto/Box2D/issues/509. I added the string "std::" in front of isfinite, so "isinfinite" became "std::isinfinite" in the two occurrences in transducer.cpp.

    For reference, here is my setup information: Ubuntu 16.04 Python 3.6.5 :: Anaconda, Inc. Pytorch 1.3.1 gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

    The error output is below: (transducer) dzubke@phoneme-1:~/awni_speech/transducer$ python setup.py install running install running bdist_egg running egg_info creating transducer_cpp.egg-info writing transducer_cpp.egg-info/PKG-INFO writing dependency_links to transducer_cpp.egg-info/dependency_links.txt writing top-level names to transducer_cpp.egg-info/top_level.txt writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' reading manifest file 'transducer_cpp.egg-info/SOURCES.txt' writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_ext building 'transducer_cpp' extension creating build creating build/temp.linux-x86_64-3.7 gcc -pthread -B /home/dzubke/miniconda3/envs/transducer/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/TH -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/THC -I/home/dzubke/miniconda3/envs/transducer/include/python3.7m -c transducer.cpp -o build/temp.linux-x86_64-3.7/transducer.o -fopenmp -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=transducer_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ transducer.cpp: In function ‘float log_sum_exp(float, float)’: transducer.cpp:9:20: error: ‘isfinite’ was not declared in this scope if (!isfinite(a)) return b; ^ transducer.cpp:9:20: note: suggested alternative: In file included from /usr/include/c++/5/random:38:0, from /usr/include/c++/5/bits/stl_algo.h:66, from /usr/include/c++/5/algorithm:62, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/SmallVector.h:26, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/ArrayRef.h:18, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/core/MemoryFormat.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:11, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Context.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/ATen.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/extension.h:4, from transducer.cpp:6: /usr/include/c++/5/cmath:601:5: note: ‘std::isfinite’ isfinite(_Tp __x) ^ transducer.cpp:10:20: error: ‘isfinite’ was not declared in this scope if (!isfinite(b)) return a; ^ transducer.cpp:10:20: note: suggested alternative: In file included from /usr/include/c++/5/random:38:0, from /usr/include/c++/5/bits/stl_algo.h:66, from /usr/include/c++/5/algorithm:62, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/SmallVector.h:26, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/ArrayRef.h:18, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/core/MemoryFormat.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:11, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Context.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/ATen.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/extension.h:4, from transducer.cpp:6: /usr/include/c++/5/cmath:601:5: note: ‘std::isfinite’ isfinite(_Tp __x) ^ error: command 'gcc' failed with exit status 1

    opened by dzubke 3
  • Support complex joiner networks

    Support complex joiner networks

    I find that the current implementation supports only joiner networks containing an adder: https://github.com/awni/transducer/blob/b517f1f60177b6be2e3928a11c02784de8977672/torch_test.py#L187-L188

    In the paper Speech Recognition with Deep Recurrent Neural Networks, which is a successor of Sequence Transduction with Recurrent Neural Networks, it mentions another kind of joiner network:

    In the original formulation Pr(k|t, u) was defined by taking an ‘acoustic’ distribution Pr(k|t) from the CTC network, a ‘linguistic’ distribution Pr(k|u) from the prediction network, then multiplying the two together and renormalising. An improvement introduced in this paper is to instead feed the hidden activations of both networks into a separate feedforward output network, whose outputs are then normalised with a softmax function to yield Pr(k|t, u). This allows a richer set of possibilities for combining linguistic and acoustic information, and appears to lead to better generalisation. In particular we have found that the number of deletion errors encountered during decoding is reduced.

    Wondering whether it will support feedfoward joiner networks that contain nn.Linear() and nn.Tanh() layers.


    Also, the README.md says:

    The memory of this implementation scales with the product B * T * U and does not increase with the token set size

    But the memory occupied by the encoder and decoder is proportional to the vocabulary size. The memory consumed by a vocab size of 10k is certainly more than that of a size of 1k, I believe.

    opened by csukuangfj 2
  • Question about the gradient computation

    Question about the gradient computation

    Thanks, for the great work.

    https://github.com/awni/transducer/blob/5a1c2c776f1667d78c79310616fd2d48bff4e4f5/ref_transduce.py#L92

    l couldn't find the log_probs term anywhere in the paper, specifically equation(20). Could you point me to the equation in the paper, that refers to it.

    opened by digital10111 2
  • Questions about the reasoning in forward-backward algorithm

    Questions about the reasoning in forward-backward algorithm

    I'm looking for the explanation of 3 following cases:

    • Why forward and backward passes should return the same (or very similar) likelihoods?

    • the forward likelihood is a sum of last forward variable (and calculation step) and the probability of emitting blank token in the last step alphas[T-1, U-1] + log_probs[T-1, U-1, blank] however backward likelihood is only the first backward variable value (last calculation step). betas[0, 0] Why they're different?

    • In line 48 in https://github.com/awni/transducer/blob/master/ref_transduce.py alphas[t, 0] = alphas[t-1, 0] + log_probs[t-1, 0, blank] why shouldn't we use log_probs[t-1, 0, labels[-1]]

    instead? Isn't it assuming that we expect the last frame to emit a blank token, which is not true?

    Could anyone here help me to understand these problems?

    opened by smolendawid 2
  • Transducer update not backward compatible with speech repo running pytorch 0.4.1

    Transducer update not backward compatible with speech repo running pytorch 0.4.1

    Hello Awni,

    I wanted to let you know that I think your recent update to the transducer repo is not backward compatible with pytorch 0.4.1 when using your speech repo: https://github.com/awni/speech. I used pytorch 0.4.1 instead of pytorch 1.X when running your speech repo because I encountered an import error with ffi "ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead." when I tried using a more recent version of pytorch. As outlined here: https://github.com/pytorch/pytorch/issues/15645, the recommendation was to use an earlier version of pytorch, so I used 0.4.1.

    As a smaller issue, the Makefile in your speech repo calls a build.py function in libs/transducer, which is no longer present in the transducer repo. This is a smaller issue, but when I ran the "python setup.py install" command, I got the output below.

    Here are the details of my setup: OS: ubuntu-1604-xenial-v20200108 Python: Python 3.6.5 :: Anaconda, Inc. Pytorch: 0.4.1 Cuda: 10.0

    (awni_env36) dzubke@phoneme-1:~/awni_speech/speech/libs/transducer$ python setup.py install running install running bdist_egg running egg_info writing transducer_cpp.egg-info/PKG-INFO writing dependency_links to transducer_cpp.egg-info/dependency_links.txt writing top-level names to transducer_cpp.egg-info/top_level.txt reading manifest file 'transducer_cpp.egg-info/SOURCES.txt' writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_ext building 'transducer_cpp' extension gcc -pthread -B /home/dzubke/miniconda3/envs/awni_env36/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include/THC -I/home/dzubke/miniconda3/envs/awni_env36/include/python3.6m -c transducer.cpp -o build/temp.linux-x86_64-3.6/transducer.o -fopenmp -DTORCH_EXTENSION_NAME=transducer_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ transducer.cpp:6:29: fatal error: torch/extension.h: No such file or directory compilation terminated. error: command 'gcc' failed with exit status 1

    This isn't a critical issue for me, as I am using a CTC model for phoneme recognition, so I don't need the transducer module. To get around it, I commented out the line: "from speech.models.transducer_model import Transducer" in speech/speech/models/init.py, and that seems to work for me. For reference on my setup, the warp-ctc module did build properly.

    I just wanted to bring this to your attention. I am likely going to modify your speech repo to run on a more recent version of pytorch as I think that will make it easier convert my ctc model to coreML using onnx, so I will let you know if this issue is resolved when using a more recent version of pytorch.

    Thanks for publishing your code! It has been very helpful to me in my project. Let me know if you wnat more details on my setup or this issue.

    opened by dzubke 1
  • Does not work for Pytorch 1.1

    Does not work for Pytorch 1.1

    When i run python build.py , i got this error.

    Traceback (most recent call last):
      File "build.py", line 4, in <module>
        from torch.utils.ffi import create_extension
      File "/home/vespar/miniconda3/envs/ariyan/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 1, in <module>
        raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.")
    ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.
    

    I am using pytorch 1.1 and the script is support only for pytorch 0.4.1 I changes the code with pytorch new library. Code is below,

    import os
    import sys
    import torch
    #from torch.utils.ffi import create_extension
    from torch.utils.cpp_extension import BuildExtension, CppExtension
    from setuptools import setup
    import setuptools
    
    this_file = os.path.abspath(__file__)
    
    sources = ['src/transducer.c']
    headers = ['src/transducer.h']
    
    args = ["-std=c99"]
    if sys.platform == "darwin":
        args += ["-DAPPLE"]
    else:
        args += ["-fopenmp"]
    
    #ffi = create_extension(
    #    '_ext.transducer',
    #    headers=headers,
    #    sources=sources,
    #    relative_to=__file__,
    #    extra_compile_args=args
    #)
    
    
    setup(
    	name='_ext.transducer',
    	ext_modules=[
    		CppExtension(
    			name='_ext.transducer',
    			sources=['src/transducer.h','src/transducer.c'],
    			extra_compile_args=args),
    	],
    	cmdclass={
    		'build_ext': BuildExtension
    	})
    

    after doing this, I got this error,

    usage: build.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
       or: build.py --help [cmd1 cmd2 ...]
       or: build.py --help-commands
       or: build.py cmd --help
    
    error: no commands supplied
    
    opened by mdhasanai 1
  • pytorch 1.1 support

    pytorch 1.1 support

    getting this error when using torch 1.1:

    from .._ext import transducer File "/home/tanish/ds3/libs/transducer/_ext/transducer/init.py", line 2, in from torch.utils.ffi import _wrap_function File "/home/tanish/environments/ds3_torch_1.1/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 1, in raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.") ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.

    Please add pytorch 1.1 support to the code. Any help on this ??

    opened by rajeevbaalwan 1
  • Import Error,  _transducer can not be imported.

    Import Error, _transducer can not be imported.

    follow the readme, I got the error about importing error, suggest that there may be a circle import.

    finally I find a soulution I have to manually copy the _transducer.cpython-xxxx.so to run the torch_test, I'm wondering is there a more wisable solution ,

    opened by xiongjun19 1
Owner
Awni Hannun
Research Scientist at Facebook AI Research
Awni Hannun
Sequence-to-Sequence learning using PyTorch

Seq2Seq in PyTorch This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train

Elad Hoffer 514 Nov 17, 2022
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language mod

null 20.5k Jan 8, 2023
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 1.1k Dec 27, 2022
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language mod

null 11.3k Feb 18, 2021
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 986 Feb 17, 2021
Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Amazon Web Services - Labs 1000 Apr 19, 2021
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language mod

null 13.2k Jul 7, 2021
A highly sophisticated sequence-to-sequence model for code generation

CoderX A proof-of-concept AI system by Graham Neubig (June 30, 2021). About CoderX CoderX is a retrieval-based code generation AI system reminiscent o

Graham Neubig 39 Aug 3, 2021
Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Se

Maha 490 Dec 15, 2022
MASS: Masked Sequence to Sequence Pre-training for Language Generation

MASS: Masked Sequence to Sequence Pre-training for Language Generation

Microsoft 1.1k Dec 17, 2022
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

null 186 Dec 24, 2022
Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Yoon Kim 43 Dec 23, 2022
Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

Frog for Python This is a Python binding to the Natural Language Processing suite Frog. Frog is intended for Dutch and performs part-of-speech tagging

Maarten van Gompel 46 Dec 14, 2022
Sequence model architectures from scratch in PyTorch

This repository implements a variety of sequence model architectures from scratch in PyTorch. Effort has been put to make the code well structured so that it can serve as learning material. The training loop implements the learner design pattern from fast.ai in pure PyTorch, with access to the loop provided through callbacks. Detailed logging and graphs are also provided with python logging and wandb. Additional implementations will be added.

Brando Koch 11 Mar 28, 2022
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

Token Shift GPT Implementation of Token Shift GPT - An autoregressive model that relies solely on shifting along the sequence dimension and feedforwar

Phil Wang 32 Oct 14, 2022
Yet Another Sequence Encoder - Encode sequences to vector of vector in python !

Yase Yet Another Sequence Encoder - encode sequences to vector of vectors in python ! Why Yase ? Yase enable you to encode any sequence which can be r

Pierre PACI 12 Aug 19, 2021
Task-based datasets, preprocessing, and evaluation for sequence models.

SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models. SeqIO is a library for processing sequential data to be fed into downst

Google 290 Dec 26, 2022
LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

LightSeq is a high performance inference library for sequence processing and generation implemented in CUDA. It enables highly efficient computation of modern NLP models such as BERT, GPT2, Transformer, etc. It is therefore best useful for Machine Translation, Text Generation, Dialog, Language Modelling, and other related tasks using these models.

Bytedance Inc. 2.5k Jan 3, 2023
Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

TRICE: a task-agnostic transferring framework for multi-source sequence generation This is the source code of our work Transfer Learning for Sequence

THUNLP-MT 9 Jun 27, 2022