A Fast Sequence Transducer Implementation with PyTorch Bindings

Overview

transducer

A Fast Sequence Transducer Implementation with PyTorch Bindings. The corresponding publication is Sequence Transduction with Recurrent Neural Networks.

Tested with Python 3.7 and PyTorch 1.3

Install and Test

First install PyTorch then from the top level of the repo run

python setup.py install

And test with

python test.py
Comments
  • always get nothing trying to use viterbi decode interface

    always get nothing trying to use viterbi decode interface

    Hi, awni! thanks for your gred repo, I have a problem in How to use the decode interface : I have tried to use code like following: ` B, T, *_ = scores.size()

       logit_lengths = torch.full((B, ), T, dtype=torch.int, device=scores.device)
    
       y = torch.full([B, 1], 0, dtype=torch.int32, device=scores.device)
    
        cur_len = 0
    
        for i in range(T):
            old_y = y
            preds, _ = self.pred_net(old_y)
            label_lengths = torch.full((B, ), cur_len, dtype=torch.int, device=scores.device)
            y = self.criterion.viterbi(scores, preds,logit_lengths, label_lengths)
            b, new_len = y.shape
            if new_len < 1:
                break
             print("shape of y is: ", y.shape)
            cur_len = new_len
    

    `

    but I always got break at the first step

    opened by xiongjun19 4
  • setup error: ‘isfinite’ was not declared in this scope

    setup error: ‘isfinite’ was not declared in this scope

    For other's reference, I encountered an "‘isfinite’ was not declared in this scope" error when running the setup.py script. I was able to resolve it based on the fix outlined here: https://github.com/erincatto/Box2D/issues/509. I added the string "std::" in front of isfinite, so "isinfinite" became "std::isinfinite" in the two occurrences in transducer.cpp.

    For reference, here is my setup information: Ubuntu 16.04 Python 3.6.5 :: Anaconda, Inc. Pytorch 1.3.1 gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

    The error output is below: (transducer) dzubke@phoneme-1:~/awni_speech/transducer$ python setup.py install running install running bdist_egg running egg_info creating transducer_cpp.egg-info writing transducer_cpp.egg-info/PKG-INFO writing dependency_links to transducer_cpp.egg-info/dependency_links.txt writing top-level names to transducer_cpp.egg-info/top_level.txt writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' reading manifest file 'transducer_cpp.egg-info/SOURCES.txt' writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_ext building 'transducer_cpp' extension creating build creating build/temp.linux-x86_64-3.7 gcc -pthread -B /home/dzubke/miniconda3/envs/transducer/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/TH -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/THC -I/home/dzubke/miniconda3/envs/transducer/include/python3.7m -c transducer.cpp -o build/temp.linux-x86_64-3.7/transducer.o -fopenmp -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=transducer_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ transducer.cpp: In function ‘float log_sum_exp(float, float)’: transducer.cpp:9:20: error: ‘isfinite’ was not declared in this scope if (!isfinite(a)) return b; ^ transducer.cpp:9:20: note: suggested alternative: In file included from /usr/include/c++/5/random:38:0, from /usr/include/c++/5/bits/stl_algo.h:66, from /usr/include/c++/5/algorithm:62, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/SmallVector.h:26, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/ArrayRef.h:18, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/core/MemoryFormat.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:11, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Context.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/ATen.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/extension.h:4, from transducer.cpp:6: /usr/include/c++/5/cmath:601:5: note: ‘std::isfinite’ isfinite(_Tp __x) ^ transducer.cpp:10:20: error: ‘isfinite’ was not declared in this scope if (!isfinite(b)) return a; ^ transducer.cpp:10:20: note: suggested alternative: In file included from /usr/include/c++/5/random:38:0, from /usr/include/c++/5/bits/stl_algo.h:66, from /usr/include/c++/5/algorithm:62, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/SmallVector.h:26, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/ArrayRef.h:18, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/core/MemoryFormat.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:11, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Context.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/ATen.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/extension.h:4, from transducer.cpp:6: /usr/include/c++/5/cmath:601:5: note: ‘std::isfinite’ isfinite(_Tp __x) ^ error: command 'gcc' failed with exit status 1

    opened by dzubke 3
  • Support complex joiner networks

    Support complex joiner networks

    I find that the current implementation supports only joiner networks containing an adder: https://github.com/awni/transducer/blob/b517f1f60177b6be2e3928a11c02784de8977672/torch_test.py#L187-L188

    In the paper Speech Recognition with Deep Recurrent Neural Networks, which is a successor of Sequence Transduction with Recurrent Neural Networks, it mentions another kind of joiner network:

    In the original formulation Pr(k|t, u) was defined by taking an ‘acoustic’ distribution Pr(k|t) from the CTC network, a ‘linguistic’ distribution Pr(k|u) from the prediction network, then multiplying the two together and renormalising. An improvement introduced in this paper is to instead feed the hidden activations of both networks into a separate feedforward output network, whose outputs are then normalised with a softmax function to yield Pr(k|t, u). This allows a richer set of possibilities for combining linguistic and acoustic information, and appears to lead to better generalisation. In particular we have found that the number of deletion errors encountered during decoding is reduced.

    Wondering whether it will support feedfoward joiner networks that contain nn.Linear() and nn.Tanh() layers.


    Also, the README.md says:

    The memory of this implementation scales with the product B * T * U and does not increase with the token set size

    But the memory occupied by the encoder and decoder is proportional to the vocabulary size. The memory consumed by a vocab size of 10k is certainly more than that of a size of 1k, I believe.

    opened by csukuangfj 2
  • Question about the gradient computation

    Question about the gradient computation

    Thanks, for the great work.

    https://github.com/awni/transducer/blob/5a1c2c776f1667d78c79310616fd2d48bff4e4f5/ref_transduce.py#L92

    l couldn't find the log_probs term anywhere in the paper, specifically equation(20). Could you point me to the equation in the paper, that refers to it.

    opened by digital10111 2
  • Questions about the reasoning in forward-backward algorithm

    Questions about the reasoning in forward-backward algorithm

    I'm looking for the explanation of 3 following cases:

    • Why forward and backward passes should return the same (or very similar) likelihoods?

    • the forward likelihood is a sum of last forward variable (and calculation step) and the probability of emitting blank token in the last step alphas[T-1, U-1] + log_probs[T-1, U-1, blank] however backward likelihood is only the first backward variable value (last calculation step). betas[0, 0] Why they're different?

    • In line 48 in https://github.com/awni/transducer/blob/master/ref_transduce.py alphas[t, 0] = alphas[t-1, 0] + log_probs[t-1, 0, blank] why shouldn't we use log_probs[t-1, 0, labels[-1]]

    instead? Isn't it assuming that we expect the last frame to emit a blank token, which is not true?

    Could anyone here help me to understand these problems?

    opened by smolendawid 2
  • Transducer update not backward compatible with speech repo running pytorch 0.4.1

    Transducer update not backward compatible with speech repo running pytorch 0.4.1

    Hello Awni,

    I wanted to let you know that I think your recent update to the transducer repo is not backward compatible with pytorch 0.4.1 when using your speech repo: https://github.com/awni/speech. I used pytorch 0.4.1 instead of pytorch 1.X when running your speech repo because I encountered an import error with ffi "ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead." when I tried using a more recent version of pytorch. As outlined here: https://github.com/pytorch/pytorch/issues/15645, the recommendation was to use an earlier version of pytorch, so I used 0.4.1.

    As a smaller issue, the Makefile in your speech repo calls a build.py function in libs/transducer, which is no longer present in the transducer repo. This is a smaller issue, but when I ran the "python setup.py install" command, I got the output below.

    Here are the details of my setup: OS: ubuntu-1604-xenial-v20200108 Python: Python 3.6.5 :: Anaconda, Inc. Pytorch: 0.4.1 Cuda: 10.0

    (awni_env36) dzubke@phoneme-1:~/awni_speech/speech/libs/transducer$ python setup.py install running install running bdist_egg running egg_info writing transducer_cpp.egg-info/PKG-INFO writing dependency_links to transducer_cpp.egg-info/dependency_links.txt writing top-level names to transducer_cpp.egg-info/top_level.txt reading manifest file 'transducer_cpp.egg-info/SOURCES.txt' writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_ext building 'transducer_cpp' extension gcc -pthread -B /home/dzubke/miniconda3/envs/awni_env36/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include/THC -I/home/dzubke/miniconda3/envs/awni_env36/include/python3.6m -c transducer.cpp -o build/temp.linux-x86_64-3.6/transducer.o -fopenmp -DTORCH_EXTENSION_NAME=transducer_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ transducer.cpp:6:29: fatal error: torch/extension.h: No such file or directory compilation terminated. error: command 'gcc' failed with exit status 1

    This isn't a critical issue for me, as I am using a CTC model for phoneme recognition, so I don't need the transducer module. To get around it, I commented out the line: "from speech.models.transducer_model import Transducer" in speech/speech/models/init.py, and that seems to work for me. For reference on my setup, the warp-ctc module did build properly.

    I just wanted to bring this to your attention. I am likely going to modify your speech repo to run on a more recent version of pytorch as I think that will make it easier convert my ctc model to coreML using onnx, so I will let you know if this issue is resolved when using a more recent version of pytorch.

    Thanks for publishing your code! It has been very helpful to me in my project. Let me know if you wnat more details on my setup or this issue.

    opened by dzubke 1
  • Does not work for Pytorch 1.1

    Does not work for Pytorch 1.1

    When i run python build.py , i got this error.

    Traceback (most recent call last):
      File "build.py", line 4, in <module>
        from torch.utils.ffi import create_extension
      File "/home/vespar/miniconda3/envs/ariyan/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 1, in <module>
        raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.")
    ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.
    

    I am using pytorch 1.1 and the script is support only for pytorch 0.4.1 I changes the code with pytorch new library. Code is below,

    import os
    import sys
    import torch
    #from torch.utils.ffi import create_extension
    from torch.utils.cpp_extension import BuildExtension, CppExtension
    from setuptools import setup
    import setuptools
    
    this_file = os.path.abspath(__file__)
    
    sources = ['src/transducer.c']
    headers = ['src/transducer.h']
    
    args = ["-std=c99"]
    if sys.platform == "darwin":
        args += ["-DAPPLE"]
    else:
        args += ["-fopenmp"]
    
    #ffi = create_extension(
    #    '_ext.transducer',
    #    headers=headers,
    #    sources=sources,
    #    relative_to=__file__,
    #    extra_compile_args=args
    #)
    
    
    setup(
    	name='_ext.transducer',
    	ext_modules=[
    		CppExtension(
    			name='_ext.transducer',
    			sources=['src/transducer.h','src/transducer.c'],
    			extra_compile_args=args),
    	],
    	cmdclass={
    		'build_ext': BuildExtension
    	})
    

    after doing this, I got this error,

    usage: build.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
       or: build.py --help [cmd1 cmd2 ...]
       or: build.py --help-commands
       or: build.py cmd --help
    
    error: no commands supplied
    
    opened by mdhasanai 1
  • pytorch 1.1 support

    pytorch 1.1 support

    getting this error when using torch 1.1:

    from .._ext import transducer File "/home/tanish/ds3/libs/transducer/_ext/transducer/init.py", line 2, in from torch.utils.ffi import _wrap_function File "/home/tanish/environments/ds3_torch_1.1/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 1, in raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.") ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.

    Please add pytorch 1.1 support to the code. Any help on this ??

    opened by rajeevbaalwan 1
  • Import Error,  _transducer can not be imported.

    Import Error, _transducer can not be imported.

    follow the readme, I got the error about importing error, suggest that there may be a circle import.

    finally I find a soulution I have to manually copy the _transducer.cpython-xxxx.so to run the torch_test, I'm wondering is there a more wisable solution ,

    opened by xiongjun19 1
Owner
Awni Hannun
Distinguished Scientist at Zoom AI
Awni Hannun
Sequence-to-Sequence learning using PyTorch

Seq2Seq in PyTorch This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train

Elad Hoffer 514 Nov 17, 2022
Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

zhaohu xing 112 Dec 16, 2022
An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

Luke Tonin 195 Dec 17, 2022
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

Abhay Gupta 161 Dec 8, 2022
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

Sunbow Liu 22 Nov 25, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 897 Jan 5, 2023
Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Se

Maha 490 Dec 15, 2022
Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

Cornelius Roemer 24 Oct 26, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 8, 2023
.NET bindings for the Pytorch engine

TorchSharp TorchSharp is a .NET library that provides access to the library that powers PyTorch. It is a work in progress, but already provides a .NET

Matteo Interlandi 17 Aug 30, 2021
Rust bindings for the C++ api of PyTorch.

tch-rs Rust bindings for the C++ api of PyTorch. The goal of the tch crate is to provide some thin wrappers around the C++ PyTorch api (a.k.a. libtorc

Laurent Mazare 2.3k Dec 30, 2022
Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

Super-Fast-Adversarial-Training This is a PyTorch Implementation code for develo

LBK 26 Dec 2, 2022
A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

A PyTorch Implementation of GGNN This is a PyTorch implementation of the Gated Graph Sequence Neural Networks (GGNN) as described in the paper Gated G

Ching-Yao Chuang 427 Dec 13, 2022
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

Ruotian(RT) Luo 906 Jan 3, 2023
A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

A PyTorch Implementation of GGNN This is a PyTorch implementation of the Gated Graph Sequence Neural Networks (GGNN) as described in the paper Gated G

Ching-Yao Chuang 427 Dec 13, 2022
CUDA Python Low-level Bindings

CUDA Python Low-level Bindings

NVIDIA Corporation 529 Jan 3, 2023
RLBot Python bindings for the Rust crate rl_ball_sym

RLBot Python bindings for rl_ball_sym 0.6 Prerequisites: Rust & Cargo Build Tools for Visual Studio RLBot - Verify that the file %localappdata%\RLBotG

Eric Veilleux 2 Nov 25, 2022
Simple renderer for use with MuJoCo (>=2.1.2) Python Bindings.

Viewer for MuJoCo in Python Interactive renderer to use with the official Python bindings for MuJoCo. Starting with version 2.1.2, MuJoCo comes with n

Rohan P. Singh 62 Dec 30, 2022
Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

Miloš Stanojević 11 Oct 14, 2022