A Fast Sequence Transducer Implementation with PyTorch Bindings

Awni Hannun

Last update: Dec 18, 2022

Related tags

Deep Learning transducer

Overview

transducer

A Fast Sequence Transducer Implementation with PyTorch Bindings. The corresponding publication is Sequence Transduction with Recurrent Neural Networks.

Tested with Python 3.7 and PyTorch 1.3

Install and Test

First install PyTorch then from the top level of the repo run

python setup.py install

And test with

python test.py

Comments

always get nothing trying to use viterbi decode interface

Hi, awni! thanks for your gred repo, I have a problem in How to use the decode interface : I have tried to use code like following: ` B, T, *_ = scores.size()

   logit_lengths = torch.full((B, ), T, dtype=torch.int, device=scores.device)

   y = torch.full([B, 1], 0, dtype=torch.int32, device=scores.device)

    cur_len = 0

    for i in range(T):
        old_y = y
        preds, _ = self.pred_net(old_y)
        label_lengths = torch.full((B, ), cur_len, dtype=torch.int, device=scores.device)
        y = self.criterion.viterbi(scores, preds,logit_lengths, label_lengths)
        b, new_len = y.shape
        if new_len < 1:
            break
         print("shape of y is: ", y.shape)
        cur_len = new_len

but I always got break at the first step

opened by xiongjun19 4

setup error: ‘isfinite’ was not declared in this scope

For other's reference, I encountered an "‘isfinite’ was not declared in this scope" error when running the setup.py script. I was able to resolve it based on the fix outlined here: https://github.com/erincatto/Box2D/issues/509. I added the string "std::" in front of isfinite, so "isinfinite" became "std::isinfinite" in the two occurrences in transducer.cpp.

For reference, here is my setup information: Ubuntu 16.04 Python 3.6.5 :: Anaconda, Inc. Pytorch 1.3.1 gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

The error output is below: (transducer) dzubke@phoneme-1:~/awni_speech/transducer$ python setup.py install running install running bdist_egg running egg_info creating transducer_cpp.egg-info writing transducer_cpp.egg-info/PKG-INFO writing dependency_links to transducer_cpp.egg-info/dependency_links.txt writing top-level names to transducer_cpp.egg-info/top_level.txt writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' reading manifest file 'transducer_cpp.egg-info/SOURCES.txt' writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_ext building 'transducer_cpp' extension creating build creating build/temp.linux-x86_64-3.7 gcc -pthread -B /home/dzubke/miniconda3/envs/transducer/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/TH -I/home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/THC -I/home/dzubke/miniconda3/envs/transducer/include/python3.7m -c transducer.cpp -o build/temp.linux-x86_64-3.7/transducer.o -fopenmp -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=transducer_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ transducer.cpp: In function ‘float log_sum_exp(float, float)’: transducer.cpp:9:20: error: ‘isfinite’ was not declared in this scope if (!isfinite(a)) return b; ^ transducer.cpp:9:20: note: suggested alternative: In file included from /usr/include/c++/5/random:38:0, from /usr/include/c++/5/bits/stl_algo.h:66, from /usr/include/c++/5/algorithm:62, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/SmallVector.h:26, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/ArrayRef.h:18, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/core/MemoryFormat.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:11, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Context.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/ATen.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/extension.h:4, from transducer.cpp:6: /usr/include/c++/5/cmath:601:5: note: ‘std::isfinite’ isfinite(_Tp __x) ^ transducer.cpp:10:20: error: ‘isfinite’ was not declared in this scope if (!isfinite(b)) return a; ^ transducer.cpp:10:20: note: suggested alternative: In file included from /usr/include/c++/5/random:38:0, from /usr/include/c++/5/bits/stl_algo.h:66, from /usr/include/c++/5/algorithm:62, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/SmallVector.h:26, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/util/ArrayRef.h:18, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/c10/core/MemoryFormat.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:11, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/Context.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/ATen/ATen.h:5, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4, from /home/dzubke/miniconda3/envs/transducer/lib/python3.7/site-packages/torch/include/torch/extension.h:4, from transducer.cpp:6: /usr/include/c++/5/cmath:601:5: note: ‘std::isfinite’ isfinite(_Tp __x) ^ error: command 'gcc' failed with exit status 1

opened by dzubke 3
Support complex joiner networks

I find that the current implementation supports only joiner networks containing an adder: https://github.com/awni/transducer/blob/b517f1f60177b6be2e3928a11c02784de8977672/torch_test.py#L187-L188

In the paper Speech Recognition with Deep Recurrent Neural Networks, which is a successor of Sequence Transduction with Recurrent Neural Networks, it mentions another kind of joiner network:

In the original formulation Pr(k|t, u) was defined by taking an ‘acoustic’ distribution Pr(k|t) from the CTC network, a ‘linguistic’ distribution Pr(k|u) from the prediction network, then multiplying the two together and renormalising. An improvement introduced in this paper is to instead feed the hidden activations of both networks into a separate feedforward output network, whose outputs are then normalised with a softmax function to yield Pr(k|t, u). This allows a richer set of possibilities for combining linguistic and acoustic information, and appears to lead to better generalisation. In particular we have found that the number of deletion errors encountered during decoding is reduced.

Wondering whether it will support feedfoward joiner networks that contain nn.Linear() and nn.Tanh() layers.

Also, the README.md says:

The memory of this implementation scales with the product B * T * U and does not increase with the token set size

But the memory occupied by the encoder and decoder is proportional to the vocabulary size. The memory consumed by a vocab size of 10k is certainly more than that of a size of 1k, I believe.

opened by csukuangfj 2
Question about the gradient computation

Thanks, for the great work.

https://github.com/awni/transducer/blob/5a1c2c776f1667d78c79310616fd2d48bff4e4f5/ref_transduce.py#L92

l couldn't find the log_probs term anywhere in the paper, specifically equation(20). Could you point me to the equation in the paper, that refers to it.

opened by digital10111 2
Questions about the reasoning in forward-backward algorithm
I'm looking for the explanation of 3 following cases:

Why forward and backward passes should return the same (or very similar) likelihoods?

the forward likelihood is a sum of last forward variable (and calculation step) and the probability of emitting blank token in the last step alphas[T-1, U-1] + log_probs[T-1, U-1, blank] however backward likelihood is only the first backward variable value (last calculation step). betas[0, 0] Why they're different?

In line 48 in https://github.com/awni/transducer/blob/master/ref_transduce.py alphas[t, 0] = alphas[t-1, 0] + log_probs[t-1, 0, blank] why shouldn't we use log_probs[t-1, 0, labels[-1]]

instead? Isn't it assuming that we expect the last frame to emit a blank token, which is not true?

Could anyone here help me to understand these problems?
opened by smolendawid 2
Transducer update not backward compatible with speech repo running pytorch 0.4.1

Hello Awni,

I wanted to let you know that I think your recent update to the transducer repo is not backward compatible with pytorch 0.4.1 when using your speech repo: https://github.com/awni/speech. I used pytorch 0.4.1 instead of pytorch 1.X when running your speech repo because I encountered an import error with ffi "ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead." when I tried using a more recent version of pytorch. As outlined here: https://github.com/pytorch/pytorch/issues/15645, the recommendation was to use an earlier version of pytorch, so I used 0.4.1.

As a smaller issue, the Makefile in your speech repo calls a build.py function in libs/transducer, which is no longer present in the transducer repo. This is a smaller issue, but when I ran the "python setup.py install" command, I got the output below.

Here are the details of my setup: OS: ubuntu-1604-xenial-v20200108 Python: Python 3.6.5 :: Anaconda, Inc. Pytorch: 0.4.1 Cuda: 10.0

(awni_env36) dzubke@phoneme-1:~/awni_speech/speech/libs/transducer$ python setup.py install running install running bdist_egg running egg_info writing transducer_cpp.egg-info/PKG-INFO writing dependency_links to transducer_cpp.egg-info/dependency_links.txt writing top-level names to transducer_cpp.egg-info/top_level.txt reading manifest file 'transducer_cpp.egg-info/SOURCES.txt' writing manifest file 'transducer_cpp.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_ext building 'transducer_cpp' extension gcc -pthread -B /home/dzubke/miniconda3/envs/awni_env36/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include/TH -I/home/dzubke/miniconda3/envs/awni_env36/lib/python3.6/site-packages/torch/lib/include/THC -I/home/dzubke/miniconda3/envs/awni_env36/include/python3.6m -c transducer.cpp -o build/temp.linux-x86_64-3.6/transducer.o -fopenmp -DTORCH_EXTENSION_NAME=transducer_cpp -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ transducer.cpp:6:29: fatal error: torch/extension.h: No such file or directory compilation terminated. error: command 'gcc' failed with exit status 1

This isn't a critical issue for me, as I am using a CTC model for phoneme recognition, so I don't need the transducer module. To get around it, I commented out the line: "from speech.models.transducer_model import Transducer" in speech/speech/models/init.py, and that seems to work for me. For reference on my setup, the warp-ctc module did build properly.

I just wanted to bring this to your attention. I am likely going to modify your speech repo to run on a more recent version of pytorch as I think that will make it easier convert my ctc model to coreML using onnx, so I will let you know if this issue is resolved when using a more recent version of pytorch.

Thanks for publishing your code! It has been very helpful to me in my project. Let me know if you wnat more details on my setup or this issue.

opened by dzubke 1

Does not work for Pytorch 1.1

When i run python build.py , i got this error.

Traceback (most recent call last):
  File "build.py", line 4, in <module>
    from torch.utils.ffi import create_extension
  File "/home/vespar/miniconda3/envs/ariyan/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 1, in <module>
    raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.")
ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.

I am using pytorch 1.1 and the script is support only for pytorch 0.4.1 I changes the code with pytorch new library. Code is below,

import os
import sys
import torch
#from torch.utils.ffi import create_extension
from torch.utils.cpp_extension import BuildExtension, CppExtension
from setuptools import setup
import setuptools

this_file = os.path.abspath(__file__)

sources = ['src/transducer.c']
headers = ['src/transducer.h']

args = ["-std=c99"]
if sys.platform == "darwin":
    args += ["-DAPPLE"]
else:
    args += ["-fopenmp"]

#ffi = create_extension(
#    '_ext.transducer',
#    headers=headers,
#    sources=sources,
#    relative_to=__file__,
#    extra_compile_args=args
#)


setup(
	name='_ext.transducer',
	ext_modules=[
		CppExtension(
			name='_ext.transducer',
			sources=['src/transducer.h','src/transducer.c'],
			extra_compile_args=args),
	],
	cmdclass={
		'build_ext': BuildExtension
	})

after doing this, I got this error,

usage: build.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: build.py --help [cmd1 cmd2 ...]
   or: build.py --help-commands
   or: build.py cmd --help

error: no commands supplied

opened by mdhasanai 1

pytorch 1.1 support

getting this error when using torch 1.1:

from .._ext import transducer File "/home/tanish/ds3/libs/transducer/_ext/transducer/init.py", line 2, in from torch.utils.ffi import _wrap_function File "/home/tanish/environments/ds3_torch_1.1/lib/python3.6/site-packages/torch/utils/ffi/init.py", line 1, in raise ImportError("torch.utils.ffi is deprecated. Please use cpp extensions instead.") ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.

Please add pytorch 1.1 support to the code. Any help on this ??

opened by rajeevbaalwan 1
Import Error, _transducer can not be imported.

follow the readme, I got the error about importing error, suggest that there may be a circle import.

finally I find a soulution I have to manually copy the _transducer.cpython-xxxx.so to run the torch_test, I'm wondering is there a more wisable solution ,

opened by xiongjun19 1

Owner

Awni Hannun

Distinguished Scientist at Zoom AI

GitHub

Sequence-to-Sequence learning using PyTorch

Seq2Seq in PyTorch This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train

514 Nov 17, 2022

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

112 Dec 16, 2022

An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

195 Dec 17, 2022

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

161 Dec 8, 2022

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

22 Nov 25, 2022

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

897 Jan 5, 2023

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Se

490 Dec 15, 2022

Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

24 Oct 26, 2022

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

1.4k Jan 8, 2023

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

11 Oct 14, 2022

A Fast Sequence Transducer Implementation with PyTorch Bindings

Related tags

Overview

transducer

Install and Test

Comments

always get nothing trying to use viterbi decode interface

setup error: ‘isfinite’ was not declared in this scope

Support complex joiner networks

Question about the gradient computation

Questions about the reasoning in forward-backward algorithm

Transducer update not backward compatible with speech repo running pytorch 0.4.1

Does not work for Pytorch 1.1

pytorch 1.1 support

Import Error, _transducer can not be imported.

Owner

Awni Hannun

Sequence-to-Sequence learning using PyTorch

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

An implementation of a sequence to sequence neural network using an encoder-decoder

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Sequence lineage information extracted from RKI sequence data repo

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

.NET bindings for the Pytorch engine

Rust bindings for the C++ api of PyTorch.

Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

CUDA Python Low-level Bindings

RLBot Python bindings for the Rust crate rl_ball_sym

Simple renderer for use with MuJoCo (>=2.1.2) Python Bindings.

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.