Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

Facebook Research

Last update: Dec 29, 2022

Related tags

Deep Learning gtn_applications

Overview

gtn_applications

An applications library using GTN. Current examples include:

Offline handwriting recognition
Automatic speech recognition

Installing

Build python bindings for the GTN library.
conda activate gtn_env # using the same environment from Step 1
conda install pytorch torchvision -c pytorch
pip install -r requirements.txt

Training

We give an example of how to train on the IAM off-line handwriting recognition benchmark.

First register here and download the dataset:

./datasets/download/iamdb.sh <path_to_data> <email> <password>

Then update the configuration JSON configs/iamdb/tds2d.json to point to the data path used above:

  "data" : {
    "dataset" : "iamdb",
    "data_path" : "<path_to_data>",
    "num_features" : 64
  },

Single GPU training can be run with:

python train.py --config configs/iamdb/tds2d.json

To run distributed training with multiple GPUs:

python train.py --config configs/iamdb/tds2d.json --world_size <NUM_GPUS>

For a list of options type:

python train.py -h

Contributing

Use Black to format python code.

First install:

pip install black

Then run with:

black <file>.py

License

GTN is licensed under a MIT license. See LICENSE.

Comments

module 'gtn' has no attribute 'Device' in STC

I installed gtn by pip install gtn and tested STC like below

from criterions.STC import STC
import random
import torch

stcloss = STC(blank_idx=0, p0=1, plast=1, thalf=1, reduction="none")

batch_size = 2
mel_len = 115
text_len = 25
num_token = 64
inputs = torch.randn(mel_len, batch_size, num_token)
targets = [[random.randint(1, text_len) for _ in range(random.randint(1, num_token))] for _ in range(batch_size)]
loss = stcloss(inputs, targets)

But when I trying to run code like above. The Error Like below occurs.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-b0ecfb513efd> in <module>
      7 inputs = torch.randn(mel_len, batch_size, num_token)
      8 targets = [[random.randint(1, text_len) for _ in range(random.randint(1, num_token))] for _ in range(batch_size)]
----> 9 loss = stcloss(inputs, targets)

~/anaconda3/envs/leecho/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/data/leecho/xi-stt/xi-stt/model/STCLoss.py in forward(self, inputs, targets)
    212             # concatenate (tokens, <star>, <star>\tokens)
    213             log_probs = torch.cat([log_probs, lse, neglse], dim=2)
--> 214         return STCLoss(log_probs, targets, prob, self.reduction)

/data/leecho/xi-stt/xi-stt/model/STCLoss.py in forward(ctx, inputs, targets, prob, reduction)
     96             emissions_graphs[b] = g_emissions
     97 
---> 98         gtn.parallel_for(process, range(B))
     99 
    100         ctx.auxiliary_data = (losses, scales, emissions_graphs, inputs.shape)

/data/leecho/xi-stt/xi-stt/model/STCLoss.py in process(b)
     71             # create emission graph
     72             g_emissions = gtn.linear_graph(
---> 73                 T, Cstar, gtn.Device(gtn.CPU), inputs.requires_grad
     74             )
     75             cpu_data = inputs[b].cpu().contiguous()

AttributeError: module 'gtn' has no attribute 'Device'

Is there any way to solve this problem?

opened by LEECHOONGHO 7

[STC] STC loss ascends while training

Hello, I'm training ASR model with STC Loss and letter-to-word encoder like below. But when I progress training, STC Loss ascended and became 'Inf' after 12000 step.

Is there any miss in my implementation? Any help would be appreciated. Thank you.

training args:

num_gpu : 4
audio_length_per_gpu : 160s
lr : 0.0001~0.001
use FullyShardedDataParallel, mix_precision=off

#   max_word_length : 10
#   n_letter_symbols : 69 (blank + pad + korean)
#   n_word_symbols : 12158 (blank + korean_morph 97% in corpus)

#   blank_idx=0, p0=0.05, plast=0.15, thalf=16000

self.criterion = STC(
    blank_idx=self.cfg.blank_idx, 
    p0=self.cfg.p0, 
    plast=self.cfg.plast, 
    thalf=self.cfg.thalf, reduction="mean"
)

#   model_output : Tensor[batch_size, max_frame_length, n_letter_symbols*max_word_length]
#   self.l2w_matrix : Tensor[n_letter_symbols*max_word_length, n_word_symbols]

word_level_output = model_output @ self.l2w_matrix

#   word_level_output : Tensor[batch_size, max_frame_length, n_word_symbols]

word_level_output =  F.log_softmax(word_level_output.transpose(1, 0), dim=-1)

loss = self.criterion(word_level_output, word_labels)

stcloss

opened by LEECHOONGHO 3

[STC] Question for STC loss training

Hello, I'm trying to apply STC for my ASR model training.

Before proceeding with the training, I have a question to ask about STC training mentioned in STC paper [1] If anyone has experimented with the case I posited, please give me advise.

1. Does STC valid for pDrop=0.01~0.02 data? or do I just have to use CTC?
    data - The 99% reliable data that may contain some typos or sometimes the business name is erased.

    1-1. If STC is valid for 1., Is it sufficient to set p_0=0.01, p_max=0.03 for this case?

2.  Adam is not allowed for STC/(WFST)?

3. For future work, I am thinking of using pseudo labeled YouTube data for ASR training.
    In this case, data could have much incorrect labels in it.
    Does STC perform better than CTC even in case of incorrect labeled data training?

Thank you.

[1] Star Temporal Classification: Sequence Classification with Partially Labeled Data.

opened by LEECHOONGHO 2

RNN model correction

RNN model's default parameters contain stride. There are 2 convolutional layers each with a stride of 2. So for num_feature = 80, get reduced to 80 ->40 ->20. Updated the code to integrate it.
CLA Signed

opened by ronitd 2
Open Source STC code; Some code cleanup
Summary

Create criterions, models directory and organize files appropriately

Make __init__.py files empty so that model training works. Otherwise, there were issues with module loading.

Will send a follow-up PR to add documentation to STC code

Test Plan:

Ran IamDB training and make sure it runs an epoch.
CLA Signed
opened by vineelpratap 1
Dumb question for arc_sort func

When should we set the arc_sort(true) ? For example, in https://github.com/facebookresearch/gtn_applications/blob/eb1cb83dda3d3887f980dbd6b697c2c2b6fd1d45/transducer.py#L265 arc_sort was given parameter True. If my understanding is correct, for FSA target, arc_sort(true) and arc_sort(false) would give the exact same result? How do we decide to set it true or false? (When we prefer sort it in olable and when ilabel? )

opened by yuekaizhang 1
Is it possible to get multiple recognition results instead of one?

It seems that in model.viterbi(self, outputs), only one recognition results will be returned for each sample. Is it possible to return multiple alternates from this decoding method, like beam search? If yes, what would be the recommended decoding algorithm? Will that be time consuming?

Thanks!

opened by zhwa 1
Organize dataset download scripts
Update IAM download script. Previous scripts were not working - https://stackoverflow.com/q/64715260

Create a new directories dataset/download and dataset/utils and move the files appropriately

CLA Signed
opened by vineelpratap 0
Publish recipes for ngram transitions work
Consider this as V0. TODO

Test all the recipes again

More details to README. Possibly share ltr, WP based tokens, lexion, transition graph directly via S3 ...

CLA Signed
opened by vineelpratap 0
For ASR inference, how to use the asg for inference for LibriSpeech or any wav sound with Pytorch?

Hi. I am planning to try asg to replace the Wav2Vec2 with LM.

Compairing to the tutorial on CTC at Pytorch, https://pytorch.org/audio/main/tutorials/asr_inference_with_ctc_decoder_tutorial.html

I am thinking how to do the same thing with asg to replace the CTC? If it is using the LM model with asg, is it better than wav2vec2LM with CTC? How much the difference?

Cloud you show me how to do automatic speech recognition with ASG? I hope I would write the decoder in javascript if I understand how it is going.

opened by JonathanSum 2
[STC] Question about ASR training.
Hello, I'm trying to implement ASR model proposed in Star Temporal Classification. But I have some trouble for implementing my first 'word level output ASR model'.

When I use simple word-to-encoder's one hot tensor(E matrix), STC loss ascends. So I made several modifications to solve this problem.

(1). I view x : [T, B, A_L × l_max] to x : [T, B, l_max, A_L], apply F.log_softmax(dim=-1) and view it to original shape x : [T, B, A_L × l_max]

(2). And I apply letter-to-word encoder e_matrix : [A_L × l_max, A_W] by x = x @ e_matrix, and F.log_softmax(torch.exp(x), dim=-1) for STC input.

The reason I applied softmax for A_L is make x to probability of the appearance of alphabets at each location in the word. And log -> e_matrix -> exp is to convert the one hot sum by e_matrix operation into a probability product.

1.Is it right for letter-to-word encoder?

After applying this, STC loss starts from 3.5 and fall to 2.1~2.5 for 15 epoch. But the viterbi decoded(implemented in gtn_applications/criterions/ctc.py) output is always BLANK while checking WER for every predicted output.

Cause CTC loss and word level classification output has the same result, I assume that this is a problem with the properties of CTC training and word-level output ASR.

Is this an ordinary result?

How many epochs are needed to get results other than blank usually?

How many epochs are needed to reach the highest performance usually?

I'm sorry to bother you every time.
opened by LEECHOONGHO 6
"MemoryError: std::bad_alloc" while using compose and intersect function.

I want to compose two gtn, one is for lexicon, and the second is for grammar (LM) which is created from lm_arpa.py file. I am getting "MemoryError: std::bad_alloc" while doing with 250 GB RAM. I am not sure whether this is on the expected lines or not. PFA of both the gtn and code for reproducibility. gu-G.txt gu-L.txt

Code: gtn.savetxt('gu-LG.txt', gtn.compose(gtn.loadtxt('gu-L.txt'), gtn.loadtxt('gu-G.txt')).arc_sort())

opened by ronitd 0

Owner

Facebook Research

GitHub

Code to reproduce the experiments in the paper "Transformer Based Multi-Source Domain Adaptation" (EMNLP 2020)

Transformer Based Multi-Source Domain Adaptation Dustin Wright and Isabelle Augenstein To appear in EMNLP 2020. Read the preprint: https://arxiv.org/a

36 Dec 5, 2022

Code to reproduce experiments in the paper "Explainability Requires Interactivity".

Explainability Requires Interactivity This repository contains the code to train all custom models used in the paper Explainability Requires Interacti

5 Apr 7, 2022

Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

Code To run: python runner.py new --save <SAVE_NAME> --data <PATH_TO_DATA_DIR> --dataset <DATASET> --model <model_name> [options] --n 1000 - train - t

5 Dec 12, 2022

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

B-Pref Official codebase for B-Pref: Benchmarking Preference-BasedReinforcement Learning contains scripts to reproduce experiments. Install conda env

48 Dec 20, 2022

This repo will contain code to reproduce and build upon understanding transfer learning

What is being transferred in transfer learning? This repo contains the code for the following paper: Behnam Neyshabur*, Hanie Sedghi*, Chiyuan Zhang*.

4 Jun 16, 2021

Code to reproduce the results for Compositional Attention: Disentangling Search and Retrieval.

Compositional-Attention This repository contains the official implementation for the paper Compositional Attention: Disentangling Search and Retrieval

17 Oct 23, 2021

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

VANET Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning" Introduction This is the implementation of article VAN

23 Dec 26, 2022

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

pytorch-fcn PyTorch implementation of Fully Convolutional Networks. Requirements pytorch >= 0.2.0 torchvision >= 0.1.8 fcn >= 6.1.5 Pillow scipy tqdm

1.6k Jan 7, 2023

Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".

Tensor Component Analysis for Interpreting the Latent Space of GANs [ paper | project page ] Code to reproduce the results in the paper "Tensor Compon

4 Jun 17, 2022

Code to reproduce the results for Statistically Robust Neural Network Classification, published in UAI 2021

1 Jun 2, 2022

Reproduce partial features of DeePMD-kit using PyTorch.

DeePMD-kit on PyTorch For better understand DeePMD-kit, we implement its partial features using PyTorch and expose interface consuing descriptors. Tec

8 Dec 17, 2022

The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert.

The accompanying codes reproduce all figures and statistics presented in "Controlling for multiple covariates" by Mark Tygert. This repository also pr

1 Dec 2, 2021

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021. Balestriero et

1 Jan 27, 2022

Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

Reproduce ResNet-v2 using MXNet Requirements Install MXNet on a machine with CUDA GPU, and it's better also installed with cuDNN v5 Please fix the ran

531 Dec 4, 2022

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

3 Jan 3, 2023

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Minimal code and simple experiments to play with Denoising Diffusion Probabilist

16 Oct 6, 2022

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

Related tags

Overview

gtn_applications

Installing

Training

Contributing

License

Comments

Owner

Facebook Research

Code to reproduce the experiments in the paper "Transformer Based Multi-Source Domain Adaptation" (EMNLP 2020)

Code to reproduce experiments in the paper "Explainability Requires Interactivity".

Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

This repo will contain code to reproduce and build upon understanding transfer learning

Code to reproduce the results for Compositional Attention: Disentangling Search and Retrieval.

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".

Code to reproduce the results for Statistically Robust Neural Network Classification, published in UAI 2021

Reproduce partial features of DeePMD-kit using PyTorch.

The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert.

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax

Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression.

PyTorch code to run synthetic experiments.