SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

SpeechBrain

Last update: Jan 9, 2023

Related tags

Text Data & NLP speechbrain

Overview

The SpeechBrain Toolkit

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and many others.

SpeechBrain is currently in beta.

Key features

SpeechBrain provides various useful tools to speed up and facilitate research on speech technologies:

Various pretrained models nicely integrated with _{(HuggingFace)} in our official organization account. These models are given with an interface to easily run inference, facilitating integration. If a HuggingFace model isn't available, we usually provide a least a Google Drive folder containing all the experimental results corresponding.
The Brain class, a fully-customizable tool for managing training and evaluation loops over data. The annoying details of training loops are handled for you while retaining complete flexibility to override any part of the process when needed.
A YAML-based hyperparameter specification language that describes all types of hyperparameters, from individual numbers (e.g. learning rate) to complete objects (e.g. custom models). This dramatically simplifies recipe code by distilling basic algorithmic components.
Multi-GPU training and inference with PyTorch Data-Parallel or Distributed Data-Parallel.
Mixed-precision for faster training.
A transparent and entirely customizable data input and output pipeline. SpeechBrain follows the PyTorch data loader and dataset style and enables users to customize the i/o pipelines (e.g adding on-the-fly downsampling, BPE tokenization, sorting, threshold ...).

Speech recognition

SpeechBrain supports state-of-the-art methods for end-to-end speech recognition:

State-of-the-art performance or comparable with other existing toolkits in several ASR benchmarks.
Easily customizable neural language models including RNNLM and TransformerLM. We also propose few pre-trained models to save you computations (more to come!). We support the Hugging Face dataset to facilitate the training over a large text dataset.
Hybrid CTC/Attention end-to-end ASR:
- Many available encoders: CRDNN (VGG + {LSTM,GRU,LiGRU} + DNN), ResNet, SincNet, vanilla transformers, contextnet-based transformers or conformers. Thanks to the flexibility of SpeechBrain, any fully customized encoder could be connected to the CTC/attention decoder and trained in few hours of work. The decoder is fully customizable as well: LSTM, GRU, LiGRU, transformer, or your neural network!
- Optimised and fast beam search on both CPUs or GPUs.
Transducer end-to-end ASR with a custom Numba loss to accelerate the training. Any encoder or decoder can be plugged into the transducer ranging from VGG+RNN+DNN to conformers.
Pre-trained ASR models for transcribing an audio file or extracting features for a downstream task.

Feature extraction and augmentation

SpeechBrain provides efficient and GPU-friendly speech augmentation pipelines and acoustic feature extraction:

On-the-fly and fully-differentiable acoustic feature extraction: filter banks can be learned. This simplifies the training pipeline (you don't have to dump features on disk).
On-the-fly feature normalization (global, sentence, batch, or speaker level).
On-the-fly environmental corruptions based on noise, reverberation, and babble for robust model training.
On-the-fly frequency and time domain SpecAugment.

Speaker recognition, identification and diarization

SpeechBrain provides different models for speaker recognition, identification, and diarization on different datasets:

State-of-the-art performance on speaker recognition and diarization based on ECAPA-TDNN models.
Original Xvectors implementation (inspired by Kaldi) with PLDA.
Spectral clustering for speaker diarization (combined with speakers embeddings).
Libraries to extract speaker embeddings with a pre-trained model on your data.

Speech enhancement and separation

Recipes for spectral masking, spectral mapping, and time-domain speech enhancement.
Multiple sophisticated enhancement losses, including differentiable STOI loss, MetricGAN, and mimic loss.
State-of-the-art performance on speech separation with Conv-TasNet, DualPath RNN, and SepFormer.

Multi-microphone processing

Combining multiple microphones is a powerful approach to achieve robustness in adverse acoustic environments:

Delay-and-sum, MVDR, and GeV beamforming.
Speaker localization.

Performance

The recipes released with speechbrain implement speech processing systems with competitive or state-of-the-art performance. In the following, we report the best performance achieved on some popular benchmarks:

Dataset	Task	System	Performance
LibriSpeech	Speech Recognition	CNN + Transformer	WER=2.50% (test-clean)
TIMIT	Speech Recognition	CRDNN + distillation	PER=13.1% (test)
CommonVoice (French)	Speech Recognition	CRDNN	WER=17.7% (test)
VoxCeleb2	Speaker Verification	ECAPA-TDNN	EER=0.69% (vox1-test)
AMI	Speaker Diarization	ECAPA-TDNN	DER=2.13% (lapel-mix)
VoiceBank	Speech Enhancement	MetricGAN+	PESQ=3.08 (test)
WSJ2MIX	Speech Separation	SepFormer	SDRi=22.6 dB (test)
WSJ3MIX	Speech Separation	SepFormer	SDRi=20.0 dB (test)

For more details, take a look into the corresponding implementation in recipes/dataset/.

Documentation & Tutorials

SpeechBrain is designed to speed-up research and development of speech technologies. Hence, our code is backed-up with three different levels of documentation:

Low-level: during the review process of the different pull requests, we are focusing on the level of comments that are given. Hence, any complex functionality or long pipeline is supported with helpful comments enabling users to handily customize the code.
Functional-level: all classes in SpeechBrain contains a detailed docstring that details the input and output formats, the different arguments, the usage of the function, the potentially associated bibliography, and a function example that is used for test integration during pull requests. Such examples can also be used to manipulate a class or a function to properly understand what is exactly happening.
Educational-level: we provide various Google Colab (i.e. interactive) tutorials describing all the building-blocks of SpeechBrain ranging from the core of the toolkit to a specific model designed for a particular task. The number of available tutorials is expected to increase over time.

Under development

We are currently working towards integrating DNN-HMM for speech recognition and machine translation.

Quick installation

SpeechBrain is constantly evolving. New features, tutorials, and documentation will appear over time. SpeechBrain can be installed via PyPI to rapidly use the standard library. Moreover, a local installation can be used by those users that what to run experiments and modify/customize the toolkit. SpeechBrain supports both CPU and GPU computations. For most all the recipes, however, a GPU is necessary during training. Please note that CUDA must be properly installed to use GPUs.

Install via PyPI

Once you have created your Python environment (Python 3.8+) you can simply type:

pip install speechbrain

Then you can access SpeechBrain with:

import speechbrain as sb

Install with GitHub

Once you have created your Python environment (Python 3.8+) you can simply type:

git clone https://github.com/speechbrain/speechbrain.git
cd speechbrain
pip install -r requirements.txt
pip install --editable .

Then you can access SpeechBrain with:

import speechbrain as sb

Any modification made to the speechbrain package will be automatically interpreted as we installed it with the --editable flag.

Test Installation

Please, run the following script to make sure your installation is working:

pytest tests
pytest --doctest-modules speechbrain

Running an experiment

In SpeechBrain, you can run experiments in this way:

> cd recipes///
> python experiment.py params.yaml

The results will be saved in the output_folder specified in the yaml file. The folder is created by calling sb.core.create_experiment_directory() in experiment.py. Both detailed logs and experiment outputs are saved there. Furthermore, less verbose logs are output to stdout.

Learning SpeechBrain

Instead of a long and boring README, we prefer to provide you with different resources that can be used to learn how to customize SpeechBrain to adapt it to your needs:

General information can be found on the website.
We offer many tutorials, you can start out from the basic ones about SpeechBrain basic functionalities and building blocks. We provide also more advanced tutorials (e.g SpeechBrain advanced, signal processing ...). You can browse them via the Tutorials drop down menu on SpeechBrain website in the upper right.
Details on the SpeechBrain API, how to contribute, and the code are given in the documentation.

License

SpeechBrain is released under the Apache License, version 2.0. The Apache license is a popular BSD-like license. SpeechBrain can be redistributed for free, even for commercial purposes, although you can not take off the license headers (and under some circumstances, you may have to distribute a license document). Apache is not a viral license like the GPL, which forces you to release your modifications to the source code. Also note that this project has no connection to the Apache Foundation, other than that we use the same license terms.

Comments

Add Transducer recipe
Hello @mravanelli , @TParcollet , @jjery2243542 ,

This is a work in progress transducer recipe, the following tasks are addressed:

[x] add transducer joint module

[x] REMOVED:add seq2seq bool in Brain class to handle the [x,y] input for the compute_forward function

[x] add embedding for the Prediction Network

[x] add greedy decoding

[x] Transducer minimal recipe

[x] add Transducer seq2seq recipe for TIMIT

[x] add comments to explain the greedy search over the transducer

[x] Add transducer recipe for Librispeech

[x] Find the good architecture with 14 % wer

enhancement refactor ready to review
opened by aheba 73
use sentencepiece lib from google
Add BPE tokenizer:

[x] add the BPE training

[x] use the BPE trained model for the token generation for Librispeech recipe

[x] Design the way of adding the BPE on the params (yaml file)

enhancement ready to review
opened by aheba 52
Switchboard Recipe
Hey everybody,

I made a recipe for the Switchboard corpus. The data preparation steps mostly follow Kaldi's s5c recipe.

The recipe includes the following models:

ASR

CTC: Wav2Vec2 Encoder + CTC Decoder (adapted from the Commonvoice recipes)

seq2seq: CRDNN encoder + GRU Decoder + Attention (adapted from the LibriSpeech recipe)

Note: Unlike the Librispeech recipe, this system does not include any LM. In fact, every LM I tried (pretrained, finetuned or trained from scratch) seemed to make the performance much worse

transformer: Transformer model + LM (adapted from the LibriSpeech recipe)

LM

There are two hparams files for finetuning existing LibriSpeech LMs on Switchboard and Fisher data, one for an RNNLM and the other for a Transformer LM

Tokenizer

Basic Sentencepiece Tokenizer training on Switchboard and Fisher data

Performance The model performance is as follows: | Model | Swbd WER | Callhome WER | Eval2000 WER | |:---------------------------------:|:-----------:|:---------------:| :---------------:| | CTC | 21.35 | 28.32 | 24.91 | | seq2seq | 25.37 | 36.87 | 29.33 | | Transformer (LibriSpeech LM) | 22.00 | 30.12 | 26.14 | | Transformer (Finetuned LM) | 21.11 | 29.43 | 25.36 |

As you can see, the performance is currently comparable to Kaldi's chain systems without i-vectors. However, they need some refinement to be on par with the best Kaldi systems available (WER should be around 18 on the full eval2000 testset).

If you have any suggestions for improvements, I'd be happy to implement them.

I can also provide the trained models in case you are interested (I might need some help with this whole Huggingface thing though).

Best, Dominik

ps Thanks for all the great work you've done here! :)
enhancement
opened by dwgnr 50
$handle the use of multigpu_{count,backend}$

handle the use of multigpu_{count,backend}

Hey @pplantinga , @mravanelli , Here is a PR fixing the issue #395 . As discussed, the multigpu_{count, backend} are not used in our ddp.py, currently, the multigpu_{count, backend} is used in the hyperparamsfile only with data_parallel. This PR handle the use of multigpu_{count, backend} by DDP.py. If the use set this params in the command line, the params in the yaml file is omitted.
help wanted work in progress ready to review

opened by aheba 50
add noise and reverberance version for BinauralWSJ0Mix

Hi there, I have created a noise and reverberance version of BinauralWSJ0Mix datasets and trained with convtasnet-parallel structure. Here are the recipes and not conflicted with the clean version of datasets. Also, I have trained convtasnet-parallel.yaml again and got a better results which I could share you with the Google Driver. Thanks.

opened by huangzj421 43
Aishell1Mix

This branch adds a new task named Aishell1Mix to the recipes which is similar to the LibriMix but applied to the mandarin AISHELL-1 dataset. Hope to receive your reply. Much thanks.
enhancement

opened by huangzj421 42
training on voxceleb1+2 is very slow?
Dear all: I noticed that when training on voxceleb1+2, it will take me up to 25 hours for single epoch. and even with ddp on 4 gpu cards, the training speed does not reduce at all. I guess the cpu is the bottleneck? anyone has the same phenomena? thank you.

7%|████████▎ | 16569/241547 [1:45:07<25:09:56, 2.48it/s, train_loss=13
question
opened by dragen1860 35

Insertion problem when decoding with pre-trained ASR model.

Thanks for the clear example In foldertemplates/speech_recognition/ASR/ to train an ASR model on mini-librispeech dataset. However, when I used the librispeech-pretrained model (ASR model, language model and tokenizer) to decode some waveforms in librispeech test dataset, the decoding result will repeat some of the words many times and cause severe insertion errors. Below is several examples:

1221-135766-0014, %WER 2436.36 [ 268 / 11, 268 ins, 0 del, 0 sub ]
PEARL ; SAW ; AND ; GAZED ; INTENTLY ; BUT ; NEVER ; SOUGHT ; TO ; MAKE ; ACQUAINTANCE ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ;    <eps>     ; <eps> ; <eps> ; <eps> ; <eps>
  =   ;  =  ;  =  ;   =   ;    =     ;  =  ;   =   ;   =    ; =  ;  =   ;      =       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I   ;    I     ;   I   ;   I   ;   I    ;   I   ;   I   ;      I       ;   I   ;   I   ;   I   ;   I  
PEARL ; SAW ; AND ; GAZED ; INTENTLY ; BUT ; NEVER ; SOUGHT ; TO ; MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED ; INTENTLY ;  BUT  ; NEVER ; SOUGHT ;   TO  ;  MAKE ; ACQUAINTANCE ; PEARL ;  SAW  ;  AND  ; GAZED

121-123859-0001, %WER 869.81 [ 461 / 53, 454 ins, 0 del, 7 sub ]
O  ; TIS ; THE ; FIRST  ; TIS ; FLATTERY ; IN ; MY ; SEEING ; AND ; MY ; GREAT ; MIND ; MOST ; KINGLY ; DRINKS ; IT ; UP ; MINE ; EYE ; WELL ; KNOWS ; WHAT ; WITH ; HIS ; GUST ; IS ; GREEING ; AND ; TO ; HIS ; PALATE ; DOTH ; PREPARE ; THE ; CUP ; IF ; IT ; BE ; POISON'D ; TIS ; THE ; LESSER ; SIN ; THAT ; MINE ; EYE ; LOVES ; IT ; AND ; DOTH ; <eps>  ; <eps>  ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; FIRST ; BEGIN ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>  ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ; <eps>
S  ;  =  ;  =  ;   S    ;  =  ;    =     ; =  ; =  ;   =    ;  =  ; =  ;   =   ;  =   ;  =   ;   S    ;   =    ; =  ; =  ;  =   ;  =  ;  =   ;   =   ;  =   ;  =   ;  =  ;  =   ; =  ;    S    ;  =  ; =  ;  =  ;   =    ;  =   ;    =    ;  =  ;  =  ; =  ; =  ; =  ;    S     ;  =  ;  =  ;   =    ;  =  ;  =   ;  =   ;  S  ;   S   ; =  ;  =  ;  =   ;   I    ;   I    ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   =   ;   =   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I    ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I    ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I   ;   I  
OH ; TIS ; THE ; THIRST ; TIS ; FLATTERY ; IN ; MY ; SEEING ; AND ; MY ; GREAT ; MIND ; MOST ; KEENLY ; DRINKS ; IT ; UP ; MINE ; EYE ; WELL ; KNOWS ; WHAT ; WITH ; HIS ; GUST ; IS ;  GREEN  ; AND ; TO ; HIS ; PALATE ; DOTH ; PREPARE ; THE ; CUP ; IF ; IT ; BE ; POISONED ; TIS ; THE ; LESSER ; SIN ; THAT ; MINE ;  I  ;  LOVE ; IT ; AND ; DOTH ; THIRST ; BEGINS ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGINS ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGINS ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE ;   IT  ;  AND  ;  DOTH ; FIRST ; BEGIN ;  THAT ;  MINE ;   I   ;  LOVE

1284-134647-0001, %WER 707.41 [ 191 / 27, 191 ins, 0 del, 0 sub ]
THE ; EDICT ; OF ; MILAN ; THE ; GREAT ; CHARTER ; OF ; TOLERATION ; HAD ; CONFIRMED ; TO ; EACH ; INDIVIDUAL ; OF ; THE ; ROMAN ; WORLD ; THE ; PRIVILEGE ; OF ; CHOOSING ; AND ; PROFESSING ; HIS ; OWN ; RELIGION ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps> ; <eps> ;   <eps>   ; <eps> ; <eps> ;   <eps>    ; <eps> ; <eps> ; <eps> ; <eps> ; <eps> ;   <eps>   ; <eps> ;  <eps>   ; <eps> ;   <eps>    ; <eps> ; <eps> ;  <eps>   ; <eps>
 =  ;   =   ; =  ;   =   ;  =  ;   =   ;    =    ; =  ;     =      ;  =  ;     =     ; =  ;  =   ;     =      ; =  ;  =  ;   =   ;   =   ;  =  ;     =     ; =  ;    =     ;  =  ;     =      ;  =  ;  =  ;    =     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I   ;   I   ;     I     ;   I   ;   I   ;     I      ;   I   ;   I   ;   I   ;   I   ;   I   ;     I     ;   I   ;    I     ;   I   ;     I      ;   I   ;   I   ;    I     ;   I  
THE ; EDICT ; OF ; MILAN ; THE ; GREAT ; CHARTER ; OF ; TOLERATION ; HAD ; CONFIRMED ; TO ; EACH ; INDIVIDUAL ; OF ; THE ; ROMAN ; WORLD ; THE ; PRIVILEGE ; OF ; CHOOSING ; AND ; PROFESSING ; HIS ; OWN ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;   HE  ;  HAD  ; CONFIRMED ;   TO  ;  EACH ; INDIVIDUAL ;   OF  ;  THE  ; ROMAN ; WORLD ;  THE  ; PRIVILEGE ;   OF  ; CHOOSING ;  AND  ; PROFESSING ;  HIS  ;  OWN  ; RELIGION ;  THE

The dataset I tested on is part of the librispeech test-clean dataset (reader id beginning with 1, 2 and 3, 1074 files in total.), and the average WER on this dataset is 20.3%. Below is the hparams I used for searching:

 test_search: !new:speechbrain.decoders.S2SRNNBeamSearchLM
    embedding: !ref <embedding>
    decoder: !ref <decoder>
    linear: !ref <seq_lin>
    ctc_linear: !ref <ctc_lin>
    language_model: !ref <lm_model>
    bos_index: 0
    eos_index: 0
    blank_index: 0
    min_decode_ratio: 0.0
    max_decode_ratio: 1.0
    beam_size: 80
    eos_threshold: 1.5
    using_max_attn_shift: true
    max_attn_shift: 240
    coverage_penalty: 1.5
    lm_weight: 0.5
    ctc_weight: 0.0
    temperature: 1.25
    temperature_lm: 1.25

I also found that if I change the testing batch_size from 8 to 1, the WER can be reduced from 20.3% to 2.8%, which I believe should be the normal result. I am thus wondering whether the padding might be the main reason for this problem.

opened by Kuray107 31

LM decoder and training for TIMIT
Modifications:

Add length normalization for beam search.

Rename length penalty to length rewarding (beam search).

Integrate LM in the decoder.

Add recipe for LM and ASR with LM decoding.

work in progress ready to review
opened by jjery2243542 31
Can't train a model with multi NVIDIA RTX 3090 GPUs.

OS: Ubuntu 20.04 Python: I tested both 3.7 and 3.8 SpeechBrain: I tested 0.5.8 and 0.5.9 PyTorch: 1.7.0 for SpeechBrain 0.5.8 and 1.9.0 for SpeechBrain 0.5.9, both complied on CUDA 11.1 Recipe: speechbrain/recipes/LibriSpeech/ASR/transformer

command: python train.py hparams/transformer.yaml --data_folder xxx --data_parallel_backend

I have 8 3090 GPUs on my server. But when I watched nvidia-smi, there was only one GPU process running on one GPU, the rest of the 7 GPUs were idle. So how can I fix this problem? Thank you.

opened by Xinghui-Wu 28
MultiGPU + Librispeech
Adding Multi-GPU training to the Librispeech recipe.

Change the logging to info on the libri preparation. Without that, the user has NO feedback on what is happening, and it's actually weird.

Add multi GPU with data parallel to experiment.py

Add a multigpu param to the yaml file

To do: [x] Test the recipe on 1-2 GPU [x] Test that the checkpointing doesn't break due to DataParallel when going from one to two and two to one
enhancement ready to review
opened by TParcollet 27
[Bug]: Speaker Classification Inference KeyError

Issue

This is similar to the #1049 issue, the only difference being that it was for language identification.

I trained a model using some audio from commonvoice. The model completed training. I am doing inference now.

This is my Inference code:

`from speechbrain.pretrained import EncoderClassifier import os import sys import torch import torchaudio

classifier = EncoderClassifier.from_hparams(source="./content/best_model/", hparams_file='hparams_inference.yaml', savedir="./content/best_model/")

Classification

audio_file = 'data/common_voice_de_27022043.wav' signal, fs = torchaudio.load(audio_file) # test_speaker: 5789 output_probs, score, index, text_lab = classifier.classify_batch(signal) print('Target: 000fc181c938978e23ec7c066dddc246ca2b3160b50e3bfee829c02f5db753b3b8b955ad0f1b3effc954e5b10b474e6e93386f2b7925e7195abd84a164477851, Predicted: ' + text_lab[0])

Speaker 2

audio_file = 'data/common_voice_de_18351596.wav' signal, fs =torchaudio.load(audio_file) # test_speaker: 460 output_probs, score, index, text_lab = classifier.classify_batch(signal) print('Target: 0e156ff8b3bdd99355fd7f99e2259c47bb78e7dcac346a9966181b9e5e265960ddccc5f73b036948d3586a03e8b482b01d908da93026737ac60a9996d88d6881, Predicted: ' + text_lab[0]) `

I get the error KeyError: 0

I have added the pre-trainer link in the YAML. This is what that part looks like: `modules: compute_features: !ref <compute_features> embedding_model: !ref <embedding_model> classifier: !ref mean_var_norm: !ref <mean_var_norm>

pretrainer: !new:speechbrain.utils.parameter_transfer.Pretrainer loadables: embedding_model: !ref <embedding_model> classifier: !ref label_encoder: !ref <label_encoder> paths: embedding_model: !ref <pretrained_path>/embedding_model.ckpt classifier: !ref <pretrained_path>/classifier.ckpt label_encoder: !ref <pretrained_path>/label_encoder.txt`

##I have attached my label encoder in the comments

Could you tell me what I am missing here?

Expected behaviour

Successful Inference

To Reproduce

'''Compete YAML file below:

pretrain folders:

pretrained_path: /content/best_model/

Model parameters

n_mels: 23 sample_rate: 16000 n_classes: 29 # In this case, we have 28 speakers emb_dim: 512 # dimensionality of the embeddings

Feature extraction

compute_features: !new:speechbrain.lobes.features.Fbank n_mels: !ref <n_mels>

Mean and std normalization of the input features

mean_var_norm: !new:speechbrain.processing.features.InputNormalization norm_type: sentence std_norm: False

embedding_model: !new:custom_model.Xvector in_channels: !ref <n_mels> activation: !name:torch.nn.LeakyReLU tdnn_blocks: 5 tdnn_channels: [512, 512, 512, 512, 1500] tdnn_kernel_sizes: [5, 3, 3, 1, 1] tdnn_dilations: [1, 2, 3, 1, 1] lin_neurons: !ref <emb_dim>

classifier: !new:custom_model.Classifier input_shape: [null, null, !ref <emb_dim>] activation: !name:torch.nn.LeakyReLU lin_blocks: 1 lin_neurons: !ref <emb_dim> out_neurons: !ref <n_classes>

label_encoder: !new:speechbrain.dataio.encoder.CategoricalEncoder

modules: compute_features: !ref <compute_features> embedding_model: !ref <embedding_model> classifier: !ref mean_var_norm: !ref <mean_var_norm>

pretrainer: !new:speechbrain.utils.parameter_transfer.Pretrainer loadables: embedding_model: !ref <embedding_model> classifier: !ref label_encoder: !ref <label_encoder> paths: embedding_model: !ref <pretrained_path>/embedding_model.ckpt classifier: !ref <pretrained_path>/classifier.ckpt label_encoder: !ref <pretrained_path>/label_encoder.txt'''

Versions

No response

Relevant log output

No response

Additional context

No response
bug

opened by praveenmathew93 1

[Bug]: M1 GPU (mps) support

Describe the bug

It looks like the Speechbrain library does not support the M1 GPU (mps backend). The error is raised when trying to use the MPS backend on a pre-trained model (at least, this is the case I found, I don't know if it happens also in other situations, but I guess it does) and in particular the error is:

{ValueError}invalid type: 'torch.mps.FloatTensor'

The error is caused by this line in dual_path.py (file in the Speechbrain library, line 1066):

        if gap > 0:
            pad = torch.Tensor(torch.zeros(B, N, gap)).type(input.type())

And it is caused by the fact that input.type() returns torch.mps.FloatTensor but such value is not a valid Tensor type.

Such problem has been already reported in PyTorch (here: https://github.com/pytorch/pytorch/issues/82296) and looks like it is on its way to be fixed.

However, it looks like Speechbrain will need to upgrade its PyTorch dependency (from the PyTorch discussion it looks like they're gonna include the fix in Torch 2.0) or find a workaround with the datatype in the meanwhile 🤔

Expected behaviour

Being able to use the MPS backend on a M1 Mac to run Speechbrain models

To Reproduce

from speechbrain.pretrained.interfaces import SepformerSeparation
import torchaudio
import torch

separator = SepformerSeparation.from_hparams(source="speechbrain/sepformer-wsj02mix", savedir="./pretrained-sepformer-wsj02mix",  run_opts={"device": "mps"})

s1, fs = torchaudio.load('./my_file.wav') # Just insert here any wav file you want
resampler = torchaudio.transforms.Resample(fs, 8000)

s1 = resampler(s1)

est_sources = separator.separate_batch(s1)

Versions

0.5.13

Relevant log output

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[49], line 1
----> 1 est_sources = separator.separate_batch(s1)

File ~/Desktop/personal_git/voice-assistant/.venv/lib/python3.10/site-packages/speechbrain/pretrained/interfaces.py:1976, in SepformerSeparation.separate_batch(self, mix)
   1974 mix = mix.to(self.device)
   1975 mix_w = self.mods.encoder(mix)
-> 1976 est_mask = self.mods.masknet(mix_w)
   1977 mix_w = torch.stack([mix_w] * self.hparams.num_spks)
   1978 sep_h = mix_w * est_mask

File ~/Desktop/personal_git/voice-assistant/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
   1190 # If we don't have any hooks, we want to skip the rest of the logic in
   1191 # this function, and just call forward.
   1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194     return forward_call(*input, **kwargs)
   1195 # Do not call functions when jit is used
   1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/Desktop/personal_git/voice-assistant/.venv/lib/python3.10/site-packages/speechbrain/lobes/models/dual_path.py:1017, in Dual_Path_Model.forward(self, x)
   1012     x = self.pos_enc(x.transpose(1, -1)).transpose(1, -1) + x * (
   1013         x.size(1) ** 0.5
   1014     )
   1016 # [B, N, K, S]
-> 1017 x, gap = self._Segmentation(x, self.K)
   1019 # [B, N, K, S]
   1020 for i in range(self.num_layers):

File ~/Desktop/personal_git/voice-assistant/.venv/lib/python3.10/site-packages/speechbrain/lobes/models/dual_path.py:1097, in Dual_Path_Model._Segmentation(self, input, K)
   1095 B, N, L = input.shape
   1096 P = K // 2
-> 1097 input, gap = self._padding(input, K)
   1098 # [B, N, K, S]
   1099 input1 = input[:, :, :-P].contiguous().view(B, N, -1, K)

File ~/Desktop/personal_git/voice-assistant/.venv/lib/python3.10/site-packages/speechbrain/lobes/models/dual_path.py:1067, in Dual_Path_Model._padding(self, input, K)
   1065 gap = K - (P + L % K) % K
   1066 if gap > 0:
-> 1067     pad = torch.Tensor(torch.zeros(B, N, gap)).type(input.type())
   1068     input = torch.cat([input, pad], dim=2)
   1070 _pad = torch.Tensor(torch.zeros(B, N, P)).type(input.type())

ValueError: invalid type: 'torch.mps.FloatTensor'

Additional context

No response

bug

opened by mattiasu96 0

[Bug]: SLURP/direct malformed node or string

Describe the bug

While testing recipes, one of the SLURP recipes ran into an error.

ValueError: malformed node or string: <ast.Name object at 0x7ff2cbf7c4f0>

Expected behaviour

No error

To Reproduce

No response

Versions

No response

Relevant log output

File "speechbrain/recipes/SLURP/direct/train.py", line 365, in <module>
    slu_brain.evaluate(test_set, test_loader_kwargs=hparams["dataloader_opts"])
...
  File "speechbrain/recipes/SLURP/direct/train.py", line 128, in compute_objectives
    _dict = ast.literal_eval(
...
  File "python3.9/ast.py", line 66, in _raise_malformed_node
    raise ValueError(f'malformed node or string: {node!r}')
ValueError: malformed node or string: <ast.Name object at 0x7ff2cbf7c4f0>

Additional context

No response

bug

opened by anautsch 0

[Bug]: LJSpeech & LibriTTS - audio_pipeline error

Describe the bug

While testing recipes, _get_spec_norms threw an error. I remember since I ran the tests last time, some of it might have been fixed already in #1740 for LJSpeech but the issue might still be up for LibriTTS.

Expected behaviour

No error

To Reproduce

No response

Versions

No response

Relevant log output

File "speechbrain/recipes/LibriTTS/vocoder/hifigan/train.py", line 330, in audio_pipeline
    mel = hparams["mel_spectogram"](audio=audio.squeeze(0))
...
  File "torchaudio/transforms/_transforms.py", line 108, in forward
    return F.spectrogram(
  File "torchaudio/functional/functional.py", line 114, in spectrogram
    frame_length_norm, window_norm = _get_spec_norms(normalized)
  File "torchaudio/functional/functional.py", line 239, in _get_spec_norms
    raise TypeError("Input type not supported")
TypeError: Input type not supported

Additional context

No response

bug

opened by anautsch 0

[Bug]: CommonVoice/self-supervised-learning/wav2vec2 - not implemented for NumPy arrays
Describe the bug

While testing recipes, an error occured.

TypeError: Concatenation operation is not implemented for NumPy arrays, use np.concatenate() instead. Please do not rely on this error; it may not be given on all Python implementations.

Expected behaviour

Either a clearer restriction of when this recipe can be used, or an adjustment of dependencies (so there is no error).

To Reproduce

No response

Versions

No response

Relevant log output

File "speechbrain/recipes/CommonVoice/self-supervised-learning/wav2vec2/train_hf_wav2vec2.py", line 111, in fit_batch predictions = self.compute_forward(batch, sb.Stage.TRAIN) ... File "transformers/models/wav2vec2/modeling_wav2vec2.py", line 285, in _sample_negative_indices sampled_negative_indices[batch_idx] += batch_idx * sequence_length TypeError: Concatenation operation is not implemented for NumPy arrays, use np.concatenate() instead. Please do not rely on this error; it may not be given on all Python implementations.

Additional context

No response
bug
opened by anautsch 0
[Bug]: Voicebank/*/*MetricGAN* torch.multinomial error
Describe the bug

While testing recipes, an error was thrown for all MetricGAN recipes. This report points to one log only, but it is alike for all three MetricGAN recipes.

RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

Expected behaviour

No error

To Reproduce

No response

Versions

No response

Relevant log output

File "speechbrain/recipes/Voicebank/enhance/MetricGAN/train.py", line 371, in train_discriminator self.fit( ... File "torch/utils/data/sampler.py", line 203, in __iter__ rand_tensor = torch.multinomial(self.weights, self.num_samples, self.replacement, generator=self.generator) RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

Additional context

No response
bug
opened by anautsch 0

Releases(v0.5.13)

v0.5.13(Aug 29, 2022)
This is a minor release with better dependency version specification. We note that SpeechBrain is compatible with PyTorch 1.12, and the updated package reflects this. See the issue linked next to each commit for more details about the corresponding changes.

Commit summary

[edb7714]: Adding no_sync and on_fit_batch_end method to core (Rudolf Arseni Braun) #1449

[07155e9]: G2P fixes (flexthink) #1473

[6602dab]: fix for #1469, minimal testing for profiling (anautsch) #1476

[abbfab9]: test clean-ups: passes linters; doctests; unit & integration tests; load-yaml on cpu (anautsch) #1487

[1a16b41]: fix ddp incorrect command (=) #1498

[0b0ec9d]: using no_sync() in fit_batch() of core.py (Rudolf Arseni Braun) #1449

[5c9b833]: Remove torch maximum compatible version (Peter Plantinga) #1504

[d0f4352]: remove limit for HF hub as it does not work with colab (Titouan) #1508

[b78f6f8]: Add revision to hub (Titouan) #1510

[2c491a4]: fix transducer loss inputs devices (Adel Moumen) #1511

[4972f76]: missing space in install command (pehonnet) #1512

[6bc72af]: Fixing shuffle argument for distributed sampler in core.py (Rudolf Arseni Braun) #1518

[df7acd9]: Added the link for example results (cem) #1523

[5bae6df]: add LinearWarmupScheduler (Ge Li) #1537

[2edd7ee]: updating scipy version in requirements.txt. (Nauman Dawalatabad) #1546

Source code(tar.gz)
Source code(zip)
v0.5.12(Jun 26, 2022)
Release Notes - SpeechBrain v0.5.12

We worked very hard and we are very happy to announce the new version of SpeechBrain!

SpeechBrain 0.5.12 significantly expands the toolkit without introducing any major interface changes. I would like to warmly thank the many contributors that made this possible.

The main changes are the following:

A) Text-to-Speech: We developed the first TTS system of SpeechBrain. You can find it here. The system relies on Tacotron2 + HiFiGAN (as vocoder). The models coupled with an easy-inference interface are available on HuggingFace.

B) Grapheme-to-Phoneme (G2P): We developed an advanced Grapheme-to-Phoneme. You can find the code here. The current version significantly outperforms our previous model.

C) Speech Separation:

We developed a novel version of the SepFormer called Resource-Efficient SepFormer (RE-Sepformer). The code is available here and the pre-trained model (with an easy inference interface) here.

We released a recipe for Binaural speech separation with WSJMix. See the code here.

We released a new recipe with the AIShell mix dataset. You can see the code here.

D) Speech Enhancement:

We released the SepFormer model for speech enhancement. the code is here, while the pre-trained model (with easy-inference interface) is here.

We implemented the WideResNet for speech enhancement and use it to mimic loss-based speech enhancement. The code is here and the pretrained model (with easy-inference interface) is here.

E) Feature Front-ends:

We now support LEAF filter banks. The code is here. You can find an example of a recipe using it here.

We now support SincConv multichannel (see code here).

F) Recipe Refactors:

We refactored the Voxceleb recipe and fix the normalization issues. See the new code here. We also made the EER computation method less memory demanding (see here).

We refactored the IEMOCAP recipe for emotion recognition. See the new code here.

G) Models for African Languages: We now have recipes for the DVoice dataset. We currently support Darija, Swahili, Wolof, Fongbe, and Amharic. The code is available here. The pretrained model (coupled with an easy-inference interface) can be found on SpeechBrain-HuggingFace.

H) Profiler: We implemented a model profiler that helps users while developing new models with SpeechBrain. The profiler outputs a bunch of potentially useful information, such as the real-time factors and many other details. A tutorial is available here.

I) Tests: We significantly improved the tests. In particular, we introduced the following tests: HF_repo tests, docstring checks, yaml-script consistency, recipe tests, and check URLs. This will helps us scale up the project.

L) Other improvements:

We now support the torchaudio RNNT loss*.

We improved the relative attention mechanism of the Conformer.

We updated the transformer for LibriSpeech. This improves the performance from WER= 2.46% to 2.26% on the test-clean. See the code here.

The Environmental corruption module can now support different sampling rates.

Minor fixes.

Source code(tar.gz)
Source code(zip)
v0.5.11(Dec 20, 2021)
Dear users, We worked very hard, and we are very happy to announce the new version of SpeechBrain. SpeechBrain 0.5.11 further expands the toolkit without introducing any major interface change.

The main changes are the following:

We implemented new recipes, such as:

VoxLingua 107 for language identification.

Sepformer for speech enhancement

MetricGAN-U for speech enhancement

SLURP with wav2vec for spoken language understanding.

REALM for speech separation with real data.

Korean Speech Recognition with KsponSpeech.

CommonVoice for German.

IEMOCAP for language emotion recognition using wav2vec.

Support for Dynamic batching with a Tutorial to help users familiarize themselves with it.

Support for wav2vec training within SpeechBrain.

Developed an interface with Orion for hyperparameter tuning with a Tutorial to help users familiarize themselves with it.

the torchaudio transducer loss is now supported. We also kept our numba implementation to help users customize the transducer loss part if needed.

Improved CTC-Segmentation

Fixed minor bugs and issues (e.g., fixed MVDR beamformer ).

Let me thank all the amazing contributors for this achievement. Please, keep add a star to our project if you appreciate our effort for the community. Together, we are growing very fast, and we have big plans for the future.

Stay Tuned!
Source code(tar.gz)
Source code(zip)
0.5.10(Sep 11, 2021)
This version mainly expands the functionalities of SpeechBrain without adding any backward incompatibilities.

New Recipes:

Language Identification with CommonLanguage

EEG signal processing with ERPCore

Speech translation with Fisher-Call Home

Emotion Recognition with IEMOCAP

Voice Activity Detection with LibriParty

ASR with LibriSpeech wav2vec (WER=1.9 on test-clean)

SpeechEnhancement with CoopNet

SpeechEnhancement with SEGAN

Speech Separation with LibriMix, WHAM, and WHAMR

Support for guided attention

Spoken Language Understanding with SLURP

Beyond that, we fixed some minor bugs and issues.
Source code(tar.gz)
Source code(zip)
v0.5.9(Jun 17, 2021)
This main differences with the previous version are the following:

Added Wham/whamr/librimix for speech separation

Compatibility with PyTorch 1.9

Fixed minor bugs

Added SpeechBrain paper

Source code(tar.gz)
Source code(zip)
v0.5.8(Jun 6, 2021)
SpeechBrain 0.5.8 improves the previous version in the following way:

Added wav2vec support in TIMIT, CommonVoice, AISHELL-1

Improved Fluent Speech Command Recipe

Improved SLU recipes

Recipe for UrbanSound8k

Fix small bugs

Fix typos

Source code(tar.gz)
Source code(zip)
0.5.7(Apr 29, 2021)
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains. The current version (v0.5.7) supports:

E2E Speech Recognition

Speaker Recognition (Identification and Verification)

Spoken Language Understanding (e.g., Intent recognition)

Speaker Diarization

Speech Enhancement

Speech Separation

Multi-microphone signal processing (beamforming, localization)

Many other tasks will be supported soon. Take a look into our roadmap on Discourse. Your contribution is welcome! Please, star our project to help us growing.

For more info and tutorials: https://speechbrain.github.io/
Source code(tar.gz)
Source code(zip)