PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

Overview


Build Status

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit. It provides easy-to-use, low-overhead, first-class Python wrappers for the C++ code in Kaldi and OpenFst libraries. You can use PyKaldi to write Python code for things that would otherwise require writing C++ code such as calling low-level Kaldi functions, manipulating Kaldi and OpenFst objects in code or implementing new Kaldi tools.

You can think of Kaldi as a large box of legos that you can mix and match to build custom speech recognition solutions. The best way to think of PyKaldi is as a supplement, a sidekick if you will, to Kaldi. In fact, PyKaldi is at its best when it is used alongside Kaldi. To that end, replicating the functionality of myriad command-line tools, utility scripts and shell-level recipes provided by Kaldi is a non-goal for the PyKaldi project.

Overview

Getting Started

Like Kaldi, PyKaldi is primarily intended for speech recognition researchers and professionals. It is jam packed with goodies that one would need to build Python software taking advantage of the vast collection of utilities, algorithms and data structures provided by Kaldi and OpenFst libraries.

If you are not familiar with FST-based speech recognition or have no interest in having access to the guts of Kaldi and OpenFst in Python, but only want to run a pre-trained Kaldi system as part of your Python application, do not fret. PyKaldi includes a number of high-level application oriented modules, such as asr, alignment and segmentation, that should be accessible to most Python programmers.

If you are interested in using PyKaldi for research or building advanced ASR applications, you are in luck. PyKaldi comes with everything you need to read, write, inspect, manipulate or visualize Kaldi and OpenFst objects in Python. It includes Python wrappers for most functions and methods that are part of the public APIs of Kaldi and OpenFst C++ libraries. If you want to read/write files that are produced/consumed by Kaldi tools, check out I/O and table utilities in the util package. If you want to work with Kaldi matrices and vectors, e.g. convert them to NumPy ndarrays and vice versa, check out the matrix package. If you want to use Kaldi for feature extraction and transformation, check out the feat, ivector and transform packages. If you want to work with lattices or other FST structures produced/consumed by Kaldi tools, check out the fstext, lat and kws packages. If you want low-level access to Gaussian mixture models, hidden Markov models or phonetic decision trees in Kaldi, check out the gmm, sgmm2, hmm, and tree packages. If you want low-level access to Kaldi neural network models, check out the nnet3, cudamatrix and chain packages. If you want to use the decoders and language modeling utilities in Kaldi, check out the decoder, lm, rnnlm, tfrnnlm and online2 packages.

Interested readers who would like to learn more about Kaldi and PyKaldi might find the following resources useful:

Since automatic speech recognition (ASR) in Python is undoubtedly the "killer app" for PyKaldi, we will go over a few ASR scenarios to get a feel for the PyKaldi API. We should note that PyKaldi does not provide any high-level utilities for training ASR models, so you need to train your models using Kaldi recipes or use pre-trained models available online. The reason why this is so is simply because there is no high-level ASR training API in Kaldi C++ libraries. Kaldi ASR models are trained using complex shell-level recipes that handle everything from data preparation to the orchestration of myriad Kaldi executables used in training. This is by design and unlikely to change in the future. PyKaldi does provide wrappers for the low-level ASR training utilities in Kaldi C++ libraries but those are not really useful unless you want to build an ASR training pipeline in Python from basic building blocks, which is no easy task. Continuing with the lego analogy, this task is akin to building this given access to a truck full of legos you might need. If you are crazy enough to try though, please don't let this paragraph discourage you. Before we started building PyKaldi, we thought that was a mad man's task too.

Automatic Speech Recognition in Python

PyKaldi asr module includes a number of easy-to-use, high-level classes to make it dead simple to put together ASR systems in Python. Ignoring the boilerplate code needed for setting things up, doing ASR with PyKaldi can be as simple as the following snippet of code:

asr = SomeRecognizer.from_files("final.mdl", "HCLG.fst", "words.txt", opts)

with SequentialMatrixReader("ark:feats.ark") as feats_reader:
    for key, feats in feats_reader:
        out = asr.decode(feats)
        print(key, out["text"])

In this simplified example, we first instantiate a hypothetical recognizer SomeRecognizer with the paths for the model final.mdl, the decoding graph HCLG.fst and the symbol table words.txt. The opts object contains the configuration options for the recognizer. Then, we instantiate a PyKaldi table reader SequentialMatrixReader for reading the feature matrices stored in the Kaldi archive feats.ark. Finally, we iterate over the feature matrices and decode them one by one. Here we are simply printing the best ASR hypothesis for each utterance so we are only interested in the "text" entry of the output dictionary out. Keep in mind that the output dictionary contains a bunch of other useful entries, such as the frame level alignment of the best hypothesis and a weighted lattice representing the most likely hypotheses. Admittedly, not all ASR pipelines will be as simple as this example, but they will often have the same overall structure. In the following sections, we will see how we can adapt the code given above to implement more complicated ASR pipelines.

Offline ASR using Kaldi Models

This is the most common scenario. We want to do offline ASR using pre-trained Kaldi models, such as ASpIRE chain models. Here we are using the term "models" loosely to refer to everything one would need to put together an ASR system. In this specific example, we are going to need:

Note that you can use this example code to decode with ASpIRE chain models.

from kaldi.asr import NnetLatticeFasterRecognizer
from kaldi.decoder import LatticeFasterDecoderOptions
from kaldi.nnet3 import NnetSimpleComputationOptions
from kaldi.util.table import SequentialMatrixReader, CompactLatticeWriter

# Set the paths and read/write specifiers
model_path = "models/aspire/final.mdl"
graph_path = "models/aspire/graph_pp/HCLG.fst"
symbols_path = "models/aspire/graph_pp/words.txt"
feats_rspec = ("ark:compute-mfcc-feats --config=models/aspire/conf/mfcc.conf "
               "scp:wav.scp ark:- |")
ivectors_rspec = (feats_rspec + "ivector-extract-online2 "
                  "--config=models/aspire/conf/ivector_extractor.conf "
                  "ark:spk2utt ark:- ark:- |")
lat_wspec = "ark:| gzip -c > lat.gz"

# Instantiate the recognizer
decoder_opts = LatticeFasterDecoderOptions()
decoder_opts.beam = 13
decoder_opts.max_active = 7000
decodable_opts = NnetSimpleComputationOptions()
decodable_opts.acoustic_scale = 1.0
decodable_opts.frame_subsampling_factor = 3
asr = NnetLatticeFasterRecognizer.from_files(
    model_path, graph_path, symbols_path,
    decoder_opts=decoder_opts, decodable_opts=decodable_opts)

# Extract the features, decode and write output lattices
with SequentialMatrixReader(feats_rspec) as feats_reader, \
     SequentialMatrixReader(ivectors_rspec) as ivectors_reader, \
     CompactLatticeWriter(lat_wspec) as lat_writer:
    for (fkey, feats), (ikey, ivectors) in zip(feats_reader, ivectors_reader):
        assert(fkey == ikey)
        out = asr.decode((feats, ivectors))
        print(fkey, out["text"])
        lat_writer[fkey] = out["lattice"]

The fundamental difference between this example and the short snippet from last section is that for each utterance we are reading the raw audio data from disk and computing two feature matrices on the fly instead of reading a single precomputed feature matrix from disk. The script file wav.scp contains a list of WAV files corresponding to the utterances we want to decode. The additional feature matrix we are extracting contains online i-vectors that are used by the neural network acoustic model to perform channel and speaker adaptation. The speaker-to-utterance map spk2utt is used for accumulating separate statistics for each speaker in online i-vector extraction. It can be a simple identity mapping if the speaker information is not available. We pack the MFCC features and the i-vectors into a tuple and pass this tuple to the recognizer for decoding. The neural network recognizers in PyKaldi know how to handle the additional i-vector features when they are available. The model file final.mdl contains both the transition model and the neural network acoustic model. The NnetLatticeFasterRecognizer processes feature matrices by first computing phone log-likelihoods using the neural network acoustic model, then mapping those to transition log-likelihoods using the transition model and finally decoding transition log-likelihoods into word sequences using the decoding graph HCLG.fst, which has transition IDs on its input labels and word IDs on its output labels. After decoding, we save the lattice generated by the recognizer to a Kaldi archive for future processing.

This example also illustrates the powerful I/O mechanisms provided by Kaldi. Instead of implementing the feature extraction pipelines in code, we define them as Kaldi read specifiers and compute the feature matrices simply by instantiating PyKaldi table readers and iterating over them. This is not only the simplest but also the fastest way of computing features with PyKaldi since the feature extraction pipeline is run in parallel by the operating system. Similarly, we use a Kaldi write specifier to instantiate a PyKaldi table writer which writes output lattices to a compressed Kaldi archive. Note that for these to work, we need compute-mfcc-feats, ivector-extract-online2 and gzip to be on our PATH.

Offline ASR using a PyTorch Acoustic Model

This is similar to the previous scenario, but instead of a Kaldi acoustic model, we use a PyTorch acoustic model. After computing the features as before, we convert them to a PyTorch tensor, do the forward pass using a PyTorch neural network module outputting phone log-likelihoods and finally convert those log-likelihoods back into a PyKaldi matrix for decoding. The recognizer uses the transition model to automatically map phone IDs to transition IDs, the input labels on a typical Kaldi decoding graph.

from kaldi.asr import MappedLatticeFasterRecognizer
from kaldi.decoder import LatticeFasterDecoderOptions
from kaldi.matrix import Matrix
from kaldi.util.table import SequentialMatrixReader, CompactLatticeWriter
from models import AcousticModel  # Import your PyTorch model
import torch

# Set the paths and read/write specifiers
acoustic_model_path = "models/aspire/model.pt"
transition_model_path = "models/aspire/final.mdl"
graph_path = "models/aspire/graph_pp/HCLG.fst"
symbols_path = "models/aspire/graph_pp/words.txt"
feats_rspec = ("ark:compute-mfcc-feats --config=models/aspire/conf/mfcc.conf "
               "scp:wav.scp ark:- |")
lat_wspec = "ark:| gzip -c > lat.gz"

# Instantiate the recognizer
decoder_opts = LatticeFasterDecoderOptions()
decoder_opts.beam = 13
decoder_opts.max_active = 7000
asr = MappedLatticeFasterRecognizer.from_files(
    transition_model_path, graph_path, symbols_path, decoder_opts=decoder_opts)

# Instantiate the PyTorch acoustic model (subclass of torch.nn.Module)
model = AcousticModel(...)
model.load_state_dict(torch.load(acoustic_model_path))
model.eval()

# Extract the features, decode and write output lattices
with SequentialMatrixReader(feats_rspec) as feats_reader, \
     CompactLatticeWriter(lat_wspec) as lat_writer:
    for key, feats in feats_reader:
        feats = torch.from_numpy(feats.numpy())  # Convert to PyTorch tensor
        loglikes = model(feats)                  # Compute log-likelihoods
        loglikes = Matrix(loglikes.numpy())      # Convert to PyKaldi matrix
        out = asr.decode(loglikes)
        print(key, out["text"])
        lat_writer[key] = out["lattice"]

Online ASR using Kaldi Models

This section is a placeholder. Check out this script in the meantime.

Lattice Rescoring with a Kaldi RNNLM

Lattice rescoring is a standard technique for using large n-gram language models or recurrent neural network language models (RNNLMs) in ASR. In this example, we rescore lattices using a Kaldi RNNLM. We first instantiate a rescorer by providing the paths for the models. Then we use a table reader to iterate over the lattices we want to rescore and finally we use a table writer to write rescored lattices back to disk.

") opts.eos_index = symbols.find_index("") opts.brk_index = symbols.find_index(" ") rescorer = LatticeRnnlmPrunedRescorer.from_files( old_lm_path, word_embedding_rxfilename, rnnlm_path, opts=opts) # Read the lattices, rescore and write output lattices with SequentialCompactLatticeReader(lat_rspec) as lat_reader, \ CompactLatticeWriter(lat_wspec) as lat_writer: for key, lat in lat_reader: lat_writer[key] = rescorer.rescore(lat) ">
from kaldi.asr import LatticeRnnlmPrunedRescorer
from kaldi.fstext import SymbolTable
from kaldi.rnnlm import RnnlmComputeStateComputationOptions
from kaldi.util.table import SequentialCompactLatticeReader, CompactLatticeWriter

# Set the paths, extended filenames and read/write specifiers
symbols_path = "models/tedlium/config/words.txt"
old_lm_path = "models/tedlium/data/lang_nosp/G.fst"
word_feats_path = "models/tedlium/word_feats.txt"
feat_embedding_path = "models/tedlium/feat_embedding.final.mat"
word_embedding_rxfilename = ("rnnlm-get-word-embedding %s %s - |"
                             % (word_feats_path, feat_embedding_path))
rnnlm_path = "models/tedlium/final.raw"
lat_rspec = "ark:gunzip -c lat.gz |"
lat_wspec = "ark:| gzip -c > rescored_lat.gz"

# Instantiate the rescorer
symbols = SymbolTable.read_text(symbols_path)
opts = RnnlmComputeStateComputationOptions()
opts.bos_index = symbols.find_index("")
opts.eos_index = symbols.find_index("")
opts.brk_index = symbols.find_index("
    
     "
    )
rescorer = LatticeRnnlmPrunedRescorer.from_files(
    old_lm_path, word_embedding_rxfilename, rnnlm_path, opts=opts)

# Read the lattices, rescore and write output lattices
with SequentialCompactLatticeReader(lat_rspec) as lat_reader, \
     CompactLatticeWriter(lat_wspec) as lat_writer:
  for key, lat in lat_reader:
    lat_writer[key] = rescorer.rescore(lat)

Notice the extended filename we used to compute the word embeddings from the word features and the feature embeddings on the fly. Also of note are the read/write specifiers we used to transparently decompress/compress the lattice archives. For these to work, we need rnnlm-get-word-embedding, gunzip and gzip to be on our PATH.

About PyKaldi

PyKaldi aims to bridge the gap between Kaldi and all the nice things Python has to offer. It is more than a collection of bindings into Kaldi libraries. It is a scripting layer providing first class support for essential Kaldi and OpenFst types in Python. PyKaldi vector and matrix types are tightly integrated with NumPy. They can be seamlessly converted to NumPy arrays and vice versa without copying the underlying memory buffers. PyKaldi FST types, including Kaldi style lattices, are first class citizens in Python. The API for the user facing FST types and operations is almost entirely defined in Python mimicking the API exposed by pywrapfst, the official Python wrapper for OpenFst.

PyKaldi harnesses the power of CLIF to wrap Kaldi and OpenFst C++ libraries using simple API descriptions. The CPython extension modules generated by CLIF can be imported in Python to interact with Kaldi and OpenFst. While CLIF is great for exposing existing C++ API in Python, the wrappers do not always expose a "Pythonic" API that is easy to use from Python. PyKaldi addresses this by extending the raw CLIF wrappers in Python (and sometimes in C++) to provide a more "Pythonic" API. Below figure illustrates where PyKaldi fits in the Kaldi ecosystem.

Architecture

PyKaldi has a modular design which makes it easy to maintain and extend. Source files are organized in a directory tree that is a replica of the Kaldi source tree. Each directory defines a subpackage and contains only the wrapper code written for the associated Kaldi library. The wrapper code consists of:

  • CLIF C++ API descriptions defining the types and functions to be wrapped and their Python API,

  • C++ headers defining the shims for Kaldi code that is not compliant with the Google C++ style expected by CLIF,

  • Python modules grouping together related extension modules generated with CLIF and extending the raw CLIF wrappers to provide a more "Pythonic" API.

You can read more about the design and technical details of PyKaldi in our paper.

Coverage Status

The following table shows the status of each PyKaldi package (we currently do not plan to add support for nnet, nnet2 and online) along the following dimensions:

  • Wrapped?: If there are enough CLIF files to make the package usable in Python.
  • Pythonic?: If the package API has a "Pythonic" look-and-feel.
  • Documentation?: If there is documentation beyond what is automatically generated by CLIF. Single checkmark indicates that there is not much additional documentation (if any). Three checkmarks indicates that package documentation is complete (or near complete).
  • Tests?: If there are any tests for the package.
Package Wrapped? Pythonic? Documentation? Tests?
base
chain
cudamatrix
decoder
feat
fstext
gmm
hmm
ivector
kws
lat
lm
matrix
nnet3
online2
rnnlm
sgmm2
tfrnnlm
transform
tree
util

Installation

If you are using a relatively recent Linux or macOS, such as Ubuntu >= 16.04, CentOS >= 7 or macOS >= 10.13, you should be able to install PyKaldi without too much trouble. Otherwise, you will likely need to tweak the installation scripts.

Conda

To install PyKaldi with CUDA support:

conda install -c pykaldi pykaldi

To install PyKaldi without CUDA support (CPU only):

conda install -c pykaldi pykaldi-cpu

Note that PyKaldi conda package does not provide Kaldi executables. If you would like to use Kaldi executables along with PyKaldi, e.g. as part of read/write specifiers, you need to install Kaldi separately.

Docker

If you would like to use PyKaldi inside a Docker container, follow the instructions in the docker folder.

From Source

To install PyKaldi from source, follow the steps given below.

Step 1: Clone PyKaldi Repository and Create a New Python Environment

git clone https://github.com/pykaldi/pykaldi.git
cd pykaldi

Although it is not required, we recommend installing PyKaldi and all of its Python dependencies inside a new isolated Python environment. If you do not want to create a new Python environment, you can skip the rest of this step.

You can use any tool you like for creating a new Python environment. Here we use virtualenv, but you can use another tool like conda if you prefer that. Make sure you activate the new Python environment before continuing with the rest of the installation.

virtualenv env
source env/bin/activate

Step 2: Install Dependencies

Running the commands below will install the system packages needed for building PyKaldi from source.

# Ubuntu
sudo apt-get install autoconf automake cmake curl g++ git graphviz \
    libatlas3-base libtool make pkg-config subversion unzip wget zlib1g-dev

# macOS
brew install automake cmake git graphviz libtool pkg-config wget

Running the commands below will install the Python packages needed for building PyKaldi from source.

pip install --upgrade pip
pip install --upgrade setuptools
pip install numpy pyparsing
pip install ninja  # not required but strongly recommended

In addition to above listed packages, we also need PyKaldi compatible installations of the following software:

  • Google Protobuf, recommended v3.5.0. Both the C++ library and the Python package must be installed.

  • PyKaldi compatible fork of CLIF. To streamline PyKaldi development, we made some changes to CLIF codebase. We are hoping to upstream these changes over time. These changes are in the pykaldi branch:

git clone -b pykaldi https://github.com/pykaldi/clif
  • PyKaldi compatible fork of Kaldi. To comply with CLIF requirements we had to make some changes to Kaldi codebase. We are hoping to upstream these changes over time.These changes are in the pykaldi branch:
git clone -b pykaldi https://github.com/pykaldi/kaldi

You can use the scripts in the tools directory to install or update these software locally. Make sure you check the output of these scripts. If you do not see Done installing {protobuf,CLIF,Kaldi} printed at the very end, it means that installation has failed for some reason.

cd tools
./check_dependencies.sh  # checks if system dependencies are installed
./install_protobuf.sh    # installs both the C++ library and the Python package
./install_clif.sh        # installs both the C++ library and the Python package
./install_kaldi.sh       # installs the C++ library
cd ..

Step 3: Install PyKaldi

If Kaldi is installed inside the tools directory and all Python dependencies (numpy, pyparsing, pyclif, protobuf) are installed in the active Python environment, you can install PyKaldi with the following command.

python setup.py install

Once installed, you can run PyKaldi tests with the following command.

python setup.py test

FAQ

How do I prevent PyKaldi install command from exhausting the system memory?

By default, PyKaldi install command uses all available (logical) processors to accelerate the build process. If the size of the system memory is relatively small compared to the number of processors, the parallel compilation/linking jobs might end up exhausting the system memory and result in swapping. You can limit the number of parallel jobs used for building PyKaldi as follows:

MAKE_NUM_JOBS=2 python setup.py install

How do I build PyKaldi on Windows?

We have no idea what is needed to build PyKaldi on Windows. It would probably require lots of changes to the build system.

How do I build PyKaldi using a different Kaldi installation?

At the moment, PyKaldi is not compatible with the upstream Kaldi repository. You need to build it against our Kaldi fork.

If you already have a compatible Kaldi installation on your system, you do not need to install a new one inside the pykaldi/tools directory. Instead, you can simply set the following environment variable before running the PyKaldi installation command.

">
export KALDI_DIR=<directory where Kaldi is installed, e.g. "$HOME/tools/kaldi">

How do I build PyKaldi using a different CLIF installation?

At the moment, PyKaldi is not compatible with the upstream CLIF repository. You need to build it using our CLIF fork.

If you already have a compatible CLIF installation on your system, you do not need to install a new one inside the pykaldi/tools directory. Instead, you can simply set the following environment variables before running the PyKaldi installation command.

export CLIF_MATCHER= ">
export PYCLIF=<path to pyclif executable, e.g. "$HOME/anaconda3/envs/clif/bin/pyclif">
export CLIF_MATCHER=<path to clif-matcher executable, e.g. "$HOME/anaconda3/envs/clif/clang/bin/clif-matcher">

How do I update Protobuf, CLIF or Kaldi used by PyKaldi?

While the need for updating Protobuf and CLIF should not come up very often, you might want or need to update Kaldi installation used for building PyKaldi. Rerunning the relevant install script in tools directory should update the existing installation. If this does not work, please open an issue.

How do I build PyKaldi with Tensorflow RNNLM support?

PyKaldi tfrnnlm package is built automatically along with the rest of PyKaldi if kaldi-tensorflow-rnnlm library can be found among Kaldi libraries. After building Kaldi, go to KALDI_DIR/src/tfrnnlm/ directory and follow the instructions given in the Makefile. Make sure the symbolic link for the kaldi-tensorflow-rnnlm library is added to the KALDI_DIR/src/lib/ directory.

Citing

If you use PyKaldi for research, please cite our paper as follows:

@inproceedings{pykaldi,
  title = {PyKaldi: A Python Wrapper for Kaldi},
  author = {Dogan Can and Victor R. Martinez and Pavlos Papadopoulos and
            Shrikanth S. Narayanan},
  booktitle={Acoustics, Speech and Signal Processing (ICASSP),
             2018 IEEE International Conference on},
  year = {2018},
  organization = {IEEE}
}

Contributing

We appreciate all contributions! If you find a bug, feel free to open an issue or a pull request. If you would like to request or add a new feature please open an issue for discussion.

Comments
  • Use cmake3 for installation on CentOS

    Use cmake3 for installation on CentOS

    CentOS package manager (yum) mirrors by default have cmake2.6 and use cmake3 for cmake ver >=3.0. Hence it would be useful to include a script that takes care of that, shouldn't be a problem, just replace cmake with cmake3 in the installation script for clif.

    opened by itsmesatwik 32
  • Zamia ASR pre-trained model.

    Zamia ASR pre-trained model.

    I'm trying to use of the pre-trained ASR speech to text models on a test wav file in Python. What's the easiest way to do this ? I just finished installing pykaldi. Could someone provide me a snippet of code on how to do this ? Say I want to use the kaldi-generic-en-tdnn_f model.

    opened by Cloud299 21
  • How do you build `kaldi/lib/_clif.so`?

    How do you build `kaldi/lib/_clif.so`?

    I am trying to build a wheel for this package. I managed to encapsulate kaldi shared lib into the wheel using auditwheel. Unfortunately it still fails with

        from ._kaldi_error import *
    ImportError: _clif.so: cannot open shared object file: No such file or directory
    

    I don't understand how you build kaldi/lib/_clif.so from sources in pykaldi/kaldi/lib/clif/python/. Is there some code missing in the repo?

    opened by dmitriy-serdyuk 21
  • instillation error

    instillation error

    I followed the steps given in README page but gettign the following errors when I did "python setup.py install"

    [9/983] Building CXX object kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap-init.cc.o FAILED: kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap-init.cc.o /usr/bin/c++ -D_kaldi_error_EXPORTS -I../kaldi/lib -I../kaldi -Ikaldi -I../tools/kaldi/src -I/usr/include/python2.7 -I/home/sgangireddy/venv_py36/lib/python3.6/site-packages/numpy/core/include -std=c++11 -I.. -isystem /home/sgangireddy/PycharmProjects/pykaldi/tools/kaldi/tools/openfst/include -O1 -Wall -Wno-sign-compare -Wno-unused-local-typedefs -Wno-deprecated-declarations -Winit-self -DKALDI_DOUBLEPRECISION=0 -DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -I/home/sgangireddy/PycharmProjects/pykaldi/tools/kaldi/tools/ATLAS_headers/include -msse -msse2 -pthread -g -fPIC -Wno-maybe-uninitialized -fPIC -MD -MT kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap-init.cc.o -MF kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap-init.cc.o.d -o kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap-init.cc.o -c kaldi/base/kaldi-error-clifwrap-init.cc kaldi/base/kaldi-error-clifwrap-init.cc: In function ‘void PyInit__kaldi_error()’: kaldi/base/kaldi-error-clifwrap-init.cc:17:60: error: return-statement with a value, in function returning 'void' [-fpermissive] if (!kaldi_base___kaldi__error_clifwrap::Ready()) return nullptr; ^ kaldi/base/kaldi-error-clifwrap-init.cc:18:51: error: return-statement with a value, in function returning 'void' [-fpermissive] return kaldi_base___kaldi__error_clifwrap::Init(); ^ [12/983] Building CXX object kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap.cc.o FAILED: kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap.cc.o /usr/bin/c++ -D_kaldi_error_EXPORTS -I../kaldi/lib -I../kaldi -Ikaldi -I../tools/kaldi/src -I/usr/include/python2.7 -I/home/sgangireddy/venv_py36/lib/python3.6/site-packages/numpy/core/include -std=c++11 -I.. -isystem /home/sgangireddy/PycharmProjects/pykaldi/tools/kaldi/tools/openfst/include -O1 -Wall -Wno-sign-compare -Wno-unused-local-typedefs -Wno-deprecated-declarations -Winit-self -DKALDI_DOUBLEPRECISION=0 -DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -I/home/sgangireddy/PycharmProjects/pykaldi/tools/kaldi/tools/ATLAS_headers/include -msse -msse2 -pthread -g -fPIC -Wno-maybe-uninitialized -fPIC -MD -MT kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap.cc.o -MF kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap.cc.o.d -o kaldi/base/CMakeFiles/_kaldi_error.dir/kaldi-error-clifwrap.cc.o -c kaldi/base/kaldi-error-clifwrap.cc kaldi/base/kaldi-error-clifwrap.cc:167:27: error: variable ‘kaldi_base___kaldi__error_clifwrap::PyModuleDef kaldi_base___kaldi__error_clifwrap::Module’ has initializer but incomplete type static struct PyModuleDef Module = { ^ kaldi/base/kaldi-error-clifwrap.cc:168:3: error: ‘PyModuleDef_HEAD_INIT’ was not declared in this scope PyModuleDef_HEAD_INIT, ^ kaldi/base/kaldi-error-clifwrap.cc: In function ‘PyObject* kaldi_base___kaldi__error_clifwrap::Init()’: kaldi/base/kaldi-error-clifwrap.cc:176:45: error: ‘PyModule_Create’ was not declared in this scope PyObject* module = PyModule_Create(&Module); ^ [13/983] Building CXX object kaldi/base/CMakeFiles/_kaldi_math.dir/kaldi-math-clifwrap.cc.o FAILED: kaldi/base/CMakeFiles/_kaldi_math.dir/kaldi-math-clifwrap.cc.o /usr/bin/c++ -D_kaldi_math_EXPORTS -I../kaldi/lib -I../kaldi -Ikaldi -I../tools/kaldi/src -I/usr/include/python2.7 -I/home/sgangireddy/venv_py36/lib/python3.6/site-packages/numpy/core/include -std=c++11 -I.. -isystem /home/sgangireddy/PycharmProjects/pykaldi/tools/kaldi/tools/openfst/include -O1 -Wall -Wno-sign-compare -Wno-unused-local-typedefs -Wno-deprecated-declarations -Winit-self -DKALDI_DOUBLEPRECISION=0 -DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -I/home/sgangireddy/PycharmProjects/pykaldi/tools/kaldi/tools/ATLAS_headers/include -msse -msse2 -pthread -g -fPIC -Wno-maybe-uninitialized -fPIC -MD -MT kaldi/base/CMakeFiles/_kaldi_math.dir/kaldi-math-clifwrap.cc.o -MF kaldi/base/CMakeFiles/_kaldi_math.dir/kaldi-math-clifwrap.cc.o.d -o kaldi/base/CMakeFiles/_kaldi_math.dir/kaldi-math-clifwrap.cc.o -c kaldi/base/kaldi-math-clifwrap.cc kaldi/base/kaldi-math-clifwrap.cc: In function ‘int kaldi_base___kaldi__math_clifwrap::pyRandomState::set_seed(PyObject*, PyObject*, void*)’: kaldi/base/kaldi-math-clifwrap.cc:44:87: error: ‘PyUnicode_AsUTF8’ was not declared in this scope PyErr_Format(PyExc_ValueError, "%s is not valid for seed:int", s? PyUnicode_AsUTF8(s): "input"); ^ kaldi/base/kaldi-math-clifwrap.cc: At global scope: kaldi/base/kaldi-math-clifwrap.cc:781:27: error: variable ‘kaldi_base___kaldi__math_clifwrap::PyModuleDef kaldi_base___kaldi__math_clifwrap::Module’ has initializer but incomplete type static struct PyModuleDef Module = { ^ kaldi/base/kaldi-math-clifwrap.cc:782:3: error: ‘PyModuleDef_HEAD_INIT’ was not declared in this scope PyModuleDef_HEAD_INIT, ^ kaldi/base/kaldi-math-clifwrap.cc: In function ‘PyObject* kaldi_base___kaldi__math_clifwrap::Init()’: kaldi/base/kaldi-math-clifwrap.cc:790:45: error: ‘PyModule_Create’ was not declared in this scope PyObject* module = PyModule_Create(&Module); ^ [14/983] Generating fstream-clifwrap.cc, fstream-clifwrap.h, fstream-clifwrap-init.cc ninja: build stopped: subcommand failed. Command '['ninja', '-j', '6']' returned non-zero exit status 1.

    opened by srgangireddy 15
  • there is a question in getting fbank_pitch feature between pykaldi and kaldi

    there is a question in getting fbank_pitch feature between pykaldi and kaldi

    1.use kaldi to get fbank_pitch feature as C fbank_feats="ark:compute-fbank-feats $vtln_opts --verbose=2 --config=$fbank_config scp,p:$logdir/wav.JOB.scp ark:- |" pitch_feats="ark,s,cs:compute-kaldi-pitch-feats --verbose=2 --config=$pitch_config scp,p:$logdir/wav.JOB.scp ark:- | process-kaldi-pitch-feats $postprocess_config_opt ark:- ark:- |"

    paste-feats --length-tolerance=$paste_length_tolerance "$fbank_feats" "$pitch_feats" ark:- |
    copy-feats --compress=$compress $write_num_frames_opt ark:- \

    2.use kaldi to get fbank_pitch feature as feat

    3.then use result=approx_equal(C,feat,0.1) the result is False,Is C different from feat?tol is 0.1 so big.

    opened by liuchenbaidu 13
  • Phoneme Aligner example

    Phoneme Aligner example

    Considering other (closed) issues that seemed to be making similar requests (that were denied), I'm guessing I'm not sure this is a legitimate request, but I'll regret not asking since I really need this.

    Would you guys be able to help me write (or even better, write yourself :stuck_out_tongue: ) an example file for phoneme forced alignment with PyKaldi?

    opened by hadware 13
  • there is a problem in building PyKaldi using a different Kaldi installation.

    there is a problem in building PyKaldi using a different Kaldi installation.

    i have installed kaldi.then export KALDI_DIR=path to KALDI_DIR with running "python setup.py install" it give propmt.

    Kaldi installation at /wiki/victorliu/espnet03/espnet/tools/kaldi is not supported. Please update Kaldi to match https://github.com/pykaldi/kaldi/tree/pykaldi.

    opened by liuchenbaidu 12
  • there is a problem in installing from source  build PyKaldi using a different Kaldi installation

    there is a problem in installing from source build PyKaldi using a different Kaldi installation

    i run ./install_protobuf.sh in directory of tools Protobuf found in PATH. Checking Protobuf version... Protobuf version: 3.6.1 Protobuf version is compatible. Checking Protobuf Python package... Protobuf Python package version: 3.5.1 Protobuf Python package version is compatible. Done installing Protobuf.

    then ,i run ./install_clif.sh it print out these: .Package protobuf was not found in the pkg-config search path. Perhaps you should add the directory containing `protobuf.pc' to the PKG_CONFIG_PATH environment variable No package 'protobuf' found

    opened by liuchenbaidu 11
  • Rescoring output with grammar

    Rescoring output with grammar

    Current Kaldi source code includes a binary lattice-lmrescore which enables to run the recognition with the original HCLG.fst, and then re-align the results of recognition according to new grammar G.fst. Does your package have some similar functionalities, and if not, what could be the possible solution using your package?

    I'm aware that it's not an issue, but I don't know any other places to pose the question :)

    opened by DinoTheDinosaur 11
  • Add wrapper for new access functions to the posteriors of the nnet

    Add wrapper for new access functions to the posteriors of the nnet

    This is my go at adding access to the NNet posteriors via pykaldi, see #113. I've done this without understanding much of CLIF, or the way pykaldi has wrapped kaldi, but at least it builds and the resulting pykaldi loads... It assumes this PR has been applied to pykaldi/kaldi.

    I need a little help at this stage, as activating this code crashes pykaldi/kaldi left, right and centre. Based on nnet-online-recognizer I activate this code by replacing the highlighted lines by

    for key, wav in SequentialWaveReader("scp:local/wav.scp"):
        feat_pipeline = OnlineNnetFeaturePipeline(feat_info)
        asr.set_input_pipeline(feat_pipeline)
        d = asr._decodable
        feat_pipeline.accept_waveform(wav.samp_freq, wav.data()[0])
        print(d.num_frames_ready())
        for i in range(d.num_frames_ready()):
            x = d.log_likelihoods(i)
            print(dir(x))
        feat_pipeline.input_finished()
        out = asr.decode()
        print(key, out["text"], flush=True)
    

    and actually after the feat_pipeline.input_finished() I would need some more d.log_likelihoods() statements.

    Currently, this segfaults, so there obviously are things I am doing wrong and/or assumptions I make that don't hold.

    I think the access to the Nnet's current_log_post_ is correct---but this results in a kaldi SubVector, and I suppose this needs proper copying or reference counting or whatever it takes to make sure the memory is still intact by the time pykaldi wants to use it.

    Then CLIF does its default thing, so the return type in python is <class 'kaldi.matrix._matrix_ext.SubVector'>. This may need an explicit constructor wrapper.

    Any help is greatly appreciated!

    opened by davidavdav 10
  • Pykaldi cannot be installed from source

    Pykaldi cannot be installed from source

    Hello!

    I installed pykaldi about 2-3 months ago in a conda Python 3.4 environment and it worked fine. Then not too long ago I had to reconfigure the environment, some things changed since then (Python 3.4 is not maintained anymore, I had to make a virtualenv, not a conda env), and my installation fails at the last step:

    (venv3.4) [ojakovenko@panda pykaldi]$ python setup.py install
    running install
    running bdist_egg
    running egg_info
    writing dependency_links to pykaldi.egg-info/dependency_links.txt
    writing top-level names to pykaldi.egg-info/top_level.txt
    writing requirements to pykaldi.egg-info/requires.txt
    writing pykaldi.egg-info/PKG-INFO
    reading manifest file 'pykaldi.egg-info/SOURCES.txt'
    writing manifest file 'pykaldi.egg-info/SOURCES.txt'
    installing library code to build/bdist.linux-x86_64/egg
    running install_lib
    running build_py
    running build_ext
    Using PYCLIF: /home/ojakovenko/telephone_robot/venv3.4/bin/pyclif
    Using CLIF_MATCHER: /home/ojakovenko/telephone_robot/venv3.4/clang/bin/clif-matcher
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /home/ds/DataScience/disrt/pykaldi/build
    [19/923] Generating io-funcs-inl-clifwrap.cc, io-funcs-inl-clifwrap.h, io-funcs-inl-clifwrap-init.cc
    FAILED: kaldi/base/io-funcs-inl-clifwrap.cc kaldi/base/io-funcs-inl-clifwrap.h kaldi/base/io-funcs-inl-clifwrap-init.cc
    cd /home/ds/DataScience/disrt/pykaldi/build/kaldi/base && /home/ojakovenko/telephone_robot/venv3.4/bin/pyclif --py3output --matcher_bin=/home/ojakovenko/telephone_robot/venv3.4/clang/bin/clif-matcher --ccdeps_out /home/ds/DataScience/disrt/pykaldi/build/kaldi/base/io-funcs-inl-clifwrap.cc --header_out /home/ds/DataScience/disrt/pykaldi/build/kaldi/base/io-funcs-inl-clifwrap.h --ccinit_out /home/ds/DataScience/disrt/pykaldi/build/kaldi/base/io-funcs-inl-clifwrap-init.cc --modname=_io_funcs_inl --prepend=clif/python/types.h -I/home/ds/DataScience/disrt/pykaldi/kaldi/lib -I/home/ds/DataScience/disrt/pykaldi/kaldi -I/home/ds/DataScience/disrt/pykaldi/build/kaldi -I/home/ds/DataScience/disrt/pykaldi/tools/kaldi/src "-f-I/home/ds/anaconda3/envs/chat_bot_env/include/python3.4m              -I/home/ds/DataScience/disrt/pykaldi/kaldi/lib
              -I/home/ds/DataScience/disrt/pykaldi/kaldi              -I/home/ds/DataScience/disrt/pykaldi/build/kaldi              -I/home/ds/DataScience/disrt/pykaldi/tools/kaldi/src              -I/home/ojakovenko/telephone_robot/venv3.4/clang/lib/clang/5.0.0/include               -std=c++11 -I.. -isystem /home/ds/DataScience/disrt/pykaldi/tools/kaldi/tools/openfst/include -O1 -Wall -Wno-sign-compare -Wno-unused-local-typedefs -Wno-deprecated-declarations -Winit-self -DKALDI_DOUBLEPRECISION=0 -DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -I/home/ds/DataScience/disrt/pykaldi/tools/kaldi/tools/ATLAS_headers/include -msse -msse2 -pthread -g -fPIC -DHAVE_CUDA -I/usr/local/cuda/include -Wno-maybe-uninitialized" /home/ds/DataScience/disrt/pykaldi/kaldi/base/io-funcs-inl.clif && /home/ds/DataScience/disrt/pykaldi/tools/use_namespace.sh /home/ds/DataScience/disrt/pykaldi/build/kaldi/base/io-funcs-inl-clifwrap.cc
    No suitable matches found for write_int_vector (with C++ name: kaldi::WriteIntegerVector) on line 9.
        Declaration was found, but not inside the required file.
        Clif expects it in the file base/io-funcs-inl.h but found it at /home/ds/DataScience/disrt/pykaldi/tools/kaldi/src/base/io-funcs.h:180:1
    No suitable matches found for read_int_vector (with C++ name: kaldi::ReadIntegerVector) on line 12.
        Declaration was found, but not inside the required file.
        Clif expects it in the file base/io-funcs-inl.h but found it at /home/ds/DataScience/disrt/pykaldi/tools/kaldi/src/base/io-funcs.h:184:1
    No suitable matches found for write_int_pair_vector (with C++ name: kaldi::WriteIntegerPairVector) on line 15.
        Declaration was found, but not inside the required file.
        Clif expects it in the file base/io-funcs-inl.h but found it at /home/ds/DataScience/disrt/pykaldi/tools/kaldi/src/base/io-funcs.h:188:1
    No suitable matches found for read_int_pair_vector (with C++ name: kaldi::ReadIntegerPairVector) on line 18.
        Declaration was found, but not inside the required file.
        Clif expects it in the file base/io-funcs-inl.h but found it at /home/ds/DataScience/disrt/pykaldi/tools/kaldi/src/base/io-funcs.h:193:1
    No suitable matches found for init_kaldi_output_stream (with C++ name: kaldi::InitKaldiOutputStream) on line 21.
        Declaration was found, but not inside the required file.
        Clif expects it in the file base/io-funcs-inl.h but found it at /home/ds/DataScience/disrt/pykaldi/tools/kaldi/src/base/io-funcs.h:232:1
    No suitable matches found for init_kaldi_input_stream (with C++ name: kaldi::InitKaldiInputStream) on line 24.
        Declaration was found, but not inside the required file.
        Clif expects it in the file base/io-funcs-inl.h but found it at /home/ds/DataScience/disrt/pykaldi/tools/kaldi/src/base/io-funcs.h:237:1
    _BackendError: Matcher failed with status 1
    [32/923] Building CXX object kaldi/fstext/CMakeFiles/_float_weight.dir/float-weight-clifwrap.cc.o
    ninja: build stopped: subcommand failed.
    Command '['ninja', '-j', '14']' returned non-zero exit status 1
    

    I tried installing with older versions of packages, but no avail. The same situation is with the other python environments (3.6, 3.7) What could be the possible issue?

    Best regards, Olga Yakovenko

    opened by DinoTheDinosaur 10
  • Error install CLIF

    Error install CLIF

    I use MacOs 13.1 run pykaldi When i run ./install_clif.sh, i get error: ninja: build stopped: subcommand failed.

    /Library/Developer/CommandLineTools/usr/include/c++/v1/stdio.h:107:15: fatal error: 'stdio.h' file not found #include_next <stdio.h> ^~~~~~~~~ _BackendError: Matcher failed with status 1 ninja: build stopped: subcommand failed.

    opened by thainq07 0
  • ImportError: libkaldi-base.so: cannot open shared object file: No such file or directory

    ImportError: libkaldi-base.so: cannot open shared object file: No such file or directory

    I installed pykaldi, follow: http://ltdata1.informatik.uni-hamburg.de/pykaldi/ When I try import kaldi, I receive error: ImportError: libkaldi-base.so: cannot open shared object file: No such file or directory

    File path.sh:
    export KALDI_ROOT=kaldi export LD_LIBRARY_PATH=$KALDI_ROOT/src/lib:$KALDI_ROOT/tools/openfst-1.6.7/lib:
    $LD_LIBRARY_PATH export PATH=$KALDI_ROOT/src/lmbin/:$KALDI_ROOT/../kaldi_lm/:$PWD/utils/:$KALDI_
    ROOT/src/bin:$KALDI_ROOT/tools/openfst/bin:$KALDI_ROOT/src/fstbin/:$KALDI_ROOT/
    src/gmmbin/:$KALDI_ROOT/src/featbin/:$KALDI_ROOT/src/lm/:$KALDI_ROOT/src/sgmmbi
    n/:$KALDI_ROOT/src/sgmm2bin/:$KALDI_ROOT/src/fgmmbin/:$KALDI_ROOT/src/latbin/:$
    KALDI_ROOT/src/nnetbin:$KALDI_ROOT/src/nnet2bin/:$KALDI_ROOT/src/online2bin/:$K
    ALDI_ROOT/src/ivectorbin/:$KALDI_ROOT/src/kwsbin:$KALDI_ROOT/src/nnet3bin:$KALD
    I_ROOT/src/chainbin:$KALDI_ROOT/tools/sph2pipe_v2.5/:$KALDI_ROOT/src/rnnlmbin:$
    PWD:$PATH

    opened by aisolus02 0
  • Could not find Kaldi.  Please install Kaldi under the tools directory or set KALDI_DIR environment variable.

    Could not find Kaldi. Please install Kaldi under the tools directory or set KALDI_DIR environment variable.

    After following the instruction steps, I installed everything in sequence, but I also got the same error in the end

    Please install Kaldi under the tools directory or setKALDI_DIRenvironment variable.

    Even though I checked the KALDI_DIR variable, it is having a correct path to the kaldi folder pykaldi?

    Any help would be appreciated

    opened by shakeel608 1
  • fstext/_symbol_table.so: undefined symbol

    fstext/_symbol_table.so: undefined symbol

    1. install pykaldi
    pip3 install numpy
    pip3 install pykaldi-0.2.1-cp38-cp38-linux_x86_64.whl 
    
    1. install kaldi ./install_kaldi.sh
    2. decode with "nnet3-recognizer.py",get follows error:
    Traceback (most recent call last):
      File "nnet3-recognizer.py", line 5, in <module>
        from kaldi.asr import NnetLatticeFasterRecognizer, LatticeLmRescorer
      File "/home/ybZhang/pykaldi/lib/python3.8/site-packages/kaldi/asr.py", line 14, in <module>
        from . import decoder as _dec
      File "/home/ybZhang/pykaldi/lib/python3.8/site-packages/kaldi/decoder/__init__.py", line 1, in <module>
        from ._grammar_fst import *
    ImportError: /home/ybZhang/pykaldi/lib/python3.8/site-packages/kaldi/decoder/../fstext/../fstext/../fstext/_symbol_table.so: undefined symbol: _ZN3fst19StringToSymbolTableERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
    
    opened by v-yunbin 0
  • Pykaldi on Conda Forge

    Pykaldi on Conda Forge

    Hello,

    @h-vetinari and I got Kaldi building successfully on conda forge earlier this year, so I think it should be possible to get pykaldi on there as well built against the latest version of Kaldi. I can put together the initial feedstock for it, but just wanted to check for interest on the pykaldi maintainer front and who I should list as maintainers on pykaldi-feedstock.

    opened by mmcauliffe 4
  • Add missing types to alignment docs

    Add missing types to alignment docs

    to_phone_alignment and to_word_alignment type signatures were wrong in the docs for the alignment module. If the phones and symbol tables (Respectively) are present then it returns a string instead of int for the first position in the tuple.

    opened by hcmturner 0
Releases(v0.2.2)
ExKaldi-RT: An Online Speech Recognition Extension Toolkit of Kaldi

ExKaldi-RT is an online ASR toolkit for Python language. It reads realtime streaming audio and do online feature extraction, probability computation, and online decoding.

Wang Yu 31 Aug 16, 2021
A Python module made to simplify the usage of Text To Speech and Speech Recognition.

Nav Module The solution for voice related stuff in Python Nav is a Python module which simplifies voice related stuff in Python. Just import the Modul

Snm Logic 1 Dec 20, 2021
PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

PhoNLP is a multi-task learning model for joint part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing. Experiments on Vietnamese benchmark datasets show that PhoNLP produces state-of-the-art results, outperforming a single-task learning approach that fine-tunes the pre-trained Vietnamese language model PhoBERT for each task independently.

VinAI Research 109 Dec 2, 2022
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Espresso Espresso is an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning libra

Yiming Wang 919 Jan 3, 2023
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 26 Dec 14, 2022
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

Soohwan Kim 86 Jun 11, 2021
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

?? Contributing to OpenSpeech ?? OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

Openspeech TEAM 513 Jan 3, 2023
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

null 186 Dec 24, 2022
Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

Uyghur 11 Nov 17, 2022
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and many others.

SpeechBrain 5.1k Jan 9, 2023
End-to-End Speech Processing Toolkit

ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.0.1 1.1.0 1.2.0 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 ubuntu18/python3.8/pip ubuntu18

ESPnet 5.9k Jan 3, 2023
IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models. Everything is pure Python and PyTorch based to keep it as simple and beginner-friendly, yet powerful as possible.

Digital Phonetics at the University of Stuttgart 247 Jan 5, 2023
text to speech toolkit. 好用的中文语音合成工具箱,包含语音编码器、语音合成器、声码器和可视化模块。

ttskit Text To Speech Toolkit: 语音合成工具箱。 安装 pip install -U ttskit 注意 可能需另外安装的依赖包:torch,版本要求torch>=1.6.0,<=1.7.1,根据自己的实际环境安装合适cuda或cpu版本的torch。 ttskit的

KDD 483 Jan 4, 2023
HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

HuggingSound HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools. I have no intention of building a very complex tool here.

Jonatas Grosman 247 Dec 26, 2022
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

pyannote 2.2k Jan 9, 2023
Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Alexander Veysov 3.2k Dec 31, 2022
PyTorch implementation of Microsoft's text-to-speech system FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Chung-Ming Chien 1k Dec 30, 2022