GSoC'2021 | TensorFlow implementation of Wav2Vec2

Vasudev Gupta

Last update: Nov 28, 2022

Related tags

Overview

This repository presents an implementation of the Wav2Vec2 model [1] in TensorFlow 2.0 as a part of Google Summer of Code.

For a quick demo, please check out this. Final report of the project can be found here.

Notebooks

The repository comes with shiny Colab Notebooks. Below you can find a list of them. Spin them up and don't forget to have fun!

Notebook	Description
	This notebook gives you a template to fine-tune a pre-trained Wav2Vec2 SavedModel
	This notebook demonstrates conversion of TF Wav2Vec2 model to ONNX and compares the latency of ONNX exported model & TF model on CPU
	This notebook demonstrates Wav2Vec2 evaluation (without any padding) on LibriSpeech data
	This notebook demonstrates Wav2Vec2 SavedModel evaluation (with constant padding upto 246000 length) on LibriSpeech data
	This notebook shows a small demo of how to use Wav2Vec2 for inference for ASR task

Checkpoints

Below is a summary of checkpoints obtained during the project:

🤗 Hub Checkpoint	TFHub `SavedModel`	Description
`gsoc-wav2vec2`	`wav2vec2`	This checkpoint is TensorFlow's equivalent of pre-trained Wav2Vec2 by Facebook. PyTorch weights are converted into TensorFlow using `convert_torch_to_tf.py`
`gsoc-wav2vec2-960h`	`wav2vec2-960h`	This checkpoint is TensorFlow's equivalent of fine-tuned Wav2Vec2 by Facebook. PyTorch weights are converted into TensorFlow using `convert_torch_to_tf.py`
`finetuned-wav2vec2-960h`	-	This checkpoint is obtained by fine-tuning Wav2Vec2 model on 960h of LibriSpeech dataset during my GSoC tenure. You can reproduce training by running `main.py` on TPU v3-8

To know more about the process of obtaining the first two checkpoints, please check out this section and to know about the process of obtaining the last checkpoint, please check out this section.

Using this Repository

Wav2Vec2 model from this repository can be installed using the pip command:

# this will install the wav2vec2 package
pip3 install git+https://github.com/vasudevgupta7/gsoc-wav2vec2@main

You can use the fine-tuned checkpoints (from 🤗 Hub) like this:

from wav2vec2 import Wav2Vec2ForCTC, Wav2Vec2Config

config = Wav2Vec2Config()
model = Wav2Vec2ForCTC(config)
# now use this model like any other TF model

# incase you are interested in already trained model, use `.from_pretrained` method
model_id = "finetuned-wav2vec2-960h"
model = Wav2Vec2ForCTC.from_pretrained(model_id)

Additionally, you can use the SavedModel from TFHub like this:

import tensorflow_hub as hub

model_url = "https://tfhub.dev/vasudevgupta7/wav2vec2-960h/1"
model = hub.KerasLayer(model_url)

# use this `model`, just like any other TF SavedModel

Please checkout the notebooks referred to in this repository for more information on how to use the Wav2Vec2 model.

Reproducing this project

Setting Up

# install & setup TensorFlow first
pip3 install tensorflow

# install other requirements of this project using the following command:
pip3 install -qr requirements.txt
sudo apt-get install libsndfile1-dev

# switch to code directory for further steps
cd src

For using TPUs, it's important to store model weights and datasets in the GCS bucket so that TPU can access them directly from there. Hence we will create 2 GCS buckets - one for checkpointing and the other for storing LibriSpeech tfrecords.

# these bucket names will be required to run the training script later
export DATA_BUCKET_NAME="gsoc-librispeech-us"
export CKPT_BUCKET_NAME="gsoc-checkpoints-us"

# create GCS buckets
gsutil mb gs://${DATA_BUCKET_NAME}
gsutil mb gs://${CKPT_BUCKET_NAME}

Preparing dataset

Now we will download the LibriSpeech dataset from the official website & convert them into tfrecords using make_tfrecords.py. Finally, we will export all the tfrecords to the GCS bucket.

# possible values are `dev-clean`, `train-clean-100`, `train-clean-360`, `train-other-500`, `test-clean`
# you will have to follow same steps for all the configurations (specified above).
export DATA_SPLIT=dev-clean

wget https://www.openslr.org/resources/12/${DATA_SPLIT}.tar.gz
tar -xf ${DATA_SPLIT}.tar.gz

python3 make_tfrecords.py --data_dir LibriSpeech/${DATA_SPLIT} -d ${DATA_SPLIT} -n 50

# transfer tfrecords to GCS bucket
gsutil cp -r ${DATA_SPLIT} gs://<DATA_BUCKET_NAME>/${DATA_SPLIT}

Now your GCS bucket (DATA_BUCKET_NAME) should look like this:

.
|- ${DATA_SPLIT}
    |- ${DATA_SPLIT}-0.tfrecord
    |- ${DATA_SPLIT}-1.tfrecord
    .
    .

Follow the above steps for all other data splits. You just need to change the DATA_SPLIT environment variable.

Model training

Now since everything is installed and GCS buckets are configured, we just need to run one command to initiate training.

Note: Following commands assumes that you have exported DATA_BUCKET_NAME & CKPT_BUCKET_NAME environment variables already.

The following command will fine-tune the wav2vec2 model on single/multiple GPUs or Colab/Kaggle TPUs:

python3 main.py

For training on Cloud TPUs, run the following command:

# export `TPU_NAME` environment variable first
# this flag will ensure that your VM connects to the specified TPUs & TPUs become visible to TensorFlow
TPU_NAME=<tpu-name> python3 main.py

Running Conversion script

Original PyTorch checkpoints (from Facebook) can be converted using the conversion script available in this repository.

python3 convert_torch_to_tf.py \
--hf_model_id facebook/wav2vec2-base \ # HuggingFace Hub ID of the model you want to convert
--with_lm_head # Whether to use `Wav2Vec2ForCTC` or `Wav2Vec2Model` from this repository

Running tests

# first install `torch` & `transformers`
pip3 install torch transformers

# run this from the root of this repository
pytest -sv tests

Acknowledgement

Sayak Paul, Morgan Roff, Jaeyoun Kim for mentoring me throughout the project.
TensorFlow team & TRC for providing access to TPUs during my GSoC tenure.

References

[1] Baevski, Alexei, et al. “Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations.” ArXiv:2006.11477 [Cs, Eess], Oct. 2020. arXiv.org, http://arxiv.org/abs/2006.11477.

End Notes

Please create an issue in case you encountered any issues while using this project. Don't forget to 🌟 this repository if you liked my work.

Comments

Code review
Updated on June 23, 2021

Hi @sayakpaul, @MorganR,

Code is ready for the review. Main code resides in src/.

Here is the overview of src/ directory:

convert_torch_to_tf.py: This contains a script for converting torch pre-trained (or fine-tuned) checkpoints to TensorFlow format. Running this script: python3 convert_torch_to_tf.py and model weights will be saved in TF format.

data_utils.py: This contains the data-loading part for training & inference on LibriSpeech dataset.

wav2vec2/: This module contains all the modelling, configuration, processor (steps for encoding samples), tokenizer (steps for decoding outputs) related code. This also host CTC-loss function.

This notebook is having a small simple demo of Wav2Vec2. I am planning to shift to gradio app very soon :)

This notebook shows how to evaluate Wav2Vec2 on LibriSpeech dataset.

wav2vec2/tensorflow_addons: This is mostly taken from tensorflow_addons and other sources to be able to use latest TF (since tensorflow_addons doesn't work with TF>2.4 currently).

main.py is the script for training the model. For running: python3 main.py

Note:

Both the notebooks are end2end & doesn't require any setup from user's side.

This repositary is available as pip package now. Wav2Vec2 can be installed directly by running pip3 install git+https://github.com/vasudevgupta7/gsoc-wav2vec2@main

Running tests

Tests resides in tests/test_wav2vec2.py. Most of tests require HuggingFace Transformers & PyTorch. Few tests will be skipped if they are not installed.

# first install `torch` & `transformers` pip3 install torch transformers # run this from root of this repositary pytest -sv tests

Dataset

Original Dataset: http://www.openslr.org/12 TF Records: https://huggingface.co/datasets/vasudevgupta/gsoc-librispeech-tfrecords

TODO

[x] Complete & test model.fit().

[x] Implement CTC loss

[x] tokenise text labels for training.

[x] Implement spec-augmentation for training.

[x] fix distributed training on TPUs

[x] implement nice callbacks & logging

Thanks!!
opened by thevasudevgupta 30
Add evaluation notebook
This PR add evaluation notebook on LibriSpeech dataset. Note: following discussion is for converted fine-tuned checkpoint.

@sayakpaul, when I am restricting sequence length (or padding) to some constant value (for using tf.function()), we are getting WER of 6% while without any padding, we are getting WER of 3%. There is a difference in performance in 2 cases as original Wav2Vec2 doesn't accept any attention_mask / padding_mask.

Now if want to export to TFHub, we will have to restrict seqlen to some constant value (as SavedModel requires model signature to be constant) and we no longer will get 3% but we have no other option I guess.

Should I keep this notebook specific to TFHub export or specific to getting even better results (i.e 3% WER)?

@MorganR, please give your suggestions on this after you are back from vacations.

Links to notebooks

Notebook with 3% WER: https://colab.research.google.com/github/vasudevgupta7/gsoc-wav2vec2/blob/notebook/notebooks/librispeech_evaluation_WER_3.ipynb

Notebook with 6% WER: https://colab.research.google.com/github/vasudevgupta7/gsoc-wav2vec2/blob/notebook/notebooks/librispeech_evaluation_WER_6.ipynb
opened by thevasudevgupta 13
Discussion
Hey @sayakpaul, @MorganR,

I have few questions before I can start training the model:

LibriSpeech dataset is available in .flac format which can be read using tensorflow_io. But AFAIU cloud TPU's uses special build of TensorFlow and tensorflow_io is not working with that version. Is there any work around to this problem??

There are multiple variants of librispeech dataset- 100h, 360h, 500h (see this). 100h takes 6.3 GB, 360h takes 23 GB, 500h takes 30 GB disk space in compressed form. Best model in paper is obtained by training on combination of all datasets (i.e 960h). Which one dataset should I target for?? OR should I target 960h only (dataset will be quite large in uncompressed form) ??

Thanks!
opened by thevasudevgupta 10
Questions about processor
what does this code do :

def _normalize(self, x): """You must call this before padding.""" # -> (1, seqlen) mean = tf.reduce_mean(x, axis=-1, keepdims=True) var = tf.math.reduce_variance(x, axis=-1, keepdims=True) return tf.squeeze((x - mean) / tf.sqrt(var + 1e-5))

my other question is on what basis are numbers assigned to the vocab list by that i mean this :

I understand the code in the picture it basically gets all the characters from the text but my question is when it turns the characters into a dictionary with the values as their index does it matter what character is at what index and if yes then how does the right character get at the right index. I was trying to test my version of your tokenizer and I had trouble producing the right outputs with your vocab.json so I went and took the one here which worked fine.Also i was using a fine-tuned model for making predictions which was associated with this tokenizer via hugging face
opened by ahmedlone127 9
Wav2Vec2 ONNX

@sayakpaul, I prepared a notebook showing how TF Wav2Vec2 model can be exported in ONNX. Do you think we can include this notebook as well. Here is the link to notebook.

I will add text to to this notebook if you approve. ONNX exported model is reducing latency by more than 50% as compared to tf.function() wrapped model on CPU (comparing time on CPU as deployment commonly happens on CPUs).

opened by thevasudevgupta 8
Port original fine-tuned checkpoint to TFHub

Hello @sayakpaul @MorganR,

1st checkpoint is up here: https://tfhub.dev/vasudevgupta7/wav2vec2/1 🎉 🎉

Now, I can transfer 2nd checkpoint to TFHub. It's the converted checkpoint which was fine-tuned on LibriSpeech dataset by Facebook (TensorFlow equivalent of this). I think, we can make changes to this notebook and link it with our 2nd checkpoint.

Please give your valuable suggestions/comments on that.

opened by thevasudevgupta 6
How to change input signature

hey there!

thanks for making this repository! This may be a huge help for me.

When I download the model from https://tfhub.dev/vasudevgupta7/wav2vec2/1 the saved_model_cli says that the input signature for the model is actually (None, 50000) and not (None, 246000)... however when using tfhub to load the model into a keras layer (as done in this cloab ) it is (None, 246000)

i am confused... please help :) thanks a lot!

opened by bytosaur 5
hi,when I train wav2vec-xlsr, prompt “you should pass `attention_mask` when working with Wav2Vec2 new checkpoints”，what should I do? When do you plan to add the attention_mask for train?

hi,when I train wav2vec-xlsr, prompt “you should pass attention_mask when working with Wav2Vec2 new checkpoints”，what should I do? When do you plan to add the attention_mask for train?

opened by dengcunqin 5
Adding finished training script

This PR adds training script used for getting 5.6% WER (as discussed in #13). There were lots of bugs earlier, but now everything is fixed and working properly. Also, realised that we don't need custom training loop, hence removed that.

@sayakpaul @MorganR kindly review the PR

Thanks!!

opened by thevasudevgupta 5
Improve data loading
Major changes

Time complexity of tfrecords sharding is reduced from O(n^2) to O(n). skip method is very slow as it makes the dataset iterates over the samples again, so removed it.

Added support for loading TIMIT dataset. Introduced CommonDataLoader for keeping common code for LibriSpeech & Timit dataloaders.

@sayakpaul @MorganR
opened by thevasudevgupta 5
Fix #7
Major changes:

Added section for wrapping training with tf.keras.Model

added audio widget

some text is also fixed

Link to notebook: https://colab.research.google.com/github/vasudevgupta7/gsoc-wav2vec2/blob/export-v3/notebooks/wav2vec2_saved_model_finetuning.ipynb

@sayakpaul @MorganR
opened by thevasudevgupta 4

URL issues

When I run the following code:

from wav2vec2 import Wav2Vec2Config, Wav2Vec2Processor
tokenizer = Wav2Vec2Processor(is_tokenizer=True)

I get the following error:

Downloading `vocab.json` from https://github.com/vasudevgupta7/gsoc-wav2vec2/raw/main/data/vocab.json ... Traceback (most recent call last):
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\processor.py", line 43, in _setup_vocab
    subprocess.run(
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\processor.py", line 21, in __init__
    self._setup_vocab()
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\processor.py", line 47, in _setup_vocab
    raise ValueError(f"Couldn't download `vocab.json` from {url}")
ValueError: Couldn't download `vocab.json` from https://github.com/vasudevgupta7/gsoc-wav2vec2/raw/main/data/vocab.json

When I look up the URL, I get redirected to 'https://raw.githubusercontent.com/thevasudevgupta/gsoc-wav2vec2/main/data/vocab.json', which does have the vocab, but maybe the fact that it needs a redirect causes an issue for python?

I get a similar issue when I try to get the finetuned model:

from wav2vec2 import Wav2Vec2ForCTC, Wav2Vec2Config, Wav2Vec2Processor
model_id = "finetuned-wav2vec2-960h"
model = Wav2Vec2ForCTC.from_pretrained(model_id)

raises the following error:

Downloading model weights from https://huggingface.co/finetuned-wav2vec2-960h ... Traceback (most recent call last):
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\modeling.py", line 69, in from_pretrained
    subprocess.run(url.split(), check=True, stderr=subprocess.PIPE)
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\modeling.py", line 71, in from_pretrained
    raise ValueError(
ValueError: Couldn't download model weights from https://huggingface.co/finetuned-wav2vec2-960h

When I try to look up the URL, I get a 404 error. Maybe the URLs should be updated? If it's some issue on my end, please let me know.

opened by RomanAYakunin 0

Training related doubt

Hope everyone doing good!

Myself working on Finetuning of the Wav2vec model for Indian Accent and the size of the data is about 1.7 TB.

What would be your suggestion related to this task or any other better models to fine-tune?

You have also mentioned loading data lazily, could you please brief me about its usage.

Anyone with good knowledge, please update your comments. Thank you.

opened by pavankalyan066 0

TPU error: Input 2 to node `CTCLoss/ctc-loss/ctc_state_trans/ScatterNd_1` must be a compile-time constant

Thanks for the all the work that has gone into this project. I tried to run the code from the fine-tuning colab notebook on a TPU but I get the following error when running model.fit:

  (0) INVALID_ARGUMENT: {{function_node __inference_train_function_84983}} Input 2 to node `CTCLoss/ctc-loss/ctc_state_trans/ScatterNd_1` with op ScatterNd must be a compile-time constant.

XLA compilation requires that operator arguments that represent shapes or dimensions be evaluated to concrete values at compile time. This error means that a shape or dimension argument could not be evaluated at compile time, usually because the value of the argument depends on a parameter to the computation, on a variable, or on a stateful operation such as a random number generator.

As per the colab notebook I am loading a pretrained wav2vec2 from tfhub::

  load_locally = tf.saved_model.LoadOptions(experimental_io_device='/job:localhost')  # required for TPU
  pretrained_layer = tfhub.KerasLayer("https://tfhub.dev/vasudevgupta7/wav2vec2/1", trainable=True, load_options=load_locally)

(I added "load_locally" to make it tfhub.KerasLayer work on the TPU)

I guess the pretrained model uses the scatter_nd op in a way that is not compatible with the TPU, right? Any idea what I can do about this?

I did see your end-to-end training script, by the way, which I see has TPU support, but I was hoping to load a pretrained model.

opened by alexflint 0

Training Wav2Vec2 model on 100h & experiment-2

@sayakpaul, sorry for delay again. I have started serious experimentation now and will keep you posted with the results. I am starting with experiment-2 for now as mentioned in vasudevgupta7/compressed-wav2vec2#1. I will mention all the results in this issue by tomorrow (TPUs are running now!!)

| Experiment description | WER | Wandb | |------------------------|------|---------| | wav2vec2-960h (Facebook version) | 3% | - | | wav2vec2-960h (trained during gsoc) | 5.6% | - | | wav2vec2-100h | 7.4% | https://wandb.ai/7vasudevgupta/gsoc-wav2vec2/runs/lwiepmm0 |
| wav2vec2-100h (skipped stage-1) | 8.2% | https://wandb.ai/7vasudevgupta/gsoc-wav2vec2/runs/h0bug1zp | | wav2vec2-100h (train conv also) | 9.1% | https://wandb.ai/7vasudevgupta/gsoc-wav2vec2/runs/2iro0pl0, https://wandb.ai/7vasudevgupta/gsoc-wav2vec2/runs/284a713r | | distilled wav2vec2-100h | | https://wandb.ai/7vasudevgupta/wav2vec2-distillation/runs/2h82mhgc |

Evaluation script: https://colab.research.google.com/drive/1aNgochNmchx1R5TcoVH7nM0uPkmxNqE1?usp=sharing

Just wanted to ask one thing: Is it fine if I code in my gsoc repository or I should code in this private repo??

opened by thevasudevgupta 11
Ideas from the wav2vec2 repo
Initial action plans

Copying these things from the wav2vec2 repo for safe housekeeping.

An immediate quantize could be to convert the fine-tuned model using TFLite APIs. Post-training quantization, in specific, might be very useful. Quantization-aware training might be even more helpful but its support on TPUs is limited. I remember you had tried post-training quantization but the resulting model size was around 400 MB and I had shared some thoughts around that. Might be a good idea to again revisit post-training quantization in that case.

Google Research recently published FRILL which could be relevant for us. Basically, they perform knowledge distillation with a smaller student model with careful design choices along with quantization-aware training.

Meanwhile, if you have any other ideas that you think might be worth trying out please feel free to share them. If we have anything concrete and novel we can even target a publication in that case.

Suggesting another important resource here: Knowledge distillation: A good teacher is patient and consistent. The paper introduces simple recipes to get the best possible student model. But the study is based on image classification models. So, might be a fun exercise to try to think of ways in which this could be extended here.

A baseline approach to distil Wav2Vec2: Shrinking Bigfoot: Reducing wav2vec 2.0 footprint

Other useful resources

Model Optimization

Efficient Methods and Hardware for Deep Learning by Song Han Lecture on Quantization by Pete Warden

For non-trivial model conversions in TFLite you can refer to the following repositories

https://github.com/tulasiram58827/ocr_tflite/ https://github.com/tulasiram58827/TTS_TFLite https://github.com/sayakpaul/Adventures-in-TensorFlow-Lite
opened by sayakpaul 17
About the README
@vasudevgupta7 the README looks superb now!

I have a few suggestions that might make it even better:

I see that you have included instructions on how to load the model from Hugging Face Hub which is great. Do you think we could add something similar for TF Hub too? Maybe a note to guide the readers that they can pretty do the similar thing in TensorFlow by doing [...]?

While we are on the topic of ASR, do you think it might be good to mention a few other projects that are doing exceptional work in the area and even compare performances qualitatively? I understand this might be difficult for you to accommodate right away. So, feel free to either have this under future works or you can completely discard the choice (I won't mind).
opened by sayakpaul 6

Owner

Vasudev Gupta

Open Source @huggingface, @tensorflow | Interested in Speech & Text

GitHub https://vasudevgupta7.github.io/gsoc-wav2vec2/assets/final_report

A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models

wav2vec-toolkit A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models This repository accompanies the

29 Oct 23, 2022

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

SpeechMix Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together. Introduction For the same input: from datas

31 Nov 7, 2022

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

10 Oct 13, 2022

Transformer-based Text Auto-encoder (T-TA) using TensorFlow 2.

T-TA (Transformer-based Text Auto-encoder) This repository contains codes for Transformer-based Text Auto-encoder (T-TA, paper: Fast and Accurate Deep

13 Dec 13, 2022

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

State-of-the-art Natural Language Processing for PyTorch and TensorFlow 2.0 ?? Transformers provides thousands of pretrained models to perform tasks o

77.3k Jan 3, 2023

Making text a first-class citizen in TensorFlow.

TensorFlow Text - Text processing in Tensorflow IMPORTANT: When installing TF Text with pip install, please note the version of TensorFlow you are run

1k Dec 26, 2022

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides

2.3k Jan 7, 2023

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

State-of-the-art Natural Language Processing for PyTorch and TensorFlow 2.0 ?? Transformers provides thousands of pretrained models to perform tasks o

40.9k Feb 18, 2021

Making text a first-class citizen in TensorFlow.

TensorFlow Text - Text processing in Tensorflow IMPORTANT: When installing TF Text with pip install, please note the version of TensorFlow you are run

692 Feb 16, 2021

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides

2.1k Feb 17, 2021

A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

tfds-korean A collection of Korean Text Datasets ready to use using Tensorflow-Datasets. TensorFlow-Datasets를 이용한 한국어/한글 데이터셋 모음입니다. Dataset Catalog |

20 Jul 11, 2022

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow ?? Transformers provides thousands of pretrained mo

77.2k Jan 3, 2023

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

KR-BERT-SimCSE Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT. Training Unsupervised python train_unsupervised.py --mi

27 Dec 12, 2022

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

730 Jan 9, 2023

State of the art faster Natural Language Processing in Tensorflow 2.0 .

tf-transformers: faster and easier state-of-the-art NLP in TensorFlow 2.0 ****************************************************************************

74 Dec 5, 2022

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

335 Jan 4, 2023

SAINT PyTorch implementation

SAINT-pytorch A Simple pyTorch implementation of "Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing" based on https://arx

63 Dec 25, 2022

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

GPT-NeoX An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hun

3.1k Jan 8, 2023

Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt

3k Jan 6, 2023

GSoC'2021 | TensorFlow implementation of Wav2Vec2

Related tags

Overview

Notebooks

Checkpoints

Using this Repository

Reproducing this project

Setting Up

Preparing dataset

Model training

Running Conversion script

Running tests

Acknowledgement

References

End Notes

Comments

Updated on June 23, 2021

TODO

Major changes

Major changes:

Initial action plans

Other useful resources

Owner

Vasudev Gupta

A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Transformer-based Text Auto-encoder (T-TA) using TensorFlow 2.

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Making text a first-class citizen in TensorFlow.

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Making text a first-class citizen in TensorFlow.

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

State of the art faster Natural Language Processing in Tensorflow 2.0 .

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

SAINT PyTorch implementation

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.