Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

Overview

Seq2Seq Speech in JAX

A JAX/Flax repository for combining a pre-trained speech encoder model (e.g. Wav2Vec2, HuBERT, WavLM) with a pre-trained text decoder model (e.g. GPT2, Bart) to yield a Speech Sequence-to-Sequence (Seq2Seq) model for automatic speech recognition.

The script run_flax_speech_recognition_seq2seq.py can be used to fine-tune a Speech Seq2Seq model on one of the official speech recognition datasets or a custom dataset. It makes use of the pmap JAX operator to provide model parallelism accross GPU/TPU devices.

The modelling files are based very heavily on those from Hugging Face Transformers 🤗 . This is a standalone repository to enable rapid prototyping and involvement with the community. The final modelling files and training script will be merged into Transformers 🤗 to be used with the rest of the open-source library. The final system weights will be made publicly available at huggingface.co 🚀

Seq2SeqModel Figure 1: Speech-encoder text-decoder style Seq2Seq model.

Example Usage

To instantiate a Wav2Vec2-2-Bart model with the FlaxSpeechEncoderDecoderModel framework, run the following Python script inside the cloned repo:

from transformers import AutoFeatureExtractor, AutoTokenizer
from models.modeling_flax_speech_encoder_decoder import FlaxSpeechEncoderDecoderModel
import numpy as np

# checkpoints to leverage
encoder_id = "facebook/wav2vec2-large-lv60"
decoder_id = "facebook/bart-large"

model = FlaxSpeechEncoderDecoderModel.from_encoder_decoder_pretrained(
    encoder_id, decoder_id, encoder_add_adapter=True, decoder_from_pt=True)

model.config.decoder_start_token_id = model.config.decoder.bos_token_id
model.config.pad_token_id = model.config.decoder.pad_token_id
model.config.eos_token_id = model.config.decoder.eos_token_id
model.config.use_cache = False
model.config.processor_class = "Wav2Vec2Processor"

# check if generation works
out = model.generate(np.ones((1, 2000)))

model.save_pretrained("./")

feature_extractor = AutoFeatureExtractor.from_pretrained(encoder_id)
feature_extractor.save_pretrained("./")
tokenizer = AutoTokenizer.from_pretrained(decoder_id)
tokenizer.save_pretrained("./")

To train the model on Librispeech ASR in default precision, run the bash script provided below:

#!/usr/bin/env bash
python run_flax_speech_recognition_seq2seq.py \
        --dataset_name="librispeech_asr" \
        --model_name_or_path="./" \
        --dataset_config_name="clean" \
        --train_split_name="train.100" \
        --eval_split_name="validation" \
        --output_dir="./" \
        --preprocessing_num_workers="16" \
        --length_column_name="input_length" \
        --overwrite_output_dir \
        --num_train_epochs="5" \
        --per_device_train_batch_size="2" \
        --per_device_eval_batch_size="2" \
        --gradient_accumulation_steps="1" \
        --logging_steps="25" \
        --max_duration_in_seconds="15" \
        --max_target_length="128" \
        --generation_max_length="40" \
        --generation_num_beams="1" \
        --learning_rate="1e-4" \
        --warmup_steps="500" \
        --text_column_name="text" \
        --save_total_limit="1" \
        --freeze_feature_encoder \
        --predict_with_generate \
        --do_lower_case \
        --do_eval \
        --do_train
Comments
  • Random `TypeError` with NumPy array with all values -100

    Random `TypeError` with NumPy array with all values -100

    I'm hitting this error message now and then. It does not seem to be affecting training, but I only see it when training on TPU. The same dataset was used in GPU with no errors. Just posting here in case there is something else going on that I am missing.

    Step... (75000/759160 | Eval Loss: 0.11205478757619858 | Eval wer: 0.09877239458498131 | Eval cer: 0.02933955305671511 |):  10%|████▏                                     | 4/40 [23:39:27<205:34:14, 20557.06s/it/
    data/flax/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:719: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndar
    rays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.                                                                          
      tensor = as_tensor(value)                                                                                                                                                                                        
    --- Logging error ---                                                                                                                                                                                              
    Traceback (most recent call last):                                                                                                                                                                                 
      File "run_flax_speech_recognition_ctc.py", line 1631, in <module>                                                                                                                                                
        main()                                                                                                                                                                                                         
      File "run_flax_speech_recognition_ctc.py", line 1544, in main                                                                                                                                                    
        state, train_metric = p_train_step(state, batch)                                                                                                                                                               
      File "/data/flax/lib/python3.8/site-packages/jax/_src/traceback_util.py", line 162, in reraise_with_filtered_traceback                                                                                           
        return fun(*args, **kwargs)                                                                                                                                                                                    
      File "/data/flax/lib/python3.8/site-packages/jax/_src/api.py", line 2158, in cache_miss                                                                                                                          
        out_tree, out_flat = f_pmapped_(*args, **kwargs)                                                                                                                                                               
      File "/data/flax/lib/python3.8/site-packages/jax/_src/api.py", line 2031, in pmap_f                                                                                                                              
        p = _prepare_pmap(                                                                                                                                                                                             
      File "/data/flax/lib/python3.8/site-packages/jax/_src/api.py", line 1969, in _prepare_pmap                                                                                                                       
        _check_arg(arg)                                                                                                                                                                                                
      File "/data/flax/lib/python3.8/site-packages/jax/_src/api.py", line 2994, in _check_arg                                                                                                                          
        raise TypeError(f"Argument '{arg}' of type {type(arg)} is not a valid JAX type.")                                                                                                                              
    jax._src.traceback_util.UnfilteredStackTrace: TypeError: Argument '[[-100 -100]                                                                                                                                    
     [-100 -100]                                                                                                                                                                                                       
     [-100 -100]                                                                                                                                                                                                       
     [-100 -100]                                                                                                                                                                                                       
     [-100 -100]                                                                                                                                                                                                       
     [-100 -100]                                                                                                                                                                                                       
     [-100 -100]                                                                                                                                                                                                       
     [-100 -100]]' of type <class 'numpy.ndarray'> is not a valid JAX type.                                                                                                                                            
                                                                                                                                                                                                                       
    The stack trace below excludes JAX-internal frames.                                                                                                                                                                
    The preceding is the original exception that occurred, unmodified.                                                                                                                                                 
                                                                                                                                                                                                                       
    --------------------                                                                                                                                                                                
                                                                                                                                                                                                                       
    The above exception was the direct cause of the following exception:                                                                                                                                               
                                                                                                                                                                                                                       
    Traceback (most recent call last):                                                                                                                                                                                 
      File "run_flax_speech_recognition_ctc.py", line 1544, in main                                                                                                                                                    
        state, train_metric = p_train_step(state, batch) 
    TypeError: Argument '[[-100 -100]
     [-100 -100]
     [-100 -100]
     [-100 -100]
     [-100 -100]
     [-100 -100]
     [-100 -100]
     [-100 -100]]' of type <class 'numpy.ndarray'> is not a valid JAX type.
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/lib/python3.8/logging/__init__.py", line 1085, in emit
        msg = self.format(record)
      File "/usr/lib/python3.8/logging/__init__.py", line 929, in format
        return fmt.format(record)
      File "/usr/lib/python3.8/logging/__init__.py", line 668, in format
        record.message = record.getMessage()
      File "/usr/lib/python3.8/logging/__init__.py", line 373, in getMessage
        msg = msg % self.args
    TypeError: not all arguments converted during string formatting
    Call stack:
      File "run_flax_speech_recognition_ctc.py", line 1631, in <module>
        main()
      File "run_flax_speech_recognition_ctc.py", line 1546, in main
        logger.warning("Encountered following error: \n", e)
    Message: 'Encountered following error: \n'
    Arguments: (TypeError("Argument '[[-100 -100]\n [-100 -100]\n [-100 -100]\n [-100 -100]\n [-100 -100]\n [-100 -100]\n [-100 -100]\n [-100 -100]]' of type <class 'numpy.ndarray'> is not a valid JAX type."),)                                                  
    
    opened by versae 3
  • Negative Losses in CTC Training

    Negative Losses in CTC Training

    Training the baseline CTC model on the Common Voice 9 (CV9) dataset, we observe that the training loss drops below zero after ~1.5k train steps: https://wandb.ai/sanchit-gandhi/commonvoice_9_0/runs/y593pwm4?workspace=user-sanchit-gandhi. The CTC loss should be strictly nonnegative.

    • CV9 tokenizer: working as expected. Tested within the training script (both tokenising, and decoding), and checked that all attributes all set correctly. Furthermore, the target string in the wandb predictions logs are identical to the transcribed text in the training data -> the tokenizer is correctly tokenising and decoding.
    • Logits test: for the randomly initialised (unscanned) model, the PT-Flax equivalence test passes on CV9 for both the logits and losses. Loss is nonnegative. https://github.com/sanchit-gandhi/seq2seq-speech/blob/main/tests/check_flax_ctc_cv9.py
    • Trained Flax model: using the 50k train steps checkpoint, the loss is negative.
    • We expect to see all log probabilities in the CTC loss function be strictly negative: the probabilities should lie in the range 0 to 1, and so should have a max value of log(1) = 0.
    • The CTC loss function can be divided into three stages:
    1. Initialisation of log prob arrays
    2. Looping over the CTC Markov chain process
    3. Extraction of per-sequence loss and CTC reduction ("mean" reduction)
    • Our attention is focused on the CTC Markov chain process: https://github.com/sanchit-gandhi/seq2seq-speech/blob/74c4e3650ce1865a61dbf5e99c199539d6815bfc/run_flax_speech_recognition_ctc.py#L638-L659
    • When we remove the lax backend in the CTC loss function and use some print statements, we see that positive values creep into the log probabilities by the 4th loop of the CTC algorithm. They first occur in the 'phi-to-emit transition' with the 'next_emit' log probabilities, and then cascade to all further log probs. https://github.com/sanchit-gandhi/seq2seq-speech/blob/main/tests/check_negative_loss_ctc.ipynb
    • Changing the value of the log_epsilon hyperparameter to a more negative value does not alter this behaviour.
    opened by sanchit-gandhi 2
  • Eval samples filtering

    Eval samples filtering

    It seems that all the data are filtered by the input and label lengths in the current implementation, including eval samples, as in the code.

    Half of the eval samples for TEDLIUMv3 are not included in the evaluation as a result.

    opened by mutiann 1
  • Add RNN-T Model & Training Script

    Add RNN-T Model & Training Script

    Adds the RNN-T BPE model from NVIDIA NeMo and a training script to train the model with HF Dataset & Trainer. Template training scripts detail the configuration to train a ContextNet ASR model, large size (~144M) with Transducer loss and sub-word encoding.

    opened by sanchit-gandhi 1
  • CTC tokenizer returns <unk> tokens with `do_lower_case=True`

    CTC tokenizer returns tokens with `do_lower_case=True`

    This issue explores the behaviour of the example CV9 tokenizer created in the get_ctc_tokenizer.py file saved at https://huggingface.co/patrickvonplaten/wav2vec2_ctc_cv9_tokenizer.

    With the do_lower_case argument set to true, we observe that the tokenizer returns the <unk> token for all input chars:

    from transformers import AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("patrickvonplaten/wav2vec2_ctc_cv9_tokenizer", do_lower_case=True)
    
    input_str = "The cat sat on the mat."
    input_ids = tokenizer(input_str).input_ids
    decoded_str = tokenizer.decode(input_ids)
    
    print("Input str: ", input_str)
    print("Input ids: ", input_ids)
    print("Decoded str: ", decoded_str)
    print("<unk> token id: ", tokenizer.unk_token_id)
    print("Type: ", type(tokenizer))
    

    Output:

    Input str:  The cat sat on the mat.
    Input ids:  [3, 3, 3, 25, 3, 3, 3, 25, 3, 3, 3, 25, 3, 3, 25, 3, 3, 3, 25, 3, 3, 3, 5]
    Decoded str:  <unk> <unk> <unk> <unk> <unk> <unk>.
    <unk> token id:  3
    Type:  <class 'transformers.models.wav2vec2.tokenization_wav2vec2.Wav2Vec2CTCTokenizer'>
    

    Every input character in the alphabet is incorrectly mapped to the unknown token. However, what's interesting here is that we see the punctuation (word-spaces and full-stop) correctly tokenized and decoded. These are the two inputs that are unaffected by the do_lower_case operation.

    If we now re-run this operation with the do_lower_case argument set to False, we observe that the decoded word string is correct (barring the first capital T):

    from transformers import AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("patrickvonplaten/wav2vec2_ctc_cv9_tokenizer", do_lower_case=False)
    
    input_str = "The cat sat on the mat."
    input_ids = tokenizer(input_str).input_ids
    decoded_str = tokenizer.decode(input_ids)
    
    print("Input str: ", input_str)
    print("Input ids: ", input_ids)
    print("Decoded str: ", decoded_str)
    

    Output:

    Input str:  The cat sat on the mat.
    Input ids:  [3, 9, 7, 25, 17, 35, 18, 25, 8, 35, 18, 25, 16, 6, 25, 18, 9, 7, 25, 37, 35, 18, 5]
    Decoded str:  <unk>he cat sat on the mat.
    

    Thus, the issue must lie within the do_lower_case attribute of the Wav2Vec2CTCTokenizer. Indeed, if we inspect the modelling code for this tokenizer, we observe that setting do_lower_case to True first runs the upper method on the input string, converting the string to upper-case: https://github.com/huggingface/transformers/blob/6d80c92c77593dc674052b5a46431902e6adfe88/src/transformers/models/wav2vec2/tokenization_wav2vec2.py#L238 Since the tokenizer was built on the lower-case vocabulary, each of these upper-case characters are OOV, and so assigned to the <unk> token.

    What we should do is set the do_lower_case argument to False, meaning the upper-case operation is not performed on the input string, and simply lower-case the input string before tokenizing:

    from transformers import AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("patrickvonplaten/wav2vec2_ctc_cv9_tokenizer", do_lower_case=False)
    
    input_str = "The cat sat on the mat.".lower()
    input_ids = tokenizer(input_str).input_ids
    decoded_str = tokenizer.decode(input_ids)
    
    print("Input str: ", input_str)
    print("Input ids: ", input_ids)
    print("Decoded str: ", decoded_str)
    
    Input str:  the cat sat on the mat.
    Input ids:  [18, 9, 7, 25, 17, 35, 18, 25, 8, 35, 18, 25, 16, 6, 25, 18, 9, 7, 25, 37, 35, 18, 5]
    Decoded str:  the cat sat on the mat.
    

    This simply required modification of the lines which instantiate the tokenizer from pretrained in the CTC training script: https://github.com/sanchit-gandhi/seq2seq-speech/blob/de22a44b4e902a7ccb185f7f1f44053bc3ad297f/run_flax_speech_recognition_ctc.py#L809-L816

    opened by sanchit-gandhi 1
  • [ASR Examples] Only filter training samples by audio length criterion

    [ASR Examples] Only filter training samples by audio length criterion

    We should only filter training samples by our audio and transcription length criterions when fine-tuning ASR systems. The eval and test sets should not be filtered. Doing so leads to partial datasets, and thus in-valid results.

    opened by sanchit-gandhi 0
  • [prepare_dataset] Improve error correction vs normalisation

    [prepare_dataset] Improve error correction vs normalisation

    Perform error correction of datasets on a dataset-by-dataset basis:

    LibriSpeech ASR:

    • No error correction necessary

    VoxPopuli:

    • No error correction necessary

    Common Voice 9:

    • Remove trailing quotation marks as they do not affect the transcription
    • Replace double quotation marks with single
    • Normalise quotation marks, apostrophes, hyphens
    • Normalise the text to always finish with punctuation

    TED-LIUM:

    • Remove tokens
    • Replace spaced apostrophes by un-spaced for most frequent instances (it 's -> it's)*

    GigaSpeech:

    • Remove junk tokens (disfluencies)
    • Convert spelled out punctuation to symbolic ( -> ,)
    • Normalise the text to always finish with punctuation

    SwitchBoard:

    • Remove junk tokens (disfluencies)
    • Remove parenthesised text detailing annotations (the cat sat on the (2) mat -> the cat sat on the mat)
    • Replace anomalous words with their correct transcriptions: the cat/ca sat on the/th mat -> the cat sat on the mat

    Earnings22:

    • Remove junk tokens (disfluencies)
    • Replace mal-formatted ellipsis with spaced (… -> . . . )

    JIWER compliance (for WER/CER metrics):

    • Remove multiple spaces
    • Strip trailing spaces
    *TED-LIUM contraction statistics
    Contraction Occurrences
    Total contractions in dataset: 120399
    -----------------------------------------------------
         Cont. |       Total occ |       % of total occ |
    -----------------------------------------------------
            's |           56216 |               46.691 |
            't |           22628 |               18.794 |
           're |           16658 |               13.836 |
           've |            9464 |                7.861 |
            'm |            8360 |                6.944 |
           'll |            3636 |                 3.02 |
            'd |            3259 |                2.707 |
        'clock |              75 |                0.062 |
          'all |              15 |                0.012 |
           'am |               7 |                0.006 |
       'ivoire |               4 |                0.003 |
        'grady |               4 |                0.003 |
        'neill |               4 |                0.003 |
        'brien |               3 |                0.002 |
         'oeil |               3 |                0.002 |
            'n |               3 |                0.002 |
        'alene |               3 |                0.002 |
        'arche |               2 |                0.002 |
            'a |               2 |                0.002 |
          'ivo |               2 |                0.002 |
          'ren |               2 |                0.002 |
         'ites |               2 |                0.002 |
       'connor |               2 |                0.002 |
           'an |               2 |                0.002 |
           'or |               2 |                0.002 |
           'ts |               2 |                0.002 |
         'dour |               2 |                0.002 |
          'tat |               2 |                0.002 |
         'hara |               2 |                0.002 |
       'aquila |               2 |                0.002 |
        'oreal |               1 |                0.001 |
         'oral |               1 |                0.001 |
       'agnese |               1 |                0.001 |
          'tre |               1 |                0.001 |
          'arc |               1 |                0.001 |
         'neal |               1 |                0.001 |
         'shea |               1 |                0.001 |
           'in |               1 |                0.001 |
          'lin |               1 |                0.001 |
      'gieblyn |               1 |                0.001 |
           'ed |               1 |                0.001 |
           'on |               1 |                0.001 |
            'i |               1 |                0.001 |
     'ospedale |               1 |                0.001 |
      'addaura |               1 |                0.001 |
       'reilly |               1 |                0.001 |
      'donohue |               1 |                0.001 |
     'donoghue |               1 |                0.001 |
          'ile |               1 |                0.001 |
      'foscari |               1 |                0.001 |
    'herpiniere |               1 |                0.001 |
          'est |               1 |                0.001 |
       'conner |               1 |                0.001 |
      'oeuvres |               1 |                0.001 |
     'academie |               1 |                0.001 |
        'brian |               1 |                0.001 |
      'connell |               1 |                0.001 |
        'byrne |               1 |                0.001 |
       'angelo |               1 |                0.001 |
          'ant |               1 |                0.001 |
          'mun |               1 |                0.001 |
    
    opened by sanchit-gandhi 0
  • Recovering from crashed run

    Recovering from crashed run

    Hi, thanks for these collection of scripts!

    I've been trying to run your run_flax_speech_recognition_ctc.py on a single TPUv3-8 but after a few epochs I tend to always run out of memory (not sure if caused by memory leak or something). I also tried to recover from the last checkpoint by skipping the number of steps the model was last saved at, and setting the learning rate appropriately. I also tried modifying MixedPrecisionTrainState.create() to it starts at the last saved checkpoint step too. Nothing worked. As soon as it starts training, it runs out of memory again. Any idea of what could be happening?

    Thanks!

    opened by versae 3
Owner
Sanchit Gandhi
Open-Source Speech @huggingface
Sanchit Gandhi
Flaxformer: transformer architectures in JAX/Flax

Flaxformer: transformer architectures in JAX/Flax Flaxformer is a transformer library for primarily NLP and multimodal research at Google. It is used

Google 114 Dec 29, 2022
Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP

Pretrain and Fine-tune a T5 model with Flax on GCP This tutorial details how pretrain and fine-tune a FlaxT5 model from HuggingFace using a TPU VM ava

Gabriele Sarti 41 Nov 18, 2022
Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

Ekstra Bladet 141 Dec 30, 2022
Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple

Alexander Veysov 3.2k Dec 31, 2022
🏖 Easy training and deployment of seq2seq models.

Headliner Headliner is a sequence modeling library that eases the training and in particular, the deployment of custom sequence models for both resear

Axel Springer Ideas Engineering GmbH 231 Nov 18, 2022
🏖 Easy training and deployment of seq2seq models.

Headliner Headliner is a sequence modeling library that eases the training and in particular, the deployment of custom sequence models for both resear

Axel Springer Ideas Engineering GmbH 220 Feb 10, 2021
An open source framework for seq2seq models in PyTorch.

pytorch-seq2seq Documentation This is a framework for sequence-to-sequence (seq2seq) models implemented in PyTorch. The framework has modularized and

International Business Machines 1.4k Jan 2, 2023
Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

XLM-EMO: Multilingual Emotion Prediction in Social Media Text Abstract Detecting emotion in text allows social and computational scientists to study h

MilaNLP 35 Sep 17, 2022
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow ?? Transformers provides thousands of pretrained mo

Hugging Face 77.2k Jan 3, 2023
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 159 Apr 4, 2022
multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification,seq2seq,attention,beam search

multi-label,classifier,text classification,多标签文本分类,文本分类,BERT,ALBERT,multi-label-classification,seq2seq,attention,beam search

hellonlp 30 Dec 12, 2022
spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from t

Kenneth Enevoldsen 32 Dec 29, 2022
This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.

POS-Tagger This repository details the creation of a Part-of-Speech tagger using Trigram Hidden Markov Models to predict word tags in a word sequence.

Raihan Ahmed 1 Dec 9, 2021
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

pyannote 2.2k Jan 9, 2023
Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

Uyghur 11 Nov 17, 2022
PyTorch implementation of Microsoft's text-to-speech system FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Chung-Ming Chien 1k Dec 30, 2022
Simple Speech to Text, Text to Speech

Simple Speech to Text, Text to Speech 1. Download Repository Opsi 1 Download repository ini, extract di lokasi yang diinginkan Opsi 2 Jika sudah famil

Habib Abdurrasyid 5 Dec 28, 2021