1138 Python Speech-processing Libraries

BERT, LDA, and TFIDF based keyword extraction in Python

BERT, LDA, and TFIDF based keyword extraction in Python kwx is a toolkit for multilingual keyword extraction based on Google's BERT and Latent Dirichl

41 Dec 27, 2022

BERT-based Financial Question Answering System

BERT-based Financial Question Answering System In this example, we use Jina, PyTorch, and Hugging Face transformers to build a production-ready BERT-b

61 Sep 18, 2022

Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.

Illumination_Decomposition Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources. This code implements the

7 Nov 15, 2020

User-friendly Voice Cloning Application

Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur

19 Dec 30, 2022

Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

135 Dec 29, 2022

Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python

Synchrosqueezing is a powerful reassignment method that focuses time-frequency representations, and allows extraction of instantaneous amplitudes and frequencies

382 Jan 6, 2023

UniSpeech - Large Scale Self-Supervised Learning for Speech

UniSpeech The family of UniSpeech: WavLM (arXiv): WavLM: Large-Scale Self-Supervised Pre-training for Full Stack Speech Processing UniSpeech (ICML 202

281 Dec 15, 2022

Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition

SEW (Squeezed and Efficient Wav2vec) The repo contains the code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speec

67 Dec 1, 2022

Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow. Documentation Proper documentation is available at

151 Jan 5, 2023

Test finetuning of XLSR (multilingual wav2vec 2.0) for other speech classification tasks

wav2vec_finetune Test finetuning of XLSR (multilingual wav2vec 2.0) for other speech classification tasks Initial test: gender recognition on this dat

8 Aug 11, 2022

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

PART 2: CHAIN LINKING AUDIO-TO-TEXT NLP TASKS 2A: TRANSCRIBE-TRANSLATE-SENTIMENT-ANALYSIS In notebook3.0, I demo a simple workflow to: transcribe a lo

30 Jul 13, 2022

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

77.1k Dec 31, 2022

Large-scale pretraining for dialogue

A State-of-the-Art Large-scale Pretrained Response Generation Model (DialoGPT) This repository contains the source code and trained model for a large-

1.8k Jan 7, 2023

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Introduction Funnel-Transformer is a new self-attention model that gradually compresses the sequence of hidden states to a shorter one and hence reduc

197 Dec 11, 2022

Awesome Treasure of Transformers Models Collection

💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️

577 Jan 7, 2023

PyContinual (An Easy and Extendible Framework for Continual Learning)

PyContinual (An Easy and Extendible Framework for Continual Learning) Easy to Use You can sumply change the baseline, backbone and task, and then read

176 Jan 5, 2023

The King is Naked: on the Notion of Robustness for Natural Language Processing

the-king-is-naked: on the notion of robustness for natural language processing AAAI2022 DISCLAIMER:This repo will be updated soon with instructions on

1 Nov 24, 2022

A Python library for electronic structure pre/post-processing

PyProcar PyProcar is a robust, open-source Python library used for pre- and post-processing of the electronic structure data coming from DFT calculati

124 Dec 7, 2022

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

28 May 25, 2021

Pytorch implementation NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

NORESQA: Speech Quality Assessment using Non-Matching References This is a Pytorch implementation for using NORESQA. It contains minimal code to predi

11 Dec 14, 2021

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

1.4k Jan 1, 2023

SimpleITK is an image analysis toolkit with a large number of components supporting general filtering operations, image segmentation and registration

672 Jan 5, 2023

CYGNUS, the Cynical AI, combines snarky responses with uncanny aggression.

New & (hopefully) Improved CYGNUS with several API updates, user updates, and online/offline operations added!!!

0 Mar 28, 2022

A small Python module for BMP image processing.

micropython-microbmp A small Python module for BMP image processing. It supports BMP image of 1/2/4/8/24-bit colour depth. Loading supports compressio

4 Nov 2, 2022

A voice assistant which can be used to interact with your computer and controls your pc operations

Introduction 👨‍💻 It is a voice assistant which can be used to interact with your computer and also you have been seeing it in Iron man movies, but t

84 Dec 22, 2022

p5 is a Python package based on the core ideas of Processing.

p5 p5 is a Python library that provides high level drawing functionality to help you quickly create simulations and interactive art using Python. It c

645 Jan 4, 2023

gACSON software for visualization, processing and analysis of three-dimensional electron microscopy images

gACSON gACSON software is to visualize, segment, and analyze the morphology of neurons in three-dimensional electron microscopy images. If you use any

2 May 31, 2022

Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

points2d_projection_mesh Input 2D points (e.g. facial landmarks) on an image Camera parameters (extrinsic and intrinsic) of the image Aligned 3D mesh

5 Dec 8, 2022

A Python Based Utility for Processing GST-Return JSON Files to Multiple Formats

GSTR 1/2A Utility by Shan.tk Open Source GSTR 1/GSTR 2A JSON to Excel utility based on Python. Useful for Auditors in Verifying GSTR 1 Return Invoices

1 Oct 8, 2022

A Python Library to Make Quote Images

Quote2Image A Python Library to Make Quote Images How To Use? Download The Latest Package From Releases Extract The Zip File And Place Every File In I

28 Dec 30, 2022

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

PyTorch implementation of Continuous Augmented Positional Embeddings (CAPE), by Likhomanenko et al. Enhance your Transformer positional embeddings with easy-to-use augmentations!

26 Dec 13, 2022

📜Generate poetry with gcc diagnostics

gado (gcc awesome diagnostics orchestrator) is a wrapper of gcc that outputs its errors and warnings in a more poetic format.

19 Jun 25, 2022

Python supercharged for the fastai library

Welcome to fastcore Python goodies to make your coding faster, easier, and more maintainable Python is a powerful, dynamic language. Rather than bake

810 Jan 6, 2023

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

ReTiSAR Implementation of the Real-Time Spherical Microphone Renderer for binaural reproduction in Python [1][2]. Contents: | Requirements | Setup | Q

Division of Applied Acoustics at Chalmers University of Technology

51 Dec 17, 2022

A setup script to generate ITK Python Wheels

ITK Python Package This project provides a setup.py script to build ITK Python binary packages and infrastructure to build ITK external module Python

59 Dec 14, 2022

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

bbc-speech-segmenter: Voice Activity Detection & Speaker Diarization A complete speech segmentation system using Kaldi and x-vectors for voice activit

16 Oct 27, 2022

Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HA

3 Aug 4, 2022

Tools for computational pathology

A toolkit for computational pathology and machine learning. View documentation Please cite our paper Installation There are several ways to install Pa

254 Dec 12, 2022

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

Trapper (Transformers wRAPPER) Trapper is an NLP library that aims to make it easier to train transformer based models on downstream tasks. It wraps h

42 Sep 21, 2022

The world's largest toxicity dataset.

The Toxicity Dataset by Surge AI Saving the internet is fun. Combing through thousands of online comments to build a toxicity dataset isn't. That's wh

134 Dec 19, 2022

This repository implements a brute-force spellchecker utilizing the Damerau-Levenshtein edit distance.

About spellchecker.py Implementing a highly-accurate, brute-force, and dynamically programmed spellchecking program that utilizes the Damerau-Levensht

1 Dec 11, 2021

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

922 Dec 10, 2021

An end to end ASR Transformer model training repo

END TO END ASR TRANSFORMER 本项目基于transformer 6*encoder+6*decoder的基本结构构造的端到端的语音识别系统 Model Instructions 1.数据准备: 自行下载数据，遵循文件结构如下： ├── data │ ├── train │

10 Jul 19, 2022

CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks

CALVIN CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks Oier Mees, Lukas Hermann, Erick Rosete,

107 Dec 26, 2022

Hashformers is a framework for hashtag segmentation with transformers.

Hashtag segmentation is the task of automatically inserting the missing spaces between the words in a hashtag. Hashformers applies Transformer models

41 Nov 9, 2022

Meandering In Networks of Entities to Reach Verisimilar Answers

MINERVA Meandering In Networks of Entities to Reach Verisimilar Answers Code and models for the paper Go for a Walk and Arrive at the Answer - Reasoni

271 Dec 13, 2022

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

COCO-LM This repository contains the scripts for fine-tuning COCO-LM pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: COCO-LM: Correcting an

106 Dec 12, 2022

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone In our recent paper we propose the YourTTS model. YourTTS bri

390 Dec 29, 2022

The easiest tool for extracting radiomics features and training ML models on them.

Simple pipeline for experimenting with radiomics features Installation git clone https://github.com/piotrekwoznicki/ClassyRadiomics.git cd classrad pi

17 Aug 4, 2022

A time series processing library

Timeseria Timeseria is a time series processing library which aims at making it easy to handle time series data and to build statistical and machine l

11 Aug 8, 2022

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Intro Real-time object detection and classification. Paper: version 1, version 2. Read more about YOLO (in darknet) and download weight files here. In

6.1k Dec 30, 2022

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks - Official Project Page This repository contains the code develope

1.7k Dec 18, 2022

A python script for extracting/removing exif data from images by @AbirHasan2005

Image-Exif A Python script for extracting exif metadata from images. How to use? Using this script you can extract exif data from image and save in .c

13 Dec 16, 2022

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

3 May 25, 2022

Steerable discovery of neural audio effects

Steerable discovery of neural audio effects Christian J. Steinmetz and Joshua D. Reiss Abstract Applications of deep learning for audio effects often

182 Dec 29, 2022

This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

NORESQA: Speech Quality Assessment using Non-Matching References This is a Pytorch implementation for using NORESQA. It contains minimal code to predi

36 Dec 8, 2022

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

76 Dec 30, 2022

Object Detection with YOLOv3

Object Detection with YOLOv3 Bu projede YOLOv3-608 modeli kullanılmıştır. Requirements Python 3.8 OpenCV Numpy Documentation Yolo ile ilgili detaylı b

0 Mar 27, 2022

HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.

HTSeq DEVS: https://github.com/htseq/htseq DOCS: https://htseq.readthedocs.io A Python library to facilitate programmatic analysis of data from high-t

57 Dec 20, 2022

Stenography encryption script

ImageCrypt Project description Installation Usage Project description Project AlexGyver on Python by TheK4n Design by Пашушка Byte packing in decimal

5 Dec 14, 2022

A package, and script, to perform imaging transcriptomics on a neuroimaging scan.

Imaging Transcriptomics Imaging transcriptomics is a methodology that allows to identify patterns of correlation between gene expression and some prop

10 Dec 27, 2022

The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete.

Feature-Engineering The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete. Business

4 Oct 17, 2021

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter

47 Dec 25, 2022

SingleVC performs any-to-one VC, which is an important component of MediumVC project.

SingleVC performs any-to-one VC, which is an important component of MediumVC project. Here is the official implementation of the paper, MediumVC.

26 Dec 28, 2022

A Chinese to English Neural Model Translation Project

ZH-EN NMT Chinese to English Neural Machine Translation This project is inspired by Stanford's CS224N NMT Project Dataset used in this project: News C

29 Nov 26, 2022

Implementation of Google Brain's WaveGrad high-fidelity vocoder

WaveGrad Implementation (PyTorch) of Google Brain's high-fidelity WaveGrad vocoder (paper). First implementation on GitHub with high-quality generatio

363 Dec 27, 2022

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

DiffWave DiffWave is a fast, high-quality neural vocoder and waveform synthesizer. It starts with Gaussian noise and converts it into speech via itera

498 Jan 3, 2023

Neural HMMs are all you need (for high-quality attention-free TTS)

Neural HMMs are all you need (for high-quality attention-free TTS) Shivam Mehta, Éva Székely, Jonas Beskow, and Gustav Eje Henter This is the official

0 Oct 28, 2022

End-2-end speech synthesis with recurrent neural networks

Introduction New: Interactive demo using Google Colaboratory can be found here TTS-Cube is an end-2-end speech synthesis system that provides a full p

214 Dec 7, 2022

Text to speech for Vietnamese, ez to use, ez to update

Chào mọi người, đây là dự án mở nhằm giúp việc đọc được trở nên dễ dàng hơn. Rất cảm ơn đội ngũ Zalo đã cung cấp hạ tầng để mình có thể tạo ra app này

32 Jul 29, 2022

Natural Language Processing Tasks and Examples.

Natural Language Processing Tasks and Examples With the advancement of A.I. technology in recent years, natural language processing technology has bee

53 Dec 20, 2022

Pipeline for training LSA models using Scikit-Learn.

Latent Semantic Analysis Pipeline for training LSA models using Scikit-Learn. Usage Instead of writing custom code for latent semantic analysis, you j

23 Sep 5, 2022

An 16kHz implementation of HiFi-GAN for soft-vc.

HiFi-GAN An 16kHz implementation of HiFi-GAN for soft-vc. Relevant links: Official HiFi-GAN repo HiFi-GAN paper Soft-VC repo Soft-VC paper Example Usa

42 Dec 27, 2022

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

English|简体中文 ERNIE是百度开创性提出的基于知识增强的持续学习语义理解框架，该框架将大数据预训练与多源丰富知识相结合，通过持续学习技术，不断吸收海量文本数据中词汇、结构、语义等方面的知识，实现模型效果不断进化。ERNIE在累积 40 余个典型 NLP 任务取得 SOTA 效果，并在 G

5.4k Jan 3, 2023

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)

This repository contains the resources in our paper "Revisiting Pre-trained Models for Chinese Natural Language Processing", which will be published i

463 Dec 30, 2022

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

Wav2Vec2 STT Python Beta Software Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 mode

22 Dec 29, 2022

Makes google's political ad database actually useful

Making Google's political ad transparency library suck less This is a series of scripts that takes Google's political ad transparency data and makes t

7 Apr 28, 2022

Open source annotation tool for machine learning practitioners.

doccano doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequ

7.1k Jan 1, 2023

Code for Editing Factual Knowledge in Language Models

KnowledgeEditor Code for Editing Factual Knowledge in Language Models (https://arxiv.org/abs/2104.08164). @inproceedings{decao2021editing, title={Ed

86 Nov 28, 2022

A Model for Natural Language Attack on Text Classification and Inference

TextFooler A Model for Natural Language Attack on Text Classification and Inference This is the source code for the paper: Jin, Di, et al. "Is BERT Re

418 Dec 16, 2022

Code associated with the Don't Stop Pretraining ACL 2020 paper

dont-stop-pretraining Code associated with the Don't Stop Pretraining ACL 2020 paper Citation @inproceedings{dontstoppretraining2020, author = {Suchi

449 Jan 4, 2023

Training and evaluation codes for the BertGen paper (ACL-IJCNLP 2021)

BERTGEN This repository is the implementation of the paper "BERTGEN: Multi-task Generation through BERT" (https://arxiv.org/abs/2106.03484). The codeb

9 Oct 26, 2022

DANeS is an open-source E-newspaper dataset by collaboration between DATASET JSC (dataset.vn) and AIV Group (aivgroup.vn)

DANeS - Open-source E-newspaper dataset Source: Technology vector created by macrovector - www.freepik.com. DANeS is an open-source E-newspaper datase

64 Aug 17, 2022

clustimage is a python package for unsupervised clustering of images.

clustimage The aim of clustimage is to detect natural groups or clusters of images. Image recognition is a computer vision task for identifying and ve

52 Jan 2, 2023

Implementation of Neonatal Seizure Detection using EEG signals for deploying on edge devices including Raspberry Pi.

NeonatalSeizureDetection Description Link: https://arxiv.org/abs/2111.15569 Citation: @misc{nagarajan2021scalable, title={Scalable Machine Learn

11 Nov 8, 2022

PyContinual (An Easy and Extendible Framework for Continual Learning)

PyContinual (An Easy and Extendible Framework for Continual Learning) Easy to Use You can sumply change the baseline, backbone and task, and then read

176 Jan 5, 2023

Code to produce syntactic representations that can be used to study syntax processing in the human brain

Can fMRI reveal the representation of syntactic structure in the brain? The code base for our paper on understanding syntactic representations in the

4 Dec 18, 2022

A paper list of pre-trained language models (PLMs).

Large-scale pre-trained language models (PLMs) such as BERT and GPT have achieved great success and become a milestone in NLP.

124 Jan 2, 2023

Various Algorithms for Short Text Mining

Short Text Mining in Python Introduction This package shorttext is a Python package that facilitates supervised and unsupervised learning for short te

466 Dec 6, 2022

Memory tests solver with using OpenCV

Human Benchmark project This project is OpenCV based programs which are puzzle solvers for 7 different games for https://humanbenchmark.com/. made as

24 Dec 27, 2022

Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

Description The source code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chin

3 Jun 28, 2022

Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

2.3k Dec 30, 2022

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER. @inproceedings{tedes

40 Dec 11, 2022

Python Speech-processing Resources

Python speech-processing Libraries

BERT, LDA, and TFIDF based keyword extraction in Python

BERT-based Financial Question Answering System

Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.

User-friendly Voice Cloning Application

Label data using HuggingFace's transformers and automatically get a prediction service

Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python

UniSpeech - Large Scale Self-Supervised Learning for Speech

Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition

Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

Test finetuning of XLSR (multilingual wav2vec 2.0) for other speech classification tasks

Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Large-scale pretraining for dialogue

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Awesome Treasure of Transformers Models Collection

PyContinual (An Easy and Extendible Framework for Continual Learning)

The King is Naked: on the Notion of Robustness for Natural Language Processing

A Python library for electronic structure pre/post-processing

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Pytorch implementation NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

SimpleITK is an image analysis toolkit with a large number of components supporting general filtering operations, image segmentation and registration

CYGNUS, the Cynical AI, combines snarky responses with uncanny aggression.

A small Python module for BMP image processing.

A voice assistant which can be used to interact with your computer and controls your pc operations

p5 is a Python package based on the core ideas of Processing.

gACSON software for visualization, processing and analysis of three-dimensional electron microscopy images

Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

A Python Based Utility for Processing GST-Return JSON Files to Multiple Formats

A Python Library to Make Quote Images

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

📜Generate poetry with gcc diagnostics

Python supercharged for the fastai library

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

A setup script to generate ITK Python Wheels

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Tools for computational pathology

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

The world's largest toxicity dataset.

This repository implements a brute-force spellchecker utilizing the Damerau-Levenshtein edit distance.

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

An end to end ASR Transformer model training repo

CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks

Hashformers is a framework for hashtag segmentation with transformers.

Meandering In Networks of Entities to Reach Verisimilar Answers

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

The easiest tool for extracting radiomics features and training ML models on them.

A time series processing library

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

A python script for extracting/removing exif data from images by @AbirHasan2005

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

Steerable discovery of neural audio effects

This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Object Detection with YOLOv3

HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.

Stenography encryption script

A package, and script, to perform imaging transcriptomics on a neuroimaging scan.

The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete.

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

SingleVC performs any-to-one VC, which is an important component of MediumVC project.

A Chinese to English Neural Model Translation Project

Implementation of Google Brain's WaveGrad high-fidelity vocoder

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Neural HMMs are all you need (for high-quality attention-free TTS)

End-2-end speech synthesis with recurrent neural networks

Text to speech for Vietnamese, ez to use, ez to update

Natural Language Processing Tasks and Examples.

Pipeline for training LSA models using Scikit-Learn.

An 16kHz implementation of HiFi-GAN for soft-vc.

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

Makes google's political ad database actually useful

Open source annotation tool for machine learning practitioners.