941 Repositories
Python audio-classification Libraries
An end-to-end PyTorch framework for image and video classification
What's New: March 2021: Added RegNetZ models November 2020: Vision Transformers now available, with training recipes! 2020-11-20: Classy Vision v0.5 R
Sandbox for training deep learning networks
Deep learning networks This repo is used to research convolutional networks primarily for computer vision tasks. For this purpose, the repo contains (
Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)
Image Classification Project Killer in PyTorch This repo is designed for those who want to start their experiments two days before the deadline and ki
3D ResNet Video Classification accelerated by TensorRT
Activity Recognition TensorRT Perform video classification using 3D ResNets trained on Kinetics-400 dataset and accelerated with TensorRT P.S Click on
《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification》(AAAI 2021) GitHub:
LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification
Semi-supervised Learning for Sentiment Analysis
Neural-Semi-supervised-Learning-for-Text-Classification-Under-Large-Scale-Pretraining Code, models and Datasets for《Neural Semi-supervised Learning fo
Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)
DNA This repository provides the code of our paper: Blockwisely Supervised Neural Architecture Search with Knowledge Distillation. Illustration of DNA
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.
Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me
Library of various Few-Shot Learning frameworks for text classification
FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt
Pytorch port of Google Research's LEAF Audio paper
leaf-audio-pytorch Pytorch port of Google Research's LEAF Audio paper published at ICLR 2021. This port is not completely finished, but the Leaf() fro
PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers
CvT: Introducing Convolutions to Vision Transformers Pytorch implementation of CvT: Introducing Convolutions to Vision Transformers Usage: img = torch
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification
STAM - Pytorch Implementation of STAM (Space Time Attention Model), yet another pure and simple SOTA attention model that bests all previous models in
A scikit-learn based module for multi-label et. al. classification
scikit-multilearn scikit-multilearn is a Python module capable of performing multi-label learning tasks. It is built on-top of various scientific Pyth
scikit-learn cross validators for iterative stratification of multilabel data
iterative-stratification iterative-stratification is a project that provides scikit-learn compatible cross validators with stratification for multilab
Large-scale linear classification, regression and ranking in Python
lightning lightning is a library for large-scale linear classification, regression and ranking in Python. Highlights: follows the scikit-learn API con
High-level batteries-included neural network training library for Pytorch
Pywick High-Level Training framework for Pytorch Pywick is a high-level Pytorch training framework that aims to get you up and running quickly with st
PyTorch extensions for fast R&D prototyping and Kaggle farming
Pytorch-toolbelt A pytorch-toolbelt is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming: What
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
NVIDIA DALI The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provi
A Python package for time series classification
pyts: a Python package for time series classification pyts is a Python package for time series classification. It aims to make time series classificat
A machine learning toolkit dedicated to time-series data
tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti
A unified framework for machine learning with time series
Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible
a Deep Learning Framework for Text
DeLFT DeLFT (Deep Learning Framework for Text) is a Keras and TensorFlow framework for text processing, focusing on sequence labelling (e.g. named ent
Natural language detection
Detect the language of text. What’s so cool about franc? franc can support more languages(†) than any other library franc is packaged with support for
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and
DECAF: Deep Extreme Classification with Label Features
DECAF DECAF: Deep Extreme Classification with Label Features @InProceedings{Mittal21, author = "Mittal, A. and Dahiya, K. and Agrawal, S. and Sain
About A python based Apple Quicktime protocol,you can record audio and video from real iOS devices
介绍 本应用程序使用 python 实现,可以通过 USB 连接 iOS 设备进行屏幕共享 高帧率(30〜60fps) 高画质 低延迟(200ms) 非侵入性 支持多设备并行 Mac OSX 安装 python =3.7 brew install libusb pkg-config 如需使用 g
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch
Auditory Slow-Fast This repository implements the model proposed in the paper: Evangelos Kazakos, Arsha Nagrani, Andrew Zisserman, Dima Damen, Slow-Fa
Implement face detection, and age and gender classification, and emotion classification.
YOLO Keras Face Detection Implement Face detection, and Age and Gender Classification, and Emotion Classification. (image from wider face dataset) Ove
Dogs classification with Deep Metric Learning using some popular losses
Tsinghua Dogs classification with Deep Metric Learning 1. Introduction Tsinghua Dogs dataset Tsinghua Dogs is a fine-grained classification dataset fo
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator
involution Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVP
Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification.
Easy Few-Shot Learning Ready-to-use code and tutorial notebooks to boost your way into few-shot image classification. This repository is made for you
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125
Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc
Python audio and music signal processing library
madmom Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks. The library is i
A library for augmenting annotated audio data
muda A library for Musical Data Augmentation. muda package implements annotation-aware musical data augmentation, as described in the muda paper. The
Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals
Welcome to MARSYAS. MARSYAS is a software framework for rapid prototyping of audio applications, with flexibility and extensibility as primary concer
LibXtract is a simple, portable, lightweight library of audio feature extraction functions.
LibXtract LibXtract is a simple, portable, lightweight library of audio feature extraction functions. The purpose of the library is to provide a relat
C++ library for audio and music analysis, description and synthesis, including Python bindings
Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.
a library for audio and music analysis
aubio aubio is a library to label music and sounds. It listens to audio signals and attempts to detect events. For instance, when a drum is hit, at wh
Auralisation of learned features in CNN (for audio)
AuralisationCNN This repo is for an example of auralisastion of CNNs that is demonstrated on ISMIR 2015. Files auralise.py: includes all required func
Python Library for Model Interpretation/Explanations
Skater Skater is a unified framework to enable Model Interpretation for all forms of model to help one build an Interpretable machine learning system
Gluon CV Toolkit
Gluon CV Toolkit | Installation | Documentation | Tutorials | GluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a
ThunderSVM: A Fast SVM Library on GPUs and CPUs
What's new We have recently released ThunderGBM, a fast GBDT and Random Forest library on GPUs. add scikit-learn interface, see here Overview The miss
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.
Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.
MLBox is a powerful Automated Machine Learning python library.
MLBox is a powerful Automated Machine Learning python library. It provides the following features: Fast reading and distributed data preprocessing/cle
A machine learning toolkit dedicated to time-series data
tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti
A scikit-learn based module for multi-label et. al. classification
scikit-multilearn scikit-multilearn is a Python module capable of performing multi-label learning tasks. It is built on-top of various scientific Pyth
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch
Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c
plumi video sharing
December 2017 update We are moving tickets from the Plumi tracker (trac.plumi.org) here, for historical reasons. Plumi video sharing system Plumi is a
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
Orange Data Mining Orange is a data mining and visualization toolbox for novice and expert alike. To explore data with Orange, one requires no program
plumi video sharing
December 2017 update We are moving tickets from the Plumi tracker (trac.plumi.org) here, for historical reasons. Plumi video sharing system Plumi is a
Python CD-DA ripper preferring accuracy over speed
Whipper Whipper is a Python 3 (3.6+) CD-DA ripper based on the morituri project (CDDA ripper for *nix systems aiming for accuracy over speed). It star
Supysonic is a Python implementation of the Subsonic server API.
Supysonic Supysonic is a Python implementation of the Subsonic server API. Current supported features are: browsing (by folders or tags) streaming of
Powerful, simple, audio tag editor for GNU/Linux
puddletag puddletag is an audio tag editor (primarily created) for GNU/Linux similar to the Windows program, Mp3tag. Unlike most taggers for GNU/Linux
All-In-One Digital Audio Workstation and Plugin Suite
How to install Windows Mac OS X Fedora Ubuntu How to Build Debian and Ubuntu Fedora All Other Linux Distros Mac OS X Windows What is MusiKernel? MusiK
MusicBrainz Picard
MusicBrainz Picard MusicBrainz Picard is a cross-platform (Linux/Mac OS X/Windows) application written in Python and is the official MusicBrainz tagge
Real-time audio visualizations (spectrum, spectrogram, etc.)
Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco
Audio book player for senior visually impaired.
PI Zero W Audio Book Motivation and requirements My dad is practically blind and at 80 years has trouble hearing and operating tiny or more complicate
A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing"
A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf 2021). Abstract In this work we propose Pathfind
Deepfake Scanner by Deepware.
Deepware Scanner (CLI) This repository contains the command-line deepfake scanner tool with the pre-trained models that are currently used at deepware
Implementation of TimeSformer, a pure attention-based solution for video classification
TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.
Audio spatialization over WebRTC and JACK Audio Connection Kit
Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.
Conferencing Speech Challenge
ConferencingSpeech 2021 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech challenge. For more detai
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag
Get list of common stop words in various languages in Python
Python Stop Words Table of contents Overview Available languages Installation Basic usage Python compatibility Overview Get list of common stop words
Text vectorization tool to outperform TFIDF for classification tasks
WHAT: Supervised text vectorization tool Textvec is a text vectorization tool, with the aim to implement all the "classic" text vectorization NLP meth
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you
DELTA is a deep learning based natural language and speech processing platform.
DELTA - A DEep learning Language Technology plAtform What is DELTA? DELTA is a deep learning based end-to-end natural language and speech processing p
Snips Python library to extract meaning from text
Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
fastNLP fastNLP是一款轻量级的自然语言处理(NLP)工具包,目标是快速实现NLP任务以及构建复杂模型。 fastNLP具有如下的特性: 统一的Tabular式数据容器,简化数据预处理过程; 内置多种数据集的Loader和Pipe,省去预处理代码; 各种方便的NLP工具,例如Embedd
An open source library for deep learning end-to-end dialog systems and chatbots.
DeepPavlov is an open-source conversational AI library built on TensorFlow, Keras and PyTorch. DeepPavlov is designed for development of production re
Library for fast text representation and classification.
fastText fastText is a library for efficient learning of word representations and sentence classification. Table of contents Resources Models Suppleme
💫 Industrial-strength Natural Language Processing (NLP) in Python
spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc
:sound: Play and Record Sound with Python :snake:
Play and Record Sound with Python This Python module provides bindings for the PortAudio library and a few convenience functions to play and record Nu
Audio library for modelling loudness
Loudness Loudness is a C++ library with Python bindings for modelling perceived loudness. The library consists of processing modules which can be casc
Speech recognition module for Python, supporting several engines and APIs, online and offline.
SpeechRecognition Library for performing speech recognition, with support for several engines and APIs, online and offline. Speech recognition engine/
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a
Audio features extraction
Yaafe Yet Another Audio Feature Extractor Build status Branch master : Branch dev : Anaconda : Install Conda Yaafe can be easily install with conda. T
python wrapper for rubberband
pyrubberband A python wrapper for rubberband. For now, this just provides lightweight wrappers for pitch-shifting and time-stretching. All processing
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack
A fast MDCT implementation using SciPy and FFTs
MDCT A fast MDCT implementation using SciPy and FFTs Installation As usual pip install mdct Dependencies NumPy SciPy STFT Usage import mdct spectrum
An audio digital processing toolbox based on a workflow/pipeline principle
AudioTK Audio ToolKit is a set of audio filters. It helps assembling workflows for specific audio processing workloads. The audio workflow is split in
Data manipulation and transformation for audio signal processing, powered by PyTorch
torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the
kapre: Keras Audio Preprocessors
Kapre Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time. Tested on Python 3.6 and 3.7 Why Kapre? vs. Pre-co
Python library for handling audio datasets.
AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener
Python I/O for STEM audio files
stempeg = stems + ffmpeg Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio stream
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via
LedFx is a network based LED effect controller with support for advanced real-time audio effects
Welcome to LedFx ✨ -Making music come alive! LedFx website: https://ledfx.app/ What is LedFx? What LedFx offers is the ability to take audio input, an
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification
NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch
PyTorch implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping Paper: https://arxiv.org/abs/2102.06171.pdf Original code: htt
Real-time video and audio streams over the network, with Streamlit.
streamlit-webrtc Example You can try out the sample app using the following commands.
Efficient neural networks for analog audio effect modeling
micro-TCN Efficient neural networks for audio effect modeling
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag
Get list of common stop words in various languages in Python
Python Stop Words Table of contents Overview Available languages Installation Basic usage Python compatibility Overview Get list of common stop words
Text vectorization tool to outperform TFIDF for classification tasks
WHAT: Supervised text vectorization tool Textvec is a text vectorization tool, with the aim to implement all the "classic" text vectorization NLP meth
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.
Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you
DELTA is a deep learning based natural language and speech processing platform.
DELTA - A DEep learning Language Technology plAtform What is DELTA? DELTA is a deep learning based end-to-end natural language and speech processing p
Snips Python library to extract meaning from text
Snips NLU Snips NLU (Natural Language Understanding) is a Python library that allows to extract structured information from sentences written in natur