404 Repositories
Python pulse-audio-stream Libraries
Classify bird species based on their songs using SIamese Networks and 1D dilated convolutions.
The goal is to classify different birds species based on their songs/calls. Spectrograms have been extracted from the audio samples and used as features for classification.
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Fre-GAN Vocoder Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Training: python train.py --config config.json Citation: @misc{kim2021frega
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
simple_diarizer Simplified diarization pipeline using some pretrained models. Made to be a simple as possible to go from an input audio file to diariz
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".
AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio
AudioCLIP Extending CLIP to Image, Text and Audio
AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This
Use android as mic/speaker for ubuntu
Pulse Audio Control Panel Platforms Requirements sudo apt install ffmpeg pactl (already installed) Download Download the AppImage from release page ch
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
WaveGrad2 - PyTorch Implementation PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Status (202
A versatile token stream for handwritten parsers.
Writing recursive-descent parsers by hand can be quite elegant but it's often a bit more verbose than expected, especially when it comes to handling indentation and reporting proper syntax errors. This package provides a powerful general-purpose token stream that addresses these issues and more.
spafe: Simplified Python Audio-Features Extraction
spafe aims to simplify features extractions from mono audio files. The library can extract of the following features: BFCC, LFCC, LPC, LPCC, MFCC, IMFCC, MSRCC, NGCC, PNCC, PSRCC, PLP, RPLP, Frequency-stats etc. It also provides various filterbank modules (Mel, Bark and Gammatone filterbanks) and other spectral statistics.
In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the
AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations
AugLy is a data augmentations library that currently supports four modalities (audio, image, text & video) and over 100 augmentations. Each modality’s augmentations are contained within its own sub-library. These sub-libraries include both function-based and class-based transforms, composition operators, and have the option to provide metadata about the transform applied, including its intensity.
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.
AnimDL - Download & Stream Your Favorite Anime AnimDL is an incredibly powerful tool for downloading and streaming anime. Core features Abuses the dev
A Youtube audio player for your terminal
AudioLine A lightweight Youtube audio player for your terminal Explore the docs » View Demo · Report Bug · Request Feature · Send a Pull Request About
Audio augmentations library for PyTorch for audio in the time-domain
Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.
Package pyVHR is a comprehensive framework for studying methods of pulse rate estimation relying on remote photoplethysmography (rPPG)
Package pyVHR (short for Python framework for Virtual Heart Rate) is a comprehensive framework for studying methods of pulse rate estimation relying on remote photoplethysmography (rPPG)
A telegram bot that can send you high-quality audio 🎧🎧🎧
Music downloader bot Still under development Please Report issues to improve this repo.I will try to fix bugs in next update Music downloader bot is a
An audio guide for destroying oracles in Destiny's Vault of Glass raid
prophet An audio guide for destroying oracles in Destiny's Vault of Glass raid. This project allows you to make any encounter with oracles without hav
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling For Official repo of NU-Wave: A Diffusion Probabilistic Model for Neural Audio Up
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)
MusCaps: Generating Captions for Music Audio Ilaria Manco1 2, Emmanouil Benetos1, Elio Quinton2, Gyorgy Fazekas1 1 Queen Mary University of London, 2
pyprobables is a pure-python library for probabilistic data structures
pyprobables is a pure-python library for probabilistic data structures. The goal is to provide the developer with a pure-python implementation of common probabilistic data-structures to use in their work.
Audio processor to map oracle notes in the VoG raid in Destiny 2 to call outs.
vog_oracles Audio processor to map oracle notes in the VoG raid in Destiny 2 to call outs. Huge thanks to mzucker on GitHub for the note detection cod
VMD Audio/Text control with natural language
This repository is a proof of principle for performing Molecular Dynamics analysis, in this case with the program VMD, via natural language commands.
Repo for "Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks"
Summary This is the code for the paper Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks by Yanxiang Wang, Xian Zh
Widevine MPD Content Downloader & Decryptor
Widevine-DL Encrypted MPD Manifest Content Downloader + Decryptor (not a Widevine Key Extractor!) Requirements ffmpeg, yt-dlp, aria2, widevine-l3-decr
Accompanying code for our paper "Point Cloud Audio Processing"
Point Cloud Audio Processing Krishna Subramani1, Paris Smaragdis1 1UIUC Paper For the necessary libraries/prerequisites, please use conda/anaconda to
Python codes for Lite Audio-Visual Speech Enhancement.
Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE
Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools
Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu
Telegram bot + userbot for streaming audio in group calls.
Calls Music — Telegram bot + userbot for streaming audio in group calls ✍🏻 Requirements FFmpeg Python 3.7+ 🚀 Deployment 🛠 Configuration Copy exampl
Python function to stream unzip all the files in a ZIP archive: without loading the entire ZIP file or any of its files into memory at once
Python function to stream unzip all the files in a ZIP archive: without loading the entire ZIP file or any of its files into memory at once
Stream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:
Stream Framework Activity Streams & Newsfeeds Stream Framework is a Python library which allows you to build activity streams & newsfeeds using Cassan
Hide Your Secret Message in any Wave Audio File.
HiddenWave Embedding secret messages in wave audio file What is HiddenWave Hiddenwave is a python based program for simple audio steganography. You ca
Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!
Text to speech (using Python) Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and co
Code for paper 'Audio-Driven Emotional Video Portraits'.
Audio-Driven Emotional Video Portraits [CVPR2021] Xinya Ji, Zhou Hang, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu [Project] [Paper] G
Identify the emotion of multiple speakers in an Audio Segment
MevonAI - Speech Emotion Recognition
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
Word2Wave is a simple method for text-controlled GAN audio generation. You can either follow the setup instructions below and use the source code and CLI provided in this repo or you can have a play around in the Colab notebook provided. Note that, in both cases, you will need to train a WaveGAN model first
Unofficial PyTorch implementation of Google AI's VoiceFilter system
VoiceFilter Note from Seung-won (2020.10.25) Hi everyone! It's Seung-won from MINDs Lab, Inc. It's been a long time since I've released this open-sour
Data manipulation and transformation for audio signal processing, powered by PyTorch
torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the
Tweet stream in OBS browser source
OBS-Twitter-Stream OBSなどの配信ソフトのブラウザソースで特定のキーワードを含んだツイートを表示します 使い方 使い方については以下のwikiを御覧ください https://github.com/CubeZeero/OBS-Twitter-Stream/wiki ダウンロード W
Contrastive Language-Audio Pretraining
CLAP Contrastive Language-Audio Pretraining In due time this repo will be full of lovely things, I hope. Feel free to check out the Issues if you're i
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup
Stream Telegram files to web
Telegram File Stream Bot A Telegram bot to stream files to web Demo Bot » Report a Bug | Request Feature Table of Contents About this Bot Original Rep
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021) Hang Zhou, Yasheng Sun, Wayne Wu, Chen Cha
Home Assistant custom component for viewing IP cameras RTSP stream in real time using WebRTC technology
WebRTC Camera Home Assistant custom component for viewing IP cameras RTSP stream in real time using WebRTC technology. Based on: Pion - pure Go implem
Pytorch port of Google Research's LEAF Audio paper
leaf-audio-pytorch Pytorch port of Google Research's LEAF Audio paper published at ICLR 2021. This port is not completely finished, but the Leaf() fro
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
NVIDIA DALI The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provi
Real-time stream processing for python
Streamz Streamz helps you build pipelines to manage continuous streams of data. It is simple to use in simple cases, but also supports complex pipelin
Async client API for the Telegram Group Calls
PyTgCalls This project allow to make Telegram group call with MTProto Api using Pyrogram and WebRTC, this is possible thanks to the power of NodeJS's
About A python based Apple Quicktime protocol,you can record audio and video from real iOS devices
介绍 本应用程序使用 python 实现,可以通过 USB 连接 iOS 设备进行屏幕共享 高帧率(30〜60fps) 高画质 低延迟(200ms) 非侵入性 支持多设备并行 Mac OSX 安装 python =3.7 brew install libusb pkg-config 如需使用 g
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch
Auditory Slow-Fast This repository implements the model proposed in the paper: Evangelos Kazakos, Arsha Nagrani, Andrew Zisserman, Dima Damen, Slow-Fa
Python audio and music signal processing library
madmom Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks. The library is i
A library for augmenting annotated audio data
muda A library for Musical Data Augmentation. muda package implements annotation-aware musical data augmentation, as described in the muda paper. The
Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals
Welcome to MARSYAS. MARSYAS is a software framework for rapid prototyping of audio applications, with flexibility and extensibility as primary concer
LibXtract is a simple, portable, lightweight library of audio feature extraction functions.
LibXtract LibXtract is a simple, portable, lightweight library of audio feature extraction functions. The purpose of the library is to provide a relat
C++ library for audio and music analysis, description and synthesis, including Python bindings
Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.
a library for audio and music analysis
aubio aubio is a library to label music and sounds. It listens to audio signals and attempts to detect events. For instance, when a drum is hit, at wh
Auralisation of learned features in CNN (for audio)
AuralisationCNN This repo is for an example of auralisastion of CNNs that is demonstrated on ISMIR 2015. Files auralise.py: includes all required func
plumi video sharing
December 2017 update We are moving tickets from the Plumi tracker (trac.plumi.org) here, for historical reasons. Plumi video sharing system Plumi is a
plumi video sharing
December 2017 update We are moving tickets from the Plumi tracker (trac.plumi.org) here, for historical reasons. Plumi video sharing system Plumi is a
Python CD-DA ripper preferring accuracy over speed
Whipper Whipper is a Python 3 (3.6+) CD-DA ripper based on the morituri project (CDDA ripper for *nix systems aiming for accuracy over speed). It star
Supysonic is a Python implementation of the Subsonic server API.
Supysonic Supysonic is a Python implementation of the Subsonic server API. Current supported features are: browsing (by folders or tags) streaming of
Powerful, simple, audio tag editor for GNU/Linux
puddletag puddletag is an audio tag editor (primarily created) for GNU/Linux similar to the Windows program, Mp3tag. Unlike most taggers for GNU/Linux
All-In-One Digital Audio Workstation and Plugin Suite
How to install Windows Mac OS X Fedora Ubuntu How to Build Debian and Ubuntu Fedora All Other Linux Distros Mac OS X Windows What is MusiKernel? MusiK
MusicBrainz Picard
MusicBrainz Picard MusicBrainz Picard is a cross-platform (Linux/Mac OS X/Windows) application written in Python and is the official MusicBrainz tagge
Real-time audio visualizations (spectrum, spectrogram, etc.)
Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco
Audio book player for senior visually impaired.
PI Zero W Audio Book Motivation and requirements My dad is practically blind and at 80 years has trouble hearing and operating tiny or more complicate
Audio spatialization over WebRTC and JACK Audio Connection Kit
Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.
Conferencing Speech Challenge
ConferencingSpeech 2021 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech challenge. For more detai
:sound: Play and Record Sound with Python :snake:
Play and Record Sound with Python This Python module provides bindings for the PortAudio library and a few convenience functions to play and record Nu
Audio library for modelling loudness
Loudness Loudness is a C++ library with Python bindings for modelling perceived loudness. The library consists of processing modules which can be casc
Speech recognition module for Python, supporting several engines and APIs, online and offline.
SpeechRecognition Library for performing speech recognition, with support for several engines and APIs, online and offline. Speech recognition engine/
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a
Audio features extraction
Yaafe Yet Another Audio Feature Extractor Build status Branch master : Branch dev : Anaconda : Install Conda Yaafe can be easily install with conda. T
python wrapper for rubberband
pyrubberband A python wrapper for rubberband. For now, this just provides lightweight wrappers for pitch-shifting and time-stretching. All processing
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack
A fast MDCT implementation using SciPy and FFTs
MDCT A fast MDCT implementation using SciPy and FFTs Installation As usual pip install mdct Dependencies NumPy SciPy STFT Usage import mdct spectrum
An audio digital processing toolbox based on a workflow/pipeline principle
AudioTK Audio ToolKit is a set of audio filters. It helps assembling workflows for specific audio processing workloads. The audio workflow is split in
Data manipulation and transformation for audio signal processing, powered by PyTorch
torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the
kapre: Keras Audio Preprocessors
Kapre Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time. Tested on Python 3.6 and 3.7 Why Kapre? vs. Pre-co
Python library for handling audio datasets.
AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener
Python I/O for STEM audio files
stempeg = stems + ffmpeg Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio stream
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via
LedFx is a network based LED effect controller with support for advanced real-time audio effects
Welcome to LedFx ✨ -Making music come alive! LedFx website: https://ledfx.app/ What is LedFx? What LedFx offers is the ability to take audio input, an
Real-time video and audio streams over the network, with Streamlit.
streamlit-webrtc Example You can try out the sample app using the following commands.
Efficient neural networks for analog audio effect modeling
micro-TCN Efficient neural networks for audio effect modeling
MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)
mongo-connector The mongo-connector project originated as a MongoDB mongo-labs project and is now community-maintained under the custody of YouGov, Pl
Generate generic activity streams from the actions on your site. Users can follow any actors' activities for personalized streams.
Django Activity Stream What is Django Activity Stream? Django Activity Stream is a way of creating activities generated by the actions on your site. I
When doing audio and video sentiment recognition, I found that a lot of code is duplicated, often a function in different time debugging for a long time, based on this problem, I want to manage all the previous work, organized into an open source library can be iterative. For their own use and others.
FastAudioVisual Our project is developed here. The goal finish time is March 01, 2021 What is FastAudioVisual? FastAudioVisual is a tool that allows u
A simple script that will watch a stream for you and earn the channel points.
Credits Main idea: https://github.com/gottagofaster236/Twitch-Channel-Points-Miner Bet system (Selenium): https://github.com/ClementRoyer/TwitchAutoCo
Telegram Voice-Chat Bot Written In Python Using Pyrogram.
Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E
A machine learning package for streaming data in Python. The other ancestor of River.
scikit-multiflow is a machine learning package for streaming data in Python. creme and scikit-multiflow are merging into a new project called River. W
Python Stream Processing
Python Stream Processing Version: 1.10.4 Web: http://faust.readthedocs.io/ Download: http://pypi.org/project/faust Source: http://github.com/robinhood
Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
Latest Salt Documentation Open an issue (bug report, feature request, etc.) Salt is the world’s fastest, most intelligent and scalable automation engi
Python module for handling audio metadata
Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg The
Scalable audio processing framework written in Python with a RESTful API
TimeSide : scalable audio processing framework and server written in Python TimeSide is a python framework enabling low and high level audio analysis,
Manipulate audio with a simple and easy high level interface
Pydub Pydub lets you do stuff to audio in a way that isn't stupid. Stuff you might be looking for: Installing Pydub API Documentation Dependencies Pla
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple
?️ Open Source Audio Matching and Mastering
Matching + Mastering = ❤️ Matchering 2.0 is a novel Containerized Web Application and Python Library for audio matching and mastering. It follows a si