385 Repositories
Python audio-streaming Libraries
A collection of python scripts for extracting and analyzing acoustics from audio files.
pyAcoustics A collection of python scripts for extracting and analyzing acoustics from audio files. Contents 1 Common Use Cases 2 Major revisions 3 Fe
Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs
Google Text-To-Speech Batch Prompt File Maker Are you in the need of IVR prompts, but you have no voice actors? Let Google talk your prompts like a pr
Audio pitch-shifting & re-sampling utility, based on the EMU SP-1200
Pitcher.py Free & OS emulation of the SP-12 & SP-1200 signal chain (now with GUI) Pitch shift / bitcrush / resample audio files Written and tested in
VIsually-Pivoted Audio and(N) Text
VIP-ANT: VIsually-Pivoted Audio and(N) Text Code for the paper Connecting the Dots between Audio and Text without Parallel Data through Visual Knowled
Streaming over lightweight data transformations
Description Data augmentation libarary for Deep Learning, which supports images, segmentation masks, labels and keypoints. Furthermore, SOLT is fast a
A Telegram bot to transcribe audio, video and image into text.
Transcriber Bot A Telegram bot to transcribe audio, video and image into text. Deploy to Heroku Local Deploying Install the FFmpeg. Make sure you have
Control YouTube, streaming sites, media players on your computer using your phone as a remote.
Media Control Control Youtube, streaming sites, media players on your computer using your phone as a remote. Installation pip install -r requirements.
Wav2Vec for speech recognition, classification, and audio classification
Soxan در زبان پارسی به نام سخن This repository consists of models, scripts, and notebooks that help you to use all the benefits of Wav2Vec 2.0 in your
Python script for downloading audio from YouTube songs/videos.
Python script for downloading audio from YouTube songs/videos. All you have to do is specify the path to your folder and then type song's/video's name and the sound will be downloaded into your folder.
Make an audio file (really) long-winded
longwind Make an audio file (really) long-winded Daily repetitions are an illusion anyway.
pyo is a Python module written in C to help digital signal processing script creation.
pyo is a Python module written in C to help digital signal processing script creation.
Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"
Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose We provide PyTorch implementations for our arxiv paper "Audio-dr
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020] by Kaisiyuan Wang, Qianyi Wu, Linsen Song, Zhuoqian Yang, Wa
Code for Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019)
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019) We propose Disentangled Audio-Visual System (DAVS) to ad
AudioDVP:Photorealistic Audio-driven Video Portraits
AudioDVP This is the official implementation of Photorealistic Audio-driven Video Portraits. Major Requirements Ubuntu = 18.04 PyTorch = 1.2 GCC =
User-friendly Voice Cloning Application
Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur
Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.
PART 2: CHAIN LINKING AUDIO-TO-TEXT NLP TASKS 2A: TRANSCRIBE-TRANSLATE-SENTIMENT-ANALYSIS In notebook3.0, I demo a simple workflow to: transcribe a lo
A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats
TG-MusicPlayer A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Py
A self-hosted streaming platform with Discord authentication, auto-recording and more!
A self-hosted streaming platform with Discord authentication, auto-recording and more!
Delta TTA(Text To Audio) SoftWare
Text-To-Audio-Windows Delta TTA(Text To Audio) SoftWare Info You Can Use It For Convert Your Text To Audio File You Just Write Your Text And Your End
Text Classification in Turkish Texts with Bert
You can watch the details of the project on my youtube channel Project Interface Project Second Interface Goal= Correctly guessing the classification
Let's you download entire YT-playlists.
Youtube MP3 Playlist Downloader Let's you download entire youtube playlists as mp3 files. This application is basically a script that makes it easier
A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities
MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.
Realtime Financial Market Data Visualization and Analysis Introduction This repo shows my project about real-time stock data pipeline. All the code is
A very fast file streaming bot used for streaming and downloading movies
FileStreamBot GIVE A STAR AND FORK ELSE NO MORE OPENSOURCE A Telegram bot to turn all media and documents files to web link . Report a Bug | Request F
The virtual calculator will be above the live streaming from your camera
The virtual calculator is above the live streaming from my camera usb , the program first detect my hand and in each frame calculate the distance between two finger ,if the distance is lower than the specific length , it detected as a click i can write any arithmitic operation , when i click in the equals sign the result appears in the display section. i can clear the display section by pressing c button in the keyboard .
A GUI-based audio player with support for a large variety of formats
Miza-Player A GUI-based audio player with support for a large variety of formats, able to play from web-hosted media platforms such as YouTube, includ
Streaming parser for multipart/form-data written in Python
Streaming multipart/form-data parser streaming_form_data provides a Python parser for parsing multipart/form-data input chunks (the encoding used when
MelGAN test on audio decoding
Official repository for the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis The original work URL: https://github.com
Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
PyTorch implementation of Continuous Augmented Positional Embeddings (CAPE), by Likhomanenko et al. Enhance your Transformer positional embeddings with easy-to-use augmentations!
Utils for streaming large files (S3, HDFS, gzip, bz2...)
smart_open — utils for streaming large files in Python What? smart_open is a Python 3 library for efficient streaming of very large files from/to stor
Real-Time Spherical Microphone Renderer for binaural reproduction in Python
ReTiSAR Implementation of the Real-Time Spherical Microphone Renderer for binaural reproduction in Python [1][2]. Contents: | Requirements | Setup | Q
Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.
BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat
SAMO: Streaming Architecture Mapping Optimisation
SAMO: Streaming Architecture Mapping Optimiser The SAMO framework provides a method of optimising the mapping of a Convolutional Neural Network model
Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks
Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks - Official Project Page This repository contains the code develope
Steerable discovery of neural audio effects
Steerable discovery of neural audio effects Christian J. Steinmetz and Joshua D. Reiss Abstract Applications of deep learning for audio effects often
Reading streams of Twitter data, save them to Kafka, then process with Kafka Stream API and Spark Streaming
Using Streaming Twitter Data with Kafka and Spark Reading streams of Twitter data, publishing them to Kafka topic, process message using Kafka Stream
Code for csig audio deepfake detection
FMFCC Audio Deepfake Detection Solution This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection
Userscript qutebrowser for downloading audio / video from youtube using aria2
Yt-Downloader Userscript qutebrowser for downloading video / audio from youtube using aria2 by hint links. Requirements Rofi youtube-dl aria2 dunst In
The official repository for Audio ALBERT
AALBERT Here is also the official repository of AALBERT, which is Pytorch lightning reimplementation of the paper, Audio ALBERT: A Lite Bert for Self-
Implementation of the algorithm shown in the article "Modelo de Predicción de Éxito de Canciones Basado en Descriptores de Audio"
Success Predictor Implementation of the algorithm shown in the article "Modelo de Predicción de Éxito de Canciones Basado en Descriptores de Audio". B
OpenL3: Open-source deep audio and image embeddings
OpenL3 OpenL3 is an open-source Python library for computing deep audio and image embeddings. Please refer to the documentation for detailed instructi
A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats
TG-MusicPlayer A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Py
Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.
FENSE The metric, Fluency ENhanced Sentence-bert Evaluation (FENSE), for audio caption evaluation, proposed in the paper "Can Audio Captions Be Evalua
Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"
Status: Archive (code is provided as-is, no updates expected) Disclaimer This code is a based on "Jukebox: A Generative Model for Music" Paper We adju
A live streaming chatroom involving multiple modalities, such as voice, gesture, and facial expression
HiLive A live streaming chatroom involving multiple modalities, such as voice, gesture, and facial expression. Introduction We focus on demonstrating
A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats.
VC UserBot A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Python
CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system
CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system
Find out where all films you want to watch are streaming
Just Watch Letterboxd Find out where all films you want to watch are streaming Ever wonder what films you want to watch are already on the streaming p
Convert Video Files To Text And Audio
Video-To-Text Convert Video Files To Text And Audio Convert To Audio 1: open dvtt folder in cmd 2: run this command in cmd = main.py Audio Convert To
R interface to fast.ai
R interface to fastai The fastai package provides R wrappers to fastai. The fastai library simplifies training fast and accurate neural nets using mod
🔊 Audio and fastai v2
Fastaudio An audio module for fastai v2. We want to help you build audio machine learning applications while minimizing the need for audio domain expe
🌲 Implementation of the Robust Random Cut Forest algorithm for anomaly detection on streams
🌲 Implementation of the Robust Random Cut Forest algorithm for anomaly detection on streams
Telegram Video Chat Video Streaming bot 🇱🇰
🧪 Get SESSION_NAME from below: Pyrogram 🎭 Preview ✨ Features Music & Video stream support MultiChat support Playlist & Queue support Skip, Pause, Re
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.
torchsynth The fastest synth in the universe. Introduction torchsynth is based upon traditional modular synthesis written in pytorch. It is GPU-option
A desktop GUI providing an audio interface for GPT3.
Jabberwocky neil_degrasse_tyson_with_audio.mp4 Project Description This GUI provides an audio interface to GPT-3. My main goal was to provide a conven
This bot can stream audio or video files and urls in telegram voice chats
Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot
This is a Client-Server-System which can send audio from a microphone from the server to client and in the other direction.
Audio-Streaming-Python This is a Client-Server-System which can send audio from a microphone from the server to client and in the other direction. You
ZipFly is a zip archive generator based on zipfile.py
ZipFly is a zip archive generator based on zipfile.py. It was created by Buzon.io to generate very large ZIP archives for immediate sending out to clients, or for writing large ZIP archives without memory inflation.
This is a Client-Server-System which can share the screen from the server to client and in the other direction.
Screenshare-Streaming-Python This is a Client-Server-System which can share the screen from the server to client and in the other direction. You have
Python based Telegram bot. Search and download YouTube video or audio.
Python-Telegram-Youtube-Media-Bot Python based Telegram bot. Search and download YouTube video or audio. Just change settings.py and start TelegramBot
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder
The audio-video synchronization of MKV Container Format is exploited to achieve data hiding
The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation, and authentication
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation Introduction Getting Started FSD50K Recipe AudioSet Recipe Label E
This is the reference implementation for "Coresets via Bilevel Optimization for Continual Learning and Streaming"
Coresets via Bilevel Optimization This is the reference implementation for "Coresets via Bilevel Optimization for Continual Learning and Streaming" ht
SomaFM Plugin for Kodi
SomaFM XBMC Plugin This description is a bit outdated. You can simply install this addon by browsing the official repositories from within Kodi. Insta
Audio/Video downloader
youtubeDownloader Audio/Video downloader • The project downloads audio/video/both after link is entered • It also shows total size of the file, time l
🌊 River is a Python library for online machine learning.
River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on streaming data.
YOLOX_AUDIO is an audio event detection model based on YOLOX
YOLOX_AUDIO is an audio event detection model based on YOLOX, an anchor-free version of YOLO. This repo is an implementated by PyTorch. Main goal of YOLOX_AUDIO is to detect and classify pre-defined audio events in multi-spectrogram domain using image object detection frameworks.
Sentiment analysis on streaming twitter data using Spark Structured Streaming & Python
Sentiment analysis on streaming twitter data using Spark Structured Streaming & Python This project is a good starting point for those who have little
A discord bot for downloading youtube video and audio files
disctube disctube is a discord bot for downloading video and audio files from youtube using python pytube. disclaimer i am not the best python program
Music Streaming Platform based on full implementation of DBSM
Symphony Music Streaming Platform based on full implementation of DBSM List of Commands Insert User (INSERT) Function to implement input in USER Get a
Microservice example with Python, Faust-Streaming and Kafka (Redpanda)
Microservices Orchestration with Python, Faust-Streaming and Kafka (Redpanda) Example project for PythonBenin meetup. It demonstrates how to use Faust
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning This repository contains the code for our ICCV 202
A script that downloads YouTube videos/audio
YouTube-Downloader A script that downloads YouTube videos/audio from youtube. Usage Download the script by executing the following in your terminal :
The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"
Finnish Dialect Identification The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a te
Generating a structured library of .wav samples with Python.
sample-library Scripts for generating a structured sample library with Python Requires Docker about Samples are written to wave files in lib/. Differe
Efficient Training of Audio Transformers with Patchout
PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa
WaveFake: A Data Set to Facilitate Audio DeepFake Detection
WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper
Carnatic Notes Predictor for audio files
Carnatic Notes Predictor for audio files Link for live application: https://share.streamlit.io/pradeepak1/carnatic-notes-predictor-for-audio-files/mai
WaveFake: A Data Set to Facilitate Audio DeepFake Detection
WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper
On-device speech-to-index engine powered by deep learning.
On-device speech-to-index engine powered by deep learning.
Streamz helps you build pipelines to manage continuous streams of data
Streamz helps you build pipelines to manage continuous streams of data. It is simple to use in simple cases, but also supports complex pipelines that involve branching, joining, flow control, feedback, back pressure, and so on.
A rofi-blocks script that searches youtube and plays the selected audio on mpv.
rofi-ytm A rofi-blocks script that searches youtube and plays the selected audio on mpv. To use the script, run the following command rofi -modi block
Discord Streaming Statut (Bot/SelfBot)
Discord-Streaming-Status Discord Streaming Status For Both User Accounts And Bot Accounts. Open your cmd and enter the command: pip install discord BE
convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.
convert-to-opus-cli convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format. Installation Must have installed ffmp
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we
Terminal-based audio-to-text converter
att Terminal-based audio-to-text converter Project description A terminal-based audio-to-text converter written in python, enabling you to convert .wa
This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File.
Introduction This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File. Installation pip install -r req
digital audio workstation, instrument and effect plugins, wave editor
digital audio workstation, instrument and effect plugins, wave editor
An 8D music player made to enjoy Halloween this year!🤘
HAPPY HALLOWEEN buddy! Split Player Hello There! Welcome to SplitPlayer... Supposed To Be A 8DPlayer.... You Decide.... It can play the ordinary audio
pyffstream - A CLI frontend for streaming over SRT and RTMP specializing in sending off files
pyffstream - A CLI frontend for streaming over SRT and RTMP specializing in sending off files
Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names
Batch Sorting Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files accord
Python script for extracting audio from video files and creating Mel spectrograms
video2spectrogram About This package is meant to automate the process of extracting audio files from videos and saving the plots computed from these a
A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).
rfsoapyfile A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream). The script is threaded fo
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"
TriBERT This repository contains the code for the NeurIPS 2021 paper titled "TriBERT: Full-body Human-centric Audio-visual Representation Learning for
Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning
Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning Reference Abeßer, J. & Müller, M. Towards Audio Domain Adapt
Audio Visual Emotion Recognition using TDA
Audio Visual Emotion Recognition using TDA RAVDESS database with two datasets analyzed: Video and Audio dataset: Audio-Dataset: https://www.kaggle.com
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.
The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us
This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"
trimodal_person_verification This repository contains the code, and preprocessed dataset featured in "A Study of Multimodal Person Verification Using