355 Python Digital-audio Libraries

Let's you download entire YT-playlists.

Youtube MP3 Playlist Downloader Let's you download entire youtube playlists as mp3 files. This application is basically a script that makes it easier

11 Dec 18, 2022

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-

4 May 8, 2022

Official implementation of the article "Unsupervised JPEG Domain Adaptation For Practical Digital Forensics"

Unsupervised JPEG Domain Adaptation for Practical Digital Image Forensics @WIFS2021 (Montpellier, France) Rony Abecidan, Vincent Itier, Jeremie Boulan

6 Jan 6, 2023

A GUI-based audio player with support for a large variety of formats

Miza-Player A GUI-based audio player with support for a large variety of formats, able to play from web-hosted media platforms such as YouTube, includ

3 Dec 14, 2022

MelGAN test on audio decoding

Official repository for the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis The original work URL: https://github.com

1 Apr 29, 2022

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

PyTorch implementation of Continuous Augmented Positional Embeddings (CAPE), by Likhomanenko et al. Enhance your Transformer positional embeddings with easy-to-use augmentations!

26 Dec 13, 2022

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

ReTiSAR Implementation of the Real-Time Spherical Microphone Renderer for binaural reproduction in Python [1][2]. Contents: | Requirements | Setup | Q

Division of Applied Acoustics at Chalmers University of Technology

51 Dec 17, 2022

Simple Digital Ocean CLI by python.

2 Jan 1, 2023

Tools for computational pathology

A toolkit for computational pathology and machine learning. View documentation Please cite our paper Installation There are several ways to install Pa

254 Dec 12, 2022

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat

2 Nov 21, 2022

⚡ H2G-Net for Semantic Segmentation of Histopathological Images

H2G-Net This repository contains the code relevant for the proposed design H2G-Net, which was introduced in the manuscript "Hybrid guiding: A multi-re

8 Nov 24, 2022

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks - Official Project Page This repository contains the code develope

1.7k Dec 18, 2022

Steerable discovery of neural audio effects

Steerable discovery of neural audio effects Christian J. Steinmetz and Joshua D. Reiss Abstract Applications of deep learning for audio effects often

182 Dec 29, 2022

Code for csig audio deepfake detection

FMFCC Audio Deepfake Detection Solution This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection

9 Jun 4, 2022

Userscript qutebrowser for downloading audio / video from youtube using aria2

Yt-Downloader Userscript qutebrowser for downloading video / audio from youtube using aria2 by hint links. Requirements Rofi youtube-dl aria2 dunst In

0 Dec 11, 2021

An python based Timer and Digital Clock

Python-based-Timer- An python based Timer and Digital Clock How to contribute to this repo ❓ Step 1: Fork the this repository Step 2: Clone your fork

3 Sep 16, 2022

The official repository for Audio ALBERT

AALBERT Here is also the official repository of AALBERT, which is Pytorch lightning reimplementation of the paper, Audio ALBERT: A Lite Bert for Self-

55 Dec 11, 2022

FOSS Digital Asset Distribution Platform built on Frappe.

Digistore FOSS Digital Assets Marketplace. Distribute digital assets, like a pro. Video Demo Here Features Create, attach and list digital assets (PDF

30 Dec 8, 2022

Implementation of the algorithm shown in the article "Modelo de Predicción de Éxito de Canciones Basado en Descriptores de Audio"

Success Predictor Implementation of the algorithm shown in the article "Modelo de Predicción de Éxito de Canciones Basado en Descriptores de Audio". B

4 Mar 17, 2022

OpenL3: Open-source deep audio and image embeddings

OpenL3 OpenL3 is an open-source Python library for computing deep audio and image embeddings. Please refer to the documentation for detailed instructi

Music and Audio Research Laboratory - NYU

326 Jan 2, 2023

A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats

TG-MusicPlayer A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Py

4 Dec 14, 2022

Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.

FENSE The metric, Fluency ENhanced Sentence-bert Evaluation (FENSE), for audio caption evaluation, proposed in the paper "Can Audio Captions Be Evalua

13 Dec 23, 2022

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Disclaimer This code is a based on "Jukebox: A Generative Model for Music" Paper We adju

24 Dec 29, 2022

A fast, dataset-agnostic, deep visual search engine for digital art history

imgs.ai imgs.ai is a fast, dataset-agnostic, deep visual search engine for digital art history based on neural network embeddings. It utilizes modern

5 Dec 14, 2022

A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats.

VC UserBot A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Python

1 Nov 29, 2021

Convert Video Files To Text And Audio

Video-To-Text Convert Video Files To Text And Audio Convert To Audio 1: open dvtt folder in cmd 2: run this command in cmd = main.py Audio Convert To

2 Dec 5, 2021

Digital image process Basic algorithm

These are some basic algorithms that I have implemented by my hands in the process of learning digital image processing, such as mean and median filtering, sharpening algorithms, interpolation scaling algorithms, histogram equalization algorithms, etc.

2 Nov 3, 2022

R interface to fast.ai

R interface to fastai The fastai package provides R wrappers to fastai. The fastai library simplifies training fast and accurate neural nets using mod

113 Dec 20, 2022

🔊 Audio and fastai v2

Fastaudio An audio module for fastai v2. We want to help you build audio machine learning applications while minimizing the need for audio domain expe

152 Dec 28, 2022

The Ludii general game system, developed as part of the ERC-funded Digital Ludeme Project.

The Ludii General Game System Ludii is a general game system being developed as part of the ERC-funded Digital Ludeme Project (DLP). This repository h

50 Jan 4, 2023

A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.

torchsynth The fastest synth in the universe. Introduction torchsynth is based upon traditional modular synthesis written in pytorch. It is GPU-option

229 Jan 2, 2023

BookNLP, a natural language processing pipeline for books

BookNLP BookNLP is a natural language processing pipeline that scales to books and other long documents (in English), including: Part-of-speech taggin

654 Jan 2, 2023

A desktop GUI providing an audio interface for GPT3.

Jabberwocky neil_degrasse_tyson_with_audio.mp4 Project Description This GUI provides an audio interface to GPT-3. My main goal was to provide a conven

16 Nov 27, 2022

This bot can stream audio or video files and urls in telegram voice chats

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot

4 Oct 9, 2022

This is a Client-Server-System which can send audio from a microphone from the server to client and in the other direction.

Audio-Streaming-Python This is a Client-Server-System which can send audio from a microphone from the server to client and in the other direction. You

0 Jan 5, 2023

Python based Telegram bot. Search and download YouTube video or audio.

Python-Telegram-Youtube-Media-Bot Python based Telegram bot. Search and download YouTube video or audio. Just change settings.py and start TelegramBot

2 Oct 2, 2022

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

589 Jan 2, 2023

Chilean Digital Vaccination Pass Parser (CDVPP) parses digital vaccination passes from PDF files

cdvpp Chilean Digital Vaccination Pass Parser (CDVPP) parses digital vaccination passes from PDF files Reads a Digital Vaccination Pass PDF file as in

1 Nov 17, 2021

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding, where the hidden data can be utilized for various management purposes, including hyper-linking, annotation, and authentication

1 Nov 17, 2021

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation Introduction Getting Started FSD50K Recipe AudioSet Recipe Label E

84 Dec 27, 2022

Low-dose Digital Mammography with Deep Learning

Impact of loss functions on the performance of a deep neural network designed to restore low-dose digital mammography ====== This repository contains

6 Dec 13, 2022

Code-free deep segmentation for computational pathology

NoCodeSeg: Deep segmentation made easy! This is the official repository for the manuscript "Code-free development and deployment of deep segmentation

26 Nov 23, 2022

SomaFM Plugin for Kodi

SomaFM XBMC Plugin This description is a bit outdated. You can simply install this addon by browsing the official repositories from within Kodi. Insta

7 Jan 21, 2022

Audio/Video downloader

youtubeDownloader Audio/Video downloader • The project downloads audio/video/both after link is entered • It also shows total size of the file, time l

1 Nov 16, 2021

YOLOX_AUDIO is an audio event detection model based on YOLOX

YOLOX_AUDIO is an audio event detection model based on YOLOX, an anchor-free version of YOLO. This repo is an implementated by PyTorch. Main goal of YOLOX_AUDIO is to detect and classify pre-defined audio events in multi-spectrogram domain using image object detection frameworks.

77 Dec 19, 2022

A discord bot for downloading youtube video and audio files

disctube disctube is a discord bot for downloading video and audio files from youtube using python pytube. disclaimer i am not the best python program

3 Feb 3, 2022

Hide secret data within a digital image using good ol' terminal

pystego Hide secret data within a digital image using good ol' terminal Installation The recommended way for installing this package is using, python

1 Jan 6, 2022

ProsePainter combines direct digital painting with real-time guided machine-learning based image optimization.

ProsePainter Create images by painting with words. ProsePainter combines direct digital painting with real-time guided machine-learning based image op

276 Dec 17, 2022

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning This repository contains the code for our ICCV 202

28 Nov 8, 2022

MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Digital Image Processing Python MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python TO-DO: Refactor scripts, curren

24 Oct 16, 2022

A script that downloads YouTube videos/audio

YouTube-Downloader A script that downloads YouTube videos/audio from youtube. Usage Download the script by executing the following in your terminal :

2 Jan 4, 2022

The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Finnish Dialect Identification The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a te

2 Dec 25, 2021

Generating a structured library of .wav samples with Python.

sample-library Scripts for generating a structured sample library with Python Requires Docker about Samples are written to wave files in lib/. Differe

1 Nov 11, 2021

Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

27 Dec 22, 2022

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

OpenCV-ToothPaint3-Advanced-Digital-Image-Editor This application named ‘Tooth Paint’ version TP_2020.3 (64-bit) or version 3 was developed within a w

1 Nov 5, 2021

DICexport is a GUI (PyQt5) to export digital image correlation videos

DIC Video Exporter DICexport is a GUI (PyQt5) to export digital image correlation videos. It offers the flexibility to choose a selected range of a vi

0 Jun 23, 2022

Carnatic Notes Predictor for audio files

Carnatic Notes Predictor for audio files Link for live application: https://share.streamlit.io/pradeepak1/carnatic-notes-predictor-for-audio-files/mai

1 Nov 6, 2021

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

27 Dec 22, 2022

On-device speech-to-index engine powered by deep learning.

30 Nov 24, 2022

Digital Earth Australia notebooks and tools repository

Repository for Digital Earth Australia Jupyter Notebooks: tools and workflows for geospatial analysis with Open Data Cube and xarray

335 Dec 24, 2022

A rofi-blocks script that searches youtube and plays the selected audio on mpv.

rofi-ytm A rofi-blocks script that searches youtube and plays the selected audio on mpv. To use the script, run the following command rofi -modi block

26 Dec 21, 2022

Data science, Data manipulation and Machine learning package.

duality Data science, Data manipulation and Machine learning package. Use permitted according to the terms of use and conditions set by the attached l

3 Oct 19, 2022

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec This repo

Building and Urban Data Science (BUDS) Group

5 Dec 2, 2022

Port of Uxn to digital hardware in the Logisim simulator

Uxn-Logisim Implements the Uxn instruction set in digital hardware. Very WIP. Contents cpu.circ - The Logisim file microcode.mc - Microcode source fil

11 Mar 27, 2022

Python Digital Art Generator

Python Digital Art Generator The main goal of this repository is to generate all possible layers permutations given by the user in order to get unique

3 Mar 12, 2022

convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.

convert-to-opus-cli convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format. Installation Must have installed ffmp

4 Dec 21, 2022

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we

8 Nov 14, 2022

Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach

Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach This is the implementation of traffic prediction code in DTMP based on PyTo

1 Dec 19, 2021

Terminal-based audio-to-text converter

att Terminal-based audio-to-text converter Project description A terminal-based audio-to-text converter written in python, enabling you to convert .wa

4 Dec 15, 2022

This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File.

Introduction This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File. Installation pip install -r req

11 Jun 30, 2022

digital audio workstation, instrument and effect plugins, wave editor

306 Jan 5, 2023

An 8D music player made to enjoy Halloween this year!🤘

HAPPY HALLOWEEN buddy! Split Player Hello There! Welcome to SplitPlayer... Supposed To Be A 8DPlayer.... You Decide.... It can play the ordinary audio

1 Nov 4, 2021

Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Batch Sorting Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files accord

1 Oct 29, 2021

Python script for extracting audio from video files and creating Mel spectrograms

video2spectrogram About This package is meant to automate the process of extracting audio files from videos and saving the plots computed from these a

1 Oct 28, 2021

A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).

rfsoapyfile A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream). The script is threaded fo

4 Dec 19, 2022

Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"

TriBERT This repository contains the code for the NeurIPS 2021 paper titled "TriBERT: Full-body Human-centric Audio-visual Representation Learning for

8 Aug 31, 2022

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning Reference Abeßer, J. & Müller, M. Towards Audio Domain Adapt

2 Jul 6, 2022

Audio Visual Emotion Recognition using TDA

Audio Visual Emotion Recognition using TDA RAVDESS database with two datasets analyzed: Video and Audio dataset: Audio-Dataset: https://www.kaggle.com

Combinatorial Image Analysis research group

3 May 11, 2022

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us

5.1k Jan 2, 2023

Various hdas (Houdini Digital Assets)

aaTools My various assets for Houdini "ms_asset_loader" - Custom importer assets from Quixel Bridge "asset_placer" - Tool for placment sop geometry on

9 Dec 19, 2022

This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"

trimodal_person_verification This repository contains the code, and preprocessed dataset featured in "A Study of Multimodal Person Verification Using

7 Aug 31, 2022

A Django web application to receive, virus check and validate transfers of digital archival records, and allow archivists to appraise and accession those records.

Aurora Aurora is a Django web application that can receive, virus check and validate transfers of digital archival records, and allows archivists to a

20 Aug 30, 2022

Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Wav2CLIP 🚧 WIP 🚧 Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP 📄 🔗 Ho-Hsiang Wu, Prem Seetharaman

240 Dec 13, 2022

An app made in Python using the PyTube and Tkinter libraries to download videos and MP3 audio.

yt-dl (GUI Edition) An app made in Python using the PyTube and Tkinter libraries to download videos and MP3 audio. How do I download this? Windows: Fi

1 Oct 23, 2021

Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

rethink-audio-fsl This repo contains the source code for the paper "Who calls the shots? Rethinking Few-Shot Learning for Audio." (WASPAA 2021) Table

34 Dec 24, 2022

OpenClubhouse - A third-part web application based on flask to play Clubhouse audio.

1.1k Jan 5, 2023

Script simples para baixar vídeos/áudios/playlist do YouTube

🔗 VilelaTube ▶️ Script simples para baixar vídeos/áudios/playlist do YouTube Requisitos • Como usar • Melhorias futuras ⚠️ Atenção! ⚠️ Lembre-se de a

2 Nov 3, 2021

A Telegram Userbot to play or streaming Audio and Video songs / files in Telegram Voice Chats.

Vcmusic-Userbot A Telegram Userbot to play or streaming Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram R

3 Oct 23, 2021

a dnn ai project to classify which food people are eating on audio recordings

Deep Learning - EAT Challenge About This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objecti

1 Oct 24, 2021

A refreshed Python toolbox for building complex digital hardware

1k Jan 5, 2023

Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

Chunked Autoregressive GAN (CARGAN) Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis [paper] [compan

150 Dec 6, 2022

PaSST: Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022

Identify the emotion of multiple speakers in an Audio Segment

MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug · Request Feature Try the Demo Here Table

110 Dec 3, 2022

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

2 Oct 20, 2021

The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

18 Dec 24, 2022

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Taming Visually Guided Sound Generation • [Project Page] • [ArXiv] • [Poster] • • Listen for the samples on our project page. Overview We propose to t

226 Jan 3, 2023

Photini - A free, easy to use, digital photograph metadata (Exif, IPTC, XMP) editing application for Linux, Windows and MacOS.

A free, easy to use, digital photograph metadata (Exif, IPTC, XMP) editing application for Linux, Windows and MacOS. "Metadata" is said to mea

120 Dec 20, 2022

F.R.I.D.A.Y. ----- Female Replacement Intelligent Digital Assistant Youth

F.R.I.D.A.Y. Female Replacement Intelligent Digital Assistant Youth--Jarvis-- the virtual assistant made by python Overview This is a virtual assistan

4 Feb 26, 2022

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

12 Oct 22, 2022

Python Digital-audio Resources

Python digital-audio Libraries

Let's you download entire YT-playlists.

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

Official implementation of the article "Unsupervised JPEG Domain Adaptation For Practical Digital Forensics"

A GUI-based audio player with support for a large variety of formats

MelGAN test on audio decoding

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

Simple Digital Ocean CLI by python.

Tools for computational pathology

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

⚡ H2G-Net for Semantic Segmentation of Histopathological Images

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

Steerable discovery of neural audio effects

Code for csig audio deepfake detection

Userscript qutebrowser for downloading audio / video from youtube using aria2

An python based Timer and Digital Clock

The official repository for Audio ALBERT

FOSS Digital Asset Distribution Platform built on Frappe.

Implementation of the algorithm shown in the article "Modelo de Predicción de Éxito de Canciones Basado en Descriptores de Audio"

OpenL3: Open-source deep audio and image embeddings

A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats

Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

A fast, dataset-agnostic, deep visual search engine for digital art history

A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats.

Convert Video Files To Text And Audio

Digital image process Basic algorithm

R interface to fast.ai

🔊 Audio and fastai v2

The Ludii general game system, developed as part of the ERC-funded Digital Ludeme Project.

A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.

BookNLP, a natural language processing pipeline for books

A desktop GUI providing an audio interface for GPT3.

This bot can stream audio or video files and urls in telegram voice chats

This is a Client-Server-System which can send audio from a microphone from the server to client and in the other direction.

Python based Telegram bot. Search and download YouTube video or audio.

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Chilean Digital Vaccination Pass Parser (CDVPP) parses digital vaccination passes from PDF files

The audio-video synchronization of MKV Container Format is exploited to achieve data hiding

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

Low-dose Digital Mammography with Deep Learning

Code-free deep segmentation for computational pathology

SomaFM Plugin for Kodi

Audio/Video downloader

YOLOX_AUDIO is an audio event detection model based on YOLOX

A discord bot for downloading youtube video and audio files

Hide secret data within a digital image using good ol' terminal

ProsePainter combines direct digital painting with real-time guided machine-learning based image optimization.

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

A script that downloads YouTube videos/audio

The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Generating a structured library of .wav samples with Python.

Efficient Training of Audio Transformers with Patchout

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

DICexport is a GUI (PyQt5) to export digital image correlation videos

Carnatic Notes Predictor for audio files

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

On-device speech-to-index engine powered by deep learning.

Digital Earth Australia notebooks and tools repository

A rofi-blocks script that searches youtube and plays the selected audio on mpv.

Data science, Data manipulation and Machine learning package.

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec

Port of Uxn to digital hardware in the Logisim simulator

Python Digital Art Generator

convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach

Terminal-based audio-to-text converter

This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File.

digital audio workstation, instrument and effect plugins, wave editor

An 8D music player made to enjoy Halloween this year!🤘

Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Python script for extracting audio from video files and creating Mel spectrograms

A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).

Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning