1149 Python Audio-segmentation Libraries

A script that downloads YouTube videos/audio

YouTube-Downloader A script that downloads YouTube videos/audio from youtube. Usage Download the script by executing the following in your terminal :

2 Jan 4, 2022

Python module providing a framework to trace individual edges in an image using Gaussian process regression.

Edge Tracing using Gaussian Process Regression Repository storing python module which implements a framework to trace individual edges in an image usi

7 Dec 27, 2022

Image Segmentation using U-Net, U-Net with skip connections and M-Net architectures

Brain-Image-Segmentation Segmentation of brain tissues in MRI image has a number of applications in diagnosis, surgical planning, and treatment of bra

8 Oct 27, 2022

A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

733 Dec 29, 2022

Head and Neck Tumour Segmentation and Prediction of Patient Survival Project

Head-and-Neck-Tumour-Segmentation-and-Prediction-of-Patient-Survival Welcome to the Head and Neck Tumour Segmentation and Prediction of Patient Surviv

5 Oct 20, 2022

The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Finnish Dialect Identification The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a te

2 Dec 25, 2021

Lung Segmentation with fastapi

Lung Segmentation with fastapi This app uses FastAPI as backend. Usage for app.py First install required libraries by running: pip install -r requirem

0 Sep 20, 2022

Multi-View Radar Semantic Segmentation

Multi-View Radar Semantic Segmentation Paper Multi-View Radar Semantic Segmentation, ICCV 2021. Arthur Ouaknine, Alasdair Newson, Patrick Pérez, Flore

37 Oct 25, 2022

Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

HAIS Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021) by Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang*. (*) Corresp

145 Jan 5, 2023

Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation (ICCV 2021) [中文|EN] 概述本工作主要探索一种高效的多传感器（激光雷达和摄像头）融合点云语义分割方法。现有的多传感器融合方法主要将点云投影

126 Dec 30, 2022

This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation This is the code relat

39 Sep 23, 2022

Single object tracking and segmentation.

Single/Multiple Object Tracking and Segmentation Codes and comparison of recent single/multiple object tracking and segmentation. News 💥 AutoMatch is

385 Jan 2, 2023

[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Mining Latent Classes for Few-shot Segmentation Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao. This codebase contains baseline of our paper Mini

66 Nov 29, 2022

Code for Recurrent Mask Refinement for Few-Shot Medical Image Segmentation (ICCV 2021).

Recurrent Mask Refinement for Few-Shot Medical Image Segmentation Steps Install any missing packages using pip or conda Preprocess each dataset using

39 Dec 8, 2022

Crossover Learning for Fast Online Video Instance Segmentation (ICCV 2021)

TL;DR: CrossVIS (Crossover Learning for Fast Online Video Instance Segmentation) proposes a novel crossover learning paradigm to fully leverage rich c

79 Nov 25, 2022

(ICCV 2021 Oral) Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation.

DARS Code release for the paper "Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation", ICCV 2021

58 Jan 1, 2023

Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Multi-Anchor Active Domain Adaptation for Semantic Segmentation Munan Ning*, Donghuan Lu*, Dong Wei†, Cheng Bian, Chenglang Yuan, Shuang Yu, Kai Ma, Y

36 Dec 7, 2022

Code for the ICCV2021 paper "Personalized Image Semantic Segmentation"

PSS: Personalized Image Semantic Segmentation Paper PSS: Personalized Image Semantic Segmentation Yu Zhang, Chang-Bin Zhang, Peng-Tao Jiang, Ming-Ming

15 Jul 9, 2022

[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [arXiv] [Project Page] @inproceedings{ huang2021fapn, title={{FaPN}: Feature-alig

175 Dec 30, 2022

ICCV2021 Papers with Code

1.4k Jan 2, 2023

Generating a structured library of .wav samples with Python.

sample-library Scripts for generating a structured sample library with Python Requires Docker about Samples are written to wave files in lib/. Differe

1 Nov 11, 2021

Code for CMaskTrack R-CNN (proposed in Occluded Video Instance Segmentation)

CMaskTrack R-CNN for OVIS This repo serves as the official code release of the CMaskTrack R-CNN model on the Occluded Video Instance Segmentation data

61 Nov 25, 2022

Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022

Barbershop: GAN-based Image Compositing using Segmentation Masks (SIGGRAPH Asia 2021)

Barbershop: GAN-based Image Compositing using Segmentation Masks Barbershop: GAN-based Image Compositing using Segmentation Masks Peihao Zhu, Rameen A

928 Dec 30, 2022

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

27 Dec 22, 2022

An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.

Semisupervised Multitask Learning This repository is an unofficial and slightly modified implementation of UM-Adapt[1] using PyTorch. This code primar

11 Nov 25, 2022

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

OpenCV-ToothPaint3-Advanced-Digital-Image-Editor This application named ‘Tooth Paint’ version TP_2020.3 (64-bit) or version 3 was developed within a w

1 Nov 5, 2021

Carnatic Notes Predictor for audio files

Carnatic Notes Predictor for audio files Link for live application: https://share.streamlit.io/pradeepak1/carnatic-notes-predictor-for-audio-files/mai

1 Nov 6, 2021

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region. This repository provides the codebase and dataset for our work WORD: Revisiting Or

71 Jan 7, 2023

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

27 Dec 22, 2022

On-device speech-to-index engine powered by deep learning.

30 Nov 24, 2022

A learning-based data collection tool for human segmentation

FullBodyFilter A Learning-Based Data Collection Tool For Human Segmentation Contents Documentation Source Code and Scripts Overview of Project Usage O

4 Jun 24, 2022

A pytorch-based real-time segmentation model for autonomous driving

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation This project contains the Pytorch implementation for the proposed CFPNet: pap

342 Dec 22, 2022

A simple pytorch pipeline for semantic segmentation.

SegmentationPipeline -- Pytorch A simple pytorch pipeline for semantic segmentation. Requirements : torch=1.9.0 tqdm albumentations=1.0.3 opencv-pyt

4 Feb 22, 2022

A rofi-blocks script that searches youtube and plays the selected audio on mpv.

rofi-ytm A rofi-blocks script that searches youtube and plays the selected audio on mpv. To use the script, run the following command rofi -modi block

26 Dec 21, 2022

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

DistMIS Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation. DistriMIS Distributing Deep Learning Hyperparameter Tuning

2 Sep 9, 2022

VoxHRNet - Whole Brain Segmentation with Full Volume Neural Network

VoxHRNet This is the official implementation of the following paper: Whole Brain Segmentation with Full Volume Neural Network Yeshu Li, Jonathan Cui,

12 Nov 24, 2022

Video-based open-world segmentation

UVO_Challenge Team Alpes_runner Solutions This is an official repo for our UVO Challenge solutions for Image/Video-based open-world segmentation. Our

84 Dec 22, 2022

[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

K-Net: Towards Unified Image Segmentation Introduction This is an official release of the paper K-Net:Towards Unified Image Segmentation. K-Net will a

423 Jan 2, 2023

Bayesian Optimization Library for Medical Image Segmentation.

bayesmedaug: Bayesian Optimization Library for Medical Image Segmentation. bayesmedaug optimizes your data augmentation hyperparameters for medical im

7 Feb 10, 2022

An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance"

Lidar-Segementation An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance" from

135 Jan 6, 2023

convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.

convert-to-opus-cli convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format. Installation Must have installed ffmp

4 Dec 21, 2022

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we

8 Nov 14, 2022

Terminal-based audio-to-text converter

att Terminal-based audio-to-text converter Project description A terminal-based audio-to-text converter written in python, enabling you to convert .wa

4 Dec 15, 2022

This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File.

Introduction This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File. Installation pip install -r req

11 Jun 30, 2022

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

Adelaide Intelligent Machines (AIM) Group

7 Sep 12, 2022

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation By Yi Li, Zhanghui Kuang, Liyang Liu, Yimin Chen, Wayne Zhang SenseTime, Tsinghua Unive

33 Oct 14, 2022

digital audio workstation, instrument and effect plugins, wave editor

306 Jan 5, 2023

An 8D music player made to enjoy Halloween this year!🤘

HAPPY HALLOWEEN buddy! Split Player Hello There! Welcome to SplitPlayer... Supposed To Be A 8DPlayer.... You Decide.... It can play the ordinary audio

1 Nov 4, 2021

BMVC 2021 Oral: code for BI-GCN: Boundary-Aware Input-Dependent Graph Convolution for Biomedical Image Segmentation

BMVC 2021 BI-GConv: Boundary-Aware Input-Dependent Graph Convolution for Biomedical Image Segmentation Necassary Dependencies: PyTorch 1.2.0 Python 3.

15 Nov 8, 2022

Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond

CRF - Conditional Random Fields A library for dense conditional random fields (CRFs). This is the official accompanying code for the paper Regularized

21 Nov 26, 2022

Official Datasets and Implementation from our Paper "Video Class Agnostic Segmentation in Autonomous Driving".

Video Class Agnostic Segmentation [Method Paper] [Benchmark Paper] [Project] [Demo] Official Datasets and Implementation from our Paper "Video Class A

26 Oct 24, 2022

Covid-19 Test AI (Deep Learning - NNs) Software. Accuracy is the %96.5, loss is the 0.09 :)

Covid-19 Test AI (Deep Learning - NNs) Software I developed a segmentation algorithm to understand whether Covid-19 Test Photos are positive or negati

28 Dec 4, 2021

Fast image augmentation library and an easy-to-use wrapper around other libraries

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 9, 2023

Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Batch Sorting Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files accord

1 Oct 29, 2021

Python script for extracting audio from video files and creating Mel spectrograms

video2spectrogram About This package is meant to automate the process of extracting audio files from videos and saving the plots computed from these a

1 Oct 28, 2021

A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).

rfsoapyfile A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream). The script is threaded fo

4 Dec 19, 2022

Space Time Recurrent Memory Network - Pytorch

Space Time Recurrent Memory Network - Pytorch (wip) Implementation of Space Time Recurrent Memory Network, recurrent network competitive with attentio

50 Nov 7, 2021

MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images (ISBI 2021, MELBA 2021)

MultiMix This repository contains the implementation of MultiMix. Our publications for this project are listed below: "MultiMix: Sparingly Supervised,

27 Dec 22, 2022

Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"

TriBERT This repository contains the code for the NeurIPS 2021 paper titled "TriBERT: Full-body Human-centric Audio-visual Representation Learning for

8 Aug 31, 2022

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning Reference Abeßer, J. & Müller, M. Towards Audio Domain Adapt

2 Jul 6, 2022

Audio Visual Emotion Recognition using TDA

Audio Visual Emotion Recognition using TDA RAVDESS database with two datasets analyzed: Video and Audio dataset: Audio-Dataset: https://www.kaggle.com

Combinatorial Image Analysis research group

3 May 11, 2022

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us

5.1k Jan 2, 2023

Implementation of UNET architecture for Image Segmentation.

Semantic Segmentation using UNET This is the implementation of UNET on Carvana Image Masking Kaggle Challenge About the Dataset This dataset contains

4 Dec 21, 2021

An interactive interface for using OpenCV's GrabCut algorithm for image segmentation.

Interactive GrabCut An interactive interface for using OpenCV's GrabCut algorithm for image segmentation. Setup Install dependencies: pip install nump

16 Oct 10, 2022

PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

MemSeg: Memory-based semantic segmentation for off-road unstructured natural environments Introduction This repository is a PyTorch implementation of

11 Nov 28, 2022

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

4.5k Jan 8, 2023

The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision The PyTorch implementation of DiscoBox: Weakly Supe

1 Oct 23, 2021

Code repository for the work "Multi-Domain Incremental Learning for Semantic Segmentation", accepted at WACV 2022

Multi-Domain Incremental Learning for Semantic Segmentation This is the Pytorch implementation of our work "Multi-Domain Incremental Learning for Sema

24 Jan 2, 2023

This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"

trimodal_person_verification This repository contains the code, and preprocessed dataset featured in "A Study of Multimodal Person Verification Using

7 Aug 31, 2022

Alleviating Over-segmentation Errors by Detecting Action Boundaries

Alleviating Over-segmentation Errors by Detecting Action Boundaries Forked from ASRF offical code. This repo is the a implementation of replacing orig

13 Dec 12, 2022

Video Instance Segmentation with a Propose-Reduce Paradigm (ICCV 2021)

Propose-Reduce VIS This repo contains the official implementation for the paper: Video Instance Segmentation with a Propose-Reduce Paradigm Huaijia Li

39 Nov 23, 2022

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

20 May 28, 2022

A LiDAR point cloud cluster for panoptic segmentation

Divide-and-Merge-LiDAR-Panoptic-Cluster A demo video of our method with semantic prior: More information will be coming soon! As a PhD student, I don'

65 Dec 22, 2022

Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

Wav2CLIP 🚧 WIP 🚧 Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP 📄 🔗 Ho-Hsiang Wu, Prem Seetharaman

240 Dec 13, 2022

A simple approach to emable dense segmentation with ViT.

Vision Transformer Segmentation Network This implementation of ViT in pytorch uses a super simple and straight-forward way of generating an output of

5 Jan 3, 2023

Segmentation models with pretrained backbones. PyTorch.

Python library with Neural Networks for Image Segmentation based on PyTorch. The main features of this library are: High level API (just two lines to

6.6k Jan 6, 2023

PyTorch framework for Deep Learning research and development.

Accelerated DL & RL PyTorch framework for Deep Learning research and development. It was developed with a focus on reproducibility, fast experimentati

29 Jul 13, 2022

Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation (CVPR 2020)

Super-BPD for Fast Image Segmentation (CVPR 2020) Introduction We propose direction-based super-BPD, an alternative to superpixel, for fast generic im

189 Dec 7, 2022

Convolutional Neural Network for 3D meshes in PyTorch

MeshCNN in PyTorch SIGGRAPH 2019 [Paper] [Project Page] MeshCNN is a general-purpose deep neural network for 3D triangular meshes, which can be used f

1.4k Jan 4, 2023

An app made in Python using the PyTube and Tkinter libraries to download videos and MP3 audio.

yt-dl (GUI Edition) An app made in Python using the PyTube and Tkinter libraries to download videos and MP3 audio. How do I download this? Windows: Fi

1 Oct 23, 2021

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021)

Video Instance Segmentation using Inter-Frame Communication Transformers (NeurIPS 2021) Paper Video Instance Segmentation using Inter-Frame Communicat

81 Dec 29, 2022

Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

rethink-audio-fsl This repo contains the source code for the paper "Who calls the shots? Rethinking Few-Shot Learning for Audio." (WASPAA 2021) Table

34 Dec 24, 2022

OpenClubhouse - A third-part web application based on flask to play Clubhouse audio.

1.1k Jan 5, 2023

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

0 Feb 6, 2022

Script simples para baixar vídeos/áudios/playlist do YouTube

🔗 VilelaTube ▶️ Script simples para baixar vídeos/áudios/playlist do YouTube Requisitos • Como usar • Melhorias futuras ⚠️ Atenção! ⚠️ Lembre-se de a

2 Nov 3, 2021

Kaggle: Cell Instance Segmentation

Kaggle: Cell Instance Segmentation The goal of this challenge is to detect cells in microscope images. with simple view on how many cels have been ann

9 Aug 12, 2022

Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, Özdemir Cetin & Heinz Koeppl | Pr

23 Sep 21, 2022

Reimplementation of Dynamic Multi-scale filters for Semantic Segmentation.

Paddle implementation of Dynamic Multi-scale filters for Semantic Segmentation.

2 Nov 1, 2021

A Telegram Userbot to play or streaming Audio and Video songs / files in Telegram Voice Chats.

Vcmusic-Userbot A Telegram Userbot to play or streaming Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram R

3 Oct 23, 2021

Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, Özdemir Cetin & Heinz Koeppl | Pr

23 Sep 21, 2022

a dnn ai project to classify which food people are eating on audio recordings

Deep Learning - EAT Challenge About This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objecti

1 Oct 24, 2021

Hippocampal segmentation using the UNet network for each axis

Hipposeg Hippocampal segmentation using the UNet network for each axis, inspired by https://github.com/MICLab-Unicamp/e2dhipseg Red: False Positive Gr

0 Sep 2, 2021

Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python.

What is Retentioneering? Retentioneering is a Python framework and library to assist product analysts and marketing analysts as it makes it easier to

581 Jan 7, 2023

Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

Chunked Autoregressive GAN (CARGAN) Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis [paper] [compan

150 Dec 6, 2022

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

SegSwap Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery" [PDF] [Project page] If our project

41 Dec 10, 2022

TagLab: an image segmentation tool oriented to marine data analysis

TagLab: an image segmentation tool oriented to marine data analysis TagLab was created to support the activity of annotation and extraction of statist

49 Dec 29, 2022

Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

SML (ICCV 2021, Oral) : Official Pytorch Implementation This repository provides the official PyTorch implementation of the following paper: Standardi

20 Oct 20, 2021

PaSST: Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022

Identify the emotion of multiple speakers in an Audio Segment

MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug · Request Feature Try the Demo Here Table

110 Dec 3, 2022

Python Audio-segmentation Resources

Python audio-segmentation Libraries

A script that downloads YouTube videos/audio

Python module providing a framework to trace individual edges in an image using Gaussian process regression.

Image Segmentation using U-Net, U-Net with skip connections and M-Net architectures

A generalist algorithm for cell and nucleus segmentation.

Head and Neck Tumour Segmentation and Prediction of Patient Survival Project

The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Lung Segmentation with fastapi

Multi-View Radar Semantic Segmentation

Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

Single object tracking and segmentation.

[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Code for Recurrent Mask Refinement for Few-Shot Medical Image Segmentation (ICCV 2021).

Crossover Learning for Fast Online Video Instance Segmentation (ICCV 2021)

(ICCV 2021 Oral) Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation.

Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Code for the ICCV2021 paper "Personalized Image Semantic Segmentation"

[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

ICCV2021 Papers with Code

Generating a structured library of .wav samples with Python.

Code for CMaskTrack R-CNN (proposed in Occluded Video Instance Segmentation)

Efficient Training of Audio Transformers with Patchout

Barbershop: GAN-based Image Compositing using Segmentation Masks (SIGGRAPH Asia 2021)

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

An unofficial personal implementation of UM-Adapt, specifically to tackle joint estimation of panoptic segmentation and depth prediction for autonomous driving datasets.

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

Carnatic Notes Predictor for audio files

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

On-device speech-to-index engine powered by deep learning.

A learning-based data collection tool for human segmentation

A pytorch-based real-time segmentation model for autonomous driving

A simple pytorch pipeline for semantic segmentation.

A rofi-blocks script that searches youtube and plays the selected audio on mpv.

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

VoxHRNet - Whole Brain Segmentation with Full Volume Neural Network

Video-based open-world segmentation

[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Bayesian Optimization Library for Medical Image Segmentation.

An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance"

convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

Terminal-based audio-to-text converter

This is a story bot, that will scrape stories from r/stories subreddit and convert it into an Audio File.

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation

digital audio workstation, instrument and effect plugins, wave editor

An 8D music player made to enjoy Halloween this year!🤘

BMVC 2021 Oral: code for BI-GCN: Boundary-Aware Input-Dependent Graph Convolution for Biomedical Image Segmentation

Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond

Official Datasets and Implementation from our Paper "Video Class Agnostic Segmentation in Autonomous Driving".

Covid-19 Test AI (Deep Learning - NNs) Software. Accuracy is the %96.5, loss is the 0.09 :)

Fast image augmentation library and an easy-to-use wrapper around other libraries

Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Python script for extracting audio from video files and creating Mel spectrograms

A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).

Space Time Recurrent Memory Network - Pytorch

MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images (ISBI 2021, MELBA 2021)

Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Audio Visual Emotion Recognition using TDA

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

Implementation of UNET architecture for Image Segmentation.

An interactive interface for using OpenCV's GrabCut algorithm for image segmentation.

PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.

Code repository for the work "Multi-Domain Incremental Learning for Semantic Segmentation", accepted at WACV 2022

This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"

Alleviating Over-segmentation Errors by Detecting Action Boundaries

Video Instance Segmentation with a Propose-Reduce Paradigm (ICCV 2021)

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

A LiDAR point cloud cluster for panoptic segmentation

Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP

A simple approach to emable dense segmentation with ViT.

Segmentation models with pretrained backbones. PyTorch.

PyTorch framework for Deep Learning research and development.