157 Python Speaker-identification Libraries

Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Kaggle-Happywhale Happywhale - Whale and Dolphin Identification Silver 🥈 Solution (26/1588) 竞赛方案思路图像数据预处理-标志性特征图片裁剪：首先根据开源的标注数据训练YOLOv5x6目标检测模型，将训练集

20 Nov 14, 2022

Comprehensive-E2E-TTS - PyTorch Implementation

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

114 Nov 13, 2022

This project contains the ClonedPerson dataset and code described in our paper "Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-Identification".

ClonedPerson This is the official repository for the ClonedPerson project, which contains the ClonedPerson dataset and code described in our paper "Cl

55 Dec 27, 2022

Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)

LUPerson-NL Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL) The repository is for our CVPR2022 paper Large-Scale

43 Dec 26, 2022

App for identification of various objects. Based on YOLO v4 tiny architecture

Object_detection Repository containing trained model yolo v4 tiny, which is capable of identification 80 different classes Default feed is set to be a

0 Jun 22, 2022

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python = 3.6 , Pytorch

84 Jan 3, 2023

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

DiffGAN-TTS - PyTorch Implementation PyTorch implementation of DiffGAN-TTS: High

157 Jan 1, 2023

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

TDY-CNN for Text-Independent Speaker Verification Official implementation of Temporal Dynamic Convolutional Neural Network for Text-Independent Speake

16 Oct 17, 2022

Builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques

This project builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques.

20 Dec 30, 2022

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

Introduction This repository contains several materials that supplements the Spoofing-Aware Speaker Verification (SASV) Challenge 2022 including: calc

40 Dec 28, 2022

On the adaptation of recurrent neural networks for system identification

On the adaptation of recurrent neural networks for system identification This repository contains the Python code to reproduce the results of the pape

3 Jan 13, 2022

Pytorch implementation of XRD spectral identification from COD database

XRDidentifier Pytorch implementation of XRD spectral identification from COD database. Details will be explained in the paper to be submitted to NeurI

4 Jan 7, 2023

Exadel CompreFace is a free and open-source face recognition GitHub project

Exadel CompreFace is a leading free and open-source face recognition system Exadel CompreFace is a free and open-source face recognition service that

2.6k Jan 4, 2023

Invasive Plant Species Identification

Invasive_Plant_Species_Identification Used LiDAR Odometry and Mapping (LOAM) to create a 3D point cloud map which can be used to identify invasive pla

2 May 12, 2022

TGS Salt Identification Challenge

TGS Salt Identification Challenge This is an open solution to the TGS Salt Identification Challenge. Note Unfortunately, we can no longer provide supp

123 Nov 4, 2022

Cereal box identification in store shelves using computer vision and a single train image per model.

Product Recognition on Store Shelves Description You can read the task description here. Report You can read and download our report here. Step A - Mu

1 Jan 21, 2022

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

HAT: Hierarchical Aggregation Transformers for Person Re-identification

11 Sep 5, 2022

Accurate identification of bacteriophages from metagenomic data using Transformer

PhaMer is a python library for identifying bacteriophages from metagenomic data. PhaMer is based on a Transorfer model and rely on protein-based vocab

9 Nov 30, 2022

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

1. What does this library do? Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a prepr

334 Dec 30, 2022

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

1 Jan 11, 2022

🎃 Core identification module of AI powerful point reading system platform.

ppReader-Kernel Intro Core identification module of AI powerful point reading system platform. Usage 硬件： Windows10、GPU：nvdia GTX 1060 、普通RBG相机软件： con

1 Jan 11, 2022

R-package accompanying the paper "Dynamic Factor Model for Functional Time Series: Identification, Estimation, and Prediction"

dffm The goal of dffm is to provide functionality to apply the methods developed in the paper “Dynamic Factor Model for Functional Time Series: Identi

3 Dec 9, 2022

A simple program to make MSI Modern 15 speaker and microphone mute led work.

MSI Modern 15 sound led fixup for linux A simple program to fix the MSI Modern 15 speaker and microphone mute LEDs. Installation Requirements pulsectl

4 Oct 18, 2022

A lightweight face-recognition toolbox and pipeline based on tensorflow-lite

FaceIDLight 📘 Description A lightweight face-recognition toolbox and pipeline based on tensorflow-lite with MTCNN-Face-Detection and ArcFace-Face-Rec

16 Dec 7, 2022

A Python Package For System Identification Using NARMAX Models

SysIdentPy is a Python module for System Identification using NARMAX models built on top of numpy and is distributed under the 3-Clause BSD license. N

175 Dec 25, 2022

Character Grounding and Re-Identification in Story of Videos and Text Descriptions

Character in Story Identification Network (CiSIN) This project hosts the code for our paper. Youngjae Yu, Jongseok Kim, Heeseung Yun, Jiwan Chung and

8 Dec 9, 2022

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning English | 中文 ❗ Now we provide inferencing code and pre-training models

164 Jan 2, 2023

Analysis and plotting for motor/prop/ESC characterization, thrust vs RPM and torque vs thrust

esc_test This is a Python package used to plot and analyze data collected for the purpose of characterizing a particular propeller, motor, and ESC con

1 Dec 28, 2021

Faster, modernized fork of the language identification tool langid.py

py3langid py3langid is a fork of the standalone language identification tool langid.py by Marco Lui. Original license: BSD-2-Clause. Fork license: BSD

12 Nov 5, 2022

Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification We provide the codes for repr

12 Dec 12, 2022

MakeItTalk: Speaker-Aware Talking-Head Animation

MakeItTalk: Speaker-Aware Talking-Head Animation This is the code repository implementing the paper: MakeItTalk: Speaker-Aware Talking-Head Animation

285 Jan 8, 2023

User-friendly Voice Cloning Application

Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur

19 Dec 30, 2022

UniSpeech - Large Scale Self-Supervised Learning for Speech

UniSpeech The family of UniSpeech: WavLM (arXiv): WavLM: Large-Scale Self-Supervised Pre-training for Full Stack Speech Processing UniSpeech (ICML 202

281 Dec 15, 2022

Kivy program for identification & rotation sensing of objects on multi-touch tables.

ObjectViz ObjectViz is a multitouch object detection solution, enabling you to create physical markers out of any reliable multitouch solution. It's e

8 Apr 4, 2022

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-

4 May 8, 2022

This is a graphql api build using ariadne python that serves a graphql-endpoint at port 3002 to perform language translation and identification using deep learning in python pytorch.

Language Translation and Identification this machine/deep learning api that will be served as a graphql-api using ariadne, to perform the following ta

2 Dec 30, 2021

Official Implementation for the paper DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification

DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification Official Implementation for the pape

36 Dec 28, 2022

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

bbc-speech-segmenter: Voice Activity Detection & Speaker Diarization A complete speech segmentation system using Kaldi and x-vectors for voice activit

16 Oct 27, 2022

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone In our recent paper we propose the YourTTS model. YourTTS bri

390 Dec 29, 2022

Pytorch code for "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks".

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

114 Dec 18, 2022

Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

TensorFlow implementation of 3D Convolutional Neural Networks for Speaker Verification - Official Project Page - Pytorch Implementation This repositor

753 Dec 17, 2022

Training Structured Neural Networks Through Manifold Identification and Variance Reduction

Training Structured Neural Networks Through Manifold Identification and Variance Reduction This repository is a pytorch implementation of the Regulari

0 Dec 23, 2021

This is Official implementation for "Pose-guided Feature Disentangling for Occluded Person Re-Identification Based on Transformer" in AAAI2022

PFD：Pose-guided Feature Disentangling for Occluded Person Re-identification based on Transformer This repo is the official implementation of "Pose-gui

93 Dec 18, 2022

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter

47 Dec 25, 2022

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Fine-tuning wav2vec2 for speaker recognition This is the code used to run the experiments in https://arxiv.org/abs/2109.15053. Detailed logs of each t

103 Dec 26, 2022

Joint Detection and Identification Feature Learning for Person Search

Person Search Project This repository hosts the code for our paper Joint Detection and Identification Feature Learning for Person Search. The code is

712 Dec 17, 2022

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [pdf] The official repository for Self-Supervised Pre-Training for Transfo

45 Dec 3, 2021

Open source person re-identification library in python

Open-ReID Open-ReID is a lightweight library of person re-identification for research purpose. It aims to provide a uniform interface for different da

1.3k Jan 1, 2023

This is the code repository for the paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (NeurIPS 2021).

Code Repository for the Paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (To appear in: Proceedings of NeurIPS20

1 Oct 3, 2022

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

QAConv Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting This PyTorch code is proposed in

166 Dec 28, 2022

Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

Repository under construction. Example dataset, checkpoints, and training/testing scripts will be avaible soon! 💡 Collated best practices from most p

4 Jun 26, 2022

This repo contains simple to use, pretrained/training-less models for speaker diarization.

PyDiar This repo contains simple to use, pretrained/training-less models for speaker diarization. Supported Models Binary Key Speaker Modeling Based o

12 Jan 20, 2022

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

6 Nov 25, 2022

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification T M Feroz Ali, Subhasis Chaudhuri, ICVGIP-20-21

3 Jun 17, 2022

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [pdf] The official repository for Self-Supervised Pre-Training for Transfo

116 Jan 4, 2023

VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

VGGVox models for speaker identification and verification This directory contains code to import and evaluate the speaker identification and verificat

338 Dec 27, 2022

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

M2MeT challenge baseline -- AliMeeting This project provides the baseline system recipes for the ICASSP 2020 Multi-channel Multi-party Meeting Transcr

81 Dec 8, 2022

Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

14 Sep 14, 2022

PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identification in Symbolic Scores.

Symbolic Melody Identification This repository is an unofficial PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identifica

3 Feb 21, 2022

Face Recognition plus identification simply and fast | Python

PyFaceDetection Face Recognition plus identification simply and fast Ubuntu Setup sudo pip3 install numpy sudo pip3 install cmake sudo pip3 install dl

16 Sep 22, 2022

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

10 May 15, 2022

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL

3 Dec 26, 2022

Exploiting Robust Unsupervised Video Person Re-identification

Exploiting Robust Unsupervised Video Person Re-identification Implementation of the proposed uPMnet. For the preprint, please refer to [Arxiv]. Gettin

1 Apr 9, 2022

Vehicle direction identification consists of three module detection , tracking and direction recognization.

Vehicle-direction-identification Vehicle direction identification consists of three module detection , tracking and direction recognization. Algorithm

5 Nov 15, 2022

The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Finnish Dialect Identification The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a te

2 Dec 25, 2021

Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION.

LiMuSE Overview Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION. LiMuSE explores group communication on a multi

Auditory Model and Cognitive Computing Lab

17 Oct 26, 2022

Wikipedia Extractive Text Summarizer + Keywords Identification (entropy-based)

Wikipedia Extractive Text Summarizer + Keywords Identification (entropy-based)Wikipedia Extractive Text Summarizer + Keywords Identification (entropy-based)

1 Nov 8, 2021

Chinese Advertisement Board Identification(Pytorch)

Chinese-Advertisement-Board-Identification. We use YoloV5 to extract the ROI of the location of the chinese word. Next, we sort the bounding box and recognize every chinese words which we extracted. The methods which we use are Yolov5, ArgMargin and Focal loss.

12 Jul 21, 2022

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification This repository is the official implementation of [Dealing With Misspeci

0 Oct 25, 2021

Corruption Invariant Learning for Re-identification

Corruption Invariant Learning for Re-identification The official repository for Benchmarks for Corruption Invariant Person Re-identification (NeurIPS

73 Dec 8, 2022

Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network

Introduction This repository is about paper SpeakerGAN , and is unofficially implemented by Mingming Huang ([email protected]), Tiezheng Wang (wtz920729

7 Jan 3, 2023

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us

5.1k Jan 2, 2023

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Introduction This repository contains my unofficial reimplementation of the standard ECAPA-TDNN, which is the speaker recognition in VoxCeleb2 dataset

277 Dec 31, 2022

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Cross-Speaker-Emotion-Transfer - PyTorch Implementation PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Conditio

114 Jan 8, 2023

[BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations"

DomainMix [BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations" [paper] [de

17 Dec 20, 2022

Koopman operator identification library in Python

pykoop pykoop is a Koopman operator identification library written in Python. It allows the user to specify Koopman lifting functions and regressors i

34 Jan 4, 2023

L-SpEx: Localized Target Speaker Extraction

L-SpEx: Localized Target Speaker Extraction The data configuration and simulation of L-SpEx. The code scripts will be released in the future. Data Gen

20 Jan 2, 2023

UniSpeech - Large Scale Self-Supervised Learning for Speech

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

33 Oct 14, 2021

Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification

Less is More: Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification Suncheng Xiang Shanghai Jiao Tong University Over

68 Dec 13, 2022

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search This repository is the official implementation of CAPITAL: Optimal Subgrou

0 Oct 19, 2021

Biomarker identification for COVID-19 Severity in BALF cells Single-cell RNA-seq data

scBALF Covid-19 dataset Analysis Here is the Github page that has the codes for the bioinformatics pipeline described in the paper COVID-Datathon: Bio

2 May 21, 2022

[ACM MM2021] MGH: Metadata Guided Hypergraph Modeling for Unsupervised Person Re-identification

Introduction This project is developed based on FastReID, which is an ongoing ReID project. Projects BUC In projects/BUC, we implement AAAI 2019 paper

7 Apr 13, 2022

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter

47 Dec 25, 2022

Block fingerprinting for the beacon chain, for client identification & client diversity metrics

blockprint This is a repository for discussion and development of tools for Ethereum block fingerprinting. The primary aim is to measure beacon chain

49 Dec 8, 2022

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

24 May 20, 2022

A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.

A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.

237 Jan 2, 2023

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

1.8k Jan 8, 2023

Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

Speaker-Embeddings-Correlation-Pooling This is the original implementation of the pooling method introduced in "Speaker embeddings by modeling channel

10 Apr 30, 2022

Simple embedding based text classifier inspired by fastText, implemented in tensorflow

FastText in Tensorflow This project is based on the ideas in Facebook's FastText but implemented in Tensorflow. However, it is not an exact replica of

306 Dec 2, 2022

Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

187 Jan 6, 2023

Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

185 Jan 1, 2023

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Counterfactual Attention Learning Created by Yongming Rao*, Guangyi Chen*, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for ICCV

90 Dec 31, 2022

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Counterfactual Attention Learning Created by Yongming Rao*, Guangyi Chen*, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for ICCV

89 Dec 18, 2022

An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

30 Nov 5, 2022

Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

17 Nov 29, 2022

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

86 Dec 7, 2022

Python Speaker-identification Resources

Python speaker-identification Libraries

Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Comprehensive-E2E-TTS - PyTorch Implementation

This project contains the ClonedPerson dataset and code described in our paper "Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-Identification".

Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)

App for identification of various objects. Based on YOLO v4 tiny architecture

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

Builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

On the adaptation of recurrent neural networks for system identification

Pytorch implementation of XRD spectral identification from COD database

Exadel CompreFace is a free and open-source face recognition GitHub project

Invasive Plant Species Identification

TGS Salt Identification Challenge

Cereal box identification in store shelves using computer vision and a single train image per model.

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

HAT: Hierarchical Aggregation Transformers for Person Re-identification

Accurate identification of bacteriophages from metagenomic data using Transformer

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

🎃 Core identification module of AI powerful point reading system platform.

R-package accompanying the paper "Dynamic Factor Model for Functional Time Series: Identification, Estimation, and Prediction"

A simple program to make MSI Modern 15 speaker and microphone mute led work.

A lightweight face-recognition toolbox and pipeline based on tensorflow-lite

A Python Package For System Identification Using NARMAX Models

Character Grounding and Re-Identification in Story of Videos and Text Descriptions

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

Analysis and plotting for motor/prop/ESC characterization, thrust vs RPM and torque vs thrust

Faster, modernized fork of the language identification tool langid.py

Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

MakeItTalk: Speaker-Aware Talking-Head Animation

User-friendly Voice Cloning Application

UniSpeech - Large Scale Self-Supervised Learning for Speech

Kivy program for identification & rotation sensing of objects on multi-touch tables.

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

This is a graphql api build using ariadne python that serves a graphql-endpoint at port 3002 to perform language translation and identification using deep learning in python pytorch.

Official Implementation for the paper DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Pytorch code for "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks".

Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

Training Structured Neural Networks Through Manifold Identification and Variance Reduction

This is Official implementation for "Pose-guided Feature Disentangling for Occluded Person Re-Identification Based on Transformer" in AAAI2022

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Joint Detection and Identification Feature Learning for Person Search

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Open source person re-identification library in python

This is the code repository for the paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (NeurIPS 2021).

Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

Efficient Speech Processing Tookit for Automatic Speaker Recognition

PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identification in Symbolic Scores.

Face Recognition plus identification simply and fast | Python

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

Exploiting Robust Unsupervised Video Person Re-identification

Vehicle direction identification consists of three module detection , tracking and direction recognization.

The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION.

Wikipedia Extractive Text Summarizer + Keywords Identification (entropy-based)

Chinese Advertisement Board Identification(Pytorch)

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Corruption Invariant Learning for Re-identification

Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

[BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations"

Koopman operator identification library in Python

L-SpEx: Localized Target Speaker Extraction