173 Python Speaker-encoder Libraries

BankNote-Net: Open dataset and encoder model for assistive currency recognition

BankNote-Net: Open Dataset for Assistive Currency Recognition Millions of people around the world have low or no vision. Assistive software applicatio

13 Oct 28, 2022

Comprehensive-E2E-TTS - PyTorch Implementation

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

114 Nov 13, 2022

A Persian Image Captioning model based on Vision Encoder Decoder Models of the transformers🤗.

Persian-Image-Captioning We fine-tuning the Vision Encoder Decoder Model for the task of image captioning on the coco-flickr-farsi dataset. The implem

15 Aug 25, 2022

Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

Third Time's the Charm? Image and Video Editing with StyleGAN3 Yuval Alaluf*, Or Patashnik*, Zongze Wu, Asif Zamir, Eli Shechtman, Dani Lischinski, Da

531 Dec 20, 2022

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

18 Dec 29, 2022

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python = 3.6 , Pytorch

84 Jan 3, 2023

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

DiffGAN-TTS - PyTorch Implementation PyTorch implementation of DiffGAN-TTS: High

157 Jan 1, 2023

A very simple python script to encode and decode PowerShell one-liners.

PowerShell Encoder A very simple python script to encode and decode PowerShell one-liners. I used Raikia's PowerShell encoder ALOT, but one day it wen

5 Jul 29, 2022

Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion

Feature-Style Encoder for Style-Based GAN Inversion Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion. Code will

63 Jan 3, 2023

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

TDY-CNN for Text-Independent Speaker Verification Official implementation of Temporal Dynamic Convolutional Neural Network for Text-Independent Speake

16 Oct 17, 2022

A U-Net combined with a variational auto-encoder that is able to learn conditional distributions over semantic segmentations.

Probabilistic U-Net + **Update** + An improved Model (the Hierarchical Probabilistic U-Net) + LIDC crops is now available. See below. Re-implementatio

498 Dec 26, 2022

Semantic Segmentation Suite in TensorFlow

Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!

2.5k Jan 6, 2023

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

ruCLIP-SB RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and re

5 Apr 13, 2022

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

Introduction This repository contains several materials that supplements the Spoofing-Aware Speaker Verification (SASV) Challenge 2022 including: calc

40 Dec 28, 2022

PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks

AttentionHTR PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks. Scene Text

31 Dec 22, 2022

CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

CvT2DistilGPT2 Improving Chest X-Ray Report Generation by Leveraging Warm-Starting This repository houses the implementation of CvT2DistilGPT2 from [1

21 Dec 28, 2022

EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

EncT5 (Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks About Finetune T5 model for classification & r

34 Jan 1, 2023

Custom 64 bit shellcode encoder that evades detection and removes some common badchars (\x00\x0a\x0d\x20)

x64-shellcode-encoder Custom 64 bit shellcode encoder that evades detection and removes some common badchars (\x00\x0a\x0d\x20) Usage Using a generato

2 Jan 26, 2022

Simple keras FCN Encoder/Decoder model for MS-COCO (food subset) segmentation

FCN_MSCOCO_Food_Segmentation Simple keras FCN Encoder/Decoder model for MS-COCO (food subset) segmentation Input data: [http://mscoco.org/dataset/#ove

11 Jan 8, 2019

Torch-ngp - A pytorch implementation of the hash encoder proposed in instant-ngp

HashGrid Encoder (WIP) A pytorch implementation of the HashGrid Encoder from ins

1k Jan 1, 2023

HistoSeg : Quick attention with multi-loss function for multi-structure segmentation in digital histology images

HistoSeg : Quick attention with multi-loss function for multi-structure segmentation in digital histology images Histological Image Segmentation This

11 Dec 16, 2022

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Cross-modal Retrieval using Transformer Encoder Reasoning Networks This project reimplements the idea from "Transformer Reasoning Network for Image-Te

5 Nov 5, 2022

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

Transformer Encoder Reasoning Network Code for the cross-modal visual-linguistic retrieval method from "Transformer Reasoning Network for Image-Text M

53 Dec 30, 2022

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation Table of Contents: Introduction Project Structure Installation Datas

492 Dec 2, 2022

A Transformer Implementation that is easy to understand and customizable.

Simple Transformer I've written a series of articles on the transformer architecture and language models on Medium. This repository contains an implem

4 Jan 20, 2022

An executor that wraps 3D mesh models and encodes 3D content documents to d-dimension vector.

3D Mesh Encoder An Executor that receives Documents containing point sets data in its blob attribute, with shape (N, 3) and encodes it to embeddings o

11 Dec 14, 2022

A simple program to make MSI Modern 15 speaker and microphone mute led work.

MSI Modern 15 sound led fixup for linux A simple program to fix the MSI Modern 15 speaker and microphone mute LEDs. Installation Requirements pulsectl

4 Oct 18, 2022

A general-purpose encoder-decoder framework for Tensorflow

READ THE DOCUMENTATION CONTRIBUTING A general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summariz

5.5k Jan 7, 2023

A Python Script to convert Normal PNG Image to Apple iDOT PNG Image.

idot-png-encoder A Python Script to convert Normal PNG Image to Apple iDOT PNG Image (Multi-threaded Decoding PNG). Usage idotpngencoder.py -i inputf

2 Feb 17, 2022

IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation

IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation Independent Encoder for Deep

30 Nov 5, 2022

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning English | 中文 ❗ Now we provide inferencing code and pre-training models

164 Jan 2, 2023

Encode and decode text application

Text Encoder and Decoder Encode and decode text in many ways using this application! Encode in: ASCII85 Base85 Base64 Base32 Base16 Url MD5 Hash SHA-1

1 Feb 12, 2022

Pytorch implementation of Masked Auto-Encoder

Masked Auto-Encoder (MAE) Pytorch implementation of Masked Auto-Encoder: Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick

22 Dec 13, 2022

An executor that loads ONNX models and embeds documents using the ONNX runtime.

ONNXEncoder An executor that loads ONNX models and embeds documents using the ONNX runtime. Usage via Docker image (recommended) from jina import Flow

2 Mar 15, 2022

Python codecs extension featuring CLI tools for encoding/decoding anything

CodExt Encode/decode anything. This library extends the native codecs library (namely for adding new custom encodings and character mappings) and prov

210 Dec 30, 2022

A morse code encoder and decoder utility.

morsedecode A morse code encoder and decoder utility. Installation Install it via pip: pip install morsedecode Alternatively, you can use pipx to run

2 Dec 25, 2021

MakeItTalk: Speaker-Aware Talking-Head Animation

MakeItTalk: Speaker-Aware Talking-Head Animation This is the code repository implementing the paper: MakeItTalk: Speaker-Aware Talking-Head Animation

285 Jan 8, 2023

User-friendly Voice Cloning Application

Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur

19 Dec 30, 2022

UniSpeech - Large Scale Self-Supervised Learning for Speech

UniSpeech The family of UniSpeech: WavLM (arXiv): WavLM: Large-Scale Self-Supervised Pre-training for Full Stack Speech Processing UniSpeech (ICML 202

281 Dec 15, 2022

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

564 Jan 8, 2023

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis

1.2k Jan 3, 2023

A proof-of-concept implementation of a parallel-decodable PNG format

mtpng A parallelized PNG encoder in Rust by Brion Vibber [email protected] Background Compressing PNG files is a relatively slow operation at large imag

193 Dec 16, 2022

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-

4 May 8, 2022

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

1.4k Jan 1, 2023

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

PyTorch implementation of Continuous Augmented Positional Embeddings (CAPE), by Likhomanenko et al. Enhance your Transformer positional embeddings with easy-to-use augmentations!

26 Dec 13, 2022

The Encoder Bot For Python

The_Encoder_Bot Configuration Add values in environment variables or add them in config.env.example and rename file to config.env. Basics API_ID - Get

8 Jan 1, 2022

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

bbc-speech-segmenter: Voice Activity Detection & Speaker Diarization A complete speech segmentation system using Kaldi and x-vectors for voice activit

16 Oct 27, 2022

Cross-Encoder-with-Bi-Encoder를 활용한 WebPage 데모

Retrieval_Streamlit_Demo Cross-Encoder-with-Bi-Encoder를 활용한

5 Dec 29, 2021

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

Enchpyter Enchpyter is a program do encrypt and decrypt any word you want (just letters). You enter how many letters jumps and write the word, so, the

2 Oct 10, 2022

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

Enchpyter Enchpyter is a program do encrypt and decrypt any word you want (just letters). You enter how many letters jumps and write the word, so, the

2 Dec 9, 2021

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

922 Dec 10, 2021

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone In our recent paper we propose the YourTTS model. YourTTS bri

390 Dec 29, 2022

Enchpyter, is able to encrypt and decrypt words as you determine, of course, according to the alphabet.

Enchpyter is a program do encrypt and decrypt any word you want (just letters). You enter how many letters jumps and write the word, so, the program encrypt for you in seconds.

2 Oct 10, 2022

Pytorch code for "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks".

:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

114 Dec 18, 2022

Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

TensorFlow implementation of 3D Convolutional Neural Networks for Speaker Verification - Official Project Page - Pytorch Implementation This repositor

753 Dec 17, 2022

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter

47 Dec 25, 2022

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Fine-tuning wav2vec2 for speaker recognition This is the code used to run the experiments in https://arxiv.org/abs/2109.15053. Detailed logs of each t

103 Dec 26, 2022

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Transfer Learning for Text Classification with Tensorflow Tensorflow implementation of Semi-supervised Sequence Learning(https://arxiv.org/abs/1511.01

82 Oct 22, 2022

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".

LXMERT: Learning Cross-Modality Encoder Representations from Transformers Our servers break again :(. I have updated the links so that they should wor

838 Dec 19, 2022

A SIXEL encoder/decoder implementation derived from kmiya's sixel

libsixel What is this? This package provides encoder/decoder implementation for DEC SIXEL graphics, and some converter programs. (https://youtu.be/0Sa

2k Jan 9, 2023

Official Implementation of "Designing an Encoder for StyleGAN Image Manipulation"

Designing an Encoder for StyleGAN Image Manipulation (SIGGRAPH 2021) Recently, there has been a surge of diverse methods for performing image editing

749 Jan 9, 2023

Official Implementation for Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation We present a generic image-to-image translation framework, pixel2style2pixel (pSp

2.8k Dec 30, 2022

Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting (HMNet)

Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting (HMNet) Our paper: https://arxiv.org/abs/2111.13324 We will release the complet

15 Oct 17, 2022

For Tok-k passages that have passed through the Bi-Encoder Retrival, ReRank is performed using CrossEncoder.

Cross-Encoder-with-Bi-Encoder For Tok-k passages that have passed through the Bi-Encoder Retrival, ReRank is performed using CrossEncoder. Data Data u

7 Feb 9, 2022

This repo contains simple to use, pretrained/training-less models for speaker diarization.

PyDiar This repo contains simple to use, pretrained/training-less models for speaker diarization. Supported Models Binary Key Speaker Modeling Based o

12 Jan 20, 2022

A Telegram bot to convert videos into x265/x264 format via ffmpeg.

Video Encoder Bot A Telegram bot to convert videos into x265/x264 format via ffmpeg. Configuration Add values in environment variables or add them in

1 Mar 8, 2022

VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

VGGVox models for speaker identification and verification This directory contains code to import and evaluate the speaker identification and verificat

338 Dec 27, 2022

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations Code repo for paper Trans-Encoder: Unsupervised sentence-pa

101 Dec 29, 2022

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

M2MeT challenge baseline -- AliMeeting This project provides the baseline system recipes for the ICASSP 2020 Multi-channel Multi-party Meeting Transcr

81 Dec 8, 2022

Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

14 Sep 14, 2022

A telegram bot for compressing/encoding videos in h264 format.

Video-Encoder-Bot a telegram bot for compressing/encoding videos in h264 format. Configuration Add values in environment variables or add them in conf

61 Dec 29, 2022

Multiple implementations for abstractive text summurization , using google colab

Text Summarization models if you are able to endorse me on Arxiv, i would be more than glad https://arxiv.org/auth/endorse?x=FRBB89 thanks This repo i

463 Dec 26, 2022

The implementation of DeBERTa

DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis

1.2k Jan 6, 2023

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

1.4k Jan 4, 2023

Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

gans-collection.torch Torch implementation of various types of GANs (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN). Note that EBGAN and

53 Jan 22, 2022

An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

195 Dec 17, 2022

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL

3 Dec 26, 2022

Code for our paper "Multi-scale Guided Attention for Medical Image Segmentation"

Medical Image Segmentation with Guided Attention This repository contains the code of our paper: "'Multi-scale self-guided attention for medical image

394 Dec 28, 2022

A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Segnet is deep fully convolutional neural network architecture for semantic pixel-wise segmentation. This is implementation of http://arxiv.org/pdf/15

190 Dec 15, 2022

Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

Caffe SegNet This is a modified version of Caffe which supports the SegNet architecture As described in SegNet: A Deep Convolutional Encoder-Decoder A

1.1k Jan 2, 2023

UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset

TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation By Vladimir Iglovikov and Alexey Shvets Introduction TernausNet is

1k Dec 28, 2022

Very simple encoding scheme that will encode data as a series of OwOs or UwUs.

OwO Encoder Very simple encoding scheme that will encode data as a series of OwOs or UwUs. The encoder is a simple state machine. Still needs a decode

1 Nov 15, 2021

Encode stuff with ducks!

Duckify Encoder Usage Download main.py and run it. main.py has an encoded version in encoded_main.py.txt. As A Module Download the duckify folder (or

2 Nov 15, 2021

Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION.

LiMuSE Overview Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION. LiMuSE explores group communication on a multi

Auditory Model and Cognitive Computing Lab

17 Oct 26, 2022

BaseCrack is a tool written in Python that can decode all alphanumeric base encoding schemes.

BaseCrack Decoder For Base Encoding Schemes BaseCrack is a tool written in Python that can decode all alphanumeric base encoding schemes. This tool ca

383 Dec 27, 2022

Demo of using Auto Encoder for Image Denoising

2 Aug 4, 2022

python scripts and other files to generate induction encoder PCBs in Kicad

induction_encoder python scripts and other files to generate induction encoder PCBs in Kicad Targeting the Renesas IPS2200 encoder chips.

8 Feb 16, 2022

Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network

Introduction This repository is about paper SpeakerGAN , and is unofficially implemented by Mingming Huang ([email protected]), Tiezheng Wang (wtz920729

7 Jan 3, 2023

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us

5.1k Jan 2, 2023

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Introduction This repository contains my unofficial reimplementation of the standard ECAPA-TDNN, which is the speaker recognition in VoxCeleb2 dataset

277 Dec 31, 2022

jel - Japanese Entity Linker - is Bi-encoder based entity linker for japanese.

jel: Japanese Entity Linker jel - Japanese Entity Linker - is Bi-encoder based entity linker for japanese. Usage Currently, link and question methods

10 Jan 6, 2023

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Cross-Speaker-Emotion-Transfer - PyTorch Implementation PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Conditio

114 Jan 8, 2023

Python Speaker-encoder Resources

Python speaker-encoder Libraries

BankNote-Net: Open dataset and encoder model for assistive currency recognition

Comprehensive-E2E-TTS - PyTorch Implementation

A Persian Image Captioning model based on Vision Encoder Decoder Models of the transformers🤗.

Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

A very simple python script to encode and decode PowerShell one-liners.

Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

A U-Net combined with a variational auto-encoder that is able to learn conditional distributions over semantic segmentations.

Semantic Segmentation Suite in TensorFlow

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

PyTorch implementation of an end-to-end Handwritten Text Recognition (HTR) system based on attention encoder-decoder networks

CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

Custom 64 bit shellcode encoder that evades detection and removes some common badchars (\x00\x0a\x0d\x20)

Simple keras FCN Encoder/Decoder model for MS-COCO (food subset) segmentation

Torch-ngp - A pytorch implementation of the hash encoder proposed in instant-ngp

HistoSeg : Quick attention with multi-loss function for multi-structure segmentation in digital histology images

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

A Transformer Implementation that is easy to understand and customizable.

An executor that wraps 3D mesh models and encodes 3D content documents to d-dimension vector.

A simple program to make MSI Modern 15 speaker and microphone mute led work.

A general-purpose encoder-decoder framework for Tensorflow

A Python Script to convert Normal PNG Image to Apple iDOT PNG Image.

IEGAN — Official PyTorch Implementation Independent Encoder for Deep Hierarchical Unsupervised Image-to-Image Translation

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

Encode and decode text application

Pytorch implementation of Masked Auto-Encoder

An executor that loads ONNX models and embeds documents using the ONNX runtime.

Python codecs extension featuring CLI tools for encoding/decoding anything

A morse code encoder and decoder utility.

MakeItTalk: Speaker-Aware Talking-Head Animation

User-friendly Voice Cloning Application

UniSpeech - Large Scale Self-Supervised Learning for Speech

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

A proof-of-concept implementation of a parallel-decodable PNG format

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

The Encoder Bot For Python

A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.

Cross-Encoder-with-Bi-Encoder를 활용한 WebPage 데모

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Enchpyter, is able to encrypt and decrypt words as you determine, of course, according to the alphabet.

Pytorch code for "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks".

Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".

A SIXEL encoder/decoder implementation derived from kmiya's sixel

Official Implementation of "Designing an Encoder for StyleGAN Image Manipulation"

Official Implementation for Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

Hierarchical Motion Encoder-Decoder Network for Trajectory Forecasting (HMNet)

For Tok-k passages that have passed through the Bi-Encoder Retrival, ReRank is performed using CrossEncoder.

This repo contains simple to use, pretrained/training-less models for speaker diarization.

A Telegram bot to convert videos into x265/x264 format via ffmpeg.

VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

Efficient Speech Processing Tookit for Automatic Speaker Recognition

A telegram bot for compressing/encoding videos in h264 format.

Multiple implementations for abstractive text summurization , using google colab

The implementation of DeBERTa

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

An implementation of a sequence to sequence neural network using an encoder-decoder

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id