455 Repositories
Python single-speaker-tts Libraries
A Python module made to simplify the usage of Text To Speech and Speech Recognition.
Nav Module The solution for voice related stuff in Python Nav is a Python module which simplifies voice related stuff in Python. Just import the Modul
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism This repository is the official PyTorch implementation of our AAAI-2022 paper, in
Single Image Random Dot Stereogram for Tensorflow
TensorFlow-SIRDS Single Image Random Dot Stereogram for Tensorflow SIRDS is a means to present 3D data in a 2D image. It allows for scientific data di
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
Super Resolution Examples We run this script under TensorFlow 2.0 and the TensorLayer2.0+. For TensorLayer 1.4 version, please check release. 🚀 🚀 🚀
Arabic speech recognition, classification and text-to-speech.
klaam Arabic speech recognition, classification and text-to-speech using many advanced models like wave2vec and fastspeech2. This repository allows tr
A simple, personal chat program that runs on a single computer. No Internet, just you.
MultiChat A simple, personal chat program that runs on a single computer. No Internet, just you. Simple and Local MultiChat was created with ease of u
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism This repository is the official PyTorch implementation of our AAAI-2022 paper, in
Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image
Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image (Project page) Zhengqin Li, Mohammad Sha
InverseRenderNet: Learning single image inverse rendering, CVPR 2019.
InverseRenderNet: Learning single image inverse rendering !! Check out our new work InverseRenderNet++ paper and code, which improves the inverse rend
Learning 3D Part Assembly from a Single Image
Learning 3D Part Assembly from a Single Image This repository contains a PyTorch implementation of the paper: Learning 3D Part Assembly from A Single
High-Resolution 3D Human Digitization from A Single Image.
PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization (CVPR 2020) News: [2020/06/15] Demo with Google Colab (i
Learning View Priors for Single-view 3D Reconstruction (CVPR 2019)
Learning View Priors for Single-view 3D Reconstruction (CVPR 2019) This is code for a paper Learning View Priors for Single-view 3D Reconstruction by
Fused multiply-add (with a single rounding) for Python.
pyfma Fused multiply-add for Python. Fused multiply-add computes (x*y) + z with a single rounding. Useful for dot products, matrix multiplications, po
Code for Deep Single-image Portrait Image Relighting
Deep Single-Image Portrait Relighting [Project Page] Hao Zhou, Sunil Hadap, Kalyan Sunkavalli, David W. Jacobs. In ICCV, 2019 Overview Test script for
MakeItTalk: Speaker-Aware Talking-Head Animation
MakeItTalk: Speaker-Aware Talking-Head Animation This is the code repository implementing the paper: MakeItTalk: Speaker-Aware Talking-Head Animation
A simple, 2-person chat program that runs on a single computer. No Internet, just you
localChat A simple, 2-person chat program that runs on a single computer. No Internet, just you. Simple and Local This was created with ease of use in
User-friendly Voice Cloning Application
Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur
UniSpeech - Large Scale Self-Supervised Learning for Speech
UniSpeech The family of UniSpeech: WavLM (arXiv): WavLM: Large-Scale Self-Supervised Pre-training for Full Stack Speech Processing UniSpeech (ICML 202
A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities
MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-
Dynamic View Synthesis from Dynamic Monocular Video
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a
Paper and Codes for “Embracing Single Stride 3D Object Detector with Sparse Transformer”
SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer
Official implementation of the paper "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering"
Light Field Networks Project Page | Paper | Data | Pretrained Models Vincent Sitzmann*, Semon Rezchikov*, William Freeman, Joshua Tenenbaum, Frédo Dur
Embracing Single Stride 3D Object Detector with Sparse Transformer
SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer
A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution
DRSAN A Dynamic Residual Self-Attention Network for Lightweight Single Image Super-Resolution Karam Park, Jae Woong Soh, and Nam Ik Cho Environments U
A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.
bbc-speech-segmenter: Voice Activity Detection & Speaker Diarization A complete speech segmentation system using Kaldi and x-vectors for voice activit
Running a complete single-node all-in-one cluster instance of TIBCO ActiveMatrix™ BusinessWorks 6.8.0.
TIBCO ActiveMatrix™ BusinessWorks 6.8 Docker Image Image for running a complete single-node all-in-one cluster instance of TIBCO ActiveMatrix™ Busines
[NeurIPS 2020] This project provides a strong single-stage baseline for Long-Tailed Classification, Detection, and Instance Segmentation (LVIS).
A Strong Single-Stage Baseline for Long-Tailed Problems This project provides a strong single-stage baseline for Long-Tailed Classification (under Ima
Parallel TTS web demo based on Flask + Vue (Vuetify).
Parallel TTS web demo based on Flask + Vue (Vuetify).
An extremely simple package with a single utillity class used for gracefully handling POSIX shutdown signals.
graceful-killer An extremely simple package with a single utillity class used for gracefully handling POSIX shutdown signals. Installation Use pip to
Easy and secure implementation of Azure AD for your FastAPI APIs 🔒 Single- and multi-tenant support.
Easy and secure implementation of Azure AD for your FastAPI APIs 🔒 Single- and multi-tenant support.
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone In our recent paper we propose the YourTTS model. YourTTS bri
Pytorch code for "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks".
:speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
Functions to analyze Cell-ID single-cell cytometry data using python language.
PyCellID (building...) Functions to analyze Cell-ID single-cell cytometry data using python language. Dependecies for this project. attrs(=21.1.0) fo
Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
TensorFlow implementation of 3D Convolutional Neural Networks for Speaker Verification - Official Project Page - Pytorch Implementation This repositor
Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.
Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets.
Ugly single sign-on for django projects only
django-usso Ugly single sign-on for django projects only. Do you have many django apps with different users? Do you want to use only one of those apps
The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'
Deep Residual Fourier Transformation for Single Image Deblurring Xintian Mao, Yiming Liu, Wei Shen, Qingli Li and Yan Wang News 2021.12.5 Release Deep
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
PyTorch implementation of MICCAI 2018 paper "Liver Lesion Detection from Weakly-labeled Multi-phase CT Volumes with a Grouped Single Shot MultiBox Detector"
Grouped SSD (GSSD) for liver lesion detection from multi-phase CT Note: the MICCAI 2018 paper only covers the multi-phase lesion detection part of thi
Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features
MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter
Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"
Fine-tuning wav2vec2 for speaker recognition This is the code used to run the experiments in https://arxiv.org/abs/2109.15053. Detailed logs of each t
Implementation of Google Brain's WaveGrad high-fidelity vocoder
WaveGrad Implementation (PyTorch) of Google Brain's high-fidelity WaveGrad vocoder (paper). First implementation on GitHub with high-quality generatio
Neural HMMs are all you need (for high-quality attention-free TTS)
Neural HMMs are all you need (for high-quality attention-free TTS) Shivam Mehta, Éva Székely, Jonas Beskow, and Gustav Eje Henter This is the official
Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"
CLIPstyler Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" Environment Pytorch 1.7.1, Python 3.6 $ c
Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)
Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral) Tianyu Wang*, Xiaowei Hu*, Chi-Wing Fu, and Pheng-Ann Hen
A single model that parses Universal Dependencies across 75 languages.
A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology tags, lemmas, and dependency trees.
Train Dense Passage Retriever (DPR) with a single GPU
Gradient Cached Dense Passage Retrieval Gradient Cached Dense Passage Retrieval (GC-DPR) - is an extension of the original DPR library. We introduce G
MonoScene: Monocular 3D Semantic Scene Completion
MonoScene: Monocular 3D Semantic Scene Completion MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page] Anh-Quan Cao, Rao
Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"
CLIPstyler Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" Environment Pytorch 1.7.1, Python 3.6 $ c
Multi-path load balancing is a method used by most of the real-time network to split the packets into different paths rather than transferring it through a single path
Multipath-Load-Balancing Method of managing incoming traffic by distributing and sharing load fairly among multiple routes from source to destination
[3DV 2021] A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks
dispersion-score Official implementation of 3DV 2021 Paper A Dataset-dispersion Perspective on Reconstruction versus Recognition in Single-view 3D Rec
Code for Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021)
Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021) Single-cause Perturbation (SCP) is a framework to estimate the m
Python Single Object Tracking Evaluation
pysot-toolkit The purpose of this repo is to provide evaluation API of Current Single Object Tracking Dataset, including VOT2016 VOT2018 VOT2018-LT OT
Torch implementation of "Enhanced Deep Residual Networks for Single Image Super-Resolution"
NTIRE2017 Super-resolution Challenge: SNU_CVLab Introduction This is our project repository for CVPR 2017 Workshop (2nd NTIRE). We, Team SNU_CVLab, (B
Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust.
Subspace Adversarial Training Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust. However,
This repo contains simple to use, pretrained/training-less models for speaker diarization.
PyDiar This repo contains simple to use, pretrained/training-less models for speaker diarization. Supported Models Binary Key Speaker Modeling Based o
Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"
SinGAN Project | Arxiv | CVF | Supplementary materials | Talk (ICCV`19) Official pytorch implementation of the paper: "SinGAN: Learning a Generative M
Text-to-Speech for Belarusian language
title emoji colorFrom colorTo sdk app_file pinned Belarusian TTS 🐸 green green gradio app.py false Belarusian TTS 📢 🤖 Belarusian TTS (text-to-speec
Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)
SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem
Automatically visualize your pandas dataframe via a single print! 📊 💡
A Python API for Intelligent Visual Discovery Lux is a Python library that facilitate fast and easy data exploration by automating the visualization a
A python gui program to generate reddit text to speech videos from the id of any post.
Reddit text to speech generator A python gui program to generate reddit text to speech videos from the id of any post. Current functionality Generate
An single python server emulator of MMORPG game WindSlayer also known as WS1.
PySlayer An single python server emulator of MMORPG game WindSlayer also known as WS1. Requirements Python = 3.7 Old windslayer client (Korea Yahoo!
This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.
Pruning Self-attentions into Convolutional Layers in Single Path This is the official repository for our paper: Pruning Self-attentions into Convoluti
The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'
Deep Residual Fourier Transformation for Single Image Deblurring Xintian Mao, Yiming Liu, Wei Shen, Qingli Li and Yan Wang code will be released soon
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
English | 简体中文 Documentation: https://mmtracking.readthedocs.io/ Introduction MMTracking is an open source video perception toolbox based on PyTorch.
VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets
VGGVox models for speaker identification and verification This directory contains code to import and evaluate the speaker identification and verificat
A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.
Minimal Hand A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run. This project provides the
This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.
Pruning Self-attentions into Convolutional Layers in Single Path This is the official repository for our paper: Pruning Self-attentions into Convoluti
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.
M2MeT challenge baseline -- AliMeeting This project provides the baseline system recipes for the ICASSP 2020 Multi-channel Multi-party Meeting Transcr
Efficient Speech Processing Tookit for Automatic Speaker Recognition
Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for
A synchronous, single-threaded interface for starting processes on Linux
A synchronous, single-threaded interface for starting processes on Linux
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image [Project Page] [Paper] [Supp. Mat.] Table of Contents License Description Fittin
Single-Cell Analysis in Python. Scales to 1M cells.
Scanpy – Single-Cell Analysis in Python Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It inc
A single model for shaping, creating, accessing, storing data within a Database
'db' within pydantic - A single model for shaping, creating, accessing, storing data within a Database Key Features Integrated Redis Caching Support A
Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"
Scripts for "Current best-practices in single-cell RNA-seq: a tutorial" This repository is complementary to the publication: M.D. Luecken, F.J. Theis,
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features
CleanRL (Clean Implementation of RL Algorithms) CleanRL is a Deep Reinforcement Learning library that provides high-quality single-file implementation
TraSw for FairMOT - A Single-Target Attack example (Attack ID: 19; Screener ID: 24):
TraSw for FairMOT A Single-Target Attack example (Attack ID: 19; Screener ID: 24): Fig.1 Original Fig.2 Attacked By perturbing only two frames in this
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA results for single-image motion deblurring, image deraining, image denoising (synthetic and real data), and dual-pixel defocus deblurring.
Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,
A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.
Minimal Body A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image. The model file is only 51.2 MB and runs a
🔥3D-RecGAN in Tensorflow (ICCV Workshops 2017)
3D Object Reconstruction from a Single Depth View with Adversarial Learning Bo Yang, Hongkai Wen, Sen Wang, Ronald Clark, Andrew Markham, Niki Trigoni
Single/multi view image(s) to voxel reconstruction using a recurrent neural network
3D-R2N2: 3D Recurrent Reconstruction Neural Network This repository contains the source codes for the paper Choy et al., 3D-R2N2: A Unified Approach f
Torch Implementation of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"
Photo-Realistic-Super-Resoluton Torch Implementation of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network" [Paper]
Subpopulation detection in high-dimensional single-cell data
PhenoGraph for Python3 PhenoGraph is a clustering method designed for high-dimensional single-cell data. It works by creating a graph ("network") repr
Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image
CenterPose Overview This repository is the official implementation of the paper "Single-stage Keypoint-based Category-level Object Pose Estimation fro
TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id
TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL
Official implementation of Meta-StyleSpeech and StyleSpeech
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation Dongchan Min, Dong Bok Lee, Eunho Yang, and Sung Ju Hwang This is an official code
Learning Temporal Consistency for Low Light Video Enhancement from Single Images (CVPR2021)
StableLLVE This is a Pytorch implementation of "Learning Temporal Consistency for Low Light Video Enhancement from Single Images" in CVPR 2021, by Fan
A simple Python script to convert multiple images (well technically also a single image) into a pdf.
PythonImage2PDF A simple Python script to convert multiple images into a single PDF-document. Created basically for only my own needs for converting m
A relatively simple python program to generate one of those reddit text to speech videos dominating youtube.
Reddit text to speech generator A basic reddit tts video generator Current functionality Generate videos for subs based on comments,(askreddit) so rea
EZ graph is an easy to use AI solution that allows you to make and train your neural networks without a single line of code.
EZ-Graph EZ Graph is a GUI that allows users to make and train neural networks without writing a single line of code. Requirements python 3 pandas num
Models Supported: AlbUNet [18, 34, 50, 101, 152] (1D and 2D versions for Single and Multiclass Segmentation, Feature Extraction with supports for Deep Supervision and Guided Attention)
AlbUNet-1D-2D-Tensorflow-Keras This repository contains 1D and 2D Signal Segmentation Model Builder for AlbUNet and several of its variants developed
Important dataframe statistics with a single command
quick_eda Receiving dataframe statistics with one command Project description A python package for Data Scientists, Students, ML Engineers and anyone
Single machine, multiple cards training; mix-precision training; DALI data loader.
Template Script Category Description Category script comparison script train.py, loader.py for single-machine-multiple-cards training train_DP.py, tra
Reads Data from given Excel File and exports Single PDFs and a complete PDF grouped by Gateway
E-Shelter Excel2QR Reads Data from given Excel File and exports Single PDFs and a complete PDF grouped by Gateway Features Reads Excel 2021 Export Sin
Code and datasets for TPAMI 2021
SkeletonNet This repository constains the codes and ShapeNetV1-Surface-Skeleton,ShapNetV1-SkeletalVolume and 2d image datasets ShapeNetRendering. Plea
Implementation of TTS with combination of Tacotron2 and HiFi-GAN
Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t
Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech
Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech This repository is the official implementation of "Meta-TTS: Meta-Learning for Few
Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION.
LiMuSE Overview Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION. LiMuSE explores group communication on a multi
Azure Neural Speech Service TTS
Written in Python using the Azure Speech SDK. App.py provides an easy way to create an Text-To-Speech request to Azure Speech and download the wav file. Azure Neural Voices Text-To-Speech enables fluid, natural-sounding text to speech that matches the patterns and intonation of human voices.