986 Repositories
Python Apple-voice-recognition Libraries
Saliency - Framework-agnostic implementation for state-of-the-art saliency methods (XRAI, BlurIG, SmoothGrad, and more).
Saliency Methods 🔴 Now framework-agnostic! (Example core notebook) 🔴 🔗 For further explanation of the methods and more examples of the resulting ma
Captcha-tensorflow - Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+
Captcha Solving Using TensorFlow Introduction Solve captcha using TensorFlow. Learn CNN and TensorFlow by a practical project. Follow the steps, run t
GestureSSD CBAM - A gesture recognition web system based on SSD and CBAM, using pytorch, flask and node.js
GestureSSD_CBAM A gesture recognition web system based on SSD and CBAM, using pytorch, flask and node.js SSD implementation is based on https://github
Numbers-parser - Python module for parsing Apple Numbers .numbers files
numbers-parser numbers-parser is a Python module for parsing Apple Numbers .numbers files. It supports Numbers files generated by Numbers version 10.3
Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.
Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.
VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion (Interspeech
Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning
Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning English | 中文 ❗ Now we provide inferencing code and pre-training models
IDCARD-VERIFYING-SYSTEM - The "IDCARD VERIFYING SYSTEM" uses the Google's latest version of Tesseract OCR[Optical Character Recognition]
IDCARD VERIFYING SYSTEM The "IDCARD VERIFYING SYSTEM" uses the Google's latest v
HAR-stacked-residual-bidir-LSTMs - Deep stacked residual bidirectional LSTMs for HAR
HAR-stacked-residual-bidir-LSTM The project is based on this repository which is presented as a tutorial. It consists of Human Activity Recognition (H
SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
SpecAugment An implementation of SpecAugment for Pytorch How to use Install pytorch, version=1.9.0 (new feature (torch.Tensor.take_along_dim) is used
Face-Recognition-based-Attendance-System - An implementation of Attendance System in python.
Face-Recognition-based-Attendance-System A real time implementation of Attendance System in python. Pre-requisites To understand the implentation of F
Face_mosaic - Mosaic blur processing is applied to multiple faces appearing in the video
動機 face_recognitionを使用して得られる顔座標は長方形であり、この座標をそのまま用いてぼかし処理を行った場合得られる画像は醜い。 それに対してモ
“Robust Lightweight Facial Expression Recognition Network with Label Distribution Training”, AAAI 2021.
EfficientFace Zengqun Zhao, Qingshan Liu, Feng Zhou. "Robust Lightweight Facial Expression Recognition Network with Label Distribution Training". AAAI
The codebase for Data-driven general-purpose voice activity detection.
Data driven GPVAD Repository for the work in TASLP 2021 Voice activity detection in the wild: A data-driven approach using teacher-student training. S
A state-of-the-art semi-supervised method for image recognition
Mean teachers are better role models Paper ---- NIPS 2017 poster ---- NIPS 2017 spotlight slides ---- Blog post By Antti Tarvainen, Harri Valpola (The
Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition (NeurIPS 2019)
MLCR This is the source code for paper Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition. Xuesong Niu, Hu Han, Shiguang
Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model
Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model Baris Gecer 1, Binod Bhattarai 1
基于百度的语音识别,用python实现,pyaudio+pyqt
Speech-recognition 基于百度的语音识别,python3.8(conda)+pyaudio+pyqt+baidu-aip 百度有面向python
This is a realtime voice translator program which gets input from user at any language and converts it to the desired language that the user asks
This is a realtime voice translator program which gets input from user at any language and converts it to the desired language that the user asks ...
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021
SELF-ATTENTIVE VAD: CONTEXT-AWARE DETECTION OF VOICE FROM NOISE (ICASSP 2021) Pytorch implementation of SELF-ATTENTIVE VAD | Paper | Dataset Yong Rae
Rocks vc Userbot: A Telegram Bot Project That's Allow You To Play Audio And Video Music On Telegram Voice Chat Group
⭐️ Rocks VC Userbot ⭐️ Telegram Userbot To Play Audio And Video Song On VC Chat
A simple Python app to provide RPC for iTunes and the Music app. MacOS exclusive.
Ongaku You know, ongaku. A port of Ongaku to Python. Why? I don't know. A simple application providing the now playing state from iTunes (or the Music
Get notifications in your Discord server of any software releases from Apple.
Apple Releases Get notifications in your Discord server of any software releases from Apple. Running To locally host your own instance, create a Disco
This repository for project that can Automate Number Plate Recognition (ANPR) in Morocco Licensed Vehicles. 💻 + 🚙 + 🇲🇦 = 🤖 🕵🏻♂️
MoroccoAI Data Challenge (Edition #001) This Reposotory is result of our work in the comepetiton organized by MoroccoAI in the context of the first Mo
Discord Voice Channel Automatic Online
Discord-Selfbot-voice Features: Discord Voice Channel Automatic Online FAQ Q: How can I obtain my token? A: 1. How to obtain your token in android 2.
Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.
Video Representation Learning by Recognizing Temporal Transformations [Project Page] Simon Jenni, Givi Meishvili, and Paolo Favaro. In ECCV, 2020. Thi
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo
Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification"
[AAAI22] Cross-Domain Empirical Risk Minimization for Unbiased Long-tailed Classification We point out the overlooked unbiasedness in long-tailed clas
Scene-Text-Detection-and-Recognition (Pytorch)
Scene-Text-Detection-and-Recognition (Pytorch) Competition URL: https://tbrain.t
Indonesian Car License Plate Character Recognition using Tensorflow, Keras and OpenCV.
Monopol Indonesian Car License Plate (Indonesia Mobil Nomor Polisi) Character Recognition using Tensorflow, Keras and OpenCV. Background This applicat
Asad Alexa VC Bot Is A Telegram Bot Project That's Allow You To Play Audio And Video Music On Telegram Voice Chat Group.
Asad Alexa VC Bot Is A Telegram Bot Project That's Allow You To Play Audio And Video Music On Telegram Voice Chat Group.
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
English | 中文 Features 🌍 Chinese supported mandarin and tested with multiple datasets: aidatatang_200zh, magicdata, aishell3, data_aishell, and etc. ?
Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition (AGRA, ACM 2020, Oral)
Cross Domain Facial Expression Recognition Benchmark Implementation of papers: Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchm
Efficient face emotion recognition in photos and videos
This repository contains code of face emotion recognition that was developed in the RSF (Russian Science Foundation) project no. 20-71-10010 (Efficien
ANKIT-OS/TG-MUSIC-PLAYER a special repository. Its Is A Telegram Bot To Play To Play Music In Voice Chat
🔥 🎶 TG MUSIC PLAYER 🎶 🔥 The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • 🛠️ Lᴀɴɢᴜᴀɢᴇs Aɴᴅ Tᴏᴏʟs 🔰 •
EmoTag helps you train emotion detection model for Chinese audios
emoTag emoTag helps you train emotion detection model for Chinese audios. Environment pip install -r requirement.txt Data We used Emotional Speech Dat
The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text
speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma
PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.
VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n
Tool To download 4KHDR DV SDR from AppleTV
# APPLE-TV 4K Downloader Tool To download 4K HDR DV SDR from AppleTV Hello Fellow Developers/ ! Hi! My name is WVDUMP. I am Leaking the scripts to
Vietnamese Language Detection and Recognition
Table of Content Introduction (Khôi viết) Dataset (đổi link thui thành 3k5 ảnh mình) Getting Started (An Viết) Requirements Usage Example Training & E
Randomly selects two teams based on who is in a voice channel on Discord
TeamPickerDiscordBot Randomly selects two teams based on who is in a voice channel on Discord What I Learned The ins and outs of Python as this was my
Apple iTunes In-app purchase verification tool
itunes-iap v2 Python 2 & 3 compatible! Even with :mod:`asyncio` support! Source code: https://github.com/youknowone/itunes-iap Documentation: http://i
Python wrapper around Apple App Store Api
App Store Connect Api This is a Python wrapper around the Apple App Store Api : https://developer.apple.com/documentation/appstoreconnectapi So far, i
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.
Welcome to Spokestack Python! This library is intended for developing voice interfaces in Python. This can include anything from Raspberry Pi applicat
A repository that finds a person who looks like you by using face recognition technology.
Find Your Twin Hello everyone, I've always wondered how casting agencies do the casting for a scene where a certain actor is young or old for a movie
Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"
CoG-BART Contrast and Generation Make BART a Good Dialogue Emotion Recognizer Quick Start: To run the model on test sets of four datasets, Download th
Group Activity Recognition with Clustered Spatial Temporal Transformer
GroupFormer Group Activity Recognition with Clustered Spatial-TemporalTransformer Backbone Style Action Acc Activity Acc Config Download Inv3+flow+pos
SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model
SEOVER-Master This code is the implementation of paper: SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model
Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing
Contrast and Mix (CoMix) The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Backgroun
Okaeri-Music is a telegram music bot project, allow you to play music on voice chat group telegram.
🗄️ PROJECT MUSIC,THIS IS MAINTAINED Okaeri-Music is a telegram bot project that's allow you to play music on telegram voice chat group Features 🔥 Th
Built for streamlining development of Google Assistant Actions
Apprentice Apprentice is a framework built for developing Google Actions via Dialogflow and Google Cloud (serverless) Functions. Includes: plug-and-pl
Okaeri-Music is a telegram music bot project, allow you to play music on voice chat group telegram.
Okaeri-Music is a telegram bot project that's allow you to play music on telegram voice chat group
Pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
MOSNet pytorch implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion" https://arxiv.org/abs/1904.08352 Dependency L
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
An elaborate and exhaustive paper list for Named Entity Recognition (NER)
Named-Entity-Recognition-NER-Papers by Pengfei Liu, Jinlan Fu and other contributors. An elaborate and exhaustive paper list for Named Entity Recognit
Official Pytorch implementation for "End2End Occluded Face Recognition by Masking Corrupted Features, TPAMI 2021"
End2End Occluded Face Recognition by Masking Corrupted Features This is the Pytorch implementation of our TPAMI 2021 paper End2End Occluded Face Recog
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo
PyTorch implementation of our paper: Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition
Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition, arxiv This is a PyTorch implementation of our paper. 1. Re
A Comparative Review of Recent Kinect-Based Action Recognition Algorithms (TIP2020, Matlab codes)
A Comparative Review of Recent Kinect-Based Action Recognition Algorithms This repo contains: the HDG implementation (Matlab codes) for 'Analysis and
Streaming over lightweight data transformations
Description Data augmentation libarary for Deep Learning, which supports images, segmentation masks, labels and keypoints. Furthermore, SOLT is fast a
This is an OverPowered Vc Music Player! Will work for you and play music in Voice Chatz
VcPlayer This is an OverPowered Vc Music Player! Will work for you and play music in Voice Chatz Telegram Voice-Chat Bot [PyTGCalls] ⇝ Requirements ⇜
A Python module made to simplify the usage of Text To Speech and Speech Recognition.
Nav Module The solution for voice related stuff in Python Nav is a Python module which simplifies voice related stuff in Python. Just import the Modul
Using computer vision method to recognize and calcutate the features of the architecture.
building-feature-recognition In this repository, we accomplished building feature recognition using traditional/dl-assisted computer vision method. Th
Complete system for facial identity system
Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring
Voice package for Pycord adding extra features.
VoiceIO Voice package for Pycord adding extra features. Example Down bellow is an example of what you can currently do. import voiceio process = voic
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism This repository is the official PyTorch implementation of our AAAI-2022 paper, in
This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described in the paper.
Data Efficient Language-Supervised Zero-Shot Recognition with Optimal Transport Distillation This repository contains PyTorch evaluation code, trainin
An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live Streams, YouTube Videos & Telegram Media !!
Telegram Video Stream Bot (Py-TgCalls) An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live St
Chinese license plate recognition
AgentCLPR 简介 一个基于 ONNXRuntime、AgentOCR 和 License-Plate-Detector 项目开发的中国车牌检测识别系统。 车牌识别效果 支持多种车牌的检测和识别(其中单层车牌识别效果较好): 单层车牌: [[[[373, 282], [69, 284],
J.A.R.V.I.S is an AI virtual assistant made in python.
J.A.R.V.I.S is an AI virtual assistant made in python. Running JARVIS Without Python To run JARVIS without python: 1. Head over to our installation pa
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier
LSTMs for Human Activity Recognition Human Activity Recognition (HAR) using smartphones dataset and an LSTM RNN. Classifying the type of movement amon
Wav2Vec for speech recognition, classification, and audio classification
Soxan در زبان پارسی به نام سخن This repository consists of models, scripts, and notebooks that help you to use all the benefits of Wav2Vec 2.0 in your
Arabic speech recognition, classification and text-to-speech.
klaam Arabic speech recognition, classification and text-to-speech using many advanced models like wave2vec and fastspeech2. This repository allows tr
Testing the Facial Emotion Recognition (FER) algorithm on animations
PegHeads-Tutorial-3 Testing the Facial Emotion Recognition (FER) algorithm on animations
A Simple Voice Music Player
📀 𝐕𝐂𝐔𝐬𝐞𝐫𝐁𝐨𝐭 √𝙏𝙚𝙖𝙢✘𝙊𝙘𝙩𝙖𝙫𝙚 NOTE JUST AN ENGLISH VERSION OF OUR PRIVATE SOURCE WAIT FOR LATEST UPDATES JOIN @𝐒𝐔𝐏𝐏𝐎𝐑𝐓 JOIN @𝐂?
Create light scenes , voice control, ifttt, fuzzywuzzy speech correction and much more with Tuya light bulbs.
LightBox Features: Auto discover tuya lights Set and create moods (aka: light profiles) Change moods via IFTTT List moods via IFTTT FuzzyWuzzy, speech
The Delegate Network: An Interactive Voice Response Delegative Democracy Implementation of Liquid Democracy
The Delegate Network Overview The delegate network is a completely transparent, easy-to-use and understand version of what is sometimes called liquid
A bot that can play songs in Telegram group voice chats like AK 47
🎧 47Music Player 🎧 A bot that can play songs in Telegram group voice chats like AK 47 ✨ Easy To Deploy Pyrogram Session Config Vars API_ID : Assista
nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.
faceprocessor nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex. Tech faceprocessor uses a number of open source projec
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism This repository is the official PyTorch implementation of our AAAI-2022 paper, in
Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.
Welcome to Healthsea ✨ Create better access to health with spaCy. Healthsea is a pipeline for analyzing user reviews to supplement products by extract
Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis
MT-VAE for Multimodal Human Motion Synthesis This is the code for ECCV 2018 paper MT-VAE: Learning Motion Transformations to Generate Multimodal Human
Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring
Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring
Process RunGap output file of a workout and load data into Apple Numbers Spreadsheet and my website with API calls
BSD 3-Clause License Copyright (c) 2020, Mike Bromberek All rights reserved. ProcessWorkout Exercise data is exported in JSON format to iCloud using
Code of TIP2021 Paper《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet and Pytorch versions.
SFace Code of TIP2021 Paper 《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet, PyTorch and Jittor versi
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
User-friendly Voice Cloning Application
Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur
Craft PNG files that appear completely different in Apple software
Ambiguous PNG Packer Craft PNG files that appear completely different in Apple software
UniSpeech - Large Scale Self-Supervised Learning for Speech
UniSpeech The family of UniSpeech: WavLM (arXiv): WavLM: Large-Scale Self-Supervised Pre-training for Full Stack Speech Processing UniSpeech (ICML 202
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
SEW (Squeezed and Efficient Wav2vec) The repo contains the code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speec
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Hiring We are hiring at all levels (including FTE researchers and interns)! If you are interested in working with us on NLP and large-scale pre-traine
Awesome Treasure of Transformers Models Collection
💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️
A curated list of long-tailed recognition resources.
Awesome Long-tailed Recognition A curated list of long-tailed recognition and related resources. Please feel free to pull requests or open an issue to
Pytorch codes for Feature Transfer Learning for Face Recognition with Under-Represented Data
FTLNet_Pytorch Pytorch codes for Feature Transfer Learning for Face Recognition with Under-Represented Data 1. Introduction This repo is an unofficial
A Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Training Data》
RangeLoss Pytorch This is a Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Trai
A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats
TG-MusicPlayer A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Py
Python functions for summarizing and improving voice dictation input.
Helpmespeak Help me speak uses Python functions for summarizing and improving voice dictation input. Get started with OpenAI gpt-3 OpenAI is a amazing
Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI
EmotionUI Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI. demo screenshot (with RealSense) required packages Python = 3.6 num
That Hash will name that hash type! Identify MD5, SHA256 and 300+ other hashes Comes with
Call for translators! We're looking for translators to help translate this spec for everyone! Read this documentation in the following languages 한국어 中