1022 Python Speech-recognition Libraries

Paddle-Skeleton-Based-Action-Recognition - DecoupleGCN-DropGraph, ASGCN, AGCN, STGCN

Paddle-Skeleton-Action-Recognition DecoupleGCN-DropGraph, ASGCN, AGCN, STGCN. Yo

3 Nov 2, 2022

Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis

TweebankNLP This repo contains the new Tweebank-NER dataset and off-the-shelf Twitter-Stanza pipeline for state-of-the-art Tweet NLP, as described in

42 Jan 26, 2022

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition Xue, Wenyuan, et al. "TGRNet: A Table Graph Reconstruction Network for Ta

68 Jan 4, 2023

This is the repo of the manuscript "Dual-branch Attention-In-Attention Transformer for speech enhancement"

DB-AIAT: A Dual-branch attention-in-attention transformer for single-channel SE

68 Dec 16, 2022

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

RoNER RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2. It is meant to be an easy to use, hi

9 Nov 7, 2022

FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

FocusFace This is the official repository of "FocusFace: Multi-task Contrastive Learning for Masked Face Recognition" accepted at IEEE International C

21 Nov 17, 2022

Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications

Labelbox Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications. Use this github repository to help you s

1.7k Dec 29, 2022

Bot by image recognition simulating (random) human clicks

bbbot22 bot por reconhecimento de imagem simulando cliques humanos (aleatórios) inb4: sim, esse é basicamente o mesmo bot de 2021 porque a Globo não t

2 Apr 5, 2022

Simple and understandable swin-transformer OCR project

swin-transformer-ocr ocr with swin-transformer Overview Simple and understandable swin-transformer OCR project. The model in this repository heavily r

67 Dec 31, 2022

Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks

Introduction This repository contains the modified caffe library and network architectures for our paper "Automated Melanoma Recognition in Dermoscopy

47 Nov 24, 2022

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

ademxapp Visual applications by the University of Adelaide In designing our Model A, we did not over-optimize its structure for efficiency unless it w

338 Dec 12, 2022

Some utils for auto speech recognition

About Some utils for auto speech recognition. Utils Util Description Script Reset audio Reset sample rate, sample width, etc of audios.

1 Jan 24, 2022

Exadel CompreFace is a free and open-source face recognition GitHub project

Exadel CompreFace is a leading free and open-source face recognition system Exadel CompreFace is a free and open-source face recognition service that

2.6k Jan 4, 2023

Stanford CoreNLP provides a set of natural language analysis tools written in Java

Stanford CoreNLP Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and giv

8.8k Jan 7, 2023

MasterDuel Image Recognition Translation Command Line Tool

MasterDuelTranslate(Use Ygo Card DataBase,belong win32 window shot & image match)

77 Dec 1, 2022

Simple Text-To-Speech Bot For Discord

Simple Text-To-Speech Bot For Discord This is a very simple TTS bot for discord made with python. For this bot you need FFMPEG, see installation to se

1 Sep 26, 2022

Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features

Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features | paper | Official PyTorch implementation for Mul

48 Dec 28, 2022

Applying CLIP to Point Cloud Recognition.

PointCLIP: Point Cloud Understanding by CLIP This repository is an official implementation of the paper 'PointCLIP: Point Cloud Understanding by CLIP'

175 Dec 24, 2022

Local-Global Stratified Transformer for Efficient Video Recognition

DualFormer This repo is the implementation of our manuscript entitled "Local-Global Stratified Transformer for Efficient Video Recognition". Our model

19 Dec 7, 2022

Meta Self-learning for Multi-Source Domain Adaptation： A Benchmark

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods

188 Dec 12, 2022

A real-time speech emotion recognition application using Scikit-learn and gradio

Speech-Emotion-Recognition-App A real-time speech emotion recognition application using Scikit-learn and gradio. Requirements librosa==0.6.3 numpy sou

6 Oct 4, 2022

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER

TweebankNLP This repo contains the new Tweebank-NER dataset and Twitter-Stanza p

84 Dec 20, 2022

The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

4 Apr 5, 2022

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

16 Dec 30, 2022

Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch

Multimodal Temporal Context Network (MTCN) This repository implements the model proposed in the paper: Evangelos Kazakos, Jaesung Huh, Arsha Nagrani,

13 Nov 24, 2022

Python Computer Vision from Scratch

This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos.

221 Dec 26, 2022

An image processing project uses Viola-jones technique to detect faces and then use SIFT algorithm for recognition.

Attendance_System An image processing project uses Viola-jones technique to detect faces and then use LPB algorithm for recognition. Face Detection Us

8 Jan 11, 2022

A CNN model to detect hand gestures.

Software Used python - programming language used, tested on v3.8 miniconda - for managing virtual environment Libraries Used opencv - pip install open

6 Jul 14, 2022

Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

27 Nov 7, 2022

Official implementation of Sparse Transformer-based Action Recognition

STAR Official implementation of S parse T ransformer-based A ction R ecognition Dataset download NTU RGB+D 60 action recognition of 2D/3D skeleton fro

15 Nov 2, 2022

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition

17 Dec 12, 2022

Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

Pose Detection Project Description: Human pose estimation from video plays a critical role in various applications such as quantifying physical exerci

2 Jan 17, 2022

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

3 Mar 30, 2022

As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).

HAKE-Action HAKE-Action (TensorFlow) is a project to open the SOTA action understanding studies based on our Human Activity Knowledge Engine. It inclu

94 Nov 18, 2022

Action Recognition for Self-Driving Cars

Action Recognition for Self-Driving Cars This repo contains the codes for the 2021 Fall semester project "Action Recognition for Self-Driving Cars" at

3 Apr 7, 2022

A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population

DeepKE is a knowledge extraction toolkit supporting low-resource and document-level scenarios for entity, relation and attribute extraction. We provide comprehensive documents, Google Colab tutorials, and online demo for beginners.

1.6k Jan 5, 2023

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition Introduction Run attack: SGADV.py Objective function: foolbox/attacks/gradi

1 Jul 18, 2022

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus CVSS is a massively multilingual-to-English speech-to-speech translation corpus, co

26 Jan 16, 2022

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain, Active and Continual Few-Shot Learning (TPAMI 2022 - in submission)

42 Dec 6, 2022

How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

Real Time Object Recognition From your Screen Desktop . In this post, I will explain how to build a simply program to detect objects from you desktop

2 Sep 28, 2022

IA for recognising Traffic Signs using Keras [Tensorflow]

Traffic Signs Recognition ⚠️ 🚦 Fundamentals of Intelligent Systems Introduction 📄 Development of a neural network capable of recognizing nine differ

2 Dec 19, 2022

A machine learning project which can detect and predict the skin disease through image recognition.

ML-Project-2021 A machine learning project which can detect and predict the skin disease through image recognition. The dataset used for this is the H

1 Jan 13, 2022

Face Recognition & AI Based Smart Attendance Monitoring System.

In today’s generation, authentication is one of the biggest problems in our society. So, one of the most known techniques used for authentication is h

1 Jan 14, 2022

Food recognition model using convolutional neural network & computer vision

Food recognition model using convolutional neural network & computer vision. The goal is to match or beat the DeepFood Research Paper

1 Jan 13, 2022

Object recognition using Azure Custom Vision AI and Azure Functions

Step by Step on how to create an object recognition model using Custom Vision, export the model and run the model in an Azure Function

11 Jul 8, 2022

2021 AI CUP Competition on Traditional Chinese Scene Text Recognition - Intermediate Contest

繁體中文場景文字辨識程式碼說明組別：這就是我成員：蔣明憲唐碩謙黃玥菱林冠霆蕭靖騰目錄環境套件安裝方式資料夾布局前處理-製作偵測訓練註解檔前處理-製作分類訓練樣本 part.py ：從 json 裁切出分類訓練樣本 Class.py ：將切出來的樣本按照文字分類到各資料夾

3 Jan 14, 2022

Image Recognition Model Generator

Takes a user-inputted query and generates a machine learning image recognition model that determines if an inputted image is or isn't their query

1 Jan 13, 2022

BERN2: an advanced neural biomedical namedentity recognition and normalization tool

BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by

20 Jan 13, 2022

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

1. What does this library do? Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a prepr

334 Dec 30, 2022

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus CVSS is a massively multilingual-to-English speech-to-speech translation corpus, co

118 Jan 6, 2023

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition Usage First, install PyTorch 1.7.1+, torchvision 0.8.2

40 Dec 12, 2022

I-Spy is a discord and twitter bot 🤖 that keeps a check on usage foul language, hate-speech and NSFW contents

I-Spy is a discord and twitter bot 🤖 that keeps a check on usage foul language, hate-speech and NSFW contents. It is the one stop solution to monitor your discord servers and twitter handles against community demons by offering content moderation.

5 Nov 16, 2022

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Lip to Speech Synthesis with Visual Context Attentional GAN This repository contains the PyTorch implementation of the following paper: Lip to Speech

6 Nov 2, 2022

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

20 Jan 3, 2023

HuSpaCy: industrial-strength Hungarian natural language processing

HuSpaCy: Industrial-strength Hungarian NLP HuSpaCy is a spaCy model and a library providing industrial-strength Hungarian language processing faciliti

120 Dec 14, 2022

BERN2: an advanced neural biomedical namedentity recognition and normalization tool

BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by

13 Jan 11, 2022

Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Convolutional Recurrent Neural Network + CTCLoss | STAR-Net Code for paper "Towards Boosting the Accuracy of Non-Latin Scene Text Recognition" Depende

7 Aug 7, 2022

Computer Vision and Pattern Recognition, NUS CS4243, 2022

CS4243_2022 Computer Vision and Pattern Recognition, NUS CS4243, 2022 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : h

142 Dec 15, 2022

People tracker on the Internet: OSINT analysis and research tool by Jose Pino

trape (stable) v2.0 People tracker on the Internet: Learn to track the world, to avoid being traced. Trape is an OSINT analysis and research tool, whi

7.3k Dec 30, 2022

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 Tensorflow 2.0

NLP-Models-Tensorflow, Gathers machine learning and tensorflow deep learning models for NLP problems, code simplify inside Jupyter Notebooks 100%. Tab

1.7k Dec 30, 2022

This is a simple face recognition mini project that was completed by a team of 3 members in 1 week's time

PeekingDuckling 1. Description This is an implementation of facial identification algorithm to detect and identify the faces of the 3 team members Cla

2 Jan 25, 2022

Fancy console logger and wise assistant within your python projects

Fancy console logger and wise assistant within your python projects. Made to save tons of hours for common routines.

5 Apr 1, 2022

Implementation of paper "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement"

DCS-Net This is the implementation of "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement" Steps to run the model Edit V

10 Apr 4, 2022

End-to-end Temporal Action Detection with Transformer. [Under review]

TadTR: End-to-end Temporal Action Detection with Transformer By Xiaolong Liu, Qimeng Wang, Yao Hu, Xu Tang, Song Bai, Xiang Bai. This repo holds the c

105 Dec 25, 2022

End-to-end speech secognition toolkit

End-to-end speech secognition toolkit This is an E2E ASR toolkit modified from Espnet1 (version 0.9.9). This is the official implementation of paper:

147 Dec 28, 2022

Speech Recognition Database Management with python

Speech Recognition Database Management The main aim of this project is to recogn

2 Feb 2, 2022

The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals

Wearables Development Toolkit (WDK) The Wearables Development Toolkit (WDK) is a framework and set of tools to facilitate the iterative development of

114 Nov 27, 2022

This is a Python wrapper for TA-LIB based on Cython instead of SWIG.

TA-Lib This is a Python wrapper for TA-LIB based on Cython instead of SWIG. From the homepage: TA-Lib is widely used by trading software developers re

7.3k Jan 3, 2023

Speech to text streamlit app

Speech to text Streamlit-app! 👄 This speech to text recognition is powered by t

9 Jan 1, 2023

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Sign Language Recognition Service This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform s

1 Jan 8, 2022

Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

Spchcat Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi. Description spchcat is a command-line tool that read

279 Jan 3, 2023

A self-supervised learning framework for audio-visual speech

AV-HuBERT (Audio-Visual Hidden Unit BERT) Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Robust Self-Supervised A

431 Jan 7, 2023

SphereFace: Deep Hypersphere Embedding for Face Recognition

SphereFace: Deep Hypersphere Embedding for Face Recognition By Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj and Le Song License SphereFa

1.5k Dec 29, 2022

Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch

Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch. Topics: Face detection with Detectron 2, Time Series anomaly detection with LSTM Autoencoders, Object Detection with YOLO v5, Build your first Neural Network, Time Series forecasting for Coronavirus daily cases, Sentiment Analysis with BERT.

1.8k Dec 31, 2022

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition PDF Abstract Explainable artificial intelligence has been gaining attention

87 Dec 26, 2022

Code for paper [ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot] (ICCV 2021, oral))

ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot This repository is the official PyTorch implementation of ICCV-21 pape

21 May 9, 2022

The official code for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".

SpeechDrivesTemplates The official repo for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates". [arxiv

53 Dec 23, 2022

Learnable Boundary Guided Adversarial Training (ICCV2021)

Learnable Boundary Guided Adversarial Training This repository contains the implementation code for the ICCV2021 paper: Learnable Boundary Guided Adve

27 Sep 25, 2022

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition The source codes for ICCV2021 Paper: Spatio-Temporal Dynamic Inference Networ

40 Dec 12, 2022

This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

Elaborative Rehearsal for Zero-shot Action Recognition This is an official implementation of: Shizhe Chen and Dong Huang, Elaborative Rehearsal for Ze

26 Sep 24, 2022

An automated facial recognition based attendance system (desktop application)

Facial_Recognition_based_Attendance_System An automated facial recognition based attendance system (desktop application) Made using Python, Tkinter an

1 Jun 21, 2022

Graphical Password Authentication System.

Graphical Password Authentication System. This is used to increase the protection/security of a website. Our system is divided into further 4 layers of protection. Each layer is totally different and diverse than the others. This not only increases protection, but also makes sure that no non-human can log in to your account using different activities such as Brute Force Algorithm and so on.

12 Dec 16, 2022

BERN2: an advanced neural biomedical namedentity recognition and normalization tool

BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization

99 Jan 6, 2023

Employee-Managment - Company employee registration software in the face recognition system

Employee-Managment Company employee registration software in the face recognitio

7 Jul 10, 2022

Using BERT+Bi-LSTM+CRF

Chinese Medical Entity Recognition Based on BERT+Bi-LSTM+CRF Step 1 I share the dataset on my google drive, please download the whole 'CCKS_2019_Task1

55 Dec 21, 2022

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

ConferencingSpeech 2022 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech 2022 challenge. For more

21 Dec 2, 2022

DeepMusic is an easy to use Spotify like app to manage and listen to your favorites musics.

DeepMusic is an easy to use Spotify like app to manage and listen to your favorites musics. Technically, this project is an Android Client and its ent

1 Jul 12, 2021

A lightweight face-recognition toolbox and pipeline based on tensorflow-lite

FaceIDLight 📘 Description A lightweight face-recognition toolbox and pipeline based on tensorflow-lite with MTCNN-Face-Detection and ArcFace-Face-Rec

16 Dec 7, 2022

Automagically synchronize subtitles with video.

FFsubsync Language-agnostic automatic synchronization of subtitles with video, so that subtitles are aligned to the correct starting point within the

5.7k Jan 6, 2023

Personals scripts using ageitgey/face_recognition

HOW TO USE pip3 install requirements.txt Add some pictures of known people in the folder 'people' : a) Create a folder called by the name of the perso

1 Jan 6, 2022

NLP techniques such as named entity recognition, sentiment analysis, topic modeling, text classification with Python to predict sentiment and rating of drug from user reviews.

This file contains the following documents sumbited for Baruch CIS9665 group 9 fall 2021. 1. Dataset: drug_reviews.csv 2. python codes for text classi

2 Jan 4, 2023

Python Speech-recognition Resources

Python speech-recognition Libraries

Paddle-Skeleton-Based-Action-Recognition - DecoupleGCN-DropGraph, ASGCN, AGCN, STGCN

Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

This is the repo of the manuscript "Dual-branch Attention-In-Attention Transformer for speech enhancement"

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

FocusFace: Multi-task Contrastive Learning for Masked Face Recognition

Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications

Bot by image recognition simulating (random) human clicks

Simple and understandable swin-transformer OCR project

Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

Some utils for auto speech recognition

Exadel CompreFace is a free and open-source face recognition GitHub project

Stanford CoreNLP provides a set of natural language analysis tools written in Java

MasterDuel Image Recognition Translation Command Line Tool

Simple Text-To-Speech Bot For Discord

Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features

Applying CLIP to Point Cloud Recognition.

Local-Global Stratified Transformer for Efficient Video Recognition

Meta Self-learning for Multi-Source Domain Adaptation： A Benchmark

A real-time speech emotion recognition application using Scikit-learn and gradio

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER

The Face Mask recognition system uses AI technology to detect the person with or without a mask.

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch

Python Computer Vision from Scratch

An image processing project uses Viola-jones technique to detect faces and then use SIFT algorithm for recognition.

A CNN model to detect hand gestures.

Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

Official implementation of Sparse Transformer-based Action Recognition

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).

Action Recognition for Self-Driving Cars

A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

IA for recognising Traffic Signs using Keras [Tensorflow]

A machine learning project which can detect and predict the skin disease through image recognition.

Face Recognition & AI Based Smart Attendance Monitoring System.

Food recognition model using convolutional neural network & computer vision

Object recognition using Azure Custom Vision AI and Azure Functions

2021 AI CUP Competition on Traditional Chinese Scene Text Recognition - Intermediate Contest

Image Recognition Model Generator

BERN2: an advanced neural biomedical namedentity recognition and normalization tool

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition

I-Spy is a discord and twitter bot 🤖 that keeps a check on usage foul language, hate-speech and NSFW contents

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

HuSpaCy: industrial-strength Hungarian natural language processing

BERN2: an advanced neural biomedical namedentity recognition and normalization tool

Towards Boosting the Accuracy of Non-Latin Scene Text Recognition

Computer Vision and Pattern Recognition, NUS CS4243, 2022

People tracker on the Internet: OSINT analysis and research tool by Jose Pino

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 Tensorflow 2.0

This is a simple face recognition mini project that was completed by a team of 3 members in 1 week's time

Fancy console logger and wise assistant within your python projects

Implementation of paper "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement"

End-to-end Temporal Action Detection with Transformer. [Under review]

End-to-end speech secognition toolkit

Speech Recognition Database Management with python

The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals

This is a Python wrapper for TA-LIB based on Cython instead of SWIG.

Speech to text streamlit app

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

A self-supervised learning framework for audio-visual speech

SphereFace: Deep Hypersphere Embedding for Face Recognition

Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

Code for paper [ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot] (ICCV 2021, oral))

The official code for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".