1737 Repositories
Python multi-face-recognition Libraries
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training [Arxiv] VideoMAE: Masked Autoencoders are Data-Efficient Learne
Code and pre-trained models for MultiMAE: Multi-modal Multi-task Masked Autoencoders
MultiMAE: Multi-modal Multi-task Masked Autoencoders Roman Bachmann*, David Mizrahi*, Andrei Atanov, Amir Zamir Website | arXiv | BibTeX Official PyTo
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
📖 Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) 🔥 If DaGAN is helpful in your photos/projects, please hel
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)
MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z
The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.
OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes
Ridogram is an advanced multi-featured Telegram UserBot.
Ridogram Ridogram is an advanced multi-featured Telegram UserBot. String Session Collect String Session by running python3 stringsession.py locally or
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.
[CVPR2022] Thin-Plate Spline Motion Model for Image Animation Source code of the CVPR'2022 paper "Thin-Plate Spline Motion Model for Image Animation"
Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA sets the new standard on reconstructing highly emotional images in-the-wild
EMOCA: Emotion Driven Monocular Face Capture and Animation Radek Daněček · Michael J. Black · Timo Bolkart CVPR 2022 This repository is the official i
This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild with Dense 3D Representations and A Benchmark. (CVPR 2022)"
Gait3D-Benchmark This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception
Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1 Liang Pan1 Zhongang Cai1,2,3 Ziwei Liu1* 1S-Lab, Nanyang Technologic
[CVPR2022] This repository contains code for the paper "Nested Collaborative Learning for Long-Tailed Visual Recognition", published at CVPR 2022
Nested Collaborative Learning for Long-Tailed Visual Recognition This repository is the official PyTorch implementation of the paper in CVPR 2022: Nes
A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.
Crosslingual Coreference Coreference is amazing but the data required for training a model is very scarce. In our case, the available training for non
Learn the basics of Python. These tutorials are for Python beginners. so even if you have no prior knowledge of Python, you won’t face any difficulty understanding these tutorials.
01_Python_Introduction Introduction 👋 Python is a modern, robust, high level programming language. It is very easy to pick up even if you are complet
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
Python_Natural_Language_Processing This repository contains tutorials on important topics related to Natural Language Processing (NPL). No. Name 01 01
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
Multi-Type-TD-TSR Check it out on Source Code of our Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for
Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time.
BBB Face Recognizer Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time. Instalati
BankNote-Net: Open dataset and encoder model for assistive currency recognition
BankNote-Net: Open Dataset for Assistive Currency Recognition Millions of people around the world have low or no vision. Assistive software applicatio
Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.
Visual Transformer for Facial Emotion Recognition (FER) This project has the aim to build an efficient Visual Transformer for the Facial Emotion Recog
Comprehensive-E2E-TTS - PyTorch Implementation
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
Use Python, OpenCV, and MediaPipe to control a keyboard with facial gestures
CheekyKeys A Face-Computer Interface CheekyKeys lets you control your keyboard using your face. View a fuller demo and more background on the project
Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.
Face Detect MQTT Face or Pose detector that emits MQTT events when a face or human body is detected and not detected. I built this as an alternative t
Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch
Automatic Number Plate Recognition Automatic Number Plate Recognition (ANPR) is the process of reading the characters on the plate with various optica
A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".
Mugs: A Multi-Granular Self-Supervised Learning Framework This is a PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-
A library for building and serving multi-node distributed faiss indices.
About Distributed faiss index service. A lightweight library that lets you work with FAISS indexes which don't fit into a single server memory. It fol
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
Poisson Image Editing - A Parallel Implementation Jiayi Weng (jiayiwen), Zixu Chen (zixuc) Poisson Image Editing is a technique that can fuse two imag
MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.
Multi-Graph Fusion Networks for Urban Region Embedding (IJCAI-22) This is the implementation of Multi-Graph Fusion Networks for Urban Region Embedding
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
MTFAA-Net Unofficial PyTorch implementation of Baidu's MTFAA-Net: "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speec
[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.
OpenCOOD OpenCOOD is an Open COOperative Detection framework for autonomous driving. It is also the official implementation of the ICRA 2022 paper OPV
The most Advanced yet simple Multi Cloud tool to transfer Your Data from any cloud to any cloud remotely based on Rclone.⚡
Multi Cloud Transfer (Advanced!) 🔥 1.Setup and Start using Rclone on Google Colab and Create/Edit/View and delete your Rclone config file and keep th
Optical Character Recognition + Instance Segmentation for russian and english languages
Распознавание рукописного текста в школьных тетрадях Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS. Результаты
Arabic Car License Recognition. A solution to the kaggle competition Machathon 3.0.
Transformers Arabic licence plate recognition 🚗 Solution to the kaggle competition Machathon 3.0. Ranked in the top 6️⃣ at the final evaluation phase
HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools
HuggingSound HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools. I have no intention of building a very complex tool here.
Optical character recognition for Japanese text, with the main focus being Japanese manga
Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran
Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP
mmc installation git clone https://github.com/dmarx/Multi-Modal-Comparators cd 'Multi-Modal-Comparators' pip install poetry poetry build pip install d
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)
SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R
Ridogram is an advanced multi-featured Telegram UserBot.
Ridogram Ridogram is an advanced multi-featured Telegram UserBot. String Session Collect String Session by running python3 stringsession.py locally or
Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296
Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions This repo contains the dataset and code for the paper Benchmarking Ro
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"
Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)
FaceVerse FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset Lizhen Wang, Zhiyuan Chen, Tao Yu, Chenguang
torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations
💎A high level pipeline for face landmarks detection, supports training, evaluating, exporting, inference and 100+ data augmentations, compatible with torchvision and albumentations, can easily install with pip.
Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.
CaiT-TF (Going deeper with Image Transformers) This repository provides TensorFlow / Keras implementations of different CaiT [1] variants from Touvron
Trajectory Variational Autoencder baseline for Multi-Agent Behavior challenge 2022
MABe_2022_TVAE: a Trajectory Variational Autoencoder baseline for the 2022 Multi-Agent Behavior challenge This repository contains jupyter notebooks t
The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.
OverlapTransformer The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for
ACL 2022: CAKE: A Scalable Commonsense-Aware Framework For Multi-View Knowledge Graph Completion
CAKE ACL 2022: CAKE: A Scalable Commonsense-Aware Framework For Multi-View Knowledge Graph Completion Introduction This is the PyTorch implementation
Contains the code and data for our #ICSE2022 paper titled as "CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Naming Sequences"
CodeFill This repository contains the code for our paper titled as "CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Namin
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis Multi-View Consistent Generative Adversarial Networks for 3D-aware
Enterprise Scale NLP with Hugging Face & SageMaker Workshop series
Workshop: Enterprise-Scale NLP with Hugging Face & Amazon SageMaker Earlier this year we announced a strategic collaboration with Amazon to make it ea
HF's ML for Audio study group
Hugging Face Machine Learning for Audio Study Group Welcome to the ML for Audio Study Group. Through a series of presentations, paper reading and disc
code for Image Manipulation Detection by Multi-View Multi-Scale Supervision
MVSS-Net Code and models for ICCV 2021 paper: Image Manipulation Detection by Multi-View Multi-Scale Supervision Update 22.02.17, Pretrained model for
A Traffic Sign Recognition Project which can help the driver recognise the signs via text as well as audio. Can be used at Night also.
Traffic-Sign-Recognition In this report, we propose a Convolutional Neural Network(CNN) for traffic sign classification that achieves outstanding perf
A collection of educational notebooks on multi-view geometry and computer vision.
Multiview notebooks This is a collection of educational notebooks on multi-view geometry and computer vision. Subjects covered in these notebooks incl
A discord token nuker With loads of options that will screw an account up real bad, also has inbuilt massreport, GroupChat Spammer and Token/Password/Creditcard grabber and so much more!
Installation | Important | Changelogs | Discord NOTE: Hazard is not finished! You can expect bugs, crashes, and non-working functions. Please make an
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).
XLM-EMO: Multilingual Emotion Prediction in Social Media Text Abstract Detecting emotion in text allows social and computational scientists to study h
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry
Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est
End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model
onnx-facial-lmk-detector End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model, model.onnx. Demo You can
Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.
README Code for Two-stage Identifier: "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022. For details of the model a
Realtime micro-expression recognition using OpenCV and PyTorch
Micro-expression Recognition Realtime micro-expression recognition from scratch using OpenCV and PyTorch Try it out with a webcam or video using the e
[CVPR'22] Official PyTorch Implementation of Collaborative Transformers for Grounded Situation Recognition
[CVPR'22] Collaborative Transformers for Grounded Situation Recognition Paper | Model Checkpoint This is the official PyTorch implementation of Collab
PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".
Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval (M2HSE) PyTorch code fo
Multi-objective gym environments for reinforcement learning.
MO-Gym: Multi-Objective Reinforcement Learning Environments Gym environments for multi-objective reinforcement learning (MORL). The environments follo
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"
MKGFormer Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion" Model Architecture Illu
Requests based multi-threaded script for increasing followers on Spotify
Proxyless Spotify Follow Bot Requests based multi-threaded script for increasing followers on Spotify. Click here to report bugs. Usage Download ZIP h
Fast and multi-threaded script to automatically claim targeted username including 14 day bypass
Instagram Username Auto Claimer Fast and multi-threaded script to automatically claim targeted username. Click here to report bugs. Usage Download ZIP
[LREC] MMChat: Multi-Modal Chat Dataset on Social Media
MMChat This repo contains the code and data for the LREC2022 paper MMChat: Multi-Modal Chat Dataset on Social Media. Dataset MMChat is a large-scale d
Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation
HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation by Lukas Hoyer, Dengxin Dai, and Luc Van Gool [Arxiv] [Paper] Overview Unsup
Multi Brute Force Facebook - Crack Facebook With Login - Free For Now
✭ SAKERA CRACK Made With ❤️ By Denventa, Araya, Dapunta Author: - Denventa - Araya Dev - Dapunta Khurayra X ⇨ Fitur Login [✯] Login Cookies ⇨ Ins
Extremely easy multi instancing software for minecraft speedrunning.
Easy Multi Extremely easy multi/single instancing software for minecraft speedrunning. A couple of goals of this project: Setup multi in minutes No fi
Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch
MeMOT - Pytorch (wip) Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch. This paper is just one in a line of work, but importan
A multi-voice TTS system trained with an emphasis on quality
TorToiSe Tortoise is a text-to-speech program built with the following priorities: Strong multi-voice capabilities. Highly realistic prosody and inton
Python PostgreSQL adapter to stream results of multi-statement queries without a server-side cursor
streampq Stream results of multi-statement PostgreSQL queries from Python without server-side cursors. Has benefits over some other Python PostgreSQL
This library is helpful when creating accounts, it has everything you need for this
AccountGeneratorHelper Library to facilitate accounts generation. Unofficial API for temp email services. Receive SMS from free services. Parsing and
4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022
A Two-Stage Shake-Shake Network for Long-tailed Recognition of SAR Aerial View Objects 4st place solution for the PBVS 2022 Multi-modal Aerial View Ob
Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors
SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development
Under the hood working of transformers, fine-tuning GPT-3 models, DeBERTa, vision models, and the start of Metaverse, using a variety of NLP platforms: Hugging Face, OpenAI API, Trax, and AllenNLP
Transformers-for-NLP-2nd-Edition @copyright 2022, Packt Publishing, Denis Rothman Contact me for any question you have on LinkedIn Get the book on Ama
This is the first released system towards complex meters` detection and recognition, which is implemented by computer vision techniques.
A three-stage detection and recognition pipeline of complex meters in wild This is the first released system towards detection and recognition of comp
3D Avatar Lip Syncronization from speech (JALI based face-rigging)
visemenet-inference Inference Demo of "VisemeNet-tensorflow" VisemeNet is an audio-driven animator centric speech animation driving a JALI or standard
Multi-robot collaborative exploration and mapping through Voronoi partition and DRL in unknown environment
Voronoi Multi_Robot Collaborate Exploration Introduction In the unknown environment, the cooperative exploration of multiple robots is completed by Vo
Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022
Improving evidential deep learning via multi task learning It is a repository of AAAI2022 paper, “Improving evidential deep learning via multi-task le
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(
Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)
Optical Character Recognition OCR (Optical Character Recognition) is a technology that enables the conversion of document types such as scanned paper
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)
One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python = 3.6 , Pytorch
Multi-resolution SeqMatch based long-term Place Recognition
MRS-SLAM for long-term place recognition In this work, we imply an multi-resolution sambling based visual place recognition method. This work is based
Tensorflow 1.13.X implementation for our NN paper: Wei Xia, Sen Wang, Ming Yang, Quanxue Gao, Jungong Han, Xinbo Gao: Multi-view graph embedding clustering network: Joint self-supervision and block diagonal representation. Neural Networks 145: 1-9 (2022)
Multi-view graph embedding clustering network: Joint self-supervision and block diagonal representation Simple implementation of our paper MVGC. The d
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
DiffGAN-TTS - PyTorch Implementation PyTorch implementation of DiffGAN-TTS: High
Spacy-ginza-ner-webapi - Named Entity Recognition API with spaCy and GiNZA
Named Entity Recognition API with spaCy and GiNZA I wrote a blog post about this
Flexible-Modal Face Anti-Spoofing: A Benchmark
Flexible-Modal FAS This is the official repository of "Flexible-Modal Face Anti-
Digitalizing-Prescription-Image - PIRDS - Prescription Image Recognition and Digitalizing System is a OCR make with Tensorflow
Digitalizing-Prescription-Image PIRDS - Prescription Image Recognition and Digit
Hcaptcha-challenger - Gracefully face hCaptcha challenge with Yolov5(ONNX) embedded solution
hCaptcha Challenger 🚀 Gracefully face hCaptcha challenge with Yolov5(ONNX) embe
MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?
MOT-Tracking-by-Detection-Pipeline Tracking-by-Detection形式のMOT(Multi Object Trac
Transcript-Extractor-Bot - Yet another Telegram Voice Recognition bot but using vosk and supports 20+ languages
transcript extractor Yet another Telegram Voice Recognition bot but using vosk a
Some simple programs built in Python: webcam with cv2 that detects eyes and face, with grayscale filter
Programas en Python Algunos programas simples creados en Python: 📹 Webcam con c
Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack
yolox-bytetrack-sample YOLOXとByteTrackを用いたMOT(Multiple Object Tracking)のPythonサン
Gesture recognition on Event Data
Event based Gesture Recognition Gesture recognition on Event Data usually involv
Multi Account Generator Minecraft/NordVPN/Hulu/Origin And ...
Multi Account Generator Minecraft/NordVPN/Hulu/Origin And ...
Skforecast is a python library that eases using scikit-learn regressors as multi-step forecasters
Skforecast is a python library that eases using scikit-learn regressors as multi-step forecasters. It also works with any regressor compatible with the scikit-learn API (pipelines, CatBoost, LightGBM, XGBoost, Ranger...).
MVSDF - Learning Signed Distance Field for Multi-view Surface Reconstruction
MVSDF - Learning Signed Distance Field for Multi-view Surface Reconstruction This is the official implementation for the ICCV 2021 paper Learning Sign
This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationships.
Auto-Lambda This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationship
CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors
CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors In order to facilitate the res
Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence
Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence, etc. This article aims to provide an introduction on how to make use of the SpeechRecognition and pyttsx3 library of Python.