1440 Repositories
Python video-recognition Libraries
This python script extracts all the video URLs from any youtube channel. Then it extracts all the information like the name of the youtube channel, published date, likes, dislikes, comments, views, etc for all the videos in that channel.
youtube-channel-video-url-extractor This python script extracts all the video URLs from any youtube channel. Then it extracts all the information like
FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
PyTorch implementation of DUL (Data Uncertainty Learning in Face Recognition, CVPR2020)
PyTorch implementation of DUL (Data Uncertainty Learning in Face Recognition, CVPR2020)
Official code of paper: MovingFashion: a Benchmark for the Video-to-Shop Challenge
SEAM Match-RCNN Official code of MovingFashion: a Benchmark for the Video-to-Shop Challenge paper Installation Requirements: Pytorch 1.5.1 or more rec
A 10000+ hours dataset for Chinese speech recognition
WenetSpeech Official website | Paper A 10000+ Hours Multi-domain Chinese Corpus for Speech Recognition Download Please visit the official website, rea
Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors
-IEEE-TIM-2021-1-Shallow-CNN-for-HAR [IEEE TIM 2021-1] Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors All
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition [ArXiv] [Project Page] This repository is the official implementation of AdaMML:
Neural network for recognizing the gender of people in photos
Neural Network For Gender Recognition How to test it? Install requirements.txt file using pip install -r requirements.txt command Run nn.py using pyth
Video Autoencoder: self-supervised disentanglement of 3D structure and motion
Video Autoencoder: self-supervised disentanglement of 3D structure and motion This repository contains the code (in PyTorch) for the model introduced
[2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval
CONQUER: Contexutal Query-aware Ranking for Video Corpus Moment Retreival PyTorch implementation of CONQUER: Contexutal Query-aware Ranking for Video
A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"
SlowFast A PyTorch implementation of SlowFast based on ICCV 2019 paper SlowFast Networks for Video Recognition. Requirements Anaconda PyTorch conda in
A web app to scan crypto markets based on candlestick pattern recognition from
Crypto_Scanner A web app to scan crypto markets based on candlestick pattern recognition from "Japanese Candlestick Charting Techniques: A Contemporar
Deep Face Recognition in PyTorch
Face Recognition in PyTorch By Alexey Gruzdev and Vladislav Sovrasov Introduction A repository for different experimental Face Recognition models such
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥
face.evoLVe: High-Performance Face Recognition Library based on PaddlePaddle & PyTorch Evolve to be more comprehensive, effective and efficient for fa
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)
Large-Scale Long-Tailed Recognition in an Open World [Project] [Paper] [Blog] Overview Open Long-Tailed Recognition (OLTR) is the author's re-implemen
Command line tool to keep track of your favorite playlists on YouTube and many other places.
Command line tool to keep track of your favorite playlists on YouTube and many other places.
A 10000+ hours dataset for Chinese speech recognition
A 10000+ hours dataset for Chinese speech recognition
Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.
RealTime Sign Language Detection using Action Recognition Approach Real-Time Sign Language is commonly predicted using models whose architecture consi
A GUI for Face Recognition, based upon Docker, Tkinter, GPU and a camera device.
Face Recognition GUI This repository is a GUI version of Face Recognition by Adam Geitgey, where e.g. Docker and Tkinter are utilized. All the materia
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)
Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).
VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf
Code release for ICCV 2021 paper "Anticipative Video Transformer"
Anticipative Video Transformer Ranked first in the Action Anticipation task of the CVPR 2021 EPIC-Kitchens Challenge! (entry: AVT-FB-UT) [project page
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)
Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim
GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition
GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition
Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.
Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.
Add filters (background blur, etc) to your webcam on Linux.
Add filters (background blur, etc) to your webcam on Linux.
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
GluonMM is a library of transformer models for computer vision and multi-modality research
GluonMM is a library of transformer models for computer vision and multi-modality research. It contains reference implementations of widely adopted baseline models and also research work from Amazon Research.
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep
A Telegram Bot To Stream Videos in Telegram Voice Chat.
Video Stream X Bot Telegram bot project for streaming video on telegram video chat, powered by tgcalls and pyrogram Deploy to Heroku 👨🔧 The easy wa
📢 Video Chat Stream Telegram Bot. Can ⏳ Stream Live Videos, Radios, YouTube Videos & Telegram Video Files On Your Video Chat Of Channels & Groups !
Telegram Video Chat Bot (Beta) 📢 Video Chat Stream Telegram Bot 🤖 Can Stream Live Videos, Radios, YouTube Videos & Telegram Video Files On Your Vide
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained mo
Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple
Silero Models: pre-trained speech-to-text, text-to-speech models and benchmarks made embarrassingly simple
A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
SVHNClassifier-PyTorch A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks If
Speech Recognition using DeepSpeech2.
deepspeech.pytorch Implementation of DeepSpeech2 for PyTorch using PyTorch Lightning. The repo supports training/testing and inference using the DeepS
A PyTorch Implementation of Single Shot MultiBox Detector
SSD: Single Shot MultiBox Object Detector, in PyTorch A PyTorch implementation of Single Shot MultiBox Detector from the 2016 paper by Wei Liu, Dragom
moviepy-cli: Command line interface for MoviePy.
Moviepy-cli is designed to apply several video editing in a single command with Moviepy as an alternative to Video-cli.
A facial recognition doorbell system using a Raspberry Pi
Facial Recognition Doorbell This project expands on the person-detecting doorbell system to allow it to identify faces, and announce names accordingly
This project converts your human voice input to its text transcript and to an automated voice too.
Human Voice to Automated Voice & Text Introduction: In this project, whenever you'll speak, it will turn your voice into a robot voice and furthermore
A collection of SOTA Image Classification Models in PyTorch
A collection of SOTA Image Classification Models in PyTorch
A Telegram Bot to return Youtube Video Tags Using YoutubeTags API
YouTube-TagFind-Bot A Telegram Bot to return Youtube Video Tags Using YoutubeTags API YoutubeTags API Wrapper YoutubeTags is a python third-party api
A Bot For Streaming Videos In Tg Voice Chats.
「•ᴍɪsᴇʀʏ ᴠɪᴅᴇᴏ sᴛʀᴇᴀᴍᴇʀ•」 ᴀ ғɪɴᴇ & ғɪʀsᴛ ᴄʟᴀss ᴘʀᴏᴊᴇᴄᴛ ғᴏʀ ᴘʟᴀʏɪɴɢ ᴠɪᴅᴇᴏs ɪɴ ᴠᴏɪᴄᴇ ᴄʜᴀᴛ ʙʏ xᴇʙᴏʀɴ | •ᴘᴏᴡᴇʀᴇᴅ ʙʏ ᴛɢᴄᴀʟʟs and ᴘʏʀᴏ •ᴅᴇᴘʟᴏʏ ᴍɪsᴇʀʏ ᴛᴏ ʜᴇʀ
Code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition"
SEW (Squeezed and Efficient Wav2vec) The repo contains the code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speec
Out-of-boundary View Synthesis towards Full-frame Video Stabilization
Out-of-boundary View Synthesis towards Full-frame Video Stabilization Introduction | Update | Results Demo | Introduction This repository contains the
Official implementation of the ICCV 2021 paper "Joint Inductive and Transductive Learning for Video Object Segmentation"
JOINT This is the official implementation of Joint Inductive and Transductive learning for Video Object Segmentation, to appear in ICCV 2021. @inproce
[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"
CTR-GCN This repo is the official implementation for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition. The pap
This code renames subtitle file names to your video files names, so you don't need to rename them manually.
Rename Subtitle This code renames your subtitle file names to your video file names so you don't need to do it manually Note: It only works for series
[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)
Exploring Temporal Coherence for More General Video Face Forgery Detection(FTCN) Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, Fang Wen Accepted b
Tool to add main subject to items on Wikidata using a WMFs CirrusSearch for named entity recognition or a manually supplied list of QIDs
ItemSubjector Tool made to add main subject statements to items based on the title using a home-brewed CirrusSearch-based Named Entity Recognition alg
Telegram Vc Video Player Bot
Telegram Video Player Bot Telegram bot project for streaming video on telegram video chat, powered by tgcalls and pyrogram Deploy to Heroku 👨🔧 The
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)
video streaming userbot (vsu) based on pytgcalls for streaming video trought the telegram video chat group.
VIDEO STREAM USERBOT ✨ an another telegram userbot for streaming video trought the telegram video chat. Environmental Variables 📌 API_ID : Get this v
“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)
Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository
Using Youtube downloader is the fast and easy way to download and save any YouTube video.
Youtube video downloader using Django Using Django as a backend along with pytube module to create Youtbue Video Downloader. https://yt-videos-downloa
Desktop music recognition application for windows
MusicRecognizer Music recognition application for windows You can choose from which of the devices the recording will be made. If you choose speakers,
A Video Streaming Telegram Bot written in Python with Pyrogram and PyTgcalls
Video Stream Bot A Video Streaming Telegram Bot written in Python using Pyrogram and PyTgcalls Requirements Python 3.9 Telegram API Telegram Bot Token
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training
RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p
Official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.
This repository is the official Pytorch implementation of Test-Agnostic Long-Tailed Recognition by Test-Time Aggregating Diverse Experts with Self-Supervision.
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".
BanglaBERT This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced i
All Tools In One is a Script Developed with Python3. It gathers a total of 14 Discord tools (including a RAT, a Raid Tool, a Token Grabber, a Crash Video Maker, etc). It has a pleasant and intuitive interface to facilitate the use of all with help and explanations for each of them.
[Discord] - All Tools In One [Discord] - All Tools In One is a Script Gathering for Windows systems written in Python. Disclaimer This project was cre
txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.
txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.
Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.
face-mask-detection Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network. It contains 3 scr
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.
FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu
Python Library to Extract youtube video Tags without Youtube API
YoutubeTags Python Library to Extract youtube video Tags without Youtube API Installation pip install YoutubeTags Example import YoutubeTags from Yout
Classifying audio using Wavelet transform and deep learning
Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C
A youtube video link or id to video thumbnail python package.
Youtube-Video-Thumbnail A youtube video link or id to video thumbnail python package. Made with Python3
Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV
Realtime Face Anti-Spoofing Detection 🤖 Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo
A tool to fuck a video/audio quality using FFmpeg
Media quality fucker A tool to fuck a video/audio quality using FFmpeg How to use Download the source Download Python Extract FFmpeg Put what you want
[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation
Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"
One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.
WTW-Dataset This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on ICCV 2021. Here, you can download the
PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation
PocketNet This is the official repository of the paper: PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and M
BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond
BasicVSR BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond Ported from https://github.com/xinntao/BasicSR Dependencie
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"
Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm
Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments
Cross-Quality Labeled Faces in the Wild (XQLFW) Here, we release the database, evaluation protocol and code for the following paper: Cross Quality LFW
this is a telegram bot repository, that can stream video on telegram group video chat.
VIDEO STREAM BOT telegram bot project for streaming video on telegram video chat, powered by tgcalls and pyrogram 🛠 Commands: /vstream (reply to vide
A Advanced Anime Theme VC Video Player created for playing vidio in the voice chats of Telegram Groups
Yui Vidio Player A Advanced Anime Theme VC Video Player created for playing vidio in the voice chats of Telegram Groups Demo Setting up Add this Bot t
An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. This is Also The Source Code of The Bot Which is Being Used In @SafoTheBot Group! ❤️
Telegram Video Player Bot (Beta) An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. Special Features Supports Live Streaming From
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.
Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.
Add filters (background blur, etc) to your webcam on Linux.
webcam-filters Add filters (background blur, etc) to your webcam on Linux. Video conferencing applications tend to either lack video effects altogethe
[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition
CaaM This repo contains the codes of training our CaaM on NICO/ImageNet9 dataset. Due to my recent limited bandwidth, this codebase is still messy, wh
DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)
DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)
VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)
Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too
Asymmetric Bilateral Motion Estimation for Video Frame Interpolation, ICCV2021
ABME (ICCV2021) Junheum Park, Chul Lee, and Chang-Su Kim Official PyTorch Code for "Asymmetric Bilateral Motion Estimation for Video Frame Interpolati
Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"
CSRA This is the official code of ICCV 2021 paper: Residual Attention: A Simple But Effective Method for Multi-Label Recoginition Demo, Train and Vali
Just for testing video streaming using pytgcalls.
tgvc-video-tests Just for testing video streaming using pytgcalls. Note: The features used in this repository is highly experimental and you might not
This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.
Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env
Download Photo and Video from Wall of specific user or community
vkontakte-downloader Download Photo and Video from Wall of specific User or Community on https://vk.com Setup Clone the project git clone https://gith
This repo implements a Topological SLAM: Deep Visual Odometry with Long Term Place Recognition (Loop Closure Detection)
This repo implements a topological SLAM system. Deep Visual Odometry (DF-VO) and Visual Place Recognition are combined to form the topological SLAM system.
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific
AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition
AdaFocus (ICCV 2021) This repo contains the official code and pre-trained models for AdaFocus. Adaptive Focus for Efficient Video Recognition Referenc
The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".
Deep Exemplar-based Video Colorization (Pytorch Implementation) Paper | Pretrained Model | Youtube video 🔥 | Colab demo Deep Exemplar-based Video Col
Source Code For Template-Based Named Entity Recognition Using BART
Template-Based NER Source Code For Template-Based Named Entity Recognition Using BART Training Training train.py Inference inference.py Corpus ATIS (h
【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning (ACMMM 2021) Overview We release the code of the DSANet (Dynamic S
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!