940 Repositories
Python video-to-audio Libraries
A Survey on Deep Learning Technique for Video Segmentation
A Survey on Deep Learning Technique for Video Segmentation A Survey on Deep Learning Technique for Video Segmentation Wenguan Wang, Tianfei Zhou, Fati
WaveFake: A Data Set to Facilitate Audio DeepFake Detection
WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper
Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.
Training Script for Reuse-VOS This code implementation of CVPR 2021 paper : Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Vi
CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing
CapsuleVOS This is the code for the ICCV 2019 paper CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing. Arxiv Link: https://a
Rocks vc Userbot: A Telegram Bot Project That's Allow You To Play Audio And Video Music On Telegram Voice Chat Group
⭐️ Rocks VC Userbot ⭐️ Telegram Userbot To Play Audio And Video Song On VC Chat
Youtube video downloader and info extractor for python.
tube_dl Tube_dl is a Simple Youtube video downloader for Python. A Modular approach to bypass and download Youtube Videos and Playlist from Youtube us
Library for Python 3 to communicate with the Google Chromecast.
pychromecast Library for Python 3.6+ to communicate with the Google Chromecast. It currently supports: Auto discovering connected Chromecasts on the n
A timer for bird lovers, plays a random birdcall while displaying its image and info.
Birdcall Timer A timer for bird lovers. Siriema hatchling by Junior Peres Junior Background My partner needed a customizable timer for sitting and sta
Telegram music & video bot direct play music
⚡ NOINOI MUSIC PLAYER 🎵 SUPERFAST MUSIC BOT WHO CAN DIRECT PLAY SONG ON TELEGRAM VOICE CHAT ALSO CAN PLAY VIDEO ON VOICE CHATS ✨ Heroku Deploy YOU CA
A stable and Fast telegram video convertor bot which can compress, convert(video into audio and other video formats), rename with permanent thumbnail and trim.
ᴠɪᴅᴇᴏ ᴄᴏɴᴠᴇʀᴛᴏʀ A stable and Fast telegram video convertor bot which can compress, convert(video into audio and other video formats), rename and trim.
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
ALPRO Align and Prompt: Video-and-Language Pre-training with Entity Prompts [Paper] Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C.H
Intelligent Video Analytics toolkit based on different inference backends.
English | 中文 OpenIVA OpenIVA is an end-to-end intelligent video analytics development toolkit based on different inference backends, designed to help
Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.
SAFA: Structure Aware Face Animation (3DV2021) Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation. Getting Started
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we
Unsupervised Learning of Video Representations using LSTMs
Unsupervised Learning of Video Representations using LSTMs Code for paper Unsupervised Learning of Video Representations using LSTMs by Nitish Srivast
Official implementation of ACMMM'20 paper 'Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework'
Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework Official code for paper, Self-supervised Video Representation Le
Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".
PRP Introduction This is the implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".
code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction
Video_Pace This repository contains the code for the following paper: Jiangliu Wang, Jianbo Jiao and Yunhui Liu, "Self-Supervised Video Representation
Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.
Video Representation Learning by Recognizing Temporal Transformations [Project Page] Simon Jenni, Givi Meishvili, and Paolo Favaro. In ECCV, 2020. Thi
[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.
CoCLR: Self-supervised Co-Training for Video Representation Learning This repository contains the implementation of: InfoNCE (MoCo on videos) UberNCE
[arXiv 2020] Video Representation Learning with Visual Tempo Consistency
Video Representation Learning with Visual Tempo Consistency [Paper] [Project Page] News Full codebae is coming soon Pretained Models For now, we provi
Official Pytorch implementation for AAAI2021 paper (RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning)
RSPNet Official Pytorch implementation for AAAI2021 paper "RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning" [Suppleme
Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
[Paper] [Project page] This repository contains code for the paper: Andrew Owens, Alexei A. Efros. Audio-Visual Scene Analysis with Self-Supervised Mu
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo
Python package to display video in GUI using OpenCV-Python and PySide6
Python package to display video in GUI using OpenCV-Python and PySide6. Introduction cv2PySide6 is a package which provides utility classes and functi
Asad Alexa VC Bot Is A Telegram Bot Project That's Allow You To Play Audio And Video Music On Telegram Voice Chat Group.
Asad Alexa VC Bot Is A Telegram Bot Project That's Allow You To Play Audio And Video Music On Telegram Voice Chat Group.
Telegram music & video bot direct play music
Telegram music & video bot direct play music
A cross platform front-end GUI of the popular youtube-dl written in wxPython.
youtube-dlG A cross platform front-end GUI of the popular youtube-dl media downloader written in wxPython. Supported sites Screenshots Requirements Py
XViT - Space-time Mixing Attention for Video Transformer
XViT - Space-time Mixing Attention for Video Transformer This is the official implementation of the XViT paper: @inproceedings{bulat2021space, title
A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.
2021: A Year Full of Amazing AI papers- A Review 📌 A curated list of the latest breakthroughs in AI by release date with a clear video explanation, l
Tkinter based YouTube video downloader works on pytube 11.0.2. Can download YouTube videos in 720p(HD), 144p and even only audio.
YouTube-Downloader Tkinter based YouTube video downloader works on pytube 11.0.2. Can download YouTube videos in 720p(HD), 144p and even only audio. G
Easily download audio described movies and TV shows found on audiovault.net
AudioVault Downloader A convenient downloader for audio described movies and TV shows found on the Audio Vault. get latest binary release for Windows
FingerPy is a algorithm to measure, analyse and monitor heart-beat using only a video of the user's finger on a mobile cellphone camera.
FingerPy is a algorithm using python, scipy and fft to measure, analyse and monitor heart-beat using only a video of the user's finger on a m
EmoTag helps you train emotion detection model for Chinese audios
emoTag emoTag helps you train emotion detection model for Chinese audios. Environment pip install -r requirement.txt Data We used Emotional Speech Dat
python based bot Sends notification to your telegram whenever a new video is released on a youtube channel!
YTnotifier python based bot Sends notification to your telegram whenever a new video is released on a youtube channel! REQUIREMENTS telethon python-de
Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning
Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning Yansong Tang *, Zhenyu Jiang *, Zhenda Xie *, Yue
An easy to use GUI based video to image sequence converter (and vice versa).
Vdo & Img Conversion Tools This is a quick conversion tool made with python that can save you a lot of time. With this tool you can extract image sequ
Traditional deepdream with VQGAN+CLIP and optical flow. Ready to use in Google Colab
VQGAN-CLIP-Video cat.mp4 policeman.mp4 schoolboy.mp4 forsenBOG.mp4
Telegram bot for stream music or video on telegram
KYURA MUSIC Telegram bot for stream music or video on telegram, powered by PyTgCalls and Pyrogram Help Need Help me to translate this repo, click the
A python library for working with praat, textgrids, time aligned audio transcripts, and audio files.
praatIO Questions? Comments? Feedback? A library for working with praat, time aligned audio transcripts, and audio files that comes with batteries inc
A collection of python scripts for extracting and analyzing acoustics from audio files.
pyAcoustics A collection of python scripts for extracting and analyzing acoustics from audio files. Contents 1 Common Use Cases 2 Major revisions 3 Fe
Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs
Google Text-To-Speech Batch Prompt File Maker Are you in the need of IVR prompts, but you have no voice actors? Let Google talk your prompts like a pr
Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing
Contrast and Mix (CoMix) The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Backgroun
Automatically segment in-video YouTube sponsorships.
SponsorBlock Auto Segment [Model Download] Automatically segment in-video YouTube sponsorships. Trained on a large dataset of YouTube sponsor transcri
🐥Flappy Birds🐤 Video game. With your help I can go through🚀 the pipes. All UI is made with 🐍Pygame🐍
🐠 Flappy Fish 🐢 I am Flappy Fish 🐟 . With your help I can jump through the pipes and experience an interesting and exciting flight deep into the fi
Audio pitch-shifting & re-sampling utility, based on the EMU SP-1200
Pitcher.py Free & OS emulation of the SP-12 & SP-1200 signal chain (now with GUI) Pitch shift / bitcrush / resample audio files Written and tested in
Create a Video Membership app using FastAPI & NoSQL
Video Membership Create a Video Membership app using FastAPI & NoSQL. In this series, we're going to explore building a membership application using F
Turn any live video stream or locally stored video into a dataset of interesting samples for ML training, or any other type of analysis.
Sieve Video Data Collection Example Find samples that are interesting within hours of raw video, for free and completely automatically using Sieve API
A Simple YouTube Video Downloader With Python
Simple YouTube Video Downloader Simple YouTube Video Downloader is an open source project with a very simple UI that tries to speed up the process of
PyTorch implementation of Super SloMo by Jiang et al.
Super-SloMo PyTorch implementation of "Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation" by Jiang H., Sun
Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".
RNN-MBP Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring (AAAI-2022) by Chao Zhu, Hang Dong, Jinshan Pan
The code for paper "Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation" which is accepted by AAAI 2022
Contrastive Spatio Temporal Pretext Learning for Self-supervised Video Representation (AAAI 2022) The code for paper "Contrastive Spatio-Temporal Pret
VIsually-Pivoted Audio and(N) Text
VIP-ANT: VIsually-Pivoted Audio and(N) Text Code for the paper Connecting the Dots between Audio and Text without Parallel Data through Visual Knowled
Video Stream: an Advanced Telegram Bot that's allow you to play Video & Music on Telegram Group Video Chat
Video Stream is an Advanced Telegram Bot that's allow you to play Video & Music on Telegram Group Video Chat 🧪 Get SESSION_NAME from below: Pyrogram
Скрипт который выводит видео в консоль. Ничего лишнего)
video-to-ascii Скрипт который выводит видео в консоль. Ничего лишнего) Требования Минимальное разрешение экрана: 1280x720 Видео в качестве 360p 10-45f
A Telegram bot to transcribe audio, video and image into text.
Transcriber Bot A Telegram bot to transcribe audio, video and image into text. Deploy to Heroku Local Deploying Install the FFmpeg. Make sure you have
An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live Streams, YouTube Videos & Telegram Media !!
Telegram Video Stream Bot (Py-TgCalls) An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live St
Wav2Vec for speech recognition, classification, and audio classification
Soxan در زبان پارسی به نام سخن This repository consists of models, scripts, and notebooks that help you to use all the benefits of Wav2Vec 2.0 in your
Python script for downloading audio from YouTube songs/videos.
Python script for downloading audio from YouTube songs/videos. All you have to do is specify the path to your folder and then type song's/video's name and the sound will be downloaded into your folder.
Make an audio file (really) long-winded
longwind Make an audio file (really) long-winded Daily repetitions are an illusion anyway.
Convert human motion from video to .bvh
video_to_bvh Convert human motion from video to .bvh with Google Colab Usage 1. Open video_to_bvh.ipynb in Google Colab Go to https://colab.research.g
Efficient 3D human pose estimation in video using 2D keypoint trajectories
3D human pose estimation in video with temporal convolutions and semi-supervised training This is the implementation of the approach described in the
A python program to download one or multiple videos from YouTube.
YouTube-Video-Downloader A python program to download one or multiple videos from YouTube. Quick Start guide First Clone The Project git clone https:/
Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression.
Spatio-Temporal Entropy Model A Pytorch Reproduction of Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression. More details can
pyo is a Python module written in C to help digital signal processing script creation.
pyo is a Python module written in C to help digital signal processing script creation.
Automatically remove the mosaics in images and videos, or add mosaics to them.
Automatically remove the mosaics in images and videos, or add mosaics to them.
[ICCV 2021] Target Adaptive Context Aggregation for Video Scene Graph Generation
Target Adaptive Context Aggregation for Video Scene Graph Generation This is a PyTorch implementation for Target Adaptive Context Aggregation for Vide
[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding Official Pytorch implementation of Negative Sample Matter
Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"
Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose We provide PyTorch implementations for our arxiv paper "Audio-dr
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020] by Kaisiyuan Wang, Qianyi Wu, Linsen Song, Zhuoqian Yang, Wa
Code for Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019)
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019) We propose Disentangled Audio-Visual System (DAVS) to ad
AudioDVP:Photorealistic Audio-driven Video Portraits
AudioDVP This is the official implementation of Photorealistic Audio-driven Video Portraits. Major Requirements Ubuntu = 18.04 PyTorch = 1.2 GCC =
This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".
TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting Project Page | YouTube | Paper This is the official PyTorch implementation of the C
User-friendly Voice Cloning Application
Multi-Language-RTVC stands for Multi-Language Real Time Voice Cloning and is a Voice Cloning Tool capable of transfering speaker-specific audio featur
pyYotubemanager is full web automated bot capable of General tasks like:- Uploading a Video , Downloading , adding Title , Description , Listing types , adding Thumbnail
PyYoutubemanager Explore the docs » View Demo · Report Bug · Request Feature About The Project PyYotubemanager is full web automated bot capable of Ge
Wonkey - an open source programming language for the creation of cross-platform video games
Wonkey Programming Language Wonkey is an open source programming language for the creation of cross-platform video games, highly inspired by the “Blit
Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downstream tasks like translation and summarisation.
PART 2: CHAIN LINKING AUDIO-TO-TEXT NLP TASKS 2A: TRANSCRIBE-TRANSLATE-SENTIMENT-ANALYSIS In notebook3.0, I demo a simple workflow to: transcribe a lo
A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats
TG-MusicPlayer A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Py
A self-hosted streaming platform with Discord authentication, auto-recording and more!
A self-hosted streaming platform with Discord authentication, auto-recording and more!
Delta TTA(Text To Audio) SoftWare
Text-To-Audio-Windows Delta TTA(Text To Audio) SoftWare Info You Can Use It For Convert Your Text To Audio File You Just Write Your Text And Your End
SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation
SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation SeqFormer SeqFormer: a Frustratingly Simple Model for Video Instance Segmentat
👻🟡 Download all Snapchat video & photo memories from a data export.
Snapchat "Memories" Fetcher In compliance with the California Consumer Privacy Act of 2018 (“CCPA”), businesses which collect and store user data must
Text Classification in Turkish Texts with Bert
You can watch the details of the project on my youtube channel Project Interface Project Second Interface Goal= Correctly guessing the classification
Let's you download entire YT-playlists.
Youtube MP3 Playlist Downloader Let's you download entire youtube playlists as mp3 files. This application is basically a script that makes it easier
Dynamic View Synthesis from Dynamic Monocular Video
Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus
Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021
Introduction Official Pytorch implementation for Deep Contextual Video Compression, NeurIPS 2021 Prerequisites Python 3.8 and conda, get Conda CUDA 11
YouTube Video Search Engine For Python
YouTube-Video-Search-Engine Introduction With the increasing demand for electronic devices, it is hard for people to choose the best products from mul
A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities
MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-
Video Matting Refinement For Python
Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun
Dynamic View Synthesis from Dynamic Monocular Video
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting
BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o
nextdl - download videos from youtube.com or other video platforms
nextdl - download videos from youtube.com or other video platforms
Rune - a video miniplayer made with Python.
Rune - a video miniplayer made with Python.
A GUI-based audio player with support for a large variety of formats
Miza-Player A GUI-based audio player with support for a large variety of formats, able to play from web-hosted media platforms such as YouTube, includ
MelGAN test on audio decoding
Official repository for the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis The original work URL: https://github.com
PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
PyTorch implementation of Video Transformer Benchmarks This repository is mainly built upon Pytorch and Pytorch-Lightning. We wish to maintain a colle
Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
PyTorch implementation of Continuous Augmented Positional Embeddings (CAPE), by Likhomanenko et al. Enhance your Transformer positional embeddings with easy-to-use augmentations!
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting
BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o
Real-Time Spherical Microphone Renderer for binaural reproduction in Python
ReTiSAR Implementation of the Real-Time Spherical Microphone Renderer for binaural reproduction in Python [1][2]. Contents: | Requirements | Setup | Q