280 Repositories
Python captioning-videos Libraries
Language Models Can See: Plugging Visual Controls in Text Generation
Language Models Can See: Plugging Visual Controls in Text Generation Authors: Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lin
Simple Python script to download images and videos from public subreddits without using Reddit's API 😎
Subreddit Media Downloader Download images and videos from any public subreddit without using Reddit's API Made with ❤ by Nico 💬 About: This script a
A Persian Image Captioning model based on Vision Encoder Decoder Models of the transformers🤗.
Persian-Image-Captioning We fine-tuning the Vision Encoder Decoder Model for the task of image captioning on the coco-flickr-farsi dataset. The implem
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene
[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos
Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos Created by Muheng Li, Lei Chen, Yueqi Duan, Zhilan Hu, Jianjiang Feng, Jie
A Telegram bot to download posts, videos, reels, IGTV and a user profile picture from Instagram!
Telegram Bot A telegram bot to download media from Instagram! No API Key or Login Needed! Requirements You must have python installed (of course) You
A standalone pytube wrapper for downloading individual videos from YouTube.
pytube-runner This is a Python CLI script for downloading individual videos from YouTube. The pytube project is the core of this runner, so naturally
moving object detection for satellite videos.
DSFNet: Dynamic and Static Fusion Network for Moving Object Detection in Satellite Videos Algorithm Introduction DSFNet: Dynamic and Static Fusion Net
Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks
DIGAN (ICLR 2022) Official PyTorch implementation of "Generating Videos with Dyn
Stitch it in Time: GAN-Based Facial Editing of Real Videos
STIT - Stitch it in Time [Project Page] Stitch it in Time: GAN-Based Facial Edit
Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Scrap the 42 Intranet's elearning videos in a single click
42intra_scraper Scrap the 42 Intranet's elearning videos in a single click. Why you would want to use it ? Adjust speed at your convenience. (The intr
End-to-end image captioning with EfficientNet-b3 + LSTM with Attention
Image captioning End-to-end image captioning with EfficientNet-b3 + LSTM with Attention Model is seq2seq model. In the encoder pretrained EfficientNet
Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling
TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece
A GUI based glitch tool that uses FFMPEG to create motion interpolated glitches in your videos.
FF Dissolve Glitch This is a GUI based glitch tool that uses FFmpeg to create awesome and wierd motion interpolated glitches in videos. I call it FF d
YoutubeDownloader - Repo for downloading YT audio and videos
YoutubeDownloader Downloads video/playlist/audio from youtube url. install all t
camKapture is an open source application that allows users to access their webcam device and take pictures or create videos.
camKapture is an open source application that allows users to access their webcam device and take pictures or create videos.
A CLI tool for searching and watching videos on youtube with no spyware and MPV and yt-dlp
A CLI tool for searching and watching videos on youtube with no spyware and MPV and yt-dlp
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
TalkingHead-1KH is a talking-head dataset consisting of YouTube videos
TalkingHead-1KH Dataset TalkingHead-1KH is a talking-head dataset consisting of YouTube videos, originally created as a benchmark for face-vid2vid: On
YouPlay is a python based tool for downloading YouTube videos through its URL
YouPlay is a python based tool for downloading YouTube videos through its URL. It is capable to download videos from YouTube playlists too and can extract the audio file only from the video. It can read URLs from files and can download contents as per instruction.
Goal of the project : Detecting Temporal Boundaries in Sign Language videos
MVA RecVis course final project : Goal of the project : Detecting Temporal Boundaries in Sign Language videos. Sign language automatic indexing is an
NeuralDiff: Segmenting 3D objects that move in egocentric videos
NeuralDiff: Segmenting 3D objects that move in egocentric videos Project Page | Paper + Supplementary | Video About This repository contains the offic
Manim is an engine for precise programmatic animations, designed for creating explanatory math videos
Manim is an engine for precise programmatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This rep
ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos
ComPhy This repository holds the code for the paper. ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos, (Under review) PDF Pro
Meme-videos - Scrapes memes and turn them into a video compilations
Meme Videos Scrapes memes from reddit using praw and request and then converts t
Learning to Segment Instances in Videos with Spatial Propagation Network
Learning to Segment Instances in Videos with Spatial Propagation Network This paper is available at the 2017 DAVIS Challenge website. Check our result
A GUI application for cropping images from videos
v-trimming-gui A GUI application for cropping images from videos. 動画をシークバーで操作しながらスクリーンショットを撮るためのアプリ。 Requirement Python =3.7 opencv-python ^4.5.5 PyS
Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video
Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video
Image Captioning on google cloud platform based on iot
Image-Captioning-on-google-cloud-platform-based-on-iot - Image Captioning on google cloud platform based on iot
Compact Bidirectional Transformer for Image Captioning
Compact Bidirectional Transformer for Image Captioning Requirements Python 3.8 Pytorch 1.6 lmdb h5py tensorboardX Prepare Data Please use git clone --
Awesome AI Learning with +100 AI Cheat-Sheets, Free online Books, Top Courses, Best Videos and Lectures, Papers, Tutorials, +99 Researchers, Premium Websites, +121 Datasets, Conferences, Frameworks, Tools
All about AI with Cheat-Sheets(+100 Cheat-sheets), Free Online Books, Courses, Videos and Lectures, Papers, Tutorials, Researchers, Websites, Datasets
PianoVisuals - Create background videos synced with piano music using opencv
Steps Record piano video Use Neural Network to do body segmentation (video matti
Python program - to extract slides from videos
Programa em Python - que fiz em algumas horas e que provavelmente tem bugs - para extrair slides de vídeos.
This is Instagram reposter that repost TikTok videos.
from-tiktok-to-instagram-reposter This script reposts videos from Tik Tok to your Instagram account. You must enter the username and password and slee
Desktop utility to download images/videos/music/text from various websites, and more
Desktop utility to download images/videos/music/text from various websites, and more
Automatically skip sponsor segments in YouTube videos playing on Apple TV.
iSponsorBlockTV Skip sponsor segments in YouTube videos playing on an Apple TV. This project is written in asycronous python and should be pretty quic
Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries
Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries
In this project , I play with the YouTube data API and extract trending videos in Nigeria on a particular day
YouTubeTrendingVideosAnalysis In this project , I played with the YouTube data API and extracted trending videos in Nigeria on a particular day. This
Compact Bidirectional Transformer for Image Captioning
Compact Bidirectional Transformer for Image Captioning Requirements Python 3.8 Pytorch 1.6 lmdb h5py tensorboardX Prepare Data Please use git clone --
PyQt5 simple files , youtube videos and youtube playlist downloader
PyQt5 simple files , youtube videos and youtube playlist downloader
Python package for Near Duplicate Video Detection (Perceptual Video Hashing) - Get a 64-bit comparable hash-value for any video.
The Python package for near duplicate video detection ⭐️ Introduction Videohash is a Python package for detecting near-duplicate videos (Perceptual Vi
Make YouTube videos tasks in Todoist faster and time efficient!
Youtubist Basically fork of yt-dlp python module to my needs. You can paste playlist or channel link on the YouTube. It will automatically format to s
VCPlayerBot - Telegram bot to stream videos in telegram voicechat for both groups and channels. Supports live steams, YouTube videos and telegram media
VCPlayerBot Telegram bot to stream videos in telegram voicechat for both groups
Pyvidplayer - An extremely easy to use module that plays videos on Pygame
pyvidplayer An extremely easy to use module that plays videos on Pygame Example
Anomaly detection related books, papers, videos, and toolboxes
Anomaly Detection Learning Resources Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
PDVC Official implementation for End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021) [paper] [valse论文速递(Chinese)] This repo supports:
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
bottom-up-attention This code implements a bottom-up attention model, based on multi-gpu training of Faster R-CNN with ResNet-101, using object and at
Python code for YouTube videos.
#This is a open source project. Python 3 These files are mainly intended to accompany my series of YouTube tutorial videos here, https://www.youtube.c
FaceAnon - Anonymize people in images and videos using yolov5-crowdhuman
Face Anonymizer Blur faces from image and video files in /input/ folder. Require
Nyon-stream - A python script that uses webtorrent to stream nyaa videos directly to mpv
nyon-stream A rather shitty script that uses webtorrent to stream nyaa videos di
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans Introduction We introduce the task of dense captioning in 3D scans from commodity RGB-D sensor
Official pytorch implementation of paper Dual-Level Collaborative Transformer for Image Captioning (AAAI 2021).
Dual-Level Collaborative Transformer for Image Captioning This repository contains the reference code for the paper Dual-Level Collaborative Transform
Official pytorch implementation of the AAAI 2021 paper Semantic Grouping Network for Video Captioning
Semantic Grouping Network for Video Captioning Hobin Ryu, Sunghun Kang, Haeyong Kang, and Chang D. Yoo. AAAI 2021. [arxiv] Environment Ubuntu 16.04 CU
LaBERT - A length-controllable and non-autoregressive image captioning model.
Length-Controllable Image Captioning (ECCV2020) This repo provides the implemetation of the paper Length-Controllable Image Captioning. Install conda
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
Learning to Generate Grounded Visual Captions without Localization Supervision This is the PyTorch implementation of our paper: Learning to Generate G
ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.
This repo contains some of the codes for the following paper Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code
Character Grounding and Re-Identification in Story of Videos and Text Descriptions
Character in Story Identification Network (CiSIN) This project hosts the code for our paper. Youngjae Yu, Jongseok Kim, Heeseung Yun, Jiwan Chung and
Moer Grounded Image Captioning by Distilling Image-Text Matching Model
Moer Grounded Image Captioning by Distilling Image-Text Matching Model Requirements Python 3.7 Pytorch 1.2 Prepare data Please use git clone --recurse
Meshed-Memory Transformer for Image Captioning. CVPR 2020
M²: Meshed-Memory Transformer This repository contains the reference code for the paper Meshed-Memory Transformer for Image Captioning (CVPR 2020). Pl
Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]
Introduction This repository is for X-Linear Attention Networks for Image Captioning (CVPR 2020). The original paper can be found here. Please cite wi
[CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning
Transform and Tell: Entity-Aware News Image Captioning This repository contains the code to reproduce the results in our CVPR 2020 paper Transform and
WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption
WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption
PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning PyTorch code for our ACL 2020 paper "MART: Memory-Augmented Recur
Code for paper Adaptively Aligned Image Captioning via Adaptive Attention Time
Adaptively Aligned Image Captioning via Adaptive Attention Time This repository includes the implementation for Adaptively Aligned Image Captioning vi
Implementation of the Object Relation Transformer for Image Captioning
Object Relation Transformer This is a PyTorch implementation of the Object Relation Transformer published in NeurIPS 2019. You can find the paper here
Unsupervised captioning - Code for Unsupervised Image Captioning
Unsupervised Image Captioning by Yang Feng, Lin Ma, Wei Liu, and Jiebo Luo Introduction Most image captioning models are trained using paired image-se
This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.
Code-and-Dataset-for-CapSal This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detec
GoodNews Everyone! Context driven entity aware captioning for news images
This is the code for a CVPR 2019 paper, called GoodNews Everyone! Context driven entity aware captioning for news images. Enjoy! Model preview: Huge T
This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP
Awesome-Visual-Captioning Table of Contents ACL-2021 CVPR-2021 AAAI-2021 ACMMM-2020 NeurIPS-2020 ECCV-2020 CVPR-2020 ACL-2020 AAAI-2020 ACL-2019 NeurI
Show-attend-and-tell - TensorFlow Implementation of "Show, Attend and Tell"
Show, Attend and Tell Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attent
Deep-Learning-Image-Captioning - Implementing convolutional and recurrent neural networks in Keras to generate sentence descriptions of images
Deep Learning - Image Captioning with Convolutional and Recurrent Neural Nets ========================================================================
Image captioning - Tensorflow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Introduction This neural system for image captioning is roughly based on the paper "Show, Attend and Tell: Neural Image Caption Generation with Visual
Youtube-dislikes-adder - Add dislikes to the description of your YouTube videos.
Add number of dislikes to the description of your YouTube videos. Number of dislikes are updated if you let this function as a bot.
Videocaptioning.pytorch - A simple implementation of video captioning
pytorch implementation of video captioning recommend installing pytorch and pyth
BlogBot - a Python script that create blogs from YouTube videos.
BlogBot - Convert Youtube Videos To Blogs BlogBot is a Python script that create blogs from YouTube videos.
YouTube-Video-Downloader - Download Youtube Videos for free.
YouTube-Video-Downloader Download Youtube Videos for free. Installing Dependencies:- Windows pip install pytube Mac/Linux pip3 install pytube Clonin
Multilingual Image Captioning
Multilingual Image Captioning Authors: Bhavitvya Malik, Gunjan Chhablani Demo Link: https://huggingface.co/spaces/flax-community/multilingual-image-ca
Unsupervised Learning of Video Representations using LSTMs
Unsupervised Learning of Video Representations using LSTMs Code for paper Unsupervised Learning of Video Representations using LSTMs by Nitish Srivast
An unsupervised learning framework for depth and ego-motion estimation from monocular videos
SfMLearner This codebase implements the system described in the paper: Unsupervised Learning of Depth and Ego-Motion from Video Tinghui Zhou, Matthew
Efficient face emotion recognition in photos and videos
This repository contains code of face emotion recognition that was developed in the RSF (Russian Science Foundation) project no. 20-71-10010 (Efficien
Tkinter based YouTube video downloader works on pytube 11.0.2. Can download YouTube videos in 720p(HD), 144p and even only audio.
YouTube-Downloader Tkinter based YouTube video downloader works on pytube 11.0.2. Can download YouTube videos in 720p(HD), 144p and even only audio. G
Download clips from youtube videos with a few clicks and a GUI!
YouClip v2.0.0 Table Of Contents: What Is YouClip Installation Usage Stuff To Fix Changelog What Is YouClip? ! IMPORTANT: The source files are a total
Tool to get Canvas cover videos from Spotify tracks.
Spotify Canvas Downloader Tool to get Canvas cover videos from Spotify tracks. ✨ Try it out Building Clone the repository git clone https://github.com
A PyTorch implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize
Download YouTube videos that are available in a playlist
Youtube-Playlist-Downloader Download YouTube videos that are in a playlist Project assets: music downloaded music folder. (will be generated) music.db
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.
Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m
Download YouTube videos that are available in the given playlist
Youtube-Playlist-Downloader Download YouTube videos that are available in the given playlist Project assets: music downloaded music folder. (will be g
TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize
An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live Streams, YouTube Videos & Telegram Media !!
Telegram Video Stream Bot (Py-TgCalls) An Telegram Bot By @ZauteKm To Stream Videos In Telegram Voice Chat Of Both Groups & Channels. Supports Live St
TensorFlow Implementation of "Show, Attend and Tell"
Show, Attend and Tell Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attent
Python script for downloading audio from YouTube songs/videos.
Python script for downloading audio from YouTube songs/videos. All you have to do is specify the path to your folder and then type song's/video's name and the sound will be downloaded into your folder.
An Inline Telegram bot that can download YouTube videos with permanent thumbnail support
Tube (YouTube Downloader) An Inline Telegram bot that can download YouTube videos with permanent thumbnail support About Bot need to be in Inline Mode
3D HourGlass Networks for Human Pose Estimation Through Videos
3D-HourGlass-Network 3D CNN Based Hourglass Network for Human Pose Estimation (3D Human Pose) from videos. This was my summer'18 research project. Dis
Motion Reconstruction Code and Data for Skills from Videos (SFV)
Motion Reconstruction Code and Data for Skills from Videos (SFV) This repo contains the data and the code for motion reconstruction component of the S
A python program to download one or multiple videos from YouTube.
YouTube-Video-Downloader A python program to download one or multiple videos from YouTube. Quick Start guide First Clone The Project git clone https:/
Automatically remove the mosaics in images and videos, or add mosaics to them.
Automatically remove the mosaics in images and videos, or add mosaics to them.
Download YouTube videos/music and images in MP4, JPG with this tool.
ABOUT THE TOOL Download YouTube videos, music and images in MP4, JPG with this tool, with an easy to understand interface. This tool works with both,
🎥 PYnema is a simple UDP server written in python, allows you to watch downloaded videos.
🎥 PYnema is a simple UDP server written in python, allows you to watch downloaded videos.
This is a repository for a playlist of videos where I teach building RESTful API with Flask and Flask extensions.
Build And Deploy A REST API with Flask This is code for a series of videos in which we look at the various concepts involved when building a REST API