1232 Python Images-to-video Libraries

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training [Arxiv] VideoMAE: Masked Autoencoders are Data-Efficient Learne

Multimedia Computing Group, Nanjing University

697 Jan 7, 2023

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

203 Dec 31, 2022

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

📖 Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) 🔥 If DaGAN is helpful in your photos/projects, please hel

503 Jan 4, 2023

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes

325 Jan 5, 2023

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers Official implementation of ViewFormer. ViewFormer is a NeRF-free neural rend

169 Dec 30, 2022

Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

E2FGVI (CVPR 2022) English | 简体中文 This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flo

Media Computing Group @ Nankai University

537 Jan 7, 2023

Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

BasicVSR_PlusPlus (CVPR 2022) [Paper] [Project Page] [Code] This is the official repository for BasicVSR++. Please feel free to raise issue related to

227 Jan 1, 2023

Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA sets the new standard on reconstructing highly emotional images in-the-wild

EMOCA: Emotion Driven Monocular Face Capture and Animation Radek Daněček · Michael J. Black · Timo Bolkart CVPR 2022 This repository is the official i

339 Dec 30, 2022

Python package to generate image embeddings with CLIP without PyTorch/TensorFlow

imgbeddings A Python package to generate embedding vectors from images, using OpenAI's robust CLIP model via Hugging Face transformers. These image em

81 Jan 4, 2023

Generate images from texts. In Russian

ruDALL-E Generate images from texts pip install rudalle==1.1.0rc0 🤗 HF Models: ruDALL-E Malevich (XL) ruDALL-E Emojich (XL) (readme here) ruDALL-E S

1.6k Dec 31, 2022

🤗🖼️ HuggingPics: Fine-tune Vision Transformers for anything using images found on the web.

🤗 🖼️ HuggingPics Fine-tune Vision Transformers for anything using images found on the web. Check out the video below for a walkthrough of this proje

185 Dec 21, 2022

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

Multi-Type-TD-TSR Check it out on Source Code of our Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for

178 Dec 27, 2022

Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

Visual Transformer for Facial Emotion Recognition (FER) This project has the aim to build an efficient Visual Transformer for the Facial Emotion Recog

8 Dec 12, 2022

scene-linear test images

Scene-Referred Image Collection A collection of OpenEXR Scene-Referred images, encoded as max 2048px width, DWAA 80 compression. All exrs are encoded

7 Aug 25, 2022

Video Frame Interpolation with Transformer (CVPR2022)

VFIformer Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer Dependencies python = 3.8 pytorch = 1.8.0

63 Dec 16, 2022

Python library for tracking human heads with FLAME (a 3D morphable head model)

Video Head Tracker 3D tracking library for human heads based on FLAME (a 3D morphable head model). The tracking algorithm is inspired by face2face. It

61 Dec 25, 2022

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

Image Classification in Python Implementing image classification in Flask using Keras. The VGG16 is a convolution neural network model architecture th

19 Dec 12, 2022

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

151 Dec 30, 2022

Simple Python script to download images and videos from public subreddits without using Reddit's API 😎

Subreddit Media Downloader Download images and videos from any public subreddit without using Reddit's API Made with ❤ by Nico 💬 About: This script a

106 Jan 7, 2023

Doing the asl sign language classification on static images using graph neural networks.

SignLangGNN When GNNs 💜 MediaPipe. This is a starter project where I tried to implement some traditional image classification problem i.e. the ASL si

10 Nov 9, 2022

Telegram Group Calls Streaming bot with some useful features, written in Python with Pyrogram and Py-Tgcalls. Supporting platforms like Youtube, Spotify, Resso, AppleMusic, Soundcloud and M3u8 Links.

Yukki Music Bot Yukki Music Bot is a Powerful Telegram Music+Video Bot written in Python using Pyrogram and Py-Tgcalls by which you can stream songs,

996 Dec 28, 2022

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

1.4k Jan 1, 2023

Implements VQGAN+CLIP for image and video generation, and style transfers, based on text and image prompts. Emphasis on ease-of-use, documentation, and smooth video creation.

VQGAN-CLIP-GENERATOR Overview This is a package (with available notebook) for running VQGAN+CLIP locally, with a focus on ease of use, good documentat

98 Dec 30, 2022

Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

Third Time's the Charm? Image and Video Editing with StyleGAN3 Yuval Alaluf*, Or Patashnik*, Zongze Wu, Asif Zamir, Eli Shechtman, Dani Lischinski, Da

531 Dec 20, 2022

PyTorch Implementation for "ForkGAN with SIngle Rainy NIght Images: Leveraging the RumiGAN to See into the Rainy Night"

ForkGAN with Single Rainy Night Images: Leveraging the RumiGAN to See into the Rainy Night By Seri Lee, Department of Engineering, Seoul National Univ

52 Oct 12, 2022

CLIPfa: Connecting Farsi Text and Images

CLIPfa: Connecting Farsi Text and Images OpenAI released the paper Learning Transferable Visual Models From Natural Language Supervision in which they

66 Dec 14, 2022

Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset

Ego4D EGO4D is the world's largest egocentric (first person) video ML dataset and benchmark suite, with 3,600 hrs (and counting) of densely narrated v

118 Jan 7, 2023

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning Authors repo (alphabetical) Constantin (CoEich), Mayukh (Mayukh

331 Jan 3, 2023

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

108 Dec 27, 2022

PromptDet: Expand Your Detector Vocabulary with Uncurated Images

PromptDet: Expand Your Detector Vocabulary with Uncurated Images Paper Website Introduction The goal of this work is to establish a scalable pipeline

103 Dec 20, 2022

This project contains the ClonedPerson dataset and code described in our paper "Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-Identification".

ClonedPerson This is the official repository for the ClonedPerson project, which contains the ClonedPerson dataset and code described in our paper "Cl

55 Dec 27, 2022

PyTorch implementations of the paper: "DR.VIC: Decomposition and Reasoning for Video Individual Counting, CVPR, 2022"

DRNet for Video Indvidual Counting (CVPR 2022) Introduction This is the official PyTorch implementation of paper: DR.VIC: Decomposition and Reasoning

35 Nov 22, 2022

Official implementation of "Watermarking Images in Self-Supervised Latent-Spaces"

🔍 Watermarking Images in Self-Supervised Latent-Spaces PyTorch implementation and pretrained models for the paper. For details, see Watermarking Imag

32 Dec 13, 2022

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Bridging Video-text Retrieval with Multiple Choice Questions, CVPR 2022 (Oral) Paper | Project Page | Pre-trained Model | CLIP-Initialized Pre-trained

Applied Research Center (ARC), Tencent PCG

99 Jan 6, 2023

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Unified Multi-modal Transformers This repository maintains the official implementation of the paper UMT: Unified Multi-modal Transformers for Joint Vi

84 Jan 4, 2023

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

31 Sep 27, 2022

Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News 💪 3DCrowdNet achieves the state-of-the-art accuracy on 3D

113 Dec 21, 2022

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

DALLE2 Video (wip) ** only to be built after DALLE2 image is done and replicated, and the importance of the prior network is validated ** Direct appli

105 May 15, 2022

A Human-in-the-Loop workflow for creating HD images from text

A Human-in-the-Loop? workflow for creating HD images from text DALL·E Flow is an interactive workflow for generating high-definition images from text

2.5k Jan 2, 2023

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect video data.

11 Oct 22, 2022

An open souce video/music streamer based on MPV and piped.

🎶 Harmony Music An easy way to stream videos or music from Youtube from the command line while regaining your privacy. 📖 Table Of Contents ❔ What's

16 Nov 15, 2022

Fast TikTok NO Watermark Video Downloader (username or url)

💎 TD [ TikDown v4 ] Star ⭐ if you want more Discord Server * discord.gg/onlp | Waxor#9999 Why not open source anymore ? * BECAUSE PEOPLE SKID, STEA

26 Dec 1, 2022

A telegram bot written in Python to fetch random SFW & NSFW anime images

Tsuzumi A telegram bot written in python to fetch both random SFW & NSFW Anime images using nekos.life & waifu.pics API Commands SFW Commands : /

3 Oct 12, 2022

[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

Disentangled Representation Learning for Text-Video Retrieval This is a PyTorch implementation of the paper Disentangled Representation Learning for T

49 Dec 18, 2022

"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022.

ViGA: Video moment retrieval via Glance Annotation This is the official repository of the paper "Video Moment Retrieval from Text Queries via Single F

38 Dec 31, 2022

YouTube Downloader is extremely simple program for downloading songs or playlists (in audio or video) from YouTube. Created using Python, PyTube and PySimpleGUI.

YouTube Downloader YouTube Downloader is extremely simple program for downloading songs or playlists (in audio or video) from YouTube. Disclaimer It's

3 Dec 14, 2022

A way to store images in YAML.

YAMLImg A way to store images in YAML. I made this after seeing Roadcrosser's JSON-G because it was too inspiring to ignore this opportunity. Installa

5 Mar 14, 2022

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

13 Nov 29, 2022

Automatically updates the twitter banner with the images of 5 latest followers, using tweepy python

Auto twitter banner Automatically updates the twitter banner every few seconds with follower profile pics on it Here's how it looks! Installation git

7 Jul 4, 2022

scrape tiktok/douyin video list from specific user or keyword

get-tiktok-user-video-list scrape tiktok/douyin video list from specific user or keyword 以**https://www.douyin.com/user/MS4wLjABAAAAUpIowEL3ygUAahQB47

4 Jul 6, 2022

MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

MAU (NeurIPS2021) Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xinguang Xiang, Wen GAo. Official PyTorch Code for "MAU: A Motion-Aware

20 Nov 25, 2022

2022-bridge - Example code belonging to the Bridge pattern video

Let's Take The Bridge Pattern To The Next Level This video covers how the bridge

11 Jun 14, 2022

Imgrerite - A command-line tool to hide and reveal information inside images

ImgReRite A command line tool to hide and reveal information inside images (work

7 Feb 17, 2022

Programmers-quest - Programmer's Quest! An open source MMO built on top of the Panda3D game engine and Astron server

Programmer's Quest! Programmer's Quest! The open source Python 3 2D MMORPG showc

5 Oct 7, 2022

Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

DIGAN (ICLR 2022) Official PyTorch implementation of "Generating Videos with Dyn

147 Dec 31, 2022

TwitterBot-ImageCollector - Twitter bot that collects images from likes saves the image

TwitterBot-ImageCollector Bot de Twitter que recolecta imagenes a partir de los

4 Jun 1, 2022

Detecting drunk people through thermal images using Deep Learning (CNN)

Drunk Detection CNN Detecting drunk people through thermal images using Deep Learning (CNN) Dataset We used thermal images provided by Electronics Lab

3 Oct 27, 2022

Source code to accompany Defunctland's video "FASTPASS: A Complicated Legacy"

Shapeland Simulator Source code to accompany Defunctland's video "FASTPASS: A Complicated Legacy" Download the video at https://www.youtube.com/watch?

70 Dec 14, 2022

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation YouTube | BiliBili 16X interpolation results from two input images: Introd

28 Dec 9, 2022

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval [toc] 1. Introduction This repository provides the code for our paper at

13 Dec 8, 2022

Vitrix is an open-source FPS video game coded in python

Vitrix is an open-source FPS video game coded in python Table of contents Usage Game Server Installing Requirements Hardware Requirements Software Req

1 Feb 13, 2022

A command line tool to hide and reveal information inside images (works for both PNGs and JPGs)

Imgrerite A command line tool to hide and reveal information inside images (works for both PNGs and JPGs) Dependencies Python 3 Git Most of the Linux

10 Jul 27, 2022

Tensorflow 2 implementation of our high quality frame interpolation neural network

FILM: Frame Interpolation for Large Scene Motion Project | Paper | YouTube | Benchmark Scores Tensorflow 2 implementation of our high quality frame in

1.6k Dec 28, 2022

A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

3 Feb 10, 2022

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

53 Nov 22, 2022

Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild.

58 Dec 24, 2022

Emotion Recognition from Facial Images

Reconhecimento de Emoções a partir de imagens faciais Este projeto implementa um classificador simples que utiliza técncias de deep learning e transfe

2 Feb 9, 2022

This bot plays the most recent video from the Daily Silksong News Youtube Channel whenever a specific user enters voice chat once a day.

Do you have that one friend that really likes Hollow Knight. Are they waiting for Silksong to come out? Heckle them with this Discord bot.

2 Feb 9, 2022

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj

845 Jan 3, 2023

Youtube Downloader is a simple but highly efficient Youtube Video Downloader, made completly using Python

2 Nov 26, 2022

Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionaries

Dictionary Learning for Clustering on Hyperspectral Images Overview Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionari

6 Oct 25, 2022

A motion tracking system for any arbitaray points in a video frame.

PointTracking This code is written by Majid Masoumi @ [email protected] I have used lucas kanade optical flow technique to track the points b

1 Feb 9, 2022

Official implementation for paper Render In-between: Motion Guided Video Synthesis for Action Interpolation

Render In-between: Motion Guided Video Synthesis for Action Interpolation [Paper] [Supp] [arXiv] [4min Video] This is the official Pytorch implementat

8 Oct 27, 2022

🖼️ Draw Images or GIFs in your terminal

Drawitor Draw Images/GIFs in your terminal. Install pip install drawitor CLI Tool drawitor cat_dancing.gif Library The library is written in a simple

7 Dec 15, 2022

Steganography is the art of hiding the fact that communication is taking place, by hiding information in other information.

7 Nov 9, 2022

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

This repository is the official PyTorch implementation of Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

4 Dec 11, 2022

Software for visualization of RTStruct structures on CT images

This script is responsible for the operation of the program, it is responsible for both creating the GUI and the process of processing images from dicom files. The program is based on the use of the PyQt5 library, on the basis of which the entire interface containing the appropriate buttons and functions was created.

0 Jun 29, 2022

A python Tk GUI that creates, writes text and attaches images into a custom spreadsheet file

13 Dec 9, 2022

A Multi-Tool with 30+Options.

15 Apr 12, 2022

Generate Cartoon Images using Generative Adversarial Network

AvatarGAN ✨ Generate Cartoon Images using DC-GAN Deep Convolutional GAN is a generative adversarial network architecture. It uses a couple of guidelin

50 Dec 29, 2022

Basic functions manipulating images using the OpenCV library

OpenCV Basic functions manipulating images using the OpenCV library. Reading Ima

3 Feb 17, 2022

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

RuCLIPtiny Zero-shot image classification model for Russian language RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network

26 Sep 20, 2022

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

145 Jan 1, 2023

Art directed cropping, useful for responsive images

Art direction sets a focal point and can be used when you need multiple copies of the same Image but also in in different proportions.

1 Aug 16, 2022

A GUI based glitch tool that uses FFMPEG to create motion interpolated glitches in your videos.

FF Dissolve Glitch This is a GUI based glitch tool that uses FFmpeg to create awesome and wierd motion interpolated glitches in videos. I call it FF d

19 Nov 10, 2022

VideoMergeDcBot1 - Video Merge Dc Bot for telegram

VIDEO MERGE BOT An Telegram Bot Demo 👉 @VideoMergeDcBot To Merge multiple Video

2 Feb 4, 2022

Splat a video into a mosaic by sampling a frame at regular intervals

Splat a video into a mosaic by sampling a frame at regular intervals. Useful for seeing the changes over time of an entire video or movie.

4 Oct 16, 2022

Simple Python script that lets you upload image/video to imgur

Pymgur 🐍 Simple Python script that lets you upload image/video to imgur! Usage 🔨 Git Clone this repository install the requirements (pip install -r

3 Feb 20, 2022

Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer vision can be used to identify cognates known to exist, and perhaps lead linguists to evidence of unknown cognates.

1 Feb 3, 2022

JoplinPdf2Images - Converts a PDF to images in Joplin and adds it to the specified note as a printout

joplinPdf2Images Converts a PDF to images in Joplin and adds it to the specified

2 Apr 20, 2022

Video-face-extractor - Video face extractor with Python

Python face extractor Setup Create the srcvideos and faces directories Put your

2 Feb 3, 2022

Terminal-Video-Player - A program that can display video in the terminal using ascii characters

15 Nov 10, 2022

Python Script to generate posters out of the images in directory.

Poster-Maker Python Script to generate posters out of the images in directory. This version is very basic ligthweight code to combine organize images

1 Feb 2, 2022

Harmonious Textual Layout Generation over Natural Images via Deep Aesthetics Learning

Harmonious Textual Layout Generation over Natural Images via Deep Aesthetics Learning Code for the paper Harmonious Textual Layout Generation over Nat

7 Aug 9, 2022

An python script to convert images to upscaled versions made out of one-colour emojis.

ABOUT This is an python script to convert png, jpg and gif(output isnt animated :( ) images to scaled versions made out of one-colour emojis. Please n

0 Oct 19, 2022

Converting Images Into Minecraft Houses

Converting Images Into Minecraft Houses In this particular project, we turned a 2D Image into Minecraft pixel art and then scaled it in 3D such that i

1 Feb 2, 2022

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

traiNNer traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation to

202 Jan 4, 2023

camKapture is an open source application that allows users to access their webcam device and take pictures or create videos.

1 Jun 21, 2022

A scrapy pipeline that provides an easy way to store files and images using various folder structures.

scrapy-folder-tree This is a scrapy pipeline that provides an easy way to store files and images using various folder structures. Supported folder str

7 Oct 23, 2022

Video-stream - A telegram video stream bot repo

This is a Telegram Video stream Bot. Binary Tech 💫 Features stream videos downl

1 Feb 2, 2022

Python Images-to-video Resources

Python images-to-video Libraries

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA sets the new standard on reconstructing highly emotional images in-the-wild

Python package to generate image embeddings with CLIP without PyTorch/TensorFlow

Generate images from texts. In Russian

🤗🖼️ HuggingPics: Fine-tune Vision Transformers for anything using images found on the web.

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

scene-linear test images

Video Frame Interpolation with Transformer (CVPR2022)

Python library for tracking human heads with FLAME (a 3D morphable head model)

A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images.

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Simple Python script to download images and videos from public subreddits without using Reddit's API 😎

Doing the asl sign language classification on static images using graph neural networks.

Telegram Group Calls Streaming bot with some useful features, written in Python with Pyrogram and Py-Tgcalls. Supporting platforms like Youtube, Spotify, Resso, AppleMusic, Soundcloud and M3u8 Links.

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

Implements VQGAN+CLIP for image and video generation, and style transfers, based on text and image prompts. Emphasis on ease-of-use, documentation, and smooth video creation.

Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433

PyTorch Implementation for "ForkGAN with SIngle Rainy NIght Images: Leveraging the RumiGAN to See into the Rainy Night"

CLIPfa: Connecting Farsi Text and Images

Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

PromptDet: Expand Your Detector Vocabulary with Uncurated Images

This project contains the ClonedPerson dataset and code described in our paper "Cloning Outfits from Real-World Images to 3D Characters for Generalizable Person Re-Identification".

PyTorch implementations of the paper: "DR.VIC: Decomposition and Reasoning for Video Individual Counting, CVPR, 2022"

Official implementation of "Watermarking Images in Self-Supervised Latent-Spaces"

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

A Human-in-the-Loop workflow for creating HD images from text

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

An open souce video/music streamer based on MPV and piped.

Fast TikTok NO Watermark Video Downloader (username or url)

A telegram bot written in Python to fetch random SFW & NSFW anime images

[arXiv22] Disentangled Representation Learning for Text-Video Retrieval

"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022.

YouTube Downloader is extremely simple program for downloading songs or playlists (in audio or video) from YouTube. Created using Python, PyTube and PySimpleGUI.

A way to store images in YAML.

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Automatically updates the twitter banner with the images of 5 latest followers, using tweepy python

scrape tiktok/douyin video list from specific user or keyword

MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

2022-bridge - Example code belonging to the Bridge pattern video

Imgrerite - A command-line tool to hide and reveal information inside images

Programmers-quest - Programmer's Quest! An open source MMO built on top of the Panda3D game engine and Astron server

Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

TwitterBot-ImageCollector - Twitter bot that collects images from likes saves the image

Detecting drunk people through thermal images using Deep Learning (CNN)

Source code to accompany Defunctland's video "FASTPASS: A Complicated Legacy"

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

Vitrix is an open-source FPS video game coded in python

A command line tool to hide and reveal information inside images (works for both PNGs and JPGs)

Tensorflow 2 implementation of our high quality frame interpolation neural network

A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild.

Emotion Recognition from Facial Images

This bot plays the most recent video from the Daily Silksong News Youtube Channel whenever a specific user enters voice chat once a day.

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Youtube Downloader is a simple but highly efficient Youtube Video Downloader, made completly using Python

Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionaries

A motion tracking system for any arbitaray points in a video frame.

Official implementation for paper Render In-between: Motion Guided Video Synthesis for Action Interpolation

🖼️ Draw Images or GIFs in your terminal

Steganography is the art of hiding the fact that communication is taking place, by hiding information in other information.

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

Software for visualization of RTStruct structures on CT images

A python Tk GUI that creates, writes text and attaches images into a custom spreadsheet file