2402 Repositories
Python video-image-extractor Libraries
Code for paper "Adversarial score matching and improved sampling for image generation"
Adversarial score matching and improved sampling for image generation This repo contains the official implementation for the ICLR 2021 paper Adversari
MoViNets PyTorch implementation: Mobile Video Networks for Efficient Video Recognition;
MoViNet-pytorch Pytorch unofficial implementation of MoViNets: Mobile Video Networks for Efficient Video Recognition. Authors: Dan Kondratyuk, Liangzh
Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"
CLIPstyler Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" Environment Pytorch 1.7.1, Python 3.6 $ c
4D Human Body Capture from Egocentric Video via 3D Scene Grounding
4D Human Body Capture from Egocentric Video via 3D Scene Grounding [Project] [Paper] Installation: Our method requires the same dependencies as SMPLif
A Python script that creates subtitles of a given length from text paragraphs that can be easily imported into any Video Editing software such as FinalCut Pro for further adjustments.
Text to Subtitles - Python This python file creates subtitles of a given length from text paragraphs that can be easily imported into any Video Editin
CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data
CLIP-Indonesian CLIP (Radford et al., 2021) is a multimodal model that can connect images and text by training a vision encoder and a text encoder joi
Vision-Language Pre-training for Image Captioning and Question Answering
VLP This repo hosts the source code for our AAAI2020 work Vision-Language Pre-training (VLP). We have released the pre-trained model on Conceptual Cap
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
UNITER: UNiversal Image-TExt Representation Learning This is the official repository of UNITER (ECCV 2020). This repository currently supports finetun
Oscar and VinVL
Oscar: Object-Semantics Aligned Pre-training for Vision-and-Language Tasks VinVL: Revisiting Visual Representations in Vision-Language Models Updates
Contrastive Language-Image Pretraining
CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair
[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations Karan Desai and Justin Johnson University of Michigan CVPR 2021 arxiv.org/abs/2006.06
Video Frame Interpolation without Temporal Priors (a general method for blurry video interpolation)
Video Frame Interpolation without Temporal Priors (NeurIPS2020) [Paper] [video] How to run Prerequisites NVIDIA GPU + CUDA 9.0 + CuDNN 7.6.5 Pytorch 1
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Ro
Code for the paper "Controllable Video Captioning with an Exemplar Sentence"
SMCG Code for the paper "Controllable Video Captioning with an Exemplar Sentence" Introduction We investigate a novel and challenging task, namely con
A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021
PGL-SUM: Combining Global and Local Attention with Positional Encoding for Video Summarization PyTorch Implementation of PGL-SUM From "PGL-SUM: Combin
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation Woncheol Shin1, Gyubok Lee1, Jiyoung Lee1, Joonseok Lee2,3, Edward Ch
[NeurIPS'21 Spotlight] PyTorch code for our paper "Aligned Structured Sparsity Learning for Efficient Image Super-Resolution"
ASSL This repository is for a new network pruning method (Aligned Structured Sparsity Learning, ASSL) for efficient single image super-resolution (SR)
MonoScene: Monocular 3D Semantic Scene Completion
MonoScene: Monocular 3D Semantic Scene Completion MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page] Anh-Quan Cao, Rao
Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"
CLIPstyler Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" Environment Pytorch 1.7.1, Python 3.6 $ c
Code of Adverse Weather Image Translation with Asymmetric and Uncertainty aware GAN
Adverse Weather Image Translation with Asymmetric and Uncertainty-aware GAN (AU-GAN) Official Tensorflow implementation of Adverse Weather Image Trans
A Telegram Video Watermark Adder Bot
Watermark-Bot A Telegram Video Watermark Adder Bot by @VideosWaterMarkRobot Features: Save Custom Watermark Image. Auto Resize Watermark According to
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016
Temporal Segment Networks (TSN) We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation fo
A Docker image for plotting and farming the Chia™ cryptocurrency on one computer or across many.
An easy-to-use WebUI for crypto plotting and farming. Offers Plotman, MadMax, Chiadog, Bladebit, Farmr, and Forktools in a Docker container. Supports Chia, Cactus, Chives, Flax, Flora, HDDCoin, Maize, N-Chain, Staicoin, and Stor among others.
Media Replay Engine (MRE) is a framework to build automated video clipping and replay (highlight) generation pipelines for live and video-on-demand content.
Media Replay Engine (MRE) is a framework for building automated video clipping and replay (highlight) generation pipelines using AWS services for live
clustimage is a python package for unsupervised clustering of images.
clustimage The aim of clustimage is to detect natural groups or clusters of images. Image recognition is a computer vision task for identifying and ve
DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral
Generative Image Inpainting An open source framework for generative image inpainting task, with the support of Contextual Attention (CVPR 2018) and Ga
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Website • Docs • Twitter • Join Slack Community What is Label Studio? Label Studio is an open source data labeling tool. It lets you label data types
A simple image-level annotation tool supporting multi-channel images for napari.
napari-labelimg4classification A simple image-level annotation tool supporting multi-channel images for napari. This napari plugin was generated with
Making Structure-from-Motion (COLMAP) more robust to symmetries and duplicated structures
SfM disambiguation with COLMAP About Structure-from-Motion generally fails when the scene exhibits symmetries and duplicated structures. In this repos
This is used to convert a string to an Image with Handwritten Characters.
Text-to-Handwriting-using-python This is used to convert a string to an Image with Handwritten Characters. text_to_handwriting(string: str, save_to: s
Fast and robust clustering of point clouds generated with a Velodyne sensor.
Depth Clustering This is a fast and robust algorithm to segment point clouds taken with Velodyne sensor into objects. It works with all available Velo
Train the HRNet model on ImageNet
High-resolution networks (HRNets) for Image classification News [2021/01/20] Add some stronger ImageNet pretrained models, e.g., the HRNet_W48_C_ssld_
On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation
On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation On Nonlinear Latent Transformations for GAN-based Image Editi
Self-Supervised Image Denoising via Iterative Data Refinement
Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S
Revisiting Temporal Alignment for Video Restoration
Revisiting Temporal Alignment for Video Restoration [arXiv] Kun Zhou, Wenbo Li, Liying Lu, Xiaoguang Han, Jiangbo Lu We provide our results at Google
EdiBERT, a generative model for image editing
EdiBERT, a generative model for image editing EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation. The
Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning
Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning This is the official repository of "Camera Distortion-
Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"
Low-light Image Enhancement via Breaking Down the Darkness by Qiming Hu, Xiaojie Guo. 1. Dependencies Python3 PyTorch=1.0 OpenCV-Python, TensorboardX
Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting
QAConv Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting This PyTorch code is proposed in
Official implementation of the NeurIPS'21 paper 'Conditional Generation Using Polynomial Expansions'.
Conditional Generation Using Polynomial Expansions Official implementation of the conditional image generation experiments as described on the NeurIPS
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [Project Website] [Replicate.ai Project] StyleGAN-NADA: CLIP-Guided Domain Adaptation
Official Implementation of "Designing an Encoder for StyleGAN Image Manipulation"
Designing an Encoder for StyleGAN Image Manipulation (SIGGRAPH 2021) Recently, there has been a surge of diverse methods for performing image editing
Official Implementation for Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation
Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation We present a generic image-to-image translation framework, pixel2style2pixel (pSp
SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches
SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches [Paper] [Project Page] [Interactive Demo] [Supplementary Material] Usag
Official Implementation for HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing
HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing Yuval Alaluf*, Omer Tov*, Ron Mokady, Rinon Gal, Amit H. Bermano *Denotes equ
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
PySlowFast PySlowFast is an open source video understanding codebase from FAIR that provides state-of-the-art video classification models with efficie
OpenL3: Open-source deep audio and image embeddings
OpenL3 OpenL3 is an open-source Python library for computing deep audio and image embeddings. Please refer to the documentation for detailed instructi
Memory tests solver with using OpenCV
Human Benchmark project This project is OpenCV based programs which are puzzle solvers for 7 different games for https://humanbenchmark.com/. made as
Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation
BMVOS This is the official implementation of Pixel-Level Bijective Matching for Video Object Segmentation, to appear in WACV 2022. @article{cho2021pix
A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats
TG-MusicPlayer A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Py
Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"
RealBasicVSR [Paper] This is the official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution, arXiv". This repository contain
Real-time VIBE: Frame by Frame Inference of VIBE (Video Inference for Human Body Pose and Shape Estimation)
Real-time VIBE Inference VIBE frame-by-frame. Overview This is a frame-by-frame inference fork of VIBE at [https://github.com/mkocabas/VIBE]. Usage: i
Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis
Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis
Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [Paper] [Colab is coming soon] Approach Example Usage To r
Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.
Swin-Transformer Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows. For more details, ple
Code from the 2021 Signal Video Superclass
Twilio Video Demo This is the code written during the live Twilio Video demo during Twilio's Signal 2021 Superclass. It creates a simple Video applica
Near-Duplicate Video Retrieval with Deep Metric Learning
Near-Duplicate Video Retrieval with Deep Metric Learning This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retr
[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data (NeurIPS 2021) This repository provides the official PyTorch implementation
This plugin generates json files used by deovr allowing you to play 2d and 3d video's using the player
deovr-plugin This plugin generates json files used by deovr allowing you to play 2d and 3d video's using the player. Deovr looks for an index file /de
Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"
Low-light Image Enhancement via Breaking Down the Darkness by Qiming Hu, Xiaojie Guo. 1. Dependencies Python3 PyTorch=1.0 OpenCV-Python, TensorboardX
Official repository of the paper "GPR1200: A Benchmark for General-PurposeContent-Based Image Retrieval"
GPR1200 Dataset GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval (ArXiv) Konstantin Schall, Kai Uwe Barthel, Nico Hezel, Klaus J
AdaDM: Enabling Normalization for Image Super-Resolution
AdaDM AdaDM: Enabling Normalization for Image Super-Resolution. You can apply BN, LN or GN in SR networks with our AdaDM. Pretrained models (EDSR*/RDN
COIN the currently largest dataset for comprehensive instruction video analysis.
COIN Dataset COIN is the currently largest dataset for comprehensive instruction video analysis. It contains 11,827 videos of 180 different tasks (i.e
Official Pytorch Implementation of Unsupervised Image Denoising With Frequency Domain Knowledge (BMVC2021 Oral Accepted Paper)
Unsupervised Image Denoising with Frequency Domain Knowledge (BMVC 2021 Oral) : Official Project Page This repository provides the official PyTorch im
Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [Paper] [Colab is coming soon] Approach Example Usage To r
End-to-End Referring Video Object Segmentation with Multimodal Transformers
End-to-End Referring Video Object Segmentation with Multimodal Transformers This repo contains the official implementation of the paper: End-to-End Re
TensorFlow implementation of the algorithm in the paper "Decoupled Low-light Image Enhancement"
Decoupled Low-light Image Enhancement Shijie Hao1,2*, Xu Han1,2, Yanrong Guo1,2 & Meng Wang1,2 1Key Laboratory of Knowledge Engineering with Big Data
PyTorch code of paper "LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering"
LiVLR-VideoQA We propose a Lightweight Visual-Linguistic Reasoning framework (LiVLR) for VideoQA. The overview of LiVLR: Evaluation on MSRVTT-QA Datas
A simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)
this is a simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)
A multi-scale unsupervised learning for deformable image registration
A multi-scale unsupervised learning for deformable image registration Shuwei Shao, Zhongcai Pei, Weihai Chen, Wentao Zhu, Xingming Wu and Baochang Zha
EFB Docker image with efb-telegram-master and efb-wechat-slave
efb-wechat-docker EFB Docker image with efb-telegram-master and efb-wechat-slave Features Container run by non-root user. Support add environment vari
Official Pytorch Implementation of Unsupervised Image Denoising with Frequency Domain Knowledge
Unsupervised Image Denoising with Frequency Domain Knowledge (BMVC 2021 Oral) : Official Project Page This repository provides the official PyTorch im
Torch implementation of "Enhanced Deep Residual Networks for Single Image Super-Resolution"
NTIRE2017 Super-resolution Challenge: SNU_CVLab Introduction This is our project repository for CVPR 2017 Workshop (2nd NTIRE). We, Team SNU_CVLab, (B
This project uses Template Matching technique for object detecting by detection of template image over base image.
Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima
🤖 Telegram UserBot Untuk Memutar Lagu Dan Video Di Obrolan Suara Telegram.
🤖 Telegram UserBot Untuk Memutar Lagu Dan Video Di Obrolan Suara Telegram.
Extracts random colours from an image
EXTRACT COLOURS This repo contains all the project files. Project Description A Program that extracts 10 random colours from an image and saves the rg
A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats.
VC UserBot A Telegram Userbot to play Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram Requirements Python
Image Captioning using CNN ,LSTM and Attention
Image Captioning using CNN ,LSTM and Attention This is a deeplearning model which tries to summarize an image into a text . Installation Install this
Official Pytorch Code for the paper TransWeather
TransWeather Official Code for the paper TransWeather, Arxiv Tech Report 2021 Paper | Website About this repo: This repo hosts the implentation code,
A procedural Blender pipeline for photorealistic training image generation
BlenderProc2 A procedural Blender pipeline for photorealistic rendering. Documentation | Tutorials | Examples | ArXiv paper | Workshop paper Features
Napari simpleitk image processing
napari-simpleitk-image-processing (n-SimpleITK) Process images using SimpleITK in napari Usage Filters of this napari plugin can be found in the Tools
Experimenting with computer vision techniques to generate annotated image datasets from gameplay recordings automatically.
Experimenting with computer vision techniques to generate annotated image datasets from gameplay recordings automatically. The collected data will then be used to train a deep neural network that can detect enemy player models in real time, during gameplay. Finally, a virtual input device will adjust the player's crosshair based on live detections for greater accuracy.
Fake videos detection by tracing the source using video hashing retrieval.
Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos 🎉️ 📜 Directory Introduction VTL Trace Samples and Acc of Hash
Open source software for image correlation, distance and analysis
Douglas-Quaid Project Open source software for image correlation, distance and analysis. Strongly related to : Carl-Hauser Problem statement (@CIRCL)
✂️ EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video.
EyeLipCropper EyeLipCropper is a Python tool to crop eyes and mouth ROIs of the given video. The whole process consists of three parts: frame extracti
Telegram bot to stream videos in telegram voicechat for both groups and channels.
Telegram bot to stream videos in telegram voicechat for both groups and channels. Supports live streams, YouTube videos and telegram media. With record stream support, Schedule streams, and many more.
Image Super-Resolution Using Very Deep Residual Channel Attention Networks
Image Super-Resolution Using Very Deep Residual Channel Attention Networks
Making a music video with Wav2CLIP and VQGAN-CLIP
music2video Overview A repo for making a music video with Wav2CLIP and VQGAN-CLIP. The base code was derived from VQGAN-CLIP The CLIP embedding for au
Play Video & Music on Telegram Group Video Chat
🖤 DEMONGIRL 🖤 ʜᴇʟʟᴏ ❤️ 🇱🇰 Join us ᴠɪᴅᴇᴏ sᴛʀᴇᴀᴍ ɪs ᴀɴ ᴀᴅᴠᴀɴᴄᴇᴅ ᴛᴇʟᴇʀᴀᴍ ʙᴏᴛ ᴛʜᴀᴛ's ᴀʟʟᴏᴡ ʏᴏᴜ ᴛᴏ ᴘʟᴀʏ ᴠɪᴅᴇᴏ & ᴍᴜsɪᴄ ᴏɴ ᴛᴇʟᴇɢʀᴀᴍ ɢʀᴏᴜᴘ ᴠɪᴅᴇᴏ ᴄʜᴀᴛ 🧪 ɢ
Translate U is capable of translating the text present in an image from one language to the other.
Translate U is capable of translating the text present in an image from one language to the other. The app uses OCR and Google translate to identify and translate across 80+ languages.
Pytorch and Keras Implementations of Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects.
The repository contains the implementations for Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects. Model
SwinIR: Image Restoration Using Swin Transformer
SwinIR: Image Restoration Using Swin Transformer This repository is the official PyTorch implementation of SwinIR: Image Restoration Using Shifted Win
alfred-py: A deep learning utility library for **human**
Alfred Alfred is command line tool for deep-learning usage. if you want split an video into image frames or combine frames into a single video, then a
Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"
SinGAN Project | Arxiv | CVF | Supplementary materials | Talk (ICCV`19) Official pytorch implementation of the paper: "SinGAN: Learning a Generative M
Implementation of ICCV19 Paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network"
OANet implementation Pytorch implementation of OANet for ICCV'19 paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network", by
An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020
UnpairedSR An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020 turn RCAN(modified) -- xmodel(xilinx
Deep Sketch-guided Cartoon Video Inbetweening
Cartoon Video Inbetweening Paper | DOI | Video The source code of Deep Sketch-guided Cartoon Video Inbetweening by Xiaoyu Li, Bo Zhang, Jing Liao, Ped
Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'
pytorch-inpainting-with-partial-conv Official implementation is released by the authors. Note that this is an ongoing re-implementation and I cannot f
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018
Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int
labelpix is a graphical image labeling interface for drawing bounding boxes
Welcome to labelpix 👋 labelpix is a graphical image labeling interface for drawing bounding boxes. 🏠 Homepage Install pip install -r requirements.tx