847 Python Dense-video-captioning Libraries

A youtube video link or id to video thumbnail python package.

Youtube-Video-Thumbnail A youtube video link or id to video thumbnail python package. Made with Python3

10 Oct 21, 2022

A tool to fuck a video/audio quality using FFmpeg

Media quality fucker A tool to fuck a video/audio quality using FFmpeg How to use Download the source Download Python Extract FFmpeg Put what you want

8 Jan 25, 2022

[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This

55 Dec 22, 2022

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

406 Dec 23, 2022

Image Captioning using CNN and Transformers

Image-Captioning Keras/Tensorflow Image Captioning application using CNN and Transformer as encoder/decoder. In particulary, the architecture consists

24 Dec 28, 2022

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

BasicVSR BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond Ported from https://github.com/xinntao/BasicSR Dependencie

8 Jun 7, 2022

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

91 Nov 22, 2022

this is a telegram bot repository, that can stream video on telegram group video chat.

VIDEO STREAM BOT telegram bot project for streaming video on telegram video chat, powered by tgcalls and pyrogram 🛠 Commands: /vstream (reply to vide

319 Aug 15, 2022

A Advanced Anime Theme VC Video Player created for playing vidio in the voice chats of Telegram Groups

Yui Vidio Player A Advanced Anime Theme VC Video Player created for playing vidio in the voice chats of Telegram Groups Demo Setting up Add this Bot t

32 Sep 16, 2021

An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. This is Also The Source Code of The Bot Which is Being Used In @SafoTheBot Group! ❤️

Telegram Video Player Bot (Beta) An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. Special Features Supports Live Streaming From

206 Jan 3, 2023

Add filters (background blur, etc) to your webcam on Linux.

webcam-filters Add filters (background blur, etc) to your webcam on Linux. Video conferencing applications tend to either lack video effects altogethe

480 Dec 14, 2022

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

15.8k Jan 4, 2023

VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too

82 Dec 15, 2022

Asymmetric Bilateral Motion Estimation for Video Frame Interpolation, ICCV2021

ABME (ICCV2021) Junheum Park, Chul Lee, and Chang-Su Kim Official PyTorch Code for "Asymmetric Bilateral Motion Estimation for Video Frame Interpolati

86 Dec 28, 2022

Just for testing video streaming using pytgcalls.

tgvc-video-tests Just for testing video streaming using pytgcalls. Note: The features used in this repository is highly experimental and you might not

34 Dec 27, 2022

This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Self Driving Car An autonomous car (also known as a driverless car, self-driving car, and robotic car) is a vehicle that is capable of sensing its env

4 Sep 4, 2021

Download Photo and Video from Wall of specific user or community

vkontakte-downloader Download Photo and Video from Wall of specific User or Community on https://vk.com Setup Clone the project git clone https://gith

4 Jul 20, 2022

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

2 Aug 21, 2022

AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition

AdaFocus (ICCV 2021) This repo contains the official code and pre-trained models for AdaFocus. Adaptive Focus for Efficient Video Recognition Referenc

115 Dec 21, 2022

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Deep Exemplar-based Video Colorization (Pytorch Implementation) Paper | Pretrained Model | Youtube video 🔥 | Colab demo Deep Exemplar-based Video Col

253 Dec 27, 2022

【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning (ACMMM 2021) Overview We release the code of the DSANet (Dynamic S

46 Dec 27, 2022

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

6.5k Jan 4, 2023

Temporal Segment Networks (TSN) in PyTorch

TSN-Pytorch We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation for TSN as well as oth

1k Jan 3, 2023

Open-source Monocular Python HawkEye for Tennis

Tennis Tracking 🎾 Objectives Track the ball Detect court lines Detect the players To track the ball we used TrackNet - deep learning network for trac

188 Jan 8, 2023

Code for the Python code smells video on the ArjanCodes channel.

7 Python code smells This repository contains the code for the Python code smells video on the ArjanCodes channel (watch the video here). The example

55 Dec 29, 2022

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

SADRNet Paper link: SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction Requirements python

Multimedia Computing Group, Nanjing University

99 Dec 30, 2022

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

42 Dec 9, 2022

X-modaler is a versatile and high-performance codebase for cross-modal analytics.

X-modaler X-modaler is a versatile and high-performance codebase for cross-modal analytics. This codebase unifies comprehensive high-quality modules i

910 Dec 28, 2022

The official repo for CVPR2021——ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search.

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search [paper] Introduction This is the official implementation of ViPNAS: Efficient V

42 Sep 26, 2022

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two tasks.

264 Jan 9, 2023

3D ResNets for Action Recognition (CVPR 2018)

3D ResNets for Action Recognition Update (2020/4/13) We published a paper on arXiv. Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, and Yutaka Satoh,

3.5k Jan 6, 2023

Code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in Video".

Consistent Depth of Moving Objects in Video This repository contains training code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in

203 Jan 5, 2023

A curated list of resources for Image and Video Deblurring

1.7k Jan 1, 2023

This is an official implementation of the High-Resolution Transformer for Dense Prediction.

High-Resolution Transformer for Dense Prediction Introduction This is the official implementation of High-Resolution Transformer (HRT). We present a H

403 Dec 13, 2022

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video

45 Nov 29, 2022

An unreferenced image captioning metric (ACL-21)

UMIC This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Cont

14 Nov 20, 2022

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

6 Dec 8, 2022

3D position tracking for soccer players with multi-camera videos

This repo contains a full pipeline to support 3D position tracking of soccer players, with multi-view calibrated moving/fixed video sequences as inputs.

72 Dec 27, 2022

This application makes a webrtc video call with jitsi meet signaling

gstreamer-jitsi-meet This application makes a webrtc video call with jitsi meet signaling. Other end can be any jitsi meet app or web app. It doesn't

7 Apr 26, 2022

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

st-nerf We provide PyTorch implementations for our paper: Editable Free-viewpoint Video Using a Layered Neural Representation SIGGRAPH 2021 Jiakai Zha

258 Jan 2, 2023

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment This is a pytorch project for the paper Seeing Dynamic Scene i

21 Nov 28, 2022

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

906 Jan 3, 2023

Search Youtube Video and Get Video info

PyYouTube Get Video Data from YouTube link Installation pip install PyYouTube How to use it ? Get Videos Data from pyyoutube import Data yt = Data("ht

35 Nov 25, 2022

HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha

64 Nov 19, 2022

Extreme Rotation Estimation using Dense Correlation Volumes

Extreme Rotation Estimation using Dense Correlation Volumes This repository contains a PyTorch implementation of the paper: Extreme Rotation Estimatio

29 Nov 18, 2022

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022

Using VideoBERT to tackle video prediction

VideoBERT This repo reproduces the results of VideoBERT (https://arxiv.org/pdf/1904.01766.pdf). Inspiration was taken from https://github.com/MDSKUL/M

75 Dec 14, 2022

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

VisualGPT Our Paper VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Main Architecture of Our VisualGPT Downloa

140 Dec 28, 2022

Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge

VidVRD-tracklets This repository contains codes for Video Visual Relation Detection (VidVRD) tracklets generation based on MEGA and deepSORT. These tr

25 Dec 21, 2022

AutoVideo: An Automated Video Action Recognition System

AutoVideo is a system for automated video analysis. It is developed based on D3M infrastructure, which describes machine learning with generic pipeline languages. Currently, it focuses on video action recognition, supporting various state-of-the-art video action recognition algorithms. It also supports automated model selection and hyperparameter tuning. AutoVideo is developed by DATA Lab at Texas A&M University.

Data Analytics Lab at Texas A&M University

267 Dec 17, 2022

The original Tetris tile-matching game as adopted by IBM PC.

🕹️ Tetris Game The original Tetris tile-matching game as adopted by IBM PC. Game developed purely on python using the Dear PyGui Framework. 📖 Instru

14 Nov 12, 2022

This is official implementaion of paper "Token Shift Transformer for Video Classification".

This is official implementaion of paper "Token Shift Transformer for Video Classification". We achieve SOTA performance 80.40% on Kinetics-400 val. Paper link

60 Dec 30, 2022

Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation

Implicit Internal Video Inpainting Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation paper | project

202 Dec 30, 2022

Video Contrastive Learning with Global Context

Video Contrastive Learning with Global Context (VCLR) This is the official PyTorch implementation of our VCLR paper. Install dependencies environments

143 Dec 26, 2022

[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Panoptic Segmentation Forecasting Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021 [Link to paper] We propose

44 Nov 29, 2022

implement of SwiftNet:Real-time Video Object Segmentation

SwiftNet The official PyTorch implementation of SwiftNet:Real-time Video Object Segmentation, which has been accepted by CVPR2021. Requirements Python

64 Dec 14, 2022

Code for Motion Representations for Articulated Animation paper

Motion Representations for Articulated Animation This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulat

851 Jan 9, 2023

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

IVOS-W Paper Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild Zhaoyun Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanli

38 Dec 12, 2022

Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

1.3k Jan 2, 2023

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Omnimatte in PyTorch This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effect

728 Dec 28, 2022

The Original Snake Game. Maneuver a snake in its burrow and earn points while avoiding the snake itself and the walls of the snake burrow.

Maneuver a snake in its burrow and earn points while avoiding the snake itself and the walls of the snake burrow. The snake grows when it eats an apple by default which can be disabled in the settings tab where you can also find all the other customization options.

17 Nov 12, 2022

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation Ported from https://github.com/hzwer/arXiv2020-RIFE Dependencies NumPy

49 Jan 7, 2023

This's an implementation of deepmind Visual Interaction Networks paper using pytorch

Visual-Interaction-Networks An implementation of Deepmind visual interaction networks in Pytorch. Introduction For the purpose of understanding the ch

166 Dec 6, 2022

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo

59 Dec 28, 2022

A command line tool to remove background from video and image

A command line tool to remove background from video and image, brought to you by BackgroundRemover.app which is an app made by nadermx powered by this tool

1.7k Jan 1, 2023

MoviePy is a Python library for video editing, can read and write all the most common audio and video formats

MoviePy is a Python library for video editing: cutting, concatenations, title insertions, video compositing (a.k.a. non-linear editing), video processing, and creation of custom effects. See the gallery for some examples of use.

10k Jan 8, 2023

Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

SRDenseNet-pytorch Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch (http://openaccess.thecvf.com/content_ICC

114 Nov 26, 2022

Compressed Video Action Recognition

Compressed Video Action Recognition Chao-Yuan Wu, Manzil Zaheer, Hexiang Hu, R. Manmatha, Alexander J. Smola, Philipp Krähenbühl. In CVPR, 2018. [Proj

479 Dec 26, 2022

Send notification to your telegram group/channel/private whenever a new video is uploaded on a youtube channel!

YouTube Feeds Bot. Send notification to your telegram group/channel/private whenever a new video is uploaded on a youtube channel! Variables BOT_TOKEN

30 Dec 7, 2022

Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

244 Dec 27, 2022

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

vid2vid Project | YouTube(short) | YouTube(full) | arXiv | Paper(full) Pytorch implementation for high-resolution (e.g., 2048x1024) photorealistic vid

8.1k Jan 1, 2023

Implementation of character based convolutional neural network

Character Based CNN This repo contains a PyTorch implementation of a character-level convolutional neural network for text classification. The model a

248 Nov 21, 2022

Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax

Clockwork VAEs in JAX/Flax Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax, ported

26 Oct 5, 2022

A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

225 Dec 25, 2022

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

CLIP-ViL In our paper "How Much Can CLIP Benefit Vision-and-Language Tasks?", we show the improvement of CLIP features over the traditional resnet fea

310 Dec 28, 2022

Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

RawVSR This repo contains the official codes for our paper: Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference Xiaoh

23 Oct 8, 2022

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021) Project Page | Video | Paper | Data We present a novel metho

65 Nov 28, 2022

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

117 Dec 28, 2022

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

This is a PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR), using subpixel convolution to optimize the inference speed of TecoGAN VSR model. Please refer to the official implementation ESPCN and TecoGAN for more information.

789 Jan 4, 2023

A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

DGC-Net: Dense Geometric Correspondence Network This is a PyTorch implementation of our work "DGC-Net: Dense Geometric Correspondence Network" TL;DR A

191 Dec 16, 2022

Unsupervised Video Interpolation using Cycle Consistency

Unsupervised Video Interpolation using Cycle Consistency Project | Paper | YouTube Unsupervised Video Interpolation using Cycle Consistency Fitsum A.

100 Nov 30, 2022

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

76 Dec 6, 2022

Gesture-controlled Video Game. Just swing your finger and play the game without touching your PC

Gesture Controlled Video Game Detailed Blog : https://www.analyticsvidhya.com/blog/2021/06/gesture-controlled-video-game/ Introduction This project is

35 Jan 6, 2023

Real-time multi-object tracker using YOLO v5 and deep sort

This repository contains a two-stage-tracker. The detections generated by YOLOv5, a family of object detection architectures and models pretrained on the COCO dataset, are passed to a Deep Sort algorithm which tracks the objects. It can track any object that your Yolov5 model was trained to detect.

3.6k Jan 5, 2023

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature fo

50 Dec 21, 2022

Python Dense-video-captioning Resources

Python dense-video-captioning Libraries

A youtube video link or id to video thumbnail python package.

A tool to fuck a video/audio quality using FFmpeg

[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

Image Captioning using CNN and Transformers

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

this is a telegram bot repository, that can stream video on telegram group video chat.

A Advanced Anime Theme VC Video Player created for playing vidio in the voice chats of Telegram Groups

An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. This is Also The Source Code of The Bot Which is Being Used In @SafoTheBot Group! ❤️

Add filters (background blur, etc) to your webcam on Linux.

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Asymmetric Bilateral Motion Estimation for Video Frame Interpolation, ICCV2021

Just for testing video streaming using pytgcalls.

This solves the autonomous driving issue which is supported by deep learning technology. Given a video, it splits into images and predicts the angle of turning for each frame.

Download Photo and Video from Wall of specific user or community

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Temporal Segment Networks (TSN) in PyTorch

Open-source Monocular Python HawkEye for Tennis

Code for the Python code smells video on the ArjanCodes channel.

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

X-modaler is a versatile and high-performance codebase for cross-modal analytics.

The official repo for CVPR2021——ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search.

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

3D ResNets for Action Recognition (CVPR 2018)

Code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in Video".

A curated list of resources for Image and Video Deblurring

This is an official implementation of the High-Resolution Transformer for Dense Prediction.

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

An unreferenced image captioning metric (ACL-21)

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

3D position tracking for soccer players with multi-camera videos

This application makes a webrtc video call with jitsi meet signaling

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

Search Youtube Video and Get Video info

HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Extreme Rotation Estimation using Dense Correlation Volumes

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

Using VideoBERT to tackle video prediction

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

Video Visual Relation Detection (VidVRD) tracklets generation. also for ACM MM Visual Relation Understanding Grand Challenge

AutoVideo: An Automated Video Action Recognition System

The original Tetris tile-matching game as adopted by IBM PC.

This is official implementaion of paper "Token Shift Transformer for Video Classification".

Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation

Video Contrastive Learning with Global Context

[CVPR 2021] Forecasting the panoptic segmentation of future video frames

implement of SwiftNet:Real-time Video Object Segmentation

Code for Motion Representations for Articulated Animation paper

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

Dense Prediction Transformers

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

The Original Snake Game. Maneuver a snake in its burrow and earn points while avoiding the snake itself and the walls of the snake burrow.

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

This's an implementation of deepmind Visual Interaction Networks paper using pytorch

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

A command line tool to remove background from video and image

MoviePy is a Python library for video editing, can read and write all the most common audio and video formats

Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

Compressed Video Action Recognition

Send notification to your telegram group/channel/private whenever a new video is uploaded on a youtube channel!

Implementation of "A MLP-like Architecture for Dense Prediction"

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

Implementation of character based convolutional neural network

Implementation of experiments in the paper Clockwork Variational Autoencoders (project website) using JAX and Flax

A Joint Video and Image Encoder for End-to-End Retrieval

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

ESTDepth: Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks (CVPR 2021)

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.