1318 Repositories
Python images-to-video-converter Libraries
Demo code for paper "Learning optical flow from still images", CVPR 2021.
Depthstillation Demo code for "Learning optical flow from still images", CVPR 2021. [Project page] - [Paper] - [Supplementary] This code is provided t
Bulk Downloader for Reddit
saveddit is a bulk media downloader for reddit pip3 install saveddit Setting up authorization Register an application with Reddit Write down your clie
Implementation of ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer Unofficial implementation of ViViT: A Video Vision Transformer. Notes: This is in WIP. Model 2 is implemented, Model
[AAAI 2021] MVFNet: Multi-View Fusion Network for Efficient Video Recognition
MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021) Overview We release the code of the MVFNet (Multi-View Fusion Network).
Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning The predictive learning of spatiotemporal sequences aims to generate future
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral
NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon
SimDeblur is a simple framework for image and video deblurring, implemented by PyTorch
SimDeblur (Simple Deblurring) is an open source framework for image and video deblurring toolbox based on PyTorch, which contains most deep-learning based state-of-the-art deblurring algorithms. It is easy to implement your own image or video deblurring or other restoration algorithms.
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)
An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al
DrawBot lets you draw images taken from the internet on Skribbl.io, Gartic Phone and Paint
DrawBot You don't speak french? No worries, english translation is over here. C'est quoi ? DrawBot est un logiciel codé par V2F qui va prendre possess
The free and open-source Download Manager written in pure Python
The free and open-source Download Manager written in pure Python
Command-line program to download videos from YouTube.com and other video sites
youtube-dl - download videos from youtube.com or other video platforms
Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images
SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin
Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter
ACE Please find the preliminary version published at BMVC 2020 in the folder BMVC_version, and its extended journal version in Journal_version. Datase
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification
STAM - Pytorch Implementation of STAM (Space Time Attention Model), yet another pure and simple SOTA attention model that bests all previous models in
A Telegram Video Watermark Adder Bot in Pyrogram by @AbirHasan2005
Watermark-Bot A Telegram Video Watermark Adder Bot by @AbirHasan2005 Features: Save Custom Watermark Image. Auto Resize Watermark According to Video q
Hybrid Neural Fusion for Full-frame Video Stabilization
FuSta: Hybrid Neural Fusion for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo
Learning Calibrated-Guidance for Object Detection in Aerial Images
Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance
This code extends the neural style transfer image processing technique to video by generating smooth transitions between several reference style images
Neural Style Transfer Transition Video Processing By Brycen Westgarth and Tristan Jogminas Description This code extends the neural style transfer ima
Streamlink is a CLI utility which pipes video streams from various services into a video player
Streamlink is a CLI utility which pipes video streams from various services into a video player
Rembg is a tool to remove images background.
Rembg is a tool to remove images background.
Rembg Video Virtual Green Screen Edition
Rembg Virtual Greenscreen Edition is a tool to create a green screen matte for videos
Search and download Copernicus Sentinel satellite images
sentinelsat Sentinelsat makes searching, downloading and retrieving the metadata of Sentinel satellite images from the Copernicus Open Access Hub easy
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
Conversion of Image, video, text into ASCII format
asciju Python package that converts image to ascii Free software: MIT license
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models
merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.
Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding
~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.
cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a
Binarize document images
Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to
Generate text images for training deep learning ocr model
New version release:https://github.com/oh-my-ocr/text_renderer Text Renderer Generate text images for training deep learning OCR model (e.g. CRNN). Su
Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.
Total-Text-Dataset (Official site) Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.) Update
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).
OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents
A curated list of resources dedicated to scene text localization and recognition
Scene Text Localization & Recognition Resources A curated list of resources dedicated to scene text localization and recognition. Any suggestions and
OCR system for Arabic language that converts images of typed text to machine-encoded text.
Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l
A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.
The project is based on older versions of tesseract and other tools, and is now superseded by another project which allows for more granular control o
Text Detection from images using OpenCV
EAST Detector for Text Detection OpenCV’s EAST(Efficient and Accurate Scene Text Detection ) text detector is a deep learning model, based on a novel
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)
EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t
Recognizing cropped text in natural images.
ASTER: Attentional Scene Text Recognizer with Flexible Rectification ASTER is an accurate scene text recognizer with flexible rectification mechanism.
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments
Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc
This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.
DeepSceneTextReader This is a c++ project deploying a deep scene text reading pipeline. It reads text from natural scene images. Prerequsites The proj
Python library to extract tabular data from images and scanned PDFs
Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d
Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.
Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"
TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from
Detect textlines in document images
Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data
Detect textlines in document images
Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data
Detect and fix skew in images containing text
Alyn Skew detection and correction in images containing text Image with skew Image after deskew Install and use via pip! Recommended way(using virtual
Deskewing images with slanted content
skew_correction De-skewing images with slanted content by finding the deviation using Canny Edge Detection. To Run: In python 3.6, from deskew import
An application of high resolution GANs to dewarp images of perturbed documents
Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat
About A python based Apple Quicktime protocol,you can record audio and video from real iOS devices
介绍 本应用程序使用 python 实现,可以通过 USB 连接 iOS 设备进行屏幕共享 高帧率(30〜60fps) 高画质 低延迟(200ms) 非侵入性 支持多设备并行 Mac OSX 安装 python =3.7 brew install libusb pkg-config 如需使用 g
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)
TTNet-Pytorch The implementation for the paper "TTNet: Real-time temporal and spatial video analysis of table tennis" An introduction of the project c
Create docsets for Dash.app-compatible API browser.
doc2dash: Create Docsets for Dash.app and Clones doc2dash is an MIT-licensed extensible Documentation Set generator intended to be used with the Dash.
Simple screen recorder
Kooha Simple screen recorder Description Kooha is a simple screen recorder built with GTK. It allows you to record your screen and also audio from you
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction. It uses a customized encoder decoder architecture with spatio-temporal convolutions and channel gating to capture and interpolate complex motion trajectories between frames to generate realistic high frame rate videos. This repository contains original source code for the paper accepted to CVPR 2021.
[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers
VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru
Python package that generates hardware pinout diagrams as SVG images
PinOut A Python package that generates hardware pinout diagrams as SVG images. The package is designed to be quite flexible and works well for general
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.
TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le
This Bot can extract audios and subtitles from video files
Send any valid video file and the bot shows you available streams in it that can be extracted!!
Convert tables stored as images to an usable .csv file
Convert an image of numbers to a .csv file This Python program aims to convert images of array numbers to corresponding .csv files. It uses OpenCV for
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage
Image augmentation for machine learning experiments.
imgaug This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much lar
Simple text to phones converter for multiple languages
Phonemizer -- foʊnmaɪzɚ The phonemizer allows simple phonemization of words and texts in many languages. Provides both the phonemize command-line tool
plumi video sharing
December 2017 update We are moving tickets from the Plumi tracker (trac.plumi.org) here, for historical reasons. Plumi video sharing system Plumi is a
Insular email distribution - mail server as Docker images
Mailu is a simple yet full-featured mail server as a set of Docker images. It is free software (both as in free beer and as in free speech), open to s
Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic.
Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic. Exclusiv
The open-source core of Pinry, a tiling image board system for people who want to save, tag, and share images, videos and webpages in an easy to skim through format.
The open-source core of Pinry, a tiling image board system for people who want to save, tag, and share images, videos and webpages in an easy to skim
Been busy guys, will be reviewing and integrating pull requests shortly. Thanks to all contributors! LATEST RELEASE: 6.0.0 - flatpak @ https://flathub.org/apps/details/com.ozmartians.VidCutter - snap @ https://snapcraft.io/vidcutter - see https://github.com/ozmartian/vidcutter/releases for more details...
VidCutter 6 released on Flathub! VidCutter is now available as a flatpak at Flathub and is the most reliable option for Linux. All dependencies come b
plumi video sharing
December 2017 update We are moving tickets from the Plumi tracker (trac.plumi.org) here, for historical reasons. Plumi video sharing system Plumi is a
OpenShot Video Editor is an award-winning free and open-source video editor for Linux, Mac, and Windows, and is dedicated to delivering high quality video editing and animation solutions to the world.
OpenShot Video Editor is an award-winning free and open-source video editor for Linux, Mac, and Windows, and is dedicated to delivering high quality v
Video Editor for Linux
Project on break until late March. NEW RELEASE 2.8 IS OUT NOW. INSTALLING: see here. RELEASE NOTES AVAILABLE here. Introduction Features Releases Inst
Repository to create Ascii art in CMD based on video file.
Made to take any file format, and transform it into ascii art, displayed as a video in the cmd. If the cmd formatting is wrong, try zooming a little and remember to make cmd fullscreen. I made my cmd fullscreen, and zoomed out one tic. Written in Python 3.9
Todos os exercícios do Curso de Python, do canal Curso em Vídeo, resolvidos em Python, Javascript, Java, C++, C# e mais...
Exercícios - CeV Oferecido por Linguagens utilizadas atualmente O que vai encontrar aqui? 👀 Esse repositório é dedicado a armazenar todos os enunciad
Dynamic image server for web and print
Quru Image Server - dynamic imaging for web and print QIS is a high performance web server for creating and delivering dynamic images. It is ideal for
Implementation of TimeSformer, a pure attention-based solution for video classification
TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipBERT is designed based on 2D CNNs and transformers, and uses a sparse sampling strategy to enable efficient end-to-end video-and-language learning.
This program is to make a video based on Deep Dream
This program is to make a video based on Deep Dream. The program is modified from DeepDreamAnim and DeepDreamVideo with additional functions for bleding two frames based on the optical flows. It also supports the image division to apply the Deep Dream algorithm to a large image.
Knowledge Management for Humans using Machine Learning & Tags
HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.
Knowledge Management for Humans using Machine Learning & Tags
HyperTag helps humans intuitively express how they think about their files using tags and machine learning. Represent how you think using tags. Find what you look for using semantic search for your text documents (yes, even PDF's) and images.
Neural Re-rendering for Full-frame Video Stabilization
NeRViS: Neural Re-rendering for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo
NeRViS: Neural Re-rendering for Full-frame Video Stabilization
Neural Re-rendering for Full-frame Video Stabilization
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
Regex Converter for Flask URL Routes
Flask-Reggie Enable Regex Routes within Flask Installation pip install flask-reggie Configuration To enable regex routes within your application from
A tool to convert AWS EC2 instances back and forth between On-Demand and Spot billing models.
ec2-spot-converter This tool converts existing AWS EC2 instances back and forth between On-Demand and 'persistent' Spot billing models while preservin
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.
Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt
A vision library for performing sliced inference on large images/small objects
SAHI: Slicing Aided Hyper Inference A vision library for performing sliced inference on large images/small objects Overview Object detection and insta
Efficient 3D Backbone Network for Temporal Modeling
VoV3D is an efficient and effective 3D backbone network for temporal modeling implemented on top of PySlowFast. Diverse Temporal Aggregation and
the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.
EmbedSeg Introduction This repository hosts the version of the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.
Download courses from khanacademy.org
khan-dl A python script to download courses from Khan Academy using youtube-dl and beautifulsoup4.
Real-time video and audio streams over the network, with Streamlit.
streamlit-webrtc Example You can try out the sample app using the following commands.
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images
CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
Regex Converter for Flask URL Routes
Flask-Reggie Enable Regex Routes within Flask Installation pip install flask-reggie Configuration To enable regex routes within your application from
Easy file uploads for Flask.
Library that works with Flask & SqlAlchemy to store files on your server & in your database Read the docs: Documentation Installation Please install t
Script for downloading Coursera.org videos and naming them.
Coursera Downloader Coursera Downloader Introduction Features Disclaimer Installation instructions Recommended installation method for all Operating S
Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!
Google Images Download Python Script for 'searching' and 'downloading' hundreds of Google images to the local hard disk! Documentation Documentation H
Command-line program to download videos from YouTube.com and other video sites
youtube-dl - download videos from youtube.com or other video platforms INSTALLATION DESCRIPTION OPTIONS CONFIGURATION OUTPUT TEMPLATE FORMAT SELECTION
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search
CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is
Boltstream Live Video Streaming Website + Backend
Boltstream Self-hosted Live Video Streaming Website + Backend Reference
a pull switch (or BYO button) that gets you out of video calls, quick
zoomout a pull switch (or BYO button) that gets you out of video calls, quick. As seen on Twitter System compatibility Tested on macOS Catalina (10.15