2148 Repositories
Python referring-video-object-segmentation Libraries
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection For more details, please refer to our paper. Citing Please cite the related works
Detect textlines in document images
Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data
Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.
Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit
Handwritten Number Recognition using CNN and Character Segmentation
Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an
Document Layout Analysis
Eynollah Document Layout Analysis Introduction This tool performs document layout analysis (segmentation) from image data and returns the results as P
OCR-D-compliant page segmentation
ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e
a deep learning model for page layout analysis / segmentation.
OCR Segmentation a deep learning model for page layout analysis / segmentation. dependencies tensorflow1.8 python3 dataset: uw3-framed-lines-degraded-
ocroseg - This is a deep learning model for page layout analysis / segmentation.
ocroseg This is a deep learning model for page layout analysis / segmentation. There are many different ways in which you can train and run it, but by
Page to PAGE Layout Analysis Tool
P2PaLA Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks. 💥 Try our new DEMO for online baseli
Generic framework for historical document processing
dhSegment dhSegment is a tool for Historical Document Processing. Its generic approach allows to segment regions and extract content from different ty
Deep Learning Chinese Word Segment
引用 本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098 构建 安装好bazel代码构建工具,安装好tensorflow(目前本项目需
Detect handwritten words in a text-line (classic image processing method).
Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro
Character Segmentation using TensorFlow
Character Segmentation Segment characters and spaces in one text line,from this paper Chinese English mixed Character Segmentation as Semantic Segment
Detect textlines in document images
Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data
Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.
Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des
MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition
MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition Python 2.7 Python 3.6 MORAN is a network with rectification mechanism for
About A python based Apple Quicktime protocol,you can record audio and video from real iOS devices
介绍 本应用程序使用 python 实现,可以通过 USB 连接 iOS 设备进行屏幕共享 高帧率(30〜60fps) 高画质 低延迟(200ms) 非侵入性 支持多设备并行 Mac OSX 安装 python =3.7 brew install libusb pkg-config 如需使用 g
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)
TTNet-Pytorch The implementation for the paper "TTNet: Real-time temporal and spatial video analysis of table tennis" An introduction of the project c
OpenMMLab Detection Toolbox and Benchmark
MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.
Two-stage CenterNet
Probabilistic two-stage detection Two-stage object detectors that use class-agnostic one-stage detectors as the proposal network. Probabilistic two-st
Track to Detect and Segment: An Online Multi-Object Tracker (CVPR 2021)
Track to Detect and Segment: An Online Multi-Object Tracker (CVPR 2021) Track to Detect and Segment: An Online Multi-Object Tracker Jialian Wu, Jiale
Simple screen recorder
Kooha Simple screen recorder Description Kooha is a simple screen recorder built with GTK. It allows you to record your screen and also audio from you
Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)
ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam
PubMed Mapper: A Python library that map PubMed XML to Python object
pubmed-mapper: A Python Library that map PubMed XML to Python object 中文文档 1. Philosophy view UML Programmatically access PubMed article is a common ta
CTC segmentation python package
CTC segmentation CTC segmentation can be used to find utterances alignments within large audio files. This repository contains the ctc-segmentation py
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction. It uses a customized encoder decoder architecture with spatio-temporal convolutions and channel gating to capture and interpolate complex motion trajectories between frames to generate realistic high frame rate videos. This repository contains original source code for the paper accepted to CVPR 2021.
The next generation relational database.
What is EdgeDB? EdgeDB is an open-source object-relational database built on top of PostgreSQL. The goal of EdgeDB is to empower its users to build sa
[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers
VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.
PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation
CoTr: Efficient 3D Medical Image Segmentation by bridging CNN and Transformer This is the official pytorch implementation of the CoTr: Paper: CoTr: Ef
Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"
QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information by Masato Tamura, Hiroki Ohashi, and Tomoaki Yosh
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.
TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le
Tensorflow 2 Object Detection API kurulumu, GPU desteği, custom model hazırlama
Tensorflow 2 Object Detection API Bu tutorial, TensorFlow 2.x'in kararlı sürümü olan TensorFlow 2.3'ye yöneliktir. Bu, görüntülerde / videoda nesne a
This Bot can extract audios and subtitles from video files
Send any valid video file and the bot shows you available streams in it that can be extracted!!
Object-Centric Learning with Slot Attention
Slot Attention This is a re-implementation of "Object-Centric Learning with Slot Attention" in PyTorch (https://arxiv.org/abs/2006.15055). Requirement
[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator
involution Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVP
[CVPR2021 Oral] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
UP-DETR: Unsupervised Pre-training for Object Detection with Transformers This is the official PyTorch implementation and models for UP-DETR paper: @a
This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).
TransBTS: Multimodal Brain Tumor Segmentation Using Transformer This repo is the official implementation for TransBTS: Multimodal Brain Tumor Segmenta
Object Depth via Motion and Detection Dataset
ODMD Dataset ODMD is the first dataset for learning Object Depth via Motion and Detection. ODMD training data are configurable and extensible, with ea
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021
Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125
Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc
Image augmentation for machine learning experiments.
imgaug This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much lar
Gluon CV Toolkit
Gluon CV Toolkit | Installation | Documentation | Tutorials | GluonCV provides implementations of the state-of-the-art (SOTA) deep learning models in
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥
TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens
Official code for paper "Optimization for Oriented Object Detection via Representation Invariance Loss".
Optimization for Oriented Object Detection via Representation Invariance Loss By Qi Ming, Zhiqiang Zhou, Lingjuan Miao, Xue Yang, and Yunpeng Dong. Th
Object detection on multiple datasets with an automatically learned unified label space.
Simple multi-dataset detection An object detector trained on multiple large-scale datasets with a unified label space; Winning solution of E
Localization Distillation for Object Detection
Localization Distillation for Object Detection This repo is based on mmDetection. This is the code for our paper: Localization Distillation
plumi video sharing
December 2017 update We are moving tickets from the Plumi tracker (trac.plumi.org) here, for historical reasons. Plumi video sharing system Plumi is a
Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic.
Automatic Video Library Manager for TV Shows. It watches for new episodes of your favorite shows, and when they are posted it does its magic. Exclusiv
Been busy guys, will be reviewing and integrating pull requests shortly. Thanks to all contributors! LATEST RELEASE: 6.0.0 - flatpak @ https://flathub.org/apps/details/com.ozmartians.VidCutter - snap @ https://snapcraft.io/vidcutter - see https://github.com/ozmartian/vidcutter/releases for more details...
VidCutter 6 released on Flathub! VidCutter is now available as a flatpak at Flathub and is the most reliable option for Linux. All dependencies come b
plumi video sharing
December 2017 update We are moving tickets from the Plumi tracker (trac.plumi.org) here, for historical reasons. Plumi video sharing system Plumi is a
OpenShot Video Editor is an award-winning free and open-source video editor for Linux, Mac, and Windows, and is dedicated to delivering high quality video editing and animation solutions to the world.
OpenShot Video Editor is an award-winning free and open-source video editor for Linux, Mac, and Windows, and is dedicated to delivering high quality v
Video Editor for Linux
Project on break until late March. NEW RELEASE 2.8 IS OUT NOW. INSTALLING: see here. RELEASE NOTES AVAILABLE here. Introduction Features Releases Inst
Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"
Medical-Transformer Pytorch Code for the paper "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation" About this repo: This repo
Repository to create Ascii art in CMD based on video file.
Made to take any file format, and transform it into ascii art, displayed as a video in the cmd. If the cmd formatting is wrong, try zooming a little and remember to make cmd fullscreen. I made my cmd fullscreen, and zoomed out one tic. Written in Python 3.9
CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection
CLOCs is a novel Camera-LiDAR Object Candidates fusion network. It provides a low-complexity multi-modal fusion framework that improves the performance of single-modality detectors. CLOCs operates on the combined output candidates of any 3D and any 2D detector, and is trained to produce more accurate 3D and 2D detection results.
Todos os exercícios do Curso de Python, do canal Curso em Vídeo, resolvidos em Python, Javascript, Java, C++, C# e mais...
Exercícios - CeV Oferecido por Linguagens utilizadas atualmente O que vai encontrar aqui? 👀 Esse repositório é dedicado a armazenar todos os enunciad
Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.
SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official
A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection
Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection 1. 介绍 用以替代 NMS,在所有 bbox 中挑选出最优的集合。 NMS 仅考虑了 bbox 的得分,然后根据 IOU 来
This is a new web-based photo management application. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by object recognition, location awareness, color analysis and other ML algorithms.
Photonix Photo Manager This is a photo management application based on web technologies. Run it on your home server and it will let you find what you
This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage
SSD: Single Shot MultiBox Detector pytorch implementation focusing on simplicity
SSD: Single Shot MultiBox Detector Introduction Here is my pytorch implementation of 2 models: SSD-Resnet50 and SSDLite-MobilenetV2.
Implementation of TimeSformer, a pure attention-based solution for video classification
TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipBERT is designed based on 2D CNNs and transformers, and uses a sparse sampling strategy to enable efficient end-to-end video-and-language learning.
Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.
Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals This repo contains the Pytorch implementation of our paper: Unsupervised Seman
This program is to make a video based on Deep Dream
This program is to make a video based on Deep Dream. The program is modified from DeepDreamAnim and DeepDreamVideo with additional functions for bleding two frames based on the optical flows. It also supports the image division to apply the Deep Dream algorithm to a large image.
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks
This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.
Reviving Iterative Training with Mask Guidance for Interactive Segmentation
This repository provides the source code for training and testing state-of-the-art click-based interactive segmentation models with the official PyTorch implementation
Puzzle-CAM: Improved localization via matching partial and full features.
Puzzle-CAM The official implementation of "Puzzle-CAM: Improved localization via matching partial and full features".
Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.
CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models
Neural Re-rendering for Full-frame Video Stabilization
NeRViS: Neural Re-rendering for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo
PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection
Unbiased Teacher for Semi-Supervised Object Detection This is the PyTorch implementation of our paper: Unbiased Teacher for Semi-Supervised Object Detection
NeRViS: Neural Re-rendering for Full-frame Video Stabilization
Neural Re-rendering for Full-frame Video Stabilization
A general 3D Object Detection codebase in PyTorch.
Det3D is the first 3D Object Detection toolbox which provides off the box implementations of many 3D object detection algorithms such as PointPillars, SECOND, PIXOR, etc, as well as state-of-the-art methods on major benchmarks like KITTI(ViP) and nuScenes(CBGS).
Unsupervised text tokenizer focused on computational efficiency
YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)
🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.
pySBD: Python Sentence Boundary Disambiguation (SBD) pySBD - python Sentence Boundary Disambiguation (SBD) - is a rule-based sentence boundary detecti
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
Unsupervised text tokenizer for Neural Network-based text generation.
SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.
Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt
Exploring Cross-Image Pixel Contrast for Semantic Segmentation
Exploring Cross-Image Pixel Contrast for Semantic Segmentation Exploring Cross-Image Pixel Contrast for Semantic Segmentation, Wenguan Wang, Tianfei Z
A vision library for performing sliced inference on large images/small objects
SAHI: Slicing Aided Hyper Inference A vision library for performing sliced inference on large images/small objects Overview Object detection and insta
Efficient 3D Backbone Network for Temporal Modeling
VoV3D is an efficient and effective 3D backbone network for temporal modeling implemented on top of PySlowFast. Diverse Temporal Aggregation and
Code for "Learning to Segment Rigid Motions from Two Frames".
rigidmask Code for "Learning to Segment Rigid Motions from Two Frames". ** This is a partial release with inference and evaluation code.
Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation
Unseen Object Clustering: Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation Introduction In this work, we propose a new method
the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.
EmbedSeg Introduction This repository hosts the version of the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.
Download courses from khanacademy.org
khan-dl A python script to download courses from Khan Academy using youtube-dl and beautifulsoup4.
Real-time video and audio streams over the network, with Streamlit.
streamlit-webrtc Example You can try out the sample app using the following commands.
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images
CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete
Unsupervised text tokenizer focused on computational efficiency
YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)
🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.
pySBD: Python Sentence Boundary Disambiguation (SBD) pySBD - python Sentence Boundary Disambiguation (SBD) - is a rule-based sentence boundary detecti
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
Unsupervised text tokenizer for Neural Network-based text generation.
SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu
Async ODM (Object Document Mapper) for MongoDB based on python type hints
ODMantic Documentation: https://art049.github.io/odmantic/ Asynchronous ODM(Object Document Mapper) for MongoDB based on standard python type hints. I
A Pythonic, object-oriented interface for working with MongoDB.
PyMODM MongoDB has paused the development of PyMODM. If there are any users who want to take over and maintain this project, or if you just have quest
Pony Object Relational Mapper
Downloads Pony Object-Relational Mapper Pony is an advanced object-relational mapper. The most interesting feature of Pony is its ability to write que
A Python Object-Document-Mapper for working with MongoDB
MongoEngine Info: MongoEngine is an ORM-like layer on top of PyMongo. Repository: https://github.com/MongoEngine/mongoengine Author: Harry Marr (http:
Object-oriented file system path manipulation
path (aka path pie, formerly path.py) implements path objects as first-class entities, allowing common operations on files to be invoked on those path
Python library for serializing any arbitrary object graph into JSON. It can take almost any Python object and turn the object into JSON. Additionally, it can reconstitute the object back into Python.
jsonpickle jsonpickle is a library for the two-way conversion of complex Python objects and JSON. jsonpickle builds upon the existing JSON encoders, s