1269 Python Low-level-vision Libraries

Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

23 Jan 4, 2023

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj

845 Jan 3, 2023

A Novel Plug-in Module for Fine-grained Visual Classification

Pytorch implementation for A Novel Plug-in Module for Fine-Grained Visual Classification. fine-grained visual classification task.

109 Dec 20, 2022

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation This reposi

First Person Vision @ Image Processing Laboratory - University of Catania

1 Aug 21, 2022

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

50 Nov 12, 2022

Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

TailCalibX : Feature Generation for Long-tail Classification by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi [arXiv] [

34 Jan 2, 2023

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition Paper: MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition accepted fo

64 Dec 18, 2022

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

13.8k Jan 3, 2023

An implementation of IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification

IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification The repostiory consists of the code, results and data set links for

12 Dec 26, 2022

Face recognition project by matching the features extracted using SIFT.

MV_FaceDetectionWithSIFT Face recognition project by matching the features extracted using SIFT. By : Aria Radmehr Professor : Ali Amiri Dependencies

4 May 31, 2022

Building a Mario-like, classic platformer game in Python using the PyGame Library

1 Feb 6, 2022

An awesome list of AI for art and design - resources, and popular datasets and how we may apply computer vision tasks to art and design.

Awesome AI for Art & Design An awesome list of AI for art and design - resources, and popular datasets and how we may apply computer vision tasks to a

20 Dec 21, 2022

Reso is a low-level circuit design language and simulator, inspired by things like Redstone, Conway's Game of Life, and Wireworld.

Reso Reso is a low-level circuit design language and simulator, inspired by things like Redstone, Conway's Game of Life, and Wireworld. What is Reso?

287 Nov 26, 2022

Python Computer Vision Aim Bot for Roblox's Phantom Forces

Python-Phantom-Forces-Aim-Bot Python Computer Vision Aim Bot for Roblox's Phanto

2 Jul 11, 2022

Semantic Segmentation Suite in TensorFlow

Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!

2.5k Jan 6, 2023

LabelMe annotation tool source code

LabelMe annotation tool source code Here you will find the source code to install the LabelMe annotation tool on your server. LabelMe is an annotation

1.3k Jan 3, 2023

Mapomatic - Automatic mapping of compiled circuits to low-noise sub-graphs

mapomatic Automatic mapping of compiled circuits to low-noise sub-graphs Overvie

27 Nov 6, 2022

This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

2 Feb 20, 2022

Wordle-Python - A simple low-key clone of the popular game WORDLE made with python and a 2D Graphics module Pygame

Wordle-Python A simple low-key clone of the popular game WORDLE made with python

7 Feb 10, 2022

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

97 Dec 17, 2022

Real-time domain adaptation for semantic segmentation

Advanced-Machine-Learning This repository contains the code for the project Real

1 Jan 30, 2022

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

37 Nov 27, 2022

Python based utilities for interacting with digital multimeters that are built on the FS9721-LP3 chipset.

1 Feb 2, 2022

The aim is to extract timeseries water level 2D information for any designed boundaries within the EasyGSH model domain

bct_file_generator_for_EasyGSH The aim is to extract timeseries water level 2D information for any designed boundaries within the EasyGSH model domain

1 Jul 8, 2022

PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).

4 Oct 13, 2022

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

1.3k Dec 31, 2022

Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

You Only Cut Once (YOCO) YOCO is a simple method/strategy of performing augmenta

88 Dec 28, 2022

Multilingual finetuning of Machine Translation model on low-resource languages. Project for Deep Natural Language Processing course.

Low-resource-Machine-Translation This repository contains the code for the project relative to the course Deep Natural Language Processing. The goal o

3 Jun 22, 2022

Image processing is one of the most common term in computer vision

Image processing is one of the most common term in computer vision. Computer vision is the process by which computers can understand images and videos, and how they are stored, manipulated, and retrieve details from them. OpenCV is an open source computer vision image processing library for machine learning, deep leaning and AI application which plays a major role in real-time operation which is very important in today’s systems.

3 Feb 15, 2022

Optical character recognition for Japanese text, with the main focus being Japanese manga

Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

327 Jan 1, 2023

DocEnTr: An end-to-end document image enhancement transformer

DocEnTR Description Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer. This model is implemented on to

19 Jan 29, 2022

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

ruCLIP-SB RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and re

5 Apr 13, 2022

Computer Vision Paper Reviews with Key Summary of paper, End to End Code Practice and Jupyter Notebook converted papers

Computer-Vision-Paper-Reviews Computer Vision Paper Reviews with Key Summary along Papers & Codes. Jonathan Choi 2021 The repository provides 100+ Pap

2 Mar 17, 2022

[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)

Computer Vision - NTUA (2020-2021) This repository hosts the lab projects and theoretical problem sets of the Computer Vision course held by ECE NTUA

6 Jul 21, 2022

🤖 Project template for your next awesome AI project. 🦾

🤖 AI Awesome Project Template 👋 Template author You may want to adjust badge links in a README.md file. 💎 Installation with pip Installation is as

18 Nov 23, 2022

Deep ViT Features as Dense Visual Descriptors

dino-vit-features [paper] [project page] Official implementation of the paper "Deep ViT Features as Dense Visual Descriptors". We demonstrate the effe

113 Dec 24, 2022

PyTorch implementation of "VRT: A Video Restoration Transformer"

VRT: A Video Restoration Transformer Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, Luc Van Gool Computer

837 Jan 9, 2023

This is the code repository for the paper A hierarchical semantic segmentation framework for computer-vision-based bridge column damage detection

Bridge-damage-segmentation This is the code repository for the paper A hierarchical semantic segmentation framework for computer-vision-based bridge c

5 Dec 7, 2022

Annotate datasets with a semi-trained or fully trained YOLOv5 model

YOLOv5 Auto Annotator Annotate datasets with a semi-trained or fully trained YOLOv5 model Prerequisites Ubuntu =20.04 Python =3.7 System dependencie

3 May 14, 2022

This repository contains a toolkit for collecting, labeling and tracking object keypoints

This repository contains a toolkit for collecting, labeling and tracking object keypoints. Object keypoints are semantic points in an object's coordinate frame.

13 Dec 12, 2022

Geometric Interpretation of Matrix Square Root and Inverse Square Root

Fast Differentiable Matrix Sqrt Root Geometric Interpretation of Matrix Square Root and Inverse Square Root This repository constains the official Pyt

5 Jan 24, 2022

PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime Created by Prarthana Bhattacharyya. Disclaimer: This is n

5 Nov 8, 2022

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network This is the official implementation of

2 Jul 9, 2022

Low Complexity Channel estimation with Neural Network Solutions

Interpolation-ResNet Invited paper for WSA 2021, called 'Low Complexity Channel estimation with Neural Network Solutions'. Low complexity residual con

10 Dec 10, 2022

Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

DocEnTR Description Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer. This model is implemented on to

74 Jan 7, 2023

Vision transformers (ViTs) have found only limited practical use in processing images

CXV Convolutional Xformers for Vision Vision transformers (ViTs) have found only limited practical use in processing images, in spite of their state-o

23 Sep 10, 2022

Labelme is a graphical image annotation tool, It is written in Python and uses Qt for its graphical interface

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

9.6k Jan 9, 2023

This repository holds those infrastructure-level modules, that every application requires that follows the core 12-factor principles.

py-12f-common About This repository holds those infrastructure-level modules, that every application requires that follows the core 12-factor principl

1 Dec 15, 2022

Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch

Semantic Segmentation Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch Features Applicable to followin

530 Jan 5, 2023

Object classification with basic computer vision techniques

naive-image-classification Object classification with basic computer vision techniques. Final assignment for the computer vision course I took at univ

2 Jul 1, 2022

Computer vision applications project (Flask and OpenCV)

Computer Vision Applications Project This project is at it's initial phase. This is all about the implementation of different computer vision techniqu

1 Jan 26, 2022

Eye-Blink-Counter - Python based Computer Vision project which counts how many time a person blinks

Eye Blink Counter OpenCV and Mediapipe No Blink Blink

2 Mar 24, 2022

Vit-ImageClassification - Pytorch ViT for Image classification on the CIFAR10 dataset

Vit-ImageClassification Introduction This project uses ViT to perform image clas

4 Jun 1, 2022

Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications

Labelbox Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications. Use this github repository to help you s

1.7k Dec 29, 2022

CVAT is free, online, interactive video and image annotation tool for computer vision

Computer Vision Annotation Tool (CVAT) CVAT is free, online, interactive video and image annotation tool for computer vision. It is being used by our

8.6k Jan 4, 2023

A curated list and survey of awesome Vision Transformers.

English | 简体中文 A curated list and survey of awesome Vision Transformers. You can use mind mapping software to open the mind mapping source file. You c

281 Dec 21, 2022

DaCe is a parallel programming framework that takes code in Python/NumPy and other programming languages

aCe - Data-Centric Parallel Programming Decoupling domain science from performance optimization. DaCe is a parallel programming framework that takes c

330 Dec 30, 2022

Awesome Transformers in Medical Imaging

This repo supplements our Survey on Transformers in Medical Imaging Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat,

666 Jan 6, 2023

Pytorch Implementation of Residual Vision Transformers(ResViT)

ResViT Official Pytorch Implementation of Residual Vision Transformers(ResViT) which is described in the following paper: Onat Dalmaz and Mahmut Yurt

41 Dec 8, 2022

This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

ICCV Workshop 2021 VTGAN This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

25 Dec 8, 2022

COVID-VIT: Classification of Covid-19 from CT chest images based on vision transformer models

COVID-ViT COVID-VIT: Classification of Covid-19 from CT chest images based on vision transformer models This code is to response to te MIA-COV19 compe

17 Dec 30, 2022

Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD)

Is it Time to Replace CNNs with Transformers for Medical Images? Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (C

80 Dec 27, 2022

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets (including obl

1.7k Dec 22, 2022

🛰️ Awesome Satellite Imagery Datasets

Awesome Satellite Imagery Datasets List of aerial and satellite imagery datasets with annotations for computer vision and deep learning. Newest datase

3k Jan 3, 2023

Source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree.

self-driving-car In this repository I will share the source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree. Hope this might

2.4k Dec 29, 2022

Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation

Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation (CVPR2019) This is a pytorch implementatio

280 Jan 1, 2023

Llvlir - Low Level Variable Length Intermediate Representation

Low Level Variable Length Intermediate Representation Low Level Variable Length

2 Jan 24, 2022

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation This is the PyTorch implemention of ICCV'21 paper SGPA: Structure

24 Dec 5, 2022

Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore

[AI6122] Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course is Prof. Lu Shijian.

5 Sep 12, 2022

Exadel CompreFace is a free and open-source face recognition GitHub project

Exadel CompreFace is a leading free and open-source face recognition system Exadel CompreFace is a free and open-source face recognition service that

2.6k Jan 4, 2023

TetrisAI - Tetris AI Bot using computer vision to play game automatically

Tetris AI Tetris AI Bot using computer vision to play game automatically bot.py

11 Aug 29, 2022

Pytorch implementation of ICASSP 2022 paper Attention Probe: Vision Transformer Distillation in the Wild

Attention Probe: Vision Transformer Distillation in the Wild Jiahao Wang, Mingdeng Cao, Shuwei Shi, Baoyuan Wu, Yujiu Yang In ICASSP 2022 This code is

6 Sep 21, 2022

Attention Probe: Vision Transformer Distillation in the Wild

Attention Probe: Vision Transformer Distillation in the Wild Jiahao Wang, Mingdeng Cao, Shuwei Shi, Baoyuan Wu, Yujiu Yang In ICASSP 2022 This code is

3 Oct 31, 2022

Learn Basic to advanced level Data visualisation techniques from this Repository

Data visualisation Hey, You can learn Basic to advanced level Data visualisation techniques from this Repository. Data visualization is the graphic re

16 Jan 3, 2023

Are you obsessed with playing the increasingly-popular word game Wordle?

WORDLE-VISION Up your Wordle game! Are you obsessed with playing the increasingly-popular word game Wordle? Ever wondered what the optimal first word

5 May 10, 2022

WATTS provides a set of Python classes that can manage simulation workflows for multiple codes where information is exchanged at a coarse level

WATTS (Workflow and Template Toolkit for Simulation) provides a set of Python classes that can manage simulation workflows for multiple codes where information is exchanged at a coarse level.

13 Dec 23, 2022

OMNIVORE is a single vision model for many different visual modalities

Omnivore: A Single Model for Many Visual Modalities [paper][website] OMNIVORE is a single vision model for many different visual modalities. It learns

451 Dec 27, 2022

Cereal box identification in store shelves using computer vision and a single train image per model.

Product Recognition on Store Shelves Description You can read the task description here. Report You can read and download our report here. Step A - Mu

1 Jan 21, 2022

A competition for forecasting electricity demand at the country-level using a standard backtesting framework

5 Jul 12, 2022

Fast Differentiable Matrix Sqrt Root

Official Pytorch implementation of ICLR 22 paper Fast Differentiable Matrix Square Root

42 Dec 30, 2022

Deep Learning Topics with Computer Vision & NLP

Deep learning Udacity Course Deep Learning Topics with Computer Vision & NLP for the AWS Machine Learning Engineer Nanodegree Program Tasks are mostly

1 Jan 20, 2022

Global topography (referenced to sea-level) in a 10 arcminute resolution grid

Earth - Topography grid at 10 arc-minute resolution Global 10 arc-minute resolution grids of topography (ETOPO1 ice-surface) referenced to mean sea-le

1 Jan 20, 2022

FFCV: Fast Forward Computer Vision (and other ML workloads!)

Fast Forward Computer Vision: train models at a fraction of the cost with accele

2.3k Jan 3, 2023

GNES enables large-scale index and semantic search for text-to-text, image-to-image, video-to-video and any-to-any content form

GNES is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.

1.2k Jan 6, 2023

Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)

Awesome Visual-Transformer Collect some Transformer with Computer-Vision (CV) papers. If you find some overlooked papers, please open issues or pull r

2.8k Jan 8, 2023

Transformer in Vision

Transformer-in-Vision Recent Transformer-based CV and related works. Welcome to comment/contribute! Keep updated. Resource SCENIC: A JAX Library for C

1.1k Dec 30, 2022

Transformer in Computer Vision

Transformer-in-Vision A paper list of some recent Transformer-based CV works. If you find some ignored papers, please open issues or pull requests. **

506 Dec 26, 2022

A general python framework for visual object tracking and video object segmentation, based on PyTorch

PyTracking A general python framework for visual object tracking and video object segmentation, based on PyTorch. 📣 Two tracking/VOS papers accepted

2.6k Jan 4, 2023

Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

20 Oct 9, 2022

FSL-Mate: A collection of resources for few-shot learning (FSL).

FSL-Mate is a collection of resources for few-shot learning (FSL). In particular, FSL-Mate currently contains FewShotPapers: a paper list which tracks

1.5k Jan 8, 2023

Meta Self-learning for Multi-Source Domain Adaptation： A Benchmark

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods

188 Dec 12, 2022

PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet

PyTorch Image Classification Following papers are implemented using PyTorch. ResNet (1512.03385) ResNet-preact (1603.05027) WRN (1605.07146) DenseNet

1.2k Jan 4, 2023

X-VLM: Multi-Grained Vision Language Pre-Training

X-VLM: learning multi-grained vision language alignments Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts. Yan Zeng, Xi

286 Dec 23, 2022

Pipeline for employing a Lightweight deep learning models for LOW-power systems

PL-LOW A high-performance deep learning model lightweight pipeline that gradually lightens deep neural networks in order to utilize high-performance d

9 Aug 13, 2022

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

10 Dec 6, 2022

Python Low-level-vision Resources

Python low-level-vision Libraries

Deep Surface Reconstruction from Point Clouds with Visibility Information

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

A Novel Plug-in Module for Fine-grained Visual Classification

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

An implementation of IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification

Face recognition project by matching the features extracted using SIFT.

Building a Mario-like, classic platformer game in Python using the PyGame Library

An awesome list of AI for art and design - resources, and popular datasets and how we may apply computer vision tasks to art and design.

Reso is a low-level circuit design language and simulator, inspired by things like Redstone, Conway's Game of Life, and Wireworld.

Python Computer Vision Aim Bot for Roblox's Phantom Forces

Semantic Segmentation Suite in TensorFlow

LabelMe annotation tool source code

Mapomatic - Automatic mapping of compiled circuits to low-noise sub-graphs

This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

Wordle-Python - A simple low-key clone of the popular game WORDLE made with python and a 2D Graphics module Pygame

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

Real-time domain adaptation for semantic segmentation

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

Python based utilities for interacting with digital multimeters that are built on the FS9721-LP3 chipset.

The aim is to extract timeseries water level 2D information for any designed boundaries within the EasyGSH model domain

PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

Multilingual finetuning of Machine Translation model on low-resource languages. Project for Deep Natural Language Processing course.

Image processing is one of the most common term in computer vision

Optical character recognition for Japanese text, with the main focus being Japanese manga

DocEnTr: An end-to-end document image enhancement transformer

RuCLIP-SB (Russian Contrastive Language–Image Pretraining SWIN-BERT) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. Unlike other versions of the model we use BERT for text encoder and SWIN transformer for image encoder.

Computer Vision Paper Reviews with Key Summary of paper, End to End Code Practice and Jupyter Notebook converted papers

[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)

🤖 Project template for your next awesome AI project. 🦾

Deep ViT Features as Dense Visual Descriptors

PyTorch implementation of "VRT: A Video Restoration Transformer"

This is the code repository for the paper A hierarchical semantic segmentation framework for computer-vision-based bridge column damage detection

Annotate datasets with a semi-trained or fully trained YOLOv5 model

This repository contains a toolkit for collecting, labeling and tracking object keypoints

Geometric Interpretation of Matrix Square Root and Inverse Square Root

PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network

Low Complexity Channel estimation with Neural Network Solutions

Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

Vision transformers (ViTs) have found only limited practical use in processing images

Labelme is a graphical image annotation tool, It is written in Python and uses Qt for its graphical interface

This repository holds those infrastructure-level modules, that every application requires that follows the core 12-factor principles.

Easy to use and customizable SOTA Semantic Segmentation models with abundant datasets in PyTorch

Object classification with basic computer vision techniques

Computer vision applications project (Flask and OpenCV)

Eye-Blink-Counter - Python based Computer Vision project which counts how many time a person blinks

Vit-ImageClassification - Pytorch ViT for Image classification on the CIFAR10 dataset

Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications

CVAT is free, online, interactive video and image annotation tool for computer vision

A curated list and survey of awesome Vision Transformers.

DaCe is a parallel programming framework that takes code in Python/NumPy and other programming languages

Awesome Transformers in Medical Imaging

Pytorch Implementation of Residual Vision Transformers(ResViT)

This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

COVID-VIT: Classification of Covid-19 from CT chest images based on vision transformer models

Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD)

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets

🛰️ Awesome Satellite Imagery Datasets

Source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree.

Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation

Llvlir - Low Level Variable Length Intermediate Representation

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore

Exadel CompreFace is a free and open-source face recognition GitHub project

TetrisAI - Tetris AI Bot using computer vision to play game automatically

Pytorch implementation of ICASSP 2022 paper Attention Probe: Vision Transformer Distillation in the Wild

Attention Probe: Vision Transformer Distillation in the Wild

Learn Basic to advanced level Data visualisation techniques from this Repository

Are you obsessed with playing the increasingly-popular word game Wordle?

WATTS provides a set of Python classes that can manage simulation workflows for multiple codes where information is exchanged at a coarse level

OMNIVORE is a single vision model for many different visual modalities

Cereal box identification in store shelves using computer vision and a single train image per model.