1080 Python Computer-vision Libraries

Apply different text recognition services to images of handwritten documents.

Handprint The Handwritten Page Recognition Test is a command-line program that invokes HTR (handwritten text recognition) services on images of docume

117 Jan 2, 2023

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

1.4k Dec 29, 2022

kyle's vision of how datadog's python client should look

kyle's datadog python vision/proposal not for production use See examples/comprehensive.py for a mostly working example of the proposed API. 📈 🐶 ❤️

2 Nov 21, 2021

The best way to convert files on your computer, be it .pdf to .png, .pdf to .docx, .png to .ico, or anything you can imagine.

2 Nov 20, 2021

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

TransMix: Attend to Mix for Vision Transformers This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transf

32 Nov 21, 2021

A deep learning object detector framework written in Python for supporting Land Search and Rescue Missions.

AIR: Aerial Inspection RetinaNet for supporting Land Search and Rescue Missions AIR is a deep learning based object detection solution to automate the

13 Dec 22, 2022

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

TransMix: Attend to Mix for Vision Transformers This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transf

130 Jan 1, 2023

A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.

Minimal Body A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image. The model file is only 51.2 MB and runs a

49 Dec 5, 2022

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners This is a coarse version for MAE, only make the pretrain model, the fine

214 Dec 29, 2022

QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.

Application-Oriented Performance Benchmarks for Quantum Computing This repository contains a collection of prototypical application- or algorithm-cent

67 Nov 30, 2022

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.

DD3D: "Is Pseudo-Lidar needed for Monocular 3D Object detection?" Install // Datasets // Experiments // Models // License // Reference Full video Offi

Toyota Research Institute - Machine Learning

364 Dec 27, 2022

PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-supervised ViT.

MAE for Self-supervised ViT Introduction This is an unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-sup

36 Oct 30, 2022

AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

AgeGuesser AgeGuesser is an end-to-end, deep-learning based Age Estimation system, presented at the CAIP 2021 conference. You can find the related pap

5 Nov 10, 2022

Generating Videos with Scene Dynamics

Generating Videos with Scene Dynamics This repository contains an implementation of Generating Videos with Scene Dynamics by Carl Vondrick, Hamed Pirs

706 Jan 4, 2023

Stacked Generative Adversarial Networks

Stacked Generative Adversarial Networks This repository contains code for the paper "Stacked Generative Adversarial Networks", CVPR 2017. Part of the

241 May 7, 2022

MoCoGAN: Decomposing Motion and Content for Video Generation

MoCoGAN: Decomposing Motion and Content for Video Generation This repository contains an implementation and further details of MoCoGAN: Decomposing Mo

514 Dec 18, 2022

🔥3D-RecGAN in Tensorflow (ICCV Workshops 2017)

3D Object Reconstruction from a Single Depth View with Adversarial Learning Bo Yang, Hongkai Wen, Sen Wang, Ronald Clark, Andrew Markham, Niki Trigoni

125 Nov 26, 2022

Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral) [Project] [Paper] [Demo] [Related Work: A2RL (for Auto Image Cropping)] [C

402 Dec 27, 2022

[CVPR 2016] Unsupervised Feature Learning by Image Inpainting using GANs

Context Encoders: Feature Learning by Inpainting CVPR 2016 [Project Website] [Imagenet Results] Sample results on held-out images: This is the trainin

829 Dec 31, 2022

Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

CycleGAN PyTorch | project page | paper Torch implementation for learning an image-to-image translation (i.e. pix2pix) without input-output pairs, for

11.5k Dec 30, 2022

Image-to-image translation with conditional adversarial nets

pix2pix Project | Arxiv | PyTorch Torch implementation for learning a mapping from input images to output images, for example: Image-to-Image Translat

9.3k Jan 8, 2023

A simple interface for editing natural photos with generative neural networks.

Neural Photo Editor A simple interface for editing natural photos with generative neural networks. This repository contains code for the paper "Neural

2.1k Dec 29, 2022

Interactive Image Generation via Generative Adversarial Networks

iGAN: Interactive Image Generation via Generative Adversarial Networks Project | Youtube | Paper Recent projects: [pix2pix]: Torch implementation for

3.9k Dec 23, 2022

FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

FCOS: Fully Convolutional One-Stage Object Detection This project hosts the code for implementing the FCOS algorithm for object detection, as presente

3.1k Jan 5, 2023

Powerful and efficient Computer Vision Annotation Tool (CVAT)

Computer Vision Annotation Tool (CVAT) CVAT is free, online, interactive video and image annotation tool for computer vision. It is being used by our

8.6k Jan 1, 2023

🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.

Image Super-Resolution (ISR) The goal of this project is to upscale and improve the quality of low resolution images. This project contains Keras impl

4k Jan 8, 2023

Gesture Volume Control Using OpenCV and MediaPipe

This Project Uses OpenCV and MediaPipe Hand solutions to identify hands and Change system volume by taking thumb and index finger positions

6 Sep 12, 2022

Official implementation for "Image Quality Assessment using Contrastive Learning"

Image Quality Assessment using Contrastive Learning Pavan C. Madhusudana, Neil Birkbeck, Yilin Wang, Balu Adsumilli and Alan C. Bovik This is the offi

67 Dec 30, 2022

Implementation of light baking system for ray tracing based on Activision's UberBake

Vulkan Light Bakary MSU Graphics Group Student's Diploma Project Treefonov Andrey [GitHub] [LinkedIn] Project Goal The goal of the project is to imple

7 Dec 27, 2022

Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Temporal copying and local hallucination for video inpainting This repository contains the implementation of my master's thesis "Temporal copying and

1 Dec 2, 2022

Pytorch implementation of forward and inverse Haar Wavelets 2D

9 Oct 30, 2022

A clean and extensible PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

A clean and extensible PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners A PyTorch re-implementation of Mask Autoencoder trai

23 Dec 13, 2022

PyTorch implementation of MulMON

MulMON This repository contains a PyTorch implementation of the paper: Learning Object-Centric Representations of Multi-object Scenes from Multiple Vi

16 Nov 3, 2022

Summary of related papers on visual attention

This repo is built for paper: Attention Mechanisms in Computer Vision: A Survey paper Vision-Attention-Papers Channel attention Spatial attention Temp

2.1k Dec 30, 2022

Open source hardware and software platform to build a small scale self driving car.

Donkeycar is minimalist and modular self driving library for Python. It is developed for hobbyists and students with a focus on allowing fast experimentation and easy community contributions.

2.4k Jan 4, 2023

🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code

Knock Knock A small library to get a notification when your training is complete or when it crashes during the process with two additional lines of co

2.5k Jan 7, 2023

Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

GSCNN This is the official code for: Gated-SCNN: Gated Shape CNNs for Semantic Segmentation Towaki Takikawa, David Acuna, Varun Jampani, Sanja Fidler

859 Dec 26, 2022

UPSNet: A Unified Panoptic Segmentation Network

UPSNet: A Unified Panoptic Segmentation Network Introduction UPSNet is initially described in a CVPR 2019 oral paper. Disclaimer This repository is te

622 Dec 26, 2022

Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

Learning to Adapt Structured Output Space for Semantic Segmentation Pytorch implementation of our method for adapting semantic segmentation from the s

782 Dec 30, 2022

A Kitti Road Segmentation model implemented in tensorflow.

KittiSeg KittiSeg performs segmentation of roads by utilizing an FCN based model. The model achieved first place on the Kitti Road Detection Benchmark

890 Jan 4, 2023

Real-time Joint Semantic Reasoning for Autonomous Driving

MultiNet MultiNet is able to jointly perform road segmentation, car detection and street classification. The model achieves real-time speed and state-

518 Dec 12, 2022

Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images

Keras-ICNet [paper] Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images. Training in progress! Requisites Python 3.6.3 K

87 Dec 16, 2022

TensorFlow implementation of ENet

TensorFlow-ENet TensorFlow implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. This model was tested on th

255 Oct 17, 2022

TensorFlow implementation of ENet, trained on the Cityscapes dataset.

segmentation TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e

248 Dec 16, 2022

Fully convolutional networks for semantic segmentation

FCN-semantic-segmentation Simple end-to-end semantic segmentation using fully convolutional networks [1]. Takes a pretrained 34-layer ResNet [2], remo

186 Dec 25, 2022

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

fcn - Fully Convolutional Networks Chainer implementation of Fully Convolutional Networks. Installation pip install fcn Inference Inference is done as

218 Oct 27, 2022

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{you2019torchcv, author = {Ansheng You and Xiangtai Li and Zhen Zhu a

2.2k Jan 6, 2023

SegNet-like Autoencoders in TensorFlow

SegNet SegNet is a TensorFlow implementation of the segmentation network proposed by Kendall et al., with cool features like strided deconvolution, a

66 Nov 5, 2021

Semantic segmentation models, datasets and losses implemented in PyTorch.

Semantic Segmentation in PyTorch Semantic Segmentation in PyTorch Requirements Main Features Models Datasets Losses Learning rate schedulers Data augm

1.3k Jan 7, 2023

Hack-All is a simple CLI tool that helps ethical-hackers to make a reverse connection without knowing the target device in use is it computer or phone

Hack-All is a simple CLI tool that helps ethical-hackers to make a reverse connection without knowing the target device in use is it computer

5 Nov 22, 2022

Anime Face Detector using mmdet and mmpose

Anime Face Detector This is an anime face detector using mmdetection and mmpose. (To avoid copyright issues, I use generated images by the TADNE model

198 Jan 7, 2023

History Aware Multimodal Transformer for Vision-and-Language Navigation

History Aware Multimodal Transformer for Vision-and-Language Navigation This repository is the official implementation of History Aware Multimodal Tra

46 Nov 23, 2022

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners This repository is built upon BEiT, thanks very much! Now, we on

2.3k Jan 4, 2023

Coursework project for DIP class. The goal is to use vision to guide the Dashgo robot through two traffic cones in bright color.

3 Oct 24, 2022

ML for NLP and Computer Vision.

Sparrow is our open-source ML product. It runs on Skipper MLOps infrastructure.

2 Nov 28, 2021

History Aware Multimodal Transformer for Vision-and-Language Navigation

History Aware Multimodal Transformer for Vision-and-Language Navigation This repository is the official implementation of History Aware Multimodal Tra

46 Nov 23, 2022

A program that uses computer vision to detect hand gestures, used for controlling movie players.

HandGestureDetection This program uses a Haar Cascade algorithm to detect the presence of your hand, and then passes it on to a self-created and self-

2 Nov 22, 2022

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

mocap4face by Facemoji mocap4face by Facemoji is a free, multiplatform SDK for real-time facial motion capture based on Facial Action Coding System or

591 Dec 27, 2022

This repo. is an implementation of ACFFNet, which is accepted for in Image and Vision Computing.

Attention-Guided-Contextual-Feature-Fusion-Network-for-Salient-Object-Detection This repo. is an implementation of ACFFNet, which is accepted for in I

5 Nov 21, 2022

Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US simulation

AutomaticUSnavigation Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US

6 Dec 5, 2022

Image Segmentation using U-Net, U-Net with skip connections and M-Net architectures

Brain-Image-Segmentation Segmentation of brain tissues in MRI image has a number of applications in diagnosis, surgical planning, and treatment of bra

8 Oct 27, 2022

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling"

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling" Pipeline of Tip-Adapter Tip-Adapter can provid

187 Dec 28, 2022

Laser device for neutralizing - mosquitoes, weeds and pests

Laser device for neutralizing - mosquitoes, weeds and pests (in progress) Here I will post information for creating a laser device. A warning!! How It

1k Jan 2, 2023

A set of tools to pre-calibrate and calibrate (multi-focus) plenoptic cameras (e.g., a Raytrix R12) based on the libpleno.

COMPOTE: Calibration Of Multi-focus PlenOpTic camEra. COMPOTE is a set of tools to pre-calibrate and calibrate (multifocus) plenoptic cameras (e.g., a

4 May 10, 2022

Certified Patch Robustness via Smoothed Vision Transformers

Certified Patch Robustness via Smoothed Vision Transformers This repository contains the code for replicating the results of our paper: Certified Patc

35 Dec 14, 2022

[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Just Ask: Learning to Answer Questions from Millions of Narrated Videos Webpage • Demo • Paper This repository provides the code for our paper, includ

87 Jan 5, 2023

[ICCV'21] Pri3D: Can 3D Priors Help 2D Representation Learning?

Pri3D: Can 3D Priors Help 2D Representation Learning? [ICCV 2021] Pri3D leverages 3D priors for downstream 2D image understanding tasks: during pre-tr

124 Jan 6, 2023

VLGrammar: Grounded Grammar Induction of Vision and Language

27 Dec 23, 2022

Sketch Your Own GAN: Customizing a GAN model with hand-drawn sketches.

Sketch Your Own GAN Project | Paper | Youtube | Slides Our method takes in one or a few hand-drawn sketches and customizes an off-the-shelf GAN to mat

522 Nov 8, 2021

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet, ICCV 2021 Update: 2021/03/11: update our new results. Now our T2T-ViT-14 w

1k Dec 31, 2022

This is a collection of our NAS and Vision Transformer work.

AutoML - Neural Architecture Search This is a collection of our AutoML-NAS work iRPE (NEW): Rethinking and Improving Relative Position Encoding for Vi

832 Jan 8, 2023

ICCV2021 Papers with Code

1.4k Jan 2, 2023

RSC-Net: 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos

RSC-Net: 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos Implementation for "3D Human Pose, Shape and Texture from Low-Resoluti

42 Nov 10, 2022

Neural Scene Flow Prior (NeurIPS 2021 spotlight)

Neural Scene Flow Prior Xueqian Li, Jhony Kaesemodel Pontes, Simon Lucey Will appear on Thirty-fifth Conference on Neural Information Processing Syste

85 Jan 3, 2023

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code.

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code

2.9k Jan 8, 2023

AWS Blog post code for running feature-extraction on images using AWS Batch and Cloud Development Kit (CDK).

Batch processing with AWS Batch and CDK Welcome This repository demostrates provisioning the necessary infrastructure for running a job on AWS Batch u

7 Oct 18, 2022

Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

6.1k Jan 3, 2023

A useful tool to generate chord progressions according to melody MIDIs

Auto chord generator, pure python package that generate chord progressions according to given melodies

53 Dec 30, 2022

Official implementation of "Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention" (BMVC 2021).

Multi-Glimpse Network Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention arXiv Require

5 Nov 4, 2021

This repository summarized computer vision theories.

3 Feb 4, 2022

Controlling the computer volume with your hands // OpenCV

HandsControll-AI Controlling the computer volume with your hands // OpenCV Step 1 git clone https://github.com/Hayk-21/HandsControll-AI.git pip instal

1 Nov 4, 2021

Qtas（Quite a Storage）is an experimental distributed storage system developed by Q-team in BJFU Advanced Computer Network sources.

Qtas（Quite a Storage）is a experimental distributed storage system developed by Q-team in BJFU Advanced Computer Network sources.

3 Jan 12, 2022

Control your Puffco Peak Pro from your computer!

PuffcoPC Control your Puffco Peak Pro from your computer! Contributions Pull requests are welcome. For major changes, please open an issue first to di

5 Nov 2, 2022

Qtas（Quite a Storage）is an experimental distributed storage system developed by Q-team in BJFU Advanced Computer Network sources.

Qtas（Quite a Storage）is a experimental distributed storage system developed by Q-team in BJFU Advanced Computer Network sources.

3 Jan 12, 2022

Voilà, install macOS on ANY Computer! This is really and magic easiest way!

OSX-PROXMOX - Run macOS on ANY Computer - AMD & Intel Install Proxmox VE v7.02 - Next, Next & Finish (NNF). Open Proxmox Web Console - Datacenter N

654 Jan 9, 2023

The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".

Energy-based Conditional Generative Adversarial Network (ECGAN) This is the code for the NeurIPS 2021 paper "A Unified View of cGANs with and without

22 May 28, 2022

Algorithms covered in the Bioinformatics Course part of the Cambridge Computer Science Tripos

Bioinformatics This is a repository of all the algorithms covered in the Bioinformatics Course part of the Cambridge Computer Science Tripos Algorithm

16 Jun 30, 2022

DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021]

DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021] Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng

Automation and Intelligence for Civil Engineering (AI4CE) Lab @ NYU

98 Dec 21, 2022

With Google Drive API. My computer and my phone are in love now.

Channel trought Google Drive Google Drive API In this case, "Google Drive App" is the program. To install everything you need(has some extra things),

1 Dec 15, 2021

Code for "Discovering Non-monotonic Autoregressive Orderings with Variational Inference" (paper and code updated from ICLR 2021)

Discovering Non-monotonic Autoregressive Orderings with Variational Inference Description This package contains the source code implementation of the

10 Dec 29, 2022

AlienFX is a CLI and GUI utility to control the lighting effects of your Alienware computer.

AlienFX is a Linux utility to control the lighting effects of your Alienware computer. At present there is a CLI version (alienfx) and a gtk GUI versi

218 Dec 26, 2022

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

371 Dec 30, 2022

This is a repository filled with scripts that were made with Python, and designed to exploit computer systems.

PYTHON-EXPLOITATION This is a repository filled with scripts that were made with Python, and designed to exploit computer systems. Networking tcp_clin

1 Oct 30, 2021

Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond

CRF - Conditional Random Fields A library for dense conditional random fields (CRFs). This is the official accompanying code for the paper Regularized

21 Nov 26, 2022

A simple python-function, to gain all wlan passwords from stored wlan-profiles on a computer.

Wlan Fetcher Windows10 Description A simple python-function, to gain all wlan passwords from stored wlan-profiles on a computer. Usage This Script onl

2 Nov 20, 2021

Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

1 Oct 27, 2021

Concept Modeling: Topic Modeling on Images and Text

Concept is a technique that leverages CLIP and BERTopic-based techniques to perform Concept Modeling on images.

120 Dec 27, 2022

Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021).

AA-RMVSNet Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021) in PyTorch. paper link: arXiv | CVF Change Log Ju

97 Dec 30, 2022

Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert

149 Dec 14, 2022

PoseCamera is python based SDK for human pose estimation through RGB webcam.

PoseCamera PoseCamera is python based SDK for human pose estimation through RGB webcam. Install install posecamera package through pip pip install pos

7 Jul 20, 2021

Python Computer-vision Resources

Python computer-vision Libraries

Apply different text recognition services to images of handwritten documents.

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

kyle's vision of how datadog's python client should look

The best way to convert files on your computer, be it .pdf to .png, .pdf to .docx, .png to .ico, or anything you can imagine.

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

A deep learning object detector framework written in Python for supporting Land Search and Rescue Missions.

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners for self-supervised ViT.

AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

Generating Videos with Scene Dynamics

Stacked Generative Adversarial Networks

MoCoGAN: Decomposing Motion and Content for Video Generation

🔥3D-RecGAN in Tensorflow (ICCV Workshops 2017)

Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

[CVPR 2016] Unsupervised Feature Learning by Image Inpainting using GANs

Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

Image-to-image translation with conditional adversarial nets

A simple interface for editing natural photos with generative neural networks.

Interactive Image Generation via Generative Adversarial Networks

FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

Powerful and efficient Computer Vision Annotation Tool (CVAT)

🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.

Gesture Volume Control Using OpenCV and MediaPipe

Official implementation for "Image Quality Assessment using Contrastive Learning"

Implementation of light baking system for ray tracing based on Activision's UberBake

Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Pytorch implementation of forward and inverse Haar Wavelets 2D

A clean and extensible PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

PyTorch implementation of MulMON

Summary of related papers on visual attention

Open source hardware and software platform to build a small scale self driving car.

🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code

Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

UPSNet: A Unified Panoptic Segmentation Network

Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

A Kitti Road Segmentation model implemented in tensorflow.

Real-time Joint Semantic Reasoning for Autonomous Driving

Keras implementation of Real-Time Semantic Segmentation on High-Resolution Images

TensorFlow implementation of ENet

TensorFlow implementation of ENet, trained on the Cityscapes dataset.

Fully convolutional networks for semantic segmentation

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

SegNet-like Autoencoders in TensorFlow

Semantic segmentation models, datasets and losses implemented in PyTorch.

Hack-All is a simple CLI tool that helps ethical-hackers to make a reverse connection without knowing the target device in use is it computer or phone

Anime Face Detector using mmdet and mmpose

History Aware Multimodal Transformer for Vision-and-Language Navigation

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

Coursework project for DIP class. The goal is to use vision to guide the Dashgo robot through two traffic cones in bright color.

ML for NLP and Computer Vision.

History Aware Multimodal Transformer for Vision-and-Language Navigation

A program that uses computer vision to detect hand gestures, used for controlling movie players.

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

This repo. is an implementation of ACFFNet, which is accepted for in Image and Vision Computing.

Investigating automatic navigation towards standard US views integrating MARL with the virtual US environment developed in CT2US simulation

Image Segmentation using U-Net, U-Net with skip connections and M-Net architectures

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling"

Laser device for neutralizing - mosquitoes, weeds and pests

A set of tools to pre-calibrate and calibrate (multi-focus) plenoptic cameras (e.g., a Raytrix R12) based on the libpleno.

Certified Patch Robustness via Smoothed Vision Transformers

[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

[ICCV'21] Pri3D: Can 3D Priors Help 2D Representation Learning?

VLGrammar: Grounded Grammar Induction of Vision and Language

Sketch Your Own GAN: Customizing a GAN model with hand-drawn sketches.

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

This is a collection of our NAS and Vision Transformer work.

ICCV2021 Papers with Code

RSC-Net: 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos

Neural Scene Flow Prior (NeurIPS 2021 spotlight)

A curated list of the latest breakthroughs in AI by release date with a clear video explanation, link to a more in-depth article, and code.

AWS Blog post code for running feature-extraction on images using AWS Batch and Cloud Development Kit (CDK).

Real-Time High-Resolution Background Matting

A useful tool to generate chord progressions according to melody MIDIs

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.