902 Repositories
Python computer-camera Libraries
Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)
Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert
PoseCamera is python based SDK for human pose estimation through RGB webcam.
PoseCamera PoseCamera is python based SDK for human pose estimation through RGB webcam. Install install posecamera package through pip pip install pos
Repository for playing the computer vision apps: People analytics on Raspberry Pi.
play-with-torch Repository for playing the computer vision apps: People analytics on Raspberry Pi. Tools Tested Hardware RasberryPi 4 Model B here, RA
A dataset handling library for computer vision datasets in LOST-fromat
A dataset handling library for computer vision datasets in LOST-fromat
Deep Learning for Computer Vision final project
Deep Learning for Computer Vision final project
These data visualizations were created for my introductory computer science course using Python
Homework 2: Matplotlib and Data Visualization Overview These data visualizations were created for my introductory computer science course using Python
Implementation of an ordered dithering algorithm used in computer graphics
Ordered Dithering Project In this project, we use an ordered dithering method to turn an RGB image, first to a gray scale image and then, turn the gra
EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale
EgonNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale Paper: EgoNN: Egocentric Neural Network for Point Cloud
A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!
CoVA: Context-aware Visual Attention for Webpage Information Extraction Abstract Webpage information extraction (WIE) is an important step to create k
Hierarchical probabilistic 3D U-Net, with attention mechanisms (โ๐๐ต๐ต๐ฆ๐ฏ๐ต๐ช๐ฐ๐ฏ ๐-๐๐ฆ๐ต, ๐๐๐๐ฆ๐ด๐๐ฆ๐ต) and a nested decoder structure with deep supervision (โ๐๐๐ฆ๐ต++).
Hierarchical probabilistic 3D U-Net, with attention mechanisms (โ๐๐ต๐ต๐ฆ๐ฏ๐ต๐ช๐ฐ๐ฏ ๐-๐๐ฆ๐ต, ๐๐๐๐ฆ๐ด๐๐ฆ๐ต) and a nested decoder structure with deep supervision (โ๐๐๐ฆ๐ต++). Built in TensorFlow 2.5. Configured for voxel-level clinically significant prostate cancer detection in multi-channel 3D bpMRI scans.
IEEE Winter Conference on Applications of Computer Vision 2022 Accepted
SSKT(Accepted WACV2022) Concept map Dataset Image dataset CIFAR10 (torchvision) CIFAR100 (torchvision) STL10 (torchvision) Pascal VOC (torchvision) Im
A little Python application to auto tag your photos with the power of machine learning.
Tag Machine A little Python application to auto tag your photos with the power of machine learning. Report a bug or request a feature Table of Content
Convolutional neural network web app trained to track our infantโs sleep schedule using our Google Nest camera.
Machine Learning Sleep Schedule Tracker What is it? Convolutional neural network web app trained to track our infantโs sleep schedule using our Google
Official implementation of "Learning Proposals for Practical Energy-Based Regression", 2021.
ebms_proposals Official implementation (PyTorch) of the paper: Learning Proposals for Practical Energy-Based Regression, 2021 [arXiv] [project]. Fredr
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.
Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l
Convolutional Neural Network for 3D meshes in PyTorch
MeshCNN in PyTorch SIGGRAPH 2019 [Paper] [Project Page] MeshCNN is a general-purpose deep neural network for 3D triangular meshes, which can be used f
This will help to read QR codes using Raspberry Pi and Pi Camera
Raspberry-Pi-Generate-and-Read-QR-code This will help to read QR codes using Raspberry Pi and Pi Camera Install the required libraries first in your T
PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)
Unsupervised Depth Completion with Calibrated Backprojection Layers PyTorch implementation of Unsupervised Depth Completion with Calibrated Backprojec
FlingBot: The Unreasonable Effectiveness of Dynamic Manipulations for Cloth Unfolding
This repository contains code for training and evaluating FlingBot in both simulation and real-world settings on a dual-UR5 robot arm setup for Ubuntu 18.04
PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)
PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)
CVNets: A library for training computer vision networks
CVNets: A library for training computer vision networks This repository contains the source code for training computer vision models. Specifically, it
Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].
OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, รzdemir Cetin & Heinz Koeppl | Pr
Source code for 2021 ICCV paper "In-the-Wild Single Camera 3D Reconstruction Through Moving Water Surfaces"
In-the-Wild Single Camera 3D Reconstruction Through Moving Water Surfaces This is the PyTorch implementation for 2021 ICCV paper "In-the-Wild Single C
Trajectory Extraction of road users via Traffic Camera
Traffic Monitoring Citation The associated paper for this project will be published here as soon as possible. When using this software, please cite th
Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].
OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, รzdemir Cetin & Heinz Koeppl | Pr
A python implementation of the Basic Photometric Stereo Algorithm
Photometric-Stereo A python implementation of the Basic Photometric Stereo Algorithm Result Usage run Photometric_Stereo.py Code Tree |data #ๅๅงๆฐๆฎ๏ผtgaๆ ผ
meProp: Sparsified Back Propagation for Accelerated Deep Learning (ICML 2017)
meProp The codes were used for the paper meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting (ICML 2017) [pdf]
Vision Deep-Learning using Tensorflow, Keras.
Welcome! I am a computer vision deep learning developer working in Korea. This is my blog, and you can see everything I've studied here. https://www.n
A MNIST-like fashion product database. Benchmark
Fashion-MNIST Table of Contents Why we made Fashion-MNIST Get the Data Usage Benchmark Visualization Contributing Contact Citing Fashion-MNIST License
Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"
Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions" Environment requirement This code is based on Python
๐ฅ๐ฅHigh-Performance Face Recognition Library on PaddlePaddle & PyTorch๐ฅ๐ฅ
face.evoLVe: High-Performance Face Recognition Library based on PaddlePaddle & PyTorch Evolve to be more comprehensive, effective and efficient for fa
Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.
WIBAM (Work in progress) Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data 3D object dete
๐ท This repository is focused on having various feature implementation of OpenCV in Python.
๐ท This repository is focused on having various feature implementation of OpenCV in Python. The aim is to have a minimal implementation of all OpenCV features together, under one roof.
Hardware-accelerated ROS2 packages for camera image processing.
Isaac ROS Image Pipeline Overview This metapackage offers similar functionality as the standard, CPU-based image_pipeline metapackage, but does so by
๐ฆ LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
๐ฆ LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks
Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka
A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku.
Automatic_Background_Remover A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku. ๐ https:
realsense d400 - jpg + csv
Realsense-capture realsense d400 - jpg + csv Requirements RealSense sdk : Installation Python3 pyrealsense2 (RealSense SDK) Numpy OpenCV Tkinter Run
1st ranked 'driver careless behavior detection' for AI Online Competition 2021, hosted by MSIT Korea.
2021AICompetition-03 ๋ณธ repo ๋ mAy-I Inc. ํ์ผ๋ก ์ฐธ๊ฐํ 2021 ์ธ๊ณต์ง๋ฅ ์จ๋ผ์ธ ๊ฒฝ์ง๋ํ ์ค [์ด๋ฏธ์ง] ์ด์ ์ฌ๊ณ ์๋ฐฉ์ ์ํ ์ด์ ์ ๋ถ์ฃผ์ ํ๋ ๊ฒ์ถ ๋ชจ๋ธ] ํ์คํฌ ์ํ์ ์ํ ๋ ํฌ์งํ ๋ฆฌ์ ๋๋ค. mAy-I ๋ ๊ณผํ๊ธฐ์ ์ ๋ณดํต์ ๋ถ๊ฐ ์ฃผ์ตํ
PoseViz โ Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.
PoseViz โ 3D Human Pose Visualizer Multi-person, multi-camera 3D human pose visualization tool built using Mayavi. As used in MeTRAbs visualizations.
Multi-choice answer sheet correction system using computer vision with opencv & python.
Multi choice answer correction ๐ด 5 answer sheet samples with a specific solution for detecting answers and sheet correction. ๐ด By running the soluti
A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation
A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation
Kimimaro: Skeletonize Densely Labeled Images
Kimimaro: Skeletonize Densely Labeled Images # Produce SWC files from volumetric images. kimimaro forge labels.npy --progress # writes to ./kimimaro_o
Computer art based on joining transparent images
Computer Art There is no must in art because art is free. Introduction The following tutorial exaplains how to generate computer art based on a series
this is a lite easy to use virtual keyboard project for anyone to use
virtual_Keyboard this is a lite easy to use virtual keyboard project for anyone to use motivation I made this for this year's recruitment for RobEn AA
SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from UAV Images.
SSPNet: Scale Selection Pyramid Network for Tiny Person Detection from UAV Images (IEEE GRSL 2021) Code (based on mmdetection) for SSPNet: Scale Selec
TicTacToe using Socket Server
TicTacToe using Socket Server This is a project for the class : 18CSC302J - Computer Networks by Dr. S.Babu Contributors Suvodeep Sinha RA191100301010
My implementation of transformers related papers for computer vision in pytorch
vision_transformers This is my personnal repo to implement new transofrmers based and other computer vision DL models I am currenlty working without a
novel deep learning research works with PaddlePaddle
Research ๅๅธๅบไบ้ฃๆกจ็ๅๆฒฟ็ ็ฉถๅทฅไฝ๏ผๅ ๆฌCVใNLPใKGใSTDM็ญ้ขๅ็้กถไผ่ฎบๆๅๆฏ่ตๅ ๅๆจกๅใ ็ฎๅฝ ่ฎก็ฎๆบ่ง่ง(Computer Vision) ่ช็ถ่ฏญ่จๅค็(Natrual Language Processing) ็ฅ่ฏๅพ่ฐฑ(Knowledge Graph) ๆถ็ฉบๆฐๆฎๆๆ(Spa
Open source style Deep Dream project
DeepDream โ ๏ธ If you don't have a gpu with cuda, the style transfer execution time will be much longer Prerequisites Python =3.8.10 How to Install sud
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come
IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-quality pre-trained models from torchvision, MMLabs, and soon Pytorch Image Models. It orchestrates the end-to-end deep learning workflow allowing to train networks with easy-to-use robust high-performance libraries such as Pytorch-Lightning and Fastai
novel deep learning research works with PaddlePaddle
Research ๅๅธๅบไบ้ฃๆกจ็ๅๆฒฟ็ ็ฉถๅทฅไฝ๏ผๅ ๆฌCVใNLPใKGใSTDM็ญ้ขๅ็้กถไผ่ฎบๆๅๆฏ่ตๅ ๅๆจกๅใ ็ฎๅฝ ่ฎก็ฎๆบ่ง่ง(Computer Vision) ่ช็ถ่ฏญ่จๅค็(Natrual Language Processing) ็ฅ่ฏๅพ่ฐฑ(Knowledge Graph) ๆถ็ฉบๆฐๆฎๆๆ(Spa
Multimodal Descriptions of Social Concepts: Automatic Modeling and Detection of (Highly Abstract) Social Concepts evoked by Art Images
MUSCO - Multimodal Descriptions of Social Concepts Automatic Modeling of (Highly Abstract) Social Concepts evoked by Art Images This project aims to i
Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19)
Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19) Tianyu Wang*, Xin Yang*, Ke Xu, Shaozhe Chen, Qiang Zhang, Ry
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation
RIFE - Real Time Video Interpolation arXiv | YouTube | Colab | Tutorial | Demo Table of Contents Introduction Collection Usage Evaluation Training and
Scalable computer implemented in the game of life.
scalable-gol-computer This is a computer built in Conwayโs game of life. It supports variable sizes of 8, 16 and 32 bit. Maximum program size: 256 lin
Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation
PARE: Part Attention Regressor for 3D Human Body Estimation [ICCV 2021] PARE: Part Attention Regressor for 3D Human Body Estimation, Muhammed Kocabas,
Code for ICCV2021 paper SPEC: Seeing People in the Wild with an Estimated Camera
SPEC: Seeing People in the Wild with an Estimated Camera [ICCV 2021] SPEC: Seeing People in the Wild with an Estimated Camera, Muhammed Kocabas, Chun-
[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database
RetrievalFuse Paper | Project Page | Video RetrievalFuse: Neural 3D Scene Reconstruction with a Database Yawar Siddiqui, Justus Thies, Fangchang Ma, Q
StyleTransfer - Open source style transfer project, based on VGG19
StyleTransfer - Open source style transfer project, based on VGG19
TorchOk - The toolkit for fast Deep Learning experiments in Computer Vision
TorchOk - The toolkit for fast Deep Learning experiments in Computer Vision
Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation
FLAME Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation, accepted at the 17th IEEE Internation Co
[ICCV '21] In this repository you find the code to our paper Keypoint Communities
Keypoint Communities In this repository you will find the code to our ICCV '21 paper: Keypoint Communities Duncan Zauss, Sven Kreiss, Alexandre Alahi,
Vision-and-Language Navigation in Continuous Environments using Habitat
Vision-and-Language Navigation in Continuous Environments (VLN-CE) Project Website โ VLN-CE Challenge โ RxR-Habitat Challenge Official implementations
Official code of paper: MovingFashion: a Benchmark for the Video-to-Shop Challenge
SEAM Match-RCNN Official code of MovingFashion: a Benchmark for the Video-to-Shop Challenge paper Installation Requirements: Pytorch 1.5.1 or more rec
A 2D physics sim for orbits. Made using pygame and tkinter. High degree of intractability, allowing you to create celestial bodies of a custom mass and velocity within the simulation, select what specifically is displayed, and move the camera.
Python-Orbit-Sim A 2D physics sim for orbits. Made using pygame and tkinter. High degree of intractability, allowing you to create celestial bodies of
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition [ArXiv] [Project Page] This repository is the official implementation of AdaMML:
A Robust Avatar Generator with a huge number of templates
CoolAvatars Welcome to this repository of CoolAvatars. Using this project, you can generate cool avatars not only from the samples present in my image
DeepCAD: A Deep Generative Network for Computer-Aided Design Models
DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,
Dynamic Realtime Animation Control
Our project is targeted at making an application that dynamically detects the userโs expressions and gestures and projects it onto an animation software which then renders a 2D/3D animation realtime that gets broadcasted live.
BabelCalib: A Universal Approach to Calibrating Central Cameras. In ICCV (2021)
BabelCalib: A Universal Approach to Calibrating Central Cameras This repository contains the MATLAB implementation of the BabelCalib calibration frame
The first public PyTorch implementation of Attentive Recurrent Comparators
arc-pytorch PyTorch implementation of Attentive Recurrent Comparators by Shyam et al. A blog explaining Attentive Recurrent Comparators Visualizing At
Bald-to-Hairy Translation Using CycleGAN
GANiry: Bald-to-Hairy Translation Using CycleGAN Official PyTorch implementation of GANiry. GANiry: Bald-to-Hairy Translation Using CycleGAN, Fidan Sa
A simple malware that tries to explain the logic of computer viruses with Python.
Simple-Virus-With-Python A simple malware that tries to explain the logic of computer viruses with Python. What Is The Virus ? Computer viruses are ma
PyTorch Implementation of Small Lesion Segmentation in Brain MRIs with Subpixel Embedding (ORAL, MICCAIW 2021)
Small Lesion Segmentation in Brain MRIs with Subpixel Embedding PyTorch implementation of Small Lesion Segmentation in Brain MRIs with Subpixel Embedd
Real-Time Social Distance Monitoring tool using Computer Vision
Social Distance Detector A Real-Time Social Distance Monitoring Tool Table of Contents Motivation YOLO Theory Detection Output Tech Stack Functionalit
Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)
Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021) Jiaxi Jiang, Kai Zhang, Radu Timofte Computer Vision Lab, ETH Zurich, Switzerland ๐ฅ
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.
CLIPort: What and Where Pathways for Robotic Manipulation
CLIPort CLIPort: What and Where Pathways for Robotic Manipulation Mohit Shridhar, Lucas Manuelli, Dieter Fox CoRL 2021 CLIPort is an end-to-end imitat
Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]
Adaptive Task-Relational Context (ATRC) This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense
Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)
Table of Content Introduction Getting Started Datasets Installation Experiments Training & Testing Pretrained models Texture fine-tuning Demo Toward R
Deep Face Recognition in PyTorch
Face Recognition in PyTorch By Alexey Gruzdev and Vladislav Sovrasov Introduction A repository for different experimental Face Recognition models such
๐ฅ๐ฅHigh-Performance Face Recognition Library on PaddlePaddle & PyTorch๐ฅ๐ฅ
face.evoLVe: High-Performance Face Recognition Library based on PaddlePaddle & PyTorch Evolve to be more comprehensive, effective and efficient for fa
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)
Large-Scale Long-Tailed Recognition in an Open World [Project] [Paper] [Blog] Overview Open Long-Tailed Recognition (OLTR) is the author's re-implemen
A blender 2.9x addon for managing camera settings
TMG-Camera-Tools A blender 2.9x addon for managing camera settings Tutorial showcasing current features
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019
PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py
A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]
PINTO_model_zoo Please read the contents of the LICENSE file located directly under each folder before using the model. My model conversion scripts ar
PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)
pytorch-fcn PyTorch implementation of Fully Convolutional Networks. Requirements pytorch = 0.2.0 torchvision = 0.1.8 fcn = 6.1.5 Pillow scipy tqdm
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 200 universities.
D2L.ai: Interactive Deep Learning Book with Multi-Framework Code, Math, and Discussions Book website | STAT 157 Course at UC Berkeley | Latest version
Quickly and easily create / train a custom DeepDream model
Dream-Creator This project aims to simplify the process of creating a custom DeepDream model by using pretrained GoogleNet models and custom image dat
PyTorch implementation of DeepDream algorithm
neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br
Use ZWO astronomy camera as an IP camera.
ZWO Astronomy Camera as IP Camera Astronomy cameras are known for their high sensitivity and flexibility on whether to have IR pass through and bayer
A GUI for Face Recognition, based upon Docker, Tkinter, GPU and a camera device.
Face Recognition GUI This repository is a GUI version of Face Recognition by Adam Geitgey, where e.g. Docker and Tkinter are utilized. All the materia
In this project we will be using the live feed coming from the webcam to create a virtual mouse with complete functionalities.
Virtual Mouse Using OpenCV In this project we will be using the live feed coming from the webcam to create a virtual mouse using hand tracking. Projec
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021
KeyKatcher is a keylogger that records keystrokes made on a computer and sends to the E-Mail.
What is a keylogger? A keylogger is a software application or piece of hardware that monitors and records keystrokes made on a computer keyboard. The
Hand Gesture Volume Control | Open CV | Computer Vision
Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i
Scripts for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation and a convolutional neural network (CNN) for image classification
About subwAI subwAI - a project for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation
GluonMM is a library of transformer models for computer vision and multi-modality research
GluonMM is a library of transformer models for computer vision and multi-modality research. It contains reference implementations of widely adopted baseline models and also research work from Amazon Research.
The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining
The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining