697 Repositories
Python dataset-creation Libraries
Sequence clustering and database creation using mmseqs, from local fasta files
Sequence clustering and database creation using mmseqs, from local fasta files
Inverted index creation and query search mechanism on Wikipedia pages.
WikiPedia Search Engine Step 1 : Installing Requirements Install "stemming" module for python using pip. Step 2 : Parsing the Data To parse the data,
Python code for loading the Aschaffenburg Pose Dataset.
Aschaffenburg Pose Dataset (APD) This repository contains Python code for loading and filtering the Aschaffenburg Pose Dataset. The dataset itself and
Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)
RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020
Benchmarks for the Optimal Power Flow Problem
Power Grid Lib - Optimal Power Flow This benchmark library is curated and maintained by the IEEE PES Task Force on Benchmarks for Validation of Emergi
The devkit of the nuScenes dataset.
nuScenes devkit Welcome to the devkit of the nuScenes and nuImages datasets. Overview Changelog Devkit setup nuImages nuImages setup Getting started w
A large-scale face dataset for face parsing, recognition, generation and editing.
CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da
Python script for diving image data to train test and val
dataset-division-to-train-val-test-python python script for dividing image data to train test and val If you have an image dataset in the following st
An open source algorithm and dataset for finding poop in pictures.
The shitspotter module is where I will be work on the "shitspotter" poop-detection algorithm and dataset. The primary goal of this work is to allow for the creation of a phone app that finds where your dog pooped, because you ran to grab the doggy-bags you forgot, and now you can't find the damn thing.
A colab notebook for training Stylegan2-ada on colab, transfer learning onto your own dataset.
Stylegan2-Ada-Google-Colab-Starter-Notebook A no thrills colab notebook for training Stylegan2-ada on colab. transfer learning onto your own dataset h
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come
An Agnostic Object Detection Framework IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-q
Command-line tool for downloading and extending the RedCaps dataset.
RedCaps Downloader This repository provides the official command-line tool for downloading and extending the RedCaps dataset. Users can seamlessly dow
Official Implementation of "Tracking Grow-Finish Pigs Across Large Pens Using Multiple Cameras"
Multi Camera Pig Tracking Official Implementation of Tracking Grow-Finish Pigs Across Large Pens Using Multiple Cameras CVPR2021 CV4Animals Workshop P
Curating a dataset for bioimage transfer learning
CytoImageNet A large-scale pretraining dataset for bioimage transfer learning. Motivation In past few decades, the increase in speed of data collectio
The source code and dataset for the RecGURU paper (WSDM 2022)
RecGURU About The Project Source code and baselines for the RecGURU paper "RecGURU: Adversarial Learning of Generalized User Representations for Cross
A tutorial on training a DarkNet YOLOv4 model for the CrowdHuman dataset
YOLOv4 CrowdHuman Tutorial This is a tutorial demonstrating how to train a YOLOv4 people detector using Darknet and the CrowdHuman dataset. Table of c
Lacmus is a cross-platform application that helps to find people who are lost in the forest using computer vision and neural networks.
lacmus The program for searching through photos from the air of lost people in the forest using Retina Net neural nwtwork. The project is being develo
Blender addon that enables exporting of xmodels from blender. Great for custom asset creation for cod games
Birdman's XModel Tools For Blender Greetings everyone in the custom cod community. This blender addon should finally enable exporting of custom assets
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.
TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).
Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:
Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.
Predicting Yelp Review Quality Table of Contents Introduction Motivation Goal and Central Questions The Data Data Storage and ETL EDA Data Pipeline Da
This repository will help you get label for images in Stanford Cars Dataset.
STANFORD CARS DATASET stanford-cars "The Cars dataset contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and
Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.
JSON 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Json2Xml t
Accelerating model creation and evaluation.
EmeraldML A machine learning library for streamlining the process of (1) cleaning and splitting data, (2) training, optimizing, and testing various mo
iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms.
iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms. You can find its main page and description via this link. If you are familiar with NILM-TK API, you probably know that you can work with iAWE hdf5 data file in NILM-TK.
Neurons Dataset API - The official dataloader and visualization tools for Neurons Datasets.
Neurons Dataset API - The official dataloader and visualization tools for Neurons Datasets. Introduction We propose our dataloader API for loading and
VGGFace2-HQ - A high resolution face dataset for face editing purpose
The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose
Multi-modal Content Creation Model Training Infrastructure including the FACT model (AI Choreographer) implementation.
AI Choreographer: Music Conditioned 3D Dance Generation with AIST++ [ICCV-2021]. Overview This package contains the model implementation and training
SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts
[arXiv] The main motivation of the SHIFT15M project is to provide a dataset that contains natural dataset shifts collected from a web service IQON, wh
Reproduction process of BERT on SST2 dataset
BERT-SST2-Prod Reproduction process of BERT on SST2 dataset 安装说明 下载代码库 git clone https://github.com/JunnYu/BERT-SST2-Prod 进入文件夹,安装requirements pip ins
PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.
PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.
GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data
GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data By Shuchang Zhou, Taihong Xiao, Yi Yang, Dieqiao Feng, Qinyao He, W
A DCGAN to generate anime faces using custom mined dataset
Anime-Face-GAN-Keras A DCGAN to generate anime faces using custom dataset in Keras. Dataset The dataset is created by crawling anime database websites
Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset
Housegan-data-reader House-GAN++ (data-reader) Code and instructions for converting rplan dataset (raster images) to housegan++ data format. House-GAN
Powerful and efficient Computer Vision Annotation Tool (CVAT)
Computer Vision Annotation Tool (CVAT) CVAT is free, online, interactive video and image annotation tool for computer vision. It is being used by our
For making Tagtog annotation into csv dataset
tagtog_relation_extraction for making Tagtog annotation into csv dataset How to Use On Tagtog 1. Go to Project Downloads 2. Download all documents,
Simple Baselines for Human Pose Estimation and Tracking
Simple Baselines for Human Pose Estimation and Tracking News Our new work High-Resolution Representations for Labeling Pixels and Regions is available
implementation of the KNN algorithm on crab biometrics dataset for CS16
crab-knn implementation of the KNN algorithm in Python applied to biometrics data of purple rock crabs (leptograpsus variegatus) to classify the sex o
CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training
CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training This is the official repository for the code and models of the paper CCQA: A N
Pyfunctools is a module that provides functions, methods and classes that help in the creation of projects in python
Pyfunctools Pyfunctools is a module that provides functions, methods and classes that help in the creation of projects in python, bringing functional
This repository contains the needed resources to build the HIRID-ICU-Benchmark dataset
HiRID-ICU-Benchmark This repository contains the needed resources to build the HIRID-ICU-Benchmark dataset for which the manuscript can be found here.
GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers
GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers
Using modified BiSeNet for face parsing in PyTorch
face-parsing.PyTorch Contents Training Demo References Training Prepare training data: -- download CelebAMask-HQ dataset -- change file path in the pr
TensorFlow implementation of ENet, trained on the Cityscapes dataset.
segmentation TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e
Semantic segmentation task for ADE20k & cityscapse dataset, based on several models.
semantic-segmentation-tensorflow This is a Tensorflow implementation of semantic segmentation models on MIT ADE20K scene parsing dataset and Cityscape
Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset
Using fully convolutional networks for semantic segmentation (Shelhamer et al.) with caffe for the cityscapes dataset How to get started Download the
PyTorch Implementations for DeeplabV3 and PSPNet
Pytorch-segmentation-toolbox DOC Pytorch code for semantic segmentation. This is a minimal code to run PSPnet and Deeplabv3 on Cityscape dataset. Shor
This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset.
DeepLab-ResNet-TensorFlow This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. Up
UNet model with VGG11 encoder pre-trained on Kaggle Carvana dataset
TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation By Vladimir Iglovikov and Alexey Shvets Introduction TernausNet is
Identifies the faulty wafer before it can be used for the fabrication of integrated circuits and, in photovoltaics, to manufacture solar cells.
Identifies the faulty wafer before it can be used for the fabrication of integrated circuits and, in photovoltaics, to manufacture solar cells. The project retrains itself after every prediction, making it more robust and generalized over time.
Recommendation System to recommend top books from the dataset
recommendersystem Recommendation System to recommend top books from the dataset Introduction The recom.py is the main program code. The dataset is als
Python dataset creator to construct datasets composed of OpenFace extracted features and Shimmer3 GSR+ Sensor datas
Python dataset creator to construct datasets composed of OpenFace extracted features and Shimmer3 GSR+ Sensor datas
FS-Mol: A Few-Shot Learning Dataset of Molecules
FS-Mol is A Few-Shot Learning Dataset of Molecules, containing molecular compounds with measurements of activity against a variety of protein targets. The dataset is presented with a model evaluation benchmark which aims to drive few-shot learning research in the domain of molecules and graph-structured data.
PyScaffold is a project generator for bootstrapping high quality Python packages
PyScaffold is a project generator for bootstrapping high quality Python packages, ready to be shared on PyPI and installable via pip. It is easy to use and encourages the adoption of the best tools and practices of the Python ecosystem, helping you and your team to stay sane, happy and productive. The best part? It is stable and has been used by thousands of developers for over half a decade!
Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.
Pano3D A Holistic Benchmark and a Solid Baseline for 360o Depth Estimation Pano3D is a new benchmark for depth estimation from spherical panoramas. We
This is an example of how to automate Ridit Analysis for a dataset with large amount of questions and many item attributes
This is an example of how to automate Ridit Analysis for a dataset with large amount of questions and many item attributes
Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset
PADISI USC Dataset This repository analyzes the PADISI-Finger dataset introduced in Multi-Modal Fingerprint Presentation Attack Detection: Evaluation
Solo CLF project about the creation of the FlickColor game in Python with very precise instructions.
Solo CLF project about the creation of the FlickColor game in Python with very precise instructions.
Start-to-finish tutorial for interactive music co-creation in PyTorch and Tensorflow.js
Start-to-finish tutorial for interactive music co-creation in PyTorch and Tensorflow.js
Dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.
CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.
A map update dataset and benchmark
MUNO21 MUNO21 is a dataset and benchmark for machine learning methods that automatically update and maintain digital street map datasets. Previous dat
Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach
Introduction Datasets and source code for our paper Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approach Datasets: WebFG-496
The official PyTorch code implementation of "Human Trajectory Prediction via Counterfactual Analysis" in ICCV 2021.
Human Trajectory Prediction via Counterfactual Analysis (CausalHTP) The official PyTorch code implementation of "Human Trajectory Prediction via Count
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"
Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,
Exploring Simple 3D Multi-Object Tracking for Autonomous Driving (ICCV 2021)
Exploring Simple 3D Multi-Object Tracking for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Exploring Simple 3D Multi-Object Tracking for
Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)
DeT and DOT Code and datasets for "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021) "Depth-only Object Tracking" (BMVC2021) @InProceedings
Codes for the ICCV'21 paper "FREE: Feature Refinement for Generalized Zero-Shot Learning"
FREE This repository contains the reference code for the paper "FREE: Feature Refinement for Generalized Zero-Shot Learning". [arXiv][Paper] 1. Prepar
Introduction to Databases Coursework 2 (SQL) - dataset generator
Introduction to Databases Coursework 2 (SQL) - dataset generator This is python script generates a text file with insert queries for the schema.sql fi
Script for the creation of metadatas and the randomization of images of MekaVerse
MekaVerse-random Script for the creation of metadata and the randomization of images of MekaVerse Step to replay the random : Create a folder : output
The InvGears workbench for FreeCAD allows the creation of gear systems
FreeCAD InvGears workbench Current version 0.1.1 Overview The InvGears workbench allows the creation of gear systems. The gear generation algorithm is
General tricks that may help you find bad, or noisy, labels in your dataset
doubtlab A lab for bad labels. Warning still in progress. This repository contains general tricks that may help you find bad, or noisy, labels in your
MAUS: A Dataset for Mental Workload Assessment Using Wearable Sensor - Baseline system
MAUS: A Dataset for Mental Workload Assessment Using Wearable Sensor - Baseline system Getting started To start working on this assignment, you should
This is a python script to navigate and extract the FSD50K dataset
FSD50K navigator This is a script I use to navigate the sound dataset from FSK50K.
WORD: Revisiting Organs Segmentation in the Whole Abdominal Region
WORD: Revisiting Organs Segmentation in the Whole Abdominal Region. This repository provides the codebase and dataset for our work WORD: Revisiting Or
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
The Hypersim Dataset For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real i
Simulated garment dataset for virtual try-on
Simulated garment dataset for virtual try-on This repository contains the dataset used in the following papers: Self-Supervised Collision Handling via
A pytorch-based real-time segmentation model for autonomous driving
CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation This project contains the Pytorch implementation for the proposed CFPNet: pap
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implicit Bayesian Inference"
GINC small-scale in-context learning dataset GINC (Generative In-Context learning Dataset) is a small-scale synthetic dataset for studying in-context
Python script to scrape users/id/badges/creation-date from a Discord Server Memberlist
Discord Server Badges Scraper - Credits to bytixo he made this Custom Discord Vanity Creator How To Use Install discum 1.2 Open a terminal and execute
UV matrix decompostion using movielens dataset
UV-matrix-decompostion-with-kfold UV matrix decompostion using movielens dataset upload the 'ratings.dat' file install the following python libraries
This project aims to be a handler for input creation and running of multiple RICEWQ simulations.
What is autoRICEWQ? This project aims to be a handler for input creation and running of multiple RICEWQ simulations. What is RICEWQ? From the descript
A lightweight Python module to interact with the Mitre Att&ck Enterprise dataset.
enterpriseattack - Mitre's Enterprise Att&ck A lightweight Python module to interact with the Mitre Att&ck Enterprise dataset. Built to be used in pro
Pipeline and Dataset helpers for complex algorithm evaluation.
tpcp - Tiny Pipelines for Complex Problems A generic way to build object-oriented datasets and algorithm pipelines and tools to evaluate them pip inst
SW components and demos for visual kinship recognition. An emphasis is put on the FIW dataset-- data loaders, benchmarks, results in summary.
FIW Data Development Kit Table of Contents Introduction Families In the Wild Database Publications Organization To Do License Getting Involved Introdu
This repo includes the CUB-GHA (Gaze-based Human Attention) dataset and code of the paper "Human Attention in Fine-grained Classification".
HA-in-Fine-Grained-Classification This repo includes the CUB-GHA (Gaze-based Human Attention) dataset and code of the paper "Human Attention in Fine-g
A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset
GBiStat package A python package to assist programmers with data analysis. This package could be used to plot : Binomial Distribution of the dataset p
PyTorch implementation of a Real-ESRGAN model trained on custom dataset
Real-ESRGAN PyTorch implementation of a Real-ESRGAN model trained on custom dataset. This model shows better results on faces compared to the original
Use Google's BERT for named entity recognition (CoNLL-2003 as the dataset).
For better performance, you can try NLPGNN, see NLPGNN for more details. BERT-NER Version 2 Use Google's BERT for named entity recognition (CoNLL-2003
This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Skin Lesion detection using YOLO This project deal
Dataset para entrenamiento de yoloV3 para 4 clases
Deteccion de objetos en video Este repo basado en el proyecto PyTorch YOLOv3 para correr detección de objetos sobre video. Construí sobre este proyect
[IROS2021] NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences
NYU-VPR This repository provides the experiment code for the paper Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymiza
This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset.
FACT This repo provides a demo for the CVPR 2021 paper "A Fourier-based Framework for Domain Generalization" on the PACS dataset. To cite, please use:
Command-line tool for downloading and extending the RedCaps dataset.
Command-line tool for downloading and extending the RedCaps dataset.
IndoNLI: A Natural Language Inference Dataset for Indonesian
IndoNLI: A Natural Language Inference Dataset for Indonesian This is a repository for data and code accompanying our EMNLP 2021 paper "IndoNLI: A Natu
CM building dataset Timisoara
CM_building_dataset_Timisoara Date created: Febr-2020 The Timi\c{s}oara Building Dataset - TMBuD - is composed of 160 images with the resolution of 76
[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification
MedMNIST Project (Website) | Dataset (Zenodo) | Paper (arXiv) | MedMNIST v1 (ISBI'21) Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bili
Data, model training, and evaluation code for "PubTables-1M: Towards a universal dataset and metrics for training and evaluating table extraction models".
PubTables-1M This repository contains training and evaluation code for the paper "PubTables-1M: Towards a universal dataset and metrics for training a
A library for augmentation of a YOLO-formated dataset
YOLO Dataset Augmentation lib Инструкция по использованию этой библиотеки Запуск всех файлов осуществлять из консоли. GoogleCrawl_to_Dataset.py Это ск
A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.
IconQA About IconQA is a new diverse abstract visual question answering dataset that highlights the importance of abstract diagram understanding and c