566 Repositories
Python awesome-public-datasets Libraries
This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.
ASL-Skeleton3D and ASL-Phono Datasets Generator The ASL-Skeleton3D contains a representation based on mapping into the three-dimensional space the coo
A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets
HOW TO USE THIS PROJECT A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets Based on DeepLabCut toolbox, we run wit
[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021
Pedestron Pedestron is a MMdetection based repository, that focuses on the advancement of research on pedestrian detection. We provide a list of detec
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)
Awesome & interesting talks about programming
Programming Talks I watch a lot of talks that I love to share with my friends, fellows and coworkers. As I consider all GitHubbers my friends (oh yeah
Awesome Cheatsheet
Awesome Cheatsheet List of useful cheatsheets Inspired by @sindresorhus awesome and improved by these amazing contributors. If you see a link here is
Jupyter notebook and datasets from the pandas Q&A video series
Python pandas Q&A video series Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas. Jupyter Note
Robot Swerve Test Public With Python
Robot-Swerve-Test-Public The codebase for our swerve drivetrain prototype robot.
Predicting Price of house by considering ,house age, Distance from public transport
House-Price-Prediction Predicting Price of house by considering ,house age, Distance from public transport, No of convenient stores around house etc..
A general and strong 3D object detection codebase that supports more methods, datasets and tools (debugging, recording and analysis).
ALLINONE-Det ALLINONE-Det is a general and strong 3D object detection codebase built on OpenPCDet, which supports more methods, datasets and tools (de
Transformers based fully on MLPs
Awesome MLP-based Transformers papers An up-to-date list of Transformers based fully on MLPs without attention! Why this repo? After transformers and
Datasets, tools, and benchmarks for representation learning of code.
The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided
Anomaly detection related books, papers, videos, and toolboxes
Anomaly Detection Learning Resources Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify
Largest list of models for Core ML (for iOS 11+)
Since iOS 11, Apple released Core ML framework to help developers integrate machine learning models into applications. The official documentation We'v
YoutubeDownloader - Download any public Playlist from Youtube
YoutubeDownloader Download any public Youtube Channel / Playlist Features Bulk d
Customer Service Requests Analysis is one of the practical life problems that an analyst may face. This Project is one such take. The project is a beginner to intermediate level project. This repository has a Source Code, README file, Dataset, Image and License file.
Customer Service Requests Analysis Project 1 DESCRIPTION Background of Problem Statement : NYC 311's mission is to provide the public with quick and e
A curated list of awesome Active Learning
Awesome Active Learning 🤩 A curated list of awesome Active Learning ! 🤩 Background (image source: Settles, Burr) What is Active Learning? Active lea
Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.
Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.
A Python script that parses and checks public proxies. Multithreading is supported.
A Python script that parses and checks public proxies. Multithreading is supported.
Streamlit tool to explore coco datasets
What is this This tool given a COCO annotations file and COCO predictions file will let you explore your dataset, visualize results and calculate impo
This repository contains datasets and baselines for benchmarking Chinese text recognition.
Benchmarking-Chinese-Text-Recognition This repository contains datasets and baselines for benchmarking Chinese text recognition. Please see the corres
Interactive dimensionality reduction for large datasets
BlosSOM 🌼 BlosSOM is a graphical environment for running semi-supervised dimensionality reduction with EmbedSOM. You can use it to explore multidimen
QR-Generator - An awesome QR Generator to create or customize your QR's
QR Generator An awesome QR Generator to create or customize your QR's! Table of
CryptoApp - Python code to pull wallet balances from a variety of different chains through nothing other than your public key.
CryptoApp - Python code to pull wallet balances from a variety of different chains through nothing other than your public key.
Pixoo-Awesome is a tool to get more out of your Pixoo Devices.
Pixoo-Awesome is a tool to get more out of your Pixoo Devices. It uses the Pixoo-Client to connect to your Pixoo devices and send data to them. I targ
This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.
Code-and-Dataset-for-CapSal This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detec
This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP
Awesome-Visual-Captioning Table of Contents ACL-2021 CVPR-2021 AAAI-2021 ACMMM-2020 NeurIPS-2020 ECCV-2020 CVPR-2020 ACL-2020 AAAI-2020 ACL-2019 NeurI
A collection of awesome resources image-to-image translation.
awesome image-to-image translation A collection of resources on image-to-image translation. Contributing If you think I have missed out on something (
Both social media sentiment and stock market data are crucial for stock price prediction
Relating-Social-Media-to-Stock-Movement-Public - We explore the application of Machine Learning for predicting the return of the stock by using the information of stock returns. A trading strategy based on this analysis leads to increased trading profits up to three times compared with a simple buy and hold strategy.
Ip-Seeker - See Details With Public Ip && Find Web Ip Addresses
IP SEEKER See Details With Public Ip && Find Web Ip Addresses Tool By Heshan
Awesome-google-colab - Google Colaboratory Notebooks and Repositories
Unofficial Google Colaboratory Notebook and Repository Gallery Please contact me to take over and revamp this repo (it gets around 30k views and 200k
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.
Awesome-AI-books - Some awesome AI related books and pdfs for learning and downloading
Awesome AI books Some awesome AI related books and pdfs for downloading and learning. Preface This repo only used for learning, do not use in business
🕵 Artificial Intelligence for social control of public administration
Non-tech crash course into Operação Serenata de Amor Tech crash course into Operação Serenata de Amor Contributing with code and tech skills Supportin
Awesome Graph Classification - A collection of important graph embedding, classification and representation learning papers with implementations.
A collection of graph classification methods, covering embedding, deep learning, graph kernel and factorization papers
Public repository containing materials used for Feed Forward (FF) Neural Networks article.
Art041_NN_Feed_Forward Public repository containing materials used for Feed Forward (FF) Neural Networks article. -- Illustration of a very simple Fee
SPT_LSA_ViT - Implementation for Visual Transformer for Small-size Datasets
Vision Transformer for Small-Size Datasets Seung Hoon Lee and Seunghyun Lee and Byung Cheol Song | Paper Inha University Abstract Recently, the Vision
StyleGAN2-ADA-training-jupyter - Training custom datasets in styleGAN2-ADA by NVIDIA using Jupyter
styleGAN2-ADA-training-jupyter Training custom datasets in styleGAN2-ADA on Jupyter Official StyleGAN2-ADA by NIVIDIA Paper Training Generative Advers
Genshin-assets - 👧 Public documentation & static assets for Genshin Impact data.
genshin-assets This repo provides easy access to the Genshin Impact assets, primarily for use on static sites. Sources Genshin Optimizer - An Artifact
Instagram_scrapper - This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or excel file easily.
Instagram_scrapper This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or exce
Datasets for new state-of-the-art challenge in disentanglement learning
High resolution disentanglement datasets This repository contains the Falcor3D and Isaac3D datasets, which present a state-of-the-art challenge for co
Meta Learning for Semi-Supervised Few-Shot Classification
few-shot-ssl-public Code for paper Meta-Learning for Semi-Supervised Few-Shot Classification. [arxiv] Dependencies cv2 numpy pandas python 2.7 / 3.5+
This is a scalable system that reads messages from public Telegram channels using Telethon and stores the data in a PostgreSQL database.
This is a scalable system that reads messages from public Telegram channels using Telethon and stores the data in a PostgreSQL database. Its original intention is to monitor cryptocurrency related channels, but it can be configured to read any Telegram data that is accessible through the API.
A curated list of awesome Model-Based RL resources
Awesome Model-Based Reinforcement Learning This is a collection of research papers for model-based reinforcement learning (mbrl). And the repository w
Quickly download, clean up, and install public datasets into a database management system
Finding data is one thing. Getting it ready for analysis is another. Acquiring, cleaning, standardizing and importing publicly available data is time
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least complexity possible
Location of public benchmarking; primarily final results
CSL_public_benchmark This repo is intended to provide a periodically-updated, public view into genome sequencing benchmarks managed by HudsonAlpha's C
A NASA MEaSUREs project to provide automated, low latency, global glacier flow and elevation change datasets
Notebooks A NASA MEaSUREs project to provide automated, low latency, global glacier flow and elevation change datasets This repository provides tools
Code to generate datasets used in "How Useful is Self-Supervised Pretraining for Visual Tasks?"
Synthetic dataset rendering Framework for producing the synthetic datasets used in: How Useful is Self-Supervised Pretraining for Visual Tasks? Alejan
gcrypter: an encryption algorithm based on bytes and their correspondent numbers to encode strings
gcrypter: an encryption algorithm based on bytes and their correspondent numbers to encode strings
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision
Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.
Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size. The hub data layout enables rapid transformations and streaming of data while training models at scale. Hub is used by Google, Waymo, Red Cross, Oxford University, and Omdena.
A collection of awesome sqlite tools, scripts, books, etc
Awesome Series @ Planet Open Data World (Countries, Cities, Codes, ...) • Football (Clubs, Players, Stadiums, ...) • SQLite (Tools, Books, Schemas, ..
Datasets with Softcatalà website content
softcatala-web-dataset This repository contains Sofcatalà web site content (articles and programs descriptions). Dataset are available in the dataset
LynxKite: a complete graph data science platform for very large graphs and other datasets.
LynxKite is a complete graph data science platform for very large graphs and other datasets. It seamlessly combines the benefits of a friendly graphical interface and a powerful Python API.
A Python library that simplifies the extraction of datasets from XML content.
xmldataset: simple xml parsing 🗃️ XML Dataset: simple xml parsing Documentation: https://xmldataset.readthedocs.io A Python library that simplifies t
This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.
This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.
Index different CKAN entities in Solr, not just datasets
ckanext-sitesearch Index different CKAN entities in Solr, not just datasets Requirements This extension requires CKAN 2.9 or higher and Python 3 Featu
Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania
546 Final Project: Masked Autoencoder Haoran Tang, Qirui Wu 1. Training To train the network, please run mae_pretraining.py. Please modify folder path
Some bravo or inspiring research works on the topic of curriculum learning.
Awesome Curriculum Learning Some bravo or inspiring research works on the topic of curriculum learning. A curated list of awesome Curriculum Learning
Open source annotation tool for machine learning practitioners.
doccano doccano is an open source text annotation tool for humans. It provides annotation features for text classification, sequence labeling and sequ
AKShare is an elegant and simple financial data interface library for Python, built for human beings
AKShare is an elegant and simple financial data interface library for Python, built for human beings
Flashback is an awesome, retro IRC based app built using Django
Flashback Flashback is an awesome, retro IRC based app built using Django (and the Django Rest Framework) for the backend as well as React for the fro
An elaborate and exhaustive paper list for Named Entity Recognition (NER)
Named-Entity-Recognition-NER-Papers by Pengfei Liu, Jinlan Fu and other contributors. An elaborate and exhaustive paper list for Named Entity Recognit
Generate bitcoin public and private keys and check if they match a filelist of existing addresses that have a nonzero balance
btc-heist Running Install deps, i.e., python3 -m pip install -r requirements.txt Download the CSV dump of all bitcoin addresses with a balance and cut
Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)
Woody's Just Dance Tools Public repository created to store my custom-made tools for Just Dance (UbiArt Engine) Development and updates Almost all of
Public Mirror of Team 15's Code and Reports for RBE 3002 B21
RBE3002 Team 15 Lab Repository Team 15's Repository for all code written for RBE 3002 using the Robotis TurtleBot3 Written By Matthew Haahr, Leo Morri
Contains links to publicly available datasets for modeling health outcomes using speech and language.
speech-nlp-datasets Contains links to publicly available datasets for modeling various health outcomes using speech and language. Speech-based Corpora
🏆 • 5050 most frequent words in 109 languages
🏆 Most Common Words Multilingual 5000 most frequent words in 109 languages. Uses wordfrequency.info as a source. 🔗 License source code license data
A curated list of resources dedicated to reinforcement learning applied to cyber security.
Awesome Reinforcement Learning for Cyber Security A curated list of resources dedicated to reinforcement learning applied to cyber security. Note that
A collection of resources on neural rendering.
awesome neural rendering A collection of resources on neural rendering. Contributing If you think I have missed out on something (or) have any suggest
A curated list of awesome neural radiance fields papers
Awesome Neural Radiance Fields A curated list of awesome neural radiance fields papers, inspired by awesome-computer-vision. How to submit a pull requ
demir.ai Dataset Operations
demir.ai Dataset Operations With this application, you can have the empty values (nan/null) deleted or filled before giving your dataset to machine le
🏃♀️ A curated list about human motion capture, analysis and synthesis.
Awesome Human Motion 🏃♀️ A curated list about human motion capture, analysis and synthesis. Contents Introduction Human Models Datasets Data Process
A simple, fast, and awesome discord nuke bot! The only thing you need to add is your bot token.
SimpleNukeBot A simple, fast, and awesome discord nuke bot! The only thing you need to add is your bot token. Instructions: All you need to do is crea
DataShare - Simple library for data sharing between scripts and public functions calling
DataShare - Simple library for data sharing between scripts and public functions calling. Installation. Install code, Delete LICENSE, README, readme.t
A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization
University1652-Baseline [Paper] [Slide] [Explore Drone-view Data] [Explore Satellite-view Data] [Explore Street-view Data] [Video Sample] [中文介绍] This
This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"
Splinter This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection", to
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Tensor2Tensor Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and ac
Awesome Treasure of Transformers Models Collection
💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️
Provides script to download and format public IP lists related to the Log4j exploit.
Provides script to download and format public IP lists related to the Log4j exploit. Current format includes: plain list, Cisco ASA Network Group.
A public data repository for datasets created from TransLink GTFS data.
TransLink Spatial Data What: TransLink is the statutory public transit authority for the Metro Vancouver region. This GitHub repository is a collectio
The datasets and code of ACL 2021 paper "Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions".
Aspect-Category-Opinion-Sentiment (ACOS) Quadruple Extraction This repo contains the data sets and source code of our paper: Aspect-Category-Opinion-S
Accurately separate the TLD from the registered domain and subdomains of a URL, using the Public Suffix List.
tldextract Python Module tldextract accurately separates the gTLD or ccTLD (generic or country code top-level domain) from the registered domain and s
A collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to provide an easy-to-use access to them.
KGQA Datasets Brief Introduction This repository is a collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to
flask extension for integration with the awesome pydantic package
Flask-Pydantic Flask extension for integration of the awesome pydantic package with Flask. Installation python3 -m pip install Flask-Pydantic Basics U
CleanX is an open source python library for exploring, cleaning and augmenting large datasets of X-rays, or certain other types of radiological images.
cleanX CleanX is an open source python library for exploring, cleaning and augmenting large datasets of X-rays, or certain other types of radiological
Binary classification for arrythmia detection with ECG datasets.
HEART DISEASE AI DATATHON 2021 [Eng] / [Kor] #English This is an AI diagnosis modeling contest that uses the heart disease echocardiography and electr
ProxyBroker is an open source tool that asynchronously finds public proxies from multiple sources and concurrently checks them
ProxyBroker is an open source tool that asynchronously finds public proxies from multiple sources and concurrently checks them. Features F
wxPython's Project Phoenix. A new implementation of wxPython, better, stronger, faster than he was before.
wxPython Project Phoenix Introduction Welcome to wxPython's Project Phoenix! Phoenix is the improved next-generation wxPython, "better, stronger, fast
Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets
Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets What is LASSL • How to Use What is LASSL LASSL은 LAnguage Semi-Super
Synthetic Data Generation for tabular, relational and time series data.
An Open Source Project from the Data to AI Lab, at MIT Website: https://sdv.dev Documentation: https://sdv.dev/SDV User Guides Developer Guides Github
Goal: Enable awesome tooling for Bazel users of the C language family.
Hedron's Compile Commands Extractor for Bazel — User Interface What is this project trying to do for me? First, provide Bazel users cross-platform aut
Models, datasets and tools for Facial keypoints detection
Template for Data Science Project This repo aims to give a robust starting point to any Data Science related project. It contains readymade tools setu
Collection of awesome Python types, stubs, plugins, and tools to work with them.
Awesome Python Typing Collection of awesome Python types, stubs, plugins, and tools to work with them. Contents Static type checkers Dynamic type chec
A simple CLI tool for getting region-specific status of Logz.io components.
About A simple CLI tool for checking the current status of Logz.io components per region. Built With Python 3 The following packeges (see requirements
Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.
Deep Learning Dataset Maker Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data. How to use Down
Template for a rest app with flask, flask-rest and more...
Flask REST Template About the project (some comments): The ideia behind the project is to create an useful and simple template for an rest app . Besid
[NeurIPS 2019] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma This is the offi