3238 Repositories
Python open-house-london-data Libraries
Data-driven reduced order modeling for nonlinear dynamical systems
SSMLearn Data-driven Reduced Order Models for Nonlinear Dynamical Systems This package perform data-driven identification of reduced order model based
WorldsCollide - Final Fantasy VI Randomizer
FFVI Worlds Collide Worlds Collide is an open worlds randomizer for Final Fantas
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Awesome production machine learning This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, versi
A curated list of awesome game datasets, and tools to artificial intelligence in games
🎮 Awesome Game Datasets In computer science, Artificial Intelligence (AI) is intelligence demonstrated by machines. Its definition, AI research as th
An awesome Data Science repository to learn and apply for real world problems.
AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
applied-ml Curated papers, articles, and blogs on data science & machine learning in production. ⚙️ Figuring out how to implement your ML project? Lea
A curated list of awesome DataOps tools
Awesome DataOps A curated list of awesome DataOps tools. Awesome DataOps Data Catalog Data Exploration Data Ingestion Data Lake Data Processing Data Q
Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord.
numpy2tfrecord Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord. Installation
DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.
DeepAmandine This is an artificial intelligence based on GPT-3 that you can chat with, it is very nice and makes a lot of jokes. We wish you a good ex
Drobo Status is a python program that will connect to your Drobo and return JSON data regarding your Drobo
This is a simple python script that will run a docker container to pull data from Drobo. It will give information like (Name, serial, firmware, disk-total, disk-used, disk-free and individual disk status). I have an older Drobo FS 8 disk array. Please let me know if anyone can test. The results are outputted via JSON
This project consists of data analysis and data visualization (done using python)of all IPL seasons from 2008 to 2019 and answering the most asked questions about the IPL.
IPL-data-analysis This project consists of data analysis and data visualization of all IPL seasons from 2008 to 2019 and answering the most asked ques
Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)
Astrostatistics Davide Gerosa - [email protected] University of Milano-Bicocca, 2022. Schedule Introduction Probability and Statistics I Probabi
An open-source outlier detection package by Getcontact Data Team
pyfbad The pyfbad library supports anomaly detection projects. An end-to-end anomaly detection application can be written using the source codes of th
Open-source keylogger write in python
Python open-source keylogger Language Python open-source keylogger using pynput module Using Install dependences in archive setup.py or install.sh in
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.
PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand
The final project of "Applying AI to 2D Medical Imaging Data" of "AI for Healthcare" nanodegree - Udacity.
Pneumonia Detection from X-Rays Project Overview In this project, you will apply the skills that you have acquired in this 2D medical imaging course t
Credit EDA Case Study Using Python
This case study aims to identify patterns which indicate if a client has difficulty paying their installments which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc
The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.
Motion Compensated Pulse Rate Estimation Overview This project has 2 main parts. Develop a Pulse Rate Algorithm on the given training data. Then Test
Project for the discipline of Visual Data Analysis at EMAp FGV.
Analysis of the dissemination of fake news about COVID-19 on Twitter This project was the final work for the discipline of Visual Data Analysis of the
The final project of "Applying AI to EHR Data" of "AI for Healthcare" nanodegree - Udacity.
Patient Selection for Diabetes Drug Testing Project Overview EHR data is becoming a key source of real-world evidence (RWE) for the pharmaceutical ind
The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.
Quantifying Hippocampus Volume for Alzheimer's Progression Background Alzheimer's disease (AD) is a progressive neurodegenerative disorder that result
Using Global fishing watch's data to build a machine learning model that can identify illegal fishing and poaching activities through satellite and geo-location data.
Using Global fishing watch's data to build a machine learning model that can identify illegal fishing and poaching activities through satellite and geo-location data.
A list of Python Bots used to extract data from several websites
A list of Python Bots used to extract data from several websites. Data extraction is for products on e-commerce (ecommerce) websites. Data fetched i
Scout: an open-source version of the monitoring tool
Badger Scout Scout is an open-source version of the monitoring tool used by Badg
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet and has the capability to segment 120 unique tissue classes from a whole-body 18F-FDG PET/CT image.
A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022)
DFC2022 Baseline A simple baseline for the 2022 IEEE GRSS Data Fusion Contest (DFC2022) This repository uses TorchGeo, PyTorch Lightning, and Segmenta
Open-source offline translation library written in Python. Uses OpenNMT for translations
Open source neural machine translation in Python. Designed to be used either as a Python library or desktop application. Uses OpenNMT for translations and PyQt for GUI.
Calendar heatmaps from Pandas time series data
Note: See MarvinT/calmap for the maintained version of the project. That is also the version that gets published to PyPI and it has received several f
Unauthenticated Sqlinjection that leads to dump data base but this one impersonated Admin and drops a interactive shell
Unauthenticated Sqlinjection that leads to dump database but this one impersonated Admin and drops a interactive shell
Med to csv - A simple way to parse MedAssociate output file in tidy data
MedAssociates to CSV file A simple way to parse MedAssociate output file in tidy
Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)
Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)
This project is related to a No-SQL database, whose data are referred to autoctone botanic species
This project is related to a No-SQL database, whose data are referred to autoctone botanic species. The final goal is creating a function that performs the estimation of the ornamental value, given the specific characteristics of a single species.
CellRank's reproducibility repository.
CellRank's reproducibility repository We believe that reproducibility is key and have made it as simple as possible to reproduce our results. Please e
Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection
An official implementation of paper Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection
ALSPAC data analysis studying links between screen-usage and mental health issues in children. Provided data has been synthesised.
ADSMH - Mental Health and Screen Time Group coursework for Applied Data Science at the University of Bristol. Overview The data set that you have was
The repository is my code for various types of data visualization cases based on the Matplotlib library.
ScienceGallery The repository is my code for various types of data visualization cases based on the Matplotlib library. It summarizes the code and cas
STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.
STARCH (Storm Tracking And Regional CHaracterization) STARCH computes regional extreme storm physical and moisture balance characteristics based on sp
Data aggregated from the reports found at the MCPS COVID Dashboard into a set of visualizations.
Montgomery County Public Schools COVID-19 Visualizer Contents About this project Data Support this project About this project Data All data we use can
This is the course project of AI3602: Data Mining of SJTU
This is the course project of AI3602: Data Mining of SJTU. Group Members include Jinghao Feng, Mingyang Jiang and Wenzhong Zheng.
Automatically pick a winner who Retweeted, Commented, and Followed your Twitter account!
AutomaticTwitterGiveaways automates selecting winners for "Retweet, Comment, Follow" type Twitter giveaways.
K Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching (To appear in RA-L 2022)
KCP The official implementation of KCP: k Closest Points and Maximum Clique Pruning for Efficient and Effective 3D Laser Scan Matching, accepted for p
RP2 is a privacy-focused, free, open-source US cryptocurrency tax calculator
Privacy-focused, free, open-source cryptocurrency US tax calculator, up to date for 2021: it handles multiple coins/exchanges and computes long/short-term capital gains, cost bases, in/out lot relationships, and account balances. It supports FIFO and LIFO and it generates output in form 8949 format. It has a programmable plugin architecture.
Intel® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs
Intel® Neural Compressor targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep learning frameworks to pursue optimal inference performance.
Python data loader for Solar Orbiter's (SolO) Energetic Particle Detector (EPD).
Data loader (and downloader) for Solar Orbiter/EPD energetic charged particle sensors EPT, HET, and STEP. Supports level 2 and low latency data provided by ESA's Solar Orbiter Archive.
Data Analytics on Genomes and Genetics
Data Analytics performed on On genomes and Genetics dataset to predict genetic disorder and disorder subclass. DONE by TEAM SIGMA!
Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.
Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This
Predict the Site EUI, given the characteristics of the building and the weather data for the location of the building.
wids_datathon_2022 Description: Contains a data pipeline used to predict energy EUI Goals: Dataset exploration Automating the parameter fitting, gener
In this project , I play with the YouTube data API and extract trending videos in Nigeria on a particular day
YouTubeTrendingVideosAnalysis In this project , I played with the YouTube data API and extracted trending videos in Nigeria on a particular day. This
This repository structures data in title, summary, tags, sentiment given a fragment of a conversation
Understand-conversation-AI This repository structures data in title, summary, tags, sentiment given a fragment of a conversation How to install: pip i
Kinetics-Data-Preprocessing
Kinetics-Data-Preprocessing Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like Slow
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources.
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications
AWS Serverless Application Model (AWS SAM) The AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications
Python Machine Learning Jupyter Notebooks (ML website)
Python Machine Learning Jupyter Notebooks (ML website) Dr. Tirthajyoti Sarkar, Fremont, California (Please feel free to connect on LinkedIn here) Also
Open-source demos hosted on Dash Gallery
Dash Sample Apps This repository hosts the code for over 100 open-source Dash apps written in Python or R. They can serve as a starting point for your
Semi-Automated Data Processing
Perform semi automated exploratory data analysis, feature engineering and feature selection on provided dataset by visualizing every possibilities on each step and assisting the user to make a meaningful decision to achieve a low-bias and low-variance model.
SiliconCompiler is an open source compiler framework that automates translation from source code to silicon.
SiliconCompiler is an open source compiler framework that aims to automate translation from source code to silicon.
The bidirectional mapping library for Python.
bidict The bidirectional mapping library for Python. Status bidict: has been used for many years by several teams at Google, Venmo, CERN, Bank of Amer
Python Library to get fast extensive Dummy Data for testing
Dumda Python Library to get fast extensive Dummy Data for testing https://pypi.org/project/dumda/ Installation pip install dumda Usage: Cities from d
Displaying plot of death rates from past years in Poland. Data source from these years is in readme
Average-Death-Rate Displaying plot of death rates from past years in Poland The goal collect the data from a CSV file count the ADR (Average Death Rat
Advanced raster and geometry manipulations
buzzard In a nutshell, the buzzard library provides powerful abstractions to manipulate together images and geometries that come from different kind o
Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big
A collection of command-line interface games written in python
Command Line Interface Python Games Collection of some starter python game projects for beginners How to play these games Clone this repository git cl
Shelf DB is a tiny document database for Python to stores documents or JSON-like data
Shelf DB Introduction Shelf DB is a tiny document database for Python to stores documents or JSON-like data. Get it $ pip install shelfdb shelfquery S
Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle.
2019-indian-election-eda Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle. This project is a part of the Cou
DCM is a set of tools that helps you to keep your data in your Django Models consistent.
Django Consistency Model DCM is a set of tools that helps you to keep your data in your Django Models consistent. Motivation You have a lot of legacy
To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.
To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.
The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.
The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.
Data Science Course at Dept. of Computer Engineering, Chula 2022
2110446 Data Science Course at Chula 2022 Short links for exercises: Week1: Intro to Numpy, Pandas Numpy: https://colab.research.google.com/github/kao
Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data
WeRateDogs Twitter Data from 2015 to 2017 Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data Table of Contents Introduction Proj
BSDotPy, A module to get a bombsquad player's account data.
BSDotPy BSDotPy, A module to get a bombsquad player's account data from bombsquad's servers. Badges Provided By: shields.io Acknowledgements Issues Pu
Using knowledge-informed machine learning on the PRONOSTIA (FEMTO) and IMS bearing data sets. Predict remaining-useful-life (RUL).
Knowledge Informed Machine Learning using a Weibull-based Loss Function Exploring the concept of knowledge-informed machine learning with the use of a
Orange Chicken: Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation
Orange Chicken: Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation This repository contains code and data f
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks
Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks arXiv preprint: https://arxiv.org/abs/2201.02143. Architec
Source code for the plant extraction workflow introduced in the paper “Agricultural Plant Cataloging and Establishment of a Data Framework from UAV-based Crop Images by Computer Vision”
Plant extraction workflow Source code for the plant extraction workflow introduced in the paper "Agricultural Plant Cataloging and Establishment of a
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.
Active Transport Analytics Model: A new strategic transport modelling and data visualization framework
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”
Historic weather - Home Assistant custom component for accessing historic weather data
Historic Weather for Home Assistant (CC) 2022 by Andreas Frisch github@fraxinas.
In this repo, I will put all the code related to data science using python libraries like Numpy, Pandas, Matplotlib, Seaborn and many more.
Python-for-DS In this repo, I will put all the code related to data science using python libraries like Numpy, Pandas, Matplotlib, Seaborn and many mo
This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge.
Data-Science-Intern-Challenge This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge. Summer 2022 Data Science Inte
Active Transport Analytics Model (ATAM) is a new strategic transport modelling and data visualization framework for Active Transport as well as emerging micro-mobility modes
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”) is a new strategic transport modelling and data visualization framew
Validate arbitrary image uploads from incoming data urls while preserving file integrity but removing EXIF and unwanted artifacts and RCE exploit potential
Validate arbitrary base64-encoded image uploads as incoming data urls while preserving image integrity but removing EXIF and unwanted artifacts and mitigating RCE-exploit potential.
This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.
This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.
Deep learning with TensorFlow and earth observation data.
Deep Learning with TensorFlow and EO Data Complete file set for Jupyter Book Autor: Development Seed Date: 04 October 2021 ISBN: (to come) Notebook tu
Big Data & Cloud Computing for Oceanography
DS2 Class 2022, Big Data & Cloud Computing for Oceanography Home of the 2022 ISblue Big Data & Cloud Computing for Oceanography class (IMT-A, ENSTA, I
Generating new names based on trends in data using GPT2 (Transformer network)
MLOpsNameGenerator Overall Goal The goal of the project is to develop a model that is capable of creating Pokémon names based on its description, usin
Python PID Tuner - Based on a FOPDT model obtained using a Open Loop Process Reaction Curve
PythonPID_Tuner Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a rough e
Official git for "CTAB-GAN: Effective Table Data Synthesizing"
CTAB-GAN This is the official git paper CTAB-GAN: Effective Table Data Synthesizing. The paper is published on Asian Conference on Machine Learning (A
Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies.
Crypto_Bot Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies. Steps to get started using the bot: Sign up
A Python package that can be used to download post and comment data from Reddit.
Reddit Data Collector Reddit Data Collector is a Python package that allows a user to collect post and comment data from Reddit. It is built on top of
WebApp Maker make web apps (Duh). It is open source and make with python and shell.
WebApp Maker make web apps (Duh). It is open source and make with python and shell. This app can take any website and turn it into an app. I highly recommend turning these few websites into webapps: - Krunker.io (Fps Game) - play.fancade.com (Minigame Arcade) - Your Own Website If You Have One Apart from that enjoy my app By 220735540 (a.k.a RP400)
A practical ML pipeline for data labeling with experiment tracking using DVC.
Auto Label Pipeline A practical ML pipeline for data labeling with experiment tracking using DVC Goals: Demonstrate reproducible ML Use DVC to build a
A vanilla 3D face modeling on pose-invariant and multi-lightning image data
3D-Face-Modeling A vanilla 3D face modeling on pose-invariant and multi-lightning image data Table of Contents Background Install Usage Contributing B
🎁 3,000,000+ Unsplash images made available for research and machine learning
The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of
A collection of machine learning examples and tutorials.
machine_learning_examples A collection of machine learning examples and tutorials.
Open-source jailbreaking tool for many iOS devices
Open-source jailbreaking tool for many iOS devices *Read disclaimer before using this software. checkm8 permanent unpatchable bootrom exploit for hund
Always know what to expect from your data.
Great Expectations Always know what to expect from your data. Introduction Great Expectations helps data teams eliminate pipeline debt, through data t
Jupyter notebook and datasets from the pandas Q&A video series
Python pandas Q&A video series Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas. Jupyter Note
Low-Cost Open Source Ventilator or PAPR
Last updated 2020/04/19 Low-Cost Open-Source Ventilator-ish Device or PAPR NOTE: This is currently an independent project not affiliated with any comm
Code and data accompanying Natural Language Processing with PyTorch
Natural Language Processing with PyTorch Build Intelligent Language Applications Using Deep Learning By Delip Rao and Brian McMahan Welcome. This is a