290 Repositories
Python process-mining Libraries
PyPOTS - A Python Toolbox for Data Mining on Partially-Observed Time Series
A python toolbox/library for data mining on partially-observed time series, supporting tasks of forecasting/imputation/classification/clustering on incomplete multivariate time series with missing values.
Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.
Language Identifier What is this ? The goal of this project is to create a model that is able to predict a given sentence language through text proces
Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process
Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process, a complete algorithm library is established, which is named opensa (openspectrum analysis).
Snakemake worflow to process and filter long read data from Oxford Nanopore Technologies.
Nanopore-Workflow Snakemake workflow to process and filter long read data from Oxford Nanopore Technologies. It is designed to compare whole human gen
An IPC based on Websockets, fast, stable, and reliable
winerp An IPC based on Websockets. Fast, Stable, and easy-to-use, for inter-communication between your processes or discord.py bots. Key Features Fast
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)
GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa
Process JSON files for neural recording sessions using Medtronic's BrainSense Percept PC neurostimulator
percept_processing This code processes JSON files for streamed neural data using Medtronic's Percept PC neurostimulator with BrainSense Technology for
This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.
Project: Text Analysis - This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 -
This repository is used to simplify the process of cloning the SSM documents across the AWS regions.
SSM Cloner Introduction This module is created in order to simplify the process of copying the SSM documents from one region to another regions. As an
The purpose of this bot is to take soundcloud track requests, that are posted in the stream-requests channel, and add them to a playlist, to make the process of scrolling through the requests easier for Root
Discord Song Collector Dont know if anyone is actually going to read this, but the purpose of this bot is to check every message in the stream-request
IPscan - This Script is Framework To automate IP process large scope For Bug Hunting
IPscan This Script is Framework To automate IP process large scope For Bug Hunti
dash-manufacture-spc-dashboard is a dashboard for monitoring read-time process quality along manufacture production line
In our solution based on plotly, dash and influxdb, the user will firstly generate the specifications for different robots, and then a wide range of interactive visualizations for different machines for machine power, machine cost, and total cost based on the energy time and total energy getting dynamically from sensors. If a threshold is met, the alert email is generated for further operation.
Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets
Crowd-Kit: Computational Quality Control for Crowdsourcing Documentation Crowd-Kit is a powerful Python library that implements commonly-used aggregat
GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.
The GT4SD (Generative Toolkit for Scientific Discovery) is an open-source platform to accelerate hypothesis generation in the scientific discovery process. It provides a library for making state-of-the-art generative AI models easier to use.
A research of IT labor market based especially on hh.ru. Salaries, rate of technologies and etc.
hh_ru_research Проект реализован в учебных целях анализа рынка труда, в особенности по hh.ru Input data В качестве входных данных используются сериали
CLASSIX is a fast and explainable clustering algorithm based on sorting
CLASSIX Fast and explainable clustering based on sorting CLASSIX is a fast and explainable clustering algorithm based on sorting. Here are a few highl
Start and stop your NiceHash miners using this script.
NiceHash Mining Scheduler Use this script to schedule your NiceHash Miner(s). Electricity costs between 4-9pm are high in my area and I want NiceHash
Genetic algorithms are heuristic search algorithms inspired by the process that supports the evolution of life.
Genetic algorithms are heuristic search algorithms inspired by the process that supports the evolution of life. The algorithm is designed to replicate the natural selection process to carry generation, i.e. survival of the fittest of beings.
GraphNLI: A Graph-based Natural Language Inference Model for Polarity Prediction in Online Debates
GraphNLI: A Graph-based Natural Language Inference Model for Polarity Prediction in Online Debates Vibhor Agarwal, Sagar Joglekar, Anthony P. Young an
This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities
MLOps with Vertex AI This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities. The ex
Enable geospatial data mining through Google Earth Engine in Grasshopper 3D, via its most recent Hops component.
AALU_Geo Mining This repository is produced for a masterclass at the Architectural Association Landscape Urbanism programme. Requirements Rhinoceros (
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊
The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.
MLOps The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it insid
WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary
WikiPron WikiPron is a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary, as well as a database of pronuncia
Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources
Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources (e.g. just the lead vocals).
Towards Fine-Grained Reasoning for Fake News Detection
FinerFact This is the PyTorch implementation for the FinerFact model in the AAAI 2022 paper Towards Fine-Grained Reasoning for Fake News Detection (Ar
This jupyter notebook project was completed by me and my friend using the dataset from Kaggle
ARM This jupyter notebook project was completed by me and my friend using the dataset from Kaggle. The world Happiness 2017, which ranks 155 countries
Feature engineering and machine learning: together at last
Feature engineering and machine learning: together at last! Lambdo is a workflow engine which significantly simplifies data analysis by unifying featu
Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining
**Tutorials, examples, collections, and everything else that falls into the categories: pattern classification, machine learning, and data mining.** S
An ongoing process to make a physics engine using python.
Simple_Physics_Engine An ongoing process to make a physics engine using python. I am using this goal as a way to learn python in and out. I am trying
Process your transactions from etherscan (and other forks) into excel file for easier manipulation.
DEGEN TRACKER Read first This is my first Python open source project and it is very likely full of bad practices and security issues. You should not u
Python PID Tuner - Makes a model of the System from a Process Reaction Curve and calculates PID Gains
PythonPID_Tuner_SOPDT Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a r
Node-level Graph Regression with Deep Gaussian Process Models
Node-level Graph Regression with Deep Gaussian Process Models Prerequests our implementation is mainly based on tensorflow 1.x and gpflow 1.x: python
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain, Active and Continual Few-Shot Learning (TPAMI 2022 - in submission)
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Awesome production machine learning This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, versi
A curated list of awesome game datasets, and tools to artificial intelligence in games
🎮 Awesome Game Datasets In computer science, Artificial Intelligence (AI) is intelligence demonstrated by machines. Its definition, AI research as th
An awesome Data Science repository to learn and apply for real world problems.
AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s
Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)
Astrostatistics Davide Gerosa - [email protected] University of Milano-Bicocca, 2022. Schedule Introduction Probability and Statistics I Probabi
Alienworlds-bot - A Python script made to automate the tidious job of mining on AlienWorlds
AlienWorlds bot A Python script designed to automate the tedious work of mining
This is the course project of AI3602: Data Mining of SJTU
This is the course project of AI3602: Data Mining of SJTU. Group Members include Jinghao Feng, Mingyang Jiang and Wenzhong Zheng.
Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.
DistilBERT-Text-mining-authorship-attribution Dataset used: https://www.kaggle.com/azimulh/tweets-data-for-authorship-attribution-modelling/version/2
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by
Designed a greedy algorithm based on Markov sequential decision-making process in MATLAB/Python to optimize using Gurobi solver
Designed a greedy algorithm based on Markov sequential decision-making process in MATLAB/Python to optimize using Gurobi solver, the wheel size, gear shifting sequence by modeling drivetrain constraints to achieve maximum laps in a race with a 2-hour time window.
Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big
Tattoo - System for automating the Gentoo arch testing process
Naming origin Well, naming things is very hard. Thankfully we have an excellent
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by
Python PID Tuner - Based on a FOPDT model obtained using a Open Loop Process Reaction Curve
PythonPID_Tuner Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a rough e
macOS development environment setup: Setting up a new developer machine can be an ad-hoc, manual, and time-consuming process.
dev-setup Motivation Setting up a new developer machine can be an ad-hoc, manual, and time-consuming process. dev-setup aims to simplify the process w
Implements a fake news detection program using classifiers.
Fake news detection Implements a fake news detection program using classifiers for Data Mining course at UoA. Description The project is the categoriz
(JMLR' 19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Python Outlier Detection (PyOD) Deployment & Documentation & Stats & License PyOD is a comprehensive and scalable Python toolkit for detecting outlyin
Analyzed the data of VISA applicants to build a predictive model to facilitate the process of VISA approvals.
Analyzed the data of Visa applicants, built a predictive model to facilitate the process of visa approvals, and based on important factors that significantly influence the Visa status recommended a suitable profile for the applicants for whom the visa should be certified or denied.
PAMI stands for PAttern MIning. It constitutes several pattern mining algorithms to discover interesting patterns in transactional/temporal/spatiotemporal databases
Introduction PAMI stands for PAttern MIning. It constitutes several pattern mining algorithms to discover interesting patterns in transactional/tempor
SQL centered, docker process running game
REQUIREMENTS Linux Docker Python/bash set up image "docker build -t game ." create db container "run my_whatever/game_docker/pdb create" # creating po
Anomaly detection related books, papers, videos, and toolboxes
Anomaly Detection Learning Resources Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify
Django CMS Project for quicksetup with minimal installation process.
Django CMS Project for quicksetup with minimal installation process.
BERN2: an advanced neural biomedical namedentity recognition and normalization tool
BERN2 We present BERN2 (Advanced Biomedical Entity Recognition and Normalization
A Telegram crawler to search groups and channels automatically and collect any type of data from them.
Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb
Process incoming JSON-RPC requests in Python
August 16, 2021: Version 5 has been released. Read about the changes in version 5, or read the full documentation. Version 5 is for Python 3.8+ only.
Algorand-app - This tutorial is designed to get you started with Algorand development in a step by step process
Getting Started This tutorial is designed to get you started with Algorand devel
Codebase for Attentive Neural Hawkes Process (A-NHP) and Attentive Neural Datalog Through Time (A-NDTT)
Introduction Codebase for the paper Transformer Embeddings of Irregularly Spaced Events and Their Participants. This codebase contains two packages: a
Churn-Prediction-Project - In this project, a churn prediction model is developed for a private bank as a term project for Data Mining class.
Churn-Prediction-Project In this project, a churn prediction model is developed for a private bank as a term project for Data Mining class. Project in
Awesome-AI-books - Some awesome AI related books and pdfs for learning and downloading
Awesome AI books Some awesome AI related books and pdfs for downloading and learning. Preface This repo only used for learning, do not use in business
Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)
Mining the Social Web, 3rd Edition The official code repository for Mining the Social Web, 3rd Edition (O'Reilly, 2019). The book is available from Am
Youtube_dl_helper - A hacky python script meant to automate the process of downloading mp3 files from YouTube using youtube-dl library
youtube_dl_helper A helper program meant to automate the process of downloading mp3 files from YouTube using youtube-dl library Dependencies In order
A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.
TennisBusinessIntelligenceProject - A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.
Apriori - An algorithm for frequent item set mining and association rule learning over relational databases
Apriori Apriori is an algorithm for frequent item set mining and association rul
Vibrating-perimeter - Simple helper mod that logs how fast you are mining together with a simple buttplug.io script to control a vibrator
Vibrating Perimeter This project consists of a small minecraft helper mod that writes too a log file and a script that reads said log. Currently it on
A minimal implementation of Gaussian process regression in PyTorch
pytorch-minimal-gaussian-process In search of truth, simplicity is needed. There exist heavy-weighted libraries, but as you know, we need to go bare b
Class-imbalanced / Long-tailed ensemble learning in Python. Modular, flexible, and extensible
IMBENS: Class-imbalanced Ensemble Learning in Python Language: English | Chinese/中文 Links: Documentation | Gallery | PyPI | Changelog | Source | Downl
Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function
Img-process-manual - Opencv Library basic graphic processing algorithm coding reproduction based on Numpy and Matplotlib library
Implementation of paper "Self-supervised Learning on Graphs:Deep Insights and New Directions"
SelfTask-GNN A PyTorch implementation of "Self-supervised Learning on Graphs: Deep Insights and New Directions". [paper] In this paper, we first deepe
A script to automate the process of downloading Markdown and CSV backups of Notion
Automatic-Notion-Backup A script to automate the process of downloading Markdown and CSV backups of Notion. In addition, the data is processed to remo
Mycodo is open source software for the Raspberry Pi that couples inputs and outputs in interesting ways to sense and manipulate the environment.
Mycodo Environmental Regulation System Latest version: 8.12.9 Mycodo is open source software for the Raspberry Pi that couples inputs and outputs in i
A Python package to request and process seismic waveform data from Hi-net.
HinetPy is a Python package to simplify tedious data request, download and format conversion tasks related to NIED Hi-net. NIED Hi-net | Source Code |
🍋 A Python package to process food
Pyfood is a simple Python package to process food, in different languages. Pyfood's ambition is to be the go-to library to deal with food, recipes, on
A Python package to create, run, and post-process MODFLOW-based models.
Version 3.3.5 — release candidate Introduction FloPy includes support for MODFLOW 6, MODFLOW-2005, MODFLOW-NWT, MODFLOW-USG, and MODFLOW-2000. Other s
A Python library that tees the standard output & standard error from the current process to files on disk, while preserving terminal semantics
A Python library that tees the standard output & standard error from the current process to files on disk, while preserving terminal semantics (so breakpoint(), etc work as normal)
A tool to help the Poly copy-reading process! :D
PolyBot A tool to help the Poly copy-reading process! :D Let's face it-computers are better are repeatitive tasks. And, in spite of what one may want
A complete Python application to automatize the process of uploading files to Amazon S3
Upload files or folders (even with subfolders) to Amazon S3 in a totally automatized way taking advantage of: Amazon S3 Multipart Upload: The uploaded
Robotic Process Automation in Windows and Linux by using Driagrams.net BPMN diagrams.
BPMN_RPA Robotic Process Automation in Windows and Linux by using BPMN diagrams. With this Framework you can draw Business Process Model Notation base
Python PID Controller and Process Simulator (FOPDT) with GUI.
PythonPID_Simulator Python PID Controller and Process Simulator (FOPDT) with GUI. Run the File. Then select Model Values and Tune PID.. Hit Refresh to
Monitor the stability of a pandas or spark dataframe ⚙︎
Population Shift Monitoring popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets.
CleverCSV is a Python package for handling messy CSV files.
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".
Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".
A dynamic multi-STL, multi-process OpenSCAD build system with autoplating support
scad-build This is a multi-STL OpenSCAD build system based around GNU make. It supports dynamic build targets, intelligent previews with user-defined
🍰 ConnectMP - An easy and efficient way to share data between Processes in Python.
ConnectMP - Taking Multi-Process Data Sharing to the moon 🚀 Contribute · Community · Documentation 🎫 Introduction : 🍤 ConnectMP is the easiest and
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).
The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.
Multi-Process Vulnerability Tool
Multi-Process Vulnerability Tool
This Python script will automate the process of uploading your project to GitHub.
ProjectToGithub This Python script will help you to upload your project to Github without having to type in any commands !!! Quick Start guide First C
FFPuppet is a Python module that automates browser process related tasks to aid in fuzzing
FFPuppet FFPuppet is a Python module that automates browser process related tasks to aid in fuzzing. Happy bug hunting! Are you fuzzing the browser? G
demir.ai Dataset Operations
demir.ai Dataset Operations With this application, you can have the empty values (nan/null) deleted or filled before giving your dataset to machine le
Process RunGap output file of a workout and load data into Apple Numbers Spreadsheet and my website with API calls
BSD 3-Clause License Copyright (c) 2020, Mike Bromberek All rights reserved. ProcessWorkout Exercise data is exported in JSON format to iCloud using
Skull shaped MOSFET cells for the Efabless's 130nm process
SkullFET Skull shaped MOSFET cells for the Efabless's 130nm process List of cells Inverter Copyright (C) 2021 Uri Shaked
Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.
Codes-for-Algorithms Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.
Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification
Fine-grainedImageClassification Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification We trained model here: lin
Kaggle Competition using 15 numerical predictors to predict a continuous outcome.
Kaggle-Comp.-Data-Mining Kaggle Competition using 15 numerical predictors to predict a continuous outcome as part of a final project for a stats data
Main purpose of this project is to provide the service to automate the API testing process
PPTester project Main purpose of this project is to provide the service to automate the API testing process. In order to deploy this service use you s
Mercer Gaussian Process (MGP) and Fourier Gaussian Process (FGP) Regression
Mercer Gaussian Process (MGP) and Fourier Gaussian Process (FGP) Regression We provide the code used in our paper "How Good are Low-Rank Approximation
A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal
A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal, but extensible training loop which is flexible enough to handle the majority of use cases, and capable of utilizing different hardware options with no code changes required.