290 Repositories
Python process-mining Libraries
Analyze, visualize and process sound field data recorded by spherical microphone arrays.
Sound Field Analysis toolbox for Python The sound_field_analysis toolbox (short: sfa) is a Python port of the Sound Field Analysis Toolbox (SOFiA) too
GTK and Python based, system performance and usage monitoring tool
System Monitoring Center GTK3 and Python 3 based, system performance and usage monitoring tool. Features: Detailed system performance and usage usage
This script has been created in order to find what are the most common demanded technologies in Data Engineering field.
This is a Python script that given a whole corpus of job descriptions and a file with keywords it extracts the number of number of ocurrences of these keywords and write it to a file. This script it is easy to extend to accept more functionalities
Project developed as part of a selection process for the company Denox
📝 Tabela de conteúdos Sobre Requisitos para rodar o projeto Instalação Rotas da API Observações 🧐 Sobre Projeto desenvolvido como parte de um proces
Papers about explainability of GNNs
Papers about explainability of GNNs
Reading streams of Twitter data, save them to Kafka, then process with Kafka Stream API and Spark Streaming
Using Streaming Twitter Data with Kafka and Spark Reading streams of Twitter data, publishing them to Kafka topic, process message using Kafka Stream
Process GPX files (adding sensor metrics, uploading to InfluxDB, etc.) exported from imxingzhe.com
Xingzhe GPX Processor 行者轨迹处理工具 Xingzhe sells cheap GPS bike meters with sensor support including cadence, heart rate and power. But the GPX files expo
This project is created to visualize the system statistics such as memory usage, CPU usage, memory accessible by process and much more using Kibana Dashboard with Elasticsearch.
System Stats Visualizer This project is created to visualize the system statistics such as memory usage, CPU usage, memory accessible by process and m
An assignment from my grad-level data mining course demonstrating some experience with NLP/neural networks/Pytorch
NLP-Pytorch-Assignment An assignment from my grad-level data mining course (before I started personal projects) demonstrating some experience with NLP
A fast, pure python implementation of the MuyGPs Gaussian process realization and training algorithm.
Fast implementation of the MuyGPs Gaussian process hyperparameter estimation algorithm MuyGPs is a GP estimation method that affords fast hyperparamet
A simple framwork to streamline the Domain Adaptation training process.
FastDA Introduction This is a simple framework for domain adaptation training. You can use it to build your own training process. It heavily relies on
Multi-Process / Censorship Detection
Multi-Process / Censorship Detection
A set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI.
Overview This is a set of simple scripts to process the Imagenet-1K dataset as TFRecords and make index files for NVIDIA DALI. Make TFRecords To run t
Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation
Configurations Change HOME_PATH in CONFIG.py as the current path Data Prepare CENSINCOME Download data Put census-income.data and census-income.test i
Bayesian inference for Permuton-induced Chinese Restaurant Process (NeurIPS2021).
Permuton-induced Chinese Restaurant Process Note: Currently only the Matlab version is available, but a Python version will be available soon! This is
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms
MatrixProfile MatrixProfile is a Python 3 library, brought to you by the Matrix Profile Foundation, for mining time series data. The Matrix Profile is
Various Algorithms for Short Text Mining
Short Text Mining in Python Introduction This package shorttext is a Python package that facilitates supervised and unsupervised learning for short te
Full ELT process on GCP environment.
Rent Houses Germany - GCP Pipeline Project: The goal of the project is to extract data about house rentals in Germany, store, process and analyze it u
IMBENS: class-imbalanced ensemble learning in Python.
IMBENS: class-imbalanced ensemble learning in Python. Links: [Documentation] [Gallery] [PyPI] [Changelog] [Source] [Download] [知乎/Zhihu] [中文README] [a
Process dataframe in a easily way.
Popanda Written by Shengxuan Wang at OSU. Used for processing dataframe, especially for machine learning. The name is from "Po" in the movie Kung Fu P
Morpheus is a telegram bot that helps to simplify the process of making custom telegram stickers.
😎 Morpheus is a telegram bot that helps to simplify the process of making custom telegram stickers. As you may know, Telegram's official Sti
Conversational text Analysis using various NLP techniques
PyConverse Let me try first Installation pip install pyconverse Usage Please try this notebook that demos the core functionalities: basic usage noteb
Digital image process Basic algorithm
These are some basic algorithms that I have implemented by my hands in the process of learning digital image processing, such as mean and median filtering, sharpening algorithms, interpolation scaling algorithms, histogram equalization algorithms, etc.
This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression problems
Doctoral dissertation of Zheng Zhao This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression pro
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Command line utilities for tabular data files This is a set of command line utilities for manipulating large tabular data files. Files of numeric and
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
Feature Engineering & Feature Selection A comprehensive guide [pdf] [markdown] for Feature Engineering and Feature Selection, with implementations and
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.
This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.
slue-toolkit We introduce Spoken Language Understanding Evaluation (SLUE) benchmark. This toolkit provides codes to download and pre-process the SLUE
A Python package to process & model ChEMBL data.
insilico: A Python package to process & model ChEMBL data. ChEMBL is a manually curated chemical database of bioactive molecules with drug-like proper
Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy
Introduction ImagePy is an open source image processing framework written in Python. Its UI interface, image data structure and table data structure a
A Burp Suite extension made to automate the process of finding reverse proxy path based SSRF.
TProxer A Burp Suite extension made to automate the process of finding reverse proxy path based SSRF. How • Install • Todo • Join Discord How it works
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.
counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code
Reproduction process of BERT on SST2 dataset
BERT-SST2-Prod Reproduction process of BERT on SST2 dataset 安装说明 下载代码库 git clone https://github.com/JunnYu/BERT-SST2-Prod 进入文件夹,安装requirements pip ins
A faster Python generator that get function results from multi-process workers
multiyield This package implements a Python generator that get function results from multi-process workers. The faster_fifo Queue (instead of the stan
PyClustering is a Python, C++ data mining library.
pyclustering is a Python, C++ data mining library (clustering algorithm, oscillatory networks, neural networks). The library provides Python and C++ implementations (C++ pyclustering library) of each algorithm or model. C++ pyclustering library is a part of pyclustering and supported for Linux, Windows and MacOS operating systems.
The Fundamental Clustering Problems Suite (FCPS) summaries 54 state-of-the-art clustering algorithms, common cluster challenges and estimations of the number of clusters as well as the testing for cluster tendency.
FCPS Fundamental Clustering Problems Suite The package provides over sixty state-of-the-art clustering algorithms for unsupervised machine learning pu
Contains an implementation (sklearn API) of the algorithm proposed in "GENDIS: GEnetic DIscovery of Shapelets" and code to reproduce all experiments.
GENDIS GENetic DIscovery of Shapelets In the time series classification domain, shapelets are small subseries that are discriminative for a certain cl
Mining the Stack Overflow Developer Survey
Mining the Stack Overflow Developer Survey A prototype data mining application to compare the accuracy of decision tree and random forest regression m
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.
counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code
This repository contains Prior-RObust Bayesian Optimization (PROBO) as introduced in our paper "Accounting for Gaussian Process Imprecision in Bayesian Optimization"
Prior-RObust Bayesian Optimization (PROBO) Introduction, TOC This repository contains Prior-RObust Bayesian Optimization (PROBO) as introduced in our
DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction
DeepSTD: Mining Spatio-temporal Disturbances of Multiple Context Factors for Citywide Traffic Flow Prediction This is the implementation of DeepSTD in
MASS (Mueen's Algorithm for Similarity Search) - a python 2 and 3 compatible library used for searching time series sub-sequences under z-normalized Euclidean distance for similarity.
Introduction MASS allows you to search a time series for a subquery resulting in an array of distances. These array of distances enable you to identif
Gaussian Process Optimization using GPy
End of maintenance for GPyOpt Dear GPyOpt community! We would like to acknowledge the obvious. The core team of GPyOpt has moved on, and over the past
Responsible Machine Learning with Python
Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
Python module providing a framework to trace individual edges in an image using Gaussian process regression.
Edge Tracing using Gaussian Process Regression Repository storing python module which implements a framework to trace individual edges in an image usi
Turn crypto miner on/off depending on powerwall charge level
Mining Crypto with Tesla Solar and Powerwalls This script turns a crypto miner on and off when the Tesla Powerwall level drops/rises above a certain t
Python script using Twitter API to change user banner to see 100DaysOfCode process.
100DaysOfCode - Automatic Banners 👩💻 Adds a number to your twitter banner indicating the number of days you have in the #100DaysOfCode challenge Se
Obsei is a low code AI powered automation tool.
Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .
[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation
Mining Latent Classes for Few-shot Segmentation Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao. This codebase contains baseline of our paper Mini
Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport
Non-Homogeneous Poisson Process Intensity Modeling and Estimation using Measure Transport This GitHub page provides code for reproducing the results i
A small Python Library to process Game Boy Camera images
GameBEye GameBEye is a Python Library to process Game Boy Camera images. Source code 📁 : https://github.com/mtouzot/GameBEye Issues 🆘 : https://gith
Python functions to run WASS stereo wave processing executables, and load and post process WASS output files.
wass-pyfuns Python functions to run the WASS stereo wave processing executables, and load and post process the WASS output files. General WASS (Waves
This is a tool to calculate a resulting color of the alpha blending process.
blec: alpha blending calculator This is a tool to calculate a resulting color of the alpha blending process. A gamma correction is enabled and the def
For the paper entitled ''A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining''
Summary This is the source code for the paper "A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining", which was accepted as fu
Implementation of association rules mining algorithms (Apriori|FPGrowth) using python.
Association Rules Mining Using Python Implementation of association rules mining algorithms (Apriori|FPGrowth) using python. As a part of hw1 code in
A tool to analyze leveraged liquidity mining and find optimal option combination for hedging.
LP-Option-Hedging Description A Python program to analyze leveraged liquidity farming/mining and find the optimal option combination for hedging imper
RMNA: A Neighbor Aggregation-Based Knowledge Graph Representation Learning Model Using Rule Mining
RMNA: A Neighbor Aggregation-Based Knowledge Graph Representation Learning Model Using Rule Mining Our code is based on Learning Attention-based Embed
PyTorch implementation for "Mining Latent Structures with Contrastive Modality Fusion for Multimedia Recommendation"
MIRCO PyTorch implementation for paper: Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation Dependencies Python 3.
Bayes-Newton—A Gaussian process library in JAX, with a unifying view of approximate Bayesian inference as variants of Newton's algorithm.
Bayes-Newton Bayes-Newton is a library for approximate inference in Gaussian processes (GPs) in JAX (with objax), built and actively maintained by Wil
SNV calling pipeline developed explicitly to process individual or trio vcf files obtained from Illumina based pipeline (grch37/grch38).
SNV Pipeline SNV calling pipeline developed explicitly to process individual or trio vcf files obtained from Illumina based pipeline (grch37/grch38).
A large-scale database for graph representation learning
A large-scale database for graph representation learning
Enhancing Knowledge Tracing via Adversarial Training
Enhancing Knowledge Tracing via Adversarial Training This repository contains source code for the paper "Enhancing Knowledge Tracing via Adversarial T
A python open source CMS scanner that automates the process of detecting security flaws of the most popular CMSs
CMSmap CMSmap is a python open source CMS scanner that automates the process of detecting security flaws of the most popular CMSs. The main purpose of
Limit your docker image size with a simple CLI command. Perfect to be used inside your CI process.
docker-image-size-limit Limit your docker image size with a simple CLI command. Perfect to be used inside your CI process. Read the announcing post. I
Easily Process a Batch of Cox Models
ezcox: Easily Process a Batch of Cox Models The goal of ezcox is to operate a batch of univariate or multivariate Cox models and return tidy result. ⏬
These are After Effects and Python files that were made in the process of creating the video for the contest.
spirograph These are After Effects and Python files that were made in the process of creating the video for the contest. In the python file you can qu
Gorrabot is a bot made to automate checks and processes in the development process.
Gorrabot is a Gitlab bot made to automate checks and processes in the Faraday development. Features Check that the CHANGELOG is modified By default, m
K-means clustering is a method used for clustering analysis, especially in data mining and statistics.
K Means Algorithm What is K Means This algorithm is an iterative algorithm that partitions the dataset according to their features into K number of pr
The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".
The HIST framework for stock trend forecasting The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining C
Convenient tool for speeding up the intern/officer review process.
icpc-app-screen Convenient tool for speeding up the intern/officer applicant review process. Eliminates the pain from reading application responses of
The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".
The HIST framework for stock trend forecasting The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining C
This tutorial will guide you through the process of self-hosting Polygon
Hosting guide This tutorial will guide you through the process of self-hosting Polygon Before starting Make sure you have the following tools installe
This is REST-API for Indonesian Text Summarization using Non-Negative Matrix Factorization for the algorithm to summarize documents and FastAPI for the framework.
Indonesian Text Summarization Using FastAPI This is REST-API for Indonesian Text Summarization using Non-Negative Matrix Factorization for the algorit
This is an official source code for implementation on Extensive Deep Temporal Point Process
Extensive Deep Temporal Point Process This is an official source code for implementation on Extensive Deep Temporal Point Process, which is composed o
A simple key-based text encryption process that encrypts a string based in a list of characteres pairs.
Simple Cipher Encrypter About | New Features | Exemple | How To Use | License ℹ️ About A simple key-based text encryption process that encrypts a stri
Blue Brain text mining toolbox for semantic search and structured information extraction
Blue Brain Search Source Code DOI Data & Models DOI Documentation Latest Release Python Versions License Build Status Static Typing Code Style Securit
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥
face.evoLVe: High-Performance Face Recognition Library based on PaddlePaddle & PyTorch Evolve to be more comprehensive, effective and efficient for fa
Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.
Graph Mining Author: Jiayi Chen Time: April 2021 Implemented Algorithms: Network: Scrabing Data, Network Construbtion and Network Measurement (e.g., P
XMRiGUI is free and open-source crypto miner for Linux. It uses XMRig for mining and GTK3 for GUI.
XMRiGUI is free and open-source crypto miner for Linux. It uses XMRig for mining and GTK3 for GUI.
10x faster matrix and vector operations
Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations. If yo
eyes is a Public Opinion Mining System focusing on taiwanese forums such as PTT, Dcard.
eyes is a Public Opinion Mining System focusing on taiwanese forums such as PTT, Dcard. Features 🔥 Article monitor: helps you capture the trend at a
novel deep learning research works with PaddlePaddle
Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 目录 计算机视觉(Computer Vision) 自然语言处理(Natrual Language Processing) 知识图谱(Knowledge Graph) 时空数据挖掘(Spa
Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"
Code To run: python runner.py new --save SAVE_NAME --data PATH_TO_DATA_DIR --dataset DATASET --model model_name [options] --n 1000 - train - t
Hapi is a Python library for building Conceptual Distributed Model using HBV96 lumped model & Muskingum routing method
Current build status All platforms: Current release info Name Downloads Version Platforms Hapi - Hydrological library for Python Hapi is an open-sourc
Framework to collect and process weather data from wttr.in.
Weathercrawler Automatic extraction and processing framework for weather data from wttr.in Installation tested with: Python 3.7.3 Python 3.9.4 git clo
novel deep learning research works with PaddlePaddle
Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 目录 计算机视觉(Computer Vision) 自然语言处理(Natrual Language Processing) 知识图谱(Knowledge Graph) 时空数据挖掘(Spa
Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection, AAAI 2021.
Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection This repository is an official implementation of the AAAI 2021 paper Co-mi
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Python Package for DataHerb: create, search, and load datasets.
The Python Package for DataHerb A DataHerb Core Service to Create and Load Datasets.
Code for Mining the Benefits of Two-stage and One-stage HOI Detection
Status: Archive (code is provided as-is, no updates expected) PPO-EWMA [Paper] This is code for training agents using PPO-EWMA and PPG-EWMA, introduce
Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection
CDN Code for our NeurIPS 2021 paper "Mining the Benefits of Two-stage and One-stage HOI Detection". Contributed by Aixi Zhang*, Yue Liao*, Si Liu, Mia
Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection
Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection (NimPme) The official implementation of Novel Instances Mining with
Python bytecode manipulation and import process customization to do evil stuff with format strings. Nasty!
formathack Python bytecode manipulation and import process customization to do evil stuff with format strings. Nasty! This is an answer to a StackOver
Command Line (CLI) Application to automate creation of tasks in Redmine, issues on Github and the sync process of them.
Task Manager Automation Tool (TMAT) CLI Command Line (CLI) Application to automate creation of tasks in Redmine, issues on Github and the sync process
A python script developed to process Windows memory images based on triage type.
Overview A python script developed to process Windows memory images based on triage type. Requirements Python3 Bulk Extractor Volatility2 with Communi
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥
face.evoLVe: High-Performance Face Recognition Library based on PaddlePaddle & PyTorch Evolve to be more comprehensive, effective and efficient for fa
The (Python-based) mining software required for the Game Boy mining project.
The (Python-based) mining software required for the Game Boy mining project.
Automating the process of sorting files in my downloads folder by file type.
downloads-folder-automation Automating the process of sorting files in a user's downloads folder on Windows by file type. This script iterates through
Automates the process to obtain an appointment for NIE in spain.
get-nie-appointment A Python script that automates the process of getting an appointment for NIE assignation. It can be modified in order to change th
Labelling platform for text using distant supervision
With DataQA, you can label unstructured text documents using rule-based distant supervision.