203 Repositories
Python approximate-statistics Libraries
Code for "Long Range Probabilistic Forecasting in Time-Series using High Order Statistics"
Long Range Probabilistic Forecasting in Time-Series using High Order Statistics This is the code produced as part of the paper Long Range Probabilisti
A collection of differentiable SVD methods and also the official implementation of the ICCV21 paper "Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling?"
Differentiable SVD Introduction This repository contains: The official Pytorch implementation of ICCV21 paper Why Approximate Matrix Square Root Outpe
ecoglib: visualization and statistics for high density microecog signals
ecoglib: visualization and statistics for high density microecog signals This library contains high-level analysis tools for "topos" and "chronos" asp
Data science, Data manipulation and Machine learning package.
duality Data science, Data manipulation and Machine learning package. Use permitted according to the terms of use and conditions set by the attached l
Bayes-Newton—A Gaussian process library in JAX, with a unifying view of approximate Bayesian inference as variants of Newton's algorithm.
Bayes-Newton Bayes-Newton is a library for approximate inference in Gaussian processes (GPs) in JAX (with objax), built and actively maintained by Wil
A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset
GBiStat package A python package to assist programmers with data analysis. This package could be used to plot : Binomial Distribution of the dataset p
Working Time Statistics of working hours and working conditions by industry and company
Working Time Statistics of working hours and working conditions by industry and company
A bot to get Statistics like the Playercount from your Minecraft-Server on your Discord-Server
Hey Thanks for reading me. Warning: My English is not the best I have programmed this bot to show me statistics about the player numbers and ping of m
Command-line interface to PyPI Stats API to get download stats for Python packages
pypistats Python 3.6+ interface to PyPI Stats API to get aggregate download statistics on Python packages on the Python Package Index without having t
Code for approximate graph reduction techniques for cardinality-based DSFM, from paper
SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi
pypinfo is a simple CLI to access PyPI download statistics via Google's BigQuery.
pypinfo: View PyPI download statistics with ease. pypinfo is a simple CLI to access PyPI download statistics via Google's BigQuery. Installation pypin
Can a machine learning project be implemented to estimate the salaries of baseball players whose salary information and career statistics for 1986 are shared?
END TO END MACHINE LEARNING PROJECT ON HITTERS DATASET Can a machine learning project be implemented to estimate the salaries of baseball players whos
K-means clustering is a method used for clustering analysis, especially in data mining and statistics.
K Means Algorithm What is K Means This algorithm is an iterative algorithm that partitions the dataset according to their features into K number of pr
This is a python package that turns any images into MIDI files that views the same as them
image_to_midi This is a python package that turns any images into MIDI files that views the same as them. This package firstly convert the image to AS
Code for the Weighted, Accelerated and Restarted Primal-dual algorithm. This algorithm achieves stable linear convergence for reconstruction from undersampled noisy measurements under an approximate sharpness condition. See the paper for details.
WARPd Code for the Weighted, Accelerated and Restarted Primal-dual algorithm. This algorithm achieves stable linear convergence for reconstruction fro
Adaptive, interpretable wavelets across domains (NeurIPS 2021)
Adaptive wavelets Wavelets which adapt given data (and optionally a pre-trained model). This yields models which are faster, more compressible, and mo
BasstatPL is a package for performing different tabulations and calculations for descriptive statistics.
BasstatPL is a package for performing different tabulations and calculations for descriptive statistics. It provides: Frequency table constr
topalias - Linux alias generator from bash/zsh command history with statistics, written on Python.
topalias topalias - Linux alias generator from bash/zsh command history with statistics, written on Python. Features Generate short alias for popular
PyImpetus is a Markov Blanket based feature subset selection algorithm that considers features both separately and together as a group in order to provide not just the best set of features but also the best combination of features
PyImpetus PyImpetus is a Markov Blanket based feature selection algorithm that selects a subset of features by considering their performance both indi
A new mini-batch framework for optimal transport in deep generative models, deep domain adaptation, approximate Bayesian computation, color transfer, and gradient flow.
BoMb-OT Python3 implementation of the papers On Transportation of Mini-batches: A Hierarchical Approach and Improving Mini-batch Optimal Transport via
Focal Statistics
Focal-Statistics The Focal statistics tool in many GIS applications like ArcGIS, QGIS and GRASS GIS is a standard method to gain a local overview of r
Small Python script to parse endlessh's output and print some neat statistics
endlessh_parser endlessh_parser is a small Python script that parses endlessh's output and prints some neat statistics about it Usage Install all the
High-quality implementations of standard and SOTA methods on a variety of tasks.
Uncertainty Baselines The goal of Uncertainty Baselines is to provide a template for researchers to build on. The baselines can be a starting point fo
Speeding-Up Back-Propagation in DNN: Approximate Outer Product with Memory
Approximate Outer Product Gradient Descent with Memory Code for the numerical experiment of the paper Speeding-Up Back-Propagation in DNN: Approximate
StudyLion is a Discord bot that tracks members' study and work time while offering members to view their statistics and use productivity tools such as: To-do lists, Pomodoro timers, reminders, and much more.
StudyLion - Discord Productivity Bot StudyLion is a Discord bot that tracks members' study and work time while offering members the ability to view th
A repository for collating all the resources such as articles, blogs, papers, and books related to Bayesian Statistics.
A repository for collating all the resources such as articles, blogs, papers, and books related to Bayesian Statistics.
wikirepo is a Python package that provides a framework to easily source and leverage standardized Wikidata information
Python based Wikidata framework for easy dataframe extraction wikirepo is a Python package that provides a framework to easily source and leverage sta
Better GitHub statistics images for your profile, with stats from private and public repos
Better GitHub statistics images for your profile, with stats from private and public repos
ADOP: Approximate Differentiable One-Pixel Point Rendering
ADOP: Approximate Differentiable One-Pixel Point Rendering Abstract: We present a novel point-based, differentiable neural rendering pipeline for scen
Statistical Analysis 📈 focused on statistical analysis and exploration used on various data sets for personal and professional projects.
Statistical Analysis 📈 This repository focuses on statistical analysis and the exploration used on various data sets for personal and professional pr
A data analysis using python and pandas to showcase trends in school performance.
A data analysis using python and pandas to showcase trends in school performance. A data analysis to showcase trends in school performance using Panda
Module to use some statistics from Spotify API
statify Module to use some statistics from Spotify API To use it you have to import the functions into your own project. You have also to authenticate
Graphsignal is a machine learning model monitoring platform.
Graphsignal is a machine learning model monitoring platform. It helps ML engineers, MLOps teams and data scientists to quickly address issues with data and models as well as proactively analyze model performance and availability.
distfit - Probability density fitting
Python package for probability density function fitting of univariate distributions of non-censored data
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
Hierarchical neural-net interpretations (ACD) 🧠 Produces hierarchical interpretations for a single prediction made by a pytorch neural network. Offic
A discord bot for tracking Iranian Minecraft servers and showing the statistics of them
A discord bot for tracking Iranian Minecraft servers and showing the statistics of them
Functional Data Analysis, or FDA, is the field of Statistics that analyses data that depend on a continuous parameter.
Functional Data Analysis Python package
scrilla: A Financial Optimization Application
A python application that wraps around AlphaVantage, Quandl and IEX APIs, calculates financial statistics and optimizes portfolio allocations.
skimpy is a light weight tool that provides summary statistics about variables in data frames within the console.
skimpy Welcome Welcome to skimpy! skimpy is a light weight tool that provides summary statistics about variables in data frames within the console. Th
TextDescriptives - A Python library for calculating a large variety of statistics from text
A Python library for calculating a large variety of statistics from text(s) using spaCy v.3 pipeline components and extensions. TextDescriptives can be used to calculate several descriptive statistics, readability metrics, and metrics related to dependency distance.
track your GitHub statistics
GitHub-Stalker track your github statistics 👀 features find new followers or unfollowers find who got a star on your project or remove stars find who
Monitor and log Network and Disks statistics in MegaBytes per second.
iometrics Monitor and log Network and Disks statistics in MegaBytes per second. Install pip install iometrics Usage Pytorch-lightning integration from
Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"
MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat
A simple, transparent, open-source key logger, written in Python, for tracking your own key-usage statistics.
A simple, transparent, open-source key logger, written in Python, for tracking your own key-usage statistics, originally intended for keyboard layout optimization.
ANNchor is a python library which constructs approximate k-nearest neighbour graphs for slow metrics.
Fast k-NN graph construction for slow metrics
A corona statistics and information telegram bot.
A corona statistics and information telegram bot.
COVID-19 deaths statistics around the world
COVID-19-Deaths-Dataset COVID-19 deaths statistics around the world This is a daily updated dataset of COVID-19 deaths around the world. The dataset c
Statistics and Visualization of acceptance rate, main keyword of CVPR 2021 accepted papers for the main Computer Vision conference (CVPR)
Statistics and Visualization of acceptance rate, main keyword of CVPR 2021 accepted papers for the main Computer Vision conference (CVPR)
Graphsignal Logger
Graphsignal Logger Overview Graphsignal is an observability platform for monitoring and troubleshooting production machine learning applications. It h
Prometheus exporter for several chia node statistics
prometheus-chia-exporter Prometheus exporter for several chia node statistics It's assumed that the full node, the harvester and the wallet run on the
Statsmodels: statistical modeling and econometrics in Python
About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an
Domain Generalization with MixStyle, ICLR'21.
MixStyle This repo contains the code of our ICLR'21 paper, "Domain Generalization with MixStyle". The OpenReview link is https://openreview.net/forum?
One Stop Anomaly Shop: Anomaly detection using two-phase approach: (a) pre-labeling using statistics, Natural Language Processing and static rules; (b) anomaly scoring using supervised and unsupervised machine learning.
One Stop Anomaly Shop (OSAS) Quick start guide Step 1: Get/build the docker image Option 1: Use precompiled image (might not reflect latest changes):
TorchPQ is a python library for Approximate Nearest Neighbor Search (ANNS) and Maximum Inner Product Search (MIPS) on GPU using Product Quantization (PQ) algorithm.
Efficient implementations of Product Quantization and its variants using Pytorch and CUDA
A complete guide to start and improve in machine learning (ML)
A complete guide to start and improve in machine learning (ML), artificial intelligence (AI) in 2021 without ANY background in the field and stay up-to-date with the latest news and state-of-the-art techniques!
Newt - a Gaussian process library in JAX.
Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\
Summary statistics of geospatial raster datasets based on vector geometries.
rasterstats rasterstats is a Python module for summarizing geospatial raster datasets based on vector geometries. It includes functions for zonal stat
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis
Statistical package in Python based on Pandas
Pingouin is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy. Some of its main features are listed below. F
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilis
Using approximate bayesian posteriors in deep nets for active learning
Bayesian Active Learning (BaaL) BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.
pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit
Gaussian processes in TensorFlow
Website | Documentation (release) | Documentation (develop) | Glossary Table of Contents What does GPflow do? Installation Getting Started with GPflow
Probabilistic reasoning and statistical analysis in TensorFlow
TensorFlow Probability TensorFlow Probability is a library for probabilistic reasoning and statistical analysis in TensorFlow. As part of the TensorFl
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
imbalanced-learn imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-cla
A python library for Bayesian time series modeling
PyDLM Welcome to pydlm, a flexible time series modeling library for python. This library is based on the Bayesian dynamic linear model (Harrison and W
Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.
Stock Statistics/Indicators Calculation Helper VERSION: 0.3.2 Introduction Supply a wrapper StockDataFrame based on the pandas.DataFrame with inline s
Implementation of Kalman Filter in Python
Kalman Filter in Python This is a basic example of how Kalman filter works in Python. I do plan on refactoring and expanding this repo in the future.
Multiple Pairwise Comparisons (Post Hoc) Tests in Python
scikit-posthocs is a Python package that provides post hoc tests for pairwise multiple comparisons that are usually performed in statistical data anal
Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.
weightedcalcs weightedcalcs is a pandas-based Python library for calculating weighted means, medians, standard deviations, and more. Features Plays we
Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.
Stock Statistics/Indicators Calculation Helper VERSION: 0.3.2 Introduction Supply a wrapper StockDataFrame based on the pandas.DataFrame with inline s
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Probabilistic Programming and Statistical Inference in PyTorch
PtStat Probabilistic Programming and Statistical Inference in PyTorch. Introduction This project is being developed during my time at Cogent Labs. The
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
imbalanced-learn imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-cla
Open source time series library for Python
PyFlux PyFlux is an open source time series library for Python. The library has a good array of modern time series models, as well as a flexible array
Module for statistical learning, with a particular emphasis on time-dependent modelling
Operating system Build Status Linux/Mac Windows tick tick is a Python 3 module for statistical learning, with a particular emphasis on time-dependent
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster
[Due to the time taken @ uni, work + hell breaking loose in my life, since things have calmed down a bit, will continue commiting!!!] [By the way, I'm
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis
🎐 a python library for doing approximate and phonetic matching of strings.
jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk [email protected] and Michael
Visualize and compare datasets, target values and associations, with one line of code.
In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
🎐 a python library for doing approximate and phonetic matching of strings.
jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk [email protected] and Michael
Visualize and compare datasets, target values and associations, with one line of code.
In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin
Statsmodels: statistical modeling and econometrics in Python
About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an
scikit-learn: machine learning in Python
scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started
curl statistics made simple
httpstat httpstat visualizes curl(1) statistics in a way of beauty and clarity. It is a single file 🌟 Python script that has no dependency 👏 and is
A Python API to retrieve and read MLB GameDay data
mlbgame mlbgame is a Python API to retrieve and read MLB GameDay data. mlbgame works with real time data, getting information as games are being playe
Statsmodels: statistical modeling and econometrics in Python
About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis
Multi-class confusion matrix library in Python
Table of contents Overview Installation Usage Document Try PyCM in Your Browser Issues & Bug Reports Todo Outputs Dependencies Contribution References
🎐 a python library for doing approximate and phonetic matching of strings.
jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk [email protected] and Michael
🌊 Online machine learning in Python
In a nutshell River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.
pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer