203 Repositories
Python approximate-statistics Libraries
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch
Memorizing Transformers - Pytorch Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memori
Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).
What is judgyprophet? judgyprophet is a Bayesian forecasting algorithm based on Prophet, that enables forecasting while using information known by the
Lightning ⚡️ fast forecasting with statistical and econometric models.
Nixtla Statistical ⚡️ Forecast Lightning fast forecasting with statistical and econometric models StatsForecast offers a collection of widely used uni
This repository contains the best Data Science free hand-picked resources to equip you with all the industry-driven skills and interview preparation kit.
Best Data Science Resources Hey, Data Enthusiasts out there! Finally, after lots of requests from the community I finally came up with the best free D
This repository contains implementations of all Machine Learning Algorithms from scratch in Python. Mathematics required for ML and many projects have also been included.
👏 Pre- requisites to Machine Learning
A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode
Bloxflip Smart Bet A program that uses real statistics to choose the best times to bet on BloxFlip's crash gamemode. https://bloxflip.com/crash. THIS
STATS305C: Applied Statistics III (Spring, 2022)
STATS305C: Applied Statistics III Instructor: Scott Linderman TA: Matt MacKay, James Yang Term: Spring 2022 Stanford University Course Description: Pr
Statistics and Mathematics for Machine Learning, Deep Learning , Deep NLP
Stat4ML Statistics and Mathematics for Machine Learning, Deep Learning , Deep NLP This is the first course from our trio courses: Statistics Foundatio
Framework for evaluating ANNS algorithms on billion scale datasets.
Billion-Scale ANN http://big-ann-benchmarks.com/ Install The only prerequisite is Python (tested with 3.6) and Docker. Works with newer versions of Py
view cool stats related to your discord account.
DiscoStats cool statistics generated using your discord data. How? DiscoStats is not a service that breaks the Discord Terms of Service or Community G
Athena is the only tool that you will ever need to optimize your portfolio.
Athena Portfolio optimization is the process of selecting the best portfolio (asset distribution), out of the set of all portfolios being considered,
Live Corona statistics and information site with flask.
Flask Live Corona Info Live Corona statistics and information site with flask. Tools Flask Scrapy Matplotlib How to Run Project Download Codes git clo
Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).
Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).
A research of IT labor market based especially on hh.ru. Salaries, rate of technologies and etc.
hh_ru_research Проект реализован в учебных целях анализа рынка труда, в особенности по hh.ru Input data В качестве входных данных используются сериали
Python package for concise, transparent, and accurate predictive modeling
Python package for concise, transparent, and accurate predictive modeling. All sklearn-compatible and easy to use. 📚 docs • 📖 demo notebooks Modern
Pihole-eink-display - A simple Python script to display PiHole statistics on an eInk Display
Pihole-eink-display - A simple Python script to display PiHole statistics on an eInk Display
Tarstats - A simple Python commandline application that collects statistics about tarfiles
A simple Python commandline application that collects statistics about tarfiles.
Projeto de análise de dados com SQL
Project-Analizyng-International-Debt-Statistics- Projeto de análise de dados com SQL - Plataforma Data Camp Descrição do Projeto : Não é que nós human
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊
HEAM: High-Efficiency Approximate Multiplier Optimization for Deep Neural Networks
Approximate Multiplier by HEAM What's HEAM? HEAM is a general optimization method to generate high-efficiency approximate multipliers for specific app
The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure
miseval: a metric library for Medical Image Segmentation EVALuation The open-source and free to use Python package miseval was developed to establish
Python based project to pull useful account statistics from the Algorand block chain.
PlanetWatchStats Python based project to pull useful account statistics from the Algorand block chain. Setup pip install -r requirements.txt Run pytho
FairLens is an open source Python library for automatically discovering bias and measuring fairness in data
FairLens FairLens is an open source Python library for automatically discovering bias and measuring fairness in data. The package can be used to quick
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences. Copula and functional Principle Component Analysis (fPCA) are statistical models that allow these properties to be simulated (Joe 2014). As such, copula generated data have shown potential to improve the generalization of machine learning (ML) emulators (Meyer et al. 2021) or anonymize real-data datasets (Patki et al. 2016).
ParaMonte is a serial/parallel library of Monte Carlo routines for sampling mathematical objective functions of arbitrary-dimensions
ParaMonte is a serial/parallel library of Monte Carlo routines for sampling mathematical objective functions of arbitrary-dimensions, in particular, the posterior distributions of Bayesian models in data science, Machine Learning, and scientific inference, with the design goal of unifying the automation (of Monte Carlo simulations), user-friendliness (of the library), accessibility (from multiple programming environments), high-performance (at runtime), and scalability (across many parallel processors).
Jupyter notebooks for the book "The Elements of Statistical Learning".
This repository contains Jupyter notebooks implementing the algorithms found in the book and summary of the textbook.
Bayesian A/B testing
bayesian_testing is a small package for a quick evaluation of A/B (or A/B/C/...) tests using Bayesian approach.
Tree-based Search Graph for Approximate Nearest Neighbor Search
TBSG: Tree-based Search Graph for Approximate Nearest Neighbor Search. TBSG is a graph-based algorithm for ANNS based on Cover Tree, which is also an
Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)
Astrostatistics Davide Gerosa - [email protected] University of Milano-Bicocca, 2022. Schedule Introduction Probability and Statistics I Probabi
Generate daily updated visualizations of user and repository statistics from the GitHub API using GitHub Actions
Generate daily updated visualizations of user and repository statistics from the GitHub API using GitHub Actions for any combination of private and public repositories - dark mode supported
Generate SVG (dark/light) images visualizing (private/public) GitHub repo statistics for profile/website.
Generate daily updated visualizations of GitHub user and repository statistics from the GitHub API using GitHub Actions for any combination of private and public repositories, whether owned or contributed to - no server required.
A python tutorial on bayesian modeling techniques (PyMC3)
Bayesian Modelling in Python Welcome to "Bayesian Modelling in Python" - a tutorial for those interested in learning how to apply bayesian modelling t
Python Machine Learning Jupyter Notebooks (ML website)
Python Machine Learning Jupyter Notebooks (ML website) Dr. Tirthajyoti Sarkar, Fremont, California (Please feel free to connect on LinkedIn here) Also
Specification language for generating Generalized Linear Models (with or without mixed effects) from conceptual models
tisane Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships TL;DR: Analysts can use Tisane to author gener
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.
Survival analysis in Python
What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical commu
A python library for time-series smoothing and outlier detection in a vectorized way.
tsmoothie A python library for time-series smoothing and outlier detection in a vectorized way. Overview tsmoothie computes, in a fast and efficient w
examify-io is an online examination system that offers automatic grading , exam statistics , proctoring and programming tests , multiple user roles
examify-io is an online examination system that offers automatic grading , exam statistics , proctoring and programming tests , multiple user roles ( Examiner , Supervisor , Student )
Ipython notebook presentations for getting starting with basic programming, statistics and machine learning techniques
Data Science 45-min Intros Every week*, our data science team @Gnip (aka @TwitterBoulder) gets together for about 50 minutes to learn something. While
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
Here are the sections: Data Science Cheatsheets Data Science EBooks Data Science Question Bank Data Science Case Studies Data Science Portfolio Data J
Introduction to Statistics and Basics of Mathematics for Data Science - The Hacker's Way
HackerMath for Machine Learning “Study hard what interests you the most in the most undisciplined, irreverent and original manner possible.” ― Richard
MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
MNE-Python MNE-Python software is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, E
Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.
Think Bayes 2 by Allen B. Downey The HTML version of this book is here. Think Bayes is an introduction to Bayesian statistics using computational meth
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace, SG-HMC and more
Bayesian Neural Networks Pytorch implementations for the following approximate inference methods: Bayes by Backprop Bayes by Backprop + Local Reparame
Iris prediction model is used to classify iris species created julia's DecisionTree, DataFrames, JLD2, PlotlyJS and Statistics packages.
Iris Species Predictor Iris prediction is used to classify iris species using their sepal length, sepal width, petal length and petal width created us
Ferramenta de monitoramento do risco de colapso no sistema de saúde em municípios brasileiros com a Covid-19.
FarolCovid 🚦 Ferramenta de monitoramento do risco de colapso no sistema de saúde em municípios brasileiros com a Covid-19. Monitoring tool & simulati
Python’s bokeh, holoviews, matplotlib, plotly, seaborn package-based visualizations about COVID statistics eventually hosted as a web app on Heroku
COVID-Watch-NYC-Python-Visualization-App Python’s bokeh, holoviews, matplotlib, plotly, seaborn package-based visualizations about COVID statistics ev
Ssma is a tool that helps you collect your badges in a satr platform
satr-statistics-maker ssma is a tool that helps you collect your badges in a satr platform 🎖️ Requirements python = 3.7 Installation first clone the
Practical-statistics-for-data-scientists - Code repository for O'Reilly book
Code repository Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python by Peter Bruce, Andrew Bruce, and Peter Gedeck Pub
Tautulli - A Python based monitoring and tracking tool for Plex Media Server.
Tautulli A python based web application for monitoring, analytics and notifications for Plex Media Server. This project is based on code from Headphon
Prml - Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop
Pattern Recognition and Machine Learning (PRML) This project contains Jupyter notebooks of many the algorithms presented in Christopher Bishop's Patte
Spin-off Notice: the modules and functions used by our research notebooks have been refactored into another repository
Fecon235 - Notebooks for financial economics. Keywords: Jupyter notebook pandas Federal Reserve FRED Ferbus GDP CPI PCE inflation unemployment wage income debt Case-Shiller housing asset portfolio equities SPX bonds TIPS rates currency FX euro EUR USD JPY yen XAU gold Brent WTI oil Holt-Winters time-series forecasting statistics econometrics
Statistical-Rethinking-with-Python-and-PyMC3 - Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath
Statistical Rethinking with Python and PyMC3 This repository has been deprecated in favour of this one, please check that repository for updates, for
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
Personal implementation of paper "Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval"
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval This repo provides personal implementation of paper Approximate Ne
Generate visualizations of GitHub user and repository statistics using GitHubActions
GitHub Stats Visualization Generate visualizations of GitHub user and repository
Generate visualizations of GitHub user and repository statistics using GitHub Actions.
GitHub Stats Visualization Generate visualizations of GitHub user and repository statistics using GitHub Actions. This project is currently a work-in-
Statistics Calculator module for all types of Stats calculations.
Statistics-Calculator This Calculator user the formulas and methods to find the statistical values listed. Statistics Calculator module for all types
whylogs: A Data and Machine Learning Logging Standard
whylogs: A Data and Machine Learning Logging Standard whylogs is an open source standard for data and ML logging whylogs logging agent is the easiest
Osu statistics right on your desktop, made with pyqt
Osu!Stat Osu statistics right on your desktop, made with Qt5 Credits Would like to thank these creators for their projects and contributions. ppy, osu
Monitor the stability of a pandas or spark dataframe ⚙︎
Population Shift Monitoring popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets.
"zpool iostats" for humans; find the slow parts of your ZFS pool
Getting the gist of zfs statistics vpool-demo.mp4 The ZFS command "zpool iostat" provides a histogram listing of how often it takes to do things in pa
Library for Memory Trace Statistics in Python
Memory Search Library for Memory Trace Statistics in Python The library uses tracemalloc as a core module, which is why it is only available for Pytho
Automatic labeling, conversion of different data set formats, sample size statistics, model cascade
Simple Gadget Collection for Object Detection Tasks Automatic image annotation Conversion between different annotation formats Obtain statistical info
More detailed upload statistics for Nicotine+
More Upload Statistics A small plugin for Nicotine+ 3.1+ to create more detailed upload statistics. ⚠ No data previous to enabling this plugin will be
Here is some Python code that allows you to read in SVG files and approximate their paths using a Fourier series.
Here is some Python code that allows you to read in SVG files and approximate their paths using a Fourier series. The Fourier series can be animated and visualized, the function can be output as a two dimensional vector for Desmos and there is a method to output the coefficients as LaTeX code.
Convert Table data to approximate values with GUI
Table_Editor Convert Table data to approximate values with GUIs... usage - Import methods for extension Tables. Imported method supposed to have only
Fastshap: A fast, approximate shap kernel
fastshap: A fast, approximate shap kernel fastshap was designed to be: Fast Calculating shap values can take an extremely long time. fastshap utilizes
The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
About The ROOT system provides a set of OO frameworks with all the functionality needed to handle and analyze large amounts of data in a very efficien
Fit models to your data in Python with Sherpa.
Table of Contents Sherpa License How To Install Sherpa Using Anaconda Using pip Building from source History Release History Sherpa Sherpa is a modeli
Telegram Group Chat Statistics With Python
Telegram Group Chat Statistics How to Run First add PYTHONPATH in repository root directory enviroment variable by running: export PYTHONPATH=${PWD}
Baseball Discord bot that can post up-to-date scores, lineups, and home runs.
Sunny Day Discord Bot Baseball Discord bot that can post up-to-date scores, lineups, and home runs. Uses webscraping techniques to scrape baseball dat
Revisiting Global Statistics Aggregation for Improving Image Restoration
Revisiting Global Statistics Aggregation for Improving Image Restoration Xiaojie Chu, Liangyu Chen, Chengpeng Chen, Xin Lu Paper: https://arxiv.org/pd
Sample python script for monitoring Rocketchat database and get statistics of users.
rocketchat-DB-monitoring Sample python script for monitoring Rocketchat database and get statistics of users. 1. Update python: yum check-update && yu
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.
This project is created to visualize the system statistics such as memory usage, CPU usage, memory accessible by process and much more using Kibana Dashboard with Elasticsearch.
System Stats Visualizer This project is created to visualize the system statistics such as memory usage, CPU usage, memory accessible by process and m
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)
A simple API for working with University of California, Irvine (UCI) Machine Learning (ML) repository Table of Contents Introduction About Page of the
The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert.
The accompanying codes reproduce all figures and statistics presented in "Controlling for multiple covariates" by Mark Tygert. This repository also pr
It's an .exe file that can notify your chia profit and warning message every time automatically.
chia-Notify-with-Line 警示程式 It's an .exe file that can notify your chia profit and warning message every time automatically. 這是我自行設計的小程式,有轉成.exe檔了,可以在沒
Generate visualizations of GitHub user and repository statistics using GitHub Actions.
GitHub Stats Visualization Generate visualizations of GitHub user and repository statistics using GitHub Actions. This project is currently a work-in-
SPTAG: A library for fast approximate nearest neighbor search
SPTAG: A library for fast approximate nearest neighbor search SPTAG SPTAG (Space Partition Tree And Graph) is a library for large scale vector approxi
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Command line utilities for tabular data files This is a set of command line utilities for manipulating large tabular data files. Files of numeric and
stability-selection - A scikit-learn compatible implementation of stability selection
stability-selection - A scikit-learn compatible implementation of stability selection stability-selection is a Python implementation of the stability
PyNNDescent is a Python nearest neighbor descent for approximate nearest neighbors.
PyNNDescent PyNNDescent is a Python nearest neighbor descent for approximate nearest neighbors. It provides a python implementation of Nearest Neighbo
Approximate Nearest Neighbor Search for Sparse Data in Python!
Approximate Nearest Neighbor Search for Sparse Data in Python! This library is well suited to finding nearest neighbors in sparse, high dimensional spaces (like text documents).
Find usage statistics (imports, function calls, attribute access) for Python code-bases
Python Library stats This is a small library that allows you to query some useful statistics for Python code-bases. We currently report library import
Export Statistics for a Telegram Group Chat
Telegram Statistics Export Statistics for a Telegram Group Chat How to Run First, in main repo directory, run the following code to add src to your PY
A pairs trade is a market neutral trading strategy enabling traders to profit from virtually any market conditions.
A pairs trade is a market neutral trading strategy enabling traders to profit from virtually any market conditions. This strategy is categorized as a statistical arbitrage and convergence trading strategy.
Python module for performing linear regression for data with measurement errors and intrinsic scatter
Linear regression for data with measurement errors and intrinsic scatter (BCES) Python module for performing robust linear regression on (X,Y) data po
Render tokei's output to interactive sunburst chart.
Render tokei's output to interactive sunburst chart.
GWAS summary statistics files QC tool
SSrehab dependencies: python 3.8+ a GNU/Linux with bash v4 or 5. python packages in requirements.txt bcftools (only for prepare_dbSNPs) gz-sort (only
Distance correlation and related E-statistics in Python
dcor dcor: distance correlation and related E-statistics in Python. E-statistics are functions of distances between statistical observations in metric
Computations and statistics on manifolds with geometric structures.
Geomstats Code Continuous Integration Code coverage (numpy) Code coverage (autograd, tensorflow, pytorch) Documentation Community NEWS: Geomstats is r
Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
Description Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Ti
Optimal space decomposition based-product quantization for approximate nearest neighbor search
Optimal space decomposition based-product quantization for approximate nearest neighbor search Abstract Product quantization(PQ) is an effective neare
Statistical and Algorithmic Investing Strategies for Everyone
Eiten - Algorithmic Investing Strategies for Everyone Eiten is an open source toolkit by Tradytics that implements various statistical and algorithmic
🌊 River is a Python library for online machine learning.
River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on streaming data.
A collection of video resources for machine learning
Machine Learning Videos This is a collection of recorded talks at machine learning conferences, workshops, seminars, summer schools, and miscellaneous
A paper using optimal transport to solve the graph matching problem.
GOAT A paper using optimal transport to solve the graph matching problem. https://arxiv.org/abs/2111.05366 Repo structure .github: Files specifying ho
Important dataframe statistics with a single command
quick_eda Receiving dataframe statistics with one command Project description A python package for Data Scientists, Students, ML Engineers and anyone