1149 Repositories
Python document-layout-analysis Libraries
Explorative Data Analysis Guidelines
Explorative Data Analysis Get data into a usable format! Find out if the following predictive modeling phase will be successful! Combine everything in
Kaggler is a Python package for lightweight online machine learning algorithms and utility functions for ETL and data analysis.
Kaggler is a Python package for lightweight online machine learning algorithms and utility functions for ETL and data analysis. It is distributed under the MIT License.
MCML is a toolkit for semi-supervised dimensionality reduction and quantitative analysis of Multi-Class, Multi-Label data
MCML is a toolkit for semi-supervised dimensionality reduction and quantitative analysis of Multi-Class, Multi-Label data. We demonstrate its use
An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model
pl_prompt_sst An example project using OpenPrompt under the framework of pytorch-lightning for a training prompt-based text classification model on SS
Python factor analysis library (PCA, CA, MCA, MFA, FAMD)
Prince is a library for doing factor analysis. This includes a variety of methods including principal component analysis (PCA) and correspondence anal
A high-performance topological machine learning toolbox in Python
giotto-tda is a high-performance topological machine learning toolbox in Python built on top of scikit-learn and is distributed under the G
Single-Cell Analysis in Python. Scales to 1M cells.
Scanpy – Single-Cell Analysis in Python Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It inc
Falcon: Interactive Visual Analysis for Big Data
Falcon: Interactive Visual Analysis for Big Data Crossfilter millions of records without latencies. This project is work in progress and not documente
Streamlit — The fastest way to build data apps in Python
Welcome to Streamlit 👋 The fastest way to build and share data apps. Streamlit lets you turn data scripts into sharable web apps in minutes, not week
Introduction to Geospatial Analysis in Python
Introduction to Geospatial Analysis in Python This repository is in support of a talk on geospatial data. Data To recreate all of the examples, the da
Infomap is a network clustering algorithm based on the Map equation.
Infomap Infomap is a network clustering algorithm based on the Map equation. For detailed documentation, see mapequation.org/infomap. For a list of re
Code to reproduce the results in the paper "Tensor Component Analysis for Interpreting the Latent Space of GANs".
Tensor Component Analysis for Interpreting the Latent Space of GANs [ paper | project page ] Code to reproduce the results in the paper "Tensor Compon
Python beta calculator that retrieves stock and market data and provides linear regressions.
Stock and Index Beta Calculator Python script that calculates the beta (β) of a stock against the chosen index. The script retrieves the data and resa
A Powerful Serverless Analysis Toolkit That Takes Trial And Error Out of Machine Learning Projects
KXY: A Seemless API to 10x The Productivity of Machine Learning Engineers Documentation https://www.kxy.ai/reference/ Installation From PyPi: pip inst
Detectron2 for Document Layout Analysis
Detectron2 trained on PubLayNet dataset This repo contains the training configurations, code and trained models trained on PubLayNet dataset using Det
Python Blood Vessel Topology Analysis
Python Blood Vessel Topology Analysis This repository is not being updated anymore. The new version of PyVesTo is called PyVaNe and is available at ht
This repo contains implementation of different architectures for emotion recognition in conversations.
Emotion Recognition in Conversations Updates 🔥 🔥 🔥 Date Announcements 03/08/2021 🎆 🎆 We have released a new dataset M2H2: A Multimodal Multiparty
Image classification for projects and researches
This is a tool to help you quickly solve classification problems including: data analysis, training, report results and model explanation.
A python program for visualizing MIDI files, and displaying them in a spiral layout
SpiralMusic_python A python program for visualizing MIDI files, and displaying them in a spiral layout For a hardware version using Teensy & LED displ
Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"
Scripts for "Current best-practices in single-cell RNA-seq: a tutorial" This repository is complementary to the publication: M.D. Luecken, F.J. Theis,
This project uses Youtube data API's to do youtube tags analysis based on viewCount, comments etc.
Youtube video details analyser Steps to run this project Please set the AuthKey which you can fetch from google developer console and paste it in the
zeus is a Python implementation of the Ensemble Slice Sampling method.
zeus is a Python implementation of the Ensemble Slice Sampling method. Fast & Robust Bayesian Inference, Efficient Markov Chain Monte Carlo (MCMC), Bl
Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).
Face Recognition: Too Bias, or Not Too Bias? Robinson, Joseph P., Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. "Face recognition:
Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.
Predicting Yelp Review Quality Table of Contents Introduction Motivation Goal and Central Questions The Data Data Storage and ETL EDA Data Pipeline Da
Optimal Randomized Canonical Correlation Analysis
ORCCA Optimal Randomized Canonical Correlation Analysis This project is for the python version of ORCCA algorithm. It depends on Numpy for matrix calc
Empyrial is a Python-based open-source quantitative investment library dedicated to financial institutions and retail investors
By Investors, For Investors. Want to read this in Chinese? Click here Empyrial is a Python-based open-source quantitative investment library dedicated
A machine learning model for analyzing text for user sentiment and determine whether its a positive, neutral, or negative review.
Sentiment Analysis on Yelp's Dataset Author: Roberto Sanchez, Talent Path: D1 Group Docker Deployment: Deployment of this application can be found her
Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.
Trading Tesla with Machine Learning and Sentiment Analysis An interactive program to train a Random Forest Classifier to predict Tesla daily prices us
AI-powered literature discovery and review engine for medical/scientific papers
AI-powered literature discovery and review engine for medical/scientific papers paperai is an AI-powered literature discovery and review engine for me
Real-time analysis of intracranial neurophysiology recordings.
py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro
This project uses unsupervised machine learning to identify correlations between daily inoculation rates in the USA and twitter sentiment in regards to COVID-19.
Twitter COVID-19 Sentiment Analysis Members: Christopher Bach | Khalid Hamid Fallous | Jay Hirpara | Jing Tang | Graham Thomas | David Wetherhold Pro
Conversational text Analysis using various NLP techniques
Conversational text Analysis using various NLP techniques
Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners
japanese-ebook-analysis This aim of this project is to make analysing the contents of a japanese ebook easy and streamline the process for non-technic
An easy-to-use library for emulating code in minidump files.
dumpulator Note: This is a work-in-progress prototype, please treat it as such. An easy-to-use library for emulating code in minidump files. Example T
A multi-platform GUI for bit-based analysis, processing, and visualization
A multi-platform GUI for bit-based analysis, processing, and visualization
Render tokei's output to interactive sunburst chart.
Render tokei's output to interactive sunburst chart.
Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)
DocFormer - PyTorch Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for t
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".
Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short
Training, generation, and analysis code for Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics
Location-Aware Generative Adversarial Networks (LAGAN) for Physics Synthesis This repository contains all the code used in L. de Oliveira (@lukedeo),
Sentiment Analysis Project using Count Vectorizer and TF-IDF Vectorizer
Sentiment Analysis Project This project contains two sentiment analysis programs for Hotel Reviews using a Hotel Reviews dataset from Datafiniti. The
A high performance implementation of HDBSCAN clustering.
HDBSCAN HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates
The Fundamental Clustering Problems Suite (FCPS) summaries 54 state-of-the-art clustering algorithms, common cluster challenges and estimations of the number of clusters as well as the testing for cluster tendency.
FCPS Fundamental Clustering Problems Suite The package provides over sixty state-of-the-art clustering algorithms for unsupervised machine learning pu
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.
tree-SNE t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in s
Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
Description Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis. Ti
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se
Contains an implementation (sklearn API) of the algorithm proposed in "GENDIS: GEnetic DIscovery of Shapelets" and code to reproduce all experiments.
GENDIS GENetic DIscovery of Shapelets In the time series classification domain, shapelets are small subseries that are discriminative for a certain cl
An experimental Python-to-C transpiler and domain specific language for embedded high-performance computing
An experimental Python-to-C transpiler and domain specific language for embedded high-performance computing
A deep-learning pipeline for segmentation of ambiguous microscopic images.
Welcome to Official repository of deepflash2 - a deep-learning pipeline for segmentation of ambiguous microscopic images. Quick Start in 30 seconds se
Full-featured Decision Trees and Random Forests learner.
CID3 This is a full-featured Decision Trees and Random Forests learner. It can save trees or forests to disk for later use. It is possible to query tr
Repository containing detailed experiments related to the paper "Memotion Analysis through the Lens of Joint Embedding".
Memotion Analysis Through The Lens Of Joint Embedding This repository contains the experiments conducted as described in the paper 'Memotion Analysis
This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer Capacitor domain using text similarity indexes: An experimental analysis "
kwd-extraction-study This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer
Stochastic gradient descent with model building
Stochastic Model Building (SMB) This repository includes a new fast and robust stochastic optimization algorithm for training deep learning models. Th
TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id
TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL
Stochastic Extragradient: General Analysis and Improved Rates
Stochastic Extragradient: General Analysis and Improved Rates This repository is the official implementation of the paper "Stochastic Extragradient: G
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets Datasets Used: Iris dataset,
MASS (Mueen's Algorithm for Similarity Search) - a python 2 and 3 compatible library used for searching time series sub-sequences under z-normalized Euclidean distance for similarity.
Introduction MASS allows you to search a time series for a subquery resulting in an array of distances. These array of distances enable you to identif
Portfolio analytics for quants, written in Python
QuantStats: Portfolio analytics for quants QuantStats Python library that performs portfolio profiling, allowing quants and portfolio managers to unde
scikit-survival is a Python module for survival analysis built on top of scikit-learn.
scikit-survival scikit-survival is a Python module for survival analysis built on top of scikit-learn. It allows doing survival analysis while utilizi
Library of Stan Models for Survival Analysis
survivalstan: Survival Models in Stan author: Jacki Novik Overview Library of Stan Models for Survival Analysis Features: Variety of standard survival
PySurvival is an open source python package for Survival Analysis modeling
PySurvival What is Pysurvival ? PySurvival is an open source python package for Survival Analysis modeling - the modeling concept used to analyze or p
Deep Survival Machines - Fully Parametric Survival Regression
Package: dsm Python package dsm provides an API to train the Deep Survival Machines and associated models for problems in survival analysis. The under
Banpei is a Python package of the anomaly detection.
Banpei Banpei is a Python package of the anomaly detection. Anomaly detection is a technique used to identify unusual patterns that do not conform to
A Python package for modular causal inference analysis and model evaluations
Causal Inference 360 A Python package for inferring causal effects from observational data. Description Causal inference analysis enables estimating t
moDel Agnostic Language for Exploration and eXplanation
moDel Agnostic Language for Exploration and eXplanation Overview Unverified black box model is the path to the failure. Opaqueness leads to distrust.
Code for our paper "Multi-scale Guided Attention for Medical Image Segmentation"
Medical Image Segmentation with Guided Attention This repository contains the code of our paper: "'Multi-scale self-guided attention for medical image
A combination of autoregressors and autoencoders using XLNet for sentiment analysis
A combination of autoregressors and autoencoders using XLNet for sentiment analysis Abstract In this paper sentiment analysis has been performed in or
Sentiment analysis on streaming twitter data using Spark Structured Streaming & Python
Sentiment analysis on streaming twitter data using Spark Structured Streaming & Python This project is a good starting point for those who have little
Important dataframe statistics with a single command
quick_eda Receiving dataframe statistics with one command Project description A python package for Data Scientists, Students, ML Engineers and anyone
Data Scientist in Simple Stock Analysis of PT Bukalapak.com Tbk for Long Term Investment
Data Scientist in Simple Stock Analysis of PT Bukalapak.com Tbk for Long Term Investment Brief explanation of PT Bukalapak.com Tbk Bukalapak was found
Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production
Numerics Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production Use procedure: Initialise a new i
Predict the latency time of the deep learning models
Deep Neural Network Prediction Step 1. Genernate random parameters and Run them sequentially : $ python3 collect_data.py -gp -ep -pp -pl pooling -num
This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.
Text Summarizer This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text. Team Members This mini-project was
strava-offline is a tool to keep a local mirror of Strava activities for further analysis/processing:
strava-offline Overview strava-offline is a tool to keep a local mirror of Strava activities for further analysis/processing: synchronizes metadata ab
nbsafety adds a layer of protection to computational notebooks by solving the stale dependency problem when executing cells out-of-order
nbsafety adds a layer of protection to computational notebooks by solving the stale dependency problem when executing cells out-of-order
gdsfactory is an EDA (electronics design automation) tool to Layout Integrated Circuits.
gdsfactory 3.5.5 gdsfactory is an EDA (electronics design automation) tool to Layout Integrated Circuits. It is build on top of phidl gdspy and klayou
Analysis scripts for QG equations
qg-edgeofchaos Analysis scripts for QG equations FIle/Folder Structure eigensolvers.py - Spectral and finite-difference solvers for Rossby wave eigenf
Log processor for nginx or apache that extracts user and user sessions and calculates other types of useful data for bot detection or traffic analysis
Log processor for nginx or apache that extracts user and user sessions and calculates other types of useful data for bot detection or traffic analysis
A Python adaption of Augur to prioritize cell types in perturbation analysis.
A Python adaption of Augur to prioritize cell types in perturbation analysis.
Powerful, efficient particle trajectory analysis in scientific Python.
freud Overview The freud Python library provides a simple, flexible, powerful set of tools for analyzing trajectories obtained from molecular dynamics
Pymwp is a tool for automatically performing static analysis on programs written in C
pymwp: MWP analysis in Python pymwp is a tool for automatically performing static analysis on programs written in C, inspired by "A Flow Calculus of m
Office365 (Microsoft365) audit log analysis tool
Office365 (Microsoft365) audit log analysis tool The header describes it all WHY?? The first line of code was written long time before other colleague
This is an example of how to automate Ridit Analysis for a dataset with large amount of questions and many item attributes
This is an example of how to automate Ridit Analysis for a dataset with large amount of questions and many item attributes
Source code, data, and evaluation details for “Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Formation, and Ramifications”
Analysis of cross-lingual citations in English papers Contents initial_analysis Source code, data, and evaluation details as published at ICADL2020 ci
A package for music online and offline rhythmic information analysis including music Beat, downbeat, tempo and meter tracking.
BeatNet A package for music online and offline rhythmic information analysis including music Beat, downbeat, tempo and meter tracking. This repository
Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity
Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity, such as gratings, photonic-crystal slabs, metasurfaces, surface-emitting lasers, nano-antennas, and more.
Obsei is a low code AI powered automation tool.
Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .
4CAT: Capture and Analysis Toolkit
4CAT: Capture and Analysis Toolkit 4CAT is a research tool that can be used to analyse and process data from online social platforms. Its goal is to m
Robocop is a tool that performs static code analysis of Robot Framework code.
Robocop Introduction Documentation Values Requirements Installation Usage Example Robotidy FAQ Watch our talk from RoboCon 2021 about Robocop and Robo
The official PyTorch code implementation of "Human Trajectory Prediction via Counterfactual Analysis" in ICCV 2021.
Human Trajectory Prediction via Counterfactual Analysis (CausalHTP) The official PyTorch code implementation of "Human Trajectory Prediction via Count
Keyboard Layout Change - Extension for Ulauncher
Keyboard Layout Change - Extension for Ulauncher
Document blur detection based on Laplacian operator and text detection.
Document Blur Detection For general blurred image, using the variance of Laplacian operator is a good solution. But as for the blur detection of docum
Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework
VFedPCA+VFedAKPCA This is the official source code for the Paper: Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-
A flexible ML framework built to simplify medical image reconstruction and analysis experimentation.
meddlr Getting Started Meddlr is a config-driven ML framework built to simplify medical image reconstruction and analysis problems. Installation To av
Exploratory analysis and data visualization of aircraft accidents and incidents in Brazil.
Exploring aircraft accidents in Brazil Occurrencies with aircraft in Brazil are investigated by the Center for Investigation and Prevention of Aircraf
Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions.
About Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions. The tool provides rich data and a summary g
cLoops2: full stack analysis tool for chromatin interactions
cLoops2: full stack analysis tool for chromatin interactions Introduction cLoops2 is an extension of our previous work, cLoops. From loop-calling base
Tool for automatically reordering python imports. Similar to isort but uses static analysis more.
reorder_python_imports Tool for automatically reordering python imports. Similar to isort but uses static analysis more. Installation pip install reor
Multifunctional Analysis of Regions through Input-Output
MARIO Multifunctional Analysis of Regions through Input-Output. (Documents) What is it MARIO is a python package for handling input-output tables and
LSTM Neural Networks for Spectroscopic Studies of Type Ia Supernovae
Package Description The difficulties in acquiring spectroscopic data have been a major challenge for supernova surveys. snlstm is developed to provide