1190 Repositories
Python analysis-pipeline Libraries
ALSPAC data analysis studying links between screen-usage and mental health issues in children. Provided data has been synthesised.
ADSMH - Mental Health and Screen Time Group coursework for Applied Data Science at the University of Bristol. Overview The data set that you have was
El Niño - Southern Oscillation analysis compared to minimum flow rates of rivers in northeast Brazil
ENSO (El Niño - Southern Oscillation) analysis in northeast Brazil É comprovada a influência dos fenômenos El Niño e La Niña nas secas no nordesde bra
EchoDNS - Analyze your DNS traffic super easy, shows all requested DNS traffic
EchoDNS - Analyze your DNS traffic super easy, shows all requested DNS traffic
This is RFA-Toolbox, a simple and easy-to-use library that allows you to optimize your neural network architectures using receptive field analysis (RFA) and create graph visualizations of your architecture.
ReceptiveFieldAnalysisToolbox This is RFA-Toolbox, a simple and easy-to-use library that allows you to optimize your neural network architectures usin
Sentiment-Analysis and EDA on the IMDB Movie Review Dataset
Sentiment-Analysis and EDA on the IMDB Movie Review Dataset The main part of the work focuses on the exploration and study of different approaches whi
ioztat is a storage load analysis tool for OpenZFS
ioztat is a storage load analysis tool for OpenZFS. It provides iostat-like statistics at an individual dataset/zvol level.
A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries
A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries by disassembling and analyzing the compiled python byte-code(.pyc) files across all python versions (including Python 3.10.*)
Financial portfolio optimisation in python, including classical efficient frontier, Black-Litterman, Hierarchical Risk Parity
PyPortfolioOpt has recently been published in the Journal of Open Source Software 🎉 PyPortfolioOpt is a library that implements portfolio optimizatio
Practical Machine Learning with Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Semi-Automated Data Processing
Perform semi automated exploratory data analysis, feature engineering and feature selection on provided dataset by visualizing every possibilities on each step and assisting the user to make a meaningful decision to achieve a low-bias and low-variance model.
Security audit Python project dependencies against security advisory databases.
Security audit Python project dependencies against security advisory databases.
Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle.
2019-indian-election-eda Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle. This project is a part of the Cou
API Server for VoIP analysis (CDR + Audio CODECs)
Swagger generated server Overview This server was generated by the swagger-codegen project. By using the OpenAPI-Spec from a remote server, you can ea
AB-test-analyzer - Python class to perform AB test analysis
AB-test-analyzer Python class to perform AB test analysis Overview This repo con
Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products
Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products.
Repo for The Crown: Exploratory Analysis of Nim Malware DEF CON 615 talk
Repo for "The Crown: Exploratory Analysis of Nim Malware" DEF CON 615 talk
A computer vision pipeline to identify the "icons" in Christian paintings
Christian-Iconography A computer vision pipeline to identify the "icons" in Christian paintings. A bit about iconography. Iconography is related to id
Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis
Introduction This is an implementation of our paper Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis.
HuSpaCy: industrial-strength Hungarian natural language processing
HuSpaCy: Industrial-strength Hungarian NLP HuSpaCy is a spaCy model and a library providing industrial-strength Hungarian language processing faciliti
Repository for the AugmentedPCA Python package.
Overview This Python package provides implementations of Augmented Principal Component Analysis (AugmentedPCA) - a family of linear factor models that
Specification language for generating Generalized Linear Models (with or without mixed effects) from conceptual models
tisane Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships TL;DR: Analysts can use Tisane to author gener
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.
Active Transport Analytics Model: A new strategic transport modelling and data visualization framework
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”
Active Transport Analytics Model (ATAM) is a new strategic transport modelling and data visualization framework for Active Transport as well as emerging micro-mobility modes
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”) is a new strategic transport modelling and data visualization framew
Sentinel-1 SAR time series analysis for OSINT use
SARveillance Sentinel-1 SAR time series analysis for OSINT use. Description Generates a time lapse GIF of the Sentinel-1 satellite images for the loca
End-to-end MLOps pipeline of a BERT model for emotion classification.
image source EmoBERT-MLOps The goal of this repository is to build an end-to-end MLOps pipeline based on the MLOps course from Made with ML, but this
A practical ML pipeline for data labeling with experiment tracking using DVC.
Auto Label Pipeline A practical ML pipeline for data labeling with experiment tracking using DVC Goals: Demonstrate reproducible ML Use DVC to build a
People tracker on the Internet: OSINT analysis and research tool by Jose Pino
trape (stable) v2.0 People tracker on the Internet: Learn to track the world, to avoid being traced. Trape is an OSINT analysis and research tool, whi
Always know what to expect from your data.
Great Expectations Always know what to expect from your data. Introduction Great Expectations helps data teams eliminate pipeline debt, through data t
Jupyter notebook and datasets from the pandas Q&A video series
Python pandas Q&A video series Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas. Jupyter Note
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)
100 pandas puzzles Puzzles notebook Solutions notebook Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of panda
FMA: A Dataset For Music Analysis
FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information
Python library for the analysis of dynamic measurements
Python library for the analysis of dynamic measurements The goal of this library is to provide a starting point for users in metrology and related are
Streamlit App For Product Analysis - Streamlit App For Product Analysis
Streamlit_App_For_Product_Analysis Здравствуйте! Перед вами дашборд, позволяющий
Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis
Delta_Conformity_Sociopatterns_Analysis ∆-Conformity is a local homophily measur
Deep Learning pipeline for motor-imagery classification.
BCI-ToolBox 1. Introduction BCI-ToolBox is deep learning pipeline for motor-imagery classification. This repo contains five models: ShallowConvNet, De
Practical Time-Series Analysis, published by Packt
Practical Time-Series Analysis This is the code repository for Practical Time-Series Analysis, published by Packt. It contains all the supporting proj
U-Time: A Fully Convolutional Network for Time Series Segmentation
U-Time & U-Sleep Official implementation of The U-Time [1] model for general-purpose time-series segmentation. The U-Sleep [2] model for resilient hig
A simple flask application to collect annotations for the Turing Change Point Dataset, a benchmark dataset for change point detection algorithms
AnnotateChange Welcome to the repository of the "AnnotateChange" application. This application was created to collect annotations of time series data
Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)
taganomaly Anomaly detection labeling tool, specifically for multiple time series (one time series per category). Taganomaly is a tool for creating la
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection
Deep learning for time series forecasting Flow forecast is an open-source deep learning for time series forecasting framework. It provides all the lat
Survival analysis in Python
What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical commu
:spaghetti: Pastas is an open-source Python framework for the analysis of hydrological time series.
Pastas: Analysis of Groundwater Time Series Pastas: what is it? Pastas is an open source python package for processing, simulating and analyzing groun
(JMLR' 19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Python Outlier Detection (PyOD) Deployment & Documentation & Stats & License PyOD is a comprehensive and scalable Python toolkit for detecting outlyin
Hierarchical Time Series Forecasting with a familiar API
scikit-hts Hierarchical Time Series with a familiar API. This is the result from not having found any good implementations of HTS on-line, and my work
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis
bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P
Highly comparative time-series analysis
〰️ hctsa 〰️ : highly comparative time-series analysis hctsa is a software package for running highly comparative time-series analysis using Matlab (fu
Timeseries analysis for neuroscience data
=================================================== Nitime: timeseries analysis for neuroscience data ===============================================
A Python library for unevenly-spaced time series analysis
traces A Python library for unevenly-spaced time series analysis. Why? Taking measurements at irregular intervals is common, but most tools are primar
This is a Python wrapper for TA-LIB based on Cython instead of SWIG.
TA-Lib This is a Python wrapper for TA-LIB based on Cython instead of SWIG. From the homepage: TA-Lib is widely used by trading software developers re
Software Engineer Salary Prediction
Based on 2021 stack overflow data, this machine learning web application helps one predict the salary based on years of experience, level of education and the country they work in.
Medical appointments No-Show classifier
Medical Appointments No-shows Why do 20% of patients miss their scheduled appointments? A person makes a doctor appointment, receives all the instruct
DevSecOps pipeline for Python based web app using Jenkins, Ansible, AWS, and open-source security tools and checks.
DevSecOps pipeline for Python Web App A Jenkins end-to-end DevSecOps pipeline for Python web application, hosted on AWS Ubuntu 20.04 Note: This projec
A general and strong 3D object detection codebase that supports more methods, datasets and tools (debugging, recording and analysis).
ALLINONE-Det ALLINONE-Det is a general and strong 3D object detection codebase built on OpenPCDet, which supports more methods, datasets and tools (de
Cherche (search in French) allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers.
Cherche (search in French) allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers. Cherche is meant to be used with small to medium sized corpora. Cherche's main strength is its ability to build diverse and end-to-end pipelines.
Finite Element Analysis
FElupe - Finite Element Analysis FElupe is a Python 3.6+ finite element analysis package focussing on the formulation and numerical solution of nonlin
This is the offline-training-pipeline for our project.
offline-training-pipeline This is the offline-training-pipeline for our project. We adopt the offline training and online prediction Machine Learning
General Assembly's 2015 Data Science course in Washington, DC
DAT8 Course Repository Course materials for General Assembly's Data Science course in Washington, DC (8/18/15 - 10/29/15). Instructor: Kevin Markham (
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Spark Python Notebooks This is a collection of IPython notebook/Jupyter notebooks intended to train the reader on different Apache Spark concepts, fro
List of papers, code and experiments using deep learning for time series forecasting
Deep Learning Time Series Forecasting List of state of the art papers focus on deep learning and resources, code and experiments using deep learning f
Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch
Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch. Topics: Face detection with Detectron 2, Time Series anomaly detection with LSTM Autoencoders, Object Detection with YOLO v5, Build your first Neural Network, Time Series forecasting for Coronavirus daily cases, Sentiment Analysis with BERT.
A Practitioner's Guide to Natural Language Processing
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, Text Analytics with Python published by Apress/Springer.
A powerful and user-friendly binary analysis platform!
angr angr is a platform-agnostic binary analysis framework. It is brought to you by the Computer Security Lab at UC Santa Barbara, SEFCOM at Arizona S
Anomaly detection related books, papers, videos, and toolboxes
Anomaly Detection Learning Resources Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify
Dshell is a network forensic analysis framework.
Dshell An extensible network forensic analysis framework. Enables rapid development of plugins to support the dissection of network packet captures. K
Complete pipeline for crawling online newspaper article.
Complete pipeline for crawling online newspaper article. The articles are stored to MongoDB. The whole pipeline is dockerized, thus the user does not need to worry about dependencies. Additionally, docker-compose is available to increase the useability for the user.
Free and open source qualitative research tool
Taguette A spin on the phrase "tag it!", Taguette is a free and open source qualitative research tool that allows users to: Import PDFs, Word Docs (.d
Resco: A simple python package that report the effect of deep residual learning
resco Description resco is a simple python package that report the effect of dee
Customer Service Requests Analysis is one of the practical life problems that an analyst may face. This Project is one such take. The project is a beginner to intermediate level project. This repository has a Source Code, README file, Dataset, Image and License file.
Customer Service Requests Analysis Project 1 DESCRIPTION Background of Problem Statement : NYC 311's mission is to provide the public with quick and e
Analysis of a daily word game "Wordle"
Wordle Analysis of a daily word game "Wordle" https://www.powerlanguage.co.uk/wordle/ Description Worlde is a daily word game in which a player attemp
Uproot is a library for reading and writing ROOT files in pure Python and NumPy.
Uproot is a library for reading and writing ROOT files in pure Python and NumPy. Unlike the standard C++ ROOT implementation, Uproot is only an I/O li
NNR conformation conditional and global probabilities estimation and analysis in peptides or proteins fragments
NNR and global probabilities estimation and analysis in peptides or protein fragments This module calculates global and NNR conformation dependent pro
PyMove is a Python library to simplify queries and visualization of trajectories and other spatial-temporal data
Use PyMove and go much further Information Package Status License Python Version Platforms Build Status PyPi version PyPi Downloads Conda version Cond
A lightweight face-recognition toolbox and pipeline based on tensorflow-lite
FaceIDLight 📘 Description A lightweight face-recognition toolbox and pipeline based on tensorflow-lite with MTCNN-Face-Detection and ArcFace-Face-Rec
BigDL - Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems
Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems.
Imaging, analysis, and simulation software for radio interferometry
ehtim (eht-imaging) Python modules for simulating and manipulating VLBI data and producing images with regularized maximum likelihood methods. This ve
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Knowledge Repo The Knowledge Repo project is focused on facilitating the sharing of knowledge between data scientists and other technical roles using
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python
Scalene: a high-performance CPU, GPU and memory profiler for Python by Emery Berger, Sam Stern, and Juan Altmayer Pizzorno. Scalene community Slack Ab
A distributed crawler for weibo, building with celery and requests.
A distributed crawler for weibo, building with celery and requests.
DUQ is a python package for working with physical Dimensions, Units, and Quantities.
DUQ is a python package for working with physical Dimensions, Units, and Quantities.
An analysis tool for Python that blurs the line between testing and type systems.
CrossHair An analysis tool for Python that blurs the line between testing and type systems. THE LATEST NEWS: Check out the new crosshair cover command
This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.
Aspect_Based_Sentiment_Extraction Created on: 5th Jan, 2022. This project deals with an important field of Natural Lnaguage Processing - Aspect Based
An interactive DNN Model deployed on web that predicts the chance of heart failure for a patient with an accuracy of 98%
Heart Failure Predictor About A Web UI deployed Dense Neural Network Model Made using Tensorflow that predicts whether the patient is healthy or has c
NLP techniques such as named entity recognition, sentiment analysis, topic modeling, text classification with Python to predict sentiment and rating of drug from user reviews.
This file contains the following documents sumbited for Baruch CIS9665 group 9 fall 2021. 1. Dataset: drug_reviews.csv 2. python codes for text classi
Fine tuning keras-ocr python package with custom synthetic dataset from scratch
OCR-Pipeline-with-Keras The keras-ocr package generally consists of two parts: a Detector and a Recognizer: Detector is responsible for creating bound
Data and analysis code for an MS on SK VOC genomes phenotyping/neutralisation assays
Description Summary of phylogenomic methods and analyses used in "Immunogenicity of convalescent and vaccinated sera against clinical isolates of ance
A stock analysis app with streamlit
StockAnalysisApp A stock analysis app with streamlit. You select the ticker of the stock and the app makes a series of analysis by using the price cha
Automated Exploration Data Analysis on a financial dataset
Automated EDA on financial dataset Just a simple way to get automated Exploration Data Analysis from financial dataset (OHLCV) using Streamlit and ta.
This repository contains the code to replicate the analysis from the paper "Moving On - Investigating Inventors' Ethnic Origins Using Supervised Learning"
Replication Code for 'Moving On' - Investigating Inventors' Ethnic Origins Using Supervised Learning This repository contains the code to replicate th
Malware arcane - Scripts and notes on my malware analysis journey
Malware Arcane Repository of notes and scripts I use when doing malware analysis
CSE-519---Project - Job Title Analysis (Project for CSE 519 - Data Science Fundamentals)
A Multifaceted Approach to Job Title Analysis CSE 519 - Data Science Fundamentals Project Description Project consists of three parts: Salary Predicti
Automatic caption evaluation metric based on typicality analysis.
SeMantic and linguistic UndeRstanding Fusion (SMURF) Automatic caption evaluation metric described in the paper "SMURF: SeMantic and linguistic UndeRs
Streaming Finance Data with AWS Lambda
A data pipeline consisting of an AWS lambda function reading data from yfinance API, an AWS Kinesis stream to receive & store data in S3 buckets and AWS Glue crawler & Athena to run SQL queries.
inding a method to objectively quantify skill versus chance in games, using reinforcement learning
Skill-vs-chance-games-analysis - Finding a method to objectively quantify skill versus chance in games, using reinforcement learning
AmiEviL - This program uses the Virus Total API to determine if your suspicious file is malicious or not
AmiEviL - This program uses the Virus Total API to determine if your suspicious file is malicious or not. The program requests the hash of the file and outputs information (if any). This version will output: the file type, names seen in the wild, the number of security vendors that have flagged it as malicious, undetected, and unable to process the file.
DeepSpeech - Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.
(简体中文|English) Quick Start | Documents | Models List PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks i
NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess
Project 3: Web APIs & NLP Problem Statement How do r/Libertarian and r/Neoliberal differ on Biden post-inaguration? The goal of the project is to see
Twitter-Sentiment-Analysis - Analysis of twitter posts' positive and negative score.
Twitter-Sentiment-Analysis The hands-on project is in Python 3 Programming class offered by University of Michigan via Coursera. The task is to build
LSTM based Sentiment Classification using Tensorflow - Amazon Reviews Rating
LSTM based Sentiment Classification using Tensorflow - Amazon Reviews Rating (Dataset) The dataset is from Amazon Review Data (2018)
A web-based analysis toolkit for the System Usability Scale providing calculation, plotting, interpretation and contextualization utility
System Usability Scale Analysis Toolkit The System Usability Scale (SUS) Analysis Toolkit is a web-based python application that provides a compilatio