3311 Repositories
Python first-year-data-analysis Libraries
This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge.
Data-Science-Intern-Challenge This repository contains answers of the Shopify Summer 2022 Data Science Intern Challenge. Summer 2022 Data Science Inte
Active Transport Analytics Model (ATAM) is a new strategic transport modelling and data visualization framework for Active Transport as well as emerging micro-mobility modes
{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”) is a new strategic transport modelling and data visualization framew
Sentinel-1 SAR time series analysis for OSINT use
SARveillance Sentinel-1 SAR time series analysis for OSINT use. Description Generates a time lapse GIF of the Sentinel-1 satellite images for the loca
Validate arbitrary image uploads from incoming data urls while preserving file integrity but removing EXIF and unwanted artifacts and RCE exploit potential
Validate arbitrary base64-encoded image uploads as incoming data urls while preserving image integrity but removing EXIF and unwanted artifacts and mitigating RCE-exploit potential.
This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.
This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.
Deep learning with TensorFlow and earth observation data.
Deep Learning with TensorFlow and EO Data Complete file set for Jupyter Book Autor: Development Seed Date: 04 October 2021 ISBN: (to come) Notebook tu
Big Data & Cloud Computing for Oceanography
DS2 Class 2022, Big Data & Cloud Computing for Oceanography Home of the 2022 ISblue Big Data & Cloud Computing for Oceanography class (IMT-A, ENSTA, I
Generating new names based on trends in data using GPT2 (Transformer network)
MLOpsNameGenerator Overall Goal The goal of the project is to develop a model that is capable of creating Pokémon names based on its description, usin
Official git for "CTAB-GAN: Effective Table Data Synthesizing"
CTAB-GAN This is the official git paper CTAB-GAN: Effective Table Data Synthesizing. The paper is published on Asian Conference on Machine Learning (A
A Python package that can be used to download post and comment data from Reddit.
Reddit Data Collector Reddit Data Collector is a Python package that allows a user to collect post and comment data from Reddit. It is built on top of
A practical ML pipeline for data labeling with experiment tracking using DVC.
Auto Label Pipeline A practical ML pipeline for data labeling with experiment tracking using DVC Goals: Demonstrate reproducible ML Use DVC to build a
A vanilla 3D face modeling on pose-invariant and multi-lightning image data
3D-Face-Modeling A vanilla 3D face modeling on pose-invariant and multi-lightning image data Table of Contents Background Install Usage Contributing B
🎁 3,000,000+ Unsplash images made available for research and machine learning
The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of
People tracker on the Internet: OSINT analysis and research tool by Jose Pino
trape (stable) v2.0 People tracker on the Internet: Learn to track the world, to avoid being traced. Trape is an OSINT analysis and research tool, whi
A collection of machine learning examples and tutorials.
machine_learning_examples A collection of machine learning examples and tutorials.
Always know what to expect from your data.
Great Expectations Always know what to expect from your data. Introduction Great Expectations helps data teams eliminate pipeline debt, through data t
Jupyter notebook and datasets from the pandas Q&A video series
Python pandas Q&A video series Read about the series, and view all of the videos on one page: Easier data analysis in Python with pandas. Jupyter Note
Code and data accompanying Natural Language Processing with PyTorch
Natural Language Processing with PyTorch Build Intelligent Language Applications Using Deep Learning By Delip Rao and Brian McMahan Welcome. This is a
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)
100 pandas puzzles Puzzles notebook Solutions notebook Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of panda
FMA: A Dataset For Music Analysis
FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information
Python library for the analysis of dynamic measurements
Python library for the analysis of dynamic measurements The goal of this library is to provide a starting point for users in metrology and related are
Python for downloading model data (HRRR, RAP, GFS, NBM, etc.) from NOMADS, NOAA's Big Data Program partners (Amazon, Google, Microsoft), and the University of Utah Pando Archive System.
Python for downloading model data (HRRR, RAP, GFS, NBM, etc.) from NOMADS, NOAA's Big Data Program partners (Amazon, Google, Microsoft), and the University of Utah Pando Archive System.
Streamlit App For Product Analysis - Streamlit App For Product Analysis
Streamlit_App_For_Product_Analysis Здравствуйте! Перед вами дашборд, позволяющий
Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis
Delta_Conformity_Sociopatterns_Analysis ∆-Conformity is a local homophily measur
Predictive Modeling & Analytics on Home Equity Line of Credit
Predictive Modeling & Analytics on Home Equity Line of Credit Data (Python) HMEQ Data Set In this assignment we will use Python to examine a data set
Implements a fake news detection program using classifiers.
Fake news detection Implements a fake news detection program using classifiers for Data Mining course at UoA. Description The project is the categoriz
A collection of data structures and algorithms I'm writing while learning
Data Structures and Algorithms: This is a collection of data structures and algorithms that I write while learning the subject Stack: stack.py A stack
Repository for the paper : Meta-FDMixup: Cross-Domain Few-Shot Learning Guided byLabeled Target Data
1 Meta-FDMIxup Repository for the paper : Meta-FDMixup: Cross-Domain Few-Shot Learning Guided byLabeled Target Data. (ACM MM 2021) paper News! the rep
Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.
Multi-Time Attention Networks (mTANs) This repository contains the PyTorch implementation for the paper Multi-Time Attention Networks for Irregularly
Script that allows to download data with satellite's orbit height and create CSV with their change in time.
Satellite orbit height ◾ Requirements Python = 3.8 Packages listen in reuirements.txt (run pip install -r requirements.txt) Account on Space Track ◾
A tool for RaceRoom Racing Experience which shows you launch data
R3E Launch Tool A tool for RaceRoom Racing Experience which shows you launch data. Usage Run the tool, change the Stop Speed to whatever you want, and
This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.
Deals of the Day This is a web scraper, using the Python framework Scrapy, built to extract data such as price and product name from the Deals of the
A Simple Key-Value Data-store written in Python
mercury-db This is a File Based Key-Value Datastore that supports basic CRUD (Create, Read, Update, Delete) operations developed using Python. The dat
To attract customers, the hotel chain has added to its website the ability to book a room without prepayment
To attract customers, the hotel chain has added to its website the ability to book a room without prepayment. We need to predict whether the customer is going to reject the booking or not. Since in case of refusal, the hotel incurs losses.
LSTM built using Keras Python package to predict time series steps and sequences. Includes sin wave and stock market data
LSTM Neural Network for Time Series Prediction LSTM built using the Keras Python package to predict time series steps and sequences. Includes sine wav
Practical Time-Series Analysis, published by Packt
Practical Time-Series Analysis This is the code repository for Practical Time-Series Analysis, published by Packt. It contains all the supporting proj
U-Time: A Fully Convolutional Network for Time Series Segmentation
U-Time & U-Sleep Official implementation of The U-Time [1] model for general-purpose time-series segmentation. The U-Sleep [2] model for resilient hig
A simple flask application to collect annotations for the Turing Change Point Dataset, a benchmark dataset for change point detection algorithms
AnnotateChange Welcome to the repository of the "AnnotateChange" application. This application was created to collect annotations of time series data
Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)
taganomaly Anomaly detection labeling tool, specifically for multiple time series (one time series per category). Taganomaly is a tool for creating la
The Wearables Development Toolkit - a development environment for activity recognition applications with sensor signals
Wearables Development Toolkit (WDK) The Wearables Development Toolkit (WDK) is a framework and set of tools to facilitate the iterative development of
DeltaPy - Tabular Data Augmentation (by @firmai)
DeltaPy — Tabular Data Augmentation & Feature Engineering Finance Quant Machine Learning ML-Quant.com - Automated Research Repository Introduction T
A Python package for time series augmentation
tsaug tsaug is a Python package for time series augmentation. It offers a set of augmentation methods for time series, as well as a simple API to conn
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection
Deep learning for time series forecasting Flow forecast is an open-source deep learning for time series forecasting framework. It provides all the lat
Supervised forecasting of sequential data in Python.
Supervised forecasting of sequential data in Python. Intro Supervised forecasting is the machine learning task of making predictions for sequential da
Algorithms for outlier, adversarial and drift detection
Alibi Detect is an open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline d
An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour
Gordo Building thousands of models with timeseries data to monitor systems. Table of content About Examples Install Uninstall Developer manual How to
Survival analysis in Python
What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical commu
:spaghetti: Pastas is an open-source Python framework for the analysis of hydrological time series.
Pastas: Analysis of Groundwater Time Series Pastas: what is it? Pastas is an open source python package for processing, simulating and analyzing groun
(JMLR' 19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)
Python Outlier Detection (PyOD) Deployment & Documentation & Stats & License PyOD is a comprehensive and scalable Python toolkit for detecting outlyin
Forecast dynamically at scale with this unique package. pip install scalecast
🌄 Scalecast: Dynamic Forecasting at Scale About This package uses a scaleable forecasting approach in Python with common scikit-learn and statsmodels
Hierarchical Time Series Forecasting with a familiar API
scikit-hts Hierarchical Time Series with a familiar API. This is the result from not having found any good implementations of HTS on-line, and my work
An open source python library for automated feature engineering
"One of the holy grails of machine learning is to automate more and more of the feature engineering process." ― Pedro Domingos, A Few Useful Things to
An intuitive library to extract features from time series
Time Series Feature Extraction Library Intuitive time series feature extraction This repository hosts the TSFEL - Time Series Feature Extraction Libra
The Turing Change Point Detection Benchmark: An Extensive Benchmark Evaluation of Change Point Detection Algorithms on real-world data
Turing Change Point Detection Benchmark Welcome to the repository for the Turing Change Point Detection Benchmark, a benchmark evaluation of change po
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis
bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P
Highly comparative time-series analysis
〰️ hctsa 〰️ : highly comparative time-series analysis hctsa is a software package for running highly comparative time-series analysis using Matlab (fu
Python binding for Khiva library.
Khiva-Python Build Documentation Build Linux and Mac OS Build Windows Code Coverage README This is the Khiva Python binding, it allows the usage of Kh
Timeseries analysis for neuroscience data
=================================================== Nitime: timeseries analysis for neuroscience data ===============================================
A Python library for unevenly-spaced time series analysis
traces A Python library for unevenly-spaced time series analysis. Why? Taking measurements at irregular intervals is common, but most tools are primar
This is a Python wrapper for TA-LIB based on Cython instead of SWIG.
TA-Lib This is a Python wrapper for TA-LIB based on Cython instead of SWIG. From the homepage: TA-Lib is widely used by trading software developers re
Python package for downloading ECMWF reanalysis data and converting it into a time series format.
ecmwf_models Readers and converters for data from the ECMWF reanalysis models. Written in Python. Works great in combination with pytesmo. Citation If
A way of looking at COVID-19 data that I haven't seen before.
Visualizing Omicron: COVID-19 Deaths vs. Cases Click here for other countries. Data is from Our World in Data/Johns Hopkins University. About this pro
Software Engineer Salary Prediction
Based on 2021 stack overflow data, this machine learning web application helps one predict the salary based on years of experience, level of education and the country they work in.
Analyzed the data of VISA applicants to build a predictive model to facilitate the process of VISA approvals.
Analyzed the data of Visa applicants, built a predictive model to facilitate the process of visa approvals, and based on important factors that significantly influence the Visa status recommended a suitable profile for the applicants for whom the visa should be certified or denied.
Medical appointments No-Show classifier
Medical Appointments No-shows Why do 20% of patients miss their scheduled appointments? A person makes a doctor appointment, receives all the instruct
Arquivos do curso online sobre a estatística voltada para ciência de dados e aprendizado de máquina.
Estatistica para Ciência de Dados e Machine Learning Arquivos do curso online sobre a estatística voltada para ciência de dados e aprendizado de máqui
A general and strong 3D object detection codebase that supports more methods, datasets and tools (debugging, recording and analysis).
ALLINONE-Det ALLINONE-Det is a general and strong 3D object detection codebase built on OpenPCDet, which supports more methods, datasets and tools (de
Prometheus Exporter for data scraped from datenplattform.darmstadt.de
darmstadt-opendata-exporter Scrapes data from https://datenplattform.darmstadt.de and presents it in the Prometheus Exposition format. Pull requests w
Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort
Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort
PyTorch EO aims to make Deep Learning for Earth Observation data easy and accessible to real-world cases and research alike.
Pytorch EO Deep Learning for Earth Observation applications and research. 🚧 This project is in early development, so bugs and breaking changes are ex
Blender Add-on that sets a Material's Base Color to one of Pantone's Colors of the Year
Blender PCOY (Pantone Color of the Year) MCMC (Mid-Century Modern Colors) HG71 (House & Garden Colors 1971) Blender Add-ons That Assign a Custom Color
Full-Stack application that visualizes amusement park safety.
Amusement Park Ride Safety Analysis Project Proposal We have chosen to look into amusement park data to explore ride safety relationships visually, in
Finite Element Analysis
FElupe - Finite Element Analysis FElupe is a Python 3.6+ finite element analysis package focussing on the formulation and numerical solution of nonlin
Datasets, tools, and benchmarks for representation learning of code.
The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided
General Assembly's 2015 Data Science course in Washington, DC
DAT8 Course Repository Course materials for General Assembly's Data Science course in Washington, DC (8/18/15 - 10/29/15). Instructor: Kevin Markham (
Ipython notebook presentations for getting starting with basic programming, statistics and machine learning techniques
Data Science 45-min Intros Every week*, our data science team @Gnip (aka @TwitterBoulder) gets together for about 50 minutes to learn something. While
A middle-to-high level algorithm book designed with coding interview at heart!
Hands-on Algorithmic Problem Solving A one-stop coding interview prep book! About this book In short, this is a middle-to-high level algorithm book de
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
Here are the sections: Data Science Cheatsheets Data Science EBooks Data Science Question Bank Data Science Case Studies Data Science Portfolio Data J
A site that displays up to date COVID-19 stats, powered by fastpages.
https://covid19dashboards.com This project was built with fastpages Background This project showcases how you can use fastpages to create a static das
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Spark Python Notebooks This is a collection of IPython notebook/Jupyter notebooks intended to train the reader on different Apache Spark concepts, fro
🤖 ⚡ scikit-learn tips
🤖 ⚡ scikit-learn tips New tips are posted on LinkedIn, Twitter, and Facebook. 👉 Sign up to receive 2 video tips by email every week! 👈 List of all
Introduction to Statistics and Basics of Mathematics for Data Science - The Hacker's Way
HackerMath for Machine Learning “Study hard what interests you the most in the most undisciplined, irreverent and original manner possible.” ― Richard
List of papers, code and experiments using deep learning for time series forecasting
Deep Learning Time Series Forecasting List of state of the art papers focus on deep learning and resources, code and experiments using deep learning f
Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch
Jupyter Notebook tutorials on solving real-world problems with Machine Learning & Deep Learning using PyTorch. Topics: Face detection with Detectron 2, Time Series anomaly detection with LSTM Autoencoders, Object Detection with YOLO v5, Build your first Neural Network, Time Series forecasting for Coronavirus daily cases, Sentiment Analysis with BERT.
A Practitioner's Guide to Natural Language Processing
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, Text Analytics with Python published by Apress/Springer.
Koç University deep learning framework.
Knet Knet (pronounced "kay-net") is the Koç University deep learning framework implemented in Julia by Deniz Yuret and collaborators. It supports GPU
A powerful and user-friendly binary analysis platform!
angr angr is a platform-agnostic binary analysis framework. It is brought to you by the Computer Security Lab at UC Santa Barbara, SEFCOM at Arizona S
Anomaly detection related books, papers, videos, and toolboxes
Anomaly Detection Learning Resources Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify
Dshell is a network forensic analysis framework.
Dshell An extensible network forensic analysis framework. Enables rapid development of plugins to support the dissection of network packet captures. K
My solution to the book A Collection of Data Science Take-Home Challenges
DS-Take-Home Solution to the book "A Collection of Data Science Take-Home Challenges". Note: Please don't contact me for the dataset. This repository
Free and open source qualitative research tool
Taguette A spin on the phrase "tag it!", Taguette is a free and open source qualitative research tool that allows users to: Import PDFs, Word Docs (.d
Cleaned test data list of DukeMTMC-reID, ICCV2021
Cleaned DukeMTMC-reID Cleaned data list of DukeMTMC-reID released with our paper accepted by ICCV 2021: Learning Instance-level Spatial-Temporal Patte
Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data
Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data This is the official PyTorch implementation of the SeCo paper: @articl
Resco: A simple python package that report the effect of deep residual learning
resco Description resco is a simple python package that report the effect of dee
a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.
data-services A repository for storing various Data Engineering docker-compose files in one place. How to use it ? Set the required settings in .env f
Customer Service Requests Analysis is one of the practical life problems that an analyst may face. This Project is one such take. The project is a beginner to intermediate level project. This repository has a Source Code, README file, Dataset, Image and License file.
Customer Service Requests Analysis Project 1 DESCRIPTION Background of Problem Statement : NYC 311's mission is to provide the public with quick and e
Analysis of a daily word game "Wordle"
Wordle Analysis of a daily word game "Wordle" https://www.powerlanguage.co.uk/wordle/ Description Worlde is a daily word game in which a player attemp
Creating a python package to convert /transfer excelsheet data to a mysql Database Table
Creating a python package to convert /transfer excelsheet data to a mysql Database Table
Lol qq parser - A League of Legends parser for QQ data
lol_qq_parser A League of Legends parser for QQ data Sources This package relies
Python bindings for Basler's VisualApplets TCL script generation
About visualapplets.py The Basler AG company provides a TCL scripting engine to automatize the creation of VisualApplets designs (a former Silicon Sof