188 Repositories
Python analaytics-engineering Libraries
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Streamify A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more! Description Objective The project will stre
Free Data Engineering course!
Data Engineering Zoomcamp Register in DataTalks.Club's Slack Join the #course-data-engineering channel The videos are published to DataTalks.Club's Yo
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
Python_Natural_Language_Processing This repository contains tutorials on important topics related to Natural Language Processing (NPL). No. Name 01 01
A machine learning malware analysis framework for Android apps.
🕵️ A machine learning malware analysis framework for Android apps. ☢️ DroidDetective is a Python tool for analysing Android applications (APKs) for p
DLO8012: Natural Language Processing & CSL804: Computational Lab - II Semester VIII
NATURAL-LANGUAGE-PROCESSING-AND-COMPUTATIONAL-LAB-II DLO8012: NLP & CSL804: CL-II [SEMESTER VIII] Syllabus NLP - Reference Books THE WALL MEGA SATISH
Esse é o meu primeiro repo tratando de fim a fim, uma pipeline de dados abertos do governo brasileiro relacionado a compras de contrato e cronogramas anuais com spark, em pyspark e SQL!
Olá! Esse é o meu primeiro repo tratando de fim a fim, uma pipeline de dados abertos do governo brasileiro relacionado a compras de contrato e cronogr
Open-source data observability for modern data teams
Use cases Monitor your data warehouse in minutes: Data anomalies monitoring as dbt tests Data lineage made simple, reliable, and automated dbt operati
Programming assignments and quizzes from all courses within the Machine Learning Engineering for Production (MLOps) specialization offered by deeplearning.ai
Machine Learning Engineering for Production (MLOps) Specialization on Coursera (offered by deeplearning.ai) Programming assignments from all courses i
Fully cross-platform toolkit (and library!) for MachO+Obj-C editing/analysis
fully cross-platform toolkit (and library!) for MachO+Obj-C editing/analysis. Includes a cli kit, a curses GUI, ObjC header dumping, and much more.
An end-to-end Python-based Infrastructure as Code framework for network automation and orchestration.
Nectl An end-to-end Python-based Infrastructure as Code framework for network automation and orchestration. Features Data modelling and validation. Da
In this project we predict the forest cover type using the cartographic variables in the training/test datasets.
Kaggle Competition: Forest Cover Type Prediction In this project we predict the forest cover type (the predominant kind of tree cover) using the carto
Unofficial Playdate reverse-engineering notes/tools - covers file formats, server API and USB commands
Unofficial Playdate reverse-engineering notes/tools - covers file formats, server API and USB commands ⚠️ This documentation is unofficial and is not
Diabetes-Feature-Engineering - A machine learning model that can predict whether people have diabetes when their characteristics are specified
Diabetes-Feature-Engineering Aim Developing a machine learning model that can pr
Files related to PoC||GTFO 21:21 - NSA’s Backdoor of the PX1000-Cr
Files related to PoC||GTFO 21:21 - NSA’s Backdoor of the PX1000-Cr 64bit2key.py
LotteryBuyPredictionWebApp - Lottery Purchase Prediction Model
Lottery Purchase Prediction Model Objective and Goal Predict the lottery type th
Patching - Interactive Binary Patching for IDA Pro
Patching - Interactive Binary Patching for IDA Pro Overview Patching assembly code to change the behavior of an existing program is not uncommon in ma
A Radare2 based Python module for Binary Analysis and Reverse Engineering.
Zepu1chr3 A Radare2 based Python module for Binary Analysis and Reverse Engineering. Installation You can simply run this command. pip3 install zepu1c
Binjago - Set of tools aiding in analysis of stripped Golang binaries with Binary Ninja
Binjago 🥷 Set of tools aiding in analysis of stripped Golang binaries with Bina
Convert monolithic Jupyter notebooks into Ploomber pipelines.
Soorgeon Join our community | Newsletter | Contact us | Blog | Website | YouTube Convert monolithic Jupyter notebooks into Ploomber pipelines. soorgeo
Reverse engineering the dengue virus (under development construction)
Reverse engineering the dengue virus (under development 🚧 ) What is dengue? Dengue is a viral infection transmitted to humans through the bite of inf
Data Engineering ZoomCamp
Data Engineering ZoomCamp I'm partaking in a Data Engineering Bootcamp / Zoomcamp and will be tracking my progress here. I can't promise these notes w
🤖 Project template for your next awesome AI project. 🦾
🤖 AI Awesome Project Template 👋 Template author You may want to adjust badge links in a README.md file. 💎 Installation with pip Installation is as
An easy-to-use feature store
A feature store is a data storage system for data science and machine-learning. It can store raw data and also transformed features, which can be fed straight into an ML model or training script.
Feature engineering and machine learning: together at last
Feature engineering and machine learning: together at last! Lambdo is a workflow engine which significantly simplifies data analysis by unifying featu
This is an open solution to the Home Credit Default Risk challenge 🏡
Home Credit Default Risk: Open Solution This is an open solution to the Home Credit Default Risk challenge 🏡 . More competitions 🎇 Check collection
Estimation of whether or not the persons given information will have diabetes.
Diabetes Business Problem : It is desired to develop a machine learning model that can predict whether people have diabetes when their characteristics
Play WORDLE game in your terminal.
Wordle TUI Play WORDLE game in your terminal. The game will be kept the same as the Web version. Prerequisites Python 3.7+ Linux/MacOS (Windows is not
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
applied-ml Curated papers, articles, and blogs on data science & machine learning in production. ⚙️ Figuring out how to implement your ML project? Lea
A curated list of awesome DataOps tools
Awesome DataOps A curated list of awesome DataOps tools. Awesome DataOps Data Catalog Data Exploration Data Ingestion Data Lake Data Processing Data Q
Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified
Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified
Aircraft design optimization made fast through modern automatic differentiation
Aircraft design optimization made fast through modern automatic differentiation. Plug-and-play analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.
A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries
A Python Bytecode Disassembler helping reverse engineers in dissecting Python binaries by disassembling and analyzing the compiled python byte-code(.pyc) files across all python versions (including Python 3.10.*)
Semi-Automated Data Processing
Perform semi automated exploratory data analysis, feature engineering and feature selection on provided dataset by visualizing every possibilities on each step and assisting the user to make a meaningful decision to achieve a low-bias and low-variance model.
Data Science Course at Dept. of Computer Engineering, Chula 2022
2110446 Data Science Course at Chula 2022 Short links for exercises: Week1: Intro to Numpy, Pandas Numpy: https://colab.research.google.com/github/kao
People tracker on the Internet: OSINT analysis and research tool by Jose Pino
trape (stable) v2.0 People tracker on the Internet: Learn to track the world, to avoid being traced. Trape is an OSINT analysis and research tool, whi
Always know what to expect from your data.
Great Expectations Always know what to expect from your data. Introduction Great Expectations helps data teams eliminate pipeline debt, through data t
Feature engineering library that helps you keep track of feature dependencies, documentation and schema
Feature engineering library that helps you keep track of feature dependencies, documentation and schema
DeltaPy - Tabular Data Augmentation (by @firmai)
DeltaPy — Tabular Data Augmentation & Feature Engineering Finance Quant Machine Learning ML-Quant.com - Automated Research Repository Introduction T
Automated Time Series Forecasting
AutoTS AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale. There are dozens of forecasting mod
An open source python library for automated feature engineering
"One of the holy grails of machine learning is to automate more and more of the feature engineering process." ― Pedro Domingos, A Few Useful Things to
An intuitive library to extract features from time series
Time Series Feature Extraction Library Intuitive time series feature extraction This repository hosts the TSFEL - Time Series Feature Extraction Libra
3ds-Ghidra-Scripts - Ghidra scripts to help with 3ds reverse engineering
3ds Ghidra Scripts These are ghidra scripts to help with 3ds reverse engineering
Medical appointments No-Show classifier
Medical Appointments No-shows Why do 20% of patients miss their scheduled appointments? A person makes a doctor appointment, receives all the instruct
PyResToolbox - A collection of Reservoir Engineering Utilities
pyrestoolbox A collection of Reservoir Engineering Utilities This set of functio
a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.
data-services A repository for storing various Data Engineering docker-compose files in one place. How to use it ? Set the required settings in .env f
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)
PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J
Python client for using Prefect Cloud with Saturn Cloud
prefect-saturn prefect-saturn is a Python package that makes it easy to run Prefect Cloud flows on a Dask cluster with Saturn Cloud. For a detailed tu
Code repo for the book "Feature Engineering for Machine Learning," by Alice Zheng and Amanda Casari, O'Reilly 2018
feature-engineering-book This repo accompanies "Feature Engineering for Machine Learning," by Alice Zheng and Amanda Casari. O'Reilly, 2018. The repo
Quilt is a self-organizing data hub for S3
Quilt is a self-organizing data hub Python Quick start, tutorials If you have Python and an S3 bucket, you're ready to create versioned datasets with
This is my personal version of Pac-Man using python, which is the first assignment of EA Software Engineering Virtual Experience Program from Forage.com
Vac-Man in Python This is my personal version of Vax-man game using python, which is the first task of EA Software Engineering Virtual Experience Prog
An interactive python script that enables root access on the T-Mobile (Wingtech) TMOHS1, as well as providing several useful utilites to change the configuration of the device.
TMOHS1 Root Utility Description An interactive python script that enables root access on the T-Mobile (Wingtech) TMOHS1, as well as providing several
CoreSE - basic of social Engineering tool
Core Social Engineering basic of social Engineering tool. just for fun :) About First of all, I must say that I wrote such a project because of my int
Pti-file-format - Reverse engineering the Polyend Tracker instrument file format
pti-file-format Reverse engineering the Polyend Tracker instrument file format.
Iowa Project - My second project done at General Assembly, focused on feature engineering and understanding Linear Regression as a concept
Project 2 - Ames Housing Data and Kaggle Challenge PROBLEM STATEMENT Inferring or Predicting? What's more valuable for a housing model? When creating
Data-science-on-gcp - Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
data-science-on-gcp Source code accompanying book: Data Science on the Google Cloud Platform, 2nd Edition Valliappa Lakshmanan O'Reilly, Jan 2022 Bran
GEF (GDB Enhanced Features) - a modern experience for GDB with advanced debugging features for exploit developers & reverse engineers ☢
GEF (GDB Enhanced Features) - a modern experience for GDB with advanced debugging features for exploit developers & reverse engineers ☢
Tensorflow-Project-Template - A best practice for tensorflow project template architecture.
Tensorflow Project Template A simple and well designed structure is essential for any Deep Learning project, so after a lot of practice and contributi
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities
Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities. This is aimed at those looking to get into the field of Data Science or those who are already in the field and looking to solve a real-world project with python.
House_prices_kaggle - Predict sales prices and practice feature engineering, RFs, and gradient boosting
House Prices - Advanced Regression Techniques Predicting House Prices with Machine Learning This project is build to enhance my knowledge about machin
Houseprices - Predict sales prices and practice feature engineering, RFs, and gradient boosting
House Prices - Advanced Regression Techniques Predicting House Prices with Machine Learning This project is build to enhance my knowledge about machin
An open-source systems and controls toolbox for Python3
harold A control systems package for Python=3.6. Introduction This package is written with the ambition of providing a full-fledged control systems s
This repository includes different versions of the prescribed-time controller as Simulink blocks and MATLAB script codes for engineering applications.
Prescribed-time Control Prescribed-time control (PTC) blocks in Simulink environment, MATLAB R2020b. For more theoretical details, refer to the papers
A curated list of python programming language blogs
Python Blogs A curated list of python programming language blogs Contribute Companies/Organization # A B C D E F G H I J K L M N O P Q R S T U V W X Y
Minitel 5 somewhat reverse-engineered
Minitel 5 The Minitel was a french dumb terminal with an embedded modem which had its Golden Age before the rise of Internet. Typically cubic, with an
Chromepass - Hacking Chrome Saved Passwords
Chromepass - Hacking Chrome Saved Passwords and Cookies View Demo · Report Bug · Request Feature Table of Contents About the Project AV Detection Gett
Python Libraries with functions and constants related to electrical engineering.
ElectricPy Electrical-Engineering-for-Python Python Libraries with functions and constants related to electrical engineering. The functions and consta
CRC Reverse Engineering Tool in Python
CRC Beagle CRC Beagle is a tool for reverse engineering CRCs. It is designed for commnication protocols where you often have several messages of the s
Materials (slides, code, assignments) for the NYU class I teach on NLP and ML Systems (Master of Engineering).
FREE_7773 Repo containing material for the NYU class (Master of Engineering) I teach on NLP, ML Sys etc. For context on what the class is trying to ac
Feature Store for Machine Learning
Overview Feast is an open source feature store for machine learning. Feast is the fastest path to productionizing analytic data for model training and
An extremely configurable markdown reverser for Python3.
🔄 Unmarkd A markdown reverser. Unmarkd is a BeautifulSoup-powered Markdown reverser written in Python and for Python. Why This is created as a StackS
yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data.
The yt Project yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data. yt supports structured, varia
A 3D structural engineering finite element library for Python.
An easy to use elastic 3D structural engineering finite element analysis library for Python.
This script has been created in order to find what are the most common demanded technologies in Data Engineering field.
This is a Python script that given a whole corpus of job descriptions and a file with keywords it extracts the number of number of ocurrences of these keywords and write it to a file. This script it is easy to extend to accept more functionalities
Automatic and platform-independent unpacker for Windows binaries based on emulation
_ _ __ _ __ _ | | | | / / (_) \ \ | | | | | |_ __ | | _ | | _ __ __ _ ___| | _____ _ __
Machine Learning automation and tracking
The Open-Source MLOps Orchestration Framework MLRun is an open-source MLOps framework that offers an integrative approach to managing your machine-lea
An ongoing curated list of frameworks, libraries, learning tutorials, software and resources in Python Language.
Python Development Welcome to the world of Python. An ongoing curated list of frameworks, libraries, learning tutorials, software and resources in Pyt
IDA scripts for hypervisor (Hyper-v) analysis and reverse engineering automation
Re-Scripts IA32-VMX-Helper (IDA-Script) IA32-MSR-Decoder (IDA-Script) IA32 VMX Helper It's an IDA script (Updated IA32 MSR Decoder) which helps you to
Parser for the GeoSuite[tm] PRV export format
Parser for the GeoSuite[tm] PRV export format This library provides functionality to parse geotechnical investigation data in .prv files generated by
The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete.
Feature-Engineering The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete. Business
Slientruss3d : Python for stable truss analysis tool
slientruss3d : Python for stable truss analysis tool Desciption slientruss3d is a python package which can solve the resistances, internal forces and
PLUR is a collection of source code datasets suitable for graph-based machine learning.
PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. We provide scripts for downloading, processing, and loading the datasets. This is done by offering a unified API and data structures for all datasets.
Simulate Attacks With Mininet And Hping3
Miniattack Simulate Attacks With Mininet And Hping3 It measures network load with bwm-ng when the net is under attack and plots the result. This demo
Flexible time series feature extraction & processing
tsflex is a toolkit for flexible time series processing & feature extraction, that is efficient and makes few assumptions about sequence data. Useful
A collection of online resources to help you on your Tech journey.
Everything Tech Resources & Projects About The Project Coming from an engineering background and looking to up skill yourself on a new field can be di
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
Kaggler is a Python package for lightweight online machine learning algorithms and utility functions for ETL and data analysis.
Kaggler is a Python package for lightweight online machine learning algorithms and utility functions for ETL and data analysis. It is distributed under the MIT License.
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
Feature Engineering & Feature Selection A comprehensive guide [pdf] [markdown] for Feature Engineering and Feature Selection, with implementations and
An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.
An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models. Hyperactive: is very easy to lear
Collection of AWS Fault Injection Simulator (FIS) experiment templates.
Collection of AWS Fault Injection Simulator (FIS) experiment templates. These templates let you perform chaos engineering experiments on resources (applications, network, and infrastructure) in the AWS Cloud.
Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.
Predicting Yelp Review Quality Table of Contents Introduction Motivation Goal and Central Questions The Data Data Storage and ETL EDA Data Pipeline Da
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)
PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J
Building house price data pipelines with Apache Beam and Spark on GCP
This project contains the process from building a web crawler to extract the raw data of house price to create ETL pipelines using Google Could Platform services.
Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.
Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.
A multi-platform GUI for bit-based analysis, processing, and visualization
A multi-platform GUI for bit-based analysis, processing, and visualization
QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.
Application-Oriented Performance Benchmarks for Quantum Computing This repository contains a collection of prototypical application- or algorithm-cent
BoobSnail allows generating Excel 4.0 XLM macro.
BoobSnail allows generating Excel 4.0 XLM macro. Its purpose is to support the RedTeam and BlueTeam in XLM macro generation.
A data preprocessing and feature engineering script for a machine learning pipeline is prepared.
FEATURE ENGINEERING Business Problem: A data preprocessing and feature engineering script for a machine learning pipeline needs to be prepared. It is
Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared
Feature-Engineering Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared. When the dataset