Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

kemalgunay

Last update: Apr 21, 2022

Related tags

Machine Learning Feature-Engineering

Overview

Feature-Engineering

Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared.

When the dataset is passed through this script, the modeling starts. expected to be ready.

Dataset Story

The data set is the data set of the people who were in the Titanic shipwreck. It consists of 768 observations and 12 variables. The target variable is specified as "Survived"; 1: one's survival, 0: indicates the person's inability to survive.

Variables

PassengerId: ID of the passenger

Survived: Survival status (0: not survived, 1: survived)
Pclass: Ticket class (1: 1st class (upper), 2: 2nd class (middle), 3: 3rd class(lower))
Name: Name of the passenger
Sex: Gender of the passenger (male, female)
Age: Age in years
Sibsp: Number of siblings/spouses aboard the Titanic
- Sibling = Brother, sister, stepbrother, stepsister
- Spouse = Husband, wife (mistresses and fiances were ignored) Parch: Number of parents/children aboard the Titanic
- Parent = Mother, father
- Child = Daughter, son, stepdaughter, stepson
- Some children travelled only with a nanny , therefore Parch = 0 for them.
Ticket: Ticket number
Fare: Passenger fare
Cabin: Cabin number
Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)

REFERENCE: Data Science and ML Boot Camp, 2021, Veri Bilimi Okulu (https://www.veribilimiokulu.com/)

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

19 Oct 3, 2022

Data science, Data manipulation and Machine learning package.

duality Data science, Data manipulation and Machine learning package. Use permitted according to the terms of use and conditions set by the attached l

3 Oct 19, 2022

Data Version Control or DVC is an open-source tool for data science and machine learning projects

Continuous Machine Learning project integration with DVC Data Version Control or DVC is an open-source tool for data science and machine learning proj

2 Jul 29, 2021

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

5.7k Dec 30, 2022

Python ML pipeline that showcases mltrace functionality.

mltrace tutorial Date: October 2021 This tutorial builds a training and testing pipeline for a toy ML prediction problem: to predict whether a passeng

28 Nov 9, 2022

MLOps pipeline project using Amazon SageMaker Pipelines

This project shows steps to build an end to end MLOps architecture that covers data prep, model training, realtime and batch inference, build model registry, track lineage of artifacts and model drift detection. It utilizes SageMaker Pipelines that offers machine learning (ML) to orchestrate SageMaker jobs and author reproducible ML pipelines.

3 Sep 16, 2022

A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc

4.2k Dec 29, 2022

Apache Liminal is an end-to-end platform for data engineers & scientists, allowing them to build, train and deploy machine learning models in a robust and agile way

Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.

121 Dec 28, 2022

A collection of neat and practical data science and machine learning projects

Data Science A collection of neat and practical data science and machine learning projects Explore the docs » Report Bug · Request Feature Table of Co

2 Dec 10, 2021

Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Related tags

Overview

Feature-Engineering

Dataset Story

Variables

You might also like...

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

Data science, Data manipulation and Machine learning package.

Data Version Control or DVC is an open-source tool for data science and machine learning projects

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

Python ML pipeline that showcases mltrace functionality.

MLOps pipeline project using Amazon SageMaker Pipelines

A library of extension and helper modules for Python's data analysis and machine learning libraries.

Apache Liminal is an end-to-end platform for data engineers & scientists, allowing them to build, train and deploy machine learning models in a robust and agile way

A collection of neat and practical data science and machine learning projects

Owner

kemalgunay

A data preprocessing package for time series data. Design for machine learning and deep learning.

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

The easy way to combine mlflow, hydra and optuna into one machine learning pipeline.

fMRIprep Pipeline To Machine Learning

This repository contains full machine learning pipeline of the Zillow Houses competition on Kaggle platform.

Houseprices - Predict sales prices and practice feature engineering, RFs, and gradient boosting

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques