Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc)

Overview

Heart-Failure-Prediction

Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc). Structured a custom ensemble model and a neural network. Found a outperformed model for heart failure prediction accuracy of 88 percent.

Introduction

Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worldwide. Four out of 5CVD deaths are due to heart attacks and strokes. Heart failure is a common event caused by CVDs and this dataset contains 11 features that can be used to predict a possible heart disease.

People with cardiovascular disease or who are at high cardiovascular risk need early detection and management wherein a machine learning model can be of great help.

Table of Contents

  • [Data]

    • [What We need to do]
  • [Exploratory Data Analysis]

    • [Target Variable]
    • [Features]
  • [Model Selection]

    • [Model Creation and Comparison]
    • [Bulid a custom ensemble (superlearner) with best three of models]
    • [Neural Networks]
    • [Feature Importance]
  • [Conclusion]

DATA

1 Age: Age of the patient [years]

2 Sex: Sex of the patient [M: Male, F: Female]

3 ChestPainType: [TA: Typical Angina, ATA: Atypical Angina, NAP: Non-Anginal Pain, ASY: Asymptomatic]

4 RestingBP: Resting blood pressure [mm Hg]

5 Cholesterol: Serum cholesterol [mm/dl]

6 FastingBS: Fasting blood sugar [1: if FastingBS > 120 mg/dl, 0: otherwise]

7 RestingECG: Resting electrocardiogram results [Normal: Normal, ST: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), LVH: showing probable or definite left ventricular hypertrophy by Estes' criteria]

8 MaxHR: Maximum heart rate achieved [Numeric value between 60 and 202]

9 ExerciseAngina: Exercise-induced angina [Y: Yes, N: No]

10 Oldpeak: ST [Numeric value measured in depression] (

11 ST_Slope: The slope of the peak exercise ST segment [Up: upsloping, Flat: flat, Down: downsloping]

12 HeartDisease: Output class [1: heart disease, 0: Normal]

Reference: https://www.kaggle.com/fedesoriano/heart-failure-prediction

You might also like...
TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.

TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models. The library is a collection of Keras models and supports classification, regression and ranking. TF-DF is a TensorFlow wrapper around the Yggdrasil Decision Forests C++ libraries. Models trained with TF-DF are compatible with Yggdrasil Decision Forests' models, and vice versa.

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Automatically create Faiss knn indices with the most optimal similarity search parameters.

It selects the best indexing parameters to achieve the highest recalls given memory and query speed constraints.

Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.
Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.

Auto_TS: Auto_TimeSeries Automatically build multiple Time Series models using a Single Line of Code. Now updated with Dask. Auto_timeseries is a comp

XGBoost + Optuna

AutoXGB XGBoost + Optuna: no brainer auto train xgboost directly from CSV files auto tune xgboost using optuna auto serve best xgboot model using fast

Mortality risk prediction for COVID-19 patients using XGBoost models

Mortality risk prediction for COVID-19 patients using XGBoost models Using demographic and lab test data received from the HM Hospitales in Spain, I b

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)
A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and the A* Search (using the Manhattan Distance Heuristic)

CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

SmartSim Example Zoo This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning appl

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Automated Machine Learning Pipeline for tabular data. Designed for predictive maintenance applications, failure identification, failure prediction, condition monitoring, etc.

Owner
Chris Yuan
Motivated, open-minded, and detail-oriented student currently working towards a degree in Data Science.
Chris Yuan
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 5, 2023
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 7, 2023
Machine Learning Algorithms ( Desion Tree, XG Boost, Random Forest )

implementation of machine learning Algorithms such as decision tree and random forest and xgboost on darasets then compare results for each and implement ant colony and genetic algorithms on tsp map, play blackjack game and robot in grid world and evaluate reward for it

Mohamadreza Rezaei 1 Jan 19, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Jan 3, 2023
Predicting diabetes over a five year period using logistic regression and the Pima First-Nation dataset

Diabetes This script uses the Pima First Nations dataset to create a model to predict whether or not an individual will develop Diabetes Mellitus Type

null 1 Mar 28, 2022
Diabetes Prediction with Logistic Regression

Diabetes Prediction with Logistic Regression Exploratory Data Analysis Data Preprocessing Model & Prediction Model Evaluation Model Validation: Holdou

AZİZE SULTAN PALALI 2 Oct 23, 2021
A logistic regression model for health insurance purchasing prediction

Logistic_Regression_Model A logistic regression model for health insurance purchasing prediction This code is using these packages, so please make sur

ShawnWang 1 Nov 29, 2021
Houseprices - Predict sales prices and practice feature engineering, RFs, and gradient boosting

House Prices - Advanced Regression Techniques Predicting House Prices with Machine Learning This project is build to enhance my knowledge about machin

null 1 Jan 1, 2022
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 363 Dec 14, 2022
It is a forest of random projection trees

rpforest rpforest is a Python library for approximate nearest neighbours search: finding points in a high-dimensional space that are close to a given

Lyst 211 Dec 29, 2022