Diabetes-Feature-Engineering
Aim
Developing a machine learning model that can predict whether people have diabetes when their characteristics are specified. Perform the necessary data analysis and feature engineering steps before developing the model.
Information about the dataset
The dataset is part of the large dataset held at the National Institutes of Diabetes-Digestive-Kidney Diseases in the USA. On Pima Indian women aged 21 and over living in Phoenix, the 5th largest city in the State of Arizona in the USA. Data used for diabetes research. The target variable is specified as "outcome"; 1 indicates positive diabetes test result, 0 indicates negative.
Variables
Pregnancies: Number of pregnancies Glucose Oral: 2-hour plasma glucose concentration in glucose tolerance test Blood Pressure: Blood Pressure (mm Hg) SkinThickness : Skin Thickness Insulin: 2-hour serum insulin (mu U/ml) DiabetesPedigreeFunction: Function (2 hour plasma glucose concentration in oral glucose tolerance test) BMI : Body mass index Age : Age (years) Outcome: Have the disease (1) or not (0)