Overview of Credit Card Analysis
In this project, RandomOverSampler
and SMOTE
algorithms were used to perform oversampling
, ClusterCentroids
algorithm was used to undersampling
, SMOTEENN
algorithm was applied as a combinatorial approach
of over- and undersampling
of credit card credit dataset from LendingClub
. Machine learning models - BalancedRandomForestClassifier and EasyEnsembleClassifier
were used to predict credit risk.
Results
1. Naive Random Oversampling
2. SMOTE Oversampling
3. Undersampling
4. Combination (Over and Under) Sampling
5. Balanced Random Forest Classifier
6. Easy Ensemble AdaBoost Classifier
Summary
1. Comparing Credit Risk Resampling to Ensemble Techniques, it is clear that higher credit risk prediction accuracy was observed with Easy Ensemble AdaBoost Classifier of 93%
. It is recommended that Easy Ensemble AdaBoost Classifier be used to reduce bias in prediction.