Hepatitis C Blood Based Detection
Final project for machine learning (CSC 590).
Dataset from Kaggle.
Using data from previous hepatitis C blood panels for patients ranging in age from 19 to 77 and data classifiers, it is possible to determine the progression of infection. This tool is important for doctors to properly diagnosis and take action to prevent serious illness and death in patients. The data set has multiple lab records from blood donors and from patients diagnosed with hepatitis C, Fibrosis, and Cirrhosis. There are a few different approaches that can be taken when using the K-Nearest Neighbors Classifier. The first analysis is simply, can the classifier predict the level of hepatitis C infection the patient has. Running the classifier given all attributes it has a 92.85% accuracy in predicting the infection level. There are many other iterations of work that could be done to improve the classifier. This could come from either a larger data set or even new and improved blood work done. For now it can be concluded that the current methods to diagnose patients are close to accurate in the data mining world.