Enhancing Early Detection of Type II Diabetes with Machine Learning: A Performance Evaluation
DOI:
https://doi.org/10.11113/oiji2025.13n1.328Keywords:
Diabetes, Machine Learning, Disease Prediction, Feature Selection, Risk FactorsAbstract
Type II diabetes is a common issue nowadays and takes a longer time to detect. Detection of diabetes greatly relies on the clinical results from medical professionals, which require a significant amount of time, manpower, and expenses. Machine learning findings may be used as the reference in gaining preliminary understanding about the disease. It is crucial to achieve early detection of type II diabetes in a feasible and efficient manner for broader populations. This study aims to evaluate the performance of selected machine learning models for type II diabetes. The dataset of Behavioral Risk Factor Surveillance System from 2021 was used in this study. Five attributes of high blood pressure, high cholesterol, BMI, general health, and walking difficulty with the highest Cramer’s V correlation were selected. Four machine learning models were identified through a literature review, including: (i) Decision Tree, (ii) Neural Network, (iii) Random Forest, (iv) Logistic Regression, and (v) AdaBoost, and were analyzed in the study. The performance of each machine learning model was evaluated based on accuracy, precision, sensitivity, and F1-score. All the algorithms showed acceptable performance, ranging from 68.8% to 74.7%. Neural Network showed the highest accuracy and F1-score of 71.0% and 71.9%, respectively. Decision Tree had the highest sensitivity of 74.7% among all the algorithms. This project suggests Neural Network as the algorithm with the best overall performance in the diabetes prediction model and suggests Decision Tree as the most suitable algorithm specifically for screening diabetes. Preliminary diagnosis based on the interpretation of risk factors may greatly reduce the workload of clinical professionals in identifying the high risk group for type II diabetes to proceed further clinical diagnostic.