Machine Learning Analysis of Factors Contributing to Diabetes Development
Keywords:diabetes, machine learning, predictive models, performance evaluation, feature selection
Diabetes is a chronic condition that affects how the body processes blood sugar. Early diagnosis and management of diabetes are essential for preventing these complications. Machine Learning (ML) techniques offer an effective means to accurately diagnose diabetes by identifying key risk factors and developing predictive models. In this study, we assess the performance of 11 ML algorithms on four diabetes prediction datasets, considering the top 2, top 3, and all attributes. Through k-fold cross-validation, we ensure robust and generalizable results. We use a set of standard evaluation metrics such as accuracy, precision, recall, f1-score, and Receiver Operating Characteristic curve (ROC_AUC). Our analysis aims to determine the optimal number of features and assess how performance changes with feature additions. Notably, some ML classifiers achieve satisfactory classification and predictive abilities using only the top 2 or 3 features. Furthermore, varying dataset performances across algorithms highlight the need for assessing multiple models to identify the most suitable one. These findings enable the creation of dependable models that enhance patient outcomes by leveraging effective algorithms and pertinent features.
How to Cite
Copyright (c) 2024 Edgar Ceh-Varela, Larry Maes, Sarbagya Ratna Shakya
This work is licensed under a Creative Commons Attribution 4.0 International License.