Τεχνικές data mining και προβλεπτική αναλυτικής σακχαρώδους διαβήτη
Data mining techniques and predictive analysis for diabetes mellitus
View/ Open
Keywords
Σακχαρώδης διαβήτης ; Μηχανική μάθηση ; Λογιστική παλινδρόμηση ; Τυχαίο δάσος ; K-πλησιέστερων γειτόνων ; Κανονικοποίηση ; Τεχνική κλιμάκωσης ; Μηχανική χαρακτηριστικώνAbstract
Diabetes mellitus is a prevalent chronic disease with significant implications for public health and
individual well being. In recent years, machine learning (M L ) techniques have emerged as powerful tools
for predicting diabetes outcomes, offering the potential to improve early detection. This paper presents a
comprehensive evaluation of various ML models, including logistic regression (LR), support vector
machines (SVM), random forest (RF), k nearest neigh bors (KNN), multilayer perceptrons (MLPs), and of
the gradient boosting (GB) model, in the context of diabetes prediction. Making extensive use of the
capabilities of the Scikit learn library in Python, we analyze the performance of these models using the
PIMA dataset and investigate the impact of preprocessing techniques such as Normalization,
Standardization, and Feature Engineer Binning and One hot encoding techniques). Our findings highlight
the random forest (RF) model as the most efficient algorithm, achieving 85% accuracy when combined
with feature engineering (Binning and One hot encoding techniques). This research contributes to the
growing body of knowledge on ML based healthcare analytics by providing insight into the strengths and
limitations of different algorithms for diabetes prediction. Furthermore, our study offers practical guidance
for healthcare professionals and researchers, facilitating early detection and personalized interventions
for improved diabetes management By identifying future research directions, including personalized
medicine, and improving pretreatment methods, this work aims to stimulate continued innovation in the
field of ML based healthcare analytics and promote interdisciplinary collaboration to address the challenges of chronic disease management.