The growing impact of heart disease on global health requires to improve diagnostic techniques for more timely and accurate diagnosis. Machine Learning (ML) has demonstrated significant potential in supporting the identification and classification of heart diseases, thanks to its ability to analyze large volumes of data and learn complex patterns. The aim of this work is to explore the application of ML algorithms for heart disease diagnosis, using two datasets ‘Heart Disease Cleveland’ and ‘Heart Failure Prediction Dataset’ available on the web. Each dataset is enriched with 1000 synthetic instances, generated by a designed Generative Adversarial Network model. Different ML-based classification models including Random Forest, Logistic Regression, Stochastic Gradient Descent and XGBoost are compared based on standard performance metrics. In addition, a stacking model as an ensemble method based on the combination of the above four models has been developed and tested. The obtained results show the effectiveness of ML models in the diagnosis of cardiac diseases, with the stacking model standing out for its superior performance according to the majority of metrics.
Heart Disease Diagnosis Using Machine Learning / Roccotelli, Michele; Ali, Wasim A.; Fanti, Maria Pia. - (2025), pp. 1-12. ( 4th International Conference on Innovation in Engineering, ICIE 2025 cze 2025) [10.1007/978-3-031-94223-5_1].
Heart Disease Diagnosis Using Machine Learning
Roccotelli, Michele;Ali, Wasim A.;Fanti, Maria Pia
2025
Abstract
The growing impact of heart disease on global health requires to improve diagnostic techniques for more timely and accurate diagnosis. Machine Learning (ML) has demonstrated significant potential in supporting the identification and classification of heart diseases, thanks to its ability to analyze large volumes of data and learn complex patterns. The aim of this work is to explore the application of ML algorithms for heart disease diagnosis, using two datasets ‘Heart Disease Cleveland’ and ‘Heart Failure Prediction Dataset’ available on the web. Each dataset is enriched with 1000 synthetic instances, generated by a designed Generative Adversarial Network model. Different ML-based classification models including Random Forest, Logistic Regression, Stochastic Gradient Descent and XGBoost are compared based on standard performance metrics. In addition, a stacking model as an ensemble method based on the combination of the above four models has been developed and tested. The obtained results show the effectiveness of ML models in the diagnosis of cardiac diseases, with the stacking model standing out for its superior performance according to the majority of metrics.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

