Fuzzy-driven gradient boosting for interpretable cardiovascular disease risk prediction
Fadhillah Azmi1, Amir Saleh2, Nurul Khairina3, Surya Alpacino Manurung1 and Muhammad Fadhil1
Department of Computer Engineering,Politeknik Negeri Medan, Jl. Almamater No. 1, Padang Bulan, Medan 20155,North Sumatra,Indonesia2
Department of Informatics Engineering,Universitas Medan Area, Jl. Kolam No. 1, Medan Estate, Percut Sei Tuan, Medan 20223,North Sumatra,Indonesia3
Corresponding Author : Fadhillah Azmi
Recieved : 30-Jun-2025; Revised : 15-Feb-2026; Accepted : 17-Feb-2026
Abstract
Cardiovascular disease (CVD) continues to be the foremost cause of mortality globally, highlighting the necessity for predictive models that are both precise and clinically interpretable. However, most machine learning (ML) models operate as “black boxes,” constraining their utility for practical medical decision-making. This paper presents a fuzzy-driven gradient boosting (GB) framework that incorporates fuzzy logic into the ML workflow to enhance interpretability while maintaining predictive performance. Employing the fuzzy c-means (FCM) methodology, triangular membership functions (MFs) are developed to convert five principal clinical variables—age, height, weight, systolic blood pressure (ap_hi), and diastolic blood pressure (ap_lo)—into linguistically interpretable categories. Each data point is transformed into a set of fuzzy membership values derived from the pertinent MFs, signifying its partial association with different linguistic categories. The framework is evaluated using 10-fold cross-validation (CV) on three feature types: raw data, fuzzy-linguistic labels, and fuzzy MF values, with the latter being the core fuzzy representation. The experimental results indicate that the hybrid model achieves the best performance, with an accuracy of 73.37% and an area under the receiver operating characteristic curve (AUROC) of 0.800 on Dataset I, and an accuracy of 98.83% with an AUROC of 0.996 on Dataset II. Fuzzy transformation was shown to significantly improve model transparency and provide clinically meaningful explanations. Overall, this proposed framework offers a promising direction for the development of interpretable, reliable, and high-performance artificial intelligence (AI)-based clinical decision support systems (CDSS) for predicting CVD risk.
Keywords
Cardiovascular disease (CVD), Fuzzy logic, Gradient boosting, Fuzzy c-means (FCM), Clinical decision support systems (CDSS).
Cite this article
Azmi F, Saleh A, Khairina N, Manurung SA, Fadhil M. Fuzzy-driven gradient boosting for interpretable cardiovascular disease risk prediction. International Journal of Advanced Technology and Engineering Exploration. 2026;13(135):225-244. DOI : 10.19101/IJATEE.2025.121220881
References
[1] Netala VR, Teertam SK, Li H, Zhang Z. A comprehensive review of cardiovascular disease management: cardiac biomarkers, imaging modalities, pharmacotherapy, surgical interventions, and herbal remedies. Cells. 2024; 13(17):1-69.
[2] Kazi DS, Elkind MS, Deutsch A, Dowd WN, Heidenreich P, Khavjou O, et al. Forecasting the economic burden of cardiovascular disease and stroke in the United States through 2050: a presidential advisory from the American heart association. Circulation. 2024; 150(4): e89-101.
[3] Liu T, Krentz A, Lu L, Curcin V. Machine learning based prediction models for cardiovascular disease risk using electronic health records data: systematic review and meta-analysis. European Heart Journal-Digital Health. 2025; 6(1):7-22.
[4] Liu T, Krentz AJ, Huo Z, Ćurčin V. Opportunities and challenges of cardiovascular disease risk prediction for primary prevention using machine learning and electronic health records: a systematic review. Reviews in Cardiovascular Medicine. 2025; 26(4):1-20.
[5] Suliman A, Masud M, Serhani MA, Abdullahi AS, Oulhaj A. Predictive performance of machine learning compared to statistical methods in time-to-event analysis of cardiovascular disease: a systematic review protocol. BMJ Open. 2024; 14(4):1-5.
[6] Dorraki M, Liao Z, Abbott D, Psaltis PJ, Baker E, Bidargaddi N, et al. Improving cardiovascular disease prediction with machine learning using mental health data: a prospective UK Biobank study. JACC: Advances. 2024; 3(9_Part_2):1-9.
[7] Chi Z, Liu L, Yi L, Shi L. Prediction of cardiovascular diseases based on GBDT+ LR. Scientific Reports. 2025; 15(1):1-15.
[8] Xu C, Shi F, Ding W, Fang C, Fang C. Development and validation of a machine learning model for cardiovascular disease risk prediction in type 2 diabetes patients. Scientific Reports. 2025; 15(1):1-15.
[9] Liu Y, Liu C, Zheng J, Xu C, Wang D. Improving explainability and integrability of medical AI to promote health care professional acceptance and use: mixed systematic review. Journal of Medical Internet Research. 2025; 27:1-22.
[10] Nasarian E, Alizadehsani R, Acharya UR, Tsui KL. Designing interpretable ML system to enhance trust in healthcare: a systematic review to proposed responsible clinician-AI-collaboration framework. Information Fusion. 2024; 108:102412.
[11] Reddy S, Shaikh S. The long road ahead: navigating obstacles and building bridges for clinical integration of artificial intelligence technologies. Journal of Medical Artificial Intelligence. 2025; 8:1-8.
[12] Retzlaff CO, Angerschmid A, Saranti A, Schneeberger D, Roettger R, Mueller H, et al. Post-hoc vs ante-hoc explanations: xAI design guidelines for data scientists. Cognitive Systems Research. 2024; 86:1-17.
[13] Budhkar A, Song Q, Su J, Zhang X. Demystifying the black box: a survey on explainable artificial intelligence (XAI) in bioinformatics. Computational and Structural Biotechnology Journal. 2025; 27: 346-59.
[14] Ennab M, Mcheick H. Enhancing interpretability and accuracy of AI models in healthcare: a comprehensive review on challenges and future directions. Frontiers in Robotics and AI. 2024; 11:1444763.
[15] Ahmad AA, Polat H. Prediction of heart disease based on machine learning using jellyfish optimization algorithm. Diagnostics. 2023; 13(14):1-17.
[16] Naik A, Tejani GG, Mousavirad SJ. SGO enhanced random forest and extreme gradient boosting framework for heart disease prediction. Scientific Reports. 2025; 15(1):1-31.
[17] Bizimana PC, Zhang Z, Hounye AH, Asim M, Hammad M, El-latif AA. Automated heart disease prediction using improved explainable learning-based technique. Neural Computing and Applications. 2024; 36(26):16289-318.
[18] Petch J, Di S, Nelson W. Opening the black box: the promise and limitations of explainable machine learning in cardiology. Canadian Journal of Cardiology. 2022; 38(2):204-13.
[19] El-ibrahimi A, Daanouni O, Alouani Z, El GO, Saleh S, Cherradi B, et al. Fuzzy based system for coronary artery disease prediction using subtractive clustering and risk factors data. Intelligence-Based Medicine. 2025; 11:1-14.
[20] Teja MD, Rayalu GM. Optimizing heart disease diagnosis with advanced machine learning models: a comparison of predictive performance. BMC Cardiovascular Disorders. 2025; 25(1):1-12.
[21] Jhumka K, Auzine MM, Heenaye-mamode KM, Casseem SM, Fedally SA, Mungloo-dilmohamud Z. Explainable chronic kidney disease (CKD) prediction using deep learning and Shapley additive explanations (SHAP). In proceedings of the 7th international conference on advances in artificial intelligence 2023 (pp. 29-33). ACM.
[22] Palatnik DSI, Maria BRVM, Costa DSE. Local interpretable model-agnostic explanations for classification of lymph node metastases. Sensors. 2019; 19(13):1-18.
[23] Ali ML, Sadi MS, Goni MO. Diagnosis of heart diseases: a fuzzy-logic-based approach. Plos One. 2024; 19(2):1-25.
[24] Mariadoss S, Augustin F. Enhanced sugeno fuzzy inference system with fuzzy AHP and coefficient of variation to diagnose cardiovascular disease during pregnancy. Journal of King Saud University-Computer and Information Sciences. 2023; 35(8):1-25.
[25] Li Y, Li D, Xu Y. Research on the prediction model of sudden death risk in coronary heart disease based on XGBoost and random forest. Symmetry. 2025; 17(9):1-23.
[26] Fatima U, Khushal R. Enhancing explainability in epidemiological predictions using fuzzy logic integrated with machine and deep learning algorithms. Scientific Reports. 2025; 15(1):1-23.
[27] Wang T, Gault R, Greer D. Data informed initialization of fuzzy membership functions. International Journal of Fuzzy Systems. 2025:1-12.
[28] Krzysztoń E, Mikołajewski D, Prokopowicz P. Review of fuzzy methods application in IIoT security-challenges and perspectives. Electronics. 2025; 14(17),1-25.
[29] Koukaras P, Tjortjis C. Data preprocessing and feature engineering for data mining: techniques, tools, and best practices. AI. 2025;6(10):1-40.
[30] Liu M, Li S, Yuan H, Ong ME, Ning Y, Xie F, et al. Handling missing values in healthcare data: a systematic review of deep learning-based imputation techniques. Artificial Intelligence in Medicine. 2023; 142:102587.
[31] Leinonen T, Wong D, Vasankari A, Wahab A, Nadarajah R, Kaisti M, et al. Empirical investigation of multi-source cross-validation in clinical ECG classification. Computers in Biology and Medicine. 2024; 183:109271.
[32] Abdussamad AM, Inayat A. Addressing limitations of the K-means clustering algorithm: outliers, non-spherical data, and optimal cluster selection. AIMS Math. 2024; 9(9):25070-97.
[33] Sewwandi MA, Li Y, Zhang J. K-outlier removal based on contextual label information and cluster purity for continuous data classification. Expert Systems with Applications. 2024; 237:1-15.
[34] Xu Q, Xie W, Liao B, Hu C, Qin L, Yang Z, et al. Interpretability of clinical decision support systems based on artificial intelligence from technological and medical perspective: a systematic review. Journal of Healthcare Engineering. 2023; 2023(1):1-13.
[35] Kim SY, Kim DH, Kim MJ, Ko HJ, Jeong OR. XAI-based clinical decision support systems: a systematic review. Applied Sciences. 2024; 14(15):1-25.
[36] Krasnov D, Davis D, Malott K, Chen Y, Shi X, Wong A. Fuzzy c-means clustering: a review of applications in breast cancer detection. Entropy. 2023; 25(7):1-14.
[37] Pérez-ortega J, Moreno-calderón CF, Roblero-aguilar SS, Almanza-ortega NN, Frausto-solís J, Pazos-rangel R, et al. A new criterion for improving convergence of fuzzy C-means clustering. Axioms. 2024; 13(1):1-16.
[38] Gu X, Han J, Shen Q, Angelov PP. Autonomous learning for fuzzy systems: a review. Artificial Intelligence Review. 2023; 56(8):7549-95.
[39] Singh DP. A comparative analysis of machine learning techniques for hypertension risk prediction and diagnostic classification. International Journal of Innovative Research in Technology. 2025; 11(12): 7183-95.
[40] Kumar R, Garg S, Kaur R, Johar MG, Singh S, Menon SV, et al. A comprehensive review of machine learning for heart disease prediction: challenges, trends, ethical considerations, and future directions. Frontiers in Artificial Intelligence. 2025; 8:1-31.
[41] Airlangga G, Liu A. A hybrid gradient boosting and neural network model for predicting urban happiness: integrating ensemble learning with deep representation for enhanced accuracy. Machine Learning and Knowledge Extraction. 2025; 7(1):1-23.
[42] Hussain A, Aslam A. Cardiovascular disease prediction using risk factors: a comparative performance analysis of machine learning models. Journal on Artificial Intelligence. 2024; 6(1):129-52.
[43] Rainio O, Teuho J, Klén R. Evaluation metrics and statistical tests for machine learning. Scientific Reports. 2024; 14(1):1-14.
[44] Salmi M, Atif D, Oliva D, Abraham A, Ventura S. Handling imbalanced medical datasets: review of a decade of research. Artificial Intelligence Review. 2024; 57(10):1-57.
