Enhancing transparency in breast cancer diagnosis through LIME-driven machine learning models
Zulfikar Ali Ansari1, Md Shamsul Haque Ansari2, Alka Singh3, Naziya Hussain4, Sandeep Keshav3, Shaik Sanheera5 and Anwar Ahamed Shaikh6
School of Computer Science,UPES Dehradun, Misraspatti,Uttarakhand,India2
School of Computer Applications,Noida Institute of Engineering and Technology, Greater Noida,Uttar Pradesh,India3
Department of Computer Science and Engineering,SVKM's NMIMS Deemed to be University, Shirpur Campus, Mumbai,Maharashtra,India4
Department of Computer Science and Engineering,Koneru Lakshmaiah Education Foundation, Vaddeswaram,Guntur, 522302,India5
Department of Computer Science and Engineering,School of Engineering and Technolog, Sanjivani University, Kopergaon,Maharashtra,India6
Corresponding Author : Zulfikar Ali Ansari
Recieved : 03-October-2024; Revised : 12-March-2026; Accepted : 24-March-2026
Abstract
Breast cancer is a serious global health concern. There is a critical need to develop diagnostic tools with enhanced predictive capability and interpretability. This study proposes a diagnostic framework based on machine learning (ML) algorithms, integrated with the local interpretable model-agnostic explanations (LIME) technique, to improve the interpretability of model predictions. Six ML models are considered for breast cancer diagnosis: random forest (RF), decision tree (DT), naïve Bayes (NB), extra trees (ET), extreme gradient boosting (XGBoost), and CatBoost (CB). The proposed models are evaluated on two benchmark datasets obtained from the University of California, Irvine (UCI), ML repository. The performance of the models is assessed using metrics such as accuracy, precision, recall, F1-score, Matthews correlation coefficient (MCC), and receiver operating characteristic-area under curve (ROC-AUC). In addition, a fidelity score is computed to compare the agreement between model predictions and their explanations. The results indicate that the proposed ensemble models exhibit strong predictive performance. Among all models, the ET classifier achieves the best results on both datasets, with accuracies of 97.66% on Dataset 1 and 97.62% on Dataset 2. Although the ensemble models demonstrate superior predictive capability, the DT model attains the highest fidelity score, indicating greater interpretability. Overall, the study highlights the importance of balancing predictive performance with interpretability in ML-based diagnostic systems.
Keywords
Breast cancer diagnosis, Machine learning, Explainable AI (XAI), LIME, Ensemble learning, Model interpretability.
Cite this article
Ansari ZA, Ansari MSH, Singh A, Hussain N, Keshav S, Sanheera S, Shaikh AA. Enhancing transparency in breast cancer diagnosis through LIME-driven machine learning models. International Journal of Advanced Technology and Engineering Exploration. 2026;13(136):410-427. DOI : 10.19101/IJATEE.2024.111101806
