Leveraging ensemble methods with pretrained CNNs for image-based sign language classification
Yasir Altaf1, Abdul Wahid1 and Mudasir Manzoor Kirmani2
Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir,Ganderbal, Srinagar, 190025,Jammu and Kashmir,India2
Corresponding Author : Yasir Altaf
Recieved : 19-April-2025; Revised : 23-May-2026; Accepted : 25-May-2026
Abstract
Sign language serves as a vital means of communication for individuals who are deaf and hard of hearing (DHH), providing an effective visual alternative when spoken interaction is impractical or impossible. Among the various modalities of sign language, hand gestures constitute the most expressive and frequently used component for conveying meaning. However, accurately recognizing and classifying these gestures remains a significant challenge in the development of reliable assistive communication technologies. This study proposes an ensemble convolutional neural network (ECNN) framework that integrates three high-performing pre-trained convolutional neural network (CNN) architectures: densely connected convolutional network 201 (DenseNet201), residual network 152 (ResNet152), and visual geometry group 19 (VGG19), through a logistic regression (LR)–based meta-learner. The ensemble leverages the complementary feature representations of its constituent models using adaptive feature distillation and selection (AFDS) to improve recognition accuracy and generalization across multiple sign language datasets. Experimental evaluations demonstrate that the proposed ECNN achieves superior classification performance, attaining accuracies of 99.86% and 98.65% on the American Sign Language (ASL) and Indian Sign Language (ISL) datasets, respectively. Furthermore, the model generalizes effectively to cross-domain benchmarks, achieving accuracies of 98.0% on the National University of Singapore (NUS) Hand Posture dataset and 96.0% on the Arabic Sign Language (ArSL) dataset, thereby outperforming existing state-of-the-art approaches. These results validate the robustness, scalability, and cross-lingual adaptability of the proposed ensemble model, highlighting its potential as a reliable foundation for real-time sign language recognition (SLR) and assistive communication systems.
Keywords
Sign language recognition (SLR), Ensemble convolutional neural network, Adaptive feature distillation and selection (AFDS), Hand gesture classification, Assistive communication systems.
Cite this article
Altaf Y, Wahid A, Kirmani MM. Leveraging ensemble methods with pretrained CNNs for image-based sign language classification. International Journal of Advanced Technology and Engineering Exploration. 2026;13(138):701-730. DOI : 10.19101/IJATEE.2025.121220510
