Ensemble learning–based automatic text summarization using HIN-MELM-AE and DePori for multi-document and multilingual texts
Sunil Upadhyay1 and Hemant Kumar Soni1
Corresponding Author : Sunil Upadhyay
Recieved : 19-December-2024; Revised : 13-December-2025; Accepted : 16-December-2025
Abstract
Automatic text summarization (ATS), which involves generating a concise and accurate summary from a longer text document, has emerged in response to the rapid growth of textual information. Most existing studies have not adequately addressed ATS for multi-document (MD) and multilingual summarization. To overcome these limitations, this paper proposes an improved ensemble learning–based ATS framework incorporating slang filtering using a hyperfan-IN multilayer extreme learning machine autoencoder (HIN-MLELM-AE) and the Dehghani poor and rich optimization (DePori) algorithm. Initially, the text documents are collected and preprocessed. Subsequently, slang identification and filtering are performed on the preprocessed text using the DePori optimization technique. The slang-filtered text is then transformed using info-squared fuzzy c-means (InS-FCM) clustering, latent Dirichlet allocation (LDA)–based topic modeling, term frequency–inverse document frequency (TF-IDF) analysis, and frequent term selection. Part-of-speech (POS) tagging is carried out on the transformed data using a sememe similarity induced hidden Markov model (SemSim-HMM). Next, significant entities are extracted from the transformed and POS-tagged text. For entity vectorization, sentence bidirectional encoder representations from transformers (SBERT) is employed. Finally, ATS is performed using an ensemble of models, including HIN-MLELM-AE, autoencoder (AE), variational autoencoder (VAE), and SBERT. The outputs of these ensemble models are evaluated using cosine similarity, followed by voting-based fusion, re-ranking, and optimal sentence selection to generate the final summary. Experimental results demonstrate that the proposed model achieves an accuracy of 98.72%, outperforming existing conventional methods.
Keywords
Automatic text summarization, Ensemble learning, Multi-document summarization, Multilingual text processing, Slang filtering, Extreme learning machine autoencoder.
Cite this article
Upadhyay S, Soni HK. Ensemble learning–based automatic text summarization using HIN-MELM-AE and DePori for multi-document and multilingual texts. International Journal of Advanced Technology and Engineering Exploration. 2025;12(133):1783-1806. DOI : 10.19101/IJATEE.2024.111102227
