Ensemble learning–based automatic text summarization using HIN-MELM-AE and DePori for multi-document and multilingual texts
Sunil Upadhyay1 and Hemant Kumar Soni1
Corresponding Author : Sunil Upadhyay
Recieved : 19-Dec-2024; Revised : 13-Dec-2025; Accepted : 16-Dec-2025
Abstract
Automatic text summarization (ATS), which involves generating a concise and accurate summary from a longer text document, has emerged in response to the rapid growth of textual information. Most existing studies have not adequately addressed ATS for multi-document (MD) and multilingual summarization. To overcome these limitations, this paper proposes an improved ensemble learning–based ATS framework incorporating slang filtering using a hyperfan-IN multilayer extreme learning machine autoencoder (HIN-MLELM-AE) and the Dehghani poor and rich optimization (DePori) algorithm. Initially, the text documents are collected and preprocessed. Subsequently, slang identification and filtering are performed on the preprocessed text using the DePori optimization technique. The slang-filtered text is then transformed using info-squared fuzzy c-means (InS-FCM) clustering, latent Dirichlet allocation (LDA)–based topic modeling, term frequency–inverse document frequency (TF-IDF) analysis, and frequent term selection. Part-of-speech (POS) tagging is carried out on the transformed data using a sememe similarity induced hidden Markov model (SemSim-HMM). Next, significant entities are extracted from the transformed and POS-tagged text. For entity vectorization, sentence bidirectional encoder representations from transformers (SBERT) is employed. Finally, ATS is performed using an ensemble of models, including HIN-MLELM-AE, autoencoder (AE), variational autoencoder (VAE), and SBERT. The outputs of these ensemble models are evaluated using cosine similarity, followed by voting-based fusion, re-ranking, and optimal sentence selection to generate the final summary. Experimental results demonstrate that the proposed model achieves an accuracy of 98.72%, outperforming existing conventional methods.
Keywords
Automatic text summarization, Ensemble learning, Multi-document summarization, Multilingual text processing, Slang filtering, Extreme learning machine autoencoder.
Cite this article
Upadhyay S, Soni HK. Ensemble learning–based automatic text summarization using HIN-MELM-AE and DePori for multi-document and multilingual texts. International Journal of Advanced Technology and Engineering Exploration. 2025;12(133):1783-1806. DOI : 10.19101/IJATEE.2024.111102227
References
[1] Manojkumar VK, Mathi S, Gao XZ. An experimental investigation on unsupervised text summarization for customer reviews. Procedia Computer Science. 2023; 218:1692-701.
[2] El-kassas WS, Salama CR, Rafea AA, Mohamed HK. Automatic text summarization: a comprehensive survey. Expert systems with applications. 2021; 165:1-26.
[3] Vo SN, Vo TT, Le B. Interpretable extractive text summarization with meta-learning and BI-LSTM: a study of meta learning and explainability techniques. Expert Systems with Applications. 2024; 245:123045.
[4] Ghanem FA, Padma MC, Alkhatib R. Automatic short text summarization techniques in social media platforms. Future Internet. 2023; 15(9):1-27.
[5] Khekare G, Masudi C, Chukka YK, Koyyada DP. Text normalization and summarization using advanced natural language processing. In international conference on integrated circuits and communication systems 2024 (pp. 1-6). IEEE.
[6] Sengar SS, Hasan AB, Kumar S, Carroll F. Generative artificial intelligence: a systematic review and applications. Multimedia Tools and Applications. 2025; 84(21):23661-700.
[7] Rehman T, Mandal R, Agarwal A, Sanyal DK. Hallucination reduction in long input text summarization. In international conference on security, surveillance and artificial intelligence 2024 (pp. 307-16). CRC Press.
[8] Liu W, Sun Y, Yu B, Wang H, Peng Q, Hou M, et al. Automatic text summarization method based on improved textrank algorithm and K-means clustering. Knowledge-Based Systems. 2024; 287:1-15.
[9] Zhang W, Huang JH, Vakulenko S, Xu Y, Rajapakse T, Kanoulas E. Beyond relevant documents: a knowledge-intensive approach for query-focused summarization using large language models. In international conference on pattern recognition 2024 (pp. 89-104). Cham: Springer Nature Switzerland.
[10] Awasthi I, Gupta K, Bhogal PS, Anand SS, Soni PK. Natural language processing (NLP) based text summarization-a survey. In 6th international conference on inventive computation technologies (ICICT) 2021 (pp. 1310-7). IEEE.
[11] Fan J, Tian X, Lv C, Zhang S, Wang Y, Zhang J. Extractive social media text summarization based on MFMMR-BertSum. Array. 2023; 20:1-7.
[12] Alomari A, Idris N, Sabri AQ, Alsmadi I. Deep reinforcement and transfer learning for abstractive text summarization: a review. Computer Speech & Language. 2022; 71:1-43.
[13] Muniraj P, Sabarmathi KR, Leelavathi R. HNTSumm: hybrid text summarization of transliterated news articles. International Journal of Intelligent Networks. 2023; 4:53-61.
[14] Gupta H, Patel M. Method of text summarization using LSA and sentence-based topic modelling with bert. In international conference on artificial intelligence and smart systems (ICAIS) 2021 (pp. 511-7). IEEE.
[15] Mohan GB, Kumar RP, Elakkiya R. Enhancing pre-trained models for text summarization: a multi-objective genetic algorithm optimization approach. Multimedia Tools and Applications. 2025; 84(25):29949-65.
[16] Onah DF, Pang EL, El-haj M. A data-driven latent semantic analysis for automatic text summarization using LDA topic modelling. In international conference on big data 2022 (pp. 2771-80). IEEE.
[17] Abo-bakr H, Mohamed SA. Automatic multi-documents text summarization by a large-scale sparse multi-objective optimization algorithm. Complex & Intelligent Systems. 2023; 9(4):4629-44.
[18] Yu-te LS, Bahukhandi A, Liu D, Ma KL. Towards dataset-scale and feature-oriented evaluation of text summarization in large language model prompts. IEEE Transactions on Visualization. 2025; 31(1): 481 – 91.
[19] Alami MZ, Frikh B, Ouhbi B. EXABSUM: a new text summarization approach for generating extractive and abstractive summaries. Journal of Big Data. 2023; 10(1):1-34.
[20] Wahab MH, Hamid NA, Subramaniam S, Latip R, Othman M. Decomposition–based multi-objective differential evolution for extractive multi-document automatic text summarization. Applied Soft Computing. 2024; 151:1-18.
[21] Hailu TT, Yu J, Fantaye TG. A framework for word embedding based automatic text summarization and evaluation. Information. 2020; 11(2):1-23.
[22] Jiang J, Zhang H, Dai C, Zhao Q, Feng H, Ji Z, et al. Enhancements of attention-based bidirectional LSTM for hybrid automatic text summarization. IEEE Access. 2021; 9:123660-71.
[23] Kouris P, Alexandridis G, Stafylopatis A. Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approach. Journal of Big Data. 2024; 11(1):1-39.
[24] Hernández-castañeda Á, García-hernández RA, Ledeneva Y. Toward the automatic generation of an objective function for extractive text summarization. IEEE Access. 2023; 11:51455-64.
[25] Zhong J, Wang Z. MTL‐DAS: automatic text summarization for domain adaptation. Computational Intelligence and Neuroscience. 2022; 2022(1):1-10.
[26] Hosseinabadi S, Kelarestaghi M, Eshghi F. ISSE: a new iterative sentence scoring and extraction scheme for automatic text summarization. International Journal of Computers and Applications. 2022; 44(6):535-40.
[27] Belwal RC, Rai S, Gupta A. Text summarization using topic-based vector space model and semantic measure. Information Processing & Management. 2021; 58(3):1-15.
[28] Onan A, Alhumyani HA. FuzzyTP-BERT: enhancing extractive text summarization with fuzzy topic modeling and transformer networks. Journal of King Saud University-Computer and Information Sciences. 2024; 36(6):1-14.
[29] Onan A, Alhumyani H. Contextual hypergraph networks for enhanced extractive summarization: introducing multi-element contextual hypergraph extractive summarizer (MCHES). Applied Sciences. 2024; 14(11):1-25.
[30] Toprak A, Turan M. Enhanced automatic abstractive document summarization using transformers and sentence grouping. The Journal of Supercomputing. 2025; 81(4):1-30.
[31] Khan B, Usman M, Khan I, Khan J, Hussain D, Gu YH. Next-generation text summarization: a T5-LSTM FusionNet hybrid approach for psychological data. IEEE Access. 2025; 13: 37557-71.
[32] Alotaibi A, Nadeem F. An unsupervised integrated framework for Arabic aspect-based sentiment analysis and abstractive text summarization of traffic services using transformer models. Smart Cities. 2025; 8(2):1-36.
[33] Tomer M, Kumar M. Multi-document extractive text summarization based on firefly algorithm. Journal of King Saud University-Computer and Information Sciences. 2022; 34(8):6057-65.
[34] Alami N, Mallahi ME, Amakdouf H, Qjidaa H. Hybrid method for text summarization based on statistical and semantic treatment. Multimedia Tools and Applications. 2021; 80(13):19567-600.
[35] Onan A, Alhumyani HA. DeepExtract: semantic-driven extractive text summarization framework using LLMs and hierarchical positional encoding. Journal of King Saud University-Computer and Information Sciences. 2024; 36(8):1-19.
[36] Lamsiyah S, El MA, Espinasse B, Ouatik SE. An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings. Expert Systems with Applications. 2021; 167:1-16.
[37] Hark C, Karcı A. Karcı summarization: a simple and effective approach for automatic text summarization using Karcı entropy. Information Processing & Management. 2020; 57(3):1-16.
