(Publisher of Peer Reviewed Open Access Journals)

International Journal of Advanced Computer Research (IJACR)

ISSN (Print):2249-7277    ISSN (Online):2277-7970
Volume-13 Issue-63 June-2023
Full-Text PDF
Paper Title : A review and analysis for the text-based classification
Author Name : Prince Kumar and Animesh Kumar Dubey
Abstract :

In the current information-rich era, efficient retrieval and classification of text-based documents have become crucial tasks. With the exponential growth of digital content, the ability to retrieve the most relevant and appropriate documents has become a pressing concern. Effective document retrieval not only saves time and effort but also contributes to enhanced knowledge discovery and decision-making processes. To address these challenges, various text-based classification techniques have been developed and implemented. This paper aims to provide a comprehensive review and analysis of text-based classification techniques. The objectives include evaluating existing methods, identifying their strengths and limitations, and suggesting potential avenues for future research. The paper will analyze various algorithms, feature extraction techniques, and evaluation metrics employed in text-based classification. Additionally, it investigated the impact of different factors, such as document size, language, and domain specificity, on classification performance.

Keywords : Text-based classification, Knowledge discovery, Inherent ambiguity, Extraction mechanism.
Cite this article : Kumar P, Dubey AK. A review and analysis for the text-based classification. International Journal of Advanced Computer Research. 2023; 13(63):23-28. DOI:10.19101/IJACR.2023.1362008.
References :
[1]Gasparetto A, Marcuzzo M, Zangari A, Albarelli A. A survey on text classification algorithms: From text to predictions. Information. 2022; 13(2):83.
[Crossref] [Google Scholar]
[2]Wang Y, Wang C, Zhan J, Ma W, Jiang Y. Text FCG: Fusing contextual information via graph learning for text classification. Expert Systems with Applications. 2023:119658.
[Crossref] [Google Scholar]
[3]Chen X, Cong P, Lv S. A long-text classification method of Chinese news based on BERT and CNN. IEEE Access. 2022; 10:34046-57.
[Crossref] [Google Scholar]
[4]Bayer M, Kaufhold MA, Reuter C. A survey on data augmentation for text classification. ACM Computing Surveys. 2022; 55(7):1-39.
[Crossref] [Google Scholar]
[5]Qasim R, Bangyal WH, Alqarni MA, Ali Almazroi A. A fine-tuned BERT-based transfer learning approach for text classification. Journal of Healthcare Engineering. 2022.
[Crossref] [Google Scholar]
[6]Ma Y, Liu X, Zhao L, Liang Y, Zhang P, Jin B. Hybrid embedding-based text representation for hierarchical multi-label text classification. Expert Systems with Applications. 2022; 187:115905.
[Crossref] [Google Scholar]
[7]Muñoz S, Iglesias CA. A text classification approach to detect psychological stress combining a lexicon-based feature framework with distributional representations. Information Processing & Management. 2022; 59(5):103011.
[Crossref] [Google Scholar]
[8]Dubey AK, Kushwaha GR, Shrivastava N. Heterogeneous data mining environment based on dam for mobile computing environments. Information Technology and Mobile Communication. 2011:144.
[Google Scholar]
[9]Mohammed A, Kora R. An effective ensemble deep learning framework for text classification. Journal of King Saud University-Computer and Information Sciences. 2022; 34(10):8825-37.
[Crossref] [Google Scholar]
[10]Khataei Maragheh H, Gharehchopogh FS, Majidzadeh K, Sangar AB. A new hybrid based on long short-term memory network with spotted hyena optimization algorithm for multi-label text classification. Mathematics. 2022; 10(3):488.
[Google Scholar]
[11]Dubey AK, Shandilya SK. A comprehensive survey of grid computing mechanism in J2ME for effective mobile computing techniques. In 2010 5th international conference on industrial and information systems 2010 (pp. 207-212). IEEE.
[Crossref] [Google Scholar]
[12]Zhou H. Research of text classification based on TF-IDF and CNN-LSTM. In journal of physics: conference series 2022 (p. 012021). IOP Publishing.
[Crossref] [Google Scholar]
[13]Zhang H, Zhang X, Huang H, Yu L. Prompt-based meta-learning for few-shot text classification. In proceedings of the 2022 conference on empirical methods in natural language processing 2022 (pp. 1342-57).
[Google Scholar]
[14]Yang X, Li Y, Li Q, Liu D, Li T. Temporal-spatial three-way granular computing for dynamic text sentiment classification. Information Sciences. 2022; 596:551-66.
[Crossref] [Google Scholar]
[15]Li Q, Peng H, Li J, Xia C, Yang R, Sun L, Yu PS, He L. A survey on text classification: From traditional to deep learning. ACM Transactions on Intelligent Systems and Technology (TIST). 2022; 13(2):1-41.
[Crossref] [Google Scholar]
[16]Zhao H, Xie J, Wang H. Graph convolutional network based on multi-head pooling for short text classification. IEEE Access. 2022; 10:11947-56.
[Crossref] [Google Scholar]
[17]Yang D, Kim B, Lee SH, Ahn YH, Kim HY. AutoDefect: defect text classification in residential buildings using a multi-task channel attention network. Sustainable Cities and Society. 2022; 80:103803.
[Crossref] [Google Scholar]
[18]Dubey AK, Kapoor D, Kashyap V. A review on performance analysis of data mining methods in IoT. International Journal of Advanced Technology and Engineering Exploration. 2020; 7(73):193.
[Google Scholar]
[19]William P, Badholia A, Patel B, Nigam M. Hybrid Machine Learning Technique for Personality Classification from Online Text using HEXACO Model. In2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS) 2022 Apr 7 (pp. 253-259). IEEE.
[Google Scholar]
[20]Shelke N, Chaudhury S, Chakrabarti S, Bangare SL, Yogapriya G, Pandey P. An efficient way of text-based emotion analysis from social media using LRA-DNN. Neuroscience Informatics. 2022: 100048.
[Crossref] [Google Scholar]
[21]Liu L, Wu Y, Yin L, Ren J, Song R, Xu G. A method combining text classification and keyword recognition to improve long text information mining. In 7th IEEE International Conference on Data Science in Cyberspace (DSC) 2022 (pp. 242-8). IEEE.
[Crossref] [Google Scholar]
[22]Pathak M, Jain A. µBoost: An Effective Method for Solving Indic Multilingual Text Classification Problem. In eighth international conference on multimedia big data (BigMM) 2022 (pp. 96-100). IEEE.
[Crossref] [Google Scholar]
[23]Wang H, Cao J, Lin D. Deep analysis of power equipment defects based on semantic framework text mining technology. CSEE Journal of Power and Energy Systems. 2019; 8(4):1157-64.
[Crossref] [Google Scholar]
[24]Ma L, Pu KQ. Neural network accelerated tuple search for relational data. In 2022 IEEE 23rd international conference on information reuse and integration for data science (IRI) 2022 (pp. 81-2). IEEE.
[Crossref] [Google Scholar]
[25]Yu B, Deng C, Bu L. Policy text classification algorithm based on bert. In 11th international conference of information and communication technology (ICTech)) 2022 (pp. 488-91). IEEE.
[Crossref] [Google Scholar]
[26]Caron M. Shortcut learning in financial text mining: exposing the overly optimistic performance estimates of text classification models under distribution shift. In2022 IEEE International Conference on Big Data (Big Data) 2022 (pp. 3486-95). IEEE.
[Crossref] [Google Scholar]
[27]Sun JW, Bao JQ, Bu LP. Text classification algorithm based on TF-IDF and BERT. In 2022 11th international conference of information and communication technology (ICTech)) 2022 (pp. 1-4). IEEE.
[Google Scholar]
[28]Umer M, Imtiaz Z, Ahmad M, Nappi M, Medaglia C, Choi GS, Mehmood A. Impact of convolutional neural network and FastText embedding on text classification. Multimedia Tools and Applications. 2023; 82(4):5569-85.
[Google Scholar]
[29]Shi Y, Zhang X, Yu N. PL-Transformer: a POS-aware and layer ensemble transformer for text classification. Neural Computing and Applications. 2023; 35(2):1971-82.
[Crossref] [Google Scholar]
[30]Chandran NV, Anoop VS, Asharaf S. Topicstriker: A topic kernels-powered approach for text classification. Results in Engineering. 2023; 17:100949.
[Google Scholar]
[31]Alantari HJ, Currim IS, Deng Y, Singh S. An empirical comparison of machine learning methods for text-based sentiment analysis of online consumer reviews. International Journal of Research in Marketing. 2022; 39(1):1-9.
[Google Scholar]
[32]Guo Y, Ge Y, Yang YC, Al-Garadi MA, Sarker A. Comparison of pretraining models and strategies for health-related social media text classification. In Healthcare 2022 (p. 1478). MDPI.
[Crossref] [Google Scholar]
[33]Shao D, Li C, Huang C, Xiang Y, Yu Z. A news classification applied with new text representation based on the improved LDA. Multimedia Tools and Applications. 2022; 81(15):21521-45.
[Google Scholar]
[34]Chen J, Lv S. Long Text Truncation Algorithm Based on Label Embedding in Text Classification. Applied Sciences. 2022; 12(19):9874.
[Google Scholar]
[35]Ozmen M, Zhang H, Wang P, Coates M. Multi-relation message passing for multi-label text classification. In ICASSP 2022-2022 IEEE international conference on acoustics, speech and signal processing (ICASSP) 2022 (pp. 3583-7). IEEE.
[Crossref] [Google Scholar]