Combating deep-fake news in bengali: detection and prevention of manipulated social media’s content using machine learning
Md. Jane Alam1, Md. Ashikur Rahman Khan1, Ishtiaq Ahammad1, Abir Hosen1 and Joysri Rani Das1
Corresponding Author : Md. Jane Alam
Recieved : 26-September-2025; Revised : 14-January-2026; Accepted : 26-March-2026
Abstract
The rapid expansion of social media and digital news platforms has accelerated the spread of deep fake and manipulated content, posing serious risks to public trust and social stability, particularly in low-resource languages such as Bengali, which remains underexplored despite its widespread use. To address this gap, this study investigates Bengali deep fake news detection using both classical machine learning (ML) and transformer-based deep learning approaches on newly released, expert-annotated Bangla news datasets. After comprehensive data pre-processing, multiple classifiers—logistic regression (LR), decision tree (DT), random forest (RF), gradient boosting (GB), and a fine-tuned Bangla Bidirectional Encoder Representations from Transformers (BERT)—were evaluated, with careful handling of class imbalance and cross-validation (CV) for classical models. Experimental results show that performance improves consistently with larger training data, and the Bangla BERT model significantly outperforms traditional methods, achieving 94.49% accuracy, an F1-score of 89.03%, recall of 90.28%, and a receiver operating characteristic – area under the curve (ROC-AUC) of 98.36%, demonstrating superior contextual and semantic understanding of Bengali text. The findings confirm that language-specific transformer models are highly effective for combating deep fake news in Bengali and establish a strong baseline for future research and real-world deployment of misinformation detection systems in Bangla digital media.
Keywords
Bangla fake news, BERT classifier, Fake news detection, Random forest, decision tree, Artificial intelligence, Machine learning.
Cite this article
Alam M, Khan MR, Ahammad I, Hosen A, Das JR. Combating deep-fake news in bengali: detection and prevention of manipulated social media’s content using machine learning. International Journal of Advanced Computer Research. 2026;16(76):57-76. DOI : 10.19101/IJACR.2025.1570025
[1] Olan F, Jayawickrama U, Arakpogun EO, Suklan J, Liu S. Fake news on social media: the impact on society. Information Systems Frontiers. 2024; 26(2):443-58.
[2] Fallis D, Mathiesen K. Fake news is counterfeit news. Inquiry. 2025; 68(10):3191-210.
[3] Farhoudinia B, Ozturkcan S, Kasap N. Fake news in business and management literature: a systematic review of definitions, theories, methods and implications. Aslib Journal of Information Management. 2025; 77(2):306-29.
[4] https://www.eicta.iitk.ac.in/knowledge-hub/cyber-security/disinformation-security-strategies-to-combat-fake-news-in-the-digital-age-2025. Accessed 14 December 2025.
[5] Rustam F, Aljedaani W, Jurcut AD, Alfarhood S, Safran M, Ashraf I. Fake news detection using enhanced features through text to image transformation with customized models. Discover Computing. 2024; 27(1):54.
[6] http://www.syvjournal.com/an-importance-of-an-online-news-portal. Accessed 23 December 2023.
[7] Vosoughi S, Roy D, Aral S. The spread of true and false news online. science. 2018; 359(6380):1146-51.
[8] Ahmad I, Yousaf M, Yousaf S, Ahmad MO. Fake news detection using machine learning ensemble methods. Complexity. 2020; 2020(1):8885861.
[9] Tanvir AA, Mahir EM, Akhter S, Huq MR. Detecting fake news using machine learning and deep learning algorithms. In 7th international conference on smart computing & communications (ICSCC) 2019 (pp. 1-5). IEEE.
[10] Grave E, Bojanowski P, Gupta P, Joulin A, Mikolov T. Learning word vectors for 157 languages. In proceedings of the eleventh international conference on language resources and evaluation 2018 (pp. 3483-7).
[11] Ahmed I, Manik JA. A hazy picture appears. The Daily Star. 2012.
[12] https://so03.tci-thaijo.org/index.php/KMR/article/view/270184. Accessed 14 December 2025.
[13] Soll J. The long and brutal history of fake news. Politico Magazine. 2016.
[14] Albahr A, Albahar M. An empirical comparison of fake news detection using different machine learning algorithms. International Journal of Advanced Computer Science and Applications. 2020; 11(9):146-52.
[15] Schonfeld E. Citizen ‘journalist’hits apple stock with false (Steve Jobs) heart attack rumor. TechCrunch.com. 2008.
[16] Levi O, Hosseini P, Diab M, Broniatowski D. Identifying nuances in fake news vs. satire: using semantic and linguistic cues. In proceedings of the second workshop on natural language processing for internet freedom: censorship, disinformation, and propaganda 2019 (pp. 31-5). Association for Computational Linguistics.
[17] Zhou X, Zafarani R. Network-based fake news detection: a pattern-driven approach. ACM SIGKDD Explorations Newsletter. 2019; 21(2):48-60.
[18] Farhad FI, Imran S, Santo MM, Khan M, Sakib A, Rahman MS, et al. Addressing misinformation in Bengali media: a hybrid deep learning solution. In 27th international conference on computer and information technology (ICCIT) 2024 (pp. 774-9). IEEE.
[19] Rahman MA, Shourov MF. Social media as a challenge to traditional journalism in Bangladesh. Digital Journalism. 2025:1-17.
[20] Atikuzzaman M. Social media use and the spread of COVID-19-related fake news among university students in Bangladesh. Journal of Information & Knowledge Management. 2022; 21(Supp01):2240002.
[21] Puraivan E, Venegas R, Riquelme F. An empiric validation of linguistic features in machine learning models for fake news detection. Data & Knowledge Engineering. 2023; 147:102207.
[22] Long Y, Lu Q, Xiang R, Li M, Huang CR. Fake news detection through multi-perspective speaker profiles. In proceedings of the eighth international joint conference on natural language processing 2017 (pp. 252-6). Asian Federation of Natural Language Processing.
[23] Kamal A, Abulaish M. Contextualized satire detection in short texts using deep learning techniques. Journal of Web Engineering. 2024; 23(1):27-52.
[24] Shaar S, Georgiev N, Alam F, Da SMG, Mohamed A, Nakov P. Assisting the human fact-checkers: detecting all previously fact-checked claims in a document. In findings of the association for computational linguistics: EMNLP 2022 (pp. 2069-80). Association for Computational Linguistics.
[25] Guo Y, Ji S, Fang X, Chiu DK, Cao N, Leung H. An unsupervised fake news detection framework based on structural contrastive learning. Cybersecurity. 2025; 8(1):41.
[26] Alawadh HM, Alabrah A, Meraj T, Rauf HT. Attention-enriched mini-BERT fake news analyzer using the Arabic language. Future Internet. 2023; 15(2):44.
[27] Alsudias L, Rayson P. COVID-19 and Arabic Twitter: how can Arab world governments and public health organizations learn from social media? In proceedings of the 1st workshop on NLP for COVID-19 2020 (pp. 1-9). Association for Computational Linguistics.
[28] Muaad AY, Jayappa DH, Benifa JB, Alabrah A, Naji SMA, Pushpa D, et al. Artificial intelligence‐based approach for misogyny and sarcasm detection from Arabic texts. Computational Intelligence and Neuroscience. 2022; 2022(1):1-9.
[29] Mughaid A, Al-zu’bi S, Arjan AA, Al-amrat R, Alajmi R, Zitar RA, et al. An intelligent cybersecurity system for detecting fake news in social media websites. Soft Computing. 2022; 26(12):5577-91.
[30] Gumaei A, Al-rakhami MS, Hassan MM, De AVH, Camacho D. An effective approach for rumor detection of Arabic tweets using extreme gradient boosting method. Transactions on Asian and Low-Resource Language Information Processing. 2022; 21(1):1-6.
[31] Shu K, Sliva A, Wang S, Tang J, Liu H. Fake news detection on social media: a data mining perspective. ACM SIGKDD Explorations Newsletter. 2017; 19(1):22-36.
[32] Huang KH, Mckeown K, Nakov P, Choi Y, Ji H. Faking fake news for real fake news detection: propaganda-loaded training data generation. In proceedings of the 61st annual meeting of the association for computational linguistics 2023 (pp. 14571-89). Association for Computational Linguistics.
[33] Monteiro RA, Santos RL, Pardo TA, De ATA, Ruiz EE, Vale OA. Contributions to the study of fake news in portuguese: new corpus and automatic detection results. In international conference on computational processing of the portuguese language 2018 (pp. 324-34). Cham: Springer International Publishing.
[34] Bhatt G, Sharma A, Sharma S, Nagpal A, Raman B, Mittal A. Combining neural, statistical and external features for fake news stance identification. In companion proceedings of the the web conference 2018 (pp. 1353-7). ACM.
[35] Watanabe H, Bouazizi M, Ohtsuki T. Hate speech on twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access. 2018; 6:13825-35.
[36] Amer E, Kwak KS, El-sappagh S. Context-based fake news detection model relying on deep learning models. Electronics. 2022; 11(8):1255.
[37] Lasotte YB, Garba EJ, Malgwi YM, Buhari MA. An ensemble machine learning approach for fake news detection and classification using a soft voting classifier. European Journal of Electrical Engineering and Computer Science. 2022; 6(2):1-7.
[38] Ruchansky N, Seo S, Liu Y. Csi: a hybrid deep model for fake news detection. In proceedings of the conference on information and knowledge management 2017 (pp. 797-806). ACM.
[39] Benamira A, Devillers B, Lesot E, Ray AK, Saadi M, Malliaros FD. Semi-supervised learning and graph neural networks for fake news detection. In proceedings of the international conference on advances in social networks analysis and mining 2019 (pp. 568-9). IEEE.
[40] Ahmed H, Traore I, Saad S. Detecting opinion spams and fake news using text classification. Security and Privacy. 2018; 1(1):e9.
[41] Pérez-rosas V, Kleinberg B, Lefevre A, Mihalcea R. Automatic detection of fake news. In proceedings of the 27th international conference on computational linguistics 2018 (pp. 3391-401). Association for Computational Linguistics.
[42] Lv J, Gao Y, Li L, Shi L, Li S. Multi-modal fake news detection: a comprehensive survey on deep learning technology, advances, and challenges. Journal of King Saud University Computer and Information Sciences. 2025; 37(9):1-29.
[43] Harada A, Bollegala D, Chandrasiri NP. Discrimination of human-written and human and machine written sentences using text consistency. In international conference on computing, communication, and intelligent systems (ICCCIS) 2021 (pp. 41-7). IEEE.
[44] Ahammad M, Sani A, Rahman K, Islam MT, Masud MM, Hassan MM, et al. Roberta-gcn: a novel approach for combating fake news in Bangla using advanced language processing and graph convolutional networks. IEEE Access. 2024; 12:132644-63.
[45] Hossain MZ, Rahman MA, Islam MS, Kar S. BanFakenews: a dataset for detecting fake news in Bangla. Proceedings of the 12th conference on language resources and evaluation (pp.2862–71). European Language Resources Association.
