Context-aware misinformation detection using BERT-based neural network with TF-IDF integration
Saberi Goswami1, Supratim Bhattacharya 1 and Jayanta Poray1
Corresponding Author : Saberi Goswami
Recieved : 19-Jul-2025; Revised : 23-Oct-2025; Accepted : 11-Nov-2025
Abstract
The widespread propagation of misinformation, especially during pandemics, are often followed by severe consequences, that include public distrust in healthcare systems, increased anxiety, and disturbances in critical treatments. Thus, a robust system is needed to detect misinformation as well as to prevent its rapid spread to maintain stability and avoid disorganization in the society. This study proposes a novel hybrid artificially intelligent model which combines a bidirectional encoder representations from transformers (BERT)-based neural network and term frequency–inverse document frequency (TF–IDF) to detect and reduce misinformation effectively. The framework integrates clear steps to guarantee reproducibility, including fusion of TF–IDF and contextual BERT embeddings through concatenation, feature normalization using min–max scaling and classification using fully connected layers. The model achieves an accuracy of 96–98% on benchmark datasets and consistently outperforms existing methodologies. The efficacy of this method in identifying and tracking misinformation trends, especially during health crises has been demonstrated by the experimental results, whereas the integration of explainable artificial intelligence (XAI) helps to improve transparency.
Keywords
Misinformation Detection, BERT, TF-IDF, Explainable artificial intelligence (XAI), Pandemic information analysis.
Cite this article
Goswami S, Bhattacharya S, Poray J. Context-aware misinformation detection using BERT-based neural network with TF-IDF integration. International Journal of Advanced Computer Research. 2026;16(75):23-37. DOI : 10.19101/IJACR.2025.1570016
References
[1] Zarocostas J. How to fight an infodemic. The lancet. 2020; 395(10225):676.
[2] Cinelli M, Quattrociocchi W, Galeazzi A, Valensise CM, Brugnoli E, Schmidt AL, et al. The COVID-19 social media infodemic. Scientific Reports. 2020; 10(1):1-10.
[3] Devlin J, Chang MW, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. In proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: human language technologies, (long and short papers) 2019 (pp. 4171-86).
[4] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in Neural Information Processing Systems. 2017; 30.https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Attention+is+all+you+need&btnG=
[5] Alghamdi J, Lin Y, Luo S. Towards COVID-19 fake news detection using transformer-based models. Knowledge-Based Systems. 2023; 274:1-11.
[6] Abduljaleel IQ, Ali IH. Detecting fake news using BERT word embedding, attention mechanism, partition and overlapping text techniques. TEM Journal. 2025; 14(2):1152-65.
[7] Alam F, Shaar S, Dalvi F, Sajjad H, Nikolov A, Mubarak H, et al. Fighting the COVID-19 infodemic: modeling the perspective of journalists, fact-checkers, social media platforms, policy makers, and the society. In findings of the association for computational linguistics: EMNLP 2021 (pp. 611-49). Association for Computational Linguistics.
[8] Mass Y, Carmeli B, Roitman H, Konopnicki D. Unsupervised FAQ retrieval with question generation and BERT. In proceedings of the 58th annual meeting of the association for computational linguistics 2020 (pp. 807-12). Association for Computational Linguistics.
[9] He P, Liu X, Gao J, Chen W. Deberta: decoding-enhanced bert with disentangled attention. In international conference on learning representations. 2021:1-21.
[10] Timoneda JC, Vera SV. BERT, RoBERTa, or DeBERTa? comparing performance across transformers models in political science text. The Journal of Politics. 2025; 87(1):347-64.
[11] Rogers A, Kovaleva O, Rumshisky A. A primer in BERTology: what we know about how BERT works. Transactions of the Association for Computational Linguistics. 2020; 8:842-66.
[12] Li X, Whinston AB. A model of fake data in data-driven analysis. Journal of Machine Learning Research. 2020; 21(3):1-26.
[13] Hossain T, LoganIV RL, Ugarte A, Matsubara Y, Young S, Singh S. COVIDLies: detecting COVID-19 misinformation on social media. In proceedings of the 1st workshop on NLP for COVID-19 (part 2) at EMNLP 2020. Association for Computational Linguistics.
[14] Ribeiro MT, Singh S, Guestrin C. " Why should i trust you?" explaining the predictions of any classifier. In proceedings of the 22nd international conference on knowledge discovery and data mining 2016 (pp. 1135-44). ACM.
[15] Khattar D, Goud JS, Gupta M, Varma V. Mvae: multimodal variational autoencoder for fake news detection. In the world wide web conference 2019 (pp. 2915-21). ACM.
[16] Nguyen TT, Nguyen XP, Joty S, Li X. Differentiable window for dynamic local attention. In proceedings of association for computational linguistics 2020 (pp. 6606–17). ACL.
[17] Droog E, Vermeulen I, Van HD, Harutyunyan D, Tejedor S, Pulido C. Combatting the misinformation crisis: a systematic review of the literature on characteristics and effectiveness of media literacy interventions. Communication Research. 2024:1-30.
[18] Shu K, Wang S, Lee D, Liu H. Disinformation, misinformation, and fake news in social media. Cham: Springer International Publishing; 2020.
[19] Vosoughi S, Roy D, Aral S. The spread of true and false news online. Science. 2018; 359(6380):1146-51.
[20] Mouratidis D, Kanavos A, Kermanidis K. From misinformation to insight: machine learning strategies for fake news detection. Information. 2025; 16(3).
[21] Liu H, Chen X, Liu X. A study of the application of weight distributing method combining sentiment dictionary and TF-IDF for text sentiment analysis. IEEE Access. 2022; 10:32280-9.
[22] Siddique MM, Kumar S. BERT-Enhanced Bi-LSTM with weighted cross-entropy for multilingual sentiment classification. International Journal of Advances in Intelligent Informatics. 2025; 11(3):396-416
[23] Kaliyar RK, Goswami A, Narang P. FakeBERT: fake news detection in social media with a BERT-based deep learning approach. Multimedia Tools and Applications. 2021; 80(8):11765-88.
[24] Mehta D, Dwivedi A, Patra A, Anand KM. A transformer-based architecture for fake news classification. Social Network Analysis and Mining. 2021; 11(1):39.
[25] Verma PK, Agrawal P, Amorim I, Prodan R. WELFake: word embedding over linguistic features for fake news detection. IEEE Transactions on Computational Social Systems. 2021; 8(4):881-93.
[26] Bhatt G, Sharma A, Sharma S, Nagpal A, Raman B, Mittal A. Combining neural, statistical and external features for fake news stance identification. In companion proceedings of the the web conference 2018 (pp. 1353-7). ACM.
[27] Hussna AU, Alam MG, Islam R, Alkhamees BF, Hassan MM, Uddin MZ. Dissecting the infodemic: an in-depth analysis of COVID-19 misinformation detection on X (formerly Twitter) utilizing machine learning and deep learning techniques. Heliyon. 2024; 10(18):1-22.
[28] Vimbi V, Shaffi N, Mahmud M. Interpreting artificial intelligence models: a systematic review on the application of LIME and SHAP in Alzheimer’s disease detection. Brain Informatics. 2024; 11(1):1-29.
[29] Adadi A, Berrada M. Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access. 2018; 6:52138-60.
[30] Mienye ID, Obaido G, Jere N, Mienye E, Aruleba K, Emmanuel ID, et al. A survey of explainable artificial intelligence in healthcare: Concepts, applications, and challenges. Informatics in Medicine Unlocked. 2024; 51:101587.
[31] Dhiman P, Kaur A, Gupta D, Juneja S, Nauman A, Muhammad G. GBERT: a hybrid deep learning model based on GPT-BERT for fake news detection. Heliyon. 2024; 10(16).
[32] Essa E, Omar K, Alqahtani A. Fake news detection based on a hybrid BERT and LightGBM models. Complex & Intelligent Systems. 2023; 9(6):6581-92.
[33] Ferdush J, Kamruzzaman J, Karmakar G, Gondal I, Das R. Cross-domain fake news detection through fusion of evidence from multiple social media platforms. Future Internet. 2025; 17(2):61.
[34] Sharma A, Sharma S, Bhardwaj U, Mistry S, Deb N, Krishna A. COVID-19 fake news detection using cross-domain classification techniques. In australasian joint conference on artificial intelligence 2023 (pp. 507-19). Singapore: Springer Nature Singapore.
[35] Rizal R, Faturahman A, Impron A, Darmawan I, Haerani E, Rahmatulloh A. Unveiling the truth: detecting fake news using SVM and TF-IDF. In international conference on advancement in data science, e-learning and information system (ICADEIS) 2025 (pp. 1-6). IEEE.
[36] Nasser M, Arshad NI, Ali A, Alhussian H, Saeed F, Da'u A, et al. A systematic review of multimodal fake news detection on social media using deep learning models. Results in Engineering. 2025; 26:1-17.
[37] Zidan M, Sleem A, Nabil A, Othman M. Multimodal fake news detection: a survey of text and visual content integration methods. International Journal of Computers and Informatics (Zagazig University). 2025; 7:13-25.
[38] Mohawesh R, Salameh HB, Jararweh Y, Alkhalaileh M, Maqsood S. Fake review detection using transformer-based enhanced LSTM and RoBERTa. International journal of cognitive computing in engineering. 2024; 5:250-8.
[39] Rout J, Mishra M, Saikia MJ. Towards reliable fake news detection: enhanced attention-based transformer model. Journal of Cybersecurity and Privacy. 2025; 5(3):1-22.
[40] Alqadi BS, Alsuhibany SA, Yousafzai SN, Alzu’bi S, Alsekait DM, Abdelminaam DS. Transfer learning driven fake news detection and classification using large language models. Scientific Reports. 2025; 15(1):1-17.
[41] Fenza G, Gallo M, Loia V, Petrone A, Stanzione C. Concept-drift detection index based on fuzzy formal concept analysis for fake news classifiers. Technological Forecasting and Social Change. 2023; 194:122640.
[42] Alnabhan MQ, Branco P. Bertguard: two-tiered multi-domain fake news detection with class imbalance mitigation. Big Data and Cognitive Computing. 2024; 8(8):1-17.
[43] Xu Q, Du H, Łukasik S, Zhu T, Wang S, Yu X. MDAM3: a misinformation detection and analysis framework for multitype multimodal media. In proceedings of the ACM on web conference 2025 (pp. 5285-96). ACM.
[44] Floridi L. The ethics of artificial intelligence: principles, challenges, and opportunities. Oxford; 2023.
[45] Onayinka TS, Opele JK, Adewole LB, Agbasimelo CI. Ethical implications and policy frameworks for AI-driven solutions to combat misinformation in digital media. UNIZIK Journal of Educational Research and Policy Studies. 2024; 17(3):314-27.
[46] Ahmed H, Traore I, Saad S. Detection of online fake news using n-gram analysis and machine learning techniques. In international conference on intelligent, secure, and dependable systems in distributed and cloud environments 2017 (pp. 127-38). Cham: Springer International Publishing.
[47] Hanif A, Beheshti A, Benatallah B, Zhang X, Habiba, Foo E, et al. A comprehensive survey of explainable artificial intelligence (xai) methods: exploring transparency and interpretability. In international conference on web information systems engineering 2023 (pp. 915-25). Singapore: Springer Nature Singapore.
[48] Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Information Processing & Management. 1988; 24(5):513-23.
[49] Adoma AF, Henry NM, Chen W. Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In 17th international computer conference on wavelet active media technology and information processing (ICCWAMTIP) 2020 (pp. 117-21). IEEE.
[50] Wang Y, Wei B, Zhang M, Yan F. FND-MC: fake news detection based on crossmodal alignment and multimodal fusion. In international conference on artificial intelligence and engineering management (ICAIEM) 2025 (pp. 5-10). IEEE.
