International Journal of Advanced Technology and Engineering Exploration ISSN (Print): 2394-5443    ISSN (Online): 2394-7454 Volume-13 Issue-138 May-2026
  1. 4774
    Citations
  2. 2.8
    CiteScore
Hybrid non-maximum suppression-based Telugu scene text detection

Vishnuvardhan Atmakuri1, Vayunandan Kumar Konakalla2, M. Radha3, Sape Chittibabulu4, Nalluri Brahma Naidu5 and G. Satyanarayana6

Research Scholar, Jawaharlal Nehru Technological University Hyderabad,Kukatpally, Hyderabad,Telangana-500085,India1
Senior Software Engineer,Fidelity Investments, Dallas,Texas-75201,USA2
Assistant Professor, Department of Information Technology,Vallurupalli Nageswara Rao, Vignana Jyothi Institute of Engineering and Technology, Bachupally, Hyderabad,Telangana- 500118,India3
Assistant Professor, Department of Computer Science and Engineering,Aditya University, Surampalem,Andhra Pradesh-533437,India4
Assistant Professor, Department of Computer Science and Engineering,Vignan's Foundation for Science, Technology & Research, Guntur -Tenali Rd, Vadlamudi,Andhra Pradesh-522213,India5
Associate Professor, Department of Computer Science and Engineering,Vasireddy Venkatadri International Technological University Guntur,Andhra Pradesh-522508,India6
Corresponding Author : Vishnuvardhan Atmakuri

Recieved : 22-April-2025; Revised : 23-May-2026; Accepted : 26-May-2026

Abstract

Detecting text in natural scenes is crucial for bridging visual and linguistic information and supports applications such as multilingual information retrieval, assistive technologies, and scene understanding. Telugu script presents unique detection challenges due to its rounded glyph structures, complex ligatures, and frequent background interference. This paper proposes an enhanced Telugu scene text detection framework based on the efficient and accurate scene text detector (EAST), augmented with a hybrid non-maximum suppression (NMS) strategy that integrates Soft-NMS and adaptive NMS. The proposed hybrid NMS improves post-processing reliability by suppressing redundant detections while preserving overlapping true positives (TPs), resulting in a more balanced precision–recall (PR) performance. The proposed method is evaluated on the Indian Institute of Information Technology-Indic Language Scene Text (IIIT-ILST) dataset, achieving a precision of 0.75, recall of 0.80, and an F1-score of 0.77, thereby outperforming existing Telugu text detection approaches. Further analysis using PR curves at intersection over union (IoU) thresholds of 0.5 and 0.75, along with mean average precision (mAP) evaluation, confirms the robustness of the proposed model under varying localization constraints. Overall, the hybrid NMS-enhanced EAST method establishes a strong benchmark for Telugu scene text detection and provides a practical solution for language-specific scene understanding applications.

Keywords

Telugu scene text detection, Efficient and accurate scene text detector (EAST), Hybrid non-maximum suppression, Scene understanding, Indic language processing, Optical character recognition

Cite this article

Atmakuri V, Konakalla VK, Radha M, Chittibabulu S, Naidu NB, Satyanarayana G. Hybrid non-maximum suppression-based Telugu scene text detection. International Journal of Advanced Technology and Engineering Exploration. 2026;13(138):749-768. DOI : 10.19101/IJATEE.2025.121220525

References
[1]
Pal U, Halder A, Shivakumara P, Blumenstein M. A comprehensive review on text detection and recognition in scene images. Artificial Intelligence and Applications. 2024; 2(4):229-49.
[2]
Zhandong L, Ruixia S, Li K, Li Y. From detection to understanding: a systematic survey of deep learning for scene text processing. Applied Sciences. 2025; 15(17):1-102.
[3]
Sarkar A, Mondal A, Lehal GS, Jawahar CV. Printed OCR for extremely low-resource INDIC languages. In international conference on computer vision and image processing 2024 (pp. 108-22). Cham: Springer Nature Switzerland.
[4]
Yang Y, Eli E, Aysa A, Ubul K. Script identification in multilingual environment: a survey in recent years. Artificial Intelligence Review. 2025; 58(10):294.
[5]
Dutta M, Mohajon A, Dev S, Bappi DS, Das JK. Text recognition of Bangla and English scripts in natural scene images. International Journal of Advanced Research in Science and Technology. 2023; 12(10):1137-42.
[6]
Zhu Y, Yao C, Bai X. Scene text detection and recognition: recent advances and future trends. Frontiers of Computer Science. 2016; 10(1):19-36.
[7]
Wang W, Li X, Lyu X, Zeng T, Chen J, Chen S. Multi-attribute NMS: an enhanced non-maximum suppression algorithm for pedestrian detection in crowded scenes. Applied Sciences. 2023; 13(14):1-20.
[8]
Xu HH, Wang XQ, Wang D, Duan BG, Rui T. Object detection in crowded scenes via joint prediction. Defence Technology. 2023; 21:103-15.
[9]
Terven J, Cordova-esparza DM, Romero-gonzález JA, Ramírez-pedraza A, Chavez-urbiola EA. A comprehensive survey of loss functions and metrics in deep learning. Artificial Intelligence Review. 2025; 58(7):195.
[10]
Mazumder S, Neogy S, Das S, Roy K, Pal U. Review of the performance of CNN. Artificial Intelligence and Applications. 2025: 1-20.
[11]
Mathew M, Jain M, Jawahar CV. Benchmarking scene text recognition in Devanagari, Telugu and Malayalam. In 14th IAPR international conference on document analysis and recognition (ICDAR) 2017 (pp. 42-6). IEEE.
[12]
Li Y. DPNet: scene text detection based on dual perspective CNN-transformer. Plos One. 2024; 19(10):1-23.
[13]
Kudale D, Kasuba BV, Subramanian V, Chaudhuri P, Ramakrishnan G. TEXTRON: weakly supervised multilingual text detection through data programming. In proceedings of the winter conference on applications of computer vision 2024 (pp. 2871-80). IEEE.
[14]
Kadha V, Duddeti BB, Srinadh K, Buddepu SK, Janjanam L, Medhi K. From pixels to text: a deep learning survey of scene text detection and recognition. Computers and Electrical Engineering. 2026; 135:111139.
[15]
Li N, Wang Z, Huang Y, Tian J, Li X, Xiao Z. A multi-scale natural scene text detection method based on attention feature extraction and cascade feature fusion. Sensors. 2024; 24(12):1-18.
[16]
Yang M, Pan C, Wang L, Tao J, Liu H. Adaptive spatial-aware non-maximum suppression for dense object detection. Neurocomputing. 2025: 132541.
[17]
Shi H, Zhang J, Lei A, Wang C, Xiao Y, Wu C, et al. Enhancing detection accuracy of highly overlapping targets in agricultural imagery using IoA-SoftNMS algorithm across diverse image sizes. Computers and Electronics in Agriculture. 2024; 227:109475.
[18]
Li B, Song S, Ai L. Rethinking the non-maximum suppression step in 3D object detection from a bird’s-eye view. Electronics. 2024; 13(20):1-17.
[19]
Li L, Liu X, Chen X, Yin F, Chen B, Wang Y, et al. SDMSEAF-YOLOv8: a framework to significantly improve the detection performance of unmanned aerial vehicle images. Geocarto International. 2024; 39(1):1-19.
[20]
Jiang H, Zhang X, Xiang S. Non-maximum suppression guided label assignment for object detection in crowd scenes. IEEE Transactions on Multimedia. 2023; 26:2207-18.
[21]
Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In proceedings of the international conference on computer vision 2019 (pp. 8440-49). IEEE/CVF.
[22]
Zhang SX, Zhu X, Hou JB, Liu C, Yang C, Wang H, et al. Deep relational reasoning graph network for arbitrary shape text detection. In proceedings of the conference on computer vision and pattern recognition 2020 (pp. 9699-708). IEEE/CVF.
[23]
Wang W, Xie E, Li X, Hou W, Lu T, Yu G, et al. Shape robust text detection with progressive scale expansion network. In proceedings of the conference on computer vision and pattern recognition 2019 (pp. 9336-45). IEEE/CVF.
[24]
Baek Y, Lee B, Han D, Yun S, Lee H. Character region awareness for text detection. In proceedings of the conference on computer vision and pattern recognition 2019 (pp. 9365-74). IEEE/CVF.
[25]
Liao M, Wan Z, Yao C, Chen K, Bai X. Real-time scene text detection with differentiable binarization. In proceedings of the AAAI conference on artificial intelligence 2020 (pp. 11474-81).
[26]
Guo M, Wang J, Xu Q, Jiang B, Luo B. Entropy calibrated prototype embedding for transductive few-shot learning. Pattern Recognition Letters. 2026; 201:138-44.
[27]
Wang W, Xie E, Li X, Liu X, Liang D, Yang Z, et al. Shen C. Pan++: towards efficient and accurate end-to-end spotting of arbitrarily-shaped text. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021; 44(9):5349-67.
[28]
Izrailev FM, Castaneda-mendoza A. Return probability: exponential versus gaussian decay. Physics Letters A. 2006; 350(5-6):355-62.
[29]
Gunupusala S, Kaila SC. Multi-class network anomaly detection using machine learning techniques. Contemporary Mathematics. 2024; 5(2):37-49.
[30]
Liao M, Shi B, Bai X, Wang X, Liu W. Textboxes: a fast text detector with a single deep neural network. In proceedings of the AAAI conference on artificial intelligence 2017 (pp.4161-67).
[31]
Girshick R. Fast R-CNN. In proceedings of the international conference on computer vision 2015 (pp. 1440-8). IEEE.
[32]
Lu M, Mou Y, Chen CL, Tang Q. An efficient text detection model for street signs. Applied Sciences. 2021; 11(13):1-16.
[33]
Atmakuri V, Dhanalakshmi M. Advancements in telugu text detection: leveraging East with soft non-maximum suppression. IJFANS International Journal of Food and Nutritional Sciences. 2022; 11(12):12345–52.