(Publisher of Peer Reviewed Open Access Journals)

International Journal of Advanced Technology and Engineering Exploration (IJATEE)

ISSN (Print):2394-5443    ISSN (Online):2394-7454
Volume-10 Issue-98 January-2023
Full-Text PDF
Paper Title : Framework for deep learning based model for human activity recognition (HAR) using adapted PSRA6 dataset
Author Name : Rukhsarbano S. Sheikh, Sudhir Madhav Patil and Maneetkumar R. Dhanvijay
Abstract :

Perimeter surveillance at critical infrastructure sites is the most crucial aspect for such site owners. The titleholders use enhanced technology to keep an eye on suspicious activities and ground level movements using artificial intelligence (AI)-based smart cameras on perimeter border. In recent years, the use of AI has increased in the surveillance system that is deployed at critical areas to obtain a live feed of the ground situation. This allows the detection of human intrusions and the classification of targets based on human activity recognition (HAR). HAR is an important task for timely prevention of any kind of attack or intrusion. Surveillance is the most common application of vision-based HAR research. In recent years, deep learning has led to many AI applications in surveillance. This paper reports a customised video dataset concerning to perimeter surveillance related activity for 6 human action classes (PSRA6) pertaining to suspicious human activity through HAR. Three simple and built-from-scratch deep learning based convolutional neural network (CNN) architectures: convolution and long short-term memory (CONVLSTM), long-term recurrent convolutional network (LRCN), and 2-layer CNN, are used for the intended HAR. Python interface for all the three architectures has been provided by using Keras library. Performances of these architectures are investigated in terms of accuracy, precision, recall and F1 score. This work presented an effective method for collecting and characterising the adapted PSRA6 dataset. Based on the performance comparison, the 2-layer CNN architecture outperforms all other architectures with an accuracy of 96.77%, loss of 0.21, weighted average precision of 97%, weighted average recall of 97%, and weighted average F1 score of 97%. Though the designed architectures are limited by computational power, the 2-layer CNN model performed the best.

Keywords : CNN, Deep learning, Keras, Human action recognition, PSRA6, Neural network.
Cite this article : Sheikh RS, Patil SM, Dhanvijay MR. Framework for deep learning based model for human activity recognition (HAR) using adapted PSRA6 dataset . International Journal of Advanced Technology and Engineering Exploration. 2023; 10(98):37-66. DOI:10.19101/IJATEE.2021.876325.
References :
[1]Goyal A, Anandamurthy SB, Dash P, Acharya S, Bathla D, Hicks D, et al. Automatic border surveillance using machine learning in remote video surveillance systems. In emerging trends in electrical, communications, and information technologies 2020 (pp. 751-60). Springer, Singapore.
[Crossref] [Google Scholar]
[2]Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electronic Markets. 2021; 31(3):685-95.
[Crossref] [Google Scholar]
[3]Vrigkas M, Nikou C, Kakadiaris IA. A review of human activity recognition methods. Frontiers in Robotics and AI. 2015; 2:1-28.
[Crossref] [Google Scholar]
[4]Jegham I, Khalifa AB, Alouani I, Mahjoub MA. Vision-based human action recognition: an overview and real world challenges. Forensic Science International: Digital Investigation. 2020; 32:1-14.
[Crossref] [Google Scholar]
[5]Reddy KK, Shah M. Recognizing 50 human action categories of web videos. Machine Vision and Applications. 2013; 24(5):971-81.
[Crossref] [Google Scholar]
[6]Soomro K, Zamir AR, Shah M. UCF101: a dataset of 101 human actions classes from videos in the wild. Center for Research in Computer Vision, University of Central Florida. 2012: 1-8.
[Crossref] [Google Scholar]
[7]Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T. HMDB: a large video database for human motion recognition. In international conference on computer vision 2011 (pp. 2556-63). IEEE.
[Crossref] [Google Scholar]
[8]Schuldt C, Laptev I, Caputo B. Recognizing human actions: a local SVM approach. In proceedings of the 17th international conference on pattern recognition 2004 (pp. 32-6). IEEE.
[Crossref] [Google Scholar]
[9]https://academictorrents.com/details/184d11318372f70018cf9a72ef867e2fb9ce1d26. Accessed 12 March 2022.
[10]Li A, Thotakuri M, Ross DA, Carreira J, Vostrikov A, Zisserman A. The ava-kinetics localized human actions video dataset. Computing Research Repository. 2020; 5(7):1-8.
[Crossref] [Google Scholar]
[11]Cheng M, Cai K, Li M. RWF-2000: an open large scale video database for violence detection. In 25th international conference on pattern recognition 2021 (pp. 4183-90). IEEE.
[Crossref] [Google Scholar]
[12]Barekatain M, Martí M, Shih HF, Murray S, Nakayama K, Matsuo Y, et al. Okutama-action: an aerial view video dataset for concurrent human action detection. In proceedings of the conference on computer vision and pattern recognition workshops 2017 (pp. 28-35). IEEE.
[Google Scholar]
[13]Singh S, Velastin SA, Ragheb H. Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In international conference on advanced video and signal based surveillance 2010 (pp. 48-55). IEEE.
[Crossref] [Google Scholar]
[14]Demir U, Rawat YS, Shah M. Tinyvirat: low-resolution video action recognition. In 25th international conference on pattern recognition 2021 (pp. 7387-94). IEEE.
[Crossref] [Google Scholar]
[15]Ranganarayana K, Rao GV. Action recognition in low resolution videos using FO-SVM. Indian Journal of Computer Science and Engineering. 2021; 12(4):1149-62.
[Crossref] [Google Scholar]
[16]Sargano AB, Wang X, Angelov P, Habib Z. Human action recognition using transfer learning with deep representations. In international joint conference on neural networks 2017 (pp. 463-9). IEEE.
[Crossref] [Google Scholar]
[17]Ji S, Xu W, Yang M, Yu K. 3D convolutional neural networks for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2012; 35(1):221-31.
[Crossref] [Google Scholar]
[18]Mutegeki R, Han DS. A CNN-LSTM approach to human activity recognition. In international conference on artificial intelligence in information and communication 2020 (pp. 362-6). IEEE.
[Crossref] [Google Scholar]
[19]Geng C, Song J. Human action recognition based on convolutional neural networks with a convolutional auto-encoder. In 5th international conference on computer sciences and automation engineering 2016 (pp. 933-8). Atlantis Press.
[Crossref] [Google Scholar]
[20]Aggarwal JK, Ryoo MS. Human activity analysis: a review. ACM Computing Surveys. 2011; 43(3):1-43.
[Crossref] [Google Scholar]
[21]Mustafa T, Dhavale S, Kuber MM. Performance analysis of inception-v2 and yolov3-based human activity recognition in videos. SN Computer Science. 2020; 1(3):1-7.
[Crossref] [Google Scholar]
[22]Zeng M, Nguyen LT, Yu B, Mengshoel OJ, Zhu J, Wu P, et al. Convolutional neural networks for human activity recognition using mobile sensors. In 6th international conference on mobile computing, applications and services 2014 (pp. 197-205). IEEE.
[Crossref] [Google Scholar]
[23]Weinland D, Ronfard R, Boyer E. A survey of vision-based methods for action representation, segmentation and recognition. Computer Vision and Image Understanding. 2011; 115(2):224-41.
[Crossref] [Google Scholar]
[24]Serrano I, Deniz O, Espinosa-aranda JL, Bueno G. Fight recognition in video using hough forests and 2D convolutional neural network. IEEE Transactions on Image Processing. 2018; 27(10):4787-97.
[Crossref] [Google Scholar]
[25]Sharma R, Singh A. An integrated approach towards efficient image classification using deep CNN with transfer learning and PCA. Journal: Advances in Technology Innovation. 2022; 2022(2):105-17.
[Google Scholar]
[26]Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Communications of the ACM. 2017; 60(6):84-90.
[Crossref] [Google Scholar]
[27]Islam SS, Dey EK, Tawhid MN, Hossain BM. A CNN based approach for garments texture design classification. Advances in Technology Innovation. 2017; 2(4):119-25.
[Google Scholar]
[28]Rajakumaran S, Dr JS. Improvement in tongue color image analysis for disease identification using deep learning based depth wise separable convolution model [J]. Indian Journal of Computer Science and Engineering. 2021; 12(1):21-34.
[Crossref] [Google Scholar]
[29]https://machinelearningmastery.com/cnn-models-for-human-activity-recognition-time-series-classification/. Accessed 12 March 2022.
[30]Liu J, Luo J, Shah M. Recognizing realistic actions from videos “in the wild”. In IEEE conference on computer vision and pattern recognition 2009 (pp. 1996-2003). IEEE.
[Crossref] [Google Scholar]
[31]Ashhar SM, Mokri SS, Abd RAA, Huddin AB, Zulkarnain N, Azmi NA, et al. Comparison of deep learning convolutional neural network (CNN) architectures for CT lung cancer classification. International Journal of Advanced Technology and Engineering Exploration. 2021; 8(74):126-34.
[Crossref] [Google Scholar]
[32]Ankalaki S, Thippeswamy MN. A customized 1D-CNN approach for sensor-based human activity recognition. International Journal of Advanced Technology and Engineering Exploration. 2022; 9(87):216-31.
[Crossref] [Google Scholar]
[33]Qin Y, Mo L, Ye J, Du Z. Multi-channel features fitted 3D CNNs and LSTMs for human activity recognition. In 10th international conference on sensing technology 2016 (pp. 1-5). IEEE.
[Crossref] [Google Scholar]
[34]Suresha M, Kuppa S, Raghukumar DS. A study on deep learning spatiotemporal models and feature extraction techniques for video understanding. International Journal of Multimedia Information Retrieval. 2020; 9(2):81-101.
[Crossref] [Google Scholar]
[35]Uddin MZ, Khaksar W, Torresen J. Human activity recognition using robust spatiotemporal features and convolutional neural network. In international conference on multisensor fusion and integration for intelligent systems 2017 (pp. 144-9). IEEE.
[Crossref] [Google Scholar]
[36]Beddiar DR, Nini B, Sabokrou M, Hadid A. Vision-based human activity recognition: a survey. Multimedia Tools and Applications. 2020; 79(41):30509-55.
[Crossref] [Google Scholar]
[37]Arunnehru J, Chamundeeswari G, Bharathi SP. Human action recognition using 3D convolutional neural networks with 3D motion cuboids in surveillance videos. Procedia Computer Science. 2018; 133:471-7.
[Crossref] [Google Scholar]
[38]Chen H, Mahfuz S, Zulkernine F. Smart phone based human activity recognition. In international conference on bioinformatics and biomedicine 2019 (pp. 2525-32). IEEE.
[Crossref] [Google Scholar]
[39]Bilal M, Maqsood M, Mehmood I, Javaid M, Rho S. An activity recognition framework for overlapping activities using transfer learning. In international conference on computational science and computational intelligence 2020 (pp. 701-5). IEEE.
[Crossref] [Google Scholar]
[40]Sun Z, Ke Q, Rahmani H, Bennamoun M, Wang G, Liu J. Human action recognition from various data modalities: a review. IEEE transactions on pattern analysis and machine intelligence. 2022: 1-20.
[Google Scholar]
[41]Arif S, Wang J, Ul HT, Fei Z. 3D-CNN-based fused feature maps with LSTM applied to action recognition. Future Internet. 2019; 11(2):1-17.
[Crossref] [Google Scholar]