International Journal of Advanced Technology and Engineering Exploration ISSN (Print): 2394-5443    ISSN (Online): 2394-7454 Volume-12 Issue-128 July-2025
  1. 3710
    Citations
  2. 2.7
    CiteScore
ConvLSTM3D for elderly human activity recognition: an integrated spatiotemporal attention-based approach

Jeevan Babu Maddala 1 and Shaheda Akthar2

Research Scholar, Department of Computer Science and Engineering,Acharya Nagarjuna University, Nagarjuna Nagar, Guntur, Andhra Pradesh 522510,India1
Assistant Professor, Department of Computer Science and Engineering,Government College for Women (A), Sambasiva Pet, Guntur, Andhra Pradesh 522001,India2
Corresponding Author : Jeevan Babu Maddala

Recieved : 17-Jun-2024; Revised : 25-Jun-2025; Accepted : 05-Jul-2025

Abstract

Human activity recognition (HAR) plays a vital role in monitoring the daily movements of the elderly, enabling early detection of potential health concerns and facilitating timely interventions during emergencies. Traditional HAR approaches often process spatial and temporal data sequentially, which may limit their ability to effectively capture the dynamic nature of human activities. To address this limitation, a comprehensive convolutional long short-term memory 3D (ConvLSTM3D) model is proposed. This model integrates spatial and temporal information through a novel attention mechanism inspired by Transformer architectures. The attention mechanism dynamically prioritizes critical features in video frames, emphasizing significant movements or postural changes while suppressing irrelevant information such as background clutter or occlusions. An adaptive relevance attention module (ARAM) further enhances the model’s sensitivity by dynamically weighting spatiotemporal features, allowing it to focus on salient patterns in noisy environments. The ConvLSTM3D architecture combines convolutional neural networks (CNNs) and long short-term memory (LSTM) networks within a three-dimensional framework. Evaluated on the MSR Daily Activity3D dataset, the model achieved F1-scores of 93% and 94% for activities such as watching TV and combing, respectively, and an accuracy of 93% for brushing. These results validate the model’s effectiveness in recognizing complex activities and demonstrate its adaptability to diverse real-world scenarios. By incorporating advanced attention mechanisms, the ConvLSTM3D model offers a transformative solution for elderly care and paves the way for more responsive and personalized interventions in assisted living environments.

Keywords

Adaptive relevance attention module, ConvLSTM3D, Elderly monitoring, Human activity recognition, Spatiotemporal modelling, Transformer-based attention.

References

[1] Nguyen TH, Nebel JC, Florez-revuelta F. Recognition of activities of daily living with egocentric vision: a review. Sensors. 2016; 16(1):1-24.

[2] Gupta S, Gambhir S. Artificial intelligence-based human activity recognition using real-time videos. In machine vision and industrial robotics in manufacturing 2024 (pp. 201-23). CRC Press.

[3] Kumar M, Murugan B, Pooja S. Enhancing human activity recognition through deep learning: comparative analysis of single frame CNN and convolutional LSTM models. In 9th international conference on control and robotics engineering (ICCRE) 2024 (pp. 400-5). IEEE.

[4] Sönmez ŞN, Doğru İA, Atacak İ, Kiliç K. Human activity recognition using CNN-BiLSTM-LightGBM hybrid model. In 32nd signal processing and communications applications conference (SIU) 2024 (pp. 1-4). IEEE.

[5] Alotaibi F, Alnfiai MM, Al-wesabi FN, Alduhayyem M, Hilal AM, Hamza MA. Internet of things-driven human activity recognition of elderly and disabled people using arithmetic optimization algorithm with LSTM autoencoder. Journal of Disability Research. 2023; 2(3):136-46.

[6] Grover A, Arora D, Grover A. Activity identification and recognition in real-time video data using deep learning techniques. In international conference on data intelligence and cognitive informatics 2023 (pp. 403-14). Singapore: Springer Nature Singapore.

[7] Sati HC, Tanish. Human activity detection using deep learning approaches. International Journal of Advanced Electrical Engineering. 2024; 5(1):1-9.

[8] Sun H, Chen Y. Real-time elderly monitoring for senior safety by lightweight human action recognition. In 16th international symposium on medical information and communication technology (ISMICT) 2022 (pp. 1-6). IEEE.

[9] Jitha JDS. Exploration of deep learning models for video based multiple human activity recognition. International Journal on Recent and Innovation Trends in Computing and Communication. 2023; 11(8s):422-8.

[10] Kharrat F, Gueaieb W, Karray F, Elsaddik A. A hybrid deep learning model for human activity recognition and fall detection for the elderly. In international symposium on medical measurements and applications (MeMeA) 2023 (pp. 1-6). IEEE.

[11] Gaya-morey FX, Manresa-yee C, Buades-rubio JM. Deep learning for computer vision based activity recognition and fall detection of the elderly: a systematic review. Applied Intelligence. 2024; 54(19):8982-9007.

[12] Snoun A, Bouchrika T, Jemai O. Deep-learning-based human activity recognition for alzheimer’s patients’ daily life activities assistance. Neural Computing and Applications. 2023; 35(2):1777-802.

[13] Gorji A, Bourdoux A, Pollin S, Sahli H. Multi-view CNN-LSTM architecture for radar-based human activity recognition. IEEE Access. 2022; 10:24509-19.

[14] Putra PU, Shima K, Shimatani K. A deep neural network model for multi-view human activity recognition. PloS one. 2022; 17(1):1-20.

[15] Mohammadi S, Majelan SG, Shokouhi SB. Ensembles of deep neural networks for action recognition in still images. In 9th international conference on computer and knowledge engineering (ICCKE) 2019 (pp. 315-8). IEEE.

[16] Gupta N, Malik P, Dubey AK, Jain A, Yadav S, Verma D. Vision-based human activity recognition using CNN and LSTM architecture. In international advanced computing conference 2023 (pp. 100-10). Cham: Springer Nature Switzerland.

[17] Paramasivam K, Sindha MM, Balakrishnan SB. KNN-based machine learning classifier used on deep learned spatial motion features for human action recognition. Entropy. 2023; 25(6):1-15.

[18] Wang D, Yao J, Zhang Y. Human activities recognition from video images by using convolutional neural network. Journal of Intelligent & Fuzzy Systems. 2025; 48(6):931-42.

[19] Albaba M, Qassab A, Yılmaz A. Human activity recognition and classification using of convolutional neural networks and recurrent neural networks. International Journal of Applied Mathematics Electronics and Computers. 2020; 8(4):185-9.

[20] Kushwaha A, Khare M, Bommisetty RM, Khare A. Human activity recognition based on video summarization and deep convolutional neural network. The Computer Journal. 202; 67(8):2601-9.

[21] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In 3rd international conference on learning representations (ICLR 2015) 2015(pp.1-4). Computational and Biological Learning Society.

[22] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In proceedings of the conference on computer vision and pattern recognition 2016 (pp. 770-8). IEEE.

[23] Tasnim N, Baek JH. Dynamic edge convolutional neural network for skeleton-based human action recognition. Sensors. 2023; 23(2):1-17.

[24] Liu Y, Zhang H, Li Y, He K, Xu D. Skeleton-based human action recognition via large-kernel attention graph convolutional network. IEEE Transactions on Visualization and Computer Graphics. 2023; 29(5):2575-85.

[25] Deng Z, Gao Q, Ju Z, Yu X. Skeleton-based multifeatures and multistream network for real-time action recognition. IEEE Sensors Journal. 2023; 23(7):7397-409.

[26] Lin CB, Dong Z, Kuan WK, Huang YF. A framework for fall detection based on OpenPose skeleton and LSTM/GRU models. Applied Sciences. 2020; 11(1):1-20.

[27] Chen B, Meng F, Tang H, Tong G. Two-level attention module based on spurious-3D residual networks for human action recognition. Sensors. 2023; 23(3):1-15.

[28] Ullah H, Munir A. Human activity recognition using cascaded dual attention CNN and bi-directional GRU framework. Journal of Imaging. 2023; 9(7):1-30.

[29] Ahmad T, Wu J, Alwageed HS, Khan F, Khan J, Lee Y. Human activity recognition based on deep-temporal learning using convolution neural networks features and bidirectional gated recurrent unit with features selection. IEEE Access. 2023; 11:33148-59.

[30] Ahmad T, Wu J. SDIGRU: spatial and deep features integration using multilayer gated recurrent unit for human activity recognition. IEEE Transactions on Computational Social Systems. 2023; 11(1):973-85.

[31] Kushwaha A, Khare A, Prakash O. Micro-network-based deep convolutional neural network for human activity recognition from realistic and multi-view visual data. Neural Computing and Applications. 2023; 35(18):13321-41.

[32] Kushwaha A, Srivastava P, Khare A. Human activity recognition based on integration of multilayer information of convolutional neural network architecture. Concurrency and Computation: Practice and Experience. 2023; 35(5):e7571.

[33] Nagpal D, Gupta S, Kumar D, Illés Z, Verma C, Dey B. Goldenager: a personalized feature fusion activity recognition model for elderly. IEEE Access. 2023; 11:56766-84.

[34] Sameem MS. CCGS-based discriminatory recognition of skeleton-based actions. Multimedia Tools and Applications. 2025:1-4.

[35] Li W, Zhang Z, Liu Z. Action recognition based on a bag of 3d points. In computer society conference on computer vision and pattern recognition-workshops 2010 (pp. 9-14). IEEE.

Keuntungan Tanpa Sekatan Mesin MahjongRTP Game Mahjong Wins 3 Emang KelewatanMuraibet Ternyata Memakai Server Luar Negeri