ConvLSTM3D for elderly human activity recognition: an integrated spatiotemporal attention-based approach
Jeevan Babu Maddala1 and Shaheda Akthar2
Assistant Professor, Department of Computer Science and Engineering,Government College for Women (A), Sambasiva Pet, Guntur, Andhra Pradesh 522001,India2
Corresponding Author : Jeevan Babu Maddala
Recieved : 17-June-2024; Revised : 25-June-2025; Accepted : 05-July-2025
Abstract
Human activity recognition (HAR) plays a vital role in monitoring the daily movements of the elderly, enabling early detection of potential health concerns and facilitating timely interventions during emergencies. Traditional HAR approaches often process spatial and temporal data sequentially, which may limit their ability to effectively capture the dynamic nature of human activities. To address this limitation, a comprehensive convolutional long short-term memory 3D (ConvLSTM3D) model is proposed. This model integrates spatial and temporal information through a novel attention mechanism inspired by Transformer architectures. The attention mechanism dynamically prioritizes critical features in video frames, emphasizing significant movements or postural changes while suppressing irrelevant information such as background clutter or occlusions. An adaptive relevance attention module (ARAM) further enhances the model’s sensitivity by dynamically weighting spatiotemporal features, allowing it to focus on salient patterns in noisy environments. The ConvLSTM3D architecture combines convolutional neural networks (CNNs) and long short-term memory (LSTM) networks within a three-dimensional framework. Evaluated on the MSR Daily Activity3D dataset, the model achieved F1-scores of 93% and 94% for activities such as watching TV and combing, respectively, and an accuracy of 93% for brushing. These results validate the model’s effectiveness in recognizing complex activities and demonstrate its adaptability to diverse real-world scenarios. By incorporating advanced attention mechanisms, the ConvLSTM3D model offers a transformative solution for elderly care and paves the way for more responsive and personalized interventions in assisted living environments.
Keywords
Adaptive relevance attention module, ConvLSTM3D, Elderly monitoring, Human activity recognition, Spatiotemporal modelling, Transformer-based attention.
Cite this article
Maddala JB, Akthar S. ConvLSTM3D for elderly human activity recognition: an integrated spatiotemporal attention-based approach. International Journal of Advanced Technology and Engineering Exploration. 2025;12(128):1106-1123. DOI : 10.19101/IJATEE.2024.111101047
