LRCN-HTP: Leveraging Hybrid Temporal Processing for Enhanced Activity Recognition in Multi-Human Scenarios

Authors

  • P.Pravanya, K.Lakshmi Priya, SK.Khamarjaha, K.Buela Likhitha, P.M.Ashok Kumar

Keywords:

Convolutional Neural Networks, Deep Learning, Feature Extraction, Hybrid Long-Term Recurrent Convolutional-Network Temporal Processing, Multi-Human Activity Recognition, Spatial Context, Temporal Convolutional Networks, Temporal Dependency.

Abstract

Multi-human activity recognition remains a challenging domain, with significant research focused on utilizing diverse datasets to identify human activities in everyday scenarios accurately. This paper introduces an innovative approach that employs a Hybrid Long-Term Recurrent Convolutional-Network Temporal Processing (LRCN-HTP) model for enhanced multi-human activity recognition. Integrating advanced computing technology and deep neural networks addresses socially relevant challenges, paving the way for applications requiring a nuanced understanding of human interactions. The LRCN-HTP model synergizes the spatial context understanding of Convolutional Neural Networks (CNNs) with the long-term temporal dependency management of Recurrent Neural Networks (RNNs), particularly LSTM networks. By doing so, it offers a comprehensive framework that leverages the strengths of both CNNs for feature extraction and LSTMs for sequential data processing. This hybrid approach ensures that the model captures the fine-grained details and broader patterns of human activity. To enhance the model's performance and mitigate common deep learning challenges, such as the dependency on extensive labeled datasets, the LRCN-HTP architecture integrates dilated convolutions and causal convolutions within the TCNs to extend the receptive field and maintain the sequence's temporal integrity. The robust feature maps generated through convolutional layers undergo a sophisticated learning process involving various activation functions and filters, subsequently integrated with LSTM's sequential processing to form accurate predictions. Our architecture is tailored to address the intricate problems of sequence prediction with spatial inputs effectively. Testing the extensive UCF101 dataset, our proposed LRCN-HTP model achieves an impressive accuracy of 97.22%, outperforming several existing models. The results underscore the model's reliability and superior capability in recognizing various activities, confirming the effectiveness of our integrated approach in human activity recognition.

Downloads

Download data is not yet available.

References

Murugan, Pushparaja. "Learning the sequential temporal information with recurrent neural networks." arXiv preprint, 1807.02857, (2018).

jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, and Trevor Darrell, “Long-term Recurrent Convolutional Networks for Visual Recognition and Description,” " in IEEE Transactions on Pattern Analysis anMachine Intelligence, vol. 39, no. 4, pp. 677-691, April 1, 2017.

S. Ji, W. Xu, M. Yang, and K. Yu, “3D convolutional neural networks for human action recognition,” in IEEE Trans. Pattern Anal. Mach. Intell., 2013.

Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei, “Large-scale video classification with convolutional neural networks,” in CVPR, 2014.

J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell, “Long-term recurrent convolutional networks for visual recognition and description,” in CVPR, 2015.

M. Baccouche, F. Mamalet, C. Wolf, C. Garcia, and A. Baskurt. Sequential deep learning for human action recognition. In Human Behavior Understanding. 2011. 2, 4, 5.

S. Slade, L. Zhang, Y. Yu, and C.P. Lim. An evolving ensemble model of multi-stream convolutional neural networks for human action recognition in still images. Neural computing and applications, pages 1–27, 2022.

G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. arXiv:1608.06993 [cs], January 2018.

H.A. Qazi, U. Jahangir, B.M. Yousuf, and A.Noor. Human action recognition using sift and hog method. In 2017 International conference on information and communication technologies, pages 6– 10, 2017.

S Zabihi, E Rahimian, A Asif, A Mohammadi, “Light-weighted CNN-Attention based architecture for Hand Gesture Recognition via ElectroMyography”, pp:1-5, 2022.

Khodabandelou, G., Kheriji, W. & Selem, F.H. Link traffic speed forecasting using convolutional attention-based gated recurrent unit. Appl Intell 51, 2331–2352 2021.

Kamyab M, Liu G, Rasool A, Adjeisah M. ACR-SA: attention-based deep model through two-channel CNN and Bi-RNN for sentiment analysis. PeerJ Comput Sci.;8:e877, 2022.

Downloads

Published

24.03.2024

How to Cite

K.Lakshmi Priya, SK.Khamarjaha, K.Buela Likhitha, P.M.Ashok Kumar, P. . (2024). LRCN-HTP: Leveraging Hybrid Temporal Processing for Enhanced Activity Recognition in Multi-Human Scenarios. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 2590–2598. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5731

Issue

Section

Research Article

Similar Articles

You may also start an advanced similarity search for this article.