Deep Learning-Based Group Activity Recognition Using Multiperson Relational Graph

Authors

  • Smita S. Kulkarni, Sangeeta Jadhav, Avinash N Bhute, Harsha A Bhute, Ashitosh D. Chavan

Keywords:

Group Activity Recognition, Graph Convolution Neural Networks, Deformable CNN

Abstract

The recognition of group activities (GAR) is of significant importance in the field of computer vision as it facilitates the investigation and understanding of patterns of human behavior. Existing methodologies mostly concentrate on interactions at the interpersonal level within a group. However, sociological research has emphasized the significance of individual characteristics, interactions at the multi-person level, and the overall structure of the group in recognizing group activities. Hence, in this research, to represent the relationships between people’s locations and appearances, adaptable and effective multi-person relational graphs (MRG) have been developed for the aim of GAR. Graph Convolution Network (GCN) with sparse temporal sampling is applied to efficiently infer multi-person relational graphs. The proposed network distinguishes group activity from individual interaction via relational reasoning. The use of a GCN for identifying group activities comes after the implementation of a deformable CNN to collect features and categorize individual actions. For multi-level interaction reasoning and group structure modeling, visualization samples and experimental results show that this approach works better than the best methods currently available. These findings highlight the necessity of taking into account multi-person relational graphs (MRG) representations for recognizing group activities.

Downloads

Download data is not yet available.

References

G. J. Qi, H. Larochelle, B. Huet, J. Luo, and K. Yu., “Guest editorial: Deep learning for multimedia computing,” in IEEE Transactions on Multimedia,, vol. 17, no. 11, pp. 1873–1874, 2015.

S. Herath, M. Harandi, and F. Porikli, “Going deeper into action recognition: A survey. image and vision computing,” in Image and Vision Computing, vol. 60, 2017.

M. R. Amer, P. Lei, and S. Todorovic, “Hirf: hierarchical random field for collective activity recognition in videos,” in Proceedings of the European Conference on Computer Vision, pp. 572-585, Zurich, Switzerland, 2014. Springer.

Wongun Choi, K. Shahid, and S. Savarese, “What are they doing? : Collective activity classification using spatio-temporal relationship among people,” in IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp. 1282–1289, Kyoto, Japan, 2009.

M. S. Qi, J. Qin, A. N. Li, Y. H. Wang, J. B. Luo, and L. Van Gool, “Stagnet: An attentive semantic rnn for group activity recognition,” in Proceedings of the 15th European Conference on Computer Vision,, pp. 104–120, Munich, Germany, 2018. Springer,.

Z. W. Deng, A. Vahdat, H. X. Hu, and G. Mori, “Structure inference machines: Recurrent neural networks for analyzing relations in group activity recognition,” in Proceedings of IEEE Conference on Computer Vision and Pattern Re- cognition, pp. 4772–4781, Las Vegas, USA, 2016.

Ritu Basak Mohammed Shaheen Alam Jony Rashidul Hasan Nabil Tofayet Sultan, Nusrat Jahan, “Machine learning in cyberbullying detection from social-media image or screenshot with optical character recognition,” in International Journal of Intelligent Systems and Applications (IJISA), vol. 15, no. 2, pp. 1-13, 2023.

S. A. Vahora and N. C. Chauhan, “A comprehensive study of group activity recognition methods in video,” in Indian Journal of Science and Technology, vol. 10, no. 23, pp.1–11, 2017.

J. M. Chaquet, E. J. Carmona, and A. Fernández-Caballero, “A survey of video datasets for human action and activity recognition,” in Computer Vision and Image Understanding, vol. 117, no. 6, pp. 663–659, 2013.

Zhang, Shugang, Zhiqiang Wei, Jie Nie, Lei Huang, Shuang Wang, and Zhen Li, “A review on human activity recognition using vision-based method” in Journal of healthcare engineering, 2017.

Rajesh P. Chinchewadi Tarun Jaiswal Sushma Jaiswal, Harikumar Pallthadka, “Optimized image captioning: Hybrid transformers vision transformers and convolutional neural networks: Enhanced with beam search” in International Journal of Intelligent Systems and Applications (IJISA), vol. 16, no. 2, pp. 53–61, 2024.

C. Fauzi and S. Sulistyo, “A survey of group activity recognition in smart building,” in IEEE Proceedings of International Conference on Signals and Systems, pp. 13–19, Bali„ 2018.

J. K. Aggarwal and M. S. Ryoo, “Human activity analysis: A review. acm computing surveys,” in Journal of healthcare engineering, vol. 43, no. 3, 2011.

S. A. Vahora and N. C. Chauhan, “A comprehensive study of group activity recognition methods in video,” in Indian Journal of Science and Technology, vol. 10, no. 23, 2017.

Wu Li-Fang, Qi Wang, Meng Jian, Yu Qiao, and Bo-Xuan Zhao, “A comprehensive review of group activity recognition in videos,” in International Journal of Automation and Computing, vol. 18, no. 3, pp. 334–350, 2021.

T. Lan, Y. Wang, W. Yang, S. N. Robinovitch, and G. Mori, “Discriminative latent models for recognizing contextual group activities,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 8, pp. 1549–1562, 2012.

Ryoo MS and Aggarwal J., “Stochastic representation and recognition of high-level group activities,” in International journal of Computer Vision, vol. 93, 2011.

P. Prabhu S. Anthonisamy, “Human activity detection using profound learning with improved convolutional neural networks,” in International Journal of Intelligent Systems and Applications in Engineering, vol. 12, no. 21 pp. 606–616, 2024.

S. S. Kulkarni and S Jadhav, “Insight on human activity recognition using the deep learning approach,” in 2023 International Conference on Emerging Smart Computing and Informatics (ESCI), pp. 1–5. IEEE, 2023.

T. Bagautdinov, A. Alahi, F. Fleuret, P. Fua, and S. Savarese, “Social scene understanding: End-to-end multi-person action localization and collective activity recognition,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4325–4334, Honolulu, USA„ 2017.

M. S. Ibrahim, S. Muralidharan, Z.W. Deng, A. Vahdat, and G. Mori, “A hierarchical deep temporal model for group activity recognition,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1971–1980, Las Vegas, USA, 2015.

L. H. Lu, H. J. Di, Y. Lu, L. Zhang, and S. Z. Wang, “A two-level attention-based interaction model for multiperson activity recognition,” in Neurocomputing,, vol. 322, 2018.

M. S. Ibrahim and G. Mori, “Hierarchical relational networks for group activity recognition and retrieval,” in Proceedings of the 15th European Conference on Computer Vision, pp. 742–758, 2018.

Wang L., Xiong Y., Y. Wang Z., Qiao, and X.and Van Gool Lin, D. and Tang, “Temporal segment networks: Towards good practices for deep action recognition,” in European conference on computer vision, pp. 20–36, Cham, 2016. Springer.

X. Li and M. C. Chuah. Sbgar, “Semantics based group activity recognition,” in Proceedings of IEEE International Conference on Computer Vision, pp. 2895–2904, Venice, Italy, 2017.

T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks.

M. S. Wang, B. B. Ni, and X. K. Yang, “Recurrent modeling of interaction context for collective activity recognition” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 7408–7416, Honolulu, USA„ 2017. IEEE,.

J. C. Wu, L. M. Wang, L. Wang, J. Guo, and G. S. Wu, “Learning actor relation graphs for group activity recognition” in Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9956–9966, Long Beach, USA, 2019.

S. M. Azar, M. G. Atigh, A. Nickabadi, and A. Alahi, “Convolutional relational machine for group activity recognition,” in Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7884–7893, Long Beach, USA, 2018.

Yao L., Liu Y., and Huang, “Spatio-temporal information for human action recognition,” in EURASIP Journal on Image and Video Processing, vol. 39, pp. 1–9, 2016.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and L. Polosukhin, “Attention is all you need,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010, 2017.

Adam Santoro, David Raposo, David G. T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, and Tim Lillicrap, “A simple neural network module for relational reasoning,” in NIPS, 2017.

Zijian Kuang and Xinran Tie, “Improved actor relation graph based group activity recognition,” 2020.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin, “Attention is all you need,” in NeurIPS, 2017.

Choi W., Shahid K., and Savarese S., “Learning context for collective activity recognition,” in Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3273–3280, Colorado, 2011.

P. Z. Zhang, Y. Y. Tang, J. F. Hu, and W. S. Zheng, “Fast collective activity recognition under weak supervision,” in IEEE Transactions on Image Processing, vol. 29, 2019.

D. Wang W. Ouyang X. Zhu, Y. Zhou and R. Su., “Mlst-former: Multilevel spatial-temporal transformer for group activity recognition,” in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 7, pp.3383–3397, 2023.

C. Yuan Q. Tian R. Yan, X. Shu and J. Tang, “Position-aware participation-contributed temporal dynamic model for group activity recognition,” in IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp.7574–7588, 2022.

Downloads

Published

16.06.2024

How to Cite

Smita S. Kulkarni. (2024). Deep Learning-Based Group Activity Recognition Using Multiperson Relational Graph. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 424–435. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6229

Issue

Section

Research Article