Human Pose Estimation in Thermal Frames using Deep Learning Techniques

Authors

  • Dhananjay kumar Prasad, Sonu Airen, Chandra Prakash Singar,Puja Gupta

Keywords:

Action Recognition, Human Pose Estimation, Thermal Imaging, Image Analysis, YOLOv8-Pose.

Abstract

This paper presents a deep learning-based framework for human action recognition from thermal images, with a specific emphasis on pose estimation. The framework we proposed processes thermal images in stages. First, we extracted frames from the thermal video, followed by preprocessing the thermal frames, which included resizing, augmenting, and labelling action classes; labelling bounding boxes, and labelling 17 COCO-like keypoints. We developed a custom dataset with nine human actions including walking, sitting, lying, and an abnormal behaviour class. Lastly, we trained a YOLOv8-Pose model on the Thermal-IM dataset to both detect humans and estimate pose. Among the tested variants, the YOLOv8n-pose had the best accuracy-efficiency tradeoff. When evaluated on the Thermal-IM validation set, the YOLOv8n-pose achieved bounding box and pose mAP@0.5 average precision scores of 0.98 with mAP@0.5:0.95 scores of 0.96–0.97. It also achieved bounding box precision and recall values of 0.94 and 0.96, respectively, and pose precision and recall values of 0.93 and 0.96, respectively. The results show that the Deep Learning model can be effective for reliably detecting slight changes in human poses from thermal imagery in infinitely variable and difficult thermal conditions. Overall, the results confirm that pose-based analysis using thermal imagery is an appropriate, privacy-respecting and illumination independent, method for automated human behavior monitoring in complex indoor scenarios, with direct relevance for applications in surveillance, healthcare, and security fields of study.

Downloads

Download data is not yet available.

References

Shuangjun Liu and Sarah Ostadabbas, “Seeing under the cover: A physics guided learning approach for inbed pose estimation,” 2019

E. Samkari, M. Arif, M. Alghamdi, and M. A. Al Ghamdi. Human pose estimation using deep learning: A systematic literature review. Machine Learning and Knowledge Extraction, 5(4):1612–1659, 2023.

Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):172–186, 2021

K. Sun, B. Xiao, D. Liu, and J. Wang. Deep high-resolution representation learning for human pose estimation. Proc. of the EEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5686–5696, 2019.

Zhou, X., Wang, D., & Krähenbühl, P. (2019). Objects as points. arXiv preprint arXiv:1904.07850.

Y. Xu, J. Zhang, Q. Zhang, and D. Tao. ViTPose: Simple vision transformer baselines for human pose estimation. In Proc. of the Advances in Neural Information Processing Systems, 2022.

G. Jocher, A. Chaurasia, and J. Qiu. Ultralytics yolov8, 2023.

H.-S. Fang, J. Li, H. Tang, C. Xu, H. Zhu, Y. Xiu, Y.-L. Li, and C. Lu. AlphaPose: Whole-body regional multi-person pose estimation and tracking in real-time. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(6):7157–7173, 2023.

Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Dollar, ´ “Microsoft coco: Common objects in context,” 2015.

Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, and Bernt Schiele, “2d human pose estimation: New benchmark and state of the art analysis,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3686–3693

Srihari, P., 2022. Spatio-Temporal Information for Action Recognition in Thermal Video Using Deep Learning Model. International journal of electrical and computer engineering systems, 13(8), pp.669-680.

Tang, Z., Ye, W., Ma, W.C. and Zhao, H., 2023. What happened 3 seconds ago? inferring the past with thermal imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 17111-17120).

S. A. Manssor, S. Sun, M. Abdalmajed, and S. Ali, "Real-time human detection in thermal infrared imaging at night using enhanced Tiny-YOLOv3 network," Journal of Real-Time Image Processing, vol. 19, pp. 261–274, 2022.

J. Imran and B. Raman, "Deep residual infrared action recognition by integrating local and global spatio-temporal cues," Infrared Physics & Technology, vol. 102, p. 103014, 2019.

M. Krišto, M. Ivašić-Kos, and M. Pobar, "Thermal object detection in difficult weather conditions using YOLO," IEEE Access, vol. 8, pp. 125459–125476, 2020.

G. Batchuluun, J. K. Kang, D. T. Nguyen, T. D. Pham, M. Arsalan, and K. R. Park, "Action recognition from thermal videos using joint and skeleton information," IEEE Access, vol. 9, pp. 11716–11733, 2021.

M. Ding, Y. Y. Ding, X. Z. Wu, X. H. Wang, and Y. B. Xu, "Action recognition of individuals on an airport apron based on tracking bounding boxes of the thermal infrared target," Infrared Physics & Technology, vol. 117, p. 103859, 2021.

Gupta, P. and Kulkarni, N., 2013. An introduction of soft computing approach over hard computing. International Journal of Latest Trends in Engineering and Technology (IJLTET), 3(1), pp.254-258

Liu, Y., & Ostadabbas, S. SLP: A Dataset for In-Bed Pose Estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020.

Gupta, P., Sharma, V. and Varma, S., 2022. A novel algorithm for mask detection and recognizing actions of human. Expert Systems with Applications, p.116823.

Kniaz, V., Mizginov, R., & Afonin, S. ThermalGAN: Multimodal RGB-to-Thermal Image Translation for Person Re-Identification in Multispectral Dataset. In: Proceedings of the European Conference on Computer Vision Workshops (ECCVW), 2018.

Liu, Y., Shao, Z., & Ostadabbas, S. Multimodal In-Bed Human Pose Estimation under Blankets. In: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2249–2258, 2021.

Chen, C., Xie, W., Yang, Y., & Liu, Y. ThermalPose: Estimating Human Pose from Thermal Images Using Self-Supervised Multi-Modal Learning. arXiv preprint arXiv:2109.10199, 2021.

Singh, U., Gupta, P. and Shukla, M., 2022. Activity detection and counting people using Mask-RCNN with bidirectional ConvLSTM. Journal of Intelligent & Fuzzy Systems, 43(5), pp.6505-6520

Mehra, D., Suri, S., & Gupta, R. Fusion of Thermal and Depth Data for Enhanced Human Pose Estimation. In: International Conference on Computer Vision Systems (ICVS), 2022.

Singh, U, Gupta, P., Shukla, M., Sharma, V., Varma, S. and Sharma, S.K., 2023. Acknowledgment of patient in sense behaviors using bidirectional ConvLSTM. Concurrency and Computation: Practice and Experience, 35(28), p.e7819.

Mickael Cormier, Caleb Ng Zhi Yi, Andreas Specker, Benjamin Blaß, Michael Heizmann, and Jurgen Beyerer. Lever- ¨ aging thermal imaging for robust human pose estimation in low-light vision. In Proceedings of the Asian Conference on Computer Vision, pages 67–83, 2024.

Evan Gebhardt and Marilyn Wolf. Camel dataset for visual and thermal infrared multiple object detection and tracking.

Gupta, P., Sharma, V. and Varma, S., 2022, September. An Algorithm for Counting People using Dense Nets and Feature Fusion. In 2022 4th International Conference on Inventive Research in Computing Applications (ICIRCA) (pp. 1248-1253). IEEE.

Soonmin Hwang, Jaesik Park, Namil Kim, Yukyung Choi, and In So Kweon. Multispectral pedestrian detection: Benchmark dataset and baselines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

Xinyu Jia, Chuang Zhu, Minzhen Li, Wenqi Tang, and Wenli Zhou. Llvip: A visible-infrared paired dataset for low-light vision. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3496–3504, 2021.

Askat Kuzdeuov, Darya Taratynova, Alim Tleuliyev, and Huseyin Atakan Varol. Openthermalpose: An open-source annotated thermal human pose dataset and initial yolov8-pose baselines. In 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG), pages 1–8. IEEE, 2024.

Downloads

Published

06.08.2024

How to Cite

Dhananjay kumar Prasad. (2024). Human Pose Estimation in Thermal Frames using Deep Learning Techniques. International Journal of Intelligent Systems and Applications in Engineering, 12(23s), 3713 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/7845

Issue

Section

Research Article