BCRNVSRM: Design of an Iterative Fusion of BiLSTM & BiGRU with Convolutionally Recurrent Neural Networks to Enhance Summarization Efficiency of Videos with Rapid Movements

Darshankumar  D. Billur; Manu  T. M.

Authors

Darshankumar D. Billur KLE Collegeof Engineering & Technology/ Department of ECE, Chikodi-591201, INDIA
Manu T. M. KLE Institute of Technology, Hubballi / Department of ECE, 580030, INDIA

Keywords:

Video Summarization, BiLSTM & BiGRU Fusion, Grey Wolf Optimizer, Convolutionally Recurrent Neural Networks, Rapid Movement Videos

Abstract

With the burgeoning growth of digital video content, accurate and efficient video summarization becomes imperative, especially for videos exhibiting rapid movements. Such videos present challenges due to their intrinsic high variability and complexities, necessitating advanced techniques to capture and condense meaningful information effectively. Traditional summarization techniques often fail to harness the multidomain features inherent to dynamic video sequences, leading to imprecise and inefficient summarization results. Existing models lack robust fusion mechanisms and are limited in their ability to cope with high variance scenarios in videos with swift movements. In this paper, we introduce a novel framework that employs a fusion of BiLSTM & BiGRU operations to transform frame sequences into multidomain features. These features are then enriched and converted into high variance descriptors using the Grey Wolf Optimizer (GWO). To amalgamate these modalities, a weighted sum method, guided by GWO, is utilized, ensuring an optimized integration process. Subsequently, summary profiles are generated from these fused data samples through Convolutionally Recurrent Neural Networks. The entire schema is tailored to comprehensively capture the underlying patterns and temporal consistencies in rapidly moving video sequences. The proposed model exhibits a commendable enhancement in video summarization performance. Quantitative evaluations report an enhancement of 3.9% in precision, 2.9% in accuracy, 4.5% in recall, 3.5% in AUC, and 4.8% in specificity. Furthermore, the methodology reduces delay by 1.9%, indicating a promising direction in real-time video processing and summarization. In conclusion, this work significantly bridges the gap between complex video content and concise summarization, paving the way for advanced video processing tools in the future.

Downloads

Download data is not yet available.

References

T. Liu, Q. Meng, J. -J. Huang, A. Vlontzos, D. Rueckert and B. Kainz, "Video Summarization Through Reinforcement Learning With a 3D Spatio-Temporal U-Net," in IEEE Transactions on Image Processing, vol. 31, pp. 1573-1586, 2022, doi: 10.1109/TIP.2022.3143699.

Y. Zhang, Y. Liu, W. Kang and Y. Zheng, "MAR-Net: Motion-Assisted Reconstruction Network for Unsupervised Video Summarization," in IEEE Signal Processing Letters, vol. 30, pp. 1282-1286, 2023, doi: 10.1109/LSP.2023.3313091.

H. Li, Q. Ke, M. Gong and R. Zhang, "Video Joint Modelling Based on Hierarchical Transformer for Co-Summarization," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3904-3917, 1 March 2023, doi: 10.1109/TPAMI.2022.3186506.

O. Issa and T. Shanableh, "CNN and HEVC Video Coding Features for Static Video Summarization," in IEEE Access, vol. 10, pp. 72080-72091, 2022, doi: 10.1109/ACCESS.2022.3188638.

Y. Yuan and J. Zhang, "Unsupervised Video Summarization via Deep Reinforcement Learning With Shot-Level Semantics," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 1, pp. 445-456, Jan. 2023, doi: 10.1109/TCSVT.2022.3197819.

P. Kadam et al., "Recent Challenges and Opportunities in Video Summarization With Machine Learning Algorithms," in IEEE Access, vol. 10, pp. 122762-122785, 2022, doi: 10.1109/ACCESS.2022.3223379.

P. Nagar, A. Rathore, C. V. Jawahar and C. Arora, "Generating Personalized Summaries of Day Long Egocentric Videos," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 6, pp. 6832-6845, 1 June 2023, doi: 10.1109/TPAMI.2021.3118077.

W. Zhu, Y. Han, J. Lu and J. Zhou, "Relational Reasoning Over Spatial-Temporal Graphs for Video Summarization," in IEEE Transactions on Image Processing, vol. 31, pp. 3017-3031, 2022, doi: 10.1109/TIP.2022.3163855.

Y. Zhang, Y. Liu, P. Zhu and W. Kang, "Joint Reinforcement and Contrastive Learning for Unsupervised Video Summarization," in IEEE Signal Processing Letters, vol. 29, pp. 2587-2591, 2022, doi: 10.1109/LSP.2022.3227525.

W. -C. L. Lew, D. Wang, K. K. Ang, J. -H. Lim, C. Quek and A. -H. Tan, "EEG-Video Emotion-Based Summarization: Learning With EEG Auxiliary Signals," in IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 1827-1839, 1 Oct.-Dec. 2022, doi: 10.1109/TAFFC.2022.3208259.

B. Zhao, M. Gong and X. Li, "AudioVisual Video Summarization," in IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 8, pp. 5181-5188, Aug. 2023, doi: 10.1109/TNNLS.2021.3119969.

Y. Xu, X. Li, L. Pan, W. Sang, P. Wei and L. Zhu, "Self-Supervised Adversarial Video Summarizer With Context Latent Sequence Learning," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 8, pp. 4122-4136, Aug. 2023, doi: 10.1109/TCSVT.2023.3240464.

C. Ma, L. Lyu, G. Lu and C. Lyu, "Adaptive Multiview Graph Difference Analysis for Video Summarization," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 12, pp. 8795-8808, Dec. 2022, doi: 10.1109/TCSVT.2022.3190998.

G. Mujtaba, A. Malik and E. -S. Ryu, "LTC-SUM: Lightweight Client-Driven Personalized Video Summarization Framework Using 2D CNN," in IEEE Access, vol. 10, pp. 103041-103055, 2022, doi: 10.1109/ACCESS.2022.3209275.

T. Hussain et al., "Deep Learning Assists Surveillance Experts: Toward Video Data Prioritization," in IEEE Transactions on Industrial Informatics, vol. 19, no. 7, pp. 7946-7956, July 2023, doi: 10.1109/TII.2022.3213569.

M. Ma, S. Mei, S. Wan, Z. Wang, X. -S. Hua and D. D. Feng, "Graph Convolutional Dictionary Selection With L₂,ₚ Norm for Video Summarization," in IEEE Transactions on Image Processing, vol. 31, pp. 1789-1804, 2022, doi: 10.1109/TIP.2022.3146012.

T. -C. Hsu, Y. -S. Liao and C. -R. Huang, "Video Summarization With Spatiotemporal Vision Transformer," in IEEE Transactions on Image Processing, vol. 32, pp. 3013-3026, 2023, doi: 10.1109/TIP.2023.3275069.

A. Pramanik, S. K. Pal, J. Maiti and P. Mitra, "Traffic Anomaly Detection and Video Summarization Using Spatio-Temporal Rough Fuzzy Granulation With Z-Numbers," in IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 12, pp. 24116-24125, Dec. 2022, doi: 10.1109/TITS.2022.3198595.

R. P. Mathews et al., "Unsupervised Multi-Latent Space RL Framework for Video Summarization in Ultrasound Imaging," in IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 1, pp. 227-238, Jan. 2023, doi: 10.1109/JBHI.2022.3208779.

Y. Pan et al., "Exploring Global Diversity and Local Context for Video Summarization," in IEEE Access, vol. 10, pp. 43611-43622, 2022, doi: 10.1109/ACCESS.2022.3163414.

W. Xie et al., "FIAS3: Frame Importance-Assisted Sparse Subset Selection to Summarize Wireless Capsule Endoscopy Videos," in IEEE Access, vol. 11, pp. 10850-10863, 2023, doi: 10.1109/ACCESS.2023.3240999.

R. Zhong, R. Wang, W. Yao, M. Hu, S. Dong and A. Munteanu, "Semantic Representation and Attention Alignment for Graph Information Bottleneck in Video Summarization," in IEEE Transactions on Image Processing, vol. 32, pp. 4170-4184, 2023, doi: 10.1109/TIP.2023.3293762.

N. Liu, X. Sun, H. Yu, F. Yao, G. Xu and K. Fu, "Abstractive Summarization for Video: A Revisit in Multistage Fusion Network With Forget Gate," in IEEE Transactions on Multimedia, vol. 25, pp. 3296-3310, 2023, doi: 10.1109/TMM.2022.3157993.

T. Tang, Y. Wu, Y. Wu, L. Yu and Y. Li, "VideoModerator: A Risk-aware Framework for Multimodal Video Moderation in E-Commerce," in IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 1, pp. 846-856, Jan. 2022, doi: 10.1109/TVCG.2021.3114781.

W. Ramos et al., "Text-Driven Video Acceleration: A Weakly-Supervised Reinforcement Learning Method," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 2492-2504, 1 Feb. 2023, doi: 10.1109/TPAMI.2022.3157198.

BCRNVSRM: Design of an Iterative Fusion of BiLSTM & BiGRU with Convolutionally Recurrent Neural Networks to Enhance Summarization Efficiency of Videos with Rapid Movements

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Announcements

Information for Authors

ijisae

Information

trindex