IGRCVRM: Design of an Iterative Graph Based Recurrent Convolutional Model for Content Based Video Retrieval Using Multidomain Features

Authors

  • Shubhangini Ugale G H Raisoni University Amravati, , Anjangaon Bari Road, Maharashtra 444701, India
  • Wani Patil G H Raisoni University Amravati, , Anjangaon Bari Road, Maharashtra 444701, India
  • Vivek Kapur G H Raisoni Institute of Engineering and Technology, Nagpur, Shraddha Part, MIDC, Hingna Wadi Link Road, Nagpur-440016, India

Keywords:

Content-based Video Retrieval, Multidomain Features, Ant Lion Firefly Optimizer, Graph-based Recurrent Neural Network, Video Summarization

Abstract

The proliferation of video content has necessitated the development of effective content-based video retrieval (CBVR) systems. Current CBVR methodologies suffer from limitations in feature representation and selection, resulting in suboptimal performance in terms of precision, accuracy, recall, and computational efficiency. To address these limitations, we introduce a pioneering Iterative Graph-based Recurrent Convolutional Model (IGRCVRM) designed to elevate the quality of video representation. IGRCVRM capitalizes on a multifaceted approach by harnessing features derived from a spectrum of domains, including Fourier Components, Z Transform, S Transform Components, Laplace Components, and Convolutional Transforms. What sets IGRCVRM apart is its meticulous feature selection process, orchestrated by the Ant Lion Firefly Optimizer (ALFO). This optimizer operates with precision to sift through feature candidates, significantly enhancing variance control and ultimately fostering a more robust representation. Notably, the selected features undergo a transformative journey, being effectively distilled and synthesized within an innovative Graph-based Recurrent Neural Network (GRNN). GRNN is a dynamic fusion of Graph Convolutional Network (GCN) and Recurrent Neural Networks (RNNs), demonstrating the model's capacity to capture both spatial and temporal intricacies, thus culminating in an exceptionally potent and comprehensive framework for content-based video retrieval. Experimental results demonstrate that our proposed model enhances the precision of video summarization by 5.9%, improves accuracy by 4.5%, boosts recall by 4.9%, increases the Area Under the Curve (AUC) by 5.5%, and enhances specificity by 3.5%, while simultaneously reducing delay by 2.9% when compared with existing methods. Collectively, these improvements signify a substantial advancement in the field of CBVR, with potential implications for various applications ranging from video surveillance to multimedia indexing and retrieval.

Downloads

Download data is not yet available.

References

H. Yoon and J. -H. Han, "Content-Based Video Retrieval With Prototypes of Deep Features," in IEEE Access, vol. 10, pp. 30730-30742, 2022, doi: 10.1109/ACCESS.2022.3160214.

W. Jo et al., "Simultaneous Video Retrieval and Alignment," in IEEE Access, vol. 11, pp. 28466-28478, 2023, doi: 10.1109/ACCESS.2023.3259733.

H. Kou, Y. Yang and Y. Hua, "KnowER: Knowledge enhancement for efficient text-video retrieval," in Intelligent and Converged Networks, vol. 4, no. 2, pp. 93-105, June 2023, doi: 10.23919/ICN.2023.0009.

S. R. Dubey, "A Decade Survey of Content Based Image Retrieval Using Deep Learning," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 5, pp. 2687-2704, May 2022, doi: 10.1109/TCSVT.2021.3080920.

R. Zuo et al., "Fine-Grained Video Retrieval With Scene Sketches," in IEEE Transactions on Image Processing, vol. 32, pp. 3136-3149, 2023, doi: 10.1109/TIP.2023.3278474.

W. Jo, G. Lim, J. Kim, J. Yun and Y. Choi, "Exploring the Temporal Cues to Enhance Video Retrieval on Standardized CDVA," in IEEE Access, vol. 10, pp. 38973-38981, 2022, doi: 10.1109/ACCESS.2022.3165177.

K. Yousaf and T. Nawaz, "A Deep Learning-Based Approach for Inappropriate Content Detection and Classification of YouTube Videos," in IEEE Access, vol. 10, pp. 16283-16298, 2022, doi: 10.1109/ACCESS.2022.3147519.

H. Tang, J. Zhu, M. Liu, Z. Gao and Z. Cheng, "Frame-Wise Cross-Modal Matching for Video Moment Retrieval," in IEEE Transactions on Multimedia, vol. 24, pp. 1338-1349, 2022, doi: 10.1109/TMM.2021.3063631.

H. Ren et al., "ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based Image Retrieval," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 9, pp. 5022-5035, Sept. 2023, doi: 10.1109/TCSVT.2023.3248646.

F. Liu et al., "SceneSketcher-v2: Fine-Grained Scene-Level Sketch-Based Image Retrieval Using Adaptive GCNs," in IEEE Transactions on Image Processing, vol. 31, pp. 3737-3751, 2022, doi: 10.1109/TIP.2022.3175403.

J. Dong et al., "Dual Encoding for Video Retrieval by Text," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp. 4065-4080, 1 Aug. 2022, doi: 10.1109/TPAMI.2021.3059295.

T. Jing, H. Xia, J. Hamm and Z. Ding, "Augmented Multimodality Fusion for Generalized Zero-Shot Sketch-Based Visual Retrieval," in IEEE Transactions on Image Processing, vol. 31, pp. 3657-3668, 2022, doi: 10.1109/TIP.2022.3173815.

Y. Liu, J. Wu, L. Li, W. Dong and G. Shi, "Quality Assessment of UGC Videos Based on Decomposition and Recomposition," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 3, pp. 1043-1054, March 2023, doi: 10.1109/TCSVT.2022.3209007.

B. Yang, M. Cao and Y. Zou, "Concept-Aware Video Captioning: Describing Videos With Effective Prior Information," in IEEE Transactions on Image Processing, vol. 32, pp. 5366-5378, 2023, doi: 10.1109/TIP.2023.3307969.

D. Wuebben, J. L. Rubio-Tamayo, M. Gertrudix Barrio and J. Romero-Luis, "360° Video for Research Communication and Dissemination: A Case Study and Guidelines," in IEEE Transactions on Professional Communication, vol. 66, no. 1, pp. 59-77, March 2023, doi: 10.1109/TPC.2022.3228022.

F. Zhang, M. Xu and C. Xu, "Geometry Sensitive Cross-Modal Reasoning for Composed Query Based Image Retrieval," in IEEE Transactions on Image Processing, vol. 31, pp. 1000-1011, 2022, doi: 10.1109/TIP.2021.3138302.

W. Nie, Y. Zhao, J. Nie, A. -A. Liu and S. Zhao, "CLN: Cross-Domain Learning Network for 2D Image-Based 3D Shape Retrieval," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 3, pp. 992-1005, March 2022, doi: 10.1109/TCSVT.2021.3070969.

L. Liao, M. Yang and B. Zhang, "Deep Supervised Dual Cycle Adversarial Network for Cross-Modal Retrieval," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 2, pp. 920-934, Feb. 2023, doi: 10.1109/TCSVT.2022.3203247.

W. -C. L. Lew, D. Wang, K. K. Ang, J. -H. Lim, C. Quek and A. -H. Tan, "EEG-Video Emotion-Based Summarization: Learning With EEG Auxiliary Signals," in IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 1827-1839, 1 Oct.-Dec. 2022, doi: 10.1109/TAFFC.2022.3208259.

J. Wang, B. -K. Bao and C. Xu, "DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering," in IEEE Transactions on Multimedia, vol. 24, pp. 3369-3380, 2022, doi: 10.1109/TMM.2021.3097171.

T. -C. Hsu, Y. -S. Liao and C. -R. Huang, "Video Summarization With Spatiotemporal Vision Transformer," in IEEE Transactions on Image Processing, vol. 32, pp. 3013-3026, 2023, doi: 10.1109/TIP.2023.3275069.

D. Liu, P. Zhou, Z. Xu, H. Wang and R. Li, "Few-Shot Temporal Sentence Grounding via Memory-Guided Semantic Learning," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 5, pp. 2491-2505, May 2023, doi: 10.1109/TCSVT.2022.3223725.

J. F. H. Albarracín and A. Ramírez Rivera, "Video Reenactment as Inductive Bias for Content-Motion Disentanglement," in IEEE Transactions on Image Processing, vol. 31, pp. 2365-2374, 2022, doi: 10.1109/TIP.2022.3153140.

J. Lin, P. Yang, N. Zhang, F. Lyu, X. Chen and L. Yu, "Low-Latency Edge Video Analytics for On-Road Perception of Autonomous Ground Vehicles," in IEEE Transactions on Industrial Informatics, vol. 19, no. 2, pp. 1512-1523, Feb. 2023, doi: 10.1109/TII.2022.3181986.

Nisha Balani, Pallavi Chavan, and Mangesh Ghonghe. 2022. Design of high-speed blockchain-based sidechaining peer to peer communication protocol over 5G networks. Multimedia Tools Appl. 81, 25 (Oct 2022), 36699–36713. https://doi.org/10.1007/s11042-021-11604-6

Chavan, P. V., & Balani, N. (2022). Design of heuristic model to improve block-chain-based sidechain configuration. In International Journal of Computational Science and Engineering (Vol. 1, Issue 1, p. 1). Inderscience Publishers. https://doi.org/10.1504/ijcse.2022.10050704

Nisha Balani & Pallavi Chavan (2023) CSIMH: Design of an Efficient Security-Aware Customized Sidechaining Model via Iterative Meta-Heuristics, Journal of Applied Security Research, DOI: 10.1080/19361610.2023.2264068

Morzelona, R. (2021). Human Visual System Quality Assessment in The Images Using the IQA Model Integrated with Automated Machine Learning Model . Machine Learning Applications in Engineering Education and Management, 1(1), 13–18. Retrieved from http://yashikajournals.com/index.php/mlaeem/article/view/5

Dhabliya, D., Ugli, I.S.M., Murali, M.J., Abbas, A.H.R., Gulbahor, U. Computer Vision: Advances in Image and Video Analysis (2023) E3S Web of Conferences, 399, art. no. 04045, .

Ólafur, S., Nieminen, J., Bakker, J., Mayer, M., & Schmid, P. Enhancing Engineering Project Management through Machine Learning Techniques. Kuwait Journal of Machine Learning, 1(1). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/112

Downloads

Published

24.11.2023

How to Cite

Ugale , S. ., Patil , W. ., & Kapur , V. . (2023). IGRCVRM: Design of an Iterative Graph Based Recurrent Convolutional Model for Content Based Video Retrieval Using Multidomain Features. International Journal of Intelligent Systems and Applications in Engineering, 12(5s), 243–257. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3882

Issue

Section

Research Article