IGRCVRM: Design of an Iterative Graph Based Recurrent Convolutional Model for Content Based Video Retrieval Using Multidomain Features
Keywords:
Content-based Video Retrieval, Multidomain Features, Ant Lion Firefly Optimizer, Graph-based Recurrent Neural Network, Video SummarizationAbstract
The proliferation of video content has necessitated the development of effective content-based video retrieval (CBVR) systems. Current CBVR methodologies suffer from limitations in feature representation and selection, resulting in suboptimal performance in terms of precision, accuracy, recall, and computational efficiency. To address these limitations, we introduce a pioneering Iterative Graph-based Recurrent Convolutional Model (IGRCVRM) designed to elevate the quality of video representation. IGRCVRM capitalizes on a multifaceted approach by harnessing features derived from a spectrum of domains, including Fourier Components, Z Transform, S Transform Components, Laplace Components, and Convolutional Transforms. What sets IGRCVRM apart is its meticulous feature selection process, orchestrated by the Ant Lion Firefly Optimizer (ALFO). This optimizer operates with precision to sift through feature candidates, significantly enhancing variance control and ultimately fostering a more robust representation. Notably, the selected features undergo a transformative journey, being effectively distilled and synthesized within an innovative Graph-based Recurrent Neural Network (GRNN). GRNN is a dynamic fusion of Graph Convolutional Network (GCN) and Recurrent Neural Networks (RNNs), demonstrating the model's capacity to capture both spatial and temporal intricacies, thus culminating in an exceptionally potent and comprehensive framework for content-based video retrieval. Experimental results demonstrate that our proposed model enhances the precision of video summarization by 5.9%, improves accuracy by 4.5%, boosts recall by 4.9%, increases the Area Under the Curve (AUC) by 5.5%, and enhances specificity by 3.5%, while simultaneously reducing delay by 2.9% when compared with existing methods. Collectively, these improvements signify a substantial advancement in the field of CBVR, with potential implications for various applications ranging from video surveillance to multimedia indexing and retrieval.
Downloads
References
H. Yoon and J. -H. Han, "Content-Based Video Retrieval With Prototypes of Deep Features," in IEEE Access, vol. 10, pp. 30730-30742, 2022, doi: 10.1109/ACCESS.2022.3160214.
W. Jo et al., "Simultaneous Video Retrieval and Alignment," in IEEE Access, vol. 11, pp. 28466-28478, 2023, doi: 10.1109/ACCESS.2023.3259733.
H. Kou, Y. Yang and Y. Hua, "KnowER: Knowledge enhancement for efficient text-video retrieval," in Intelligent and Converged Networks, vol. 4, no. 2, pp. 93-105, June 2023, doi: 10.23919/ICN.2023.0009.
S. R. Dubey, "A Decade Survey of Content Based Image Retrieval Using Deep Learning," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 5, pp. 2687-2704, May 2022, doi: 10.1109/TCSVT.2021.3080920.
R. Zuo et al., "Fine-Grained Video Retrieval With Scene Sketches," in IEEE Transactions on Image Processing, vol. 32, pp. 3136-3149, 2023, doi: 10.1109/TIP.2023.3278474.
W. Jo, G. Lim, J. Kim, J. Yun and Y. Choi, "Exploring the Temporal Cues to Enhance Video Retrieval on Standardized CDVA," in IEEE Access, vol. 10, pp. 38973-38981, 2022, doi: 10.1109/ACCESS.2022.3165177.
K. Yousaf and T. Nawaz, "A Deep Learning-Based Approach for Inappropriate Content Detection and Classification of YouTube Videos," in IEEE Access, vol. 10, pp. 16283-16298, 2022, doi: 10.1109/ACCESS.2022.3147519.
H. Tang, J. Zhu, M. Liu, Z. Gao and Z. Cheng, "Frame-Wise Cross-Modal Matching for Video Moment Retrieval," in IEEE Transactions on Multimedia, vol. 24, pp. 1338-1349, 2022, doi: 10.1109/TMM.2021.3063631.
H. Ren et al., "ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based Image Retrieval," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 9, pp. 5022-5035, Sept. 2023, doi: 10.1109/TCSVT.2023.3248646.
F. Liu et al., "SceneSketcher-v2: Fine-Grained Scene-Level Sketch-Based Image Retrieval Using Adaptive GCNs," in IEEE Transactions on Image Processing, vol. 31, pp. 3737-3751, 2022, doi: 10.1109/TIP.2022.3175403.
J. Dong et al., "Dual Encoding for Video Retrieval by Text," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp. 4065-4080, 1 Aug. 2022, doi: 10.1109/TPAMI.2021.3059295.
T. Jing, H. Xia, J. Hamm and Z. Ding, "Augmented Multimodality Fusion for Generalized Zero-Shot Sketch-Based Visual Retrieval," in IEEE Transactions on Image Processing, vol. 31, pp. 3657-3668, 2022, doi: 10.1109/TIP.2022.3173815.
Y. Liu, J. Wu, L. Li, W. Dong and G. Shi, "Quality Assessment of UGC Videos Based on Decomposition and Recomposition," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 3, pp. 1043-1054, March 2023, doi: 10.1109/TCSVT.2022.3209007.
B. Yang, M. Cao and Y. Zou, "Concept-Aware Video Captioning: Describing Videos With Effective Prior Information," in IEEE Transactions on Image Processing, vol. 32, pp. 5366-5378, 2023, doi: 10.1109/TIP.2023.3307969.
D. Wuebben, J. L. Rubio-Tamayo, M. Gertrudix Barrio and J. Romero-Luis, "360° Video for Research Communication and Dissemination: A Case Study and Guidelines," in IEEE Transactions on Professional Communication, vol. 66, no. 1, pp. 59-77, March 2023, doi: 10.1109/TPC.2022.3228022.
F. Zhang, M. Xu and C. Xu, "Geometry Sensitive Cross-Modal Reasoning for Composed Query Based Image Retrieval," in IEEE Transactions on Image Processing, vol. 31, pp. 1000-1011, 2022, doi: 10.1109/TIP.2021.3138302.
W. Nie, Y. Zhao, J. Nie, A. -A. Liu and S. Zhao, "CLN: Cross-Domain Learning Network for 2D Image-Based 3D Shape Retrieval," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 3, pp. 992-1005, March 2022, doi: 10.1109/TCSVT.2021.3070969.
L. Liao, M. Yang and B. Zhang, "Deep Supervised Dual Cycle Adversarial Network for Cross-Modal Retrieval," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 2, pp. 920-934, Feb. 2023, doi: 10.1109/TCSVT.2022.3203247.
W. -C. L. Lew, D. Wang, K. K. Ang, J. -H. Lim, C. Quek and A. -H. Tan, "EEG-Video Emotion-Based Summarization: Learning With EEG Auxiliary Signals," in IEEE Transactions on Affective Computing, vol. 13, no. 4, pp. 1827-1839, 1 Oct.-Dec. 2022, doi: 10.1109/TAFFC.2022.3208259.
J. Wang, B. -K. Bao and C. Xu, "DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering," in IEEE Transactions on Multimedia, vol. 24, pp. 3369-3380, 2022, doi: 10.1109/TMM.2021.3097171.
T. -C. Hsu, Y. -S. Liao and C. -R. Huang, "Video Summarization With Spatiotemporal Vision Transformer," in IEEE Transactions on Image Processing, vol. 32, pp. 3013-3026, 2023, doi: 10.1109/TIP.2023.3275069.
D. Liu, P. Zhou, Z. Xu, H. Wang and R. Li, "Few-Shot Temporal Sentence Grounding via Memory-Guided Semantic Learning," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 5, pp. 2491-2505, May 2023, doi: 10.1109/TCSVT.2022.3223725.
J. F. H. Albarracín and A. Ramírez Rivera, "Video Reenactment as Inductive Bias for Content-Motion Disentanglement," in IEEE Transactions on Image Processing, vol. 31, pp. 2365-2374, 2022, doi: 10.1109/TIP.2022.3153140.
J. Lin, P. Yang, N. Zhang, F. Lyu, X. Chen and L. Yu, "Low-Latency Edge Video Analytics for On-Road Perception of Autonomous Ground Vehicles," in IEEE Transactions on Industrial Informatics, vol. 19, no. 2, pp. 1512-1523, Feb. 2023, doi: 10.1109/TII.2022.3181986.
Nisha Balani, Pallavi Chavan, and Mangesh Ghonghe. 2022. Design of high-speed blockchain-based sidechaining peer to peer communication protocol over 5G networks. Multimedia Tools Appl. 81, 25 (Oct 2022), 36699–36713. https://doi.org/10.1007/s11042-021-11604-6
Chavan, P. V., & Balani, N. (2022). Design of heuristic model to improve block-chain-based sidechain configuration. In International Journal of Computational Science and Engineering (Vol. 1, Issue 1, p. 1). Inderscience Publishers. https://doi.org/10.1504/ijcse.2022.10050704
Nisha Balani & Pallavi Chavan (2023) CSIMH: Design of an Efficient Security-Aware Customized Sidechaining Model via Iterative Meta-Heuristics, Journal of Applied Security Research, DOI: 10.1080/19361610.2023.2264068
Morzelona, R. (2021). Human Visual System Quality Assessment in The Images Using the IQA Model Integrated with Automated Machine Learning Model . Machine Learning Applications in Engineering Education and Management, 1(1), 13–18. Retrieved from http://yashikajournals.com/index.php/mlaeem/article/view/5
Dhabliya, D., Ugli, I.S.M., Murali, M.J., Abbas, A.H.R., Gulbahor, U. Computer Vision: Advances in Image and Video Analysis (2023) E3S Web of Conferences, 399, art. no. 04045, .
Ólafur, S., Nieminen, J., Bakker, J., Mayer, M., & Schmid, P. Enhancing Engineering Project Management through Machine Learning Techniques. Kuwait Journal of Machine Learning, 1(1). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/112
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.