Relative Depth Prediction of Objects in Hazy Images Using Depthmap and Object Detection
Keywords:
Deep learning, Dehazing, Depth map estimation, Object detection model, Relative depthsAbstract
The ability to establish the relative distances of objects in a single image is essential for many computer vision applications, including scene understanding, augmented reality, and robotics. In this study, we present a method that combines object detection and depth maps to provide an estimate of the relative depth of objects within an image. First, we locate and identify objects within the image using a state-of-the-art object identification model, which yields a set of bounding box coordinates, then estimate the monocular depth maps using a deep learning model. The estimated depth map statistics are used to determine the average depth value of each object enclosed within the bounding box. This data is utilized to estimate the relative distance of objects in the scene. The level of closeness is measured by comparing the average depth value of the objects with a hyper-parameter. An object with average depth value higher than the hyper-parameter is closer to the camera, whereas an object with an average depth value lower than the hyper-parameter is farther away from the camera. We have categorized the relative depths of objects into four levels on the basis of the average depth value's correlation to the hyper-parameter. Experimental evaluations on standard and real-time datasets have shown that the proposed strategy is effective and precise, emphasizing its potential applicability in several computer vision areas.
Downloads
References
P. Buddi, C. V. K. N. S. N. Moorthy, N. Venkateswarulu, and M. Bala, “Exploration of issues, challenges and latest developments in au- tonomous cars,” Journal of Big Data, vol. 10, 05 2023.
K. O’Shea and R. Nash, “An introduction to convolutional neural networks,” CoRR, vol. abs/1511.08458, 2015. [Online]. Available: http://arxiv.org/abs/1511.08458.
S. Ren, K. He, R. B. Girshick, and J. Sun, “Faster R- CNN: towards real-time object detection with region proposal networks,” CoRR, vol. abs/1506.01497, 2015. [Online]. Available: http://arxiv.org/abs/1506.01497.
J. Redmon, S. K. Divvala, R. B. Girshick, and Farhadi, “You only look once: Unified, real-time object detection,” CoRR, vol. abs/1506.02640, 2015. [Online]. Available: http://arxiv.org/abs/1506.02640.
N. Mehendale and S. Neoge, “Review on lidar technology, ” SSRN Electronic Journal, 01 2020.
H. G. Olanrewaju and W. O. Popoola, “Effect of synchronization error on optical spatial modulation,” IEEE Transactions on Communications, vol. 65, no. 12, pp. 5362–5374, 2017.
S. Pillai, R. Ambrus¸, and A. Gaidon, “Superdepth: Self-supervised, super-resolved monocular depth estimation,” in 2019 International Con- ference on Robotics and Automation (ICRA), 2019, pp. 9250–9256.
A. Kumar, S. Bhandarkar, and M. Prasad, “Depthnet: A recurrent neural network architecture for monocular depth prediction,” 06 2018, pp. 396– 3968.
R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun, “Towards robust monocular depth estimation: Mixing datasets for zero-shot cross- dataset transfer,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 3, pp. 1623–1637, 2022.
C. Godard, O. Aodha, and G. Brostow, “Unsupervised monocular depth estimation with left-right consistency,” 07 2017.
M. Poggi, F. Aleotti, F. Tosi, and S. Mattoccia, “Towards real-time unsupervised monocular depth estimation on cpu,” 06 2018.
V. Repala and S. R. Dubey, “Dual cnn models for unsupervised monoc- ular depth estimation,” 04 2018.
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds. Cham: Springer International Publishing, 2016, pp. 21–37.
T. Lin, P. Goyal, R. B. Girshick, K. He, and P. Dolla´r, “Focal loss for dense object detection,” CoRR, vol. abs/1708.02002, 2017. [Online]. Available: http://arxiv.org/abs/1708.02002
G. Jocher, “Yolov5 by ultralytics,” 2020. [Online]. Available: https://github.com/ultralytics/yolov5
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” CoRR, vol. abs/1505.04597, 2015. [Online]. Available: http://arxiv.org/abs/1505.04597
H. Laga, L. V. Jospin, F. Boussa¨ıd, and M. Bennamoun, “A survey on deep learning techniques for stereo-based depth estimation,” CoRR, vol. abs/2006.02535, 2020. [Online]. Available: https://arxiv.org/abs/2006.02535
N. B. M and N. Kaulgud, “Wavelet-based method for enhancing the visibility of hazy images using color attenuation prior,” in 2023 Interna- tional Conference on Recent Trends in Electronics and Communication (ICRTEC), 2023, pp. 1–6.
Y. Chen, “An introduction to wavelet analysis with applications to image and jpeg 2000,” in 2022 4th International Conference on Intelligent Medicine and Image Processing, ser. IMIP 2022. New York, NY, USA: Association for Computing Machinery, 2022, p. 49–57. [Online]. Available: https://doi.org/10.1145/3524086.3524094.
Q. Zhu, J. Mai, and L. Shao, “A fast single image haze removal algorithm using color attenuation prior,” IEEE Transactions on Image Processing, vol. 24, no. 11, pp. 3522–3533, 2015.
Y. Zhang, W. Ding, Z. Pan, and J. Qin, “Improved wavelet threshold for image de-noising,” Frontiers in Neuroscience, vol.13, 2019. [Online]. Available: https://www.frontiersin.org/articles/10.3389/fnins.2019.0009
A. F. Agarap, “Deep learning using rectified linear units (relu),” CoRR, vol. abs/1803.08375, 2018. [Online]. Available: http://arxiv.org/abs/1803.08375.
C. Garbin, X. Zhu, and O. Marques, “Dropout vs. batch normalization: an empirical study of their impact to deep learning,” Multimedia Tools and Applications, vol. 79, pp. 1–39, 05 2020.
C. Wang, H. M. Liao, I. Yeh, Y. Wu, P. Chen, and J. Hsieh, “Cspnet: A new backbone that can enhance learning capability of CNN,” CoRR, vol. abs/1911.11929, 2019. [Online]. Available: http://arxiv.org/abs/1911.11929
S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for instance segmentation,” CoRR, vol. abs/1803.01534, 2018. [Online]. Available: http://arxiv.org/abs/1803.01534
Y. Song, Q.-K. Pan, L. Gao, and B. Zhang, “Improved non-maximum suppression for object detection using harmony search algorithm,” Applied Soft Computing, vol. 81, p. 105478, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1568494619302480
B. Li, W. Ren, D. Fu, D. Tao, D. Feng, W. Zeng, and Z. Wang, “Benchmarking single-image dehazing and beyond,” IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 492–505, 2019.
S. H. K. M., Standard Deviation. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 1378–1379. [Online]. Available: https://doi.org/10.1007/978-3-642-04898-2 535.
C. Andrade, “Understanding the difference between standard deviation and standard error of the mean, and knowing when to use which,” Indian Journal of Psychological Medicine, vol. 42, no. 4, pp. 409–410, 2020, pMID: 33402813. [Online]. Available: https://doi.org/10.1177/0253717620933419
J. Shkak and H. Hassan, “Characteristics of normal distribution,” 12 2020.
A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” International Journal of Robotics Research (IJRR), 2013.
P. K. Nathan Silberman, Derek Hoiem and R. Fergus, “Indoor segmen- tation and support inference from rgbd images,” in ECCV, 2012.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dolla´r, and C. L. Zitnick, “Microsoft coco: Common objects in con- text,” in Computer Vision – ECCV 2014, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Cham: Springer International Publishing, 2014, pp. 740–755.
B. Li, W. Ren, D. Fu, D. Tao, D. Feng, W. Zeng, and Z. Wang, “Reside: A benchmark for single image dehazing,” 12 2017.
C. Tsokos and R. Wooten, “Chapter 7 - normal probability,” in The Joy of Finite Mathematics, C. Tsokos and R. Wooten, Eds. Boston: Academic Press, 2016, pp. 231–263. [Online]. Available: https://www.sciencedirect.com/science/article/pii/B9780128029671000073
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.