AI-Optimized VMware Horizon VDI: Predictive Resource Scaling for GPU-Intensive Workloads in Hybrid Cloud Environments

Authors

  • Naga Subrahmanyam, Cherukupalle

Keywords:

VMware Horizon, GPU Resource Scaling, Hybrid Cloud, Machine Learning, Instant Clone, Cost-Performance Optimization

Abstract

This paper proposes an AI-driven framework for predictive GPU resource scaling in VMware Horizon Virtual Desktop Infrastructure (VDI) to optimize performance and cost-efficiency for GPU-intensive workloads like CAD, AI training, and medical imaging in hybrid cloud environments. By integrating machine learning (ML) models with VMware’s Instant Clone technology, the system dynamically forecasts GPU demand and provisions resources while balancing on-premises and public cloud infrastructure costs. A hybrid Long Short-Term Memory (LSTM) and Reinforcement Learning (RL) model achieves 92% prediction accuracy for GPU utilization, reducing idle resource costs by 35% compared to static allocation. Experimental results demonstrate a 40% improvement in workload latency and 28% savings in public cloud spending.

Downloads

Download data is not yet available.

References

Boutros, A., Nurvitadhi, E., & Betz, V. (2022). Architecture and application co-design for beyond-FPGA reconfigurable acceleration devices. IEEE Access.

Fu, Z., Zhou, J., Xu, W., Guo, C., & Wu, Q. (2023). GPU and VPU Enabled Virtual Mobile Infrastructure for 3-D Image Rendering and its Application in Telemedicine. IEEE Internet of Things Journal, 11(5), 7724–7738. https://doi.org/10.1109/jiot.2023.3316698

Gartner. (2023). Cost Optimization Strategies for GPU-Intensive Workloads. Gartner Research Note.

Ghanem, M. C. (2022). Towards an efficient automation of network penetration testing using model-based reinforcement learning. City, University of London.

Guerrero, G. D., Wallace, R. M., & others. (2014). A performance/cost model for a CUDA drug discovery application on physical and public cloud infrastructures. Concurrency and Computation: Practice and Experience.

Hong, C. H., Spence, I., & Nikolopoulos, D. S. (2017). GPU virtualization and scheduling methods: A comprehensive survey. ACM Computing Surveys.

IDC. (2023). Hybrid Cloud Cost Management Trends. IDC Market Analysis Report.

IEEE CloudCom. (2023). Benchmarking GPU Passthrough in Hybrid Cloud VDI Environments. Proceedings of the IEEE International Conference on Cloud Computing Technology and Science.

Lin, W., Shi, F., Wu, W., Li, K., & Wu, G. (2020). A taxonomy and survey of power models and power modeling for cloud servers. ACM Computing Surveys.

Mehta, V. (2018). Workload assignment in various heterogeneous cloud environments. IIIT Hyderabad.

Mehta, V., Rishabh, K., Raja, R., & others. (2016). MultiStack: Multi-cloud big data research framework/platform. Proceedings of the IEEE International Conference on Cloud Computing.

Microsoft Azure. (2023). Case Study: GPU-Accelerated Medical Imaging at Scale. Azure Architecture Center.

Mohamadi Bahram Abadi, R., & others. (2018). Server consolidation techniques in virtualized data centers of cloud environments: A systematic literature review. Software: Practice and Experience.

Mohan, J., Phanishayee, A., Raniwala, A., & others. (2020). Analyzing and mitigating data stalls in DNN training. arXiv preprint arXiv:2001.05040.

Moubayed, A., Shami, A., & Al-Dulaimi, A. (2022). On end-to-end intelligent automation of 6G networks. Future Internet.

NVIDIA. (2023). NVIDIA NGC Containers for AI Training. NGC Catalog Technical Brief.

NVIDIA. (2023). Virtual GPU Software User Guide. Retrieved from NVIDIA Documentation Hub.

Oreyomi, M., & Jahankhani, H. (2022). Challenges and opportunities of autonomous cyber defence (ACyD) against cyber attacks. In AI and other emerging technologies for digital transformation (Springer).

PTC. (2023). CAD Workload Performance in Virtualized Environments. PTC Technical Report.

Radiology AI Journal. (2023). Impact of GPU Passthrough on Tumor Detection Accuracy in Mammography. *12*(4), 45–60.

Varghese, B., & Buyya, R. (2017). Next generation cloud computing: New trends and research directions. Future Generation Computer Systems, 79, 849–861. https://doi.org/10.1016/j.future.2017.09.020

VMware. (2023). VMware Horizon 8: Instant Clone Technology Deep Dive. VMware Technical White Paper.

VMware. (2023). VMware vSphere 8.0 Resource Management Guide. VMware Documentation Library.

Wang, X., Han, Y., Leung, V. C. M., Niyato, D., Yan, X., & Chen, X. (2020). Convergence of Edge Computing and Deep Learning: A Comprehensive survey. IEEE Communications Surveys & Tutorials, 22(2), 869–904. https://doi.org/10.1109/comst.2020.2970550

Xue, M., Ma, J., Li, W., Tian, K., Dong, Y., Wu, J., & others. (2018). Scalable GPU virtualization with dynamic sharing of graphics memory space. Proceedings of the IEEE International Parallel and Distributed Processing Symposium.

Zhang, C., Patras, P., & Haddadi, H. (2019). Deep learning in mobile and wireless Networking: a survey. IEEE Communications Surveys & Tutorials, 21(3), 2224–2287. https://doi.org/10.1109/comst.2019.2904897

Zhang, Z., Wang, T., Li, A., & Zhang, W. (2022). Adaptive auto-scaling of delay-sensitive serverless services with reinforcement learning. Proceedings of the IEEE Annual Computer Software and Applications Conference.

Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., & Zhang, J. (2019). Edge Intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, 107(8), 1738–1762. https://doi.org/10.1109/jproc.2019.2918951

Zurawski, J. (2024). New York-Presbyterian and Columbia University Irving Medical Center Requirements Analysis Report. https://doi.org/10.2172/2479511

Downloads

Published

30.07.2024

How to Cite

Naga Subrahmanyam. (2024). AI-Optimized VMware Horizon VDI: Predictive Resource Scaling for GPU-Intensive Workloads in Hybrid Cloud Environments. International Journal of Intelligent Systems and Applications in Engineering, 12(22s), 2062 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/7519

Issue

Section

Research Article