AI-Optimized VMware Horizon VDI: Predictive Resource Scaling for GPU-Intensive Workloads in Hybrid Cloud Environments
Keywords:
VMware Horizon, GPU Resource Scaling, Hybrid Cloud, Machine Learning, Instant Clone, Cost-Performance OptimizationAbstract
This paper proposes an AI-driven framework for predictive GPU resource scaling in VMware Horizon Virtual Desktop Infrastructure (VDI) to optimize performance and cost-efficiency for GPU-intensive workloads like CAD, AI training, and medical imaging in hybrid cloud environments. By integrating machine learning (ML) models with VMware’s Instant Clone technology, the system dynamically forecasts GPU demand and provisions resources while balancing on-premises and public cloud infrastructure costs. A hybrid Long Short-Term Memory (LSTM) and Reinforcement Learning (RL) model achieves 92% prediction accuracy for GPU utilization, reducing idle resource costs by 35% compared to static allocation. Experimental results demonstrate a 40% improvement in workload latency and 28% savings in public cloud spending.
Downloads
References
Boutros, A., Nurvitadhi, E., & Betz, V. (2022). Architecture and application co-design for beyond-FPGA reconfigurable acceleration devices. IEEE Access.
Fu, Z., Zhou, J., Xu, W., Guo, C., & Wu, Q. (2023). GPU and VPU Enabled Virtual Mobile Infrastructure for 3-D Image Rendering and its Application in Telemedicine. IEEE Internet of Things Journal, 11(5), 7724–7738. https://doi.org/10.1109/jiot.2023.3316698
Gartner. (2023). Cost Optimization Strategies for GPU-Intensive Workloads. Gartner Research Note.
Ghanem, M. C. (2022). Towards an efficient automation of network penetration testing using model-based reinforcement learning. City, University of London.
Guerrero, G. D., Wallace, R. M., & others. (2014). A performance/cost model for a CUDA drug discovery application on physical and public cloud infrastructures. Concurrency and Computation: Practice and Experience.
Hong, C. H., Spence, I., & Nikolopoulos, D. S. (2017). GPU virtualization and scheduling methods: A comprehensive survey. ACM Computing Surveys.
IDC. (2023). Hybrid Cloud Cost Management Trends. IDC Market Analysis Report.
IEEE CloudCom. (2023). Benchmarking GPU Passthrough in Hybrid Cloud VDI Environments. Proceedings of the IEEE International Conference on Cloud Computing Technology and Science.
Lin, W., Shi, F., Wu, W., Li, K., & Wu, G. (2020). A taxonomy and survey of power models and power modeling for cloud servers. ACM Computing Surveys.
Mehta, V. (2018). Workload assignment in various heterogeneous cloud environments. IIIT Hyderabad.
Mehta, V., Rishabh, K., Raja, R., & others. (2016). MultiStack: Multi-cloud big data research framework/platform. Proceedings of the IEEE International Conference on Cloud Computing.
Microsoft Azure. (2023). Case Study: GPU-Accelerated Medical Imaging at Scale. Azure Architecture Center.
Mohamadi Bahram Abadi, R., & others. (2018). Server consolidation techniques in virtualized data centers of cloud environments: A systematic literature review. Software: Practice and Experience.
Mohan, J., Phanishayee, A., Raniwala, A., & others. (2020). Analyzing and mitigating data stalls in DNN training. arXiv preprint arXiv:2001.05040.
Moubayed, A., Shami, A., & Al-Dulaimi, A. (2022). On end-to-end intelligent automation of 6G networks. Future Internet.
NVIDIA. (2023). NVIDIA NGC Containers for AI Training. NGC Catalog Technical Brief.
NVIDIA. (2023). Virtual GPU Software User Guide. Retrieved from NVIDIA Documentation Hub.
Oreyomi, M., & Jahankhani, H. (2022). Challenges and opportunities of autonomous cyber defence (ACyD) against cyber attacks. In AI and other emerging technologies for digital transformation (Springer).
PTC. (2023). CAD Workload Performance in Virtualized Environments. PTC Technical Report.
Radiology AI Journal. (2023). Impact of GPU Passthrough on Tumor Detection Accuracy in Mammography. *12*(4), 45–60.
Varghese, B., & Buyya, R. (2017). Next generation cloud computing: New trends and research directions. Future Generation Computer Systems, 79, 849–861. https://doi.org/10.1016/j.future.2017.09.020
VMware. (2023). VMware Horizon 8: Instant Clone Technology Deep Dive. VMware Technical White Paper.
VMware. (2023). VMware vSphere 8.0 Resource Management Guide. VMware Documentation Library.
Wang, X., Han, Y., Leung, V. C. M., Niyato, D., Yan, X., & Chen, X. (2020). Convergence of Edge Computing and Deep Learning: A Comprehensive survey. IEEE Communications Surveys & Tutorials, 22(2), 869–904. https://doi.org/10.1109/comst.2020.2970550
Xue, M., Ma, J., Li, W., Tian, K., Dong, Y., Wu, J., & others. (2018). Scalable GPU virtualization with dynamic sharing of graphics memory space. Proceedings of the IEEE International Parallel and Distributed Processing Symposium.
Zhang, C., Patras, P., & Haddadi, H. (2019). Deep learning in mobile and wireless Networking: a survey. IEEE Communications Surveys & Tutorials, 21(3), 2224–2287. https://doi.org/10.1109/comst.2019.2904897
Zhang, Z., Wang, T., Li, A., & Zhang, W. (2022). Adaptive auto-scaling of delay-sensitive serverless services with reinforcement learning. Proceedings of the IEEE Annual Computer Software and Applications Conference.
Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., & Zhang, J. (2019). Edge Intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, 107(8), 1738–1762. https://doi.org/10.1109/jproc.2019.2918951
Zurawski, J. (2024). New York-Presbyterian and Columbia University Irving Medical Center Requirements Analysis Report. https://doi.org/10.2172/2479511
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.