Modified Deep Q Optimizer with Updated SARSA for Improving Learning Efficiency through Optimum Resource Utilization in Reinforcement Learning

Authors

  • Vaqar Ahmed Ansari Assistant Professor,St.Francis Institute of Technology
  • Kamal Shah Professor (IT Dept), Thakur College of Engineering and Technology, Mumbai, India
  • Rohini Patil Assistant Professor, Terna Engineering College, Navi Mumbai,India
  • Anil Vasoya Associate Professor, Thakur College of Engineering and Technology, Mumbai, India
  • Payel Saha Professor Extc Department, Thakur College of Engineering and Technology
  • Mary Margaret Assistant Professor , Thakur College of Engineering and Technology, Mumbai
  • Suresh Rajpurohit Assistant professor Computer Engineering St Francis Institute of Technology

Keywords:

Reinforcement learning, Deep Q optimiser, Cost Model, Reward function

Abstract

In Reinforcement Learning (RL) efficiency of the algorithm is ensured by reducing the cost of learning with maximizing the rewards. In this paper a new technique for RL-based Deep Q optimizers is introduced with Updated SARSA algorithm and newly defined linear cost function. Modified DQ perform significantly faster after learning as it utilize value for particular epoch instead of  for the whole dataset .Proposed linear cost model gives wide range of weight parameters “W” where the mean value is always closer to the minimum cost which implies easy to make cluster and train the features. Proposed modified DQ gives 88.2% reduction in variation for relative mean cost for proposed cost-model. With proposed cost model, training execution time for modified DQ has been reduced by 19.35% compared to existing DQ and improves accuracy by 3 to 4% with optimum reward.

Downloads

Download data is not yet available.

References

Nguyen T T, Nguyen N D & Nahavandi S 2020 Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications. IEEE Trans Cybern. 50(9):1-27

Cao Z & Lin C T 2023 Reinforcement Learning From Hierarchical Critics. IEEE Trans Neural Netw Learn Syst. 34(2):1066-1073

Li, X., Xu, H., Zhang, J., & Chang, H.2023 Deep Reinforcement Learning for Adaptive Learning Systems. Journal of Educational and Behavioral Statistics. 48(2): 220–243

Tan, F., Yan, P., & Guan, X. 2017 Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning. International Conference on Neural Information Processing.

Park, J. & Park, J. 2020 Enhanced Machine Learning Algorithms: Deep Learning, Reinforcement learning and Q-learning. Journal of Information Processing Systems. 16(5):1001-1007

Wang Hao-nan, Liu Ning, ZhangYi-yun, Feng Da-wei, Huang Feng, Li Dong-sheng et.al. 2020 Deep reinforcement learning: a survey. Frontiers of Information Technology & Electronic Engineering. 21

Xu Z X, Cao L, Chen X L, Li C X, Zhang Y L, & Lai J 2018 Deep reinforcement learning with sarsa and Q-learning: A hybrid approach. IEICE Transactions on Information and Systems E101. D(9): 2315-2322

Ren J, Ye C & Yang F 2021 Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network. Alexandria Engineering Journal. 60(3):2787-2800

Samieiyeganeh M., Rahmat R W B O K, Khalid F B, & Kasmiran KA 2022 Deep reinforcement learning to multi-agent deep reinforcement learning. J of Theoretical and Applied Information Technology. 100(4):990-1003

Cai Q,Cui C,Wang W,Xie Z,Zhang M 2023 A Survey on Deep Reinforcement Learning for Data Processing and Analytics. IEEE Transactions on Knowledge & Data Engineering, 35(05):4446-4465

Duryea E, Ganger M, &Hu W 2016 Exploring Deep Reinforcement Learning with Multi Q-Learning. Intelligent Control and Automation.07(04):129-144

Gerber RH 1986 Dataflow query processing using multiprocessor hash partition algorithms. Technical report, Wisconsin Univ., Madison

Markl V, Raman V, Simmen D, Lohman G, Pirahesh H, & Cilimdzic M 2004 Robust query processing through progressive optimization.Proceedings of the 2004 ACM sigmod international conference on management of data.659–670

Neumann T, & Radke B 2018 Adaptive optimization of very large join queries. In Proceedings of the 2018 International Conference on Management of Data. 677–692

Ortiz J, Balazinska M, Gehrke J, & Keerthi SS 2018 Learning state representations for query optimization with deep reinforcement learning. In Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, DEEM’18. 1–4

Van Hasselt H, Guez A, & Silver D 2016 Deep reinforcement learning with double q-learning. Proceedings of the AAAI Conference on Artificial Intelligence.30(1)

Wu W, Chi Y, Hac ́ıgum u ̈s H,& Naughton J F 2013 Towards predicting query execution time for concurrent and dynamic database workloads. Proceedings of the VLDB Endowment.6(10):925–936

Wu W, Chi Y, Zhu S, Tatemura J, Hacigum ̈ us H, & Naughton J F 2023 ̈Predicting query execution time: Are optimizer cost models really unusable?. Proceedings International Conference on Data Engineering. 1081-1092

Leis V, Gubichev A, Mirchev A, Boncz P, Kemper A, & NeumannT 2015 How good are query optimizers, really? Proceedings of the VLDB Endowment. 9(3):204–215

Uehara M, Shi C & Kallus N 2022 A Review of Off-Policy Evaluation in Reinforcement Learning.

Zhao D , Wang H, Shao K, & Zhu Y 2016 Deep reinforcement learning with experience replay based on SARSA. 2016 IEEE Symposium Series on Computational Intelligence (SSCI). 1-6

Bellman R E 1957 Dynamic programming. Princeton University Press

Tan F, Yan P, Guan X 2017 Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning International Conference on Neural Information Processing.475-483

Waas F, &Pellenko A 2000 Join order selection-good enough is easy. In British National Conference on Databases. 51–67

Krishnamurthy R, Boral H, & Zaniolo C 1986 Optimization of nonrecursive queries. In VLDB. 86: 128–137

Ziane M, Za M,& Borla-Salamet P 1993 Parallel query processing with zigzag trees. The International Journal on Very Large Data Bases. 2(3):277–302

Downloads

Published

29.01.2024

How to Cite

Ansari, V. A. ., Shah, K. ., Patil, R. ., Vasoya, A. ., Saha, P. ., Margaret, M. ., & Rajpurohit, S. . (2024). Modified Deep Q Optimizer with Updated SARSA for Improving Learning Efficiency through Optimum Resource Utilization in Reinforcement Learning. International Journal of Intelligent Systems and Applications in Engineering, 12(13s), 653–662. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/4634

Issue

Section

Research Article