Enhancing Software Development through Prompt Engineering A Study on Large Language Models for Code Generation and Developer Productivity

Authors

  • Jimit Patel, Meet Bipinchandra Patel, Nishil Sureshkumar Prajapati, Rahul Rathi, Raghavendra Kamarthi Eranna, Pratikkumar Prajapati, Krishna Chaitanaya Chittoor

Keywords:

Prompt engineering, code generation, software development productivity, large language models (LLMs), create tools, AI in software engineering, intelligent programming assistants

Abstract

Large Language Models (LLMs) are increasingly changing Software Development with capabilities to generate code snippets, debug, etc., and towards design work as well. Successful outcomes using LLMs is heavily reliant on prompt engineering. Well-designed prompts influence the quality of generated code, improve developer workflows, and build effective human-computer interactivity in the use of LLM models. This study examines prompt engineering in improving developer productivity via a designed process of exploration of prompting strategies to generate code. A taxonomy of potential prompt engineering techniques is introduced conceptualizing four experimental approaches for the coding task: instruction-based prompts, example-based prompts, chain-of-thought prompts, and hybrid prompts. The study focuses on developer-oriented productivity metrics beyond technical quality. Productivity metrics include a reduction in overall development time, reduced errors and better readability, e.g. improved structure of codes, and found improvements to tools used to develop software, e.g. integrated development environments, collaborative coding tools. The comparative evaluation of prompt patterns identifies how differentiating prompt patterns can create variable impact on code quality, but also variable experiences for developers. This suggests that prompt engineering can influence the continuing problem of debugging and support more rapid delivery of software to clients. The paper describes barriers to adoption in practice, such as prompt sensitivity, context limitations and reusability limitations and offer a roadmap for integrating adaptive prompting systems directly in developer environments. By relating LLM capabilities to productivity outcomes, this work provides a new perspective in bridging prompt engineering research with real-world software development pipelines.

Downloads

Download data is not yet available.

References

Rose, Leema. "An Efficient Transformer-Based Model for Automated Code Generation: Leveraging Large Language Models for Software Engineering." International Journal of Emerging Research in Engineering and Technology 1.3 (2020): 1-9.

Liu, F., Li, G., Zhao, Y. and Jin, Z., 2020, December. Multi-task learning based pre-trained language model for code completion. In Proceedings of the 35th IEEE/ACM international conference on automated software engineering (pp. 473-485).

Solaiman I, Brundage M, Clark J, Askell A, Herbert-Voss A, Wu J, Radford A, Krueger G, Kim JW, Kreps S, McCain M. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203. 2019 Aug 24.

Hellendoorn, Vincent J., Premkumar T. Devanbu, and Alberto Bacchelli. "Will they like this? evaluating code contributions with language models." In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 157-167. IEEE, 2015.

Sivaraman, Hariprasad. "Integrating Large Language Models for Automated Test Case Generation in Complex Systems." (2020).

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A. and Agarwal, S., 2020. Language models are few-shot learners. Advances in neural information processing systems, 33, pp.1877-1901.

Domhan, Tobias, and Felix Hieber. "Using target-side monolingual data for neural machine translation through multi-task learning." (2017).

Tucker, George, Minhua Wu, Ming Sun, Sankaran Panchapagesan, Gengshen Fu, and Shiv Vitaladevuni. "Model compression applied to small-footprint keyword spotting." (2016).

Schelter S, Biessmann F, Januschowski T, Salinas D, Seufert S, Szarvas G. On challenges in machine learning model management.

Klöckner, Andreas, et al. "PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation." Parallel computing 38.3 (2012): 157-174.

Dathathri, Sumanth, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, and Rosanne Liu. "Plug and play language models: A simple approach to controlled text generation." arXiv preprint arXiv:1912.02164 (2019).

Xia, Xin, Lingfeng Bao, David Lo, Zhenchang Xing, Ahmed E. Hassan, and Shanping Li. "Measuring program comprehension: A large-scale field study with professionals." IEEE Transactions on Software Engineering 44, no. 10 (2017): 951-976.

Voelter, Markus, Bernd Kolb, Klaus Birken, Federico Tomassetti, Patrick Alff, Laurent Wiart, Andreas Wortmann, and Arne Nordmann. "Using language workbenches and domain-specific languages for safety-critical software development." Software & Systems Modeling 18, no. 4 (2019): 2507-2530.

Schelter, Sebastian, Joos-Hendrik Boese, Johannes Kirschnick, Thoralf Klein, and Stephan Seufert. "Automatically tracking metadata and provenance of machine learning experiments." (2017).

Gupta, Deepali. "The aspects of artificial intelligence in software engineering." Journal of Computational and Theoretical Nanoscience 17, no. 9-10 (2020): 4635-4642.

Schmitt C, Kuckuk S, Köstler H, Hannig F, Teich J. An evaluation of domain-specific language technologies for code generation. In2014 14th International Conference on Computational Science and Its Applications 2014 Jun 30 (pp. 18-26). IEEE.

Deeptimahanti, D. K., & Sanyal, R. (2011, February). Semi-automatic generation of UML models from natural language requirements. In Proceedings of the 4th India Software Engineering Conference (pp. 165-174).

Sadowski, Caitlin, and Thomas Zimmermann. Rethinking productivity in software engineering. Springer Nature, 2019.

Tomassetti F, Torchiano M, Tiso A, Ricca F, Reggio G. Maturity of software modelling and model driven engineering: A survey in the Italian industry. In16th International Conference on Evaluation & Assessment in Software Engineering (EASE 2012) 2012 May 14 (pp. 91-100). Stevenage UK: IET.

Klein, John, Harry Levinson, and Jay Marchetti. Model-driven engineering: Automatic code generation and beyond. No. DM0001604. 2015.

Tufano, M., Drain, D., Svyatkovskiy, A., Deng, S.K. and Sundaresan, N., 2020. Unit test case generation with transformers and focal context. arXiv preprint arXiv:2009.05617.

Kats, Lennart CL, Richard G. Vogelij, Karl Trygve Kalleberg, and Eelco Visser. "Software development environments on the web: a research agenda." In Proceedings of the ACM international symposium on New ideas, new paradigms, and reflections on programming and software, pp. 99-116. 2012.

Meyer, André N., Earl T. Barr, Christian Bird, and Thomas Zimmermann. "Today was a good day: The daily life of software developers." IEEE Transactions on Software Engineering 47, no. 5 (2019): 863-880.

Erlenhov, L., Neto, F. G. D. O., & Leitner, P. (2020, November). An empirical study of bots in software development: Characteristics and challenges from a practitioner’s perspective. In Proceedings of the 28th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering (pp. 445-455).

Mayer, Philip, Michael Kirsch, and Minh Anh Le. "On multi-language software development, cross-language links and accompanying tools: a survey of professional software developers." Journal of Software Engineering Research and Development 5, no. 1 (2017): 1.

Dang, Yingnong, Dongmei Zhang, Song Ge, Ray Huang, Chengyun Chu, and Tao Xie. "Transferring code-clone detection and analysis to practice." In 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP), pp. 53-62. IEEE, 2017.

Nagaria, Bhaveet, and Tracy Hall. "How software developers mitigate their errors when developing code." IEEE Transactions on Software Engineering 48, no. 6 (2020): 1853-1867.

Schelter, S., Schmidt, P., Rukat, T., Kiessling, M., Taptunov, A., Biessmann, F. and Lange, D., 2018. Deequ-data quality validation for machine learning pipelines.

Meyer, Andre N., et al. "Design recommendations for self-monitoring in the workplace: Studies in software development." Proceedings of the ACM on Human-Computer Interaction 1.CSCW (2017): 1-24.

Schmucker R, Donini M, Perrone V, Archambeau C. Multi-objective multi-fidelity hyperparameter optimization with application to fairness.

Meyer, Andre N., Gail C. Murphy, Thomas Zimmermann, and Thomas Fritz. "Design recommendations for self-monitoring in the workplace: Studies in software development." Proceedings of the ACM on Human-Computer Interaction 1, no. CSCW (2017): 1-24.

Kevic, Katja, Braden M. Walters, Timothy R. Shaffer, Bonita Sharif, David C. Shepherd, and Thomas Fritz. "Tracing software developers' eyes and interactions for change tasks." In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, pp. 202-213. 2015.

Mukhtar, M. I., & Galadanci, B. S. (2018). Automatic code generation from UML diagrams: the state-of-the-art. Science World Journal, 13(4), 47-60.

Voelter, Markus, Bernd Kolb, Tamás Szabó, Daniel Ratiu, and Arie van Deursen. "Lessons learned from developing mbeddr: a case study in language engineering with MPS." Software & Systems Modeling 18, no. 1 (2019): 585-630.

Nguyen G, Dlugolinsky S, Bobák M, Tran V, Lopez Garcia A, Heredia I, Malík P, Hluchý L. Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artificial Intelligence Review. 2019 Jun 1;52(1):77-124.

Klimkov, Viacheslav, Adam Nadolski, Alexis Moinet, Bartosz Putrycz, Roberto Barra-Chicote, Tom Merritt, and Thomas Drugman. "Phrase break prediction for long-form reading TTS: Exploiting text structure information." (2018).

Klein, Casey, John Clements, Christos Dimoulas, Carl Eastlund, Matthias Felleisen, Matthew Flatt, Jay A. McCarthy, Jon Rafkind, Sam Tobin-Hochstadt, and Robert Bruce Findler. "Run your research: on the effectiveness of lightweight mechanization." ACM SIGPLAN Notices 47, no. 1 (2012): 285-296.

Gedik B, Andrade H. A model‐based framework for building extensible, high performance stream processing middleware and programming language for IBM InfoSphere Streams. Software: Practice and Experience. 2012 Nov;42(11):1363-91.

Schelter, S., Böse, J.H., Kirschnick, J., Klein, T. and Seufert, S., 2018. Declarative metadata management: A missing piece in end-to-end machine learning.

Yi Q. POET: a scripting language for applying parameterized source‐to‐source program transformations. Software: Practice and Experience. 2012 Jun;42(6):675-706.

Vilar, David. "Learning hidden unit contribution for adapting neural machine translation models." (2018).

von Davier M. Training Optimus prime, MD: Generating medical certification items by fine-tuning OpenAI's gpt2 transformer model. arXiv preprint arXiv:1908.08594. 2019 Aug 23.

Vogel-Heuser, Birgit, Alexander Fay, Ina Schaefer, and Matthias Tichy. "Evolution of software in automated production systems: Challenges and research directions." J. Syst. Softw. 110, no. 110 (2015): 54-84.

LaToza, T.D., Towne, W.B., Adriano, C.M. and Van Der Hoek, A., 2014, October. Microtask programming: Building software with a crowd. In Proceedings of the 27th annual ACM symposium on User interface software and technology (pp. 43-54).

Lockhart, D., Zibrat, G., & Batten, C. (2014, December). PyMTL: A unified framework for vertically integrated computer architecture research. In 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 280-292). IEEE.

Arsikere, Harish, Ashtosh Sapru, and Sri Garimella. "Multi-dialect acoustic modeling using phone mapping and online i-vectors." (2019).

Cho, Hyunsu, and Mu Li. "Treelite: toolbox for decision tree deployment." (2018).

Vetter, J.S., Brightwell, R., Gokhale, M., McCormick, P., Ross, R., Shalf, J., Antypas, K., Donofrio, D., Humble, T., Schuman, C. and Van Essen, B., 2018. Extreme heterogeneity 2018-productive computational science in the era of extreme heterogeneity: Report for DOE ASCR workshop on extreme heterogeneity. USDOE Office of Science (SC), Washington, DC (United States).

Devanbu, Prem, Thomas Zimmermann, and Christian Bird. "Belief & evidence in empirical software engineering." In Proceedings of the 38th international conference on software engineering, pp. 108-119. 2016.

King, Brian, I-Fan Chen, Yonatan Vaizman, Yuzong Liu, Roland Maas, Sree Hari Krishnan Parthasarathi, and Björn Hoffmeister. "Robust speech recognition via anchor word representations." (2017).

Fan, Angela, Beliz Gokkaya, Mark Harman, Mitya Lyubarskiy, Shubho Sengupta, Shin Yoo, and Jie M. Zhang. "Large language models for software engineering: Survey and open problems." In 2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE), pp. 31-53. IEEE, 2023.

Hou, X., Zhao, Y., Liu, Y., Yang, Z., Wang, K., Li, L., Luo, X., Lo, D., Grundy, J. and Wang, H., 2024. Large language models for software engineering: A systematic literature review. ACM Transactions on Software Engineering and Methodology, 33(8), pp.1-79.

Shanuka KA, Wijayanayake J, Vidanage K. Systematic Literature Review on Analyzing the Impact of Prompt Engineering on Efficiency, Code Quality, and Security in Crud Application Development. Journal of Desk Research Review and Analysis. 2024 Dec 30;2(1).

Viswanadhapalli V. AI-Augmented Software Development: Enhancing Code Quality and Developer Productivity Using Large Language Models.

Wang, Guoqing, Zeyu Sun, Zhihao Gong, Sixiang Ye, Yizhou Chen, Yifan Zhao, Qingyuan Liang, and Dan Hao. "Do advanced language models eliminate the need for prompt engineering in software engineering?." arXiv preprint arXiv:2411.02093 (2024).

Marvin G, Hellen N, Jjingo D, Nakatumba-Nabende J. Prompt engineering in large language models. InInternational conference on data intelligence and cognitive informatics 2023 Jun 27 (pp. 387-402). Singapore: Springer Nature Singapore.

Weber, T., Brandmaier, M., Schmidt, A., & Mayer, S. (2024). Significant productivity gains through programming with large language models. Proceedings of the ACM on Human-Computer Interaction, 8(EICS), 1-29.

Gao, Cuiyun, et al. "The current challenges of software engineering in the era of large language models." ACM Transactions on Software Engineering and Methodology 34.5 (2025): 1-30.

Ding H, Fan Z, Guehring I, Gupta G, Ha W, Huan J, Liu L, Omidvar-Tehrani B, Wang S, Zhou H. Reasoning and planning with large language models in code development. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2024 Aug 25 (pp. 6480-6490).

Wang T, Zhou N, Chen Z. Enhancing computer programming education with llms: A study on effective prompt engineering for python code generation. arXiv preprint arXiv:2407.05437. 2024 Jul 7.

Wang T, Zhou N, Chen Z. Enhancing computer programming education with llms: A study on effective prompt engineering for python code generation. arXiv preprint arXiv:2407.05437. 2024 Jul 7.

Li Y, Shi J, Zhang Z. An approach for rapid source code development based on ChatGPT and prompt engineering. IEEE Access. 2024 Apr 8;12:53074-87.

Zheng, Zibin, Kaiwen Ning, Qingyuan Zhong, Jiachi Chen, Wenqing Chen, Lianghong Guo, Weicheng Wang, and Yanlin Wang. "Towards an understanding of large language models in software engineering tasks." Empirical Software Engineering 30, no. 2 (2025): 50.

White, Jules, et al. "Chatgpt prompt patterns for improving code quality, refactoring, requirements elicitation, and software design." Generative AI for Effective Software Development. Cham: Springer Nature Switzerland, 2024. 71-108.

Khojah, R., de Oliveira Neto, F.G., Mohamad, M. and Leitner, P., 2025. The impact of prompt programming on function-level code generation. IEEE Transactions on Software Engineering.

Belzner, Lenz, Thomas Gabor, and Martin Wirsing. "Large language model assisted software engineering: prospects, challenges, and a case study." In International conference on bridging the gap between AI and reality, pp. 355-374. Cham: Springer Nature Switzerland, 2023.

Zhang, Quanjun, Tongke Zhang, Juan Zhai, Chunrong Fang, Bowen Yu, Weisong Sun, and Zhenyu Chen. "A critical review of large language model on software engineering: An example from chatgpt and automated program repair." arXiv preprint arXiv:2310.08879 (2023).

Shi J, Yang Z, Lo D. Efficient and Green Large Language Models for Software Engineering: Literature Review, Vision, and the Road Ahead. ACM Transactions on Software Engineering and Methodology. 2025 May 24;34(5):1-22.

Ross, Steven I., Fernando Martinez, Stephanie Houde, Michael Muller, and Justin D. Weisz. "The programmer’s assistant: Conversational interaction with a large language model for software development." In Proceedings of the 28th International Conference on Intelligent User Interfaces, pp. 491-514. 2023.

Jiang J, Wang F, Shen J, Kim S, Kim S. A survey on large language models for code generation. arXiv preprint arXiv:2406.00515. 2024 Jun 1.

Shethiya, Aditya S. "From Code to Cognition: Engineering Software Systems with Generative AI and Large Language Models." Integrated Journal of Science and Technology 1.4 (2024).

Paul R, Hossain MM, Siddiq ML, Hasan M, Iqbal A, Santos J. Enhancing automated program repair through fine-tuning and prompt engineering. arXiv preprint arXiv:2304.07840. 2023 Apr 16.

Wang, Junjie, Yuchao Huang, Chunyang Chen, Zhe Liu, Song Wang, and Qing Wang. "Software testing with large language models: Survey, landscape, and vision." IEEE Transactions on Software Engineering 50, no. 4 (2024): 911-936.

Wang J, Huang Y, Chen C, Liu Z, Wang S, Wang Q. Software testing with large language models: Survey, landscape, and vision. IEEE Transactions on Software Engineering. 2024 Feb 20;50(4):911-36.

Di Rocco, Juri, Davide Di Ruscio, Claudio Di Sipio, Phuong T. Nguyen, and Riccardo Rubei. "On the use of large language models in model-driven engineering: J. Di Rocco et al." Software and Systems Modeling 24, no. 3 (2025): 923-948.

Li H, Su J, Chen Y, Li Q, Zhang ZX. Sheetcopilot: Bringing software productivity to the next level through large language models. Advances in Neural Information Processing Systems. 2023 Dec 15;36:4952-84.

Nazzal, Mahmoud, Issa Khalil, Abdallah Khreishah, and NhatHai Phan. "Promsec: Prompt optimization for secure generation of functional source code with large language models (llms)." In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pp. 2266-2280. 2024.

Feng, Sidong, and Chunyang Chen. "Prompting is all you need: Automated android bug replay with large language models." In Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, pp. 1-13. 2024.

Silva, Á.F., Mendes, A. and Ferreira, J.F., 2024, April. Leveraging large language models to boost Dafny’s developers productivity. In Proceedings of the 2024 IEEE/ACM 12th International Conference on Formal Methods in Software Engineering (FormaliSE) (pp. 138-142).

Rasheed Z, Sami MA, Kemell KK, Waseem M, Saari M, Systä K, Abrahamsson P. Codepori: Large-scale system for autonomous software development using multi-agent technology. arXiv preprint arXiv:2402.01411. 2024 Feb 2.

Downloads

Published

31.12.2024

How to Cite

Jimit Patel. (2024). Enhancing Software Development through Prompt Engineering A Study on Large Language Models for Code Generation and Developer Productivity. International Journal of Intelligent Systems and Applications in Engineering, 12(23s), 3885 –. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/7942

Issue

Section

Research Article