Scalable Adaptive ETL Frameworks for Real-Time Risk Scoring in Financial Data Lake Environments
Keywords:
Finance Domain, Subledger, Data Architect, Cloud ETL, Data Lake, Azure, BI, SQLAbstract
As financial data management changes quickly, it has become imperative to have a scalable framework that adapts to these transformations and can support real-time risk scoring in data lake applications. Batch processing-based traditional ETL pipelines rarely deliver the agility and complexity required to handle today's financial transactions, especially when connecting subledger systems with external market feeds. This research focuses on the design and development of a cloud ETL architecture on Azure with PySpark and SQL to facilitate the integration of structured and unstructured data streams. The industry reports show that global investments in financial data lakes are more than $12 billion, with more than 70% of financial institutions emphasizing the importance of data architect positions for ensuring resilience and compliance. The proposed solution reads subledger data and market feed and applies adaptive transformations to cleanse, normalize and enrich the data in a centralized data lake. The results show an increase in fraud detection accuracy, a decrease in time spent assessing a credit risk, and easy integration with BI dashboards for real-time data visualization. The adaptive ETL design also complies with the requirements of IFRS 9 and Basel III for schema evolution, scalability and regulatory transparency. This framework connects operational data and analytical intelligence, providing a strategic platform for financial institutions to become resilient, compliant, and competitive in the data-driven economy.
Downloads
References
Aitha, A. R. (2021). Optimizing Data Warehousing for Large Scale Policy Management Using Advanced ETL Frameworks. Retrieved at https://www.academia.edu/download/125271911/online_jaibd_2021_1_1_1350.pdf
Arul, K. (2021). Optimizing data pipelines in cloud-based big data ecosystems: A comparative study of modern ETL tools. International Journal of Engineering and Computer Science, 10(4), 25321-25343. Retrieved at https://www.academia.edu/download/123451193/Optimizing_Data_Pipelines_in_Cloud_1_1_.pdf
Arul, K. (2021). Optimizing data pipelines in cloud-based big data ecosystems: A comparative study of modern ETL tools. International Journal of Engineering and Computer Science, 10(4), 25321-25343. Retrieved at https://www.academia.edu/download/123451193/Optimizing_Data_Pipelines_in_Cloud_1_1_.pdf
Arul, K. (2021). Optimizing data pipelines in cloud-based big data ecosystems: A comparative study of modern ETL tools. International Journal of Engineering and Computer Science, 10(4), 25321-25343. Retrieved at https://www.academia.edu/download/123451193/Optimizing_Data_Pipelines_in_Cloud_1_1_.pdf
Badgujar, P. (2021). Optimizing ETL Processes for Large-Scale Data Warehouses. Journal of Technological Innovations, 2(4). Retrieved at http://jtipublishing.com/jti/article/view/35
Guntupalli, B. (2021). My Approach to Data Validation and Quality Assurance in ETL Pipelines. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 2(3), 62-73. Retrieved at https://ijaidsml.org/index.php/ijaidsml/article/view/209
Guntupalli, B. (2021). The Evolution of ETL: From Informatica to Modern Cloud Tools. International Journal of AI, BigData, Computational and Management Studies, 2(2), 66-75. Retrieved at https://ijaibdcms.org/index.php/ijaibdcms/article/view/205
Maniar, V., Tamilmani, V., Kothamaram, R. R., Rajendran, D., Namburi, V. D., & Singh, A. A. S. (2021). Review of Streaming ETL Pipelines for Data Warehousing: Tools, Techniques, and Best Practices. International Journal of AI, BigData, Computational and Management Studies, 2(3), 74-81. Retrieved at https://ijaibdcms.org/index.php/ijaibdcms/article/view/284
Maniar, V., Tamilmani, V., Kothamaram, R. R., Rajendran, D., Namburi, V. D., & Singh, A. A. S. (2021). Review of Streaming ETL Pipelines for Data Warehousing: Tools, Techniques, and Best Practices. International Journal of AI, BigData, Computational and Management Studies, 2(3), 74-81. Retrieved at https://ijaibdcms.org/index.php/ijaibdcms/article/view/284
Maniar, V., Tamilmani, V., Kothamaram, R. R., Rajendran, D., Namburi, V. D., & Singh, A. A. S. (2021). Review of Streaming ETL Pipelines for Data Warehousing: Tools, Techniques, and Best Practices. International Journal of AI, BigData, Computational and Management Studies, 2(3), 74-81. Retrieved at https://ijaibdcms.org/index.php/ijaibdcms/article/view/284
Mishra, S. (2020). Automating the data integration and ETL pipelines through machine learning to handle massive datasets in the enterprise. International Journal of Emerging Research in Engineering and Technology, 1(2), 69-78. Retrieved at https://ijeret.org/index.php/ijeret/article/view/231
Muntala, P. S. R. P. (2021). Integrating AI with Oracle Fusion ERP for Autonomous Financial Close. International Journal of AI, BigData, Computational and Management Studies, 2(2), 76-86. Retrieved at https://ijaibdcms.org/index.php/ijaibdcms/article/view/229
Orlovskyi, D., & Kopp, A. (2020, December). A Business Intelligence Dashboard Design Approach to Improve Data Analytics and Decision Making. In IT&I (pp. 48-59). Retrieved at https://ceur-ws.org/Vol-2833/Paper_5.pdf
Parepalli, S. (2020). Data-Centric Prediction of ETL Throughput and Resource Utilization Using Classical Machine Learning Models. Journal of Artificial Intelligence, Machine Learning and Data Science, 1, 3164-3174. Retrieved at https://urfjournals.org/open-access/data-centric-prediction-of-etl-throughput-and-resource-utilization-using-classical-machine-learning-models.pdf
Rahul, N. (2021). AI-Enhanced API Integrations: Advancing Guidewire Ecosystems with Real-Time Data. International Journal of Emerging Research in Engineering and Technology, 2(1), 57-66. Retrieved at https://ijeret.org/index.php/ijeret/article/view/255
Seenivasan, D. (2021). ETL in a World of Unstructured Data: Advanced Techniques for Data Integration. International Journal of Management, IT and Engineering (IJMIE), 11(1), 127-145. Retrieved at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5148188
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


