Automated Bias Detection within the Cardiovascular Disease Dataset using MapReduce Framework with Balance Measure

Authors

  • Jyoti Prakhar Tanwir Uddin Haider (SMIEEE)

Keywords:

Bias, Big dataset, Balance measure approach, MapReduce Framework

Abstract

Today, many fields rely on decision support systems, including health care for making appropriate decisions based on datasets. The decision support system, particularly in cardiovascular disease, is entirely dependent on the big data set, so if it is biased, it’s difficult to decide whether the person has a cardiovascular disease. Bias detection in cardiovascular disease datasets has become a complex task because of the direct processing of large data sets.  Another major drawback is that biases are detected on the set of attributes rather than protected attributes within the cardiovascular disease dataset which in turn increases computational cost as we know biases lie within protected attributes. Thus, it is a major challenge to identify the protected attribute from the set of attributes. Further, In the past bias identification was done manually using a statistical technique, which produced unreliable results i.e. minimum bias value related to cardiovascular disease. Considering all these challenges, we introduce a pioneering framework designed for automated bias detection within extensive cardiovascular disease datasets. Within our proposed methodology, we identified the protected attribute, namely gender, utilizing the capabilities of the MapReduce framework. Further, the balance measure approach has been used on the protected attribute of the cardiovascular disease dataset to detect the biases. The comparative results reveal that the detection of biases on protected attributes outperforms the existing works in terms of bias value, accuracy, precision, and F1 score which are 28%, 72%, 73%, and 81% respectively.  These metrics collectively indicate the superior performance of the proposed methodology.

Downloads

Download data is not yet available.

References

A. Ghosh, Big data and its utility, Consulting Ahead 10 (2016) 52–69.

Prakhar, Jyoti, and Md Tanwir Uddin Haider. "Bias Detection and Mitigation within Decision Support System: A Comprehensive Survey." International Journal of Intelligent Systems and Applications in Engineering 11.3 (2023): 219-237.

Prakhar, Jyoti, and Md Tanwir Uddin Haider. "Automated Detection of Biases within the Healthcare System Using Clustering and Logistic Regression." 2023 15th International Conference on Computer and Automation Engineering (ICCAE). IEEE, 2023.

Agathe Balayn, Christoph Lofi, and Geert-Jan Houben. Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems. The VLDB Journal, 30(5):739–768, 2021.

Kruse, Clemens Scott, et al. “Challenges and opportunities of big data in health care: a systematic review.” JMIR medical informatics 4.4 (2016): e5359.

Heudecker N. “Hype Cycle for Big Data.” Gartner.URL: https://www.gartner.com/doc/2574616/hype-cycle-big-data- [accessed 2016-11-08] [WebCite Cache ID 6lsI6Sxxr] 2013 Jul 31.

Chawla, Nitesh V., and Darcy A. Davis. “Bringing big data to personalized healthcare: a patient-centered framework.” Journal of general internal medicine 28.3 (2013): 660-665.

Jee, Kyoungyoung, and Gang-Hoon Kim. “Potentiality of big data in the medical sector: focus on how to reshape the healthcare system.” Healthcare informatics research 19.2 (2013): 79-85.

Norori, Natalia, et al. “Addressing bias in big data and AI for healthcare: A call for open science.” Patterns 2.10 (2021): 100347.

Zhao, Jieyu, and Kai-Wei Chang. “LOGAN: Local group bias detection by clustering.” arXiv preprint arXiv:2010.02867 (2020).

Lee, Choong Ho, and Hyung-Jin Yoon. “Medical big data: promise and challenges.” Kidney Research and clinical practice 36.1 (2017): 3.

Rumsfeld, John S., Karen E. Joynt, and Thomas M. Maddox. “Big data analytics to improve cardiovascular care: promise and challenges”. Nature Reviews Cardiology 13.6 (2016): 350-359.

Zliobaite, Indre. “A survey on measuring indirect discrimination in machine learning.” arXiv preprint arXiv:1511.00148 (2015).

Jena, Bibhudutta, et al. “A survey work on optimization techniques utilizing map-reduce framework in Hadoop cluster.” International Journal of Intelligent Systems and Applications 9.4 (2017): 61.

Bhosale, Harshawardhan S., and Devendra P. Gadekar. “A review paper on big data and Hadoop.” International Journal of Scientific and Research Publications 4.10 (2014): 1-7.

Bhathal, Gurjit Singh, and Amardeep Singh. “Big data: Hadoop framework vulnerabilities, security issues and attacks.” Array 1 (2019): 100002.

Zhao, Weizhong, Huifang Ma, and Qing He. “Parallel k-means clustering based on MapReduce.” Cloud Computing: First International Conference, CloudCom 2009, Beijing, China, December 1-4, 2009. Proceedings 1. Springer Berlin Heidelberg, 2009.

Desai, Shailesh, Atul Munshi, and Devangi Munshi. “Gender bias in cardiovascular disease prevention, detection, and management, with specific reference to coronary artery disease.” Journal of mid-life health 12.1 (2021): 8.

Kim, Isabel, et al. “Sex and gender bias as a mechanistic determinant of cardiovascular disease outcomes.” Canadian Journal of Cardiology 38.12 (2022): 1865-1880.

Park, Dongchul, Jianguo Wang, and Yang-Suk Kee. “In-storage comput- ing for Hadoop MapReduce framework: Challenges and possibilities.” IEEE Transactions on Computers (2016).

Suri, Jasjit S., Mrinalini Bhagawati, Sudip Paul, Athanasios Protogeron, Petros P. Sfikakis, George D. Kitas, Narendra N. Khanna et al. “Under- standing the bias in machine learning systems for cardiovascular disease risk assessment: The first of its kind review.” Computers in biology and medicine (2022): 105204.

Downloads

Published

24.03.2024

How to Cite

Tanwir Uddin Haider (SMIEEE), J. P. (2024). Automated Bias Detection within the Cardiovascular Disease Dataset using MapReduce Framework with Balance Measure. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 2188–2196. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/5688

Issue

Section

Research Article