Integrating User Attribute Influences and DL-Based Anonymization for Enhanced Privacy Protection in Medical Record Data Sharing for Publishing

Authors

  • Lingam Suman, S. Venkata Lakshmi

Keywords:

Privacy protection, Quasi-identifiers, Sensitive attributes, Sanitization, Utility

Abstract

In today's data-driven healthcare research landscape, sharing medical records for research is pivotal for advancing medical knowledge and patient care. However, ensuring individuals' privacy while maintaining data utility poses a significant challenge. To tackle this issue, this study proposes a novel Attribute Influence Anonymization using RVAE (AIARVAE) for enhancing both privacy and utility in medical records data sharing. The proposed model employs a preprocessing step to identify and filter Quasi-Identifiers (QIs) and Sensitive Attributes (SAs) from the dataset. Then quantify the susceptibility of QIs and measure the uncertainty of SAs using entropy. These metrics are then fed into a Recurrent Variational Auto-Encoder (RVAE) model, which replaces low-entropy SAs with sanitized values with the help of QI values. This approach mitigates the risk of explicit disclosure of private information while preserving data utility. By integrating attribute influences, the proposed model provides a comprehensive solution for safeguarding medical records data privacy during research sharing and promoting responsible and ethical data-driven healthcare research.

Downloads

Download data is not yet available.

References

Majeed, A., Khan, S., & Hwang, S. O. (2022). Toward privacy preservation using clustering-based Anonymization: Recent advances and future research outlook. IEEE Access, 10, 53066-53097. https://doi.org/10.1109/access.2022.3175219

Gangarde, R., Sharma, A., & Pawar, A. (2023). Enhanced clustering based OSN privacy preservation to ensure K-anonymity, T-closeness, L-diversity, and balanced privacy utility. Computers, Materials & Continua, 75(1), 2171-2190. https://doi.org/10.32604/cmc.2023.035559

Peethambaran, G., Naikodi, C., & Suresh, L. (2020). An ensemble learning approach for privacy–quality–Efficiency trade-off in data analytics. 2020 International Conference on Smart Electronics and Communication (ICOSEC). https://doi.org/10.1109/icosec49089.2020.9215250

Cai, Y., Zhang, S., Xia, H., Fan, Y., & Zhang, H. (2020). A privacy-preserving scheme for interactive messaging over online social networks. IEEE Internet of Things Journal, 7(8), 6817-6827. https://doi.org/10.1109/jiot.2020.2986341

Onesimu, J. A., J, K., Eunice, J., Pomplun, M., & Dang, H. (2022). Privacy preserving attribute-focused Anonymization scheme for healthcare data publishing. IEEE Access, 10, 86979-86997. https://doi.org/10.1109/access.2022.3199433

Ciampi, M., Sicuranza, M., & Silvestri, S. (2022). A privacy-preserving and standard-based architecture for secondary use of clinical data. Information, 13(2), 87. https://doi.org/10.3390/info13020087

Abbasi, A., & Mohammadi, B. (2021). A clustering‐based anonymization approach for privacy‐preserving in the healthcare cloud. Concurrency and Computation: Practice and Experience, 34(1). https://doi.org/10.1002/cpe.6487

Bazai, S. U., Jang-Jaccard, J., & Alavizadeh, H. (2021). A novel hybrid approach for multi-dimensional data Anonymization for Apache spark. ACM Transactions on Privacy and Security, 25(1), 1-25. https://doi.org/10.1145/3484945

Kumar, S., & Kumar, P. (2023). Privacy preserving in online social networks using fuzzy rewiring. IEEE Transactions on Engineering Management, 70(6), 2071-2079. https://doi.org/10.1109/tem.2021.3072812

Shakeel, S., Anjum, A., Asheralieva, A., & Alam, M. (2021). K-NDDP: An efficient Anonymization model for social network data release. Electronics, 10(19), 2440. https://doi.org/10.3390/electronics10192440

Majeed, A., & Lee, S. (2020). Attribute susceptibility and entropy based data anonymization to improve users community privacy and utility in publishing data. Applied Intelligence, 50(8), 2555-2574. https://doi.org/10.1007/s10489-020-01656-w

Wang, Y., Meng, X., & Liu, X. (2023). Differentially private recurrent variational Autoencoder for text privacy preservation. Mobile Networks and Applications. https://doi.org/10.1007/s11036-023-02096-9

Moqurrab, S. A., Tariq, N., Anjum, A., Asheralieva, A., Malik, S. U., Malik, H., Pervaiz, H., & Gill, S. S. (2022). A deep learning-based privacy-preserving model for smart healthcare in Internet of Medical Things using fog computing. Wireless Personal Communications, 126(3), 2379-2401. https://doi.org/10.1007/s11277-021-09323-0

Patil, P. (2024, May 8). Healthcare dataset. Kaggle: Your Machine Learning and Data Science Community. https://www.kaggle.com/datasets/prasad22/healthcare-dataset

Majeed, A., & Hwang, S. O. (2023). Quantifying the vulnerability of attributes for effective privacy preservation using machine learning. IEEE Access, 11, 4400-4411. https://doi.org/10.1109/access.2023.3235016

Manjula, G. S., & Meyyappan, T. (2023). Two-Phase Privacy Preserving Big Data Hybrid Clustering for Multi-Party Data Sharing. International Journal of Intelligent Systems and Applications in Engineering, 11(9s), 501-510.

Udita, M., Ritu, N., & Amandeep. (2023). Secure and Compatible Integration of Cloud-Based ERP Solution: A Review. International Journal of Intelligent Systems and Applications in Engineering, 11(9), 695-707.

Kavitha, G., Kavitha, K., & Sujatha, B. (2024). A Hybrid Multi-Client Filter Based Feature Clustering and Privacy Preserving Classification Framework on High Dimensional Databases. International Journal of Intelligent Systems and Applications in Engineering, 12(8), 93-107.

Ge, Y., Wang, H., Cao, J., Zhang, Y., & Jiang, X. (2024). Privacy-preserving data publishing: An information-driven distributed genetic algorithm. World Wide Web, 27(1). https://doi.org/10.1007/s11280-024-01241-y

Canbay, Y., Sagiroglu, S., & Vural, Y. (2022). A new utility‐aware anonymization model for privacy preserving data publishing. Concurrency and Computation: Practice and Experience, 34(10). https://doi.org/10.1002/cpe.6808

Majeed, A. (2023). Attribute-centric and synthetic data based privacy preserving methods: A systematic review. Journal of Cybersecurity and Privacy, 3(3), 638-661. https://doi.org/10.3390/jcp3030030

Kulkarni, Y. R., Jagdale, B., & Sugave, S. R. (2023). Optimized key generation-based privacy preserving data mining model for secure data publishing. Advances in Engineering Software, 175, 103332. https://doi.org/10.1016/j.advengsoft.2022.103332

Kim, J. W. (2021). Efficiently supporting online privacy-preserving data publishing in a distributed computing environment. Applied Sciences, 11(22), 10740. https://doi.org/10.3390/app112210740

Ekaputra, F. J., Ekelhart, A., Mayer, R., Miksa, T., Šarčević, T., Tsepelakis, S., & Waltersdorfer, L. (2024). Semantic-enabled architecture for auditable privacy-preserving data analysis. Semantic Web, 15(3), 675-708. https://doi.org/10.3233/sw-212883

Andrew, J., Eunice, R. J., & Karthikeyan, J. (2023). An anonymization-based privacy-preserving data collection protocol for digital health data. Frontiers in Public Health, 11. https://doi.org/10.3389/fpubh.2023.1125011

S.Venkata Lakshmi and Valli Kumari Vatsavayi, 2016. Query optimization using clustering and Genetic Algorithm for Distributed Databases. International Conference on Computer Communication and Informatics (ICCCI). IEEE.

S Venkata Lakshmi and Valli Kumari Vatsavayi, April 2017. Teacher-Learner & Multi-Objective Genetic Algorithm Based Query Optimization Approach For Heterogeneous Distributed Database Systems. Journal of Theoretical and Applied Information Technology.

Sunita A Yadwad, Dr V. Valli Kumari and Dr S Venkata Lakshmi, 2021. Service Outages Prediction through Logs and Tickets Analysis, International Journal of Advanced Computer Science and Applications (IJACSA).

Simhadri Madhuri and Dr. S. Venkata Lakshmi (2023), A Machine Learning based Normalized Fuzzy Subset Linked Model In Networks for Intrusion Detection. Soft Computing.

Simhadri Madhuri and Dr. S. Venkata Lakshmi (2023), Trusted Node Feedback Based Clustering Model For Detection Of Malicious Nodes In The Network. Journal of Theoretical and Applied Information Technology (JATIT), Vol.101. Issue No. 7.

Improving Data Transmission Rate with Self Healing Activation Model for Intrusion Detection with Enhanced Quality of Service. International Journal on Recent and Innovation Trends in Computing and Communication (Q4), Volume: 11 Issue: 9s, 233-243. https://doi.org/10.17762/ijritcc.v11i9s.7417

Downloads

Published

20.06.2024

How to Cite

Lingam Suman. (2024). Integrating User Attribute Influences and DL-Based Anonymization for Enhanced Privacy Protection in Medical Record Data Sharing for Publishing . International Journal of Intelligent Systems and Applications in Engineering, 12(4), 688–696. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6272

Issue

Section

Research Article