Integrating User Attribute Influences and DL-Based Anonymization for Enhanced Privacy Protection in Medical Record Data Sharing for Publishing
Keywords:
Privacy protection, Quasi-identifiers, Sensitive attributes, Sanitization, UtilityAbstract
In today's data-driven healthcare research landscape, sharing medical records for research is pivotal for advancing medical knowledge and patient care. However, ensuring individuals' privacy while maintaining data utility poses a significant challenge. To tackle this issue, this study proposes a novel Attribute Influence Anonymization using RVAE (AIARVAE) for enhancing both privacy and utility in medical records data sharing. The proposed model employs a preprocessing step to identify and filter Quasi-Identifiers (QIs) and Sensitive Attributes (SAs) from the dataset. Then quantify the susceptibility of QIs and measure the uncertainty of SAs using entropy. These metrics are then fed into a Recurrent Variational Auto-Encoder (RVAE) model, which replaces low-entropy SAs with sanitized values with the help of QI values. This approach mitigates the risk of explicit disclosure of private information while preserving data utility. By integrating attribute influences, the proposed model provides a comprehensive solution for safeguarding medical records data privacy during research sharing and promoting responsible and ethical data-driven healthcare research.
Downloads
References
Majeed, A., Khan, S., & Hwang, S. O. (2022). Toward privacy preservation using clustering-based Anonymization: Recent advances and future research outlook. IEEE Access, 10, 53066-53097. https://doi.org/10.1109/access.2022.3175219
Gangarde, R., Sharma, A., & Pawar, A. (2023). Enhanced clustering based OSN privacy preservation to ensure K-anonymity, T-closeness, L-diversity, and balanced privacy utility. Computers, Materials & Continua, 75(1), 2171-2190. https://doi.org/10.32604/cmc.2023.035559
Peethambaran, G., Naikodi, C., & Suresh, L. (2020). An ensemble learning approach for privacy–quality–Efficiency trade-off in data analytics. 2020 International Conference on Smart Electronics and Communication (ICOSEC). https://doi.org/10.1109/icosec49089.2020.9215250
Cai, Y., Zhang, S., Xia, H., Fan, Y., & Zhang, H. (2020). A privacy-preserving scheme for interactive messaging over online social networks. IEEE Internet of Things Journal, 7(8), 6817-6827. https://doi.org/10.1109/jiot.2020.2986341
Onesimu, J. A., J, K., Eunice, J., Pomplun, M., & Dang, H. (2022). Privacy preserving attribute-focused Anonymization scheme for healthcare data publishing. IEEE Access, 10, 86979-86997. https://doi.org/10.1109/access.2022.3199433
Ciampi, M., Sicuranza, M., & Silvestri, S. (2022). A privacy-preserving and standard-based architecture for secondary use of clinical data. Information, 13(2), 87. https://doi.org/10.3390/info13020087
Abbasi, A., & Mohammadi, B. (2021). A clustering‐based anonymization approach for privacy‐preserving in the healthcare cloud. Concurrency and Computation: Practice and Experience, 34(1). https://doi.org/10.1002/cpe.6487
Bazai, S. U., Jang-Jaccard, J., & Alavizadeh, H. (2021). A novel hybrid approach for multi-dimensional data Anonymization for Apache spark. ACM Transactions on Privacy and Security, 25(1), 1-25. https://doi.org/10.1145/3484945
Kumar, S., & Kumar, P. (2023). Privacy preserving in online social networks using fuzzy rewiring. IEEE Transactions on Engineering Management, 70(6), 2071-2079. https://doi.org/10.1109/tem.2021.3072812
Shakeel, S., Anjum, A., Asheralieva, A., & Alam, M. (2021). K-NDDP: An efficient Anonymization model for social network data release. Electronics, 10(19), 2440. https://doi.org/10.3390/electronics10192440
Majeed, A., & Lee, S. (2020). Attribute susceptibility and entropy based data anonymization to improve users community privacy and utility in publishing data. Applied Intelligence, 50(8), 2555-2574. https://doi.org/10.1007/s10489-020-01656-w
Wang, Y., Meng, X., & Liu, X. (2023). Differentially private recurrent variational Autoencoder for text privacy preservation. Mobile Networks and Applications. https://doi.org/10.1007/s11036-023-02096-9
Moqurrab, S. A., Tariq, N., Anjum, A., Asheralieva, A., Malik, S. U., Malik, H., Pervaiz, H., & Gill, S. S. (2022). A deep learning-based privacy-preserving model for smart healthcare in Internet of Medical Things using fog computing. Wireless Personal Communications, 126(3), 2379-2401. https://doi.org/10.1007/s11277-021-09323-0
Patil, P. (2024, May 8). Healthcare dataset. Kaggle: Your Machine Learning and Data Science Community. https://www.kaggle.com/datasets/prasad22/healthcare-dataset
Majeed, A., & Hwang, S. O. (2023). Quantifying the vulnerability of attributes for effective privacy preservation using machine learning. IEEE Access, 11, 4400-4411. https://doi.org/10.1109/access.2023.3235016
Manjula, G. S., & Meyyappan, T. (2023). Two-Phase Privacy Preserving Big Data Hybrid Clustering for Multi-Party Data Sharing. International Journal of Intelligent Systems and Applications in Engineering, 11(9s), 501-510.
Udita, M., Ritu, N., & Amandeep. (2023). Secure and Compatible Integration of Cloud-Based ERP Solution: A Review. International Journal of Intelligent Systems and Applications in Engineering, 11(9), 695-707.
Kavitha, G., Kavitha, K., & Sujatha, B. (2024). A Hybrid Multi-Client Filter Based Feature Clustering and Privacy Preserving Classification Framework on High Dimensional Databases. International Journal of Intelligent Systems and Applications in Engineering, 12(8), 93-107.
Ge, Y., Wang, H., Cao, J., Zhang, Y., & Jiang, X. (2024). Privacy-preserving data publishing: An information-driven distributed genetic algorithm. World Wide Web, 27(1). https://doi.org/10.1007/s11280-024-01241-y
Canbay, Y., Sagiroglu, S., & Vural, Y. (2022). A new utility‐aware anonymization model for privacy preserving data publishing. Concurrency and Computation: Practice and Experience, 34(10). https://doi.org/10.1002/cpe.6808
Majeed, A. (2023). Attribute-centric and synthetic data based privacy preserving methods: A systematic review. Journal of Cybersecurity and Privacy, 3(3), 638-661. https://doi.org/10.3390/jcp3030030
Kulkarni, Y. R., Jagdale, B., & Sugave, S. R. (2023). Optimized key generation-based privacy preserving data mining model for secure data publishing. Advances in Engineering Software, 175, 103332. https://doi.org/10.1016/j.advengsoft.2022.103332
Kim, J. W. (2021). Efficiently supporting online privacy-preserving data publishing in a distributed computing environment. Applied Sciences, 11(22), 10740. https://doi.org/10.3390/app112210740
Ekaputra, F. J., Ekelhart, A., Mayer, R., Miksa, T., Šarčević, T., Tsepelakis, S., & Waltersdorfer, L. (2024). Semantic-enabled architecture for auditable privacy-preserving data analysis. Semantic Web, 15(3), 675-708. https://doi.org/10.3233/sw-212883
Andrew, J., Eunice, R. J., & Karthikeyan, J. (2023). An anonymization-based privacy-preserving data collection protocol for digital health data. Frontiers in Public Health, 11. https://doi.org/10.3389/fpubh.2023.1125011
S.Venkata Lakshmi and Valli Kumari Vatsavayi, 2016. Query optimization using clustering and Genetic Algorithm for Distributed Databases. International Conference on Computer Communication and Informatics (ICCCI). IEEE.
S Venkata Lakshmi and Valli Kumari Vatsavayi, April 2017. Teacher-Learner & Multi-Objective Genetic Algorithm Based Query Optimization Approach For Heterogeneous Distributed Database Systems. Journal of Theoretical and Applied Information Technology.
Sunita A Yadwad, Dr V. Valli Kumari and Dr S Venkata Lakshmi, 2021. Service Outages Prediction through Logs and Tickets Analysis, International Journal of Advanced Computer Science and Applications (IJACSA).
Simhadri Madhuri and Dr. S. Venkata Lakshmi (2023), A Machine Learning based Normalized Fuzzy Subset Linked Model In Networks for Intrusion Detection. Soft Computing.
Simhadri Madhuri and Dr. S. Venkata Lakshmi (2023), Trusted Node Feedback Based Clustering Model For Detection Of Malicious Nodes In The Network. Journal of Theoretical and Applied Information Technology (JATIT), Vol.101. Issue No. 7.
Improving Data Transmission Rate with Self Healing Activation Model for Intrusion Detection with Enhanced Quality of Service. International Journal on Recent and Innovation Trends in Computing and Communication (Q4), Volume: 11 Issue: 9s, 233-243. https://doi.org/10.17762/ijritcc.v11i9s.7417
Downloads
Published
How to Cite
Issue
Section
License
![Creative Commons License](http://i.creativecommons.org/l/by-sa/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.