Multi-Version Infrastructure for Privacy-Preserving AI/ML Inference at Scale
Keywords:
Privacy-Preserving Inference, Multi-Version Feature Representations, Embedding Compliance, Differential Privacy, Regulatory-Aware Machine LearningAbstract
As the number of regulatory regimes, multi-stakeholder data relationships, and compliance requirements grows, privacy becomes an increasing architectural concern for large-scale AI/ML systems for data inference. Inference pipelines that apply a single, globally cast restrictive data policy to every inference context incur a measurable decrease in model performance. To avoid degrading model performance through globally restrictive policies while also avoiding potential policy violations introduced by dynamically modifying data usage per request, our multi-version architecture explicitly maintains multiple versions of user and participant information at the feature and embedding levels. In conjunction, context-aware version selection mechanisms deterministically map the metadata describing an incoming request to the appropriate data usage policy at runtime. In turn, versioned feature vectors are generated from superset representations of available signals, with the appropriate version selected based on the incoming request context and its corresponding data usage policy. Model-specific embeddings are derived from their privacy-compliant feature vectors to ensure end-to-end compliance. Rule-based selection schemes, implemented as abstractions decoupled from inference execution code, allow rapid regulatory adaptation without requiring service redeployment. Continuous monitoring helps validate selection quality and detect performance regressions in production environments. The computational overhead introduced by generating and maintaining multiple feature and embedding versions can be reduced through centralized build-once orchestration, shared feature storage schemas, and hybrid offline–online embedding generation within internet-scale latency budgets. Beyond privacy, this architectural pattern generalizes to fairness-aware inference, multi-tenant data isolation, and auditable policy enforcement, enabling versioned features and embedding representations as a foundational primitive for developing trustworthy, policy-compliant AI/ML systems.
Downloads
References
H. Brendan McMahan et al., "Communication-Efficient Learning of Deep Networks from Decentralized Data," Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017. [Online]. Available: https://proceedings.mlr.press/v54/mcmahan17a/mcmahan17a.pdf
Arvind Narayanan and Vitaly Shmatikov, "Robust De-anonymization of Large Sparse Datasets," 2007. [Online]. Available: https://www.stat.cmu.edu/~brian/303-2012-full/303-2011/303-2010/0-from%20the%20world/2010-03-12-de-anonymizing%20netflix.pdf
Martín Abadi et al., "Deep Learning with Differential Privacy," ACM Digital Library, 2016. [Online]. Available: https://dl.acm.org/doi/epdf/10.1145/2976749.2978318
Reza Shokri, Vitaly Shmatikov, "Privacy-Preserving Deep Learning," ACM Digital Library, 2015. [Online]. Available: https://dl.acm.org/doi/pdf/10.1145/2810103.2813687
Jacob Devlin, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," 2019. [Online]. Available: https://aclanthology.org/N19-1423.pdf
Christian Janiesch et al., "Machine Learning and Deep Learning," Electronic Markets, 2021. [Online]. Available: https://link.springer.com/content/pdf/10.1007/s12525-021-00475-2.pdf
Latanya Sweeney, "k-Anonymity: A Model for Protecting Privacy," International Journal on Uncertainty, 2002. [Online]. Available: https://homepage.divms.uiowa.edu/~sriram/5980/spring16/k-anonymity1.pdf
Foot Anstey, "The General Data Protection Regulation: A Practical Guide to the Changes Ahead," 2018. [Online]. Available: https://www.faintranet.co.uk/wp-content/uploads/FOOT-ANSTEY-GDPR-_Digital-Version.pdf
Matei Zaharia et al., “Apache Spark: A Unified Engine for Big Data Processing," ACM Digital Library, 2016. https://dl.acm.org/doi/pdf/10.1145/2934664
Alexandros G. Dimakis et al., “A Survey on Network Codes for Distributed Storage,” Proceedings of the IEEE, Vol. 99, No. 3, March 2011.
https://www.academia.edu/10344819/I_N_V_I_A_Survey_on_Network_Codes_for_Distributed_Storage
Peter I. Frazier. “A Tutorial on Bayesian Optimization.” arXiv, 2018. https://arxiv.org/pdf/1807.02811
Finale Doshi-Velez and Been Kim, “Towards a Rigorous Science of Interpretable Machine Learning," arXiv, 2017. https://arxiv.org/pdf/1702.08608
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


