File System RPO Snapshots in Near Real-Time with Asynchronous Replication
Keywords:
RPO snapshots, asynchronous delay, RPO violation, Disaster Recovery, RPO snapshots near real-timeAbstract
In the File System level Disaster Recovery configurations, asynchronous data replication is often used between the local Primary File System and the remote Recovery File System to avoid latency to the applications running on the local Primary File System. The individual data updates of the local Primary File System are replicated to the remote Recovery File System in the background asynchronously after applying some delay called asynchronous delay. During the asynchronous delay, optimization methods like coalescing smaller contiguous write operations into a more extensive write operation and eliminating short-lived data updates are applied to the data updates to reduce the network bandwidth requirement for data replication. However, the asynchronous delay likely causes the delay in taking periodic Recovery Point Objective (RPO) snapshots on the Recovery File System for data consistency because of the large amount of data pending replication to the remote Recovery File System before taking RPO snapshots. This delay in taking RPO snapshots could cause more data loss, causing RPO violations if disaster hits the local Primary File System. Taking RPO snapshots strictly at RPO intervals is critical. This paper describes a new efficient procedure for taking RPO snapshots close to the RPO interval without delay by replicating pending data updates to the remote Recovery File System earlier without waiting for the asynchronous delay. Based on pending data replication, the aggregated network bandwidth between the local primary location and the remote recovery location, and the aggregated rate of data generated by applications, the early replication time before the next RPO time without waiting for the asynchronous delay is calculated.
Downloads
References
Umesh Deshpande, Nick Linck and Sangeetha Seshadri. 2021. Self-service Data Protection for Stateful Containers. In 13th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage ’21), July 27–28, 2021, Virtual, USA. ACM, New York, NY, USA, six pages.
J. Mendoca, R.Lima, E. Queiroz, E. Andrade and D. S. Kim, "Evaluation of a Backup-as-a-Service Environment for Disaster Recovery," 2019 IEEE Symposium on Computers and Communications (ISCC), Barcelona, Spain, 2019, pp. 1-6, doi: 10.1109/ISCC47284.2019.8969658.
Chao Wang, Zhanhuai Li, and Kun Ren. 2010. ARPRG: An asynchronous replication protocol with RPO guarantee. International Conference on Computer Engineering and Technology 1 (2010), V1–611–V1–615.
W. Xiao, Q. Yang, J. Ren, C. Xie and H. Li, "Design and Analysis of Block-Level Snapshots for Data Protection and Recovery," in IEEE Transactions on Computers, vol. 58, no. 12, pp. 1615-1625, Dec. 2009, doi: 10.1109/TC.2009.107.
H. Patterson, S. Manley, M. Federwisch, D. Hitz, S. Kleinman, and S. Owara. SnapMirror: File System Based Asynchronous Mirroring for Disaster Recovery. In Proceedings of the First USENIX Conference on File and Storage Technologies (FAST 2002), pages 117-129, 2002.
Marc Eshel, Roger Haskin, Dean Hildebrand, Manoj Naik and Frank Schmuck. Panache: A Parallel File System Cache for Global File Access. In FAST'10 Proceedings of the 8th USENIX conference on File and Storage technologies
F. Schmuck and R. Haskin. GPFS: A Shared-Disk File System for Large Computing Clusters. In Proc. of the First Conference on File and Storage Technologies 2000
Ann Chervenak, Vivekenand Vellanki, and Zachary Kurmas. Protecting File Systems: A survey of backup techniques. In Proceedings Joint NASA and IEEE Mass Storage Conference, March 1998.
S. Shumway. Issues in Online Backup. In USENIX Proceedings of the 5th Conference on Large Installation Systems Administration, pages 81–88, September 1991.
Ananthanarayanan, R., et al. "Panache: a parallel WAN cache for clustered filesystems." ACM SIGOPS Operating Systems Review 2008:48—53.
Manoj P. Naik and Ravindra R. Sure, “SNAPSHOTS AT REAL TIME INTERVALS ON ASYNCHRONOUS DATA REPLICATION SYSTEM,” US Patent 9 983 947, May 29, 2018.S
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.