Abstract
Online, remote, data replication is critical for today’s enterprise IT organization. Availability of data is key to the success of the organization. A few hours of downtime can cost from thousands to millions of dollars With increasing frequency, companies are instituting disaster recovery plans to ensure appropriate data availability in the event of a catastrophic failure or disaster that destroys a site (e.g. flood, fire, or earthquake).
Synchronous and asynchronous replication technologies have been available for a long period of time. Synchronous replication has the advantage of no data loss, but due to latency, synchronous replication is limited by distance and bandwidth. Asynchronous replication on the other hand has no distance limitation, but leads to some data loss which is proportional to the data lag. We present a novel method, implemented within EMC Recover-Point, which allows the system to dynamically move between these replication options without any disruption to the I/O path. As latency grows, the system will move from synchronous replication to semi-synchronous replication and then to snapshot shipping. It returns to synchronous replication as more bandwidth is available and latency allows.
- Aronovich, L., Asher, R., Bachmat, E., Bitner, H., Hirsch, M., and Klein, S. T. 2009. The design of a similarity based deduplication system. In Proceedings of the Israeli Experimental Systems Conference (SYSTOR’09). Google Scholar
Digital Library
- Azagury A., Factor, M., and Micka W. 2003. Advanced functions for storage subsystems: Supporting continuous availability. IBM Syst. J. 42, 2, 268--279. Google Scholar
Digital Library
- Cooper, B. F., Ramakrishnan, R., Srivastava, U., Silberstein, A., Bohannon, P., Jacobsen, H.-A., Puz, N., Weaver, D., and Yernen, R. 2008. PNUTS: Yahoo!s hosted data serving platform. Proc. VLDB Endow. 1, 2, 1277--1288. Google Scholar
Digital Library
- Cormen, T. H., Leiserson, C. E., and Rivest, R. L. 1990. Introduction to Algorithms 1st Ed. MIT Press and McGraw-Hill. Google Scholar
Digital Library
- EMC Celerra Replicator. 2013. http://www.emc.com/.Google Scholar
- EMC Symmetrix Remote Data Facility. 2013. http://www.emc.com/.Google Scholar
- Ghemawat, S., Gobioff, H., and Leung, S. 2003. The google file system. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP’03). 29--43. Google Scholar
Digital Library
- Hitz, D., Lau, J., and Malcolm, M. A. 1994. File system design for an nfs file server appliance. In Proceedings of the USENIX Winter Conference. 235--246. Google Scholar
Digital Library
- IOMeter. 2013. http://www.iometer.org/.Google Scholar
- Ji, M., Veitch, A., and Wilkes, J. 2003. Seneca: Remote mirroring done write. In Proceedings of the USENIX Technical Conference. 253--268.Google Scholar
- Keeton, K., Santos, C., Beyer, D., Chase, J., and Wilkes, J. 2004. Designing for disasters. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04). USENIX Association, 59--62. Google Scholar
Digital Library
- Kistler, J. J. 1993. Disconnected operation in a distributed file system. Tech. rep. CMU-CS- 93-156. School of Computer Science, Carnegie Mellon University.Google Scholar
- Krishnamurthy, S., Sanders, W. H., and Cukier, M. 2003. An adaptive quality of service aware middleware for replicated services. IEEE Trans. Parallel Distrib. Syst.14, 11, 1112--1125. Google Scholar
Digital Library
- Leung, S. A., Maccormick, J., Perl, S. E., and Zhang, L. 2002, Myriad: Cost-effective disaster tolerance. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). USENIX Association, 103--116. Google Scholar
Digital Library
- Lillibridge, M., Eshghi, K., Bhagwat, D., Deolalikar, V., Trezise, G., and Camble, P. 2009. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09). Google Scholar
Digital Library
- Liskov, B., Ghemawat, S., Gruber, R., Johnson, P., Shrira, L., and Williams, M. 1991. Replication in the harp file system. ACM SIGOPS Oper. Syst. Rev. 25, 5, 226--238. Google Scholar
Digital Library
- Matthews, J., Roselli, D., Costell, A., Wang, R., and Anderson, T. 1997. Improving the performance of log-structured file systems with adaptive methods. In Proceedings of the 16th ACM Symposium on Operating Systems Principles. 238--251. Google Scholar
Digital Library
- Patterson, R. H., Manley, S., Federwisch, M., Hitz, D., Kleiman, S., and Owara, S. 2002. SnapMirror: File-system-based asynchronous mirroring for disaster recovery. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). 117--129. Google Scholar
Digital Library
- Repilstor. 2013. http://www.purplerage.com/replistor/.Google Scholar
- Rosenblum, M. and Osterhout, J. K. 1992. The dsign and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10, 1, 26--52. Google Scholar
Digital Library
- Shaull, R., Shrira, L., and Hao, X. 2008. Skippy: A new indexing method for long-lived snapshots in the storage manager. In Proceedings of the ACM SIGMOD International Conference on Management of Data. Google Scholar
Digital Library
- Sovran, Y., Power, R., Aguilera, M. K., and Li, J. 2011. Transactional storage for geo-replicated systems. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). 385--400. Google Scholar
Digital Library
- Strunk, J. D., Goodson, G. R., Scheinholtz, M. L., Soules, C., and Ganger, G. R. 2000. Self-securing storage: Protecting data in compromised systems. In Proceedings of the 4th OSDI Conference on Foundations of Intrusion Tolerant Systems. 165--180. Google Scholar
Digital Library
- Tridgell, A. and Mackerras, P. 1996. The rsync algorithm. Tech. rep. CS-96-05, Department of Computer Science, Australian National University.Google Scholar
- Wang, Y., Li, Z., and Lin, W. 2007. RWAR: A resilient window-consistent asynchronous replication protocol. In Proceedings of the 2nd International Conference on Availability, Reliability and Security (ARES’07). 499--505. Google Scholar
Digital Library
- Weatherspoon, H., Ganesh, L., Marian, T., Balakrishnan, M., and Birman, K. 2009. Smoke and mirrors: Reflecting files at a geographically remote location without loss of performance. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09). 211--224. Google Scholar
Digital Library
- Yan, R., Shu, J., and Chan, W. D. 2004. An implementation of semi-synchronous remote mirroring system for sans. In Proceedings of the ACM Workshop on Grid and Cooperative Computing (GCC’04).Google Scholar
- Zhu, B., Li, K., and Patterson, H. 2008. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). 279--292. Google Scholar
Digital Library
- Zuo, H. and Jahanian, F. 1998. Real-time primary-backup (rtbp) replication with temporal consistency guarantees. In Proceedings of the 18th International Conference on Distributed Computing Systems (ICDCS’98). Google Scholar
Digital Library
Index Terms
Dynamic Synchronous/Asynchronous Replication
Recommendations
The case for semantic aware remote replication
StorageSS '06: Proceedings of the second ACM workshop on Storage security and survivabilityThis paper argues that the network latency due to synchronous replication is no longer tolerable in scenarios where businesses are required by regulation to separate their secondary sites from the primary by hundreds of miles. We propose a semantic-...
A Survey of Storage Remote Replication Software
ICECCS '14: Proceedings of the 2014 3rd International Conference on Eco-friendly Computing and Communication SystemsData availability at all times is the key requirement for major business operations and the financial impact of business outage could be disastrous (see table 1). There could be many majors threats to information availability, be it natural disasters, ...






Comments