Abstract
Checkpoint-recovery based virtual machine (VM) replication is an attractive technique for accommodating VM installations with high-availability. It provides seamless failover for the entire software stack executed in the VM regardless the application or the underlying operating system (OS), it runs on commodity hardware, and it is inherently capable of dealing with shared memory non-determinism of symmetric multiprocessing (SMP) configurations. There have been several studies aiming at alleviating the overhead of replication, however, due to consistency requirements, network performance of the basic replication mechanism remains extremely poor.,
In this paper we revisit the replication protocol and extend it with speculative communication. Speculative communication silently acknowledges TCP packets of the VM, enabling the guest's TCP stack to progress with transmission without exposing the messages to the clients before the corresponding execution state is checkpointed to the backup host. Furthermore, we propose replication aware congestion control, an extension to the guest's TCP stack that aggressively fills up the VMM's replication buffer so that speculative packets can be backed up and released earlier to the clients. We observe up to an order of magnitude improvement in bulk data transfer with speculative communication, and close to native VM network performance when replication awareness is enabled in the guest OS. We provide results of micro-, as well as application-level benchmarks.
- InfiniBand Trade Association. InfiniBand Architecture Specification, Release 1.2.Google Scholar
- T. Bergan, O. Anderson, J. Devietti, L. Ceze, and D. Grossman. CoreDet: a compiler and runtime system for deterministic multithreaded execution. In phProceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, ASPLOS '10, pages 53--64. ACM, 2010. ISBN 978--1--60558--839--1. Google Scholar
Digital Library
- L. S. Brakmo and L. L. Peterson. TCP Vegas: End to End Congestion Avoidance on a Global Internet. phIEEE Journal on selected Areas in communications, 13: 1465--1480, 1995. Google Scholar
Digital Library
- T. Bressoud and F. B. Schneider. Hypervisor-based fault tolerance. In phProceedings of the fifteenth ACM symposium on Operating systems principles, SOSP '95, pages 1--11, New York, NY, USA, 1995. ACM. ISBN 0--89791--715--4. Google Scholar
Digital Library
- C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live Migration of Virtual Machines. In phNSDI'05: Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation, pages 273--286, Berkeley, CA, USA, 2005. USENIX Association. Google Scholar
Digital Library
- B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and A. Warfield. Remus: high availability via asynchronous virtual machine replication. In phProceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, NSDI'08, pages 161--174, 2008. ISBN 111--999--5555--22--1. Google Scholar
Digital Library
- J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. phCommun. ACM, 51: 107--113, January 2008. Google Scholar
Digital Library
- Y. Dong, Y. Zhang, and G. Liao. Optimizing Network I/O Virtualization with Efficient Interrupt Coalescing and Virtual Receive Side Scaling. In phProceedings of the 2011 IEEE International Conference on Cluster Computing, CLUSTER '11, 2011. Google Scholar
Digital Library
- Y. Du and H. Yu. Paratus: Instantaneous Failover via Virtual Machine Replication. In phProceedings of the 2009 Eighth International Conference on Grid and Cooperative Computing, GCC '09, pages 307--312. IEEE Computer Society, 2009. Google Scholar
Digital Library
- Y. Du, H. Yu, G. Shi, J. Chen, and W. Zheng. Microwiper: Efficient Memory Propagation in Live Migration of Virtual Machines. In phProceedings of the 2010 39th International Conference on Parallel Processing, ICPP '10, pages 141--149, Washington, DC, USA, 2010. ISBN 978-0--7695--4156--3. Google Scholar
Digital Library
- G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution replay of multiprocessor virtual machines. In phProceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, VEE '08, pages 121--130, 2008. ISBN 978--1--59593--796--4. Google Scholar
Digital Library
- B. Gerofi and Y. Ishikawa. RDMA based Replication of Multiprocessor Virtual Machines over High-Performance Interconnects. In phProceedings of the 2011 IEEE International Conference on Cluster Computing, CLUSTER '11, pages 35--44, 2011. Google Scholar
Digital Library
- S. Ha, I. Rhee, and L. Xu. CUBIC: a new TCP-friendly high-speed TCP variant. phSIGOPS Oper. Syst. Rev., 42: 64--74, July 2008. ISSN 0163--5980. Google Scholar
Digital Library
- R. Hariharan and N. Sun. Workload Characterization of SPECweb2005. http://www.spec.org/workshops/2006/papers/02_Workload_char_SPECweb2005%_Final.pdf, 2006.Google Scholar
- W. Huang, Q. Gao, J. Liu, and D. K. Panda. High performance virtual machine migration with RDMA over modern interconnects. In phProceedings of the 2007 IEEE International Conference on Cluster Computing, CLUSTER '07, pages 11--20, Washington, DC, USA, 2007. ISBN 978--1--4244--1387--4. Google Scholar
Digital Library
- A. Kangarlou, S. Gamage, R. R. Kompella, and D. Xu. vSnoop: Improving TCP Throughput in Virtualized Environments via Acknowledgement Offload. In phProceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC '10, pages 1--11, Washington, DC, USA, 2010. IEEE Computer Society. Google Scholar
Digital Library
- A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. kvm: the Linux virtual machine monitor. In phOttawa Linux Symposium, pages 225--230, July 2007. URL http://www.kernel.org/doc/ols/2007/ols2007v1-pages-225--230.pdf.Google Scholar
- D. Lee, B. Wester, K. Veeraraghavan, S. Narayanasamy, P. M. Chen, and J. Flinn. Respec: efficient online multiprocessor replayvia speculation and external determinism. ASPLOS '10, pages 77--90. ACM, 2010. ISBN 978--1--60558--839--1. Google Scholar
Digital Library
- M. Lu and T. cker Chiueh. Fast memory state synchronization for virtualization-based fault tolerance. In phDependable Systems Networks, 2009. DSN '09. IEEE/IFIP International Conference on, pages 534 --543, 2009.Google Scholar
- R. McDougall and J. Anderson. Virtualization performance: perspectives and challenges ahead. phSIGOPS Oper. Syst. Rev., 44: 40--56, December 2010. ISSN 0163--5980. Google Scholar
Digital Library
- A. Menon, A. L. Cox, and W. Zwaenepoel. Optimizing network virtualization in Xen. In phProceedings of the annual conference on USENIX '06 Annual Technical Conference, pages 15--28, Berkeley, CA, USA, 2006. USENIX Association. Google Scholar
Digital Library
- A. Menon, S. Schubert, and W. Zwaenepoel. TwinDrivers: semi-automatic derivation of fast and safe hypervisor network drivers from guest OS drivers. In phProceeding of the 14th international conference on Architectural support for programming languages and operating systems, ASPLOS '09, pages 301--312, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- M. Nelson, B. H. Lim, and G. Hutchins. Fast transparent migration for virtual machines. In phATEC '05: Proceedings of the annual conference on USENIX Annual Technical Conference, page 25, Berkeley, CA, USA, 2005. USENIX Association. URL http://portal.acm.org/citation.cfm?id=1247360.1247385. Google Scholar
Digital Library
- G. Pokam, C. Pereira, K. Danne, L. Yang, S. King, and J. Torellas. Hardware and Software Approaches for Deterministic Multi-Processor Replay of Concurrent Programs. In phIntel Technology Journal, volume 13, issue 4, pages 20--41, 2009.Google Scholar
- D. J. Scales, M. Nelson, and G. Venkitachalam. The design of a practical system for fault-tolerant virtual machines. phSIGOPS Oper. Syst. Rev., 44: 30--39, December 2010. Google Scholar
Digital Library
- M. Stonebraker, D. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. MapReduce and parallel DBMSs: friends or foes? phCommun. ACM, 53: 64--71, January 2010. Google Scholar
Digital Library
- 988)}storm88pwdStrom, R.E. and Bacon, D.F. and Yemini, S.A. Volatile logging in n-fault-tolerant distributed systems. In phFault-Tolerant Computing, Eighteenth International Symposium on, pages 44 --49, Jun 1988.Google Scholar
- Y. Tamura. Kemari: Virtual Machine Synchronization for Fault Tolerance using DomT. Technical report, NTT Cyber Space Labs, 2008.Google Scholar
- 010)}ttcpTest TCP (TTCP): Benchmarking Tool and Simple Network Traffic Generator. http://www.pcausa.com/Utilities/pcattcp.htm, 2010.Google Scholar
- A. TM. Hadoop. http://hadoop.apache.org.Google Scholar
- 981)}rfc81tcpTransmission Control Protocol. Protocol Specification. http://www.ietf.org/rfc/rfc793.txt, 1981.Google Scholar
- K. V. Vishwanath and N. Nagappan. Characterizing cloud computing hardware reliability. In phProceedings of the 1st ACM symposium on Cloud computing, SoCC '10, pages 193--204, New York, NY, USA, 2010. ACM. Google Scholar
Digital Library
- M. Xu, R. Bodik, and M. D. Hill. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. In phProceedings of the 30th annual international symposium on Computer architecture, ISCA '03, pages 122--135. ACM, 2003. ISBN 0-7695-1945-8. Google Scholar
Digital Library
- X. Zhang, Z. Huo, J. Ma, and D. Meng. Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration. In phCluster Computing (CLUSTER), 2010 IEEE International Conference on, pages 88--96, 2010. Google Scholar
Digital Library
- J. Zhu, W. Dong, Z. Jiang, X. Shi, Z. Xiao, and X. Li. Improving the Performance of Hypervisor-based Fault Tolerance. In phParallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--10, 2010.Google Scholar
Index Terms
Enhancing TCP throughput of highly available virtual machines via speculative communication
Recommendations
Enhancing TCP throughput of highly available virtual machines via speculative communication
VEE '12: Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution EnvironmentsCheckpoint-recovery based virtual machine (VM) replication is an attractive technique for accommodating VM installations with high-availability. It provides seamless failover for the entire software stack executed in the VM regardless the application or ...
RDMA Based Replication of Multiprocessor Virtual Machines over High-Performance Interconnects
CLUSTER '11: Proceedings of the 2011 IEEE International Conference on Cluster ComputingWith the growing prevalence of cloud computing and the increasing number of CPU cores in modern processors, symmetric multiprocessing (SMP) Virtual Machines (VM), i.e. virtual machines with multiple virtual CPUs, are gaining significance. However, ...
Utilizing memory content similarity for improving the performance of highly available virtual machines
Checkpoint-recovery based Virtual Machine (VM) replication is an emerging approach towards accommodating VM installations with high availability. However, it comes with the price of significant performance degradation of the application executed in the ...







Comments