skip to main content
research-article

Enhancing TCP throughput of highly available virtual machines via speculative communication

Published:03 March 2012Publication History
Skip Abstract Section

Abstract

Checkpoint-recovery based virtual machine (VM) replication is an attractive technique for accommodating VM installations with high-availability. It provides seamless failover for the entire software stack executed in the VM regardless the application or the underlying operating system (OS), it runs on commodity hardware, and it is inherently capable of dealing with shared memory non-determinism of symmetric multiprocessing (SMP) configurations. There have been several studies aiming at alleviating the overhead of replication, however, due to consistency requirements, network performance of the basic replication mechanism remains extremely poor.,

In this paper we revisit the replication protocol and extend it with speculative communication. Speculative communication silently acknowledges TCP packets of the VM, enabling the guest's TCP stack to progress with transmission without exposing the messages to the clients before the corresponding execution state is checkpointed to the backup host. Furthermore, we propose replication aware congestion control, an extension to the guest's TCP stack that aggressively fills up the VMM's replication buffer so that speculative packets can be backed up and released earlier to the clients. We observe up to an order of magnitude improvement in bulk data transfer with speculative communication, and close to native VM network performance when replication awareness is enabled in the guest OS. We provide results of micro-, as well as application-level benchmarks.

References

  1. InfiniBand Trade Association. InfiniBand Architecture Specification, Release 1.2.Google ScholarGoogle Scholar
  2. T. Bergan, O. Anderson, J. Devietti, L. Ceze, and D. Grossman. CoreDet: a compiler and runtime system for deterministic multithreaded execution. In phProceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems, ASPLOS '10, pages 53--64. ACM, 2010. ISBN 978--1--60558--839--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. S. Brakmo and L. L. Peterson. TCP Vegas: End to End Congestion Avoidance on a Global Internet. phIEEE Journal on selected Areas in communications, 13: 1465--1480, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. T. Bressoud and F. B. Schneider. Hypervisor-based fault tolerance. In phProceedings of the fifteenth ACM symposium on Operating systems principles, SOSP '95, pages 1--11, New York, NY, USA, 1995. ACM. ISBN 0--89791--715--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live Migration of Virtual Machines. In phNSDI'05: Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation, pages 273--286, Berkeley, CA, USA, 2005. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and A. Warfield. Remus: high availability via asynchronous virtual machine replication. In phProceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, NSDI'08, pages 161--174, 2008. ISBN 111--999--5555--22--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. phCommun. ACM, 51: 107--113, January 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Dong, Y. Zhang, and G. Liao. Optimizing Network I/O Virtualization with Efficient Interrupt Coalescing and Virtual Receive Side Scaling. In phProceedings of the 2011 IEEE International Conference on Cluster Computing, CLUSTER '11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Du and H. Yu. Paratus: Instantaneous Failover via Virtual Machine Replication. In phProceedings of the 2009 Eighth International Conference on Grid and Cooperative Computing, GCC '09, pages 307--312. IEEE Computer Society, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Du, H. Yu, G. Shi, J. Chen, and W. Zheng. Microwiper: Efficient Memory Propagation in Live Migration of Virtual Machines. In phProceedings of the 2010 39th International Conference on Parallel Processing, ICPP '10, pages 141--149, Washington, DC, USA, 2010. ISBN 978-0--7695--4156--3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution replay of multiprocessor virtual machines. In phProceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, VEE '08, pages 121--130, 2008. ISBN 978--1--59593--796--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Gerofi and Y. Ishikawa. RDMA based Replication of Multiprocessor Virtual Machines over High-Performance Interconnects. In phProceedings of the 2011 IEEE International Conference on Cluster Computing, CLUSTER '11, pages 35--44, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Ha, I. Rhee, and L. Xu. CUBIC: a new TCP-friendly high-speed TCP variant. phSIGOPS Oper. Syst. Rev., 42: 64--74, July 2008. ISSN 0163--5980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Hariharan and N. Sun. Workload Characterization of SPECweb2005. http://www.spec.org/workshops/2006/papers/02_Workload_char_SPECweb2005%_Final.pdf, 2006.Google ScholarGoogle Scholar
  15. W. Huang, Q. Gao, J. Liu, and D. K. Panda. High performance virtual machine migration with RDMA over modern interconnects. In phProceedings of the 2007 IEEE International Conference on Cluster Computing, CLUSTER '07, pages 11--20, Washington, DC, USA, 2007. ISBN 978--1--4244--1387--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Kangarlou, S. Gamage, R. R. Kompella, and D. Xu. vSnoop: Improving TCP Throughput in Virtualized Environments via Acknowledgement Offload. In phProceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC '10, pages 1--11, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. kvm: the Linux virtual machine monitor. In phOttawa Linux Symposium, pages 225--230, July 2007. URL http://www.kernel.org/doc/ols/2007/ols2007v1-pages-225--230.pdf.Google ScholarGoogle Scholar
  18. D. Lee, B. Wester, K. Veeraraghavan, S. Narayanasamy, P. M. Chen, and J. Flinn. Respec: efficient online multiprocessor replayvia speculation and external determinism. ASPLOS '10, pages 77--90. ACM, 2010. ISBN 978--1--60558--839--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Lu and T. cker Chiueh. Fast memory state synchronization for virtualization-based fault tolerance. In phDependable Systems Networks, 2009. DSN '09. IEEE/IFIP International Conference on, pages 534 --543, 2009.Google ScholarGoogle Scholar
  20. R. McDougall and J. Anderson. Virtualization performance: perspectives and challenges ahead. phSIGOPS Oper. Syst. Rev., 44: 40--56, December 2010. ISSN 0163--5980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Menon, A. L. Cox, and W. Zwaenepoel. Optimizing network virtualization in Xen. In phProceedings of the annual conference on USENIX '06 Annual Technical Conference, pages 15--28, Berkeley, CA, USA, 2006. USENIX Association. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Menon, S. Schubert, and W. Zwaenepoel. TwinDrivers: semi-automatic derivation of fast and safe hypervisor network drivers from guest OS drivers. In phProceeding of the 14th international conference on Architectural support for programming languages and operating systems, ASPLOS '09, pages 301--312, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Nelson, B. H. Lim, and G. Hutchins. Fast transparent migration for virtual machines. In phATEC '05: Proceedings of the annual conference on USENIX Annual Technical Conference, page 25, Berkeley, CA, USA, 2005. USENIX Association. URL http://portal.acm.org/citation.cfm?id=1247360.1247385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. Pokam, C. Pereira, K. Danne, L. Yang, S. King, and J. Torellas. Hardware and Software Approaches for Deterministic Multi-Processor Replay of Concurrent Programs. In phIntel Technology Journal, volume 13, issue 4, pages 20--41, 2009.Google ScholarGoogle Scholar
  25. D. J. Scales, M. Nelson, and G. Venkitachalam. The design of a practical system for fault-tolerant virtual machines. phSIGOPS Oper. Syst. Rev., 44: 30--39, December 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Stonebraker, D. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. MapReduce and parallel DBMSs: friends or foes? phCommun. ACM, 53: 64--71, January 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 988)}storm88pwdStrom, R.E. and Bacon, D.F. and Yemini, S.A. Volatile logging in n-fault-tolerant distributed systems. In phFault-Tolerant Computing, Eighteenth International Symposium on, pages 44 --49, Jun 1988.Google ScholarGoogle Scholar
  28. Y. Tamura. Kemari: Virtual Machine Synchronization for Fault Tolerance using DomT. Technical report, NTT Cyber Space Labs, 2008.Google ScholarGoogle Scholar
  29. 010)}ttcpTest TCP (TTCP): Benchmarking Tool and Simple Network Traffic Generator. http://www.pcausa.com/Utilities/pcattcp.htm, 2010.Google ScholarGoogle Scholar
  30. A. TM. Hadoop. http://hadoop.apache.org.Google ScholarGoogle Scholar
  31. 981)}rfc81tcpTransmission Control Protocol. Protocol Specification. http://www.ietf.org/rfc/rfc793.txt, 1981.Google ScholarGoogle Scholar
  32. K. V. Vishwanath and N. Nagappan. Characterizing cloud computing hardware reliability. In phProceedings of the 1st ACM symposium on Cloud computing, SoCC '10, pages 193--204, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Xu, R. Bodik, and M. D. Hill. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. In phProceedings of the 30th annual international symposium on Computer architecture, ISCA '03, pages 122--135. ACM, 2003. ISBN 0-7695-1945-8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. X. Zhang, Z. Huo, J. Ma, and D. Meng. Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration. In phCluster Computing (CLUSTER), 2010 IEEE International Conference on, pages 88--96, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Zhu, W. Dong, Z. Jiang, X. Shi, Z. Xiao, and X. Li. Improving the Performance of Hypervisor-based Fault Tolerance. In phParallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pages 1--10, 2010.Google ScholarGoogle Scholar

Index Terms

  1. Enhancing TCP throughput of highly available virtual machines via speculative communication

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 47, Issue 7
      VEE '12
      July 2012
      229 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2365864
      Issue’s Table of Contents
      • cover image ACM Conferences
        VEE '12: Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments
        March 2012
        248 pages
        ISBN:9781450311762
        DOI:10.1145/2151024

      Copyright © 2012 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 March 2012

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!