skip to main content
research-article
Public Access

Breaking the Boundaries in Heterogeneous-ISA Datacenters

Published:04 April 2017Publication History
Skip Abstract Section

Abstract

Energy efficiency is one of the most important design considerations in running modern datacenters. Datacenter operating systems rely on software techniques such as execution migration to achieve energy efficiency across pools of machines. Execution migration is possible in datacenters today because they consist mainly of homogeneous-ISA machines. However, recent market trends indicate that alternate ISAs such as ARM and PowerPC are pushing into the datacenter, meaning current execution migration techniques are no longer applicable. How can execution migration be applied in future heterogeneous-ISA datacenters?

In this work we present a compiler, runtime, and an operating system extension for enabling execution migration between heterogeneous-ISA servers. We present a new multi-ISA binary architecture and heterogeneous-OS containers for facilitating efficient migration of natively-compiled applications. We build and evaluate a prototype of our design and demonstrate energy savings of up to 66% for a workload running on an ARM and an x86 server interconnected by a high-speed network.

References

  1. LLVM language reference manual. http://llvm.org/docs/LangRef.html, 2016.Google ScholarGoogle Scholar
  2. Redis. http://redis.io/, 2016.Google ScholarGoogle Scholar
  3. David Abdurachmanov, Brian Bockelman, Peter Elmer, Giulio Eulisse, Robert Knight, and Shahzad Muzaffar. Heterogeneous high throughput scientific computing with APM X-Gene and Intel Xeon Phi. Journal of Physics: Conference Series, 608(1):012033, 2015. Google ScholarGoogle ScholarCross RefCross Ref
  4. AMD. AMD Opteron A-Series Processors. http://www.amd.com/en-us/products/server/opteron-a-series, 2016.Google ScholarGoogle Scholar
  5. Tycho Andersen. LXD live migration of Linux containers. Linux Conference Australia, 2016.Google ScholarGoogle Scholar
  6. Applied Micro Circuits Corporation. X-Gene product family. https://www.apm.com/products/data-center/x-gene-family/x-gene/https://www.apm.com/products/data-center/x-gene-family/x-gene/, 2016.Google ScholarGoogle Scholar
  7. G. Attardi, I. Filotti, and J. Marks. Techniques for Dynamic Software Migration. In In ESPRIT '88: Proceedings of the 5th Annual ESPRIT Conference, pages 475--491. NorthHolland, 1988.Google ScholarGoogle Scholar
  8. R. Azimi, X. Zhan, and S. Reda. How good are low-power 64-bit SoCs for server-class workloads? In Workload Characterization (IISWC), 2015 IEEE International Symposium on, pages 116--117, Oct 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS parallel benchmarks summary and preliminary results. In Supercomputing '91, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Antonio Barbalace, Alastair Murray, Rob Lyerly, and Binoy Ravindran. Towards Operating System Support for Heterogeneous-ISA Platforms. In Proceedings of The 4th Workshop on Systems for Future Multicore Architectures, 2014.Google ScholarGoogle Scholar
  11. Antonio Barbalace, Binoy Ravindran, and David Katz. Popcorn: a replicated-kernel OS based on Linux. In Ottawa Linux Symposium, 2014.Google ScholarGoogle Scholar
  12. Antonio Barbalace, Marina Sadini, Saif Ansary, Christopher Jelesnianski, Akshay Ravichandran, Cagil Kendir, Alastair Murray, and Binoy Ravindran. Popcorn: Bridging the Programmability Gap in Heterogeneous-ISA Platforms. In Proceedings of the Tenth European Conference on Computer Systems, EuroSys '15, pages 29:1--29:16. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Luiz André Barroso and Urs Hölzle. The case for energy-proportional computing. Computer, 40(12):33--37, December 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, and Akhilesh Singhania. The Multikernel: A New OS Architecture for Scalable Multicore Systems. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP '09, pages 29--44. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Andrew Baumann, Simon Peter, Adrian Schüpbach, Akhilesh Singhania, Timothy Roscoe, Paul Barham, and Rebecca Isaacs. Your Computer is Already a Distributed System. Why Isn't Your OS? In Proceedings of the 12th Conference on Hot Topics in Operating Systems, HotOS'09, pages 12--12. USENIX Association, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Sharath K. Bhat, Ajithchandra Saya, Hemedra K. Rawat, Antonio Barbalace, and Binoy Ravindran. Harnessing Energy Efficiency of Heterogeneous-ISA Platforms. In Proceedings of the Workshop on Power-Aware Computing and Systems, HotPower '15, pages 6--10. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes. Borg, Omega, and Kubernetes. Queue, 14(1):10:70--10:93, January 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Juan Angel Lorenzo del Castillo, Kate Mallichan, and Yahya Al-Hazmi. OpenStack Federation in Experimentation Multi-cloud Testbeds. In Proceedings of the 2013 IEEE International Conference on Cloud Computing Technology and Science - Volume 02, CLOUDCOM '13, pages 51--56, Washington, DC, USA, 2013. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Cavium. ThunderX ARM Processor; 64-bit ARMv8 Data Center & Cloud Processors for Next Generation Cloud Data Center, HPC and Cloud Workloads. http://www.cavium.com/ThunderX\_ARM\_Processors.html, 2016.Google ScholarGoogle Scholar
  20. Byung-Gon Chun, Sunghwan Ihm, Petros Maniatis, Mayur Naik, and Ashwin Patti. CloneCloud: Elastic execution between mobile device and cloud. In Proceedings of the Sixth Conference on Computer Systems, EuroSys '11, pages 301--314, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Matthew DeVuyst, Rakesh Kumar, and Dean M. Tullsen. Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors. In Proceedings of the 20th International Conference on Parallel and Distributed Processing, IPDPS'06, pages 140--140, Washington, DC, USA, 2006. IEEE Computer Society.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Matthew DeVuyst, Ashish Venkat, and Dean M. Tullsen. Execution Migration in a Heterogeneous-ISA Chip Multiprocessor. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 261--272. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Dolphin Interconnect Solutions. Express IX. http://www.dolphinics.com/download/WHITEPAPERS/Dolphin_Express_IX_Peer_to_Peer_whitepaper.pdf, 2016.Google ScholarGoogle Scholar
  24. Richard P. Draves. Control Transfer in Operating System Kernels. Technical Report MSR-TR-94-06, Microsoft Research, May 1994.Google ScholarGoogle Scholar
  25. Bryan Ford, Mike Hibler, Jay Lepreau, Roland McGrath, and Patrick Tullmann. Interface and Execution Models in the Fluke Kernel. In Proceedings of the Third Symposium on Operating Systems Design and Implementation, OSDI '99, pages 101--115. USENIX Association, 1999.Google ScholarGoogle Scholar
  26. Alessandro Forin, Ro Forin, Joseph Barrera, Michael Young, and Richard Rashid. Design, implementation, and performance evaluation of a distributed shared memory server for Mach. Technical report, In 1988 Winter USENIX Conference, 1988.Google ScholarGoogle Scholar
  27. Joachim Gehweiler and Michael Thies. Thread migration and checkpointing in Java. Heinz Nixdorf Institute, Tech. Rep. tr-ri-10--315, 2010.Google ScholarGoogle Scholar
  28. Nicolas Geoffray, Gaël Thomas, and Bertil Folliot. Live and Heterogeneous Migration of Execution Environments, pages 1254--1263. Springer Berlin Heidelberg, Berlin, Heidelberg, 2006.Google ScholarGoogle Scholar
  29. Mark S. Gordon, D. Anoushe Jamshidi, Scott Mahlke, Z. Morley Mao, and Xu Chen. Comet: Code offload by migrating execution transparently. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12, pages 93--106, Berkeley, CA, USA, 2012. USENIX Association.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Part Guide. Intel® 64 and IA-32 architectures software developer's manual, 2011.Google ScholarGoogle Scholar
  31. Hewlett Packard Enterprise. HPE Moonshot System. https://www.hpe.com/us/en/servers/moonshot.html, 2016.Google ScholarGoogle Scholar
  32. Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, and Ion Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI'11, pages 295--308, Berkeley, CA, USA, 2011. USENIX Association.Google ScholarGoogle Scholar
  33. Jeffrey M Hsu. The dragonflybsd operating system. Proceedings USENIX AsiaBSDCon, Taipei, Taiwan, 2004.Google ScholarGoogle Scholar
  34. Galen C. Hunt and James R. Larus. Singularity: Rethinking the Software Stack. SIGOPS Oper. Syst. Rev., 41(2):37--49, April 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. K. R. Jackson, L. Ramakrishnan, K. Muriki, S. Canon, S. Cholia, J. Shalf, H. J. Wasserman, and N. J. Wright. Performance analysis of high performance computing applications on the Amazon Web Services cloud. In Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on, pages 159--168, Nov 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Adam Jundt, Allyson Cauble-Chantrenne, Ananta Tiwari, Joshua Peraza, Michael A. Laurenzano, and Laura Carrington. Compute bottlenecks on the new 64-bit ARM. In Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing, E2SC '15, pages 6:1--6:7, New York, NY, USA, 2015. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Willis Lang, Jignesh M. Patel, and Jeffrey F. Naughton. On energy management, load balancing and replication. SIGMOD Rec., 38(4):35--42, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Chris Lattner and Vikram Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In Code Generation and Optimization, 2004. CGO 2004. International Symposium on, pages 75--86. IEEE, 2004. Google ScholarGoogle ScholarCross RefCross Ref
  39. Kai Li. Shared Virtual Memory on Loosely Coupled Multiprocessors. PhD thesis, New Haven, CT, USA, 1986. AAI8728365.Google ScholarGoogle Scholar
  40. S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 469--480, Dec 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Boston Limited. Boston Viridis; presenting the world's first hyperscale server -- based on ARM processors, July 2016.Google ScholarGoogle Scholar
  42. Felix Xiaozhu Lin, Zhen Wang, and Lin Zhong. K2: A Mobile Operating System for Heterogeneous Coherence Domains. ACM Trans. Comput. Syst., 33(2):4:1--4:27, June 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Linaro. Developer cloud; the developer cloud for the ARM64 ecosystem, July 2016.Google ScholarGoogle Scholar
  44. Peng Lu, Antonio Barbalace, and Binoy Ravindran. HSG-LM: Hybrid-copy Speculative Guest OS Live Migration Without Hypervisor. In Proceedings of the 6th International Systems and Storage Conference, SYSTOR '13, pages 2:1--2:11, New York, NY, USA, 2013. ACM.Google ScholarGoogle Scholar
  45. R. Luijten, D. Pham, R. Clauberg, M. Cossale, H. N. Nguyen, and M. Pandya. 4.4 Energy-efficient microserver based on a 12-core 1.8GHz 188K-CoreMark 28nm bulk CMOS 64b SoC for big-data applications with 159GB/S/L memory bandwidth system density. In 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, pages 1--3, Feb 2015.Google ScholarGoogle ScholarCross RefCross Ref
  46. Jason Mars and Lingjia Tang. Whare-map: Heterogeneity in "Homogeneous" Warehouse-scale Computers. In Proceedings of the 40th Annual International Symposium on Computer Architecture, ISCA '13, pages 619--630. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Marvell. Chinese internet giant Baidu rolls out world's first commercial deployment of Marvell's ARM processor-base server, February 2013.Google ScholarGoogle Scholar
  48. Mesosphere Inc. Introducing the Mesosphere datacenter operating system. http://mesosphere.com/, 2016.Google ScholarGoogle Scholar
  49. Michael Nelson, Beng-Hong Lim, and Greg Hutchins. Fast transparent migration for virtual machines. In Proceedings of the Annual Conference on USENIX Annual Technical Conference, ATEC '05, pages 25--25, Berkeley, CA, USA, 2005. USENIX Association.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Nicholas Nethercote and Julian Seward. Valgrind: A framework for heavyweight dynamic binary instrumentation. In Proceedings of ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation (PLDI 2007), pages 89--100, San Diego, California, USA, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Edmund B. Nightingale, Orion Hodson, Ross McIlroy, Chris Hawblitzel, and Galen Hunt. Helios: Heterogeneous Multiprocessing with Satellite Kernels. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, SOSP '09, pages 221--234. ACM, 2009.Google ScholarGoogle Scholar
  52. John K Ousterhout, Andrew R Cherenson, Frederick Douglis, Michael N Nelson, and Brent B Welch. The Sprite network operating system. Computer, 21(2):23--36, 1988.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Eduardo Pinheiro, Ricardo Bianchini, Enrique V. Carrera, and Taliver Heath. Compilers and operating systems for low power. chapter Dynamic Cluster Reconfiguration for Power and Performance, pages 75--93. Kluwer Academic Publishers, Norwell, MA, USA, 2003.Google ScholarGoogle Scholar
  54. Qualcomm. Qualcomm makes significant advancements with its server ecosystem. https://www.qualcomm.com/news/releases/2015/10/08/qualcomm-makes-significant-advancements-its-server-ecosystem, October 2015.Google ScholarGoogle Scholar
  55. Charles Reiss, Alexey Tumanov, Gregory R. Ganger, Randy H. Katz, and Michael A. Kozuch. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the Third ACM Symposium on Cloud Computing, SoCC '12, pages 7:1--7:13, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Scaleway. Cloud computing features for your infrastructure, July 2016.Google ScholarGoogle Scholar
  57. Malte Schwarzkopf, Matthew P. Grosvenor, and Steven Hand. New wine in old skins: The case for distributed operating systems in the data center. In Proceedings of the 4th Asia-Pacific Workshop on Systems, APSys '13, pages 9:1--9:7, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Peter Smith and Norman C. Hutchinson. Heterogeneous Process Migration: The Tui System. Technical report, 1996.Google ScholarGoogle Scholar
  59. Vilas Sridharan, Nathan DeBardeleben, Sean Blanchard, Kurt B. Ferreira, Jon Stearley, John Shalf, and Sudhanva Gurumurthi. Memory errors in modern systems: The good, the bad, and the ugly. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, pages 297--310, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. P. Stanley-Marbell and V. C. Cabezas. Performance, power, and thermal analysis of low-power processors for scale-out systems. In Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on, pages 863--870, May 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. B. Steensgaard and E. Jul. Object and native code thread mobility among heterogeneous computers (includes sources). In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, SOSP '95, pages 68--77, New York, NY, USA, 1995. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. MOOR Insight & Strategy. The First Enterprise Class 64-Bit ARMv8 Server: HP Moonshot System's HP ProLiant m400 Server Cartridge, 2014.Google ScholarGoogle Scholar
  63. MOOR Insight & Strategy. Building the ecosystem for ARM servers, November 2015.Google ScholarGoogle Scholar
  64. Ian Lance Taylor. A New ELF Linker. In Proceedings of the GCC Developers' Summit, 2008.Google ScholarGoogle Scholar
  65. Niraj Tolia, Zhikui Wang, Manish Marwah, Cullen Bash, Parthasarathy Ranganathan, and Xiaoyun Zhu. Delivering energy proportionality with non energy-proportional systems: Optimizing the ensemble. In Proceedings of the 2008 Conference on Power Aware Computing and Systems, HotPower'08, pages 2--2, Berkeley, CA, USA, 2008. USENIX Association.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Ashish Venkat and Dean M. Tullsen. Harnessing ISA Diversity: Design of a Heterogeneous-ISA Chip Multiprocessor. In Proceeding of the 41st Annual International Symposium on Computer Architecuture, ISCA '14, pages 121--132. IEEE Press, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Akshat Verma, Puneet Ahuja, and Anindya Neogi. pmapper: Power and migration cost aware application placement in virtualized systems. In Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware, Middleware '08, pages 243--264, New York, NY, USA, 2008. Springer-Verlag New York, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Akshat Verma, Gargi Dasgupta, Tapan Kumar Nayak, Pradipta De, and Ravi Kothari. Server workload analysis for power minimization using consolidation. In Proceedings of the 2009 Conference on USENIX Annual Technical Conference, USENIX'09, pages 28--28, Berkeley, CA, USA, 2009. USENIX Association.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. David G Von Bank, Charles M Shub, and Robert W Sebesta. A unified model of pointwise equivalence of procedural computations. ACM Transactions on Programming Languages and Systems (TOPLAS), 16(6):1842--1874, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Daniel Wong and Murali Annavaram. KnightShift: Scaling the energy proportionality wall through server-level heterogeneity. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-45, pages 119--130, Washington, DC, USA, 2012. IEEE Computer Society.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Mallik V. Yalamanchili and Robert M. Hyatt. Heterogeneous process migration: Issues and an approach. In Proceedings of the 35th Annual Southeast Regional Conference, ACM-SE 35, pages 275--281, New York, NY, USA, 1997. ACM.Google ScholarGoogle Scholar
  72. Matei Zaharia, Benjamin Hindman, Andy Konwinski, Ali Ghodsi, Anthony D. Joesph, Randy Katz, Scott Shenker, and Ion Stoica. The datacenter needs an operating system. In Proceedings of the 3rd USENIX Conference on Hot Topics in Cloud Computing, HotCloud'11, pages 17--17, Berkeley, CA, USA, 2011. USENIX Association.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Qi Zhang, Mohamed Faten Zhani, Shuo Zhang, Quanyan Zhu, Raouf Boutaba, and Joseph L. Hellerstein. Dynamic energy-aware capacity provisioning for cloud computing environments. In Proceedings of the 9th International Conference on Autonomic Computing, ICAC '12, pages 145--154, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. S. Zhou, M. Stumm, K. Li, and D. Wortman. Heterogeneous distributed shared memory. IEEE Trans. Parallel Distrib. Syst., 3(5):540--554, September 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Breaking the Boundaries in Heterogeneous-ISA Datacenters

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!