skip to main content
tutorial

Demon: An Efficient Solution for on-Device MMU Virtualization in Mediated Pass-Through

Published:25 March 2018Publication History
Skip Abstract Section

Abstract

Memory Management Units (MMUs) for on-device address translation are widely used in modern devices. However, conventional solutions for on-device MMU virtualization, such as shadow page table implemented in mediated pass-through, still suffer from high complexity and low performance.

We present Demon, an efficient solution for on-DEvice MMU virtualizatiON in mediated pass-through. The key insight is that Demon takes advantage of IOMMU to construct a two-dimensional address translation and dynamically switches the 2nd-dimensional page table to a proper candidate when the device owner switches. In order to support fine-grained parallelism for the device with multiple engines, we put forward a hardware proposal that separates the address space of each engine and enables simultaneous device address remapping for multiple virtual machines (VMs). We implement Demon with a prototype named gDemon which virtualizes Intel GPU MMU. Nonetheless, Demon is not limited to this particular case. Evaluations show that gDemon provides up to 19.73x better performance in the media transcoding workloads and achieves performance improvement of up to 17.09% and 13.73% in the 2D benchmarks and 3D benchmarks, respectively, compared with gVirt. The current release of gDemon scales up to 6 VMs with moderate performance in our experiments. In addition, gDemon simplifies the implementation of GPU MMU virtualization with 37% code reduction.

References

  1. 2012. KVM on System z: Channel I/O And How To Virtualize It. https://www.linux-kvm.org/images/1/13/2012-forum-channel-io-kvm-forum.pdf. (2012).Google ScholarGoogle Scholar
  2. 2015. Intel Open Source HD Graphics and Intel Iris Graphics Programmer's Reference Manual, Volume 5: Memory Views. https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-bdw-vol05-memory_views_3.pdf. (2015).Google ScholarGoogle Scholar
  3. 2016. AMD Kaveri. http://www.amd.com/en-us/products/processors/desktop/a-series-apu. (2016).Google ScholarGoogle Scholar
  4. 2016. AMD Multiuser GPU (MxGPU). http://www.amd.com/en-us/solutions/professional/virtualization. (2016).Google ScholarGoogle Scholar
  5. 2016. Dual-core ARM Cortex-M4 IPU subsystem. https://training.ti.com/sites/default/files/docs/Running_RTOS_on_Cortex_M4_SLIDES.pdf. (2016).Google ScholarGoogle Scholar
  6. 2016. Intel VT-d Architecture Specification. http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf. (2016).Google ScholarGoogle Scholar
  7. 2016. Live Migration of vGPU. http://schd.ws/hosted_files/xensummit2016/c7/XenGT-LiveMigration_1.00.pdf. (2016).Google ScholarGoogle Scholar
  8. 2016. NVIDIA GRID Virtual GPU Technology. https://www.nvidia.com/en-us/design-visualization/technologies/virtual-gpu/. (2016).Google ScholarGoogle Scholar
  9. 2016. VFIO Mediated devices. https://www.kernel.org/doc/Documentation/vfio-mediated-device.txt. (2016).Google ScholarGoogle Scholar
  10. 2016. VGPU on KVM, VFIO based mediated device framework. http://www.linux-kvm.org/images/5/59/02x03-Neo_Jia_and_Kirti_Wankhede-vGPU_on_KVM-A_VFIO_based_Framework.pdf. (2016).Google ScholarGoogle Scholar
  11. 2017. Generic Buffer Sharing Mechanism for Mediated Devices. https://kvmforum2017.sched.com/event/BnoJ/generic-buffer-sharing-mechanism-for-mediated-devices-tina-zhang-intel. (2017).Google ScholarGoogle Scholar
  12. 2017. Intel Processor Graphics. https://01.org/zh/linuxgraphics. (2017).Google ScholarGoogle Scholar
  13. 2017. Live Migration with Mediated Device. https://kvmforum2017.sched.com/event/BnoH/live-migration-with-mediated-device-yulei-zhang-intel. (2017).Google ScholarGoogle Scholar
  14. 2017. NVIDIA GeForce series. http://www.geforce.com/hardware. (2017).Google ScholarGoogle Scholar
  15. 2017. NVIDIA GRID Showcases vGPU Monitoring and Migration. https://blogs.nvidia.com/blog/2017/06/22/high-availability-nvidia-grid-showcases-vgpu-monitoring-and-migration/. (2017).Google ScholarGoogle Scholar
  16. 2018. Radeon RX Vega M Graphics. https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/8th-gen-radeon-rx-vega-m-product-overview.pdf. (2018).Google ScholarGoogle Scholar
  17. Darren Abramson, Jeff Jackson, Sridhar Muthrasanallur, Gil Neiger, Greg Regnier, Rajesh Sankaran, Ioannis Schoinas, Rich Uhlig, Balaji Vembu, and John Wiegert. 2006. Intel Virtualization Technology for Directed I/O. Intel technology journal 10, 3 (2006).Google ScholarGoogle Scholar
  18. I AMD and O Virtualization. 2007. Technology (IOMMU) Specification. (2007).Google ScholarGoogle Scholar
  19. Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. 2003. Xen and the art of virtualization. In ACM SIGOPS Operating Systems Review, Vol. 37. ACM, 164--177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator.. In USENIX Annual Technical Conference, FREENIX Track. 41--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Muli Ben-Yehuda, Michael D Day, Zvi Dubitzky, Michael Factor, Nadav Har'El, Abel Gordon, Anthony Liguori, Orit Wasserman, and Ben-Ami Yassour. 2010. The Turtles Project: Design and Implementation of Nested Virtualization.. In OSDI, Vol. 10. 423--436. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Muli Ben-Yehuda, Jon Mason, Jimi Xenidis, Orran Krieger, Leendert Van Doorn, Jun Nakajima, Asit Mallick, and Elsie Wahlig. 2006. Utilizing IOMMUs for virtualization in Linux and Xen. In OLS'06: The 2006 Ottawa Linux Symposium. Citeseer, 71--86.Google ScholarGoogle Scholar
  23. Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, and Srilatha Manne. 2008. Accelerating two-dimensional page walks for virtualized systems. In ACM SIGARCH Computer Architecture News, Vol. 36. ACM, 26--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Klaus Danne. 2004. Memory management to support multitasking on fpga based systems. In Proceedings of the International Conference on Reconfigurable Computing and FPGAs. 21.Google ScholarGoogle Scholar
  25. Yaozu Dong, Mochi Xue, Xiao Zheng, Jiajun Wang, Zhengwei Qi, and Haibing Guan. 2015. Boosting GPU virtualization performance with hybrid shadow page tables. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 517--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yaozu Dong, Xiaowei Yang, Jianhui Li, Guangdeng Liao, Kun Tian, and Haibing Guan. 2012. High performance network virtualization with SR-IOV. J. Parallel and Distrib. Comput. 72, 11 (2012), 1471--1480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yaozu Dong, Jianguo Yao, Halbing Guan, R Ananth Krishna, and Yunhong Jiang. 2017. MobiXen: Porting Xen on Android devices for mobile virtualization. In 2017 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 946--949. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Micah Dowty and Jeremy Sugerman. 2009. GPU virtualization on VMware's hosted I/O architecture. ACM SIGOPS Operating Systems Review 43, 3 (2009), 73--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. José Duato, Antonio J Pena, Federico Silla, Rafael Mayo, and Enrique S Quintana-Ortí. 2010. rCUDA: Reducing the number of GPU-based accelerators in high performance clusters. In High Performance Computing and Simulation (HPCS), 2010 International Conference on. IEEE, 224--231.Google ScholarGoogle ScholarCross RefCross Ref
  30. Haibing Guan, Jianguo Yao, Zhengwei Qi, and Runze Wang. 2015. Energy-efficient SLA guarantees for virtualized GPU in cloud gaming. IEEE Transactions on Parallel and Distributed Systems 26, 9 (2015), 2434--2443.Google ScholarGoogle ScholarCross RefCross Ref
  31. Vishakha Gupta, Ada Gavrilovska, Karsten Schwan, Harshvardhan Kharche, Niraj Tolia, Vanish Talwar, and Parthasarathy Ranganathan. 2009. GViM: GPU-accelerated virtual machines. In Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing. ACM, 17--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jacob Gorm Hansen. 2007. Blink: Advanced display multiplexing for virtualized applications. In Proceedings of NOSSDAV.Google ScholarGoogle Scholar
  33. ARM Holdings. 2013. ARM system memory management unit architecture specificationâĂŤSMMU architecture version 2.0. (2013).Google ScholarGoogle Scholar
  34. Cheol-Ho Hong, Ivor Spence, and Dimitrios S Nikolopoulos. 2017. GPU Virtualization and Scheduling Methods: A Comprehensive Survey. ACM Computing Surveys (CSUR) 50, 3 (2017), 35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yu-Ju Huang, Hsuan-Heng Wu, Yeh-Ching Chung, and Wei-Chung Hsu. 2016. Building a kvm-based hypervisor for a heterogeneous system architecture compliant system. In Proceedings of the 12th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. ACM, 3--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Greg Humphreys, Matthew Eldridge, Ian Buck, Gordan Stoll, Matthew Everett, and Pat Hanrahan. 2001. WireGL: a scalable graphics system for clusters. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM, 129--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Greg Humphreys, Mike Houston, Ren Ng, Randall Frank, Sean Ahern, Peter D Kirchner, and James T Klosowski. 2002. Chromium: a stream-processing framework for interactive rendering on clusters. ACM transactions on graphics (TOG) 21, 3 (2002), 693--702. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. H Andrés Lagar-Cavilla, Niraj Tolia, Mahadev Satyanarayanan, and Eyal De Lara. 2007. VMM-independent graphics acceleration. In Proceedings of the 3rd international conference on Virtual execution environments. ACM, 33--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Qiumin Lu, Jianguo Yao, Zhengwei Qi, Bingsheng He, et al. 2016. Fairness-efficiency allocation of cpu-gpu heterogeneous resources. IEEE Transactions on Services Computing (2016).Google ScholarGoogle Scholar
  40. Gregory F Pfister. 2001. An introduction to the infiniband architecture. High Performance Mass Storage and Parallel I/O 42 (2001), 617--632.Google ScholarGoogle Scholar
  41. Zhengwei Qi, Jianguo Yao, Chao Zhang, Miao Yu, Zhizhou Yang, and Haibing Guan. 2014. VGRIS: Virtualized GPU resource isolation and scheduling in cloud gaming. ACM Transactions on Architecture and Code Optimization (TACO) 11, 2 (2014), 17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Lin Shi, Hao Chen, Jianhua Sun, and Kenli Li. 2012. vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Trans. Comput. 61, 6 (2012), 804--816. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Christopher Smowton. 2009. Secure 3D graphics for virtual machines. In Proceedings of the Second European Workshop on System Security. ACM, 36--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Yusuke Suzuki, Shinpei Kato, Hiroshi Yamada, and Kenji Kono. 2014. GPUvm: Why not virtualizing GPUs at the hypervisor?. In 2014 USENIX Annual Technical Conference (USENIX ATC 14). 109--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kun Tian, Yaozu Dong, and David Cowperthwaite. 2014. A full GPU virtualization solution with mediated pass-through. In 2014 USENIX Annual Technical Conference (USENIX ATC 14). 121--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. David E Williams. 2007. Virtualization with Xen (tm): Including XenEnterprise, XenServer, and XenExpress. Syngress. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Lei Xia, Jack Lange, Peter Dinda, and Chang Bae. 2009. Investigating virtual passthrough I/O on commodity devices. ACM SIGOPS Operating Systems Review 43, 3 (2009), 83--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Mochi Xue, Kun Tian, Yaozu Dong, Jiacheng Ma, Jiajun Wang, Zhengwei Qi, Bingsheng He, and Haibing Guan. 2016. gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space.. In USENIX Annual Technical Conference. 579--590. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Jianguo Yao, Qiumin Lu, and Zhengwei Qi. 2017. Automated Resource Sharing for Virtualized GPU with Self-Configuration. In Reliable Distributed Systems (SRDS), 2017 IEEE 36th Symposium on. IEEE, 250--252.Google ScholarGoogle ScholarCross RefCross Ref
  50. Chao Zhang, Jianguo Yao, Zhengwei Qi, Miao Yu, and Haibing Guan. 2014. vGASA: Adaptive scheduling algorithm of virtualized GPU resource in cloud gaming. IEEE Transactions on Parallel and Distributed Systems 25, 11 (2014), 3036--3045.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Demon: An Efficient Solution for on-Device MMU Virtualization in Mediated Pass-Through

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 53, Issue 3
        VEE '18
        March 2018
        99 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/3296975
        Issue’s Table of Contents
        • cover image ACM Conferences
          VEE '18: Proceedings of the 14th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
          March 2018
          106 pages
          ISBN:9781450355797
          DOI:10.1145/3186411

        Copyright © 2018 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 March 2018

        Check for updates

        Qualifiers

        • tutorial
        • Research
        • Refereed limited

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!