Abstract
The IOMMU allows the OS to encapsulate I/O devices in their own virtual memory spaces, thus restricting their DMAs to specific memory pages. The OS uses the IOMMU to protect itself against buggy drivers and malicious/errant devices. But the added protection comes at a cost, degrading the throughput of I/O-intensive workloads by up to an order of magnitude. This cost has motivated system designers to trade off some safety for performance, e.g., by leaving stale information in the IOTLB for a while so as to amortize costly invalidations. We observe that high-bandwidth devices---like network and PCIe SSD controllers---interact with the OS via circular ring buffers that induce a sequential, predictable workload. We design a ring IOMMU (rIOMMU) that leverages this characteristic by replacing the virtual memory page table hierarchy with a circular, flat table. A flat table is adequately supported by exactly one IOTLB entry, making every new translation an implicit invalidation of the former and thus requiring explicit invalidations only at the end of I/O bursts. Using standard networking benchmarks, we show that rIOMMU provides up to 7.56x higher throughput relative to the baseline IOMMU, and that it is within 0.77--1.00x the throughput of a system without IOMMU protection.
- Dennis Abts and Bob Felderman. A guided tour through data-center networking. ACM Queue, 10(5):10:10--10:23, May 2012. Google Scholar
Digital Library
- Brian Aker. Memslap - load testing and benchmarking a server. http://docs.libmemcached.org/bin/memslap.html. libmemcached 1.1.0 documentation. Accessed: Jan 2015.Google Scholar
- AMD Inc. AMD IOMMU architectural specification, rev 2.00. http://support.amd.com/TechDocs/48882.pdf, Mar 2011. Accessed: Jan 2015.Google Scholar
- Nadav Amit, Muli Ben-Yehuda, Dan Tsafrir, and Assaf Schuster. vIOMMU: efficient IOMMU emulation. In USENIX Annual Technical Conference (ATC), pages 73--86, 2011. Google Scholar
Digital Library
- Apachebench. http://en.wikipedia.org/wiki/ApacheBench. Accessed: Jan 2015.Google Scholar
- Apple Inc. Thunderbolt device driver programming guide: Debugging VT-d I/O MMU virtualization. https://developer.apple.com/library/mac/documentation/HardwareDrivers/Conceptual/ThunderboltDevGuide/DebuggingThunderboltDrivers/DebuggingThunderboltDrivers.html, 2013. Accessed: May 2014.Google Scholar
- ARM Holdings. ARM system memory management unit architecture specification -- SMMU architecture version 2.0. http://infocenter.arm.com/help/topic/com.arm.doc.ihi0062c/IHI0062C_system_mmu_architecture\_specification.pdf, 2013. Accessed: Jan 2015.Google Scholar
- Thomas Ball, Ella Bounimova, Byron Cook, Vladimir Levin, Jakob Lichtenberg, Con McGarvey, Bohus Ondrusek, Sriram K. Rajamani, and Abdullah Ustuner. Thorough static analysis of device drivers. In ACM Eurosys, pages 73--85, 2006. Google Scholar
Digital Library
- Michael Becher, Maximillian Dornseif, and Christian N. Klein. FireWire: all your memory are belong to us. In CanSecWest applied security conference, 2005.Google Scholar
- Muli Ben-Yehuda, Jimi Xenidis, Michal Ostrowski, Karl Rister, Alexis Bruemmer, and Leendert van Doorn. The price of safety: Evaluating IOMMU performance. In Ottawa Linux Symposium (OLS), pages 9--20, 2007.Google Scholar
- James E.J. Bottomley. Dynamic DMA mapping using the generic device. https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/Documentation/DMA-API.txt?id=refs/tags/v3.18.3. Linux kernel documentation. Accessed: Jan 2015.Google Scholar
- Brian D. Carrier and Joe Grand. A hardware-based memory acquisition procedure for digital investigations. Digital Investigation, 1(1):50--60, Feb 2014. Google Scholar
Digital Library
- Andy Chou, Junfeng Yang, Benjamin Chelf, Seth Hallem, and Dawson Engler. An empirical study of operating systems errors. In ACM Symposium on Operating Systems Principles (SOSP), pages 73--88, 2001. Google Scholar
Digital Library
- Cisco. Understanding switch latency. http://www.cisco.com/c/en/us/products/collateral/switches/nexus-3000-series-switches/white_paper_c11--661939.html, Jun 2012. White paper. Accessed: Aug 2014.Google Scholar
- Russell Coker. The Bonnie benchmark. http://www.coker.com.au/bonnie/. Accessed: Jan 2015.Google Scholar
- Jonathan Corbet. Linux Device Drivers, chapter 15: Memory Mapping and DMA. O'Reilly, 3rd edition, 2005.Google Scholar
Digital Library
- John Criswell, Nicolas Geoffray, and Vikram Adve. Memory safety for low-level software/hardware interactions. In USENIX Security Symposium, pages 83--100, 2009. Google Scholar
Digital Library
- The Apache HTTP server project. http://httpd.apache.org. Accessed: Jan 2015.Google Scholar
- Roy T. Fielding and Gail Kaiser. The Apache HTTP server project. IEEE Internet Computing, 1(4):88--90, Jul 1997. Google Scholar
Digital Library
- Brad Fitzpatrick. Distributed caching with memcached. Linux Journal, 2004(124), Aug 2004. Google Scholar
Digital Library
- Brice Goglin. Design and implementation of Open-MX: High-performance message passing over generic Ethernet hardware. In IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2008.Google Scholar
Cross Ref
- Abel Gordon, Nadav Amit, Nadav Har'El, Muli Ben-Yehuda, Alex Landau, Assaf Schuster, and Dan Tsafrir. ELI: Bare-metal performance for I/O virtualization. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 411--422, 2012. Google Scholar
Digital Library
- Jorrit N. Herder, Herbert Bos, Ben Gras, Philip Homburg, and Andrew S. Tanenbaum. Failure resilience for device drivers. In IEEE/IFIP Annual International Conference on Dependable Systems and Networks (DSN), pages 41--50, 2007. Google Scholar
Digital Library
- Brian Hill. Integrating an EDK custom peripheral with a LocalLink interface into Linux. Technical Report XAPP1129 (v1.0), XILINX, May 2009. http://www.xilinx.com/support/documentation/application_notes/xapp1129.pdf. Accessed: Jan 2015.Google Scholar
- IBM Corporation. PowerLinux servers -- 64-bit DMA concepts. http://pic.dhe.ibm.com/infocenter/lnxinfo/v3r0m0/topic/liabm/liabmconcepts.htm. Accessed: May 2014.Google Scholar
- IBM Corporation. AIX kernel extensions and device support programming concepts. https://publib.boulder.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.kernelext/doc/kernextc/kernextc\_pdf.pdf, 2013. Accssed: May 2014.Google Scholar
- ibverbs evaluation. http://www.scalalife.eu/book/export/html/434. Accessed: Aug 2014.Google Scholar
- Intel Corporation. Intel virtualization technology for directed I/O - architecture specification - specification - Rev. 2.2. http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/vt-directed-io-spec.pdf, Sep 2013. Accessed: Jan 2015.Google Scholar
- Intel Corporation. Serial ATA advanced host controller interface (AHCI) 1.3.1. http://www.intel.com/content/www/us/en/io/serial-ata/serial-ata-ahci-spec-rev1--3--1.html, Mar 2014. Accessed: Jan 2015.Google Scholar
- Rick A. Jones. A network performance benchmark (Revision 2.0). Technical report, Hewlett Packard, 1995. http://www.netperf.org/netperf/training/Netperf.html. Accessed: Jan 2015.Google Scholar
- Doug Joseph and Dirk Grunwald. Prefetching using Markov predictors. In ACM International Symposium on Computer Architecture (ISCA), pages 252--263, 1997. Google Scholar
Digital Library
- Asim Kadav, Matthew J. Renzelmann, and Michael M. Swift. Tolerating hardware device failures in software. In ACM Symposium on Operating Systems Principles (SOSP), pages 59--72, 2009. Google Scholar
Digital Library
- Gokul B. Kandiraju and Anand Sivasubramaniam. Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks. In ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pages 129--139, Jun 2002. Google Scholar
Digital Library
- Gokul B. Kandiraju and Anand Sivasubramaniam. Going the distance for TLB prefetching: An application-driven study. In ACM International Symposium on Computer Architecture (ISCA), pages 195--206, 2002. Google Scholar
Digital Library
- Gregory Kerr. Dissecting a small InfiniBand application using the Verbs API. Computing Research Repository (arxiv), abs/1105.1827, 2011. http://arxiv.org/abs/1105.1827.Google Scholar
- Joshua LeVasseur, Volkmar Uhlig, Jan Stoess, and Stefan Götz. Unmodified device driver reuse and improved system dependability via virtual machines. In USENIX Symposium on Operating System Design and Implementation (OSDI), pages 17--30, 2004. Google Scholar
Digital Library
- Moshe Malka, Nadav Amit, and Dan Tsafrir. Efficient intra-operating system protection against harmful DMAs. In USENIX Conference on File and Storage Technologies (FAST), Feb 2015.Google Scholar
- Vinod Mamtani. DMA directions and Windows. http://download.microsoft.com/download/a/f/d/afdfd50d-6eb9--425e-84e1-b4085a80e34e/sys-t304_wh07.pptx, 2007. Accessed: May 2014.Google Scholar
- Ben Martin. Using BonnieGoogle Scholar
- for filesystem performance benchmarking. Linux.com article. http://archive09.linux.com/feature/139742, 2008. Accessed: Jan 2015.Google Scholar
- David S. Miller, Richard Henderson, and Jakub Jelinek. Dynamic DMA mapping guide. https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/tree/Documentation/DMA-API-HOWTO.txt?id=refs/tags/v3.18.3. Linux kernel documentation. Accessed: Jan 2015.Google Scholar
- NVM Express Workgroup. NVM Express (NVMe) specification -- Revision 1.2. http://www.nvmexpress.org/wp-content/uploads/NVM-Express-1_2-Gold-20141209.pdf, Nov 2014. Accessed: Jan 2015.Google Scholar
- PCI-SIG. Address translation services Revision 1.1. https://www.pcisig.com/specifications/iov/ats, Jan 2009. Accessed: Jan 2015.Google Scholar
- Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson, and Timothy Roscoe. Arrakis: The operating system is the control plane. In USENIX Symposium on Operating System Design and Implementation (OSDI), pages 1--16, 2014. Google Scholar
Digital Library
- Ashley Saulsbury, Fredrik Dahlgren, and Per Stenström. Recency-based TLB preloading. In ACM International Symposium on Computer Architecture (ISCA), pages 117--127, 2000. Google Scholar
Digital Library
- Arvind Seshadri, Mark Luk, Ning Qu, and Adrian Perrig. SecVisor: A tiny hypervisor to provide lifetime kernel code integrity for commodity OSes. In ACM Symposium on Operating Systems Principles (SOSP), pages 335--350, 2007. Google Scholar
Digital Library
- Livio Soares and Michael Stumm. FlexSC: Flexible system call scheduling with exception-less system calls. In USENIX Symposium on Operating System Design and Implementation (OSDI), pages 33--46, 2010. Google Scholar
Digital Library
- Michael Swift, Brian Bershad, and Henry Levy. Improving the reliability of commodity operating systems. ACM Transactions on Computer Systems (TOCS), 23(1):77--110, Feb 2005. Google Scholar
Digital Library
- Transpacket Fusion Networks. Ultra low latency of 1.2 microseconds for 1G to 10G Ethernet aggregation. http://tinyurl.com/transpacket-low-latency, Dec 2012. Accessed: Jan 2015.Google Scholar
- Carl Waldspurger and Mendel Rosenblum. I/O virtualization. Communications of the ACM (CACM), 55(1):66--73, Jan 2012. Google Scholar
Digital Library
- Dan Williams, Patrick Reynolds, Kevin Walsh, Emin Gün Sirer, and Fred B. Schneider. Device driver safety through a reference validation mechanism. In USENIX Symposium on Operating System Design and Implementation (OSDI), pages 241--254, 2008. Google Scholar
Digital Library
- Paul Willmann, Scott Rixner, and Alan L. Cox. Protection strategies for direct access to virtualized I/O devices. In USENIX Annual Technical Conference (ATC), pages 15--28, 2008. Google Scholar
Digital Library
- Rafal Wojtczuk. Subverting the Xen hypervisor. In Black Hat, 2008. http://www.blackhat.com/presentations/bh-usa-08/Wojtczuk/BHUS_08_Wojtczuk_Subverting_the_Xen_Hypervisor.pdf. Accessed: May 2014.Google Scholar
- Ben-Ami Yassour, Muli Ben-Yehuda, and Orit Wasserman. On the DMA mapping problem in direct device assignment. In ACM International Systems and Storage Conference (SYSTOR), pages 18:1--18:12, 2010. Google Scholar
Digital Library
Index Terms
rIOMMU: Efficient IOMMU for I/O Devices that Employ Ring Buffers
Recommendations
rIOMMU: Efficient IOMMU for I/O Devices that Employ Ring Buffers
ASPLOS'15The IOMMU allows the OS to encapsulate I/O devices in their own virtual memory spaces, thus restricting their DMAs to specific memory pages. The OS uses the IOMMU to protect itself against buggy drivers and malicious/errant devices. But the added ...
rIOMMU: Efficient IOMMU for I/O Devices that Employ Ring Buffers
ASPLOS '15: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating SystemsThe IOMMU allows the OS to encapsulate I/O devices in their own virtual memory spaces, thus restricting their DMAs to specific memory pages. The OS uses the IOMMU to protect itself against buggy drivers and malicious/errant devices. But the added ...
SRVM: Hypervisor Support for Live Migration with Passthrough SR-IOV Network Devices
VEE '16Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (ysical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly assigned ...







Comments