skip to main content
research-article

MultiLanes: Providing Virtualized Storage for OS-Level Virtualization on Manycores

Published:16 June 2016Publication History
Skip Abstract Section

Abstract

OS-level virtualization is often used for server consolidation in data centers because of its high efficiency. However, the sharing of storage stack services among the colocated containers incurs contention on shared kernel data structures and locks within I/O stack, leading to severe performance degradation on manycore platforms incorporating fast storage technologies (e.g., SSDs based on nonvolatile memories).

This article presents MultiLanes, a virtualized storage system for OS-level virtualization on manycores. MultiLanes builds an isolated I/O stack on top of a virtualized storage device for each container to eliminate contention on kernel data structures and locks between them, thus scaling them to manycores. Meanwhile, we propose a set of techniques to tune the overhead induced by storage-device virtualization to be negligible, and to scale the virtualized devices to manycores on the host, which itself scales poorly. To reduce the contention within each single container, we further propose SFS, which runs multiple file-system instances through the proposed virtualized storage devices, distributes all files under each directory among the underlying file-system instances, then stacks a unified namespace on top of them.

The evaluation of our prototype system built for Linux container (LXC) on a 32-core machine with both a RAM disk and a modern flash-based SSD demonstrates that MultiLanes scales much better than Linux in micro- and macro-benchmarks, bringing significant performance improvements, and that MultiLanes with SFS can further reduce the contention within each single container.

References

  1. Jonathan Appavoo, Dilma Da Silva, Orran Krieger, Marc A. Auslander, Michal Ostrowski, Bryan S. Rosenburg, Amos Waterland, Robert W. Wisniewski, Jimi Xenidis, Michael Stumm, and Livio Soares. 2007. Experience distributing objects in an SMMP OS. ACM Transactions on Computer Systems 25, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Gaurav Banga, Peter Druschel, and Jeffrey C. Mogul. 1999. Resource containers: A new facility for resource management in server systems. In Proceedings of the 3rd USENIX Symposium on Operating Systems Design and Implementation (OSDI’99). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andrew Baumann, Paul Barham, Pierre-Évariste Dagand, Timothy L. Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, and Akhilesh Singhania. 2009. The multikernel: A new OS architecture for scalable multicore systems. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Matias Bjørling, Jens Axboe, David W. Nellans, and Philippe Bonnet. 2013. Linux block IO: Introducing multi-queue SSD access on multi-core systems. In 6th Annual International Systems and Storage Conference (SYSTOR’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Silas Boyd-Wickizer, Haibo Chen, Rong Chen, Yandong Mao, M. Frans Kaashoek, Robert Morris, Aleksey Pesterev, Lex Stein, Ming Wu, Yue-hua Dai, Yang Zhang, and Zheng Zhang. 2008. Corey: An operating system for many cores. In 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Silas Boyd-Wickizer, Austin T. Clements, Yandong Mao, Aleksey Pesterev, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. 2010. An analysis of Linux scalability to many cores. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. John L. Bruno, Eran Gabber, Banu Özden, and Avi Silberschatz. 1998. The eclipse operating system: Providing quality of service via reservation domains. In 1998 USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Edouard Bugnion, Scott Devine, and Mendel Rosenblum. 1997. DISCO: Running commodity operating systems on scalable multiprocessors. In Proceedings of the 16th ACM Symposium on Operating System Principles (SOSP’97). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bryan Cantrill and Jeff Bonwick. 2008. Real-world concurrency. ACM Queue 6, 5, 16--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Adrian M. Caulfield, Arup De, Joel Coburn, Todor I. Mollow, Rajesh K. Gupta, and Steven Swanson. 2010. Moneta: A high-performance storage array architecture for next-generation, non-volatile memories. In 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Adrian M. Caulfield, Todor I. Mollov, Louis Alex Eisner, Arup De, Joel Coburn, and Steven Swanson. 2012. Providing safe, user space access to fast, solid state disks. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. John Chapin, Mendel Rosenblum, Scott Devine, Tirthankar Lahiri, Dan Teodosiu, and Anoop Gupta. 1995. Hive: Fault containment for shared-memory multiprocessors. In Proceedings of the Fifteenth ACM Symposium on Operating System Principles, SOSP 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Feng Chen, Rubao Lee, and Xiaodong Zhang. 2011. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In 17th International Conference on High-Performance Computer Architecture (HPCA’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Vijay Chidambaram, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2013. Optimistic crash consistency. In ACM SIGOPS 24th Symposium on Operating Systems Principles (SOSP’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dave Chinner. 2011. dentry: move to per-sb LRU locks. Retrieved April 4, 2016 from https://lkml.org/lkml/2011/8/8/34.Google ScholarGoogle Scholar
  16. Dave Chinner. 2013. Sync and VFS scalability improvements. Retrieved April 4, 2016 from http://lwn.net/Articles/561569/.Google ScholarGoogle Scholar
  17. Austin T. Clements, M. Frans Kaashoek, Nickolai Zeldovich, Robert Tappan Morris, and Eddie Kohler. 2013. The scalable commutativity rule: Designing scalable software for multicore processors. In ACM SIGOPS 24th Symposium on Operating Systems Principles (SOSP’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yan Cui, Yingxin Wang, Yu Chen, and Yuanchun Shi. 2013. Lock-contention-aware scheduler: A scalable and energy-efficient method for addressing scalability collapse on multicore systems. ACM Transactions on Architecture and Code Optimization 9, 4, 44:1--44:25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. 2013. Everything you always wanted to know about synchronization but were afraid to ask. In ACM SIGOPS 24th Symposium on Operating Systems Principles (SOSP’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hugh Dickins. 2012. mm/memcg: per-memcg per-zone lru locking. Retrieved April 4, 2016 from https://lwn.net/Articles/482726/.Google ScholarGoogle Scholar
  21. Rasha Eqbal. 2014. ScaleFS: A Multicore-Scalable File System. Master’s thesis. Massachusetts Institute of Technology, Cambridge, MA.Google ScholarGoogle Scholar
  22. Benjamin Gamsa, Orran Krieger, Jonathan Appavoo, and Michael Stumm. 1999. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In Proceedings of the Third USENIX Symposium on Operating Systems Design and Implementation (OSDI’99). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Abel Gordon, Nadav Amit, Nadav Har’El, Muli Ben-Yehuda, Alex Landau, Assaf Schuster, and Dan Tsafrir. 2012. ELI: Bare-metal performance for I/O virtualization. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Charles Gruenwald III. 2014. Providing a Shared File System in the Hare POSIX Multikernel. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA.Google ScholarGoogle Scholar
  25. Charles Gruenwald III, Filippo Sironi, M. Frans Kaashoek, and Nickolai Zeldovich. 2015. Hare: A file system for non-cache-coherent multicores. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Junbin Kang, Benlong Zhang, Tianyu Wo, Chunming Hu, and Jinpeng Huai. 2014. MultiLanes: Providing virtualized storage for OS-level virtualization on many cores. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Junbin Kang, Benlong Zhang, Tianyu Wo, Weiren Yu, Lian Du, Shuai Ma, and Jinpeng Huai. 2015. SpanFS: A scalable file system on fast storage devices. In 2015 USENIX Annual Technical Conference (USENIX ATC’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kir Kolyshkin. 2012. Introducing container in a file aka ploop. Retrieved April 4, 2016 from http://openvz.livejournal.com/40830.html.Google ScholarGoogle Scholar
  29. Duy Le, Hai Huang, and Haining Wang. 2012. Understanding performance implications of nested file systems in a virtualized environment. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Lanyue Lu, Yupu Zhang, Thanh Do, Samer Al-Kiswany, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. Physical disentanglement in a container-based file system. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Stelios Mavridis, Yannis Sfakianakis, Anastasios Papagiannis, Manolis Marazakis, and Angelos Bilas. 2014. Jericho: Achieving scalability through optimal data placement on multicore systems. In IEEE 30th Symposium on Mass Storage Systems and Technologies (MSST’14).Google ScholarGoogle ScholarCross RefCross Ref
  32. Paul E. McKenney, Jonathan Appavoo, Andi Kleen, Orran Krieger, Rusty Russell, Dipankar Sarma, and Maneesh Soni. 2001. Read-copy update. In Ottawa Linux Symposium.Google ScholarGoogle Scholar
  33. Steven Osman, Dinesh Subhraveti, Gong Su, and Jason Nieh. 2002. The design and implementation of zap: A system for migrating computing environments. In 5th Symposium on Operating System Design and Implementation (OSDI’02). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Maxim Patlasov. 2011. Containers in a File. Retrieved April 4, 2016 from https://openvz.org/images/f/f3/Ct_in_a_file.pdf. (2011).Google ScholarGoogle Scholar
  35. Jan-Simon Pendry and Marshall K. McKusick. 1995. Union mounts in 4.4BSD-lite. In USENIX 1995 Technical Conference on UNIX and Advanced Computing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Dai Qin, Angela Demke Brown, and Ashvin Goel. 2014. Reliable writeback for client-side flash caches. In 2014 USENIX Annual Technical Conference (USENIX ATC’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Rusty Russell. 2008. Virtio: Towards a de-facto standard for virtual I/O devices. Operating Systems Review 42, 5, 95--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Eric Seppanen, Matthew T. O’Keefe, and David J. Lilja. 2010. High performance solid state storage under Linux. In IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yannis Sfakianakis, Stelios Mavridis, Anastasios Papagiannis, Spyridon Papageorgiou, Markos Fountoulakis, Manolis Marazakis, and Angelos Bilas. 2014. Vanguard: Increasing server efficiency via workload isolation in the storage I/O path. In Proceedings of the ACM Symposium on Cloud Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Stephen Soltesz, Herbert Pötzl, Marc E. Fiuczynski, Andy C. Bavier, and Larry L. Peterson. 2007. Container-based operating system virtualization: A scalable, high-performance alternative to hypervisors. In Proceedings of the 2007 EuroSys Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Xiang Song, Haibo Chen, Rong Chen, Yuanxuan Wang, and Binyu Zang. 2011. A case for scaling applications to many-core with OS clustering. In Proceedings of the 6th European Conference on Computer Systems (EuroSys’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jeremy Sugerman, Ganesh Venkitachalam, and Beng-Hong Lim. 2001. Virtualizing I/O devices on VMware workstation’s hosted virtual machine monitor. In Proceedings of the General Track: 2001 USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Ben Verghese, Anoop Gupta, and Mendel Rosenblum. 1998. Performance isolation: Sharing and isolation in shared-memory multiprocessors. In ASPLOS-VIII Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Matthew Wachs, Michael Abd-El-Malek, Eno Thereska, and Gregory R. Ganger. 2007. Argon: Performance insulation for shared storage servers. In 5th USENIX Conference on File and Storage Technologies (FAST’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Charles P. Wright, Jay Dave, Puja Gupta, Harikesavan Krishnan, David P. Quigley, Erez Zadok, and Mohammad Nayyer Zubair. 2006. Versatility and Unix semantics in namespace unification. ACM Transactions on Storage 2, 1, 74--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Erez Zadok, Ion Badulescu, and Alex Shender. 1999. Extending file systems using stackable templates. In Proceedings of the 1999 USENIX Annual Technical Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Da Zheng, Randal Burns, and Alexander S. Szalay. 2013. Toward millions of file system IOPS on low-cost, commodity hardware. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC’13). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. MultiLanes: Providing Virtualized Storage for OS-Level Virtualization on Manycores

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!