Abstract

With Solid State Disks (SSDs) offering high degrees of parallelism, SSD controllers place data and direct requests to exploit the maximum offered hardware parallelism. In the quest to maximize parallelism and utilization, sub-requests of a request that are directed to different flash chips by the scheduler can experience differential wait times since their individual queues are not coordinated and load balanced at all times. Since the macro request is considered complete only when its last sub-request completes, some of its sub-requests that complete earlier have to necessarily wait for this last sub-request. This paper opens the door to a new class of schedulers to leverage such slack between sub-requests in order to improve response times. Specifically, the paper presents the design and implementation of a slack-enabled re-ordering scheduler, called Slacker, for sub-requests issued to each flash chip. Layered under a modern SSD request scheduler, Slacker estimates the slack of each incoming sub-request to a flash chip and allows them to jump ahead of existing sub-requests with sufficient slack so as to not detrimentally impact their response times. Slacker is simple to implement and imposes only marginal additions to the hardware. Using a spectrum of 21 workloads with diverse read-write characteristics, we show that Slacker provides as much as 19.5%, 13% and 14.5% improvement in response times, with average improvements of 12%, 6.5% and 8.5%, for write-intensive, read-intensive and read-write balanced workloads, respectively.
- Microsoft research cambridge traces. http://iotta.snia.org/traces/list/BlockIO.Google Scholar
- UMass trace repository. http://traces.cs.umass.edu.Google Scholar
- Crucial bx100 ssd. http://www.crucial.com/usa/en/storage-ssd-bx100.Google Scholar
- Open NAND flash interface specification 3.1. http://www.onfi.org/specifications.Google Scholar
- N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse, and R. Panigrahy. Design tradeoffs for SSD performance. In USENIX Annual Technical Conference, pages 57--70, 2008.Google Scholar
Digital Library
- K. Arase. Semiconductor nand type flash memory with incremental step pulse programming, 1998. URL http://www.google.com/patents/US5812457. US Patent 5,812,457.Google Scholar
- A. M. Caulfield, L. M. Grupp, and S. Swanson. Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 217--228, Mar 2009. Google Scholar
Digital Library
- A. M. Caulfield, J. Coburn, T. Mollov, A. De, A. Akel, J. He, A. Jagatheesan, R. K. Gupta, A. Snavely, and S. Swanson. Understanding the impact of emerging non-volatile memories on high-performance, IO-intensive computing. In International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--11, 2010. Google Scholar
Digital Library
- F. Chen, R. Lee, and X. Zhang. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In International Symposium on High-Performance Computer Architecture, pages 266--277, 2011. Google Scholar
Cross Ref
- C. Dirik and B. Jacob. The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization. In International Symposium on Computer Architecture, pages 279--289, 2009. Google Scholar
Digital Library
- D. Feitelson and L. Rudolph. Wasted resources in gang scheduling. In 5th Jerusalem Conference on Information Technology, pages 127--136, Oct 1990. Google Scholar
Cross Ref
- D. G. Feitelson and L. Rudolph. Gang scheduling performance benefits for fine-grain synchronization. Journal of Parallel and Distributed Computing, 16 (4): 306--318, 1992. Google Scholar
Cross Ref
- C. Gao, L. Shi, M. Zhao, C. Xue, K. Wu, and E.-M. Sha. Exploiting parallelism in I/O scheduling for access conflict minimization in flash-based solid state drives. In International Conference on Mass Storage Systems and Technologies, pages 1--11, 2014. Google Scholar
Cross Ref
- Y. Hu, H. Jiang, D. Feng, L. Tian, H. Luo, and S. Zhang. Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity. In International Conference on Supercomputing, pages 96--107, 2011. Google Scholar
Digital Library
- M. Huang, Y. Wang, Z. Liu, L. Qiao, and Z. Shao. A garbage collection aware stripping method for solid-state drives. In Asia and South Pacific Design Automation Conference, pages 334--339, 2015. Google Scholar
Cross Ref
- A. Huffman. NVM express 1.1a specifications. http://www.nvmexpress.org, Sep 2013.Google Scholar
- M. Jette. Performance characteristics of gang scheduling in multiprogrammed environments. In Supercomputing Conference, pages 54--54, 1997. Google Scholar
Digital Library
- M. Jung, E. H. Wilson, III, and M. Kandemir. ysically addressed queueing (PAQ): improving parallelism in solid state disks. In International Symposium on Computer Architecture, pages 404--415, Jun 2012.Google Scholar
- M. Jung, W. Choi, S. Srikantaiah, J. Yoo, and M. T. Kandemir. HIOS: a host interface IO scheduler for solid state disks. In International Symposium on Computer Architecuture, pages 289--300, 2014. Google Scholar
Digital Library
- J. Kim, Y. Oh, E. Kim, J. Choi, D. Lee, and S. H. Noh. Disk schedulers for solid state drivers. In International Conference on Embedded Software, pages 295--304, 2009. Google Scholar
Digital Library
- S. Kung. Naive PCI SSD controllers. http://www.marvell.com/storage/system-solutions/native-pcie-ssd-controller/assets/Marvell-Native-PCIe-SSD-Controllers-WP.pdf, Jan 2012.Google Scholar
- A. Lodi, S. Martello, and D. Vigo. Recent advances on two-dimensional bin packing problems. Discrete Applied Mathematics, 123 (1--3): 379--396, 2002.Google Scholar
Digital Library
- R. Love. Kernel korner -- I/O schedulers. Linux Journal, 2004 (118): 10--, 2004.Google Scholar
- ]MLC-II-NimaMicron Technology, Inc. NAND flash memory MLC data sheet, MT29E512G08CMCCBH7--6 NAND flash memory. http://www.micron.com/.Google Scholar
- E. H. Nam, B. Kim, H. Eom, and S. L. Min. Ozone (O3): an out-of-order flash memory controller architecture. IEEE Transactions on Computers, 60 (5): 653--666, 2011. Google Scholar
Digital Library
- J. K. Ousterhout. Scheduling techniques for concurrent systems. In International Conference on Distributed Computing Systems, pages 22--30, 1982.Google Scholar
- C. Park, E. Seo, J.-Y. Shin, S. Maeng, and J. Lee. Exploiting internal parallelism of flash-based SSDs. IEEE Computer Architecture Letters, 9 (1): 9--12, Jan 2010. Google Scholar
Digital Library
- J.-Y. Shin, Z.-L. Xia, N.-Y. Xu, R. Gao, X.-F. Cai, S. Maeng, and F.-H. Hsu. Ftl design exploration in reconfigurable high-performance ssd for server applications. In 23rd International Conference on Supercomputing, pages 338--349, 2009. Google Scholar
Digital Library
- A. Tavakkol, M. Arjomand, and H. Sarbazi-Azad. Unleashing the potentials of dynamism for page allocation strategies in SSDs. In ACM International Conference on Measurement and Modeling of Computer Systems, pages 551--552, 2014. Google Scholar
Digital Library
- G. Wu and X. He. Reducing SSD read latency via nand flash program and erase suspension. In 10th USENIX Conference on File and Storage Technologies, Feb 2012.Google Scholar
Digital Library
- Q. Zhang, D. Feng, F. Wang, and Y. Xie. An efficient, QoS-aware I/O scheduler for solid state drive. In International Conference on High Performance Computing and Communications, pages 1408--1415, 2013. Google Scholar
Cross Ref
Index Terms
Exploiting Intra-Request Slack to Improve SSD Performance
Recommendations
Exploiting Intra-Request Slack to Improve SSD Performance
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating SystemsWith Solid State Disks (SSDs) offering high degrees of parallelism, SSD controllers place data and direct requests to exploit the maximum offered hardware parallelism. In the quest to maximize parallelism and utilization, sub-requests of a request that ...
Exploiting Intra-Request Slack to Improve SSD Performance
Asplos'17With Solid State Disks (SSDs) offering high degrees of parallelism, SSD controllers place data and direct requests to exploit the maximum offered hardware parallelism. In the quest to maximize parallelism and utilization, sub-requests of a request that ...
WEC: Improving Durability of SSD Cache Drives by Caching Write-Efficient Data
Serving as cache disks, flash-based solid-state drives (SSDs) can significantly boost the performance of read-intensive applications. However, frequent data updating, the necessary condition for classical replacement algorithms (e.g., LRU, MQ, LIRS, and ...







Comments