skip to main content
research-article

NCQ vs. I/O scheduler: Preventing unexpected misbehaviors

Authors Info & Claims
Published:05 April 2010Publication History
Skip Abstract Section

Abstract

Native Command Queueing (NCQ) is an optimization technology to maximize throughput by reordering requests inside a disk drive. It has been so successful that NCQ has become the standard in SATA 2 protocol specification, and the great majority of disk vendors have adopted it for their recent disks. However, there is a possibility that the technology may lead to an information gap between the OS and a disk drive. A NCQ-enabled disk tries to optimize throughput without realizing the intention of an OS, whereas the OS does its best under the assumption that the disk will do as it is told without specific knowledge regarding the details of the disk mechanism. Let us call this expectation discord, which may cause serious problems such as request starvations or performance anomaly. In this article, we (1) confirm that expectation discord actually occurs in real systems; (2) propose software-level approaches to solve them; and (3) evaluate our mechanism. Experimental results show that our solution is simple, cheap (no special hardware required), portable, and effective.

References

  1. Abbott, R. K. and Garcia-Monlina, H. 1990. Scheduling I/O requests with deadlines: A performance evaluation. In Proceedings of the 11th Real-Time Systems Symposium (RTSS). 113--124.Google ScholarGoogle Scholar
  2. Bruno, J., Brustoloni, J., Gabber, E., Ozden, B., and Silberschatz, A. 1999. Disk scheduling with quality of service guarantees. In Proceedings of the IEEE International Conference on Microelectronics and Computer Science (ICMCS). IEEE, Los Alamitos, CA.Google ScholarGoogle Scholar
  3. Carey, M. J., Jauhari, R., and Livny, M. 1989. Priority in dbms resource scheduling. In Proceedings of the 15th International Conference on Very Large Data Bases (VLDB). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, S., Stankovic, J. A., Kurose, J. F., and Towsley, D. 1991. Performance evaluation of two new disk scheduling algorithms for real-time systems. J. Real-Time Syst. 3, 307--336Google ScholarGoogle ScholarCross RefCross Ref
  5. de Jonge, W., Kaashoek, M. F., and Hsieh, W. C. 1993. The logical disk: A new approach to improving file systems. In Proceedings of the 13th Symposium on Operating Systems Principles. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dees, B. 2005. Native command queuing—Advanced performance in desktop storage. IEEE Potentials 24, 4, 4--7.Google ScholarGoogle ScholarCross RefCross Ref
  7. Denehy, T. E., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2002. Bridging the information gap in storage protocol stacks. In Proceedings of the USENIX Annual Technical Conference (USENIX'02). 177--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ganger, G. R. 2001. Blurring the line between (OSes) and storage devices. Tech. rep. CMU-CS-01-166, Carnegie Mellon University, Pittsburgh, PA.Google ScholarGoogle Scholar
  9. Gill, B. S. and Bathen, L. A. D. 2007. Amp: Adaptive multi-stream prefetching in a shared cache. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST). 185--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Grimsrud, K. 2007. Sata-io: Features moves sata into smaller form factors. Intel Developer Forum (IDF), Intel Corporation.Google ScholarGoogle Scholar
  11. Gulati, A., Merchant, A., Uysal, M., and Varman, P. J. 2007. Efficient and adaptive proportional share I/O scheduling. Tech. rep. HPL-2007-186, HP Laboratories, Palo Alto, CA.Google ScholarGoogle Scholar
  12. Gurun, S. and Krintz, C. 2005. Autodvs: An automatic, general-purpose, dynamic clock scheduling system for hand-held devices. In Proceedings of the ACM International Conference on Embedded Software (EMSOFT). ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hall, C. and Bonnet, P. 2005. Getting priorities straight: Improving Linux support for database I/O. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB '05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Huang, L. and Chiueh, T. 2000. Implementation of a rotation latency sensitive disk scheduler. Tech. rep. ECSL-TR81, SUNY, Stony Brook.Google ScholarGoogle Scholar
  15. Huffman, A. 2003. Comparing serial ATA native command queuing (NCQ) and ATA tagged command queuing (TCQ). White paper, Intel Corporation.Google ScholarGoogle Scholar
  16. Huffman, A. 2007. Serial ATA advanced host controller interface (AHCI). Specification 1.2, Intel Corporation.Google ScholarGoogle Scholar
  17. Intel and Seagate. 2003. Serial ATA native command queuing: An exciting new performance feature for serial ATA. Joint White paper, Intel Corporation and Seagate Technology.Google ScholarGoogle Scholar
  18. Iyer, S. and Druschel, P. 2001. Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O. In Proceedings of the Symposium on Operating Systems Principles. 117--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jacobson, D. M. and Wilkes, J. 1991. Disk scheduling algorithms based on rotational position. Tech. rep. HPL-CSP-91-7rev1, HP Laboratories.Google ScholarGoogle Scholar
  20. Jones, S. T., Arpaci-Dusseau, A. C., and Arpaci-Dusseau, R. H. 2006. Antfarm: Tracking processes in a virtual machine environment. In Proceedings of the USENIX Annual Technical Conference (USENIX '06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kaldewey, T., Wong, T. M., Golding, R., Povzner, A., Brandt, S., and Maltzahn, C. 2008. Virtualizing disk performance. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). 319--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Katcher, J. 1997. Postmark: A new file system benchmark. Tech. rep. TR3022, Network Appliance, Inc.Google ScholarGoogle Scholar
  23. Keeton, K., Patterson, D. A., and Hellerstein, J. M. 1998. A case for intelligent disks (idisks). SIGMOD Record 27, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Li, C., Shen, K., and Papathanasiou, A. E. 2004. Competitive prefetching for concurrent sequential I/O. In Proceedings of the 1st Workshop on Operating Systems and Architectural Support for the on Demand IT Infrastructure (OASIS '04).Google ScholarGoogle Scholar
  25. Li, M., Varki, E., Bhatia, S., and Merchant, A. 2008. Tap: Table-based prefetching for storage caches. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST). 81--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lumb, C., Schindler, J., Ganger, G. R., Riedel, E., and Nagle, D. F. 2000. Towards higher disk head utilization: Extracting “free” bandwidth from busy disk drives. In Proceedings of the 4th Symposium on Operating System Design and Implementation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. McWherter, D. T., Schroeder, B., Ailamaki, A., and Harchol-Balter, M. 2004. Priority mechanisms for OLTP and transactional Web applications. In Proceedings of the 20th International Conference on Data Engineering (ICDE '04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Mesnier, M., Ganger, G. R., and Riedel, E. 2003. Object-based storage. IEEE Comm. Mag. 41, 8, 84--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Microsoft. 2006. I/O prioritization in Windows Vista. White paper. http://www.microsoft.com/whdc/driver/priorityio.mspx.Google ScholarGoogle Scholar
  30. Ongaro, D., Cox, A. L., and Rixner, S. 2008. Scheduling I/O in virtual machine monitors. In Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE '08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Panasas. Object storage architecture. White paper. http://www.panasas.com/library.html, Panasas.Google ScholarGoogle Scholar
  32. Reuther, L. and Pohlack, M. 2003. Rotational-position-aware real-time disk scheduling using a dynamic active subset (DAS). In Proceedings of the 24th IEEE International Real-Time Systems Symposium. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Riedel, E., Faloutsos, C., Ganger, G. R., and Nagle, D. F. 2000. Data mining on an OLTP system (nearly) for free. In Proceedings of the ACM SIGMOD International Conference on Measurement of Data. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Riedel, E., Gibson, G. A., and Faloutsos, C. 1998. Active storage for large-scale data mining and multimedia. In Proceedings of the 24th International Conference on Very Large Data Bases (VLDB '98). Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ruemmler, C. and Wilkes, J. 1994. An introduction to disk drive modeling. IEEE Computer 27, 17--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. SATAIO. 2005. Serial ATA international organization: Serial ATA rev. 2.5 specification. www.sata-io.org.Google ScholarGoogle Scholar
  37. SATAIO. 2007. Serial ATA international organization: Serial ATA rev. 2.6 specification. www.sata-io.org.Google ScholarGoogle Scholar
  38. Seelam, S., Romero, R., Teller, P., and Buros, W. 2005. Enhancements to Linux I/O scheduling. In Proceedings of the Linux Symposium.Google ScholarGoogle Scholar
  39. Seltzer, M., Chen, P., and Ousterhout, J. 1990. Disk scheduling revisited. In Proceedings of the USENIX Winter Technical Conference.Google ScholarGoogle Scholar
  40. Shenoy, P. J. and Vin, H. M. 1998. Cello: A disk scheduling framework for next generation operating systems. In Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Shin, D. I., Yu, Y. J., and Yeom, H. Y. 2007. Shedding light in the black box: Structural modeling of modern disk drives. In Proceedings of the 15th Annual Meeting of the IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Sivathanu, G., Sundararaman, S., and Zadok, E. 2006. Type-safe disks. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI '06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Sivathanu, M., Prabhakaran, V., Popovici, F. I., Denehy, T. E., Arpaci-Dussseau, A. C., and Arpaci-Dusseau, R. H. 2003. Semantically-smart disk systems. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST). 73--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. T10:SAM4. 2007. SCSI architecture model - 4 (SAM-4). Specification rev.13b. www.t10.org.Google ScholarGoogle Scholar
  45. T10:SBC3. 2007. SCSI block commands - 3 (SBC-3). Specification rev.12. www.t10.org.Google ScholarGoogle Scholar
  46. T10:SPC4. 2007. SCSI primary commands - 4 (SPC-4). Specification rev. 11. www.t10.org.Google ScholarGoogle Scholar
  47. Talagala, N., Arpaci-Dusseau, R. H., and Patterson, D. 1999. Micro-benchmark based extraction of local and global disk characteristics. Tech. rep. CSD-99-1063, University of California, Berkeley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Traeger, A., and Zadok, E. 2008. A nine year study of file system and storage benchmarking. ACM Trans. Storage 4, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Waldspurger, C. and Weihl, W. 1995. Stride scheduling: Deterministic proportional resource management. Tech. rep. MIT/LCS/TM-528, MIT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Wang, R. Y., Anderson, T. E., and Patterson, D. A. 1999. Virtual log based file systems for a programmable disk. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Wang, Y. 2006. NCQ for power efficiency. White paper, ULINK Technology.Google ScholarGoogle Scholar
  52. Won, Y., Chang, H., and Ryu, J. 2006. Intelligent storage: Cross-layer optimization for soft real-time workload. ACM Trans. Storage 2, 3, 255--282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Worthington, B. L., Ganger, G. R., and Patt, Y. N. 1994. Scheduling algorithms for modern disk drives. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems. 241--251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Worthington, B. L., Ganger, G. R., Patt, Y. N., and Wilkes, J. 1995. On-line extraction of SCSI disk drive parameters. In Proceedings of the ACM SIGMETRICS Joint International Conference on Measurement and Modeling of Computer Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Wright, C. P., Joukov, N., Kulkarni, D., Miretskiy, Y., and Zadok, E. 2005. Auto-pilot: A platform for system software benchmarking. In Proceedings of the Annual USENIX Technical Conference, FREENIX Track. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. NCQ vs. I/O scheduler: Preventing unexpected misbehaviors

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Storage
        ACM Transactions on Storage  Volume 6, Issue 1
        March 2010
        99 pages
        ISSN:1553-3077
        EISSN:1553-3093
        DOI:10.1145/1714454
        Issue’s Table of Contents

        Copyright © 2010 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 April 2010
        • Received: 1 December 2009
        • Accepted: 1 December 2009
        Published in tos Volume 6, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!