article
Free Access

Eliminating receive livelock in an interrupt-driven kernel

Online:01 August 1997Publication History

Abstract

Most operating systems use interface interrupts to schedule network tasks. Interrupt-driven systems can provide low overhead and good latency at low offered load, but degrade significantly at higher arrival rates unless care is taken to prevent several pathologies. These are various forms ofreceive livelock, in which the system spends all of its time processing interrupts, to the exclusion of other necessary tasks. Under extreme conditions, no packets are delivered to the user application or the output of the system. To avoid livelock and related problems, an operating system must schedule network interrupt handling as carefully as it schedules process execution. We modified an interrupt-driven networking implementation to do so; this modification eliminates receive livelock without degrading other aspects of system performance. Our modifications include the use of polling when the system is heavily loaded, while retaining the use of interrupts ur.Jer lighter load. We present measurements demonstrating the success of our approach.

Supplemental Material

References

  1. CHANG, C.-H., FLOWER, R., FORECAST, J., GRAY, H., HAWE, W. R., NADKARNI, A. P., RAMAKRISHNAN, K. K.~ SHIKARPUR, U. N.~ AND WILDE, K. M. 1993. High-performance TCP/IP and UDP/IP networking in DEC OSF/1 for Alpha AXP. Digital Tech. J. 5, 1 (Winter), 44-61.]] Google ScholarGoogle Scholar
  2. CHEN, J. B. AND EUSTACE, A. 1995. Kernel instrumentatiion tools and techniques. Tech. Rep. TR-26-95, Harvard Univ. Center for Research in Computing Technology, Cambridge, Mass. Nov.]]Google ScholarGoogle Scholar
  3. DRUSCHEL, P. AND BANGA, G. 1996. Lazy Receiver Processing (LRP): A network subsystem architecture for server systems. In Proceedings of the 2rid Symposium on Operating Systems Design and Implementation. USENIX Assoc., Berkeley, Calif., 261-275.]] Google ScholarGoogle Scholar
  4. EUSTACE, A. AND SRIVASTAVA, A. 1995. ATOM: A flexible interface for building high performance program analysis tools. In Proceedings of the 1995 USENIX Conference. USENIX Assoc., Berkeley, Calif., 303-313.]] Google ScholarGoogle Scholar
  5. FALL, K. 1994. A peer-to-peer I/O system in support of I/O intensive workloads. Ph.D. thesis, Univ. of California, San Diego.]] Google ScholarGoogle Scholar
  6. FALL, K.~ PASQUALE, J.~ AND MCCANNE, S. 1995. Workstation video playback performance with competitive process load. In Proceedings of the 5th International Workshop on Network and Operating Systems Support for Digital Audio and Video. IEEE Communications Society, New York, 179-182.]] Google ScholarGoogle Scholar
  7. FERRARI, D.~ PASQUALE, J.~ AND POLYZOS, G. C. 1991. Network issues for Sequoia 2000. Sequoia 2000 Tech. Rep. 91,/6, Univ. of California, Berkeley. Dec.]] Google ScholarGoogle Scholar
  8. FLOYD, S. AND JACOBSON, V. 1993. Random early detection gateways for congestion avoidance. Trans. Networking 1, 4 (Aug.), 397-413.]] Google ScholarGoogle Scholar
  9. JACOBSON, V. 1990. Efficient protocol implementation. In bound notes provided at ACM SIGCOMM '90 Tutorial on "Protocols for High-Speed Networks".]]Google ScholarGoogle Scholar
  10. LEFFLER, S. J.~ McCusIcK, M. K.~ KARELS, M. J.~ AND QUARTERMAN, J. S. 1989. The Design and Implementation of the ~.3BSD UNIX Operating System. Addison-Wesley, Reading, Mass.]]Google ScholarGoogle Scholar
  11. MACKLEM, R. 1991. Lessons learned tuning the 4.3BSD Reno implementation of the NFS protocol. In Proceedings of the Winter 1991 USENIX Conference. USENIX Assoc., Berkeley, Calif., 5364.]]Google ScholarGoogle Scholar
  12. MASSALIN, H. AND PU, C. 1990. Fine-grain adaptive scheduling using feedback. Comput. Syst. 3, 1 (Winter), 139-174.]]Google ScholarGoogle Scholar
  13. MOGUL, J. C. 1989. Simple and flexible datagram access controls for UNIX-based gateways. In Proceedings of the Summer 1989 USENIX Conference. USENIX Assoc., Berkeley, Calif., 203-221.]]Google ScholarGoogle Scholar
  14. MOGUL, J. C. 1990. Efficient use of workstations for passive monitoring of local area networks. In Proceedings of the SIGCOMM '90 Symposium on Communications Architectures and Protocols. ACM, New York, 253-263.]] Google ScholarGoogle Scholar
  15. MOGUL, J. C.~ RASHID, R. F.~ AND ACCETTA, M. J. 1987. The Packet Filter: An efficient mechanism for user-level network code. In Proceedings of the 11th Symposium on Operating Systems Principles. ACM, Austin, Texas, 39-51.]] Google ScholarGoogle Scholar
  16. MOSBERGER, D. AND PETERSON, L. L. 1996. Making paths explicit in the scout operating systern. In Proceedings of the 2rid Symposium on Operating Systems Design and Implementation. USENIX Assoc., Berkeley, Calif., 153-167.]] Google ScholarGoogle Scholar
  17. PERLMAN, R. 1983. Fault-tolerant broadcast of routing information. Comput. Networks 7, 6 (Dec.), 395-405.]]Google ScholarGoogle Scholar
  18. RAMAKRISHNAN, K. K. 1992. Scheduling issues for interfacing to high speed networks. In Proceedings of the Globecom '92 IEEE Global Telecommunications Conference. IEEE, New York, 622-626.]]Google ScholarGoogle Scholar
  19. RAMAKRISHNAN, K. K. 1993. Performance considerations in designing network interfaces. IEEE J. Sel. Areas Commun. 11, 2 (Feb.), 203-219.]]Google ScholarGoogle Scholar
  20. RAMAKRISHNAN, K. K.~ VAITZBLIT, L.~ GRAY, C.~ VAHALIA, U.~ TING, D.~ TZELNIC, P.~ GLASER, S.~ AND DUSO, W. 1995. Operating system support for a video-on-demand file service. Multimedia Syst. 3, 53-65.]] Google ScholarGoogle Scholar
  21. RANUM, M. J. AND AVOLIO, F. M. 1994. A toolkit and methods for Internet firewalls. In Proceedings of the Summer 1993 USENIX Conference. USENIX Assoc., Berkeley, Calif., 37-44.]]Google ScholarGoogle Scholar
  22. ROMANOW, t. AND FLOYD, S. 1995. Dynamics of TCP traffic over ATM networks. IEEE J. Sel. Areas Commun. 13, 4 (May), 633-641.]]Google ScholarGoogle Scholar
  23. SMITH, J. M. AND TRAW, C. B. S. 1993. Giving applications access to Gb/s networking. IEEE Network 7, 4 (July), 44-52.]]Google ScholarGoogle Scholar
  24. SOUZA, R. J., KRISHNAKUMAR, P. G., (~)ZVEREN, C. M., J.SIMCOE, R., SPINNEY, B. A., THOMAS, R. E.~ AND WALSH, R. J. 1994. GIGAswitch: A high-performance packet switching platform. Digital Tech. J. 6, 1 (Winter), 9-22.]] Google ScholarGoogle Scholar
  25. SRIVASTAVA, t. AND EUSTACE, t. 1994. ATOM: A system for building customized program analysis tools. In Proceedings of the SIGPLAN '9~ Conference on Programming Language Design and Implementation. ACM, New York, 196-205.]] Google ScholarGoogle Scholar
  26. TRAW, C. B. S. AND SMITH, J. M. 1993. Hardware/software organization of a high-performance ATM host interface. IEEE J. Sel. Areas Commun. 11, 2 (Feb.), 240-253.]]Google ScholarGoogle Scholar
  27. VAHALIA, W.~ GRAY, C. C-r.~ AND TING, D. 1995. Metadata logging in an NFS server. In Proceedings of the 1995 USENIX Conference. USENIX Assoc., Berkeley, Calif., 265-276.]] Google ScholarGoogle Scholar
  28. WALDSPURGER, C. t. 1995. Lottery and stride scheduling: Flexible proportional-share resource management. Tech. Rep. MIT/LCS/TR-667, Massachusetts Institute of Technology Laboratory for Computer Science, Cambridge, Mass. Sept.]] Google ScholarGoogle Scholar
  29. WALDSPURGER, C. t. AND WEIHL, W. E. 1994. Lottery scheduling: Flexible proportional-share resource management. In Proceedings of the 1st USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Assoc., Berkeley, Calif., 1-11.]] Google ScholarGoogle Scholar
  30. WALDSPURGER, C. t. AND WEIHL, W. E. 1995. Stride scheduling: Deterministic proportionalshare resource management. Tech. Memo. MIT/LCS/TM-528, Massachusetts Institute of Technology Laboratory for Computer Science, Cambridge, Mass. June.]] Google ScholarGoogle Scholar

Index Terms

  1. Eliminating receive livelock in an interrupt-driven kernel

      Reviews

      David Michael Bowen

      The move from polling loops to interrupt systems in computer hardware and operating systems is generally considered a step forward in computer development; cycles that were once spent waiting for I/O operations to complete can now be spent doing real work for the computer user. When an interrupt-driven system is overloaded, however, it can reach a livelock state, in which all of the CPU time is spent in handling interrupts, and no cycles are available to process the information associated with the interrupts. The authors suggest that, in such livelock situations, limiting interrupts and returning to polling is a way to improve system performance. They make their case by studying several systems that are susceptible to livelock: an Internet Protocol (IP) router, a network monitor, and a network file server, all using kernels based on BSD 4.2. They obtained trace data from an instrumented kernel that show the system going into livelock. They then tested modified kernels to see if they performed better under the same load. While this paper describes current research, I hope people other than researchers will read it. The problem described is a real one in current network computing, and the authors have described the problem and their approach to solving it in language that can be understood by students taking a first course in operating systems. While the authors' ideas may not be the ultimate solution to this problem, they certainly seem like a good first step. Future researchers can benefit from the analysis the authors have provided.

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!