ABSTRACT
Inter-process communication (ipc) has to be fast and effective, otherwise programmers will not use remote procedure calls (RPC), multithreading and multitasking adequately. Thus ipc performance is vital for modern operating systems, especially μ-kernel based ones. Surprisingly, most μ-kernels exhibit poor ipc performance, typically requiring 100 μs for a short message transfer on a modern processor, running with 50 MHz clock rate.In contrast, we achieve 5 μs; a twentyfold improvement.This paper describes the methods and principles used, starting from the architectural design and going down to the coding level. There is no single trick to obtaining this high performance; rather, a synergetic approach in design and implementation on all levels is needed. The methods and their synergy are illustrated by applying them to a concrete example, the L3 μ-kernel (an industrial-quality operating system in daily use at several hundred sites). The main ideas are to guide the complete kernel design by the ipc requirements, and to make heavy use of the concept of virtual address space inside the μ-kernel itself.As the L3 experiment shows, significant performance gains are possible: compared with Mach, they range from a factor of 22 (8-byte messages) to 3 (4-Kbyte messages). Although hardware specific details influence both the design and implementation, these techniques are applicable to the whole class of conventional general purpose von Neumann processors supporting virtual addresses. Furthermore, the effort required is reasonably small, for example the dedicated parts of the μ-kernel can be concentrated in a single medium sized module.
- Acc 86.M.J. Accetta, R. V. Baron, W. Bolosky, D. B. Golub, R. F. Rashid, A. Tevanian, M. W. Young. Mach: A New Kernel Foundation for UNIX Development. Proceedings Usenix Summer'86 Conference. Atlax~ta, Georgia, June 1986, pp. 93-113.Google Scholar
- Ber 89.B.N. Bershad, T. E. Anderson, E. D. Lazowska, H. M. Levy. Lightweight Remote Procedure Call. Proceedings 12th ACM Symposium on Operating Principles, Litchfield Park, Arizona, December 1989, pp. 102-113. Google Scholar
Digital Library
- Ber 92.B.N. Bershad. The Increasing Irrelevance of IPC Performance for MicrokerneI-Based Operating Systems. Proceeedings Micro-kernel and Other Kernel Architectures Usenix Workshop, Seattle, April 1992, pp. 205-211. Google Scholar
Digital Library
- Bir 84.A.D. Birrel, B. Nelson. Implementing Remote Procedure Galls. ACM Transactions on Computer ~ysterns. February 1984, pp. 39-59. Google Scholar
Digital Library
- Bey 88.U. Beyer, D. Heinrichs, J. Liedtke. Dataspaces in L3. Proceedings ISMM International Symposium on Mini and Microcomputers and Their Applications (MIMi '88), Barcelona, June 1988, pp. 408-414.Google Scholar
- Bra 90.B.K. Bray, W. L. Lynch, M. J. Flynn. Page Allocation To Reduce Access Time of Physical Caches. Stanford University, Technical Report CSL-TR-90- 454. November 1990. Google Scholar
Digital Library
- Che 84.D.R. Cheriton. The V Kernel: A Software Base for Distributed Systems. IEEE Software, April 1984, pp. 19-42.Google Scholar
- Che 84a.D.R. Cheriton. An Experiment Using Registers For Message Based lnterprocess Communication. Operating Systems Review, October, 1984, pp. 12-20. Google Scholar
Digital Library
- DoD 83.DoD. Trusted Computer Evaluation Cmteria. DoD Computer Security Center, CSC-STD-001-83. August 1983.Google Scholar
- Dra 91.R.P. Draves, B. N. Bershad, R. F. Rashid, R. W. Dean. Using Continuations to Implement Thread Management and Communication in Operating Systems. Proceedings 13th ACM Symposium on Operating Principles, Pacific Grove, California, October 1991, pp. 122-136. Google Scholar
Digital Library
- Gös 93.K. G6smann, C. Hafer, H. Lindmeier, J. Plankl, K. Westerholz. Code Reorganization for Instructgon Caches. Proceedings 26th Annum Hawaii International Conference on System Sciences. Hawaii 1990, Vol. I pp. 214-223.Google Scholar
- Gui 82.M. Guillemont. The Chorus Distributed Operating System: Design and Implementation. Proceedings ACM International Symposium on Local Computer Networks, Firenze, April 1982, pp. 207-223.Google Scholar
- Här 92.H. H'&rtig, W.E. Kiihnhauser, W. Reck. Operating Systems on Top of Persistent Object Systems-The BirliX Approach -. Proceedings 25th Hawaii International Conference on Systems Sciences, IEEE Press 1992, Vol 1, pp. 790-799.Google Scholar
- Hil 92.D. Hildebrand. An Architectural Overview of QNX. Proceeedings Micro-kernel and Other Kernel Architectures Usenix Workshop, Seattle, April 1992, pp. 113-126. Google Scholar
Digital Library
- i486.Intel Corporation. iJ86 Processor Programmer's Reference Manual. Santa Clara, 1986 Google Scholar
Digital Library
- Kar 89.P.A. Karger. Using Registers to Optimize Cross- Domain Call Performance. Proceedings 3rd Conference on Architectural Support for Programming Languages and Operating Systems. April 1989, pp. 194- 204. Google Scholar
Digital Library
- Ger 92.F. Lange, R. Kr5ger, M. Gergeleit. JEWEL: Design and Implementation of a Distributed Measurement System. IEEE Transactions on Parallel and Distributed Systems, November 1992. Google Scholar
Digital Library
- Lie 91.J. Liedtke, U. Bartling, U. Beyer, D. Heinrichs, R. Ruland, G. Szalay. Two Years of Experience with a t~-KerneI Based OS. Operating Systems Review, April 1991, pp. 51-62. Google Scholar
Digital Library
- Lie 92.J. Liedtke. Clans ~ Chiefs. Proceedings 12. GI/ITG- Fachtagung Architektur yon Rechensystemen, Kiel 1992, A. Jammel (Ed.), Springer-Verlag, pp. 294-305. Google Scholar
Digital Library
- Lie 92a.J.Liedtke. Fast Thread Management and Communication Without Continuations. Proceeedings Microkernel and Other Kernel Architectures Usenix Workshop, Seattle, April 1992, 213-221. Google Scholar
Digital Library
- Lie 93.J.Liedtke. A Persistent System in Real Use - Experiences o! the First 13 Years -. submitted to International Workshop on Object-Orientation in Operating Systems. Asheville, North Carolina, December 1993.Google Scholar
- Lie 93a.J.Liedtke. Lazy Context Switching Algorithms for Sparc-like Processors. Arbeitspapiere der GMD No. 776. St. Augustin, 1993.Google Scholar
- Mul 84.S.J. Mullender et at. The Amoeba Distributed Operating System: Selected Papers 198d-1987. CWI Tract. No. 41, Amsterdam 1987.Google Scholar
- Ous 90.J. I~. Ousterhout. Why Aren't Operating Systems Getting Faster As Fast as Hardware? Proceedings Usenix Summer Conference 1990. Anaheim, California, 1990, pp. 247-256.Google Scholar
- Ren 88.R. van Renesse, H. van Staveren, A. S. Tanenbaum. Performance of the World's Fastest Distributed Operating System. Operating Systems Review, October 1988, pp. 25-34. Google Scholar
Digital Library
- Sch 89.M.D. Schroeder, M. Burroughs. Performance of Firefly RPC. Proceedings 12th ACM Symposium on Operating Principles, Litchfield Park, Arizona, December 1989, pp. 83-90. Google Scholar
Digital Library
- Tzo 91.S.-Y. Tzou, D. P. Anderson. The Performance of Message-passing using Restricted Virtual Memory Remapping. Software-Practice and Experience, Vol. 21(3), pp 251-267. March 1991. Google Scholar
Digital Library
Index Terms
Improving IPC by kernel design
Recommendations
Improving IPC by kernel design
Inter-process communication (ipc) has to be fast and effective, otherwise programmers will not use remote procedure calls (RPC), multithreading and multitasking adequately. Thus ipc performance is vital for modern operating systems, especially μ-...
Achieved IPC Performance
HOTOS '97: Proceedings of the 6th Workshop on Hot Topics in Operating Systems (HotOS-VI)Extensibility can be based on cross-address-space interprocess communication (IPC) or on grafting application-specific modules into the operating system. For comparing both approaches, we need to explore the best achievable performance for both models. ...
Operand-Load-Based Split Pipeline Architecture for High Clock Rate and Commensurable IPC
The increase in the complexity of a wide-issue processor with its pipeline width is one of the primary concerns of the processor designers. In the conventional design, hardware in the processor core is laid out to handle multiple instructions with two-...






Comments