Abstract
Software optimization is constantly a serious concern for developing high-performance systems. To accelerate the workflow execution of a specific functionality, software developers usually define and implement a fast path to speed up the critical and commonly executed functions in the workflow. However, producing a bug-free fast path is nontrivial. Our study on the Linux kernel discloses that a committed fast path can have up to 19 follow-up patches for bug fixing, and most of them are deep semantic bugs, which are difficult to be pinpointed by existing bug-finding tools.
In this paper, we present such a new category of software bugs based on our fast-path bug study across various system software including virtual memory manager, file systems, network, and device drivers. We investigate their root causes and identify five error-prone aspects in a fast path: path state, trigger condition, path output, fault handling, and assistant data structure. We find that many of the deep bugs can be prevented by applying static analysis incorporating simple semantic information. We extract a set of rules based on our findings and build a toolkit PALLAS to check fast-path bugs. The evaluation results show that PALLAS can effectively reveal fast-path bugs in a variety of systems including Linux kernel, mobile operating system, software-defined networking system, and web browser.
- S. Amani, A. Hixon, Z. Chen, C. Rizkallah, P. Chubb, L. O'Connor, J. Beeren, Y. Nagashima, J. Lim, T. Sewell, J. Tuong, G. Keller, T. Murray, G. Klein, and G. Heiser. COGENT: Verifying High-Assurance File System Implementations. In ASPLOS'16, Atlanta, GA, Apr. 2016.Google Scholar
Digital Library
- www-androidAndroid Open Source Project. https://source.android.com/index.html.Google Scholar
- www-chromeChromium: An Open-Source Browser Project. https://www.chromium.org/Home.Google Scholar
- www-clangclang: a C language family frontend for LLVM. http://clang.llvm.org/.Google Scholar
- D. Engler and M. Musuvathi. Static Analysis Versus Software Model Checking for Bug Finding. In VMCAI'04, 2004. Google Scholar
Cross Ref
- fastpath-wikiFast Path. https://en.wikipedia.org/wiki/Fast_path.Google Scholar
- D. Fryer, K. Sun, R. Mahmood, T. Cheng, S. Benjamin, A. Goel, and A. D. Brown. Recon: Verifying file system consistency at runtime. Trans. Storage, 8 (4), Dec. 2012.Google Scholar
- inode-patchfs: Remove i_cindex from struct inode. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers?id=9fd5746fd3d7838bf6ff991d50f1257057d1156f.Google Scholar
- H. S. Gunawi, C. Rubio-Gonzalez, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, and B. Liblit. EIO: Error Handling is Occasionally Correct. In FAST'08, 2008.Google Scholar
- H. S. Gunawi, M. Hao, T. Leesatapornwongsa, T. Patana-anake, T. Do, J. Adityatama, K. J. Eliazar, A. Laksono, J. F. Lukman, V. Martin, and A. D. Satria. What Bugs Live in the Cloud? A Study of 3000Google Scholar
- Issues in Cloud Systems. In SOCC'14, Seattle, WA, Nov. 2014.Google Scholar
- C. Hawblitzel, J. Howell, J. R. Lorch, A. Narayan, B. Parno, D. Zhang, and B. Zill. Ironclad Apps: End-to-End Security via Automated Full-System Verification. In OSDI'14, Broomfield, CO, Oct. 2014.Google Scholar
- J. Huang, X. Zhang, and K. Schwan. Understanding Issue Correlations: A Case Study of the Hadoop System. In SOCC'15, Kohala Coast, HI, Aug. 2015. Google Scholar
Digital Library
- J. Huang, M. K. Qureshi, and K. Schwan. An Evolutionary Study of Linux Memory Management for Fun and Profit. In USENIX ATC'16, Denver, CO, June 2016.Google Scholar
- A. Hunter. A Brief Introduction to the Design of UBIFS. Technical Report.Google Scholar
- K. Kelsey, T. Bai, C. Ding, and C. Zhang. Fast Track: A Software System for Speculative Program Optimization. In CGO'09, Seattle, WA, Mar. 2009. Google Scholar
Digital Library
- G. Klein, K. Elphinstone, G. Heiser, J. Andronick, D. Cock, P. Derrin, D. Elkaduwe, K. Engelhardt, M. Norrish, R. Kolanski, T. Sewell, H. Tuch, and S. Winwood. seL4: Formal Verification of an OS Kernel. In SOSP'09, Big Sky, Montana, Oct. 2009.Google Scholar
Digital Library
- A. Kogan and E. Petrank. A Methodology for Creating Fast Wait-Free Data Structures. In PPoPP'12, New Orleans, Louisiana, USA, Feb. 2012. Google Scholar
Digital Library
- L. Kuhtz. Model Checking Finite Paths and Trees. PhD thesis, Saarland University, 2010.Google Scholar
- T. Leesatapornwongsa, M. Hao, P. Joshi, J. F. Lukman, and H. S. Gunawi. SAMC: Semantic-Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems. In OSDI'14, Broomfield, CO, Oct. 2014.Google Scholar
- T. Leesatapornwongsa, J. F. Lukman, S. Lu, and H. S. Gunawi. TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems. In ASPLOS'16, Atlanta, GA, Apr. 2016. Google Scholar
Digital Library
- D. Lie, A. Chou, D. Engler, and D. L. Dill. A Simple Method for Extracting Models from Protocol Code. In ISCA'01, 2001. Google Scholar
Cross Ref
- T. A. Limoncelli and D. Hughe. LISA'11 Theme -- DevOps: New Challenges, Proven Values. USENIX; login:, 36 (4), Aug. 2011.Google Scholar
- X. Liu, C. Kreitz, R. van Renesse, J. Hickey, M. Hayden, K. Birman, and R. Constable. Building Reliable, High-Performance Communication Systems from Components. In SOSP'99, Kiawah Island, SC, Dec. 1999.Google Scholar
- L. Lu, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, and S. Lu. A Study of Linux File System Evolution. In FAST'13, Feb. 2013.Google Scholar
Digital Library
- S. Lu, S. Park, C. Hu, X. Ma, W. Jiang, Z. Li, R. A. Popa, and Y. Zhou. MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs. In SOSP'07, stevenson, Washington, Oct. 2007.Google Scholar
Digital Library
- S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from Mistakes - A Comprehensive Study on Real World Concurrency Bug Characteristics. In ASPLOS'08, Seattle, WA, Mar. 2008.Google Scholar
- N. Markey and P. Schnoebelen. Model Checking a Path. Technical Report, 2003.Google Scholar
Cross Ref
- D. McNamee, J. Walpole, C. Pu, C. Cowan, C. Krasic, A. Goel, and P. Wagle. Specialization Tools and Techniques for Systematic Optimization of System Software. ACM Transactions on Computer Systems, 19 (2). Google Scholar
Digital Library
- C. Min, S. Kashyap, B. Lee, C. Song, and T. Kim. Cross-checking Semantic Correctness: The Case of Finding File System Bugs. In SOSP'15, Monterey, CA, Oct. 2015.Google Scholar
Digital Library
- mm-zonemm: page_alloc: spill to remote nodes before waking kswapd. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/mm/page_alloc.c?id=3a025760fc158b3726eac89ee95d7f29599e9dfa.Google Scholar
- prefer-patchmm:fix deferred congestion timeout if preferred zone is not allowed. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f33261d75b88f55a08e6a9648cef73509979bfba.Google Scholar
- memcontrol-patchmm/memcontrol.c: fix uninitialized variable use in mem_cgroup_move_parent(). https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/mm/memcontrol.c?id=8dba474f034c322d96ada39cb20cac711d80dcb2.Google Scholar
- D. Mosberger and L. L. Peterson. Making Paths Explicit in the Scout Operating System. In OSDI'96, Oct. 1996. Google Scholar
Digital Library
- net-corenet: Check rps\_flow\_table when RPS map length is 1. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/net?id=8587523640441a9ff2564ebc6efeb39497ad6709.Google Scholar
- key-patchnet: tcp: add key management to congestion control. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/net?id=c5c6a8ab45ec0f18733afb4aaade0d4a139d80b3.Google Scholar
- inode-structure-patchnfsd/create race fixes, infrastructure. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/inode.c?id=261bca86ed4f7f391d1938167624e78da61dcc6b.Google Scholar
- ocfs2OCFS2 - Oracle Cluster File System for Linux. http://www.oracle.com/us/technologies/linux/025995.htm.Google Scholar
- ocfs2-patchocfs2: fix disk file size and memory file size mismatch. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs?id=ce170828e24959c69e7a40364731edc0535c550f.Google Scholar
- P. Olivier, J. Boukhobza, and E. Senn. On Benchmarking Embedded Linux Flash File Systems. Technical Report.Google Scholar
- www-ovsProduction Quality, Multilayer Open Virtual Switch. http://openvswitch.org/.Google Scholar
- C. Pu, T. Autrey, A. Black, C. Consel, C. Cowan, J. Inouye, L. Kethana, J. Walpole, and K. Zhang. Optimistic Incremental Specialization: Streamlining a Commercial Operating System. In SOSP'95, CO, USA, Dec. 1995.Google Scholar
Digital Library
- frozen-patchslub: Add frozen check in_\_slab\_alloc. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/mm/slub.c?id=507effeaba29bf724dfe38317fbd11d0fe25fa40.Google Scholar
- tcp-output-patchtcp: Fix slab corruption with ipv6 and tcp6fuzz. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/net/ipv4/tcp_input.c?id=9ae27e0adbf471c7a6b80102e38e1d5a346b3b38.Google Scholar
- www-rfc793Transmission Control Protocol. https://tools.ietf.org/html/rfc793.Google Scholar
- W. Xu, S. Kumar, and K. Li. Fast Paths in Concurrent Programs. In PACT'04, 2004.Google Scholar
- J. Yang, P. Twohey, D. Engler, and M. Musuvathi. Using Model Checking to Find Serious File System Errors. In OSDI'04, San Francisco, CA, Dec. 2004.Google Scholar
- J. Yang, C. Sar, and D. Engler. EXPLODE: A Lightweight, General System for Finding Serious Storage System Errors. In OSDI'06, Seattle, WA, Nov. 2006.Google Scholar
- D. Yuan, Y. Luo, X. Zhuang, G. R. Rodrigues, X. Zhao, Y. Zhang, P. U. Jain, and M. Stumm. Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems. In OSDI'14, Broomfield, CO, Oct. 2014.Google Scholar
Digital Library
Index Terms
Pallas: Semantic-Aware Checking for Finding Deep Bugs in Fast Path
Recommendations
Pallas: Semantic-Aware Checking for Finding Deep Bugs in Fast Path
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating SystemsSoftware optimization is constantly a serious concern for developing high-performance systems. To accelerate the workflow execution of a specific functionality, software developers usually define and implement a fast path to speed up the critical and ...
Pallas: Semantic-Aware Checking for Finding Deep Bugs in Fast Path
Asplos'17Software optimization is constantly a serious concern for developing high-performance systems. To accelerate the workflow execution of a specific functionality, software developers usually define and implement a fast path to speed up the critical and ...
Finding complex concurrency bugs in large multi-threaded applications
EuroSys '11: Proceedings of the sixth conference on Computer systemsParallel software is increasingly necessary to take advantage of multi-core architectures, but it is also prone to concurrency bugs which are particularly hard to avoid, find, and fix, since their occurrence depends on specific thread interleavings. In ...







Comments