Abstract
Increasingly many systems have to run all the time with no downtime allowed. Consider, for example, systems controlling electric power plants and e-banking servers. Nevertheless, security patches and a constant stream of new operating system versions need to be deployed without stopping running programs. These factors naturally lead to a pressing demand for live update---upgrading all or parts of the operating system without rebooting. Unfortunately, existing solutions require significant manual intervention and thus work reliably only for small operating system patches.
In this paper, we describe an automated system for live update that can safely and automatically handle major upgrades without rebooting. We have implemented our ideas in Proteos, a new research OS designed with live update in mind. Proteos relies on system support and nonintrusive instrumentation to handle even very complex updates with minimal manual effort. The key novelty is the idea of state quiescence, which allows updates to happen only in safe and predictable system states. A second novelty is the ability to automatically perform transactional live updates at the process level, ensuring a safe and stable update process. Unlike prior solutions, Proteos supports automated state transfer, state checking, and hot rollback. We have evaluated Proteos on 50 real updates and on novel live update scenarios. The results show that our techniques can effectively support both simple and complex updates, while outperforming prior solutions in terms of flexibility, security, reliability, and stability of the update process.
- Ksplice performance record. http://www.ksplice.com/cve-evaluation, 2009.Google Scholar
- FUSE: Filesystem in userspace. http://fuse.sourceforge.net/, 2012.Google Scholar
- Green hills integrity. http://www.ghs.com/products/rtos/integrity.html, 2012.Google Scholar
- S. V. Adve, V. S. Adve, and Y. Zhou. Using likely program invariants to detect hardware errors. In Proc. of the IEEE Int'l Conf. on Dependable Systems and Networks, 2008.Google Scholar
- S. Ajmani. A review of software upgrade techniques for distributed systems, 2004.Google Scholar
- S. Ajmani, B. Liskov, and L. Shrira. Scheduling and simulation: How to upgrade distributed systems. In Proc. of the Ninth Workshop on Hot Topics in Operating Systems, volume 9, pages 43--48, 2003. Google Scholar
Digital Library
- S. Ajmani, B. Liskov, L. Shrira, and D. Thomas. Modular software upgrades for distributed systems. In Proc. of the 20th European Conf. on Object-oriented Programming, pages 452--476, 2006. Google Scholar
Digital Library
- P. Akritidis. Cling: A memory allocator to mitigate dangling pointers. In Proc. of the 19th USENIX Security Symp., page 12, 2010. Google Scholar
Digital Library
- J. P. A. Almeida, M. v. Sinderen, and L. Nieuwenhuis. Transparent dynamic reconfiguration for CORBA. In Proc. of the Third Int'l Symp. on Distributed Objects and Applications, pages 197--207, 2001. Google Scholar
Digital Library
- G. Altekar, I. Bagrak, P. Burstein, and A. Schultz. OPUS: Online patches and updates for security. In Proc. of the 14th USENIX Security Symp., volume 14, pages 19--19, 2005. Google Scholar
Digital Library
- J. R. Andersen, L. Bak, S. Grarup, K. V. Lund, T. Eskildsen, K. M. Hansen, and M. Torgersen. Design, implementation, and evaluation of the resilient Smalltalk embedded platform. Comput. Lang. Syst. Struct., 31 (3--4): 127--141, 2005. Google Scholar
Digital Library
- J. Arnold and M. F. Kaashoek. Ksplice: Automatic rebootless kernel updates. In Proc. of the Fourth ACM European Conf. on Computer Systems, pages 187--198, 2009. Google Scholar
Digital Library
- A. Baumann, G. Heiser, J. Appavoo, D. Da Silva, O. Krieger, R. W. Wisniewski, and J. Kerr. Providing dynamic update in an operating system. In Proc. of the USENIX Annual Tech. Conf., page 32, 2005. Google Scholar
Digital Library
- A. Baumann, J. Appavoo, R. W. Wisniewski, D. D. Silva, O. Krieger, and G. Heiser. Reboots are for hardware: Challenges and solutions to updating an operating system on the fly. In Proc. of the USENIX Annual Tech. Conf., pages 1--14, 2007. Google Scholar
Digital Library
- B. N. Bershad, S. Savage, E. G. Sirer, M. E. Fiuczynski, D. Becker, C. Chambers, and S. Eggers. Extensibility, safety and performance in the SPIN operating system. In Proc. of the 15th ACM Symp. on Oper. Systems Prin., volume 29, pages 267--284, 1995. Google Scholar
Digital Library
- T. Bloom. Dynamic module replacement in a distributed programming system. PhD thesis, MIT, Cambridge, MA, USA, 1983.Google Scholar
Digital Library
- T. Bloom and M. Day. Reconfiguration and module replacement in Argus: Theory and practice. Software Engineering J., 8 (2): 102--108, 1993.Google Scholar
Cross Ref
- C. Boyapati, B. Liskov, L. Shrira, C. Moh, and S. Richman. Lazy modular upgrades in persistent object stores. In Proc. of the 18th ACM Conf. on Object-Oriented Programming, Systems, Languages, and Applications, pages 403--417, 2003. Google Scholar
Digital Library
- S. Boyd-Wickizer and N. Zeldovich. Tolerating malicious device drivers in Linux. In Proc. of the USENIX Annual Tech. Conf., page 9, 2010. Google Scholar
Digital Library
- H. Chen, R. Chen, F. Zhang, B. Zang, and P. Yew. Live updating operating systems using virtualization. In Proc. of the Second Int'l Conf. on Virtual Execution Environments, pages 35--44, 2006. Google Scholar
Digital Library
- H. Chen, J. Yu, R. Chen, B. Zang, and P. Yew. POLUS: A powerful live updating system. In Proc. of the 29th Int'l Conf. on Software Eng., pages 271--281, 2007. Google Scholar
Digital Library
- C. Cowan, T. Autrey, C. Krasic, C. Pu, and J. Walpole. Fast concurrent dynamic linking for an adaptive operating system. In Proc. of the Third Int'l Conf. on Configurable Distributed Systems, pages 108--115, 1996. Google Scholar
Digital Library
- F. M. David, E. M. Chan, J. C. Carlyle, and R. H. Campbell. CuriOS: Improving reliability through operating system structure. In Proc. of the 8th USENIX Symp. on Operating Systems Design and Implementation, pages 59--72, 2008. Google Scholar
Digital Library
- A. Depoutovitch and M. Stumm. Otherworld: giving applications a chance to survive OS kernel crashes. In Proc. of the 5th ACM European Conf. on Computer systems, pages 181--194, 2010. Google Scholar
Digital Library
- M. Dimitrov and H. Zhou. Unified architectural support for soft-error protection or software bug detection. In Proc. of the 16th Int'l Conf. on Parallel Architecture and Compilation Techniques, pages 73--82, 2007. Google Scholar
Digital Library
- D. Duggan. Type-based hot swapping of running modules. In Proc. of the Sixth ACM SIGPLAN Int'l Conf. on Functional programming, pages 62--73, 2001. Google Scholar
Digital Library
- T. Dumitras and P. Narasimhan. Why do upgrades fail and what can we do about it?: Toward dependable, online upgrades in enterprise system. In Proc. of the 10th Int'l Conf. on Middleware, pages 1--20, 2009. Google Scholar
Digital Library
- T. Dumitras, J. Tan, Z. Gho, and P. Narasimhan. No more HotDependencies: Toward dependency-agnostic online upgrades in distributed systems. In Proc. of the Third Workshop on Hot Topics in System Dependability, page 14, 2007. Google Scholar
Digital Library
- T. Dumitras, P. Narasimhan, and E. Tilevich. To upgrade or not to upgrade: impact of online upgrades across multiple administrative domains. In Proc. of the ACM Conf. on Object-Oriented Programming, Systems, Languages, and Appilcations, pages 865--876, 2010. Google Scholar
Digital Library
- M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. In Proc. of the 21st Int'l Conf. on Software Eng., pages 213--224, 1999. Google Scholar
Digital Library
- O. Frieder and M. E. Segal. On dynamically updating a computer program: From concept to prototype. J. Syst. Softw., 14 (2): 111--128, 1991. Google Scholar
Digital Library
- C. Giuffrida and A. Tanenbaum. Safe and automated state transfer for secure and reliable live update. In Proc. of the Fourth Int'l Workshop on Hot Topics in Software Upgrades, pages 16--20, 2012.Google Scholar
Digital Library
- C. Giuffrida and A. S. Tanenbaum. Cooperative update: A new model for dependable live update. In Proc. of the Second Int'l Workshop on Hot Topics in Software Upgrades, pages 1--6, 2009. Google Scholar
Digital Library
- C. Giuffrida, A. Kuijsten, and A. S. Tanenbaum. Enhanced operating system security through efficient and fine-grained address space randomization. In Proc. of the 21st USENIX Security Symp., page 40, 2012. Google Scholar
Digital Library
- D. Gupta. On-line software version change. PhD thesis, Indian Institute of Technology Kanpur, 1994.Google Scholar
- D. Gupta and P. Jalote. On-line software version change using state transfer between processes. Softw. Pract. and Exper., 23 (9): 949--964, 1993. Google Scholar
Digital Library
- D. Gupta, P. Jalote, and G. Barua. A formal framework for on-line software version change. IEEE Trans. Softw. Eng., 22 (2): 120--131, 1996. Google Scholar
Digital Library
- S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In Proc. of the 24th Int'l Conf. on Software Eng., pages 291--301, 2002. Google Scholar
Digital Library
- H. Hartig, M. Hohmuth, J. Liedtke, J. Wolter, and S. Schönberg. The performance of microkernel-based systems. In Proc. of the 16th ACM Symp. on Oper. Systems Prin., pages 66--77, 1997. Google Scholar
Digital Library
- D. Hartmeier. Design and performance of the OpenBSD stateful packet filter (pf). In Proc. of the USENIX Annual Tech. Conf., pages 171--180, 2002. Google Scholar
Digital Library
- C. M. Hayden, E. K. Smith, M. Denchev, M. Hicks, and J. S. Foster. Kitsune: Efficient, general-purpose dynamic software updating for C. In Proc. of the ACM Conf. on Object-Oriented Programming, Systems, Languages, and Appilcations, 2012. Google Scholar
Digital Library
- J. N. Herder, H. Bos, B. Gras, P. Homburg, and A. S. Tanenbaum. Reorganizing UNIX for reliability. In Proc. of the 11th Asia-Pacific Conf. on Advances in Computer Systems Architecture, pages 81--94, 2006. Google Scholar
Digital Library
- J. N. Herder, H. Bos, B. Gras, P. Homburg, and A. S. Tanenbaum. Failure resilience for device drivers. In Proc. of the Int'l Conf. on Dependable Systems and Networks, pages 41--50, 2007. Google Scholar
Digital Library
- M. Hicks. Dynamic software updating. PhD thesis, Univ. of Pennsylvania, 2001. Google Scholar
Digital Library
- D. Hildebrand. An architectural overview of QNX. In Proc. of the Workshop on Micro-kernels and Other Kernel Architectures, pages 113--126, 1992. Google Scholar
Digital Library
- G. Hjalmtysson and R. Gray. Dynamic C++ classes: A lightweight mechanism to update code in a running program. In Proc. of the USENIX Annual Tech. Conf., page 6, 1998. Google Scholar
Digital Library
- G. C. Hunt and J. R. Larus. Singularity: Rethinking the software stack. SIGOPS Oper. Syst. Rev., 41 (2): 37--49, 2007. Google Scholar
Digital Library
- J. Kramer and J. Magee. The evolving philosophers problem: Dynamic change management. IEEE Trans. Softw. Eng., 16 (11): 1293--1306, 1990. Google Scholar
Digital Library
- O. K. Labs. OKL4 community site. http://wiki.ok-labs.com/, 2012.Google Scholar
- C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In Proc. of the Int'l Symp. on Code Generation and Optimization, page 75, 2004. Google Scholar
Digital Library
- I. Lee. Dymos: A dynamic modification system. PhD thesis, Univ. of Wisconsin-Madison, 1983. Google Scholar
Digital Library
- J. Liedtke. Improving IPC by kernel design. In Proc. of the 14th ACM Symp. on Oper. Systems Prin., pages 175--188, 1993. Google Scholar
Digital Library
- J. Liedtke. On micro-kernel construction. In Proc. of the 15th ACM Symp. on Oper. Systems Prin., pages 237--250, 1995. Google Scholar
Digital Library
- D. E. Lowell, Y. Saito, and E. J. Samberg. Devirtualizable virtual machines enabling general, single-node, online maintenance. In Proc. of the 11th Int'l Conf. on Architectural support for programming languages and operating systems, volume 39, pages 211--223, 2004. Google Scholar
Digital Library
- K. Makris and R. Bazzi. Immediate multi-threaded dynamic software updates using stack reconstruction. In Proc. of the USENIX Annual Tech. Conf., pages 397--410, 2009. Google Scholar
Digital Library
- K. Makris and K. D. Ryu. Dynamic and adaptive updates of non-quiescent subsystems in commodity operating system kernels. In Proc. of the Second ACM European Conf. on Computer Systems, pages 327--340, 2007. Google Scholar
Digital Library
- Microsoft. Windows User-Mode driver framework. http://msdn.microsoft.com/en-us/windows/hardware/gg463294, 2010.Google Scholar
- R. G. Minnich. A dynamic kernel modifier for Linux. In Proc. of the LACSI Symposium, 2002.Google Scholar
- I. Neamtiu and M. Hicks. Safe and timely updates to multi-threaded programs. In Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation, pages 13--24, 2009. Google Scholar
Digital Library
- I. Neamtiu, M. Hicks, G. Stoyle, and M. Oriol. Practical dynamic software updating for C. In Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation, pages 72--83, 2006. Google Scholar
Digital Library
- I. Neamtiu, M. Hicks, J. S. Foster, and P. Pratikakis. Contextual effects for version-consistent dynamic software updating and safe concurrent programming. In Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation, pages 37--49, 2008. Google Scholar
Digital Library
- Y. Padioleau, J. L. Lawall, and G. Muller. Understanding collateral evolution in Linux device drivers. In Proc. of the First ACM European Conf. on Computer Systems, pages 59--71, 2006. Google Scholar
Digital Library
- Y. Padioleau, J. Lawall, R. R. Hansen, and G. Muller. Documenting and automating collateral evolutions in Linux device drivers. In Proc. of the Third ACM European Conf. on Computer Systems, pages 247--260, 2008. Google Scholar
Digital Library
- N. Palix, G. Thomas, S. Saha, C. Calvas, J. Lawall, and G. Muller. Faults in Linux: Ten years later. In Proc. of the 16th Int'l Conf. on Architectural support for programming languages and operating systems, pages 305--318, 2011. Google Scholar
Digital Library
- K. Pattabiraman, G. P. Saggese, D. Chen, Z. T. Kalbarczyk, and R. K. Iyer. Automated derivation of application-specific error detectors using dynamic analysis. IEEE Trans. Dep. Secure Comput., 8 (5): 640--655, 2011. Google Scholar
Digital Library
- S. Potter and J. Nieh. Reducing downtime due to system maintenance and upgrades. In Proc. of the 19th USENIX Systems Administration Conf., pages 6--6, 2005. Google Scholar
Digital Library
- J. Rafkind, A. Wick, J. Regehr, and M. Flatt. Precise garbage collection for C. In Proc. of the 2009 Int'l Symp. on Memory management, pages 39--48, 2009. Google Scholar
Digital Library
- M. E. Segal and O. Frieder. On-the-fly program modification: Systems for dynamic updating. IEEE Softw., 10 (2): 53--65, 1993. Google Scholar
Digital Library
- M. Seltzer and C. Small. Self-monitoring and self-adapting operating systems. In Proc. of the Sixth Workshop on Hot Topics in Operating Systems, pages 124--129, 1997. Google Scholar
Digital Library
- M. I. Seltzer, Y. Endo, C. Small, and K. A. Smith. Dealing with disaster: Surviving misbehaved kernel extensions. In Proc. of the Second USENIX Symp. on Operating Systems Design and Implementation, pages 213--227, 1996. Google Scholar
Digital Library
- J. S. Shapiro, J. M. Smith, and D. J. Farber. EROS: A fast capability system. In Proc. of the 17th ACM Symp. on Oper. Systems Prin., pages 170--185, 1999. Google Scholar
Digital Library
- C. A. N. Soules, D. D. Silva, M. Auslander, G. R. Ganger, and M. Ostrowski. System support for online reconfiguration. In Proc. of the USENIX Annual Tech. Conf., pages 141--154, 2003.Google Scholar
- G. Stoyle, M. Hicks, G. Bierman, P. Sewell, and I. Neamtiu. Mutatis mutandis: Safe and predictable dynamic software updating. ACM Trans. Program. Lang. Syst., 29 (4), 2007. Google Scholar
Digital Library
- subramanian_dynamic_2009S. Subramanian, M. Hicks, and K. S. McKinley. Dynamic software updates: A VM-centric approach. In Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation, volume 44, pages 1--12, 2009. Google Scholar
Digital Library
- A. Tamches and B. P. Miller. Fine-grained dynamic instrumentation of commodity operating system kernels. In Proc. of the Third ACM Symp. on Oper. Systems Prin., pages 117--130, 1999. Google Scholar
Digital Library
- Y. Vandewoude, P. Ebraert, Y. Berbers, and T. D'Hondt. Tranquility: A low disruptive alternative to quiescence for ensuring safe dynamic updates. IEEE Trans. Softw. Eng., 33 (12): 856--868, 2007. Google Scholar
Digital Library
- P. Zhou, W. Liu, L. Fei, S. Lu, F. Qin, Y. Zhou, S. Midkiff, and J. Torrellas. AccMon: Automatically detecting memory-related bugs via program counter-based invariants. In Proc. of the 37th Annual IEEE/ACM Int'l Symp. on Microarchitecture, pages 269--280, 2004. Google Scholar
Digital Library
Index Terms
Safe and automatic live update for operating systems
Recommendations
Live updating operating systems using virtualization
VEE '06: Proceedings of the 2nd international conference on Virtual execution environmentsMany critical IT infrastructures require non-disruptive operations. However, the operating systems thereon are far from perfect that patches and upgrades are frequently applied, in order to close vulnerabilities, add new features and enhance ...
Safe and automatic live update for operating systems
ASPLOS '13Increasingly many systems have to run all the time with no downtime allowed. Consider, for example, systems controlling electric power plants and e-banking servers. Nevertheless, security patches and a constant stream of new operating system versions ...
Safe and automatic live update for operating systems
ASPLOS '13: Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systemsIncreasingly many systems have to run all the time with no downtime allowed. Consider, for example, systems controlling electric power plants and e-banking servers. Nevertheless, security patches and a constant stream of new operating system versions ...







Comments