Abstract
Rebooting an operating system is a final but effective recovery technique. However, the system performance largely degrades just after the reboot due to the page cache being lost in the main memory. For fast performance recovery, we propose a new reboot mechanism called the warm-cache reboot. The warm-cache reboot preserves the page cache during the reboot and enables an operating system to restore it after the reboot, with the help of a virtual machine monitor (VMM). To perform correct recovery, the VMM guarantees that the reused page cache is consistent with the corresponding files on disks. We have implemented the warm-cache reboot mechanism in the Xen VMM and the Linux operating system. Our experimental results showed that the warm-cache reboot decreased performance degradation just after the reboot. In addition, we confirmed that the file cache corrupted by faults was not reused. The overheads for maintaining cache consistency were not usually large.
- Apache Software Foundation. Apache HTTP Server Project. http://httpd.apache.org/.Google Scholar
- M. Baker and M. Sullivan. The Recovery Box: Using Fast Recovery to Provide High Availability in the UNIX Environment. In Proceedings of the Summer USENIX Conference, pages 31--44, 1992.Google Scholar
- P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the Art of Virtualization. In Proceedings of the 19th Symposium on Operating Systems Principles, pages 164--177, 2003. Google Scholar
Digital Library
- G. Candea, S. Kawamoto, Y. Fujiki, G. Friedman, and A. Fox. Microreboot -- A Technique for Cheap Recovery. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation, pages 31--44, 2004. Google Scholar
Digital Library
- P. Chen, W. Ng, S. Chandra, C. Aycock, G. Rajamani, and D. Lowell. The Rio File Cache: Surviving Operating System Crashes. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 74--83, 1996. Google Scholar
Digital Library
- A. Depoutovitch and M. Stumm. Otherworld - Giving Applications a Chance to Survive OS Kernel Crashes. In Proceedings of the 5th European Conference on Computer Systems, pages 181--194, 2010. Google Scholar
Digital Library
- S. Garg, A. Puliafito, M. Telek, and K. Trivedi. Analysis of Preventive Maintenance in Transactions Based Software Systems. IEEE Transactions on Computers, 47 (1): 96--107, 1998. Google Scholar
Digital Library
- M. Grottke and K. Trivedi. Fighting Bugs: Remove, Retry, Replicate, and Rejuvenate. IEEE Computer, 40 (2): 107--109, 2007. Google Scholar
Digital Library
- J. Halderman, S. Schoen, N. Heninger, W. Clarkson, W. Paul, J. Calandrino, A. Feldman, J. Appelbaum, and E. Felten. Lest We Remember: Cold Boot Attacks on Encryption Keys. In Proceedings of the USENIX Security Symposium, pages 45--60, 2008. Google Scholar
Digital Library
- Y. Huang, C. Kintala, N. Kolettis, and N. Fulton. Software Rejuvenation: Analysis, module and Applications. In Proceedings of the 25th International Symposium on Fault-Tolerant Computing, pages 381--391, 1995. Google Scholar
Digital Library
- S. Jones,, A. Arpaci-Dusseau, and R. Arpaci-Dusseau. Geiger: Monitoring the Buffer Cache in a Virtual Machine Environment. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 14--24, 2006. Google Scholar
Digital Library
- H. Kaminaga. Improving Linux Startup Time Using Software Resume (and Other Techniques). In Proceedings of the Linux Symposium, pages 25--34, 2006.Google Scholar
- A. Kivity, Y. Kamay, and D. Laor. KVM: The Linux Virtual Machine Monitor. In Proceedings of the Linux Symposium, pages 225--230, 2007.Google Scholar
- K. Kourai and S. Chiba. A Fast Rejuvenation Technique for Server Consolidation with Virtual Machines. In Proceedings of the 37th International Conference on Dependable Systems and Networks, pages 245--254, 2007. Google Scholar
Digital Library
- K. Kourai and S. Chiba. Fast Software Rejuvenation of Virtual Machine Monitors. IEEE Transactions on Dependable and Secure Computing, 2010. Google Scholar
Digital Library
- P. Lu and K. Shen. Virtual Machine Memory Access Tracing with Hypervisor Exclusive Cache. In Proceedings of the USENIX Annual Technical Conference, pages 1--15, 2007. Google Scholar
Digital Library
- D. Mosberger and T. Jin. httperf: A Tool for Measuring Web Server Performance. Performance Evaluation Review, 26 (3): 31--37, 1998. Google Scholar
Digital Library
- W. Ng and P. Chen. The Design and Verification of the Rio File Cache. IEEE Transactions on Computers, 50 (4): 322--337, 2001. Google Scholar
Digital Library
- W. Norcott and D. Capps. IOzone Filesystem Benchmark.Google Scholar
- A. Pfiffer. Reducing System Reboot Time with kexec. http://www.osdl.org/.Google Scholar
- M. Swift, B. Bershad, and H. Levy. Improving the Reliability of Commodity Operating Systems. In Proceedings of the 19th Symposium on Operating Systems Principles, pages 207--222, 2003. Google Scholar
Digital Library
- Transaction Processing Performance Council. TPC Benchmark H Standard Specification Revision 2.9.0. http://www.tpc.org/, 2009.Google Scholar
- C. Waldspurger. Memory Resource Management in VMware ESX Server. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation, pages 181--194, 2002. Google Scholar
Digital Library
- J. Zhang and M. Wong. Database Test Suite. http://osdldbt.sourceforge.net/.Google Scholar
Index Terms
Fast and correct performance recovery of operating systems using a virtual machine monitor
Recommendations
Fast and correct performance recovery of operating systems using a virtual machine monitor
VEE '11: Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environmentsRebooting an operating system is a final but effective recovery technique. However, the system performance largely degrades just after the reboot due to the page cache being lost in the main memory. For fast performance recovery, we propose a new reboot ...
Performance Degradation-Aware Virtual Machine Live Migration in Virtualized Servers
PDCAT '12: Proceedings of the 2012 13th International Conference on Parallel and Distributed Computing, Applications and TechnologiesLive migration of virtual machines(VMs) is widely used for system management in virtualized servers. When the loads increase and SLAs of some applications are violated, dynamic migration of virtual machines across physical machines (PMs) has the ...
Error Recovery in Shared Memory Multiprocessors Using Private Caches
The problem of recovering from processor transient faults in shared memory multiprocessor systems is examined. A user-transparent checkpointing and recovery scheme using private caches is presented. Processes can recover from errors due to faulty ...









Comments