Abstract
We present a system, RCV, for enabling software applications to survive divide-by-zero and null-dereference errors. RCV operates directly on off-the-shelf, production, stripped x86 binary executables. RCV implements recovery shepherding, which attaches to the application process when an error occurs, repairs the execution, tracks the repair effects as the execution continues, contains the repair effects within the application process, and detaches from the process after all repair effects are flushed from the process state. RCV therefore incurs negligible overhead during the normal execution of the application.
We evaluate RCV on all divide-by-zero and null-dereference errors available in the CVE database [2] from January 2011 to March 2013 that 1) provide publicly-available inputs that trigger the error which 2) we were able to use to trigger the reported error in our experimental environment. We collected a total of 18 errors in seven real world applications, Wireshark, the FreeType library, Claws Mail, LibreOffice, GIMP, the PHP interpreter, and Chromium. For 17 of the 18 errors, RCV enables the application to continue to execute to provide acceptable output and service to its users on the error-triggering inputs. For 13 of the 18 errors, the continued RCV execution eventually flushes all of the repair effects and RCV detaches to restore the application to full clean functionality. We perform a manual analysis of the source code relevant to our benchmark errors, which indicates that for 11 of the 18 errors the RCV and later patched versions produce identical or equivalent results on all inputs.
- Chromium's multi-process architecture. http://blog.chromium.org/2008/09/multi-process-architecture.html.Google Scholar
- Common vulnerabilities and exposures. http://cve.mitre.org/.Google Scholar
- The libunwind project. http://www.nongnu.org/libunwind/.Google Scholar
- SPEC CPU2006. http://www.spec.org/cpu2006/.Google Scholar
- E. D. Berger and B. G. Zorn. Diehard: Probabilistic memory safety for unsafe languages. In Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '06', pages 158--168. ACM, 2006. Google Scholar
Digital Library
- B. Buck and J. K. Hollingsworth. An api for runtime code patching. Int. J. High Perform. Comput. Appl., 14(4):317--329, Nov. 2000. Google Scholar
Digital Library
- M. Carbin, S. Misailovic, M. Kling, and M. C. Rinard. Detecting and escaping infinite loops with jolt. In Proceedings of the 25th European conference on Object-oriented programming, ECOOP'11, pages 609--633. Springer-Verlag, 2011. Google Scholar
Digital Library
- A. Carzaniga, A. Gorla, A. Mattavelli, N. Perino, and M. Pezzè. Automatic recovery from runtime failures. In Proceedings of the 2013 International Conference on Software Engineering, pages 782--791. Google Scholar
Digital Library
- B. Demsky and M. C. Rinard. Goal-directed reasoning for specification-based data structure repair. IEEE Trans. Software Eng., 32(12):931--951, 2006. Google Scholar
Digital Library
- K. Dobolyi and W. Weimer. Changing java's semantics for handling null pointer exceptions. 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE), 0:47--56, 2008. Google Scholar
Digital Library
- P. Dubroy and R. Balakrishnan. A study of tabbed browsing among mozilla firefox users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 673--682. ACM, 2010. Google Scholar
Digital Library
- Y. h. Eom and B. Demsky. Self-stabilizing java. In Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation, PLDI '12', pages 287--298. ACM, 2012. Google Scholar
Digital Library
- V. P. Kemerlis, G. Portokalidis, K. Jee, and A. D. Keromytis. Libdft: Practical dynamic data flow tracking for commodity systems. In Proceedings of the 8th ACM SIGPLAN/SIGOPS Conference on Virtual Execution Environments, VEE '12', pages 121--132. ACM, 2012. Google Scholar
Digital Library
- Y. Khmelevsky, M. Rinard, and S. Sidiroglou. A source-to-source transformation tool for error fixing. CASCON, 2013. Google Scholar
Digital Library
- D. Kim, J. Nam, J. Song, and S. Kim. Automatic patch generation learned from human-written patches. In Proceedings of the 2013 International Conference on Software Engineering, ICSE '13', pages 802--811. IEEE Press, 2013. Google Scholar
Digital Library
- M. Kling, S. Misailovic, M. Carbin, and M. Rinard. Bolt: on-demand infinite loop escape in unmodified binaries. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications, OOPSLA '12', pages 431--450. ACM, 2012. Google Scholar
Digital Library
- J. Lions. Ariane 5 flight 501 failure: Report by the inquiry board., 1996.Google Scholar
- F. Long, V. Ganesh, M. Carbin, S. Sidiroglou, and M. Rinard. Automatic input rectification. In Proceedings of the 2012 International Conference on Software Engineering, ICSE 2012, pages 80--90. IEEE Press, 2012. Google Scholar
Digital Library
- F. Long, S. Sidiroglou-Douskos, D. Kim, and M. C. Rinard. Sound input filter generation for integer overflow errors. In POPL, pages 439--452, 2014. Google Scholar
Digital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '05', pages 190--200. ACM, 2005. Google Scholar
Digital Library
- V. Nagarajan, D. Jeffrey, and R. Gupta. Self-recovery in server programs. In Proceedings of the 2009 International Symposium on Memory Management, ISMM '09', pages 49--58. ACM, 2009. Google Scholar
Digital Library
- H. H. Nguyen and M. Rinard. Detecting and eliminating memory leaks using cyclic memory allocation. In Proceedings of the 6th International Symposium on Memory Management, ISMM '07', pages 15--30. ACM, 2007. Google Scholar
Digital Library
- J. H. Perkins, S. Kim, S. Larsen, S. Amarasinghe, J. Bachrach, M. Carbin, C. Pacheco, F. Sherwood, S. Sidiroglou, G. Sullivan, W.-F. Wong, Y. Zibin, M. D. Ernst, and M. Rinard. Automatically patching errors in deployed software. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP '09, pages 87--102. ACM, 2009. Google Scholar
Digital Library
- F. Qin, J. Tucek, Y. Zhou, and J. Sundaresan. Rx: Treating bugs as allergies--a safe method to survive software failures. ACM Trans. Comput. Syst., 25(3), Aug. 2007. Google Scholar
Digital Library
- M. Rinard, C. Cadar, D. Dumitran, D. M. Roy, T. Leu, and W. S. Beebee. Enhancing server availability and security through failure-oblivious computing. In OSDI, pages 303--316, 2004. Google Scholar
Digital Library
- M. C. Rinard. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In ICS, pages 324--334, 2006. Google Scholar
Digital Library
- M. C. Rinard. Using early phase termination to eliminate load imbalances at barrier synchronization points. In OOPSLA, pages 369--386, 2007. Google Scholar
Digital Library
- S. Sidiroglou, Y. Giovanidis, and A. Keromytis. A Dynamic Mechanism for Recovery from Buffer Overflow attacks. In Proceedings of the 8th Information Security Conference (ISC), September 2005. Google Scholar
Digital Library
- S. Sidiroglou and A. D. Keromytis. A Network Worm Vaccine Architecture. In Proceedings of the IEEE Workshop on Enterprise Technologies, June 2003. Google Scholar
Digital Library
- S. Sidiroglou, O. Laadan, C. Perez, N. Viennot, J. Nieh, and A. D. Keromytis. Assure: Automatic software self-healing using rescue points. In ASPLOS, pages 37--48, 2009. Google Scholar
Digital Library
- S. Sidiroglou, M. E. Locasto, S. W. Boyd, and A. D. Keromytis. Building a reactive immune system for software services. In Proceedings of the general track, 2005 USENIX annual technical conference: April 10--15, 2005, Anaheim, CA, USA, pages 149--161. USENIX, 2005. Google Scholar
Digital Library
- W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering, ICSE '09', pages 364--374. IEEE Computer Society, 2009. Google Scholar
Digital Library
Index Terms
Automatic runtime error repair and containment via recovery shepherding
Recommendations
Automatic runtime error repair and containment via recovery shepherding
PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and ImplementationWe present a system, RCV, for enabling software applications to survive divide-by-zero and null-dereference errors. RCV operates directly on off-the-shelf, production, stripped x86 binary executables. RCV implements recovery shepherding, which attaches ...
Automatic runtime recovery via error handler synthesis
ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software EngineeringSoftware systems are often subject to unexpected runtime errors. Automatic runtime recovery (ARR) techniques aim at recovering them from erroneous states and maintaining them functional in the field. This paper proposes Ares , a novel, practical ...
Towards a Theory of Forward Error Recovery
Annals of discrete mathematics, 24When the state of a program in execution is accidentally altered, a recovery action may be needed before the execution can proceed on. Two approaches exist for the design of recovery actions: backward recovery consists of retrieving a previously saved ...







Comments