ABSTRACT
Coping with software defects that occur in the post-deployment stage is a challenging problem: bugs may occur only when the system uses a specific configuration and only under certain usage scenarios. Nevertheless, halting production systems until the bug is tracked and fixed is often impossible. Thus, developers have to try to reproduce the bug in laboratory conditions. Often the reproduction of the bug consists of the lion share of the debugging effort.
In this paper we suggest an approach to address the aforementioned problem by using a specialized runtime environment (QVM, for Quality Virtual Machine). QVM efficiently detects defects by continuously monitoring the execution of the application in a production setting. QVM enables the efficient checking of violations of user-specified correctness properties, e.g., typestate safety properties, Java assertions, and heap properties pertaining to ownership.
QVM is markedly different from existing techniques for continuous monitoring by using a novel overhead manager which enforces a user-specified overhead budget for quality checks. Existing tools for error detection in the field usually disrupt the operation of the deployed system. QVM, on the other hand, provides a balanced trade off between the cost of the monitoring process and the maintenance of sufficient accuracy for detecting defects. Specifically, the overhead cost of using QVM instead of a standard JVM, is low enough to be acceptable in production environments.
We implemented QVM on top of IBM's J9 Java Virtual Machine and used it to detect and fix various errors in real-world applications.
- GOIM: Gamers own instant messenger. available at http://goim.us/wiki/show/GOIM.Google Scholar
- Aftandilian, E., and Guyer, S. Z. GC assertions: Using the garbage collector to check heap properties. In MSPC (2008), ACM. Google Scholar
Digital Library
- Allan, C., Avgustinov, P., Christensen, A. S., Hendren, L., Kuzins, S., Lhoták, O., de Moor, O., Sereni, D., Sittampalam, G., and Tibble, J. Adding trace matching with free variables to aspectj. In OOPSLA '05: Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming, systems, languages, and applications (2005), ACM, pp. 345--364. Google Scholar
Digital Library
- Arnold, M., and Ryder, B. G. A framework for reducing the cost of instrumented code. In PLDI '01: Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation (New York, NY, USA, 2001), ACM, pp. 168--179. Google Scholar
Digital Library
- Arnold, M., and Ryder, B. G. Thin guards: A simple and effective technique for reducing the penalty of dynamic class loading. In Proceedings of the Sixteenth European Conference on Object-Oriented Programming (Málaga, Spain, June 2002), B. Magnusson, Ed., vol. 2374 of Lecture Notes in Computer Science, pp. 498--524. Google Scholar
Digital Library
- Auerbach, J., Bacon, D., Cheng, P., Grove, D., Biron, B., Gracie, C., McCloskey, B., Micic, A., and Sciampacone, R. Toshio suganuma and toshiaki yasue and motohiro kawahito and hideaki komatsu and toshio nakatani. In Proceedings of the International Conference on Embedded Software (New York, NY, USA, 2008), ACM.Google Scholar
- Avgustinov, P., Tibble, J., and de Moor, O. Making trace monitors feasible. In OOPSLA '07: Proceedings of the 22nd annual ACM SIGPLAN conference on Object oriented programming systems and applications (2007), ACM, pp. 589--608. Google Scholar
Digital Library
- Azureus - Java BitTorrent client. http://azureus.sourceforge.net/.Google Scholar
- Bodden, E., Hendren, L. J., Lam, P., Lhoták, O., and Naeem, N. A. Collaborative runtime verification with tracematches. In 7th International Workshop on Runtime Verification (RV) (2007), vol. 4839 of Lecture Notes in Computer Science, pp. 9--21. Google Scholar
Digital Library
- Bodden, E., Hendren, L. J., and Lhoták, O. A staged static program analysis to improve the performance of runtime monitoring. In ECOOP (2007), pp. 525--549. Google Scholar
Digital Library
- Bond, M. D., and McKinley, K. S. Bell: bit-encoding online memory leak detection. SIGOPS Oper. Syst. Rev. 40, 5 (2006), 61--72. Google Scholar
Digital Library
- Chen, F., and Roşu, G. MOP: An Efficient and Generic Runtime Verification Framework. In Object-Oriented Programming, Systems, Languages and Applications(OOPSLA'07) (2007). Google Scholar
Digital Library
- Chilimbi, T. M., and Ganapathy, V. Heapmd: identifying heap-based bugs using anomaly detection. vol. 34, ACM, pp. 219--228.Google Scholar
- Chilimbi, T. M., and Hirzel, M. Dynamic hot data stream prefetching for general-purpose programs. In PLDI '02: Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation (New York, NY, USA, 2002), ACM, pp. 199--209. Google Scholar
Digital Library
- DeLine, R., and Fahndrich, M. Enforcing high-level protocols in low-level software. In PLDI '01: Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation (New York, NY, USA, 2001), ACM Press, pp. 59--69. Google Scholar
Digital Library
- DeLine, R., and Fähndrich, M. Adoption and focus: Practical linear types for imperative programming. pp. 13--24.Google Scholar
- Dillig, I., Dillig, T., Yahav, E., and Chandra, S. The closer: Automating resource management in java. In ISMM (2008). Google Scholar
Digital Library
- Eclipse. Standard widget toolkit (swt). http://www.eclipse.org/swt/.Google Scholar
- Fink, S., Yahav, E., Dor, N., Ramalingam, G., and Geay, E. Effective typestate verification in the presence of aliasing. In ISSTA '06: Proceedings of the 2006 international symposium on Software testing and analysis (New York, NY, USA, 2006), ACM Press, pp. 133--144. Google Scholar
Digital Library
- Fink, S. J., and Qian, F. Design, implementation and evaluation of adaptive recompilation with on-stack replacement. In International Symposium on Code Generation and Optimization (CGO 2003) (2003), pp. 241--252. Google Scholar
Digital Library
- Foster, J. S., Terauchi, T., and Aiken, A. Flow-sensitive type qualifiers. pp. 1--12.Google Scholar
- Hauswirth, M., and Chilimbi, T. M. Low-overhead memory leak detection using adaptive statistical profiling. SIGPLAN Not. 39, 11 (2004), 156--164. Google Scholar
Digital Library
- Heine, D. L., and Lam, M. S. A practical flow-sensitive and context-sensitive c and c++ memory leak detector. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation (New York, NY, USA, 2003), ACM, pp. 168--181. Google Scholar
Digital Library
- Jump, M., and McKinley, K. S. Cork: dynamic memory leak detection for garbage-collected languages. SIGPLAN Not. 42, 1 (2007), 31--38. Google Scholar
Digital Library
- Lau, J., Arnold, M., Hind, M., and Calder, B. Online performance auditing: using hot optimizations without getting burned. SIGPLAN Not. 41, 6 (2006), 239--251. Google Scholar
Digital Library
- Lev-Ami, T., and Sagiv, M. TVLA: A framework for Kleene based static analysis. In Saskatchewan (2000), vol. 1824 of Lecture Notes in Computer Science, Springer-Verlag, pp. 280--301. Google Scholar
Digital Library
- Liblit, B. Cooperative Bug Isolation (Winning Thesis of the 2005 ACM Doctoral Dissertation Competition), vol. 4440 of Lecture Notes in Computer Science. Springer, 2007. Google Scholar
Digital Library
- Livshits, V. B. Turning Eclipse against itself: Finding bugs in Eclipse code using lightweight static analysis. Eclipsecon '05 Research Exchange, Mar. 2005.Google Scholar
- Microsystems, S. Jvmtm tool interface, version 1.0. In http://java.sun.com/j2se/1.5.0/docs/guide/jvmti/jvmti.html.Google Scholar
- Mitchell, N. The runtime structure of object ownership. In ECOOP (2006), D. Thomas, Ed., vol. 4067 of Lecture Notes in Computer Science, Springer, pp. 74--98. Google Scholar
- Mitchell, N., and Sevitsky, G. The causes of bloat, the limits of health. In OOPSLA '07: Proceedings of the 22nd annual ACM SIGPLAN conference on Object oriented programming systems and applications (2007), pp. 245--260. Google Scholar
Digital Library
- Müller, P., and Rudich, A. Ownership transfer in universe types. In OOPSLA '07: Proceedings of the 22nd annual ACM SIGPLAN conference on Object oriented programming systems and applications (New York, NY, USA, 2007), ACM, pp. 461--478. Google Scholar
Digital Library
- Qian, F., and Hendren, L. An adaptive, region-based allocator for java. In Proceedings of the third international symposium on Memory management (Jun 2002), ACM Press, pp. 127--138. Google Scholar
Digital Library
- Ramalingam, G., Warshavsky, A., Field, J., Goyal, D., and Sagiv, M. Deriving specialized program analyses for certifying component-client conformance. In PLDI '02: Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation (New York, NY, USA, 2002), ACM, pp. 83--94. Google Scholar
Digital Library
- Shaham, R., Yahav, E., Kolodner, E., and Sagiv, M. Establishing local temporal heap safety properties with applications to compile-time memory management. In Static Analysis Symposium (2003). Google Scholar
Digital Library
- Strom, R. E., and Yemini, S. Typestate: A programming language concept for enhancing software reliability. IEEE Trans. Software Eng. 12, 1 (1986), 157--171. Google Scholar
Digital Library
- Suganuma, T., Yasue, T., Kawahito, M., Komatsu, H., and Nakatani, T. A dynamic optimization framework for a Java just-in-time compiler. In OOPSLA '01: Proceedings of the 16th ACM SIGPLAN conference on Object oriented programming, systems, languages, and applications (New York, NY, USA, 2001), ACM, pp. 180--195. Google Scholar
Digital Library
- Yahav, E., and Ramalingam, G. Verifying safety properties using separation and heterogeneous abstractions. In Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation (2004), ACM Press, pp. 25--34. Google Scholar
Digital Library
Index Terms
QVM: an efficient runtime for detecting defects in deployed systems
Recommendations
QVM: An Efficient Runtime for Detecting Defects in Deployed Systems
Coping with software defects that occur in the post-deployment stage is a challenging problem: bugs may occur only when the system uses a specific configuration and only under certain usage scenarios. Nevertheless, halting production systems until the ...
QVM: an efficient runtime for detecting defects in deployed systems
Coping with software defects that occur in the post-deployment stage is a challenging problem: bugs may occur only when the system uses a specific configuration and only under certain usage scenarios. Nevertheless, halting production systems until the ...
HyperFresh: Live Refresh of Hypervisors Using Nested Virtualization
APSys '17: Proceedings of the 8th Asia-Pacific Workshop on SystemsBugs in hypervisors are becoming common as hypervisors grow in size and complexity. Latent bugs, such as memory leaks, can lead to hypervisor failures resulting in complete loss of all its virtual machines (or guests). However, reliable operation of ...







Comments