skip to main content
research-article

To upgrade or not to upgrade: impact of online upgrades across multiple administrative domains

Published:17 October 2010Publication History
Skip Abstract Section

Abstract

Online software upgrades are often plagued by runtime behaviors that are poorly understood and difficult to ascertain. For example, the interactions among multiple versions of the software expose the system to race conditions that can introduce latent errors or data corruption. Moreover, industry trends suggest that online upgrades are currently needed in large-scale enterprise systems, which often span multiple administrative domains (e.g., Web 2.0 applications that rely on AJAX client-side code or systems that lease cloud-computing resources). In such systems, the enterprise does not control all the tiers of the system and cannot coordinate the upgrade process, making existing techniques inadequate to prevent mixed-version races. In this paper, we present an analytical framework for impact assessment, which allows system administrators to directly compare the risk of following an online-upgrade plan with the risk of delaying or canceling the upgrade. We also describe an executable model that implements our formal impact assessment and enables a systematic approach for deciding whether an online upgrade is appropriate. Our model provides a method of last resort for avoiding undesirable program behaviors, in situations where mixed-version races cannot be avoided through other technical means.

References

  1. }}S. Ajmani, B. Liskov, and L. Shrira. Modular software upgrades for distributed systems. In European Conference on Object-Oriented Programming, pages 452--476, Nantes, France, Jul 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. }}S. Beattie, S. Arnold, C. Cowan, P. Wagle, and C. Wright. Timing the application of security patches for optimal uptime. In Large Installation System Administration Conference, pages 233--242, Philadelphia, PA, Nov 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. }}T. Bloom. Dynamic Module Replacement in a Distributed Programming System. PhD thesis, MIT, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. }}M. Bond, K. Coons, and K. McKinley. Pacer: Proportional detection of data races. In ACM Conference on Programming Language Design and Implementation, Toronto, CA, Jun 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. }}E. A. Brewer. Lessons from giant-scale services. IEEE Internet Computing, 5(4):46--55, Jul/Aug 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. }}A. Choi. Online application upgrade using edition-based redefinition. In ACM Workshop on Hot Topics in Software Upgrades, Orlando, FL, Oct 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. }}O. Crameri, N. Knezevic, D. Kostic, R. Bianchini, and W. Zwaenepoel. Staged deployment in Mirage, an integrated software upgrade testing and distribution system. In Symposium on Operating Systems Principles, pages 221--236, Stevenson, WA, Oct 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. }}CWE/SANS. Top 25 most dangerous programming errors. Feb 2010.Google ScholarGoogle Scholar
  9. }}A. Downing, Oracle Corporation. Personal communication, 2008.Google ScholarGoogle Scholar
  10. }}T. Dumitras and P. Narasimhan. Why do upgrades fail and what can we do about it? Toward dependable, online upgrades in enterprise systems. In ACM/IEEE/IFIP Middleware Conference, pages 349--372, Urbana-Champaign, IL, Nov/Dec 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. }}T. Dumitras, D. Rosu, A. Dan, and P. Narasimhan. Ecotopia: An ecological framework for change management in distributed systems. In C. Gacek, A. Romanovsky, and R. de Lemos, editors, Architecting Dependable Systems IV, pages 262--286. Springer-Verlag, LNCS 4615, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. }}S. Hansell. Glitch makes teller machines take twice what they give. The New York Times, Feb 18 1994.Google ScholarGoogle Scholar
  13. }}M. Hicks. Dynamic Software Updating. PhD thesis, Department of Computer and Information Science, University of Pennsylvania, August 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. }}J. Kramer and J. Magee. Dynamic configuration for distributed systems. IEEE Transactions on Software Engineering, 11(4):424--436, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. }}B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In ACM Conference on Programming Language Design and Implementation, San Diego, CA, Jun 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. }}Microsoft Corporation. Perform a rolling upgrade from Windows 2000. TechNet Library, Jan 2005. http://technet.microsoft.com/en-us/library/cc738005(WS.10).aspx.Google ScholarGoogle Scholar
  17. }}Microsoft Developer Network. Windows Update Agent. http://msdn2.microsoft.com/en-us/library/aa387099.aspx. Retrieved on 18 Feb 2008.Google ScholarGoogle Scholar
  18. }}Office of Government Commerce. Service Transition. Information Technology Infrastructure Library (ITIL). 2007.Google ScholarGoogle Scholar
  19. }}F. Oliveira, K. Nagaraja, R. Bachwani, R. Bianchini, R. P. Martin, and T. D. Nguyen. Understanding and validating database system administration. USENIX Annual Technical Conference, Jun 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. }}D. Oppenheimer, A. Ganapathi, and D. A. Patterson. Why do Internet services fail, and what can be done about it? In USENIX Symposium on Internet Technologies and Systems, Seattle, WA, Mar 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. }}Oracle Corporation. Database rolling upgrade using Data Guard SQL Apply. Maximum Availability Architecture White Paper, Dec 2008. http://www.oracle.com/technology/deploy/availability/pdf/maa_wp_10gr2_rollingupgradebestpractices.pdf.Google ScholarGoogle Scholar
  22. }}D. Patterson. A simple way to estimate the cost of downtime. In Large Installation System Administration Conference, pages 185--188, Philadelphia, PA, Nov 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. }}D. Reiss, Facebook. Personal communication, 2009.Google ScholarGoogle Scholar
  24. }}J. S. Rellermeyer, M. Duller, and G. Alonso. Consistently applying updates to compositions of distributed OSGi modules. In ACM Workshop on Hot Topics in Software Upgrades, Nashville, Tennessee, Oct 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. }}M. Segal. Online software upgrading: new research directions and practical considerations. In Computer Software and Applications Conference, pages 977--981, Oxford, England, Aug 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. }}M. E. Segal and O. Frieder. Dynamically updating distributed software: supporting change in uncertain and mistrustful environments. In IEEE Conference on Software Maintenance, pages 254--261, Oct 1989.Google ScholarGoogle ScholarCross RefCross Ref
  27. }}J. Sliwerski, T. Zimmermann, and A. Zeller. When do changes induce fixes? On Fridays. In International Workshop on Mining Software Repositories (MSR), Saint Louis, Missouri, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. }}E. B. Swanson. The dimensions of maintenance. In International Conference on Software Engineering, pages 492--497, San Francisco, CA, 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. }}L. Tewksbury, L. Moser, and M. Melliar-Smith. Live upgrades of CORBA applications using object replication. In International Conference on Software Maintenance, pages 488--497, Florence, Italy, Nov 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. }}S. Vinoski. Convenience over correctness. IEEE Internet Computing, 12(4):89--92, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. }}W. Zheng, R. Bianchini, G. J. Janakiraman, J. R. Santos, and Y. Turner. Justrunit: Experiment-based management of virtualized data centers. In USENIX Annual Technical Conference, San Diego, CA, Jun 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. To upgrade or not to upgrade: impact of online upgrades across multiple administrative domains

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!