skip to main content
research-article

Recovering from distributable thread failures in distributed real-time Java

Published:27 August 2010Publication History
Skip Abstract Section

Abstract

We consider the problem of recovering from the failures of distributable threads (“threads”) in distributed real-time systems that operate under runtime uncertainties including those on thread execution times, thread arrivals, and node failure occurrences. When a thread experiences a node failure, the result is a broken thread having an orphan. Under a termination model, the orphans must be detected and aborted, and exceptions must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. Our application/scheduling model includes the proposed distributable thread programming model for the emerging Distributed Real-Time Specification for Java (DRTSJ), together with an exception-handler model. Threads are subject to time/utility function (TUF) time constraints and an utility accrual (UA) optimality criterion. A key underpinning of the TUF/UA scheduling paradigm is the notion of “best-effort” where higher importance threads are always favored over lower importance ones, irrespective of thread urgency as specified by their time constraints. We present a thread scheduling algorithm called HUA and a thread integrity protocol called TPR. We show that HUA and TPR bound the orphan cleanup and recovery time with bounded loss of the best-effort property. Our implementation experience for HUA/TPR in the Reference Implementation of the proposed programming model for the DRTSJ demonstrates the algorithm/protocol's effectiveness.

References

  1. Aguilera, M. K., Lann, G. L., and Toueg, S. 2002. On the impact of fast failure detectors on real-time fault-tolerant systems. In Proceedings of the 16th International Conference on Distributed Computing (DISC'02), Springer, Berlin, 354--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Anderson, J. S. and Jensen, E. D. 2006. Distributed real-time specification for Java: A status report (digest). In Proceedings of the 4th International Workshop on Java Technologies for Real-Time and Embedded Systems (JTRES'06), ACM, New York, 3--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cares, J. R. 2006. Distributed Networked Operations: The Foundations of Network Centric Warfare. iUniverse, Inc., Lincoln, NE.Google ScholarGoogle Scholar
  4. Clark, R., Jensen, E. D., Kanevsky, A., Maurer, J. A., Wallace, P., Wheeler, T., Zhang, Y., Wells, D., Lawrence, T., and Hurley, P. 1999. An adaptive, distributed airborne tracking system (“process the right tracks at the right time”). In Proceedings of the 11IPPS/SPDP'99 Workshops. In conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. Springer, Berlin, 353--362. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Clark, R. K. 1990. Scheduling dependent real-time activities. Ph.D dissertation. CMU-CS-90-155, Carnegie Mellon University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Curley, E., Anderson, J., Ravindran, B., and Jensen, E. D. 2006. Recovering from distributable thread failures with assured timeliness in real-time distributed systems. In Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06). IEEE, Los Alamitos, CA, 267--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. de Oliveira, R. S. and da Silva Fraga, J. 2000. Fixed priority scheduling of tasks with arbitrary precedence constraints in distributed hard real-time systems. J. Syst. Architecture 49, 11, 991--1004.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ford, B. and Lepreau, J. 1994. Evolving Mach 3.0 to a migrating thread model. In Proceedings of the USENIX Winter Technical Conference (WTEC'94). USENIX Association, Berkeley, CA, 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Goldberg, J., Greenberg, I., et al. 1995. Adaptive fault-resistant systems (ch. 5. Adpative distributed thread integrity). Tech. rep. csl-95-02, SRI International. http://www.csl.sri.com/papers/sri-csl-95-02/.Google ScholarGoogle Scholar
  10. Harbour, M. G. and Palencia, J. C. 2003. Response time analysis for tasks scheduled under EDF within fixed priorities. In Proceedings of the 24th IEEE International Real-Time Systems Symposium (RTSS'03). IEEE, Los Alamitos, CA, 200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hermant, J.-F. and Le Lann, G. 2002. Fast asynchronous uniform consensus in real-time distributed systems. IEEE Trans. Comput. 51, 8, 931--944. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hermant, J.-F. and Widder, J. 2005. Implementing reliable distributed real-time systems with the Theta-model. In Proceedings of the 9th International Conference on Principles of Distributed Systems (OPODIS'05). Lecture Notes in Computer Science, vol. 3974, Springer, Berlin, 334--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Horn, W. 1974. Some simple scheduling algorithms. Naval Res. Logistics Q. 21, 177--185.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jensen, E. D., Locke, C. D., and Tokuda, H. 1985. A time-driven scheduling model for realtime systems. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS'85). IEEE, Los Alamitos, CA, 112--122.Google ScholarGoogle Scholar
  15. Kao, B. and Garcia-Molina, H. 1997. Deadline assignment in a distributed soft real-time system. IEEE Trans. Paral. Distrib. Syst. 8, 12, 1268--1274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Li, P. 2004. Utility accrual real-time scheduling: Models and algorithms. Ph.D. dissertation, Virginia Tech., Blacksburg, VA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Li, P., Ravindran, B., et al. 2004. A formally verified application-level framework for real-time scheduling on POSIX real-time operating systems. IEEE Trans. Softw. Engin. 30, 9, 613--629. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Locke, C. D. 1986. Best-effort decision making for real-time scheduling. Ph.D. dissertation. CMU-CS-86-134, Carnegie Mellon University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Maynard, D. P., Shipman, S. E., et al. 1988. An example real-time command, control, and battle management application for alpha. Tech. rep., Archons Project Tech. rep. 88121, Computer Science Dept., Carnegie Mellon University.Google ScholarGoogle Scholar
  20. Mills, D. L. 1995. Improved algorithms for synchronizing computer network clocks. IEEE/ACM Trans. Netw. 3, 245--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Nagy, S. and Bestavros, A. 1997. Admission control for soft-transactions in ACCORD. In Proceedings of the 3rd IEEE Real-Time Technology and Applications Symposium (RTAS'97). IEEE, Los Alamitos, CA, 160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Northcutt, J. D. 1987. Mechanisms for Reliable Distributed Real-Time Operating Systems: The Alpha Kernel. Academic Press, San Diego, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. OMG. 2001. Real-Time CORBA 2.0: Dynamic scheduling specification. Tech. rep., Object Management Group.Google ScholarGoogle Scholar
  24. Palencia, J. C. and Harbour, M. G. 1998. Schedulability analysis for tasks with static and dynamic offsets. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS'98). IEEE, Los Alamitos, CA, 26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Palencia, J. C. and Harbour, M. G. 2003. Offset-based response time analysis of distributed systems scheduled under EDF. In Proceedings of the15th IEEE Euromicro Conference on Real-Time Systems (ECRTS'03). IEEE, Los Alamitos, CA, 3--12.Google ScholarGoogle Scholar
  26. Pellizzoni, R. and Lipari, G. 2005. Improved schedulability analysis of real-time transactions with earliest deadline scheduling. In Proceedings of the 11th IEEE Real Time on Embedded Technology and Applications Symposium (RTAS'05). IEEE, Los Alamitos, CA, 66--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ravindran, B., Anderson, J. S., and Jensen, E. D. 2007. On distributed real-time scheduling in networked embedded systems in the presence of crash failures. In Proceedings of the 5th IFIP WG 10.2 International Workshop on Software Technologies for Embedded and Ubiquitous Systems (SEUS'07). Lecture Notes in Computer Science, vol. 4761, Springer, Berlin, 67--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ravindran, B., Jensen, E. D., and Li, P. 2005. On recent advances in time/utility function real-time scheduling and resource management. In Proceedings of the 8th IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC'05). IEEE, Los Alamitos, CA, 55--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sha, L., Rajkumar, R., and Lehoczky, J. P. 1990. Priority inheritance protocols: An approach to real-time synchronization. IEEE Trans. Comput. 39, 9, 1175--1185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Spuri, M. 1996. Holistic analysis of deadline scheduled real-time distributed systems. Tech. rep. RR-2873, INRIA.Google ScholarGoogle Scholar
  31. Streich, H. 1995. Taskpair-scheduling: An approach for dynamic real-time systems. Mini and Microcomput. 17, 2, 77--83.Google ScholarGoogle Scholar
  32. Sun, J. 1997. Fixed priority scheduling of end-to-end periodic tasks. Ph.D. dissertation, Computer Science Department, University of Illinois, Urbana-Champaign.Google ScholarGoogle Scholar
  33. The Open Group. 1998. MK7.3a release notes. The Open Group Research Institute, Cambridge, MA.Google ScholarGoogle Scholar
  34. Tindell, K. and Clark, J. 1994. Holistic schedulability analysis for distributed hard real-time systems. Microprocess. Microprogram. 40, 2-3, 117--134. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Recovering from distributable thread failures in distributed real-time Java

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in

                    Full Access

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader
                    About Cookies On This Site

                    We use cookies to ensure that we give you the best experience on our website.

                    Learn more

                    Got it!