skip to main content
research-article
Public Access

AdaFT: A Framework for Adaptive Fault Tolerance for Cyber-Physical Systems

Published:28 March 2017Publication History
Skip Abstract Section

Abstract

Cyber-physical systems (CPS) frequently have to use massive redundancy to meet application requirements for high reliability. While such redundancy is required, it can be activated adaptively, based on the current state of the controlled plant. Most of the time, the plant is in a state that allows for a lower level of fault tolerance. Avoiding the continuous deployment of massive fault tolerance will greatly reduce the workload of the CPS, and lower the operating temperature of the cyber sub-system, thus increasing its reliability. In this article, we extend our prior research by demonstrating a software simulation framework Adaptive Fault Tolerance (AdaFT) that can automatically generate the sub-spaces within which our adaptive fault tolerance can be applied. We also show the theoretical benefits of AdaFT and its actual implementation in several real-world CPSs.

References

  1. S. Bak, T. T. Johnson, M. Caccamo, and L. Sha. 2014. Real-time reachability for verified simplex design. In Proceedings of the Real-Time Systems Symposium (RTSS’14). 138--148. Google ScholarGoogle ScholarCross RefCross Ref
  2. C. Bergenheim, S. Shladover, and E. Coelingh. 2012. Overview of platooning systems. Proceedings of the 19th ITS World Congress (2012).Google ScholarGoogle Scholar
  3. P. Bogdan and R. Marculescu. 2011. Towards a science of cyber-physical systems design. In Proceedings of the 2011 IEEE/ACM International Conference on Cyber-Physical Systems (ICCPS). IEEE, 99--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Buitinck, G. Louppe, M. Blondel, F. Pedregosa, A. Mueller, O. Grisel, V. Niculae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton, J. VanderPlas, A. Joly, B. Holt, and G. Varoquaux. 2013. API design for machine learning software: Experiences from the scikit-learn project. In Proceedings of the ECML PKDD Workshop: Languages for Data Mining and Machine Learning. 108--122.Google ScholarGoogle Scholar
  5. A. Burns and R. I. Davis. 2015. Mixed criticality systems—A review. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS). Retrieved from www-users.cs.york.ac.uk/burns/review.pdf.Google ScholarGoogle Scholar
  6. J. Clausen. 1999. Branch and bound algorithms-principles and examples. Department of Computer Science, University of Copenhagen. 1--30.Google ScholarGoogle Scholar
  7. J. Cooling. 2013. Real-time Operating Systems. Lindentree Associates.Google ScholarGoogle Scholar
  8. P. N. Currier. 2011. A method for modeling and prediction of ground vehicle dynamics and stability in autonomous systems. PhD Thesis, Virginia Tech University.Google ScholarGoogle Scholar
  9. H. Dugoff, P. S. Fancher, and L. Segel. 1969. Tire performance characteristics affecting vehicle response to steering and braking control inputs. Transportation Research Board.Google ScholarGoogle Scholar
  10. L. Escobar and W. Meeker. 2006. A review of accelerated test models. Stat. Sci. (2006), 552--577. Google ScholarGoogle ScholarCross RefCross Ref
  11. J. Fraga, F. Siqueira, and F. Favarim. 2003. An adaptive fault-tolerant component model. In Proceedings of theIEEE International Workshop on Object-Oriented Real-Time Dependable Systems. Google ScholarGoogle ScholarCross RefCross Ref
  12. J. Goldberg, I. Greenberg, and T. F. Lawrence. 1993. Adaptive fault tolerance. In Proceedings of the IEEE Workshop on Advances in Parallel and Distributed Systems. Google ScholarGoogle ScholarCross RefCross Ref
  13. A. Goyal and A. N. Tantawi. 1987. Evaluation of performability for degradable computer systems. IEEE Trans. Comput. 36, 6 (1987), 738--744. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. I. Koren and C. M. Krishna. 2007. Fault-tolerant systems. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. M. Krishna. 2015. Ameliorating thermally acclerated aging with state-based application of fault-tolerance in cyber-physical computers. IEEE Trans. Reliabil. 64, 1 (2015), 4--14. Google ScholarGoogle ScholarCross RefCross Ref
  16. C. M. Krishna and I. Koren. 2013. Adaptive fault-tolerance for cyber-physical systems. In Proceedings of the CPS Workshop. 310--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. M. Krishna and K. G. Shin. 1987. Performance measures for control computers. IEEE Trans. Automat. Control 32, 6 (1987), 467--473. Google ScholarGoogle ScholarCross RefCross Ref
  18. J. Lehoczky, L. Sha, and Y. Ding. 1989. The rate monotonic scheduling algorithm: Exact characterization and average case behavior. In Proceedings of the Real Time Systems Symposium, 1989. IEEE, 166--171. Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Li, P. Ramachandran, S. Kumar Sahoo, S. V. Adve, V. S. Adve, and Y. Zhou. 2008. Understanding the propagation of hard errors to software and implications for resilient system design. ACM SIGARCH Computer Architecture News 36, 1 (2008), 265--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. Liu, Q. Wang, S. Gopalakrishnan, W. He, L. Sha, H. Ding, and K. Lee. 2008. ORTEGA: An efficient and flexible online fault tolerance architecture for real-time control systems. IEEE Trans. Industr. Inf. 4, 4 (2008), 213--224. Google ScholarGoogle ScholarCross RefCross Ref
  21. J. F. Meyer. 1982. Closed-form solutions of performability. IEEE Trans. Comput. 31, 7 (1982), 648--657. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. F. Meyer, D. G. Furchtgott, and L. T. Wu. 1980. Performability evaluation of the SIFT computer. IEEE Trans. Comput. 29, 6 (1980), 501--509. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. University Michigan. 2012. Matlab and simulink tutorial. Retrieved from http://ctms.engin.umich.edu/CTMS/.Google ScholarGoogle Scholar
  24. R. Moazzami, J. C. Lee, and C. Hu. 1989. Temperature acceleration of time-dependent dielectric breakdown. IEEE Transactions on Electron Devices (1989).Google ScholarGoogle Scholar
  25. K. P. Murphy. 2013. Machine Learning: A Probabilistic Perspective. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Rajamani. 2011. Vehicle Dynamics and Control. 36, 11 Springer, 2462--2465.Google ScholarGoogle Scholar
  27. J. Schoen. 1980. A model of electromigration failure under pulsed condition. J. Appl. Phys. 51, 1 (1980), 508--512. Google ScholarGoogle ScholarCross RefCross Ref
  28. D. K. Schroder. 2007. Negative bias temperature instability: What do we understand? Microelectronics Reliability 47, 6 (2007), 841--852. Google ScholarGoogle ScholarCross RefCross Ref
  29. K. G. Shin, C. M. Krishna, and Y.-H. Lee. 1985. A unified method for evaluating real-time computers and its application. IEEE Trans. Automat. Control (1985).Google ScholarGoogle Scholar
  30. Mechanical Simulation. 2015. CarSim. Retrieved from http://www.carsim.com/.Google ScholarGoogle Scholar
  31. K. Skadron, M. R. Stan, K. Sankaranarayanan, W. Huang, S. Velusamy, and David Tarjan. 2004. Temperature-aware microarchitecture: Modeling and implementation. ACM Trans. Arch. Code Optimiz. 1, 1 (2004), 94--125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Song and G. Parmer. 2015. CMON: A predictable monitoring infrastructure for system-level latent fault detection and recovery. In Proceedings of the Real-Time and Embedded Technology and Application Symposium (RTAS). 247--258. Google ScholarGoogle ScholarCross RefCross Ref
  33. J. Song, J. Wittrock, and G. Parme. 2013. Predictable, efficient system-level fault tolerance in C 3. Real-Time Systems Symposium (RTSS) (2013), 21--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. A. Swetha, R. Pillay V, and S. Punnekkat. 2014. Design, analysis and implementation of improved adaptive fault tolerant model for cruise control multiprocessor system. Int. J. Comput. Appl. 86, 15 (2014).Google ScholarGoogle Scholar
  35. S. Thrun, W. Burgard, and D. Fox. 2005. Probabilistic Robotics. MIT Press.Google ScholarGoogle Scholar
  36. S. Vestal. 2007. Preemptive scheduling of multi-criticality systems with varying degrees of execution time assurance. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS). 239--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. W. J. Vigrass. 2004. Calculation of semiconductor failure rates. Harris Semiconductor (2004).Google ScholarGoogle Scholar
  38. B. Wittenmark. 2011. Computer-Controlled Systems: Theory and Design. Courier Dover Publications.Google ScholarGoogle Scholar

Index Terms

  1. AdaFT: A Framework for Adaptive Fault Tolerance for Cyber-Physical Systems

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!