skip to main content
article

Compensate or ignore? meeting control robustness requirements through adaptive soft-error handling

Published:13 June 2016Publication History
Skip Abstract Section

Abstract

To avoid catastrophic events like unrecoverable system failures on mobile and embedded systems caused by soft-errors, software-based error detection and compensation techniques have been proposed. Methods like error-correction codes or redundant execution can offer high flexibility and allow for application-specific fault-tolerance selection without the needs of special hardware supports. However, such software-based approaches may lead to system overload due to the execution time overhead. An adaptive deployment of such techniques to meet both application requirements and system constraints is desired. From our case study, we observe that a control task can tolerate limited errors with acceptable performance loss. Such tolerance can be modeled as a (m,k) constraint which requires at least m correct runs out of any k consecutive runs to be correct. In this paper, we discuss how a given (m,k) constraint can be satisfied by adopting patterns of task instances with individual error detection and compensation capabilities. We introduce static strategies and provide a formal feasibility analysis for validation. Furthermore, we develop an adaptive scheme that extends our initial approach with online awareness that increases efficiency while preserving analysis results. The effectiveness of our method is shown in a real-world case study as well as for synthesized task sets.

References

  1. R. C. Baumann. Radiation-induced soft errors in advanced semiconductor technologies. IEEE Transactions on Device and Materials Reliability, 5(3):305–316, Sept 2005.Google ScholarGoogle Scholar
  2. J. S. Hu, F. Li, V. Degalahal, M. Kandemir, N. Vijaykrishnan, and M. J. Irwin. Compiler-directed instruction duplication for soft error detection. In Design, Automation and Test in Europe, 2005. Proceedings, pages 1056–1057 Vol. 2, March 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. Oh, P. P. Shirvani, and E. J. McCluskey. Error detection by duplicated instructions in super-scalar processors. IEEE Transactions on Reliability, 51(1):63–75, Mar 2002.Google ScholarGoogle ScholarCross RefCross Ref
  4. S. Rehman, M. Shafique, P. V. Aceituno, F. Kriebel, J. J. Chen, and J. Henkel. Leveraging variable function resilience for selective software reliability on unreliable hardware. In Design, Automation Test in Europe Conference Exhibition (DATE), 2013, pages 1759–1764, March 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Zhu, H. Aydin, and J. J. Chen. Optimistic reliability aware energy management for real-time tasks with probabilistic execution times. In Real-Time Systems Symposium, 2008, pages 313–322, Nov 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Nicolescu, R. Velazco, M. Sonza-Reorda, M. Rebaudengo, and M. Violante. A software fault tolerance method for safety-critical systems: effectiveness and drawbacks. In Integrated Circuits and Systems Design, 2002. Proceedings. 15th Symposium on, pages 101– 106, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Parameswaran Ramanathan. Overload management in real-time control applications using m,k $(m,k)$-firm guarantee. IEEE Trans. Parallel Distrib. Syst., 10(6):549–559, June 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Kumar, D. Goswami, S. Chakraborty, A. Annaswamy, K. Lampka, and L. Thiele. A hybrid approach to cyber-physical systems verification. In Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE, pages 688–696, June 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. Henriksson, H. Sandberg, and K. H. Johansson. Predictive compensation for communication outages in networked control systems. In Decision and Control, 2008. CDC 2008. 47th IEEE Conference on, pages 2063–2068, Dec 2008.Google ScholarGoogle ScholarCross RefCross Ref
  10. T. Bund and F. Slomka. Sensitivity analysis of dropped samples for performance-oriented controller design. In Real-Time Distributed Computing (ISORC), 2015 IEEE 18th International Symposium on, pages 244–251, April 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. K. Mok and D. Chen. A multiframe model for real-time tasks. In Real-Time Systems Symposium, 1996., 17th IEEE, pages 22–29, Dec 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ute Schiffel, Martin Süßkraut, and Christof Fetzer. An-encoding compiler: Building safety-critical systems with commodity hardware. In SAFECOMP ’09: Proceedings of the 28th International Conference on Computer Safety, Reliability, and Security, pages 283–296, Berlin, Heidelberg, 2009. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. George A. Reis, Jonathan Chang, Neil Vachharajani, Ram Rangan, and David I. August. Swift: Software implemented fault tolerance. Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 0:243–254, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Chang, G. A. Reis, and D. I. August. Automatic instruction-level software-only recovery. In Dependable Systems and Networks, 2006. DSN 2006. International Conference on, pages 83–92, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Michael Engel, Florian Schmoll, Andreas Heinig, and Peter Marwedel. Unreliable yet useful – reliability annotations for data in cyberphysical systems. In Proceedings of the 2011 Workshop on Software Language Engineering for Cyber-physical Systems (WS4C), Berlin / Germany, oct 2011.Google ScholarGoogle Scholar
  16. Ayswarya Sundaram, Ameen Aakel, Derek Lockhart, Darshan Thaker, and Diana Franklin. Efficient fault tolerance in multi-media applications through selective instruction replication. In Proceedings of the 2008 Workshop on Radiation Effects and Fault Tolerance in Nanometer Technologies, WREFT ’08, pages 339–346, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. L. Liu and James W. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. J. ACM, 20(1):46–61, January 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Lehoczky, L. Sha, and Y. Ding. The rate monotonic scheduling algorithm: exact characterization and average case behavior. In Real Time Systems Symposium, 1989., Proceedings., pages 166–171, Dec 1989.Google ScholarGoogle ScholarCross RefCross Ref
  19. Gang Quan and Xiaobo Hu. Enhanced fixed-priority scheduling with (m,k)-firm guarantee. In Proceedings of the 21st IEEE Conference on Real-time Systems Symposium, RTSS’10, pages 79–88, Washington, DC, USA, 2000. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Linwei Niu and Gang Quan. Energy minimization for real-time systems with (m,k)-guarantee. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14(7):717–729, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y.Yamamoto. Two wheeled self-balancing r/c robot controlled with a hitechnic gyro sensor, 2010.Google ScholarGoogle Scholar
  22. Enrico Bini and Giorgio C. Buttazzo. Measuring the performance of schedulability tests. Real-Time Syst., 30(1-2):129–154, May 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. I. Davis, A. Zabos, and A. Burns. Efficient exact schedulability tests for fixed priority real-time systems. IEEE Transactions on Computers, 57(9):1261–1276, Sept 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. George A. Reis, Jonathan Chang, Neil Vachharajani, Ram Rangan, David I. August, and Shubhendu S. Mukherjee. Software-controlled fault tolerance. ACM Trans. Archit. Code Optim., 2(4):366–396, December 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Compensate or ignore? meeting control robustness requirements through adaptive soft-error handling

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM SIGPLAN Notices
              ACM SIGPLAN Notices  Volume 51, Issue 5
              LCTES '16
              May 2016
              122 pages
              ISSN:0362-1340
              EISSN:1558-1160
              DOI:10.1145/2980930
              • Editor:
              • Andy Gill
              Issue’s Table of Contents
              • cover image ACM Conferences
                LCTES 2016: Proceedings of the 17th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools, and Theory for Embedded Systems
                June 2016
                122 pages
                ISBN:9781450343169
                DOI:10.1145/2907950

              Copyright © 2016 ACM

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 13 June 2016

              Check for updates

              Qualifiers

              • article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!