skip to main content
research-article

Program-Invariant Checking for Soft-Error Detection using Reconfigurable Hardware

Published:05 November 2015Publication History
Skip Abstract Section

Abstract

There is an increasing concern about transient errors in deep submicron processor architectures. Software-only error detection approaches that exploit program invariants for silent error detection incur large execution overheads and are unreliable as state can be corrupted after invariant checkpoints. In this article, we explore the use of configurable hardware structures for the continuous evaluation of high-level program invariants at the assembly level. We evaluate the resource requirements and performance of the proposed predicate-evaluation hardware structures when integrated with a 32-bit MIPS soft core on a contemporary reconfigurable hardware device. The results, for a small set of kernel codes, reveal that these hardware structures require a very small number of hardware resources with negligible impact on the processor core that they are integrated in. Moreover, the amount of resources is fairly insensitive to the complexity of the invariants, thus making the proposed structures an attractive alternative to software-only predicate checking.

References

  1. G. Asadi and M. B. Tahoori. 2005. Soft error rate estimation and mitigation for SRAM-based FPGAs. In Proceedings of the 13th ACM International Symposium on Field-Programmable Gate Arrays (FPGA’05). ACM, New York, NY, 149--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. J. Azambuja, S. Pagliarini, M. Altieri, F. Kastensmidt, M. Hübner, and J. Becker. 2011. Using dynamic partial reconfiguration to detect SEEs in microprocessors through non-intrusive hybrid technique. In Proceedings of the 24th Symposium on Integrated Circuits and Systems Design (SBCCI). ACM Press, New York, NY, 161--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Borkar. 2005. Designing reliable systems from unreliable components: The challenge of transistor variability and degradation. IEEE Micro 25, 6, 92--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Condit, M. Harren, S. McPeak, G. C. Necula, and W. Weimer. 2003. CCured in the real world. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI). ACM Press, New York, NY, 232--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Cuenca-Asensi, A. Martinez-Alvarez, F. Restrepo-Calle, F. Palomo, H. Guzman-Miranda, and M. Aguirre. 2011. A novel co-design approach for soft errors mitigation in embedded systems. IEEE Transactions on Nuclear Science 58, 3, 1059--1065.Google ScholarGoogle ScholarCross RefCross Ref
  6. C. Flanagan and K. Leino. 2001. Houdini: An annotation assistant for ESC/Java. In Proceedings of the International Symposium of Formal Methods Europe. Springer, Berlin, 500--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Hastings and B. Joyce. 1992. Purify: Fast detection of memory leaks and access errors. In Proceedings of the USENIX Winter Technical Conference. 125--136.Google ScholarGoogle Scholar
  8. J. Heiner, B. Sellers, M. Wirthlin, and J. Kalb. 2009. FPGA partial reconfiguration via configuration scrubbing. In Proceedings of the 2009 International Conference on Field Programmable Logic and Applications (FPL). Springer, Berlin, 99--104.Google ScholarGoogle Scholar
  9. F. Kastensmidt, G. Neuberger, L. Carro, and R. Reis. 2004. Designing and testing fault-tolerant techniques for SRAM-based FPGAs. In Proceedings of the 1st Conference on Computing Frontiers. ACM, New York, NY, 419--432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Lu and A. Florin. 2007. The Design and Implementation of P2V, An Architecture for Zero-Overhead Online Verification of Software Programs. Technical Report MSR-TR-2007-99. Microsoft Research, Redmond, WA.Google ScholarGoogle Scholar
  11. Hong Lu and Alessandro Forin. 2008. Automatic Processor customization for zero-overhead online software verification. IEEE Transactions on Very Large Scale Integrated Systems 16, 10, 1346--1357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Mukherjee, J. Emer, T. Fossum, and S. Reinhardt. 2004. Cache scrubbing in microprocessors: Myth or necessity? In Proceedings of the 10th Pacific Rim International Symposium on Dependable Computing. IEEE, Piscataway, NJ, 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Oh, S. Mitra, and E. McCluskey. 2002. ED4I: Error detection by diverse data and duplicated instructions. IEEE Transactions on Computers 51, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Reinhardt and S. Mukherjee. 2000. Transient fault detection via simultaneous multithreading. Computer Architecture News 28, 2, 25--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Reis, J. Chang, N. Vachharajani, R. Rangan, and D. August. 2005. SWIFT: Software implemented fault tolerance. In Proceedings of the International Symposium on Code Generation and Optimization (CGO). ACM Press, New York, NY, 243--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Sahoo, L. Man-Lap, P. Ramachandran, S. Adve, V. Adve, and Z. Yuanyuan. 2008. Using likely program invariants to detect hardware errors. In Proceedings of the IEEE International Conference on Dependable Systems and Networks (DSN). 70--79.Google ScholarGoogle Scholar
  17. R. Vemu and J. Abraham. 2011. CEDA: Control-flow error detection using assertions. IEEE Transactions on Computers 60, 9, 1233--1245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Xilinx Corp. 2012. Virtex-6 Series FPGAs: Overview. Retrieved October 14, 2015 from http://www.xilinx.com/support/documentation/data_sheets/ds150.pdf.Google ScholarGoogle Scholar
  19. S.-Y. Yu and E. McCluskey. 2001. Permanent fault repair for FPGAs with limited redundant area. In Proceedings of the IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems. IEEE Computer Society, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Zhou, W. Liu, L. Fei, S. Lu, Q. Feng, Y. Zhou, S. Midkiff, and J. Torrellas. 2004. AccMon: Automatically detecting memory-related bugs via program counter-based invariants. In Proceedings of the 37th International Symposiuim on Microarchitecture (MICRO). IEEE Press, Piscataway, NJ, 269--280. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Program-Invariant Checking for Soft-Error Detection using Reconfigurable Hardware

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Reconfigurable Technology and Systems
        ACM Transactions on Reconfigurable Technology and Systems  Volume 9, Issue 1
        Special Section on the 2014 International Symposium on Applied Reconfigurable Computing
        November 2015
        121 pages
        ISSN:1936-7406
        EISSN:1936-7414
        DOI:10.1145/2839314
        • Editor:
        • Steve Wilton
        Issue’s Table of Contents

        Copyright © 2015 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 November 2015
        • Accepted: 1 March 2015
        • Revised: 1 February 2015
        • Received: 1 June 2014
        Published in trets Volume 9, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)2
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!