ABSTRACT
Continuous technology scaling has brought us to a point, where transistors have become extremely susceptible to cosmic radiation strikes, or soft errors. Inside the processor, caches are most vulnerable to soft errors, and techniques at various levels of design abstraction, e.g., fabrication, gate design, circuit design, and microarchitecture-level, have been developed to protect data in caches. However, no work has been done to investigate the effect of code transformations on the vulnerability of data in caches. Data is vulnerable to soft errors in the cache only if it will be read by the processor, and not if it will be overwritten. Since code transformations can change the read-write pattern of program variables, they significantly effect the soft error vulnerability of program variables in the cache. We observe that often opportunity exists to significantly reduce the soft error vulnerability of cache data by trading-off a little performance. However, even if one wanted to exploit this trade-off, it is difficult, since there are no efficient techniques to estimate vulnerability of data in caches. To this end, this paper develops efficient static analysis method to estimate program vulnerability in caches, which enables the compiler to exploit the performance-vulnerability trade-offs in applications. Finally, as compared to simulation based estimation, static analysis techniques provide the insights into vulnerability calculations that provide some simple schemes to reduce program vulnerability.
- A. Agarwal, B. Paul, and K. Roy. Process variation in nano-scale memories: failure analysis and process tolerant architecture. pages 353--356, Oct. 2004.Google Scholar
- R. Baumann, T. Hossain, S. Murata, and H. Kitagawa. Boron compounds as a dominant source of alpha particles in semiconductor devices. In Anual proceedings of IEEE symposium on Reliability Physics, pages 297--302, 1995.Google Scholar
- J. A. Blome, S. Gupta, S. Feng, and S. Mahlke. Cost-efficient soft error protection for embedded microprocessors. In CASES '06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, pages 421--431, New York, NY, USA, 2006. ACM Press. ISBN 1-59593-543-6. doi: http://doi.acm.org/10.1145/1176760.1176811. Google Scholar
Digital Library
- D. Burger and T. M. Austin. The simplescalar tool set, version 2.0. SIGARCH Comput. Archit. News, 25(3):13--25, 1997. ISSN 0163-5964. doi: http://doi.acm.org/10.1145/268806.268810. Google Scholar
Digital Library
- Y. Cai, M. T. Schmitz, A. Ejlali, B. M. Al-Hashimi, and S. M. Reddy. Cache size selection for performance, energy and reliability of timeconstrained systems. In ASP-DAC '06: Proceedings of the 2006 Asia and South Pacific Design Automation Conference, pages 923--928, Piscataway, NJ, USA, 2006. IEEE Press. ISBN 0-7803-9451-8. doi: http://doi.acm.org/10.1145/1118299.1118507. Google Scholar
Digital Library
- E. Cannon, D. Reinhardt, M. Gordon, and P. Makowenskyj. SRAM SER in 90, 130 and 180 nm bulk and SOI technologies. Reliability Physics Symposium Proceedings, 2004. 42nd Annual. 2004 IEEE International, pages 300--304, April 2004.Google Scholar
Cross Ref
- S. Chatterjee, E. Parker, P. J. Hanlon, and A. R. Lebeck. Exact analysis of the cache behavior of nested loops. SIGPLAN Notices, 36(5):286--297, 2001. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/381694.378859. Google Scholar
Digital Library
- L. Chen and A. Avizienis. N-version programming: A fault-tolerance approach to reliability of software operation. In Twenty-Fifth International Symposium on Fault-Tolerant Computing, pages 113--119, Jun 1995.Google Scholar
- J. Gaisler. Evaluation of a 32-bit microprocessor with builtin concurrent error-detection. Fault-Tolerant Computing, International Symposium on, 0:42, 1997. ISSN 0731-3071. doi: http://doi.ieeecomputersociety.org/10.1109/FTCS.1997.614076. Google Scholar
Digital Library
- S. Ghosh, M. Martonosi, and S. Malik. Cache miss equations: an analytical representation of cache misses. In ICS'97, pages 317--324, 1997. ISBN 0-89791-902-5. doi: http://doi.acm.org/10.1145/263580.263657. Google Scholar
Digital Library
- M. A. Gomaa and T. N. Vijaykumar. Opportunistic transient-fault detection. SIGARCH Comput. Archit. News, 33(2):172--183, 2005. ISSN 0163-5964. doi: http://doi.acm.org/10.1145/1080695.1069985. Google Scholar
Digital Library
- L. Hung, M. Goshima, and S. Sakai. Mitigating soft errors in highly associative cache with cam-based tag. pages 342--347, Oct. 2005. doi: 10.1109/ICCD.2005.76. Google Scholar
Digital Library
- S. Kayali. Reliability considerations for advanced microelectronics. In PRDC '00: Proceedings of the 2000 Pacific Rim International Symposium on Dependable Computing, page 99, Washington, DC, USA, 2000. IEEE Computer Society. ISBN 0-7695-0975-4. Google Scholar
Digital Library
- J. Lee and A. Shrivastava. Static analysis to mitigate soft errors in register files. In Design, Automation and Test in Europe Conference and Exhibition, 2009. DATE '09., pages 1367--1372, April 2009. Google Scholar
Digital Library
- K. Lee, A. Shrivastava, I. Issenin, N. Dutt, and N. Venkatasubramanian. Mitigating soft error failures for multimedia applications by selective data protection. In CASES '06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, pages 411--420, New York, NY, USA, 2006. ACM. ISBN 1-59593-543-6. doi: http://doi.acm.org.ezproxy1.lib.asu.edu/10.1145/1176760.1176810. Google Scholar
Digital Library
- J.-F. Li and Y.-J. Huang. An error detection and correction scheme for rams with partial-write function. In Memory Technology, Design, and Testing, 2005. MTDT 2005. 2005 IEEE International Workshop on, pages 115--120, Aug. 2005. doi: 10.1109/MTDT.2005.16. Google Scholar
Digital Library
- P. Liden, P. Dahlgren, R. Johansson, and J. Karlsson. On latching probability of particle induced transients in combinational networks. In Fault-Tolerant Computing, 1994. FTCS-24. Digest of Papers., Twenty-Fourth International Symposium on, pages 340--349, Jun 1994. doi: 10.1109/FTCS.1994.315626.Google Scholar
Cross Ref
- S. Mitra, N. Seifert, M. Zhang, Q. Shi, and K. S. Kim. Robust system design with built-in soft-error resilience. Computer, 38(2):43--52, 2005. ISSN 0018-9162. doi: http://dx.doi.org/10.1109/MC.2005.70. Google Scholar
Digital Library
- S. Mukherjee, C. T. Weaver, J. Emer, S. K. Reinhardt, and T. Austin. Measuring architectural vulnerability factors. IEEE Micro, 23(6):70--75, 2003. ISSN 0272-1732. doi: http://doi.ieeecomputersociety.org/10.1109/MM.2003.1261389. Google Scholar
Digital Library
- S. S. Mukherjee, J. Emer, T. Fossum, and S. K. Reinhardt. Cache scrubbing in microprocessors: Myth or necessity? Pacific Rim International Symposium on Dependable Computing, IEEE, 0:37-42, 2004. doi: http://doi.ieeecomputersociety.org/10.1109/PRDC.2004.1276550. Google Scholar
Digital Library
- A. Nourivand, A. Al-Khalili, and Y. Savaria. Aggressive leakage reduction of srams using error checking and correcting (ecc) techniques. pages 426--429, Aug. 2008. doi: 10.1109/MWSCAS.2008.4616827.Google Scholar
- N. Oh, S. Mitra, and E. McCluskey. Ed4i: error detection by diverse data and duplicated instructions. Computers, IEEE Transactions on, 51(2):180--199, Feb 2002. ISSN 0018-9340. doi: 10.1109/12.980007. Google Scholar
Digital Library
- R. Phelan. Addressing soft errors in armcore-based designs. Technical report, ARM, 2003.Google Scholar
- polylib. URL http://icps.u-strasbg.fr/polylib. PolyLib - A library of polyhedral functions.Google Scholar
- D. K. Pradhan, editor. Fault-tolerant computer system design. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1996. ISBN 0-13-057887-8. Google Scholar
Digital Library
- W. Pugh. The Omega test: a fast and practical integer programming algorithm for dependence analysis. In Supercomputing '91: Proceedings of the 1991 ACM/IEEE conference on Supercomputing, pages 4--13, New York, NY, USA, 1991. ACM. ISBN 0-89791-459-7. doi: http://doi.acm.org/10.1145/125826.125848. Google Scholar
Digital Library
- W. Pugh. Counting solutions to Presburger formulas: how and why. SIGPLAN Notices, 29(6):121--134, 1994. ISSN 0362-1340. doi: http://doi.acm.org/10.1145/773473.178254. Google Scholar
Digital Library
- G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, and D. I. August. Swift: Software implemented fault tolerance. In CGO'05: Proceedings of the international symposium on Code generation and optimization, pages 243--254, Washington, DC, USA, 2005. IEEE Computer Society. ISBN 0-7695-2298-X. doi: http://dx.doi.org/10.1109/CGO.2005.34. Google Scholar
Digital Library
- L. R. Rockett Jr. Simulated SEU hardened scaled CMOS SRAM cell design using gated resistors. Nuclear Science, IEEE Transactions on, 39(5):1532--1541, Oct 1992. ISSN 0018-9499. doi: 10.1109/23.173239.Google Scholar
- K. Shepard, V. Narayanan, and R. Rose. Harmony: static noise analysis of deep submicron digital integrated circuits. IEEE Trans. on CAD, (8):1132--1150, 1999. Google Scholar
Digital Library
- P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi. Modeling the effect of technology trends on the soft error rate of combinational logic. Dependable Systems and Networks, International Conference on, 0:389, 2002. doi: http://doi.ieeecomputersociety.org/10.1109/DSN.2002.1028924. Google Scholar
Digital Library
- V. Sridharan, H. Asadi, M. B. Tahoori, and D. Kaeli. Reducing data cache susceptibility to soft errors. IEEE Transactions on Dependable and Secure Computing, 3(4):353--364, 2006. doi: http://doi.ieeecomputersociety.org/10.1109/TDSC.2006.55. Google Scholar
Digital Library
- S. Verdoolaege, R. Seghir, K. Beyls, V. Loechner, andM. Bruynooghe. Counting integer points in parametric polytopes using Barvinok's rational function. Algorithmica, 48(1):37--66, 2007. doi: 10.1007/s00453-006-1231-0. Google Scholar
Digital Library
- M. E. Wolf and M. S. Lam. A data locality optimizing algorithm. In PLDI '91, pages 30--44, 1991. ISBN 0-89791-428-7. doi: http://doi.acm.org/10.1145/113445.113449. Google Scholar
Digital Library
- J. Yan and W. Zhang. Compiler-guided register reliability improvement against soft errors. In EMSOFT '05, pages 203--209, 2005. ISBN 1-59593-091-4. doi: http://doi.acm.org/10.1145/1086228.1086266 Google Scholar
Digital Library
Index Terms
Cache vulnerability equations for protecting data in embedded processor caches from soft errors
Recommendations
Cache vulnerability equations for protecting data in embedded processor caches from soft errors
LCTES '10Continuous technology scaling has brought us to a point, where transistors have become extremely susceptible to cosmic radiation strikes, or soft errors. Inside the processor, caches are most vulnerable to soft errors, and techniques at various levels ...
Reducing Data Cache Susceptibility to Soft Errors
Data caches are a fundamental component of most modern microprocessors. They provide for efficient read/write access to data memory. Errors occurring in the data cache can corrupt data values or state, and can easily propagate throughout the memory ...
Characterizing System-Level Vulnerability for Instruction Caches against Soft Errors
DFT '11: Proceedings of the 2011 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology SystemsWith continuous scaling down of the semiconductor technology, the soft errors induced by energetic particles have become an increasing challenge in designing current and next-generation reliable microprocessors. Due to their large share of the ...







Comments