Abstract
Huge leaps in performance and power improvements of computing systems are driven by rapid technology scaling, but technology scaling has also rendered computing systems susceptible to soft errors. Among the soft error protection techniques, Control Flow Checking (CFC) based techniques have gained a reputation of being lightweight yet effective. The main idea behind CFCs is to check if the program is executing the instructions in the right order. In order to validate the protection claims of existing CFCs, we develop a systematic and quantitative method to evaluate the protection achieved by CFCs using the metric of vulnerability. Our quantitative analysis indicates that existing CFC techniques are not only ineffective in providing protection from soft faults, but incur additional performance and power overheads. Our results show that software-only CFC protection schemes increase system vulnerability by 18%--21% with 17%--38% performance overhead and hybrid CFC protection increases vulnerability by 5%. Although the vulnerability remains almost the same for hardware-only CFC protection, they incur overheads of design cost, area, and power due to the hardware modifications required for their implementations.
- 2010. Amber ARM-compatible core :: Overview. http://opencores.org/project,amber.Google Scholar
- Z. Alkhalifa, V. S. S. Nair, N. Krishnamurthy, and J. A. Abraham. 1999. Design and evaluation of system-level checks for on-line control flow error detection. IEEE Trans. Parallel Distrib. Syst. 10, 6 (June 1999), 627--641. Google Scholar
Digital Library
- S. A. Asghari, H. Taheri, H. Pedram, and O. Kaynak. 2014. Software-based control flow checking against transient faults in industrial environments. IEEE Trans. Indust. Inf. 10, 1 (Feb. 2014), 481--490.Google Scholar
Cross Ref
- J. R. Azambuja, M. Altieri, J. Becker, and F. L. Kastensmidt. 2013. HETA: Hybrid error-detection technique using assertions. IEEE Trans. Nucl. Sci. 60, 4 (Aug. 2013), 2805--2812.Google Scholar
Cross Ref
- Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The gem5 simulator. SIGARCH Comput. Archit. News 39, 2 (Aug. 2011), 1--7. Google Scholar
Digital Library
- A. Biswas, P. Racunas, R. Cheveresan, J. Emer, S. S. Mukherjee, and R. Rangan. 2005. Computing architectural vulnerability factors for address-based structures. In Proceedings of the 32nd International Symposium on Computer Architecture (ISCA’05). 532--543. Google Scholar
Digital Library
- Preston Briggs, Keith D. Cooper, Timothy J. Harvey, and L. Taylor Simpson. 1998. Practical improvements to the construction and destruction of static single assignment form. Softw. Pract. Exper. 28, 8 (July 1998), 859--881. Google Scholar
Digital Library
- Wang Chao, Fu Zhongchuan, Chen Hongsong, Ba Wei, Li Bin, Chen Lin, Zhang Zexu, Wang Yuying, and Cui Gang. 2010. CFCSS without aliasing for SPARC architecture. In Proceedings of the IEEE 10th International Conference on Computer and Information Technology (CIT’10). 2094--2100. Google Scholar
Digital Library
- E. Chielle, G. S. Rodrigues, F. L. Kastensmidt, S. Cuenca-Asensi, L. A. Tambara, P. Rech, and H. Quinn. 2015. S-SETA: Selective software-only error-detection technique using assertions. IEEE Trans. Nucl. Sci. 62, 6 (Dec. 2015), 3088--3095.Google Scholar
Cross Ref
- M. Duricek and T. Krajcovic. 2014. Interactive hybrid control-flow checking method. In Proceedings of the 2014 International Conference on Applied Electronics. 79--82.Google Scholar
- J. B. Eifert and J. P. Shen. 1995. Processor monitoring using asynchronous signatured instruction streams. In Proceedings of the 25th International Symposium on Fault-Tolerant Computing, 1995, “Highlights from Twenty-Five Years’.” 106.Google Scholar
- N. Farazmand, M. Fazeli, and S. G. Miremadi. 2008. FEDC: Control flow error detection and correction for embedded systems without program interruption. In Proceedings of the 3rd International Conference on Availability, Reliability and Security (ARES’08). 33--38. Google Scholar
Digital Library
- Xin Fu, Tao Li, and José A. B. Fortes. 2006. Sim-SODA: A unified framework for architectural level software reliability analysis. In Proceedings of the Workshop on Modeling, Benchmarking and Simulation (held in conjunction with International Symposium on Computer Architecture).Google Scholar
- K. T. Gardiner, A. Yakovlev, and A. Bystrov. 2007. A C-element latch scheme with increased transient fault tolerance for asynchronous circuits. In Proceedings of the13th IEEE International On-Line Testing Symposium (IOLTS’07). 223--230. Google Scholar
Digital Library
- O. Goloubeva, M. Rebaudengo, M. Sonza Reorda, and M. Violante. 2003. Soft-error detection using control flow assertions. In Proceedings of the 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems. 581--588. Google Scholar
Digital Library
- M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the 2001 IEEE International Workshop on Workload Characterization, WWC-4. (WWC’01). IEEE Computer Society, Washington, DC, 3--14. Google Scholar
Digital Library
- P. Hazucha, T. Karnik, S. Walstra, B. Bloechel, J. Tschanz, J. Maiz, K. Soumyanath, G. Dermer, S. Narendra, V. De, and S. Borkar. 2003. Measurements and analysis of SER tolerant latch in a 90 nm dual-Vt CMOS process. In Proceedings of the IEEE 2003 Custom Integrated Circuits Conference. 617--620.Google Scholar
- J. Hennessy and D. Patterson. 2012. Computer Architecture: A Quantitative Approach (5th ed.). Morgan Kaufmann. Google Scholar
Digital Library
- Intel Corporation. 1997. Pentium Processor Family Developer’s Manual. Intel Corporation.Google Scholar
- R. Jeyapaul, Fei Hong, A. Rhisheekesan, A. Shrivastava, and Kyoungwoo Lee. 2011. UnSync: A soft error resilient redundant multicore architecture. In Proceedings of the International Conference on Parallel Processing (ICPP’11). 632--641. Google Scholar
Digital Library
- Sammy Kayali. 2000. Reliability considerations for advanced microelectronics. In Proceedings of the 2000 Pacific Rim International Symposium on Dependable Computing (PRDC’00). IEEE Computer Society, Washington, DC, 99. http://portal.acm.org/citation.cfm?id=826038.826937 Google Scholar
Digital Library
- Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis and transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO’04). IEEE Computer Society, Washington, DC, 75--. http://dl.acm.org/citation.cfm?id=977395.977673 Google Scholar
Digital Library
- H. Madeira and J. G. Silva. 1991. On-line signature learning and checking: Experimental evaluation. In CompEuro’91. Proceedings of the 5th Annual European Computer Conference on Advanced Computer Technology, Reliable Systems and Applications.642--646.Google Scholar
- T. Michel, R. Leveugle, and G. Saucier. 1991. A new approach to control flow checking without program modification. In Proceedings of the 21st International Symposium on Fault-Tolerant Computing, 1991. FTCS-21. Digest of Papers. 334--341.Google Scholar
- G. Miremadi, J. Ohlsson, M. Rimen, and J. Karlsson. 1998. Use of time, location and instruction signatures for control flow checking. In Proceedings of the DCCA-5 International Conference.Google Scholar
- P. Montesinos, W. Liu, and J. Torrellas. 2006. Shield: Cost-effective soft-error protection for register files. In Proceedings of the 3rd IBM TJ Watson Conference on Interaction between Architecture, Circuits and Compilers (PAC’06).Google Scholar
- Shubhendu S. Mukherjee, Christopher Weaver, Joel Emer, Steven K. Reinhardt, and Todd Austin. 2003. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. IEEE/ACM International Symposium on Microarchitecture 0 (2003), 29. Google Scholar
Digital Library
- N. Oh, P. P. Shirvani, and E. J. McCluskey. 2002. Control-flow checking by software signatures. IEEE Trans. Reliab. 51, 1 (March 2002), 111--122.Google Scholar
Cross Ref
- N. Oh, P. P. Shirvani, and E. J. McCluskey. 2002. Error detection by duplicated instructions in super-scalar processors. IEEE Trans. Reliab. 51, 1 (March 2002), 63--75.Google Scholar
Cross Ref
- J. Ohlsson, M. Rimen, and U. Gunneflo. 1992. A study of the effects of transient fault injection into a 32-bit RISC with built-in watchdog. In Proceedings of the 22nd International Symposium on Fault-Tolerant Computing, 1992. FTCS-22. Digest of Papers. 316--325.Google Scholar
- L. Parra, A. Lindoso, M. Portela, L. Entrena, F. Restrepo-Calle, S. Cuenca-Asensi, and A. Marínez-Álvarez. 2013. Efficient mitigation of data and control flow errors in microprocessors. In Proceedings of the 2013 14th European Conference on Radiation and Its Effects on Components and Systems (RADECS’13). 1--4.Google Scholar
- A. Rajabzadeh and S. G. Miremadi. 2006. CFCET: A hardware-based control flow checking technique in COTS processors using execution tracing. Microelectron. Reliab. 46, 5 (2006), 959--972.Google Scholar
Cross Ref
- Abhishek Rhisheekesan. 2012. Quantitative Evaluation of Control Flow based Soft Error Protection Mechanisms. Master’s thesis. School of Computing, Informatics and Decision Systems Engineering, Arizona State University.Google Scholar
- N. R. Saxena and W. K. McCluskey. 1990. Control-flow checking using watchdog assists and extended-precision checksums. IEEE Trans. Comput. 39, 4 (April 1990), 554--559. Google Scholar
Digital Library
- Michael A. Schuette and John Paul Shen. 1983. On-line monitoring using signatured instruction streams. In Proceedings of the 13th International Test Conference. 275--282.Google Scholar
- Michael A. Schuette and John Paul Shen. 1987. Processor control flow monitoring using signatured instruction streams. IEEE Trans. Comput. 36, 3 (March 1987), 264--276. Google Scholar
Digital Library
- Jared C. Smolens, Brian T. Gold, Babak Falsafi, and James C. Hoe. 2006. Reunion: Complexity-effective multicore redundancy. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 39). IEEE Computer Society, Washington, DC, 223--234. Google Scholar
Digital Library
- Darshan D. Thaker, Francois Impens, Isaac L. Chuang, Rajeevan Amirtharajah, and Frederic T. Chong. 2008. On Using Recursive TMR as a Soft Error Mitigation Technique. http://citeseerx.ist.psu.edu/viewdoc/download?rep=rep18type=pdf8doi=10.1.1.131.523Google Scholar
- Ramtilak Vemu and Jacob Abraham. 2011. CEDA: Control-flow error detection using assertions. IEEE Trans. Comput. 60, 9 (Sept. 2011), 1233--1245. Google Scholar
Digital Library
- R. Vemu, S. Gurumurthy, and J. A. Abraham. 2007. ACCE: Automatic correction of control-flow errors. In Proceedings of the IEEE International Test Conference (ITC’07). 1--10.Google Scholar
- Rajesh Venkatasubramanian, J. P. Hayes, and B. T. Murray. 2003. Low-cost on-line fault detection using control flow assertions. In Proceedings of the 9th IEEE On-Line Testing Symposium (IOLTS’03). 137--143.Google Scholar
- Kent Wilken and John Paul Shen. 1988. Continuous signature monitoring: Efficient concurrent-detection of processor control errors. In Proceedings of the 1988 International Conference on Test: New Frontiers in Testing (ITC’88). IEEE Computer Society, Washington, DC, 914--925. http://dl.acm.org/citation.cfm?id=1896122.1896279 Google Scholar
Digital Library
Index Terms
Control Flow Checking or Not? (for Soft Errors)
Recommendations
Quantitative Analysis of Control Flow Checking Mechanisms for Soft Errors
DAC '14: Proceedings of the 51st Annual Design Automation ConferenceControl Flow Checking (CFC) based techniques have gained a reputation of providing effective, yet low-overhead protection from soft errors. The basic idea is that if the control flow -- or the sequence of instructions that are executed -- is correct, ...
Protecting Caches from Soft Errors: A Microarchitect’s Perspective
Special Issue on Secure and Fault-Tolerant Embedded Computing and Regular PapersSoft error is one of the most important design concerns in modern embedded systems with aggressive technology scaling. Among various microarchitectural components in a processor, cache is the most susceptible component to soft errors. Error detection ...
Root cause analysis of soft-error-induced failures from hardware and software perspectives
AbstractBecause the dangers of soft errors are increasing with continued technology scaling, reliability against soft errors is becoming an important design concern for modern embedded systems. Various schemes have been proposed to protect ...






Comments