skip to main content
research-article

Software-Based Selective Validation Techniques for Robust CGRAs Against Soft Errors

Published:28 January 2016Publication History
Skip Abstract Section

Abstract

Coarse-Grained Reconfigurable Architectures (CGRAs) are drawing significant attention since they promise both performances with parallelism and flexibility with reconfiguration. Soft errors (or transient faults) are becoming a serious design concern in embedded systems including CGRAs since the soft error rate is increasing exponentially as technology is scaling. A recently proposed software-based technique with TMR (Triple Modular Redundancy) implemented on CGRAs incurs extreme overheads in terms of runtime and energy consumption mainly due to expensive voting mechanisms for the outputs from the triplication of every operation. In this article, we propose selective validation mechanisms for efficient modular redundancy techniques in the datapaths on CGRAs. Our techniques selectively validate the results at synchronous operations rather than every operation in order to reduce the expensive performance overhead from the validation mechanism. We also present an optimization technique to further improve the runtime and the energy consumption by minimizing synchronous operations where a validating mechanism needs to be applied. Our experimental results demonstrate that our selective validation-based TMR technique with our optimization on CGRAs can improve the runtime by 41.0% and the energy consumption by 26.2% on average over benchmarks as compared to the recently proposed software-based TMR technique with the full validation.

References

  1. D. Alnajiar, Younghun Ko, T. Imagawa, H. Konoura, M. Hiromoto, Y. Mitsuyama, M. Hashimoto, H. Ochi, and T. Onoye. 2009. Coarse-grained dynamically reconfigurable architecture with flexible reliability. In Proceedings of the International Conference on Field Programmable Logic and Applications, 2009 (FPL’09).. 186--192. DOI:http://dx.doi.org/10.1109/FPL.2009.5272317Google ScholarGoogle ScholarCross RefCross Ref
  2. R. Baumann. 2005. Soft errors in advanced computer systems. Design and Test of Computers (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Bradski. 2000. The OpenCV library. Doctor Dobbs Journal (2000).Google ScholarGoogle Scholar
  4. J. Chang, G. A. Reis, and D. I. August. 2006. Automatic instruction-level software-only recovery. In Proceedings of the International Conference on Dependable Systems and Networks, 2006 (DSN’06). 83--92. DOI:http://dx.doi.org/10.1109/DSN.2006.15 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Eisenhardt, A. Kuster, T. Schweizer, T. Kuhn, and W. Rosenstiel. 2011. Spatial and temporal data path remapping for fault-tolerant coarse-grained reconfigurable architectures. In Proceedings of the 2011 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT’11). 382--388. DOI:http://dx.doi.org/10.1109/DFT.2011.7 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Engelmann, H. Ong, and S. L. Scott. 2009. The case for modular redundancy in large-scale high performance computing systems. In Proceedings of the IASTED International Conference, Vol. 641. 046.Google ScholarGoogle Scholar
  7. P. Hazucha and C. Svensson. 2000. Impact of CMOS technology scaling on the atmospheric neutron soft error rate. IEEE Transactions on Nuclear Science 47, 6 (2000), 2586--2594. DOI:http://dx.doi.org/10.1109/23.903813Google ScholarGoogle Scholar
  8. J. L. Henning. 2000. SPEC CPU2000: Measuring CPU performance in the new millennium. Computer 33, 7 (2000), 28--35. DOI:http://dx.doi.org/10.1109/2.869367 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. M. A. H. Jafri, S. J. Piestrak, O. Sentieys, and Sebastien Pillement. 2010. Design of a fault-tolerant coarse-grained reconfigurable architecture: A case study. In Proceedings of the 2010 11th International Symposium on Quality Electronic Design (ISQED’10). 845--852. DOI:http://dx.doi.org/10.1109/ISQED.2010.5450481Google ScholarGoogle ScholarCross RefCross Ref
  10. M. Jo, D. Lee, and K. Choi. 2008. Chip implementation of a coarse-grained reconfigurable architecture supporting floating-point operations. In Proceedings of the International SoC Design Conference, 2008 (ISOCC’08). Vol. 3. IEEE, III--29.Google ScholarGoogle Scholar
  11. J. Kang, Y. Ko, J. Lee, Y. Kim, H. So, K. Lee, and Y. Paek. 2013. Selective validations for efficient protections on coarse-grained reconfigurable architectures. In Proceedings of the 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors (ASAP’13). 95--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. L. Kastensmidt, L. Sterpone, L. Carro, and M. S. Reorda. 2005. On the optimal design of triple modular redundancy logic for SRAM-based FPGAs. In Proceedings of the Design, Automation and Test in Europe, 2005. 1290--1295 Vol. 2. DOI:http://dx.doi.org/10.1109/DATE.2005.229 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Kim, J. Lee, T. X. Mai, and Y. Paek. 2012. Improving performance of nested loops on reconfigurable array processors. ACM Transactions on Architecture Code Optimization 8, 4, Article 32 (Jan. 2012), 23 pages. DOI:http://dx.doi.org/10.1145/2086696.2086711 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Y. Kim, J. Lee, A. Shrivastava, and Y. Paek. 2010. Operation and data mapping for CGRAs with multi-bank memory. SIGPLAN Notices 45, 4 (April 2010), 17--26. DOI:http://dx.doi.org/10.1145/1755951.1755892 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Kim, J. Lee, A. Shrivastava, and Y. Paek. 2011. Memory access optimization in compilation for coarse-grained reconfigurable architectures. ACM Transactions on the Design of Automated Electronics Systems 16, 4, Article 42 (Oct. 2011), 27 pages. DOI:http://dx.doi.org/10.1145/2003695.2003702 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Kim and R. N. Mahapatra. 2009. Dynamic context management for low power coarse-grained reconfigurable architecture. In Proceedings of the 19th ACM Great Lakes Symposium on VLSI (GLSVLSI’09). ACM, New York, NY, 33--38. DOI:http://dx.doi.org/10.1145/1531542.1531555 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Lee and K. Choi. 2010. Thermal-aware fault-tolerant system design with coarse-grained reconfigurable array architecture. In Proceedings of the 2010 NASA/ESA Conference on Adaptive Hardware and Systems (AHS’10). 265--272. DOI:http://dx.doi.org/10.1109/AHS.2010.5546249Google ScholarGoogle ScholarCross RefCross Ref
  18. K. Lee, A. Shrivastava, M. Kim, N. Dutt, and N. Venkatasubramanian. 2008. Mitigating the impact of hardware defects on multimedia applications: A cross-layer approach. In Proceedings of the 16th ACM International Conference on Multimedia (MM’08). ACM, New York, NY, 319--328. DOI:http://dx.doi.org/10.1145/1459359.1459402 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Lyons. 2000. SUN screen. Forbes (2000).Google ScholarGoogle Scholar
  20. R. E. Lyons and W. Vanderkulk. 1962. The use of triple-modular redundancy to improve computer reliability. IBM Journal of Research and Development 6, 2, 200--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Mei, S. Vernalde, D. Verkest, and R. Lauwereins. 2004. Design methodology for a tightly coupled VLIW/reconfigurable matrix architecture: A case study. In Proceedings of the Conference on Design, Automation and Test in Europe - Volume 2 (DATE’04). IEEE Computer Society, Washington, DC, 21224--. http://dl.acm.org/citation.cfm?id=968879.969178 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. E. Michalak, K. W. Harris, N. W. Hengartner, B. E. Takala, and S. A. Wender. 2005. Predicting the number of fatal soft errors in Los Alamos national laboratory’s ASC Q supercomputer. IEEE Transactions on Device and Materials Reliability 5, 3 (2005), 329--335. DOI:http://dx.doi.org/10.1109/TDMR.2005.855685Google ScholarGoogle ScholarCross RefCross Ref
  23. S. Pillement, O. Sentieys, and R. David. 2008. DART: A functional-level reconfigurable architecture for high energy efficiency. EURASIP Journal on Embedded Systems 2008, Article 5 (Jan. 2008), 13 pages. DOI:http://dx.doi.org/10.1155/2008/562326 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Ramakrishna Rau. 1994. Iterative modulo scheduling: An algorithm for software pipelining loops. In Proceedings of the 27th Annual International Symposium on Microarchitecture (MICRO 27). ACM, New York, NY, 63--74. DOI:http://dx.doi.org/10.1145/192724.192731 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. A. Reis, J. Chang, N. Vachharajani, R. Rangan, and D. I. August. 2005. SWIFT: Software implemented fault tolerance. In Proceedings of the International Symposium on Code Generation and Optimization, 2005 (CGO’05). 243--254. DOI:http://dx.doi.org/10.1109/CGO.2005.34 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T. Schweizer, A. Kuster, S. Eisenhardt, T. Kuhn, and W. Rosenstiel. 2012. Using run-time reconfiguration to implement fault-tolerant coarse grained reconfigurable architectures. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW’12). 320--327. DOI:http://dx.doi.org/10.1109/IPDPSW.2012.39 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Schweizer, P. Schlicker, S. Eisenhardt, T. Kuhn, and W. Rosenstiel. 2011. Low-cost TMR for fault-tolerance on coarse-grained reconfigurable architectures. In Proceedings of the 2011 International Conference on Reconfigurable Computing and FPGAs (ReConFig’11). 135--140. DOI:http://dx.doi.org/10.1109/ReConFig.2011.57 Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Singh, G. Lu, M-H. Lee, E. Filho, R. Maestre, F. Kurdahi, and N. Bagherzadeh. 2000. MorphoSys: Case study of a reconfigurable computing system targeting multimedia applications. In Proceedings of the Design Automation Conference, 2000. 573--578. DOI:http://dx.doi.org/10.1109/DAC.2000.855376 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. K. Singh, A. Agbaria, D. I. Kang, and M. French. 2006. Tolerating SEU faults in the Raw architecture. In Proceedings of the International Workshop on Dependable Embedded Systems (WDES’06). 35.Google ScholarGoogle Scholar
  30. M. B. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, Jae-Wook Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. Amarasinghe, and A. Agarwal. 2002. The Raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE Micro 22, 2 (2002), 25--35. DOI:http://dx.doi.org/10.1109/MM.2002.997877 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi. 2008. CACTI 5.1. HP Laboratories, April 2 (2008), 24.Google ScholarGoogle Scholar
  32. N. J. Wang, J. Quek, T. M. Rafacz, and S. J. Patel. 2004. Characterizing the effects of transient faults on a high-performance processor pipeline. In Proceedings of the 2004 International Conference on Dependable Systems and Networks (DSN’04). IEEE Computer Society, Washington, DC, 61--. http://dl.acm.org/citation.cfm?id=1009382.1009722 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. F. Wrobel, J.-M. Palau, M.-C. Calvet, O. Bersillon, and H. Duarte. 2001. Simulation of nucleon-induced nuclear reactions in a simplified SRAM structure: Scaling effects on SEU and MBU cross sections. IEEE Transactions on Nuclear Science 48, 6 (2001), 1946--1952. DOI:http://dx.doi.org/10.1109/23.983155Google ScholarGoogle ScholarCross RefCross Ref
  34. J. W. Yoon, A. Shrivastava, S. Park, M. Ahn, and Y. Paek. 2009. A graph drawing based spatial mapping algorithm for coarse-grained reconfigurable architectures. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 17, 11 (2009), 1565--1578. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Software-Based Selective Validation Techniques for Robust CGRAs Against Soft Errors

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!