Abstract
Networks-on-chip need to survive to manufacturing faults in order to sustain yield. An effective testing and configuration strategy however implies two opposite requirements. One one hand, a fast and scalable built-in self-testing and self-diagnosis procedure has to be carried out concurrently at NoC switches. On the other hand, programming the NoC routing mechanism to go around faulty links and switches can be optimally performed by a centralized controller with global network visibility. To the best of our knowledge, this article proposes for the first time a global network testing and configuration strategy that meets the opposite requirements by means of a fault-tolerant dual network architecture and a fast configuration algorithm for the most common failure patterns.
Experimental results report an area overhead as low as 12.5% with respect to the baseline switch architecture while achieving a high degree of fault tolerance. In fact, even when multiple stuck-at faults are considered, the capability of fault masking by the dual network is always over 80%, and the support for multiple link failures is more than 90% in presence of two unusable links in the main network with minimum set-up times.
- Alaghi, A., Sedghi, M., Karimi, N., Fathy, M., and Navabi, Z. 2008. Reliable NoC architecture utilizing a robust rerouting algorithm. In Proceedings of the East-West Design Test Symposium. 200--203. DOI:http://dx.doi.org/10.1109/EWDTS.2008.5580142.Google Scholar
- Ali, M., Welzl, M., and Hellebrand, S. 2005. A dynamic routing mechanism for network on chip. In Proceedings of the 23rd NORCHIP Conference. 70--73. DOI:http://dx.doi.org/10.1109/NORCHP.2005. 1596991.Google Scholar
- Amory, A. M., Briao, E., Cota, E., Lubaszewski, M., and Moraes, F. G. 2005. A scalable test strategy for network-on-chip routers. In Proceedings of the IEEE International Test Conference. DOI:http://dx.doi.org/10.1109/TEST.2005.1584020.Google Scholar
- Angiolini, F., Atienza, D., Murali, S., Benini, L., and De Micheli, G. 2006. Reliability support for on-chip memories using networks-on-chip. In Proceedings of the International Conference on Computer Design. 389--396. DOI:http://dx.doi.org/10.1109/ICCD.2006.4380846.Google Scholar
Cross Ref
- Arabi, K. 2002. Logic BIST and scan test techniques for multiple identical blocks. In Proceedings of the 20th IEEE VLSI Test Symposium. 60--65. DOI:http://dx.doi.org/10.1109/VTS.2002.1011112. Google Scholar
Digital Library
- Beigne, E., Clermidy, F., Vivet, P., Clouard, A., and Renaudin, M. 2005. An asynchronous NOC architecture providing low latency service and its multi-level design framework. In Proceedings of the 11th IEEE International Symposium onAsynchronous Circuits and Systems. 54--63. DOI:http://dx.doi.org/10.1109/ASYNC.2005.10. Google Scholar
Digital Library
- Bienia, C. and Li, K. 2009. Parsec 2.0: A new benchmark suite for chip-multiprocessors. In Proceedings of the 5th Annual Workshop on Modeling, Benchmarking and Simulation.Google Scholar
- Fick, D., DeOrio, A., Chen, G., Bertacco, V., Sylvester, D. and Blaauw, D. 2009a. A highly resilient routing algorithm for fault-tolerant NoCs. In Proceedings of the Conference and Exhibition on Design, Automation Test in Europe (DATE'09). 21--26. DOI:http://dx.doi.org/10.1109/DATE.2009.5090627. Google Scholar
Digital Library
- Fick, D., DeOrio, A., Hu, J., Bertacco, V., Blaauw, D., and Sylvester, D. 2009b. Vicis: A reliable network for unreliable silicon. In Proceedings of the 46th ACM/IEEE Design Automation Conference (DAC'09). 812--817. Google Scholar
Digital Library
- Gomez, M. E., Nordbotten, N. A., Flich, J., Lopez, P., Robles, A., Duato, J., Skeie, T., and Lysne, O. 2006. A routing methodology for achieving fault tolerance in direct networks. IEEE Trans. Comput. 55, 4, 400--415. DOI:http://dx.doi.org/10.1109/TC.2006.46. Google Scholar
Digital Library
- Grecu, C., Pande, P., Ivanov, A., and Saleh, R. 2006. BIST for network-on-chip interconnect infrastructures. In Proceedings of the 24th IEEE VLSI Test Symposium. DOI:http://dx.doi.org/10.1109/VTS.2006.22. Google Scholar
Digital Library
- Herve, M., Cota, E., Kastensmidt, F. L., and Lubaszewski, M. 2009. Diagnosis of interconnect shorts in mesh NoCs. In Proceedings of the 3rd ACM/IEEE International Symposium on Networks-on-Chip. 256--265. DOI:http://dx.doi.org/10.1109/NOCS.2009.5071475. Google Scholar
Digital Library
- Honarmand, N., Shahabi, A., and Navabi, Z. 2007. A heuristic search algorithm for re-routing of on-chip networks in the presence of faulty links and switches. In Proceedings of the IEEE East-West Design & Test Symposium.Google Scholar
- Loi, I., Angiolini, F., and Benini, L. 2009. Synthesis of low-overhead configurable source routing tables for network interfaces. In Proceedings of the Conference and Exhibition on Design, Automation Test in Europe (DATE'09). DOI:http://dx.doi.org/10.1109/DATE.2009.5090668. Google Scholar
Digital Library
- Lysne, O., Montanana, J. M., Flich, J., Duato, J., Pinkston, T. M., and Skeie, T. 2008. An efficient and deadlock-free network reconfiguration protocol. IEEE Trans Comput. 57, 6 (2008), 762--779. Google Scholar
Digital Library
- Magnusson, P. S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., and Werner, B. 2002. Simics: A full system simulation platform. Computer 35, 2, 50--58. Google Scholar
Digital Library
- Martin, M. M. K., Sorin, D. J., Beckmann, B. M., Marty, M. R., Xu, M., Alameldeen, A. R., Moore, K. E., Hill, M. D., and Wood, D. A. 2005. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. SIGARCH Comput. Archit. News 33, 4, 92--99. DOI:http://dx.doi.org/10.1145/1105734.1105747. Google Scholar
Digital Library
- Mejia, A. 2008. Design and Implementation of efficient topology agnostic routing algorithms for interconnection networks. PhD dissertation, University of Valencia.Google Scholar
- Mejia, A., Flich, J., and Duato, J. 2008. On the potentials of segment-based routing for NoCs. In Proceedings of the 37th International Conference on Parallel Processing. IEEE, 594--603. Google Scholar
Digital Library
- Petersén, K., and Öberg, J. 2006. Utilizing NoC switches as BIST structures in 2D-Mesh network-on-chips. In Proceedings of the NoC Workshop (DATE'06). QC 20120528.Google Scholar
- Petersén, K. and Öberg, J. 2007. Toward a scalable test methodology for 2D-mesh network-on-chips. In Proceedings of the Conference and Exhibition on Design, Automation Test in Europe (DATE'07). 1--6. DOI:http://dx.doi.org/10.1109/DATE.2007.364619. Google Scholar
Digital Library
- Rodrigo, S., Flich, J., Roca, A., Medardoni, S., Bertozzi, D., Camacho, J., Silla, F., and Duato, J. 2010. Addressing Manufacturing Challenges with Cost-Efficient Fault Tolerant Routing. In Proceedings of the 4th ACM/IEEE International Symposium on Networks-on-Chip (NOCS). 25--32. DOI:http://dx.doi.org/10.1109/NOCS.2010.12. Google Scholar
Digital Library
- Rodrigo, S., Hernandez, C., Flich, J., Silla, F., Duato, J., Medardoni, S., Bertozzi, D., Mejia, A., and Dai, D. 2009a. Yield-oriented evaluation methodology of network-on-chip routing implementations. In Proceedings of the International Symposium on System-on-Chip. 100--105. DOI:http://dx.doi.org/10.1109/SOCC.2009.5335667. Google Scholar
Digital Library
- Rodrigo, S., Medardoni, S., Flich, J., Bertozzi, D., and Duato, J. 2009b. Efficient implementation of distributed routing algorithms for NoCs. IET Comput. Digital Tech. 3, 5, 460--475. DOI:http://dx.doi.org/10.1049/iet-cdt.2008.0092.Google Scholar
Cross Ref
- Sem-Jacobsen, F. O., Rodrigo, S., and Skeie, T. 2011. iFDOR: Dynamic rerouting on-chip. In Proceedings of the 5th International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip. ACM, 11--14. Google Scholar
Digital Library
- Starobinski, D., Karpovsky, M., and Zakrevski, L. A. 2003. Application of network calculus to general topologies using turn-prohibition. IEEE/ACM Trans. Netw. 11, 3, 411--421. DOI:http://dx.doi.org/10.1109/TNET.2003.813040. Google Scholar
Digital Library
- Stergiou, S., Angiolini, F., Carta, S., Raffo, L., Bertozzi, D., and De Micheli, G. 2005. ×pipes Lite: a synthesis oriented design library for networks on chips. In Proceedings of the Conference and Exhibition on Design, Automation Test in Europe (DATE'05). Vol. 2, 1188--1193. DOI:http://dx.doi.org/10.1109/DATE.2005.1. Google Scholar
Digital Library
- Strano, A., Gomez, C., Ludovici, D., Favalli, M., Gomez, M. E., and Bertozzi, D. 2011. Exploiting Network-on-Chip structural redundancy for a cooperative and scalable built-in self-test architecture. In Proceedings of the Conference and Exhibition on Design, Automation Test in Europe (DATE'11). 1--6. DOI:http://dx.doi.org/10.1109/DATE.2011.5763109.Google Scholar
- Zhang, L., Han, Y., Xu, Q., Li, X. W., and Li, H. 2009. On Topology reconfiguration for defect-tolerant NoC-based homogeneous manycore systems. IEEE Trans. VLSI Syst. 17, 9, 1173--1186. DOI:http://dx.doi.org/10.1109/TVLSI.2008.2002108. Google Scholar
Digital Library
Index Terms
A complete self-testing and self-configuring NoC infrastructure for cost-effective MPSoCs
Recommendations
A Brief Comment on “A Complete Self-Testing and Self-Configuring NoC Infrastructure for Cost-Effective MPSoCs” [ACM Transactions on Embedded Computing Systems 12 (2013) Article 106]
In the Ghiribaldi et al. [2013] paper, a complete self-testing and self configuring NoC infrastructure for cost-effective MPSoCs was presented in order to make NoC architecture tolerant to faults. To overcome the complexity involved during the complete ...
Multi-Layer Test and Diagnosis for Dependable NoCs
NOCS '15: Proceedings of the 9th International Symposium on Networks-on-ChipNetworks-on-chip are inherently fault tolerant or at least gracefully degradable as both, connectivity and amount of resources, provide some useful redundancy. These properties can only be exploited extensively if test and diagnosis techniques support ...
Investigation of Transient Fault Effects in an Asynchronous NoC Router
PDP '10: Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based ProcessingThis paper presents Investigation of Transient Fault Effects in an asynchronous NoC router. The experiment is based on simulation-based fault injection method to assess the fault-tolerant behavior of the asynchronous router. The effort has been ...






Comments