skip to main content
research-article

Execution Trace--Driven Energy-Reliability Optimization for Multimedia MPSoCs

Published:04 May 2015Publication History
Skip Abstract Section

Abstract

Multiprocessor systems-on-chip (MPSoCs) are becoming a popular design choice in current and future technology nodes to accommodate the heterogeneous computing demand of a multitude of applications enabled on these platform. Streaming multimedia and other communication-centric applications constitute a significant fraction of the application space of these devices. The mapping of an application on an MPSoC is an NP-hard problem. This has attracted researchers to solve this problem both as stand-alone (best-effort) and in conjunction with other optimization objectives, such as energy and reliability. Most existing studies on energy-reliability joint optimization are static—that is, design time based. These techniques fail to capture runtime variability such as resource unavailability and dynamism associated with application behaviors, which are typical of multimedia applications. The few studies that consider dynamic mapping of applications do not consider throughput degradation, which directly impacts user satisfaction. This article proposes a runtime technique to analyze the execution trace of an application modeled as Synchronous Data Flow Graphs (SDFGs) to determine its mapping on a multiprocessor system with heterogeneous processing units for different fault scenarios. Further, communication energy is minimized for each of these mappings while satisfying the throughput constraint. Experiments conducted with synthetic and real SDFGs demonstrate that the proposed technique achieves significant improvement with respect to the state-of-the-art approaches in terms of throughput and storage overhead with less than 20% energy overhead.

References

  1. P. Bellasi, G. Massari, and W. Fornaciari. 2012. A RTRM proposal for multi/many-core platforms and reconfigurable applications. In Proceedings of the International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC’12). 1--8. DOI:http://dx.doi.org/10.1109/ReCoSoC.2012.6322885Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Borkar, T. Karnik, and V. De. 2004. Design and reliability challenges in nanometer technologies. In Proceedings of the Design Automation Conference (DAC’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C.-L. Chou and R. Marculescu. 2011. FARM: Fault-aware resource management in NoC-based multiprocessor platforms. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’11). 1--6.Google ScholarGoogle Scholar
  4. A. Das and A. Kumar. 2012. Fault-aware task re-mapping for throughput constrained multimedia applications on NoC-based MPSoCs. In Proceedings of the IEEE International Symposium on Rapid System Prototyping (RSP’12).Google ScholarGoogle Scholar
  5. A. Das, A. Kumar, and B. Veeravalli. 2012. Energy-aware communication and remapping of tasks for reliable multimedia multiprocessor systems. In Proceedings of the IEEE International Conference on Parallel and Distributed Systems (ICPADS’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Anup Das, Akash Kumar, and Bharadwaj Veeravalli. 2013a. Communication and migration energy aware design space exploration for multicore systems with intermittent faults. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’13). 1631--1636. DOI:http://dx.doi.org/10.7873/DATE.2013.331 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Das, A. Kumar Singh, and A. Kumar. 2013b. Energy-aware dynamic reconfiguration of communication-centric applications for reliable MPSoCs. In Proceedings of the International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC’13). 1--7.Google ScholarGoogle Scholar
  8. O. Derin, D. Kabakci, and L. Fiorin. 2011. Online task remapping strategies for fault-tolerant network-on-chip multiprocessors. In Proceedings of the IEEE/ACM Symposium on Networks on Chip (NoCS’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Furber. 2006. Living with failure: Lessons from nature? In Proceedings of the IEEE European Test Symposium (ETS’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. H. Ghamarian, M. C. W. Geilen, S. Stuijk, T, Basten, A. J. M. Moonen, M. J. G. Bekooij, B. D. Theelen, and M. R. Mousavi. 2006. Throughput analysis of synchronous data flow graphs. In Proceedings of the IEEE Conference on Application of Concurrency to System Design (ACSD’06). 25--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. H. Hoo and A. Kumar. 2012. An area-efficient partially reconfigurable crossbar switch with low reconfiguration delay. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL’12). DOI:http://dx.doi.org/10.1109/FPL.2012.6339136Google ScholarGoogle Scholar
  12. J. Hu and R. Marculescu. 2004. Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Huang, J. O. Blech, A. Raabe, C. Buckl, and A. Knoll. 2011. Analysis and optimization of fault-tolerant task scheduling on multiprocessor embedded systems. In Proceedings of the Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’11). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. K. M. Kavi, B. P. Buckles, and U. N. Bhat. 1986. A formal definition of data flow graph models. IEEE Transactions on Computers 35, 11, 940--948. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. I. Koren and C. M. Krishna. 2007. Fault-Tolerant Systems. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Lee, H. Kim, H. Park, S. Kim, H. Oh, and S. Ha. 2010. A task remapping technique for reliable multi-core embedded systems. In Proceedings of the Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’10). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. A. Lee and D. G. Messerschmitt. 1987. Synchronous data flow. Proceedings of the IEEE 75, 9, 1235--1245.Google ScholarGoogle ScholarCross RefCross Ref
  18. A. Leroy, D. Milojevic, D. Verkest, F. Robert, and F. Catthoor. 2008. Concepts and implementation of spatial division multiplexing for guaranteed throughput in networks-on-chip. IEEE Transactions on Computers 57, 9, 1182--1195. DOI:http://dx.doi.org/10.1109/TC.2008.82 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. W. McPherson. 2006. Reliability challenges for 45nm and beyond. In Proceedings of the Design Automation Conference (DAC’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Murali, T. Theocharides, N. Vijaykrishnan, M. J. Irwin, L. Benini, and G. De Micheli. 2005. Analysis of error recovery schemes for networks on chips. IEEE Design and Test of Computers 22, 5, 434--442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. Ost, M. Mandelli, G. M. Almeida, L. Moller, L. S. Indrusiak, G. Sassatelli, P. Benoit, M. Glesner, M. Robert, and F. Moraes. 2013. Power-aware dynamic mapping heuristics for NoC-based MPSoCs using a unified model-based approach. ACM Transactions on Embedded Computing Systems 12, 3, Article No. 75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. K. Reinhardt and S. S. Mukherjee. 2000. Transient fault detection via simultaneous multithreading. In Proceedings of the International Symposium on Computer Architecture (ISCA’00). ACM, New York, NY, 25--36. DOI:http://dx.doi.org/10.1145/339647.339652 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Santarini. 2011. Zynq-7000 EPP Sets Stage for New Era of Innovations. Available at http://www.design-reuse.com/articles/26686/xilinx-zynq-7000-arm-cortex-a9-mpcore.html.Google ScholarGoogle Scholar
  24. A. Kumar Singh, M. Shafique, A. Kumar, and J. Henkel. 2013. Mapping on multi/many-core systems: Survey of current and emerging trends. In Proceedings of the Design Automation Conference (DAC’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Kumar Singh, T. Srikanthan, A. Kumar, and W. Jigang. 2010. Communication-aware heuristics for run-time task mapping on NoC-based MPSoC platforms. Elsevier Journal of Systems Architecture 56, 7, 242--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. S. Stuijk, M. C. W. Geilen, and T. Basten. 2006. SDF3: SDF for free. In Proceedings of the IEEE Conference on Application of Concurrency to System Design (ACSD’06). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. W. Wolf. 2005. Multimedia applications of multiprocessor systems-on-chips. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Yang and A. Orailoglu. 2007. Predictable execution adaptivity through embedding dynamic reconfigurability into static MPSoC schedules. In Proceedings of the Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’07). ACM, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C. Ykman-Couvreur, V. Nollet, F. Catthoor, and H. Corporaal. 2006. Fast multi-dimension multi-choice knapsack heuristic for MP-SoC run-time management. In Proceedings of the International Symposium on System-on-Chip. 1--4. DOI:http://dx.doi.org/10.1109/ISSOC.2006.321966Google ScholarGoogle Scholar

Index Terms

  1. Execution Trace--Driven Energy-Reliability Optimization for Multimedia MPSoCs

              Recommendations

              Reviews

              Sunil Shukla

              Dynamic task scheduling and fault tolerance in multiprocessor systems-on-chip (MPSoCs) are explored in this paper. The paper describes a heterogeneous MPSoC system with a special fault-free node called RTM to manage the other processing nodes in the MPSoC. An application is started using a compile time mapping. The execution traces are for each node, and the communication edge is captured and analyzed to estimate throughput and energy consumption for a mapping with one less node. In case a fault occurs, the mapping that maximizes throughput and minimizes communication energy is selected. I can see how this approach may work in an MPSoC with homogeneous nodes, but I am unable to understand how this approach will work in a heterogeneous system. The execution information for a node of type T1 is going to be very different from that of node type T2. I am not sure how the authors are going to estimate throughput using the trace belonging to a node of type T1 when it needs to be replaced by a node of type T2. Also, the use of trace in estimating throughput could have been discussed in more detail. The authors haven't justified the focus on minimizing the communication energy instead of overall energy. Is communication energy a significant proportion of the overall energy__?__ In my opinion, it is possible to have a configuration that minimizes communication energy, but is not the optimal configuration when system energy is concerned. Online Computing Reviews Service

              Access critical reviews of Computing literature here

              Become a reviewer for Computing Reviews.

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!