Abstract
As technology scales, the impact of process variation on the maximum supported frequency (FMAX) of individual cores in a multiprocessor system-on-chip (MPSoC) becomes more pronounced. Task allocation without variation-aware performance analysis can greatly compromise performance and lead to a significant loss in yield, defined as the percentage of manufactured chips satisfying the application timing requirement. We propose variation-aware task allocation for best-effort and real-time streaming applications modeled as task graphs. Our solutions are primarily based on the throughput requirement, which is the most important timing requirement in many real-time streaming applications.
The four main contributions of this work are (1) distinguishing best-effort firm real-time and soft real-time application classes, which require different optimization criteria, (2) using dataflow graphs, which are well suited for modeling and analysis of streaming applications, we explicitly model task execution both in terms of clock cycles (which is independent of variation) and seconds (which does depend on the variation of the resource), which we connect by an explicit binding, (3) we present two optimization approaches, which give different improvement results at different costs, (4) we present both exhaustive and heuristic algorithms that implement the optimization approaches. Our variation-aware mapping algorithms are tested on models of seven real applications and are compared to mapping methods that are unaware of hardware variation. Our results demonstrate (1) improvements in the average performance (3% on average) for best-effort applications, and (2) for firm real-time and soft real-time applications, yield improvements of up to 27% with an average of 15%, showing the effectiveness of our approaches.
- Saman Amarasinghe, Michael I. Gordon, Michal Karczmarek, Jasper Lin, David Maze, Rodric M. Rabbah, and William Thies. 2005. Language and compiler design for streaming applications. Int. J. Parallel Program. 33, 2/3. Google Scholar
Digital Library
- Alessio Bonfietti, Luca Benini, Michele Lombardi, and Michela Milano. 2010. An efficient and complete approach for throughput-maximal SDF allocation and scheduling on multi-core platforms. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE). European Design and Automation Association, 897--902. Google Scholar
Digital Library
- Alessio Bonfietti, Michele Lombardi, Michela Milano, and Luca Benini. 2009. Throughput constraint for synchronous data flow graphs. In Proceedings of the International Conference on Integration of AI and OR Techniques in Constraint Programming (CPAIOR). Springer-Verlag, Berlin, 26--40. Google Scholar
Digital Library
- K. A. Bowman, S. G. Duvall, and J. D. Meindl. 2002. Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration. IEEE J. Solid-State Circuits 37, 2, 183--190.Google Scholar
Cross Ref
- Tracy D. Braun, Howard Jay Siegel, Noah Beck, Lasislau L. Bölöni, Muthucumara Maheswaran, Albert I. Reuther, James P. Robertson, Mitchell D. Theys, Bin Yao, Debra Hensgen, and Richard F. Freund. 2001. A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61, 810--837. Google Scholar
Digital Library
- HaNeul Chon and Taewhan Kim. 2009. Timing variation-aware task scheduling and binding for MPSoC. In Proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC). 137--142. Google Scholar
Digital Library
- S. Dighe, S. R. Vangal, P. Aseron, S. Kumar, T. Jacob, K. A. Bowman, J. Howard, J. Tschanz, V. Erraguntla, N. Borkar, V. K. De, and S. Borkar. 2011. Within-die variation-aware dynamic-voltage-frequency-scaling with optimal core allocation and thread hopping for the 80-core TeraFLOPS processor. IEEE J. Solid-State Circuits 46, 1 (Jan. 2011), 184--193.Google Scholar
Cross Ref
- M. Eisele, J. Berthold, D. Schmitt-Landsiedel, and R. Mahnkopf. 1997. The impact of intra-die device parameter variations on path delays and on the design for yield of low voltage digital circuits. IEEE Tran. (VLSI) Systems 5, 4, 360--368. Google Scholar
Digital Library
- A. H. Ghamarian, M. C. W. Geilen, S. Stuijk, T. Basten, A. J. M. Moonen, M. J. G. Bekooij, B. D. Theelen, and M. R. Mousavi. 2006. Throughput analysis of synchronous data flow graphs. In Proceedings of the International Conference on Application of Concurrency to System Design (ACSD). 25--36. Google Scholar
Digital Library
- Andreas Hansson, Kees Goossens, Marco Bekooij, and Jos Huisken. 2009. CoMPSoC: A template for composable and predictable multi-processor system on chips. ACM Trans. Des. Autom. Electron. Syst. 14, 2:1--2:24. Google Scholar
Digital Library
- Lin Huang and Qiang Xu. 2010. Performance yield-driven task allocation and scheduling for MPSoCs under process variation. In Proceedings of the Design Automation Conference (DAC). 326--331. Google Scholar
Digital Library
- M. Miranda, B. Dierickx, P. Zuber, P. Dobrovoln, F. Kutscherauer, P. Roussel, and P. Poliakov. 2009. Variability aware modeling of SoCs: From device variations to manufactured system yield. In Proceedings of the Quality of Electronic Design (ISQED). 547--553. Google Scholar
Digital Library
- D. Mirzoyan, B. Akesson, and K. Goossens. 2012. Process-variation aware mapping of real-time streaming applications to MPSoCs for improved yield. In Proceedings of the International Symposium on Quality of Electronic Design (ISQED). 41--48.Google Scholar
- Orlando Moreira and Marco Bekooij. 2007. Self-timed scheduling analysis for real-time applications. EURASIP J. Adv. Signal Process.Google Scholar
- Liang Teck Pang and B. Nikolic. 2008. Measurement and analysis of variability in 45nm strained-Si CMOS technology. In Proceedings of the Custom Integrated Circuits Conference (CICC). 129--132.Google Scholar
- A. Shabbir, A. Kumar, S. Stuijk, B. Mesman, and H. Corporaal. 2010. CA-MPSoC: An automated design flow for predictable multi-processor architectures for multiple applications. J. Syst. Archit. 56, 265--277. Google Scholar
Digital Library
- L. Singhal and E. Bozorgzadeh. 2008. Process variation aware system-level task allocation using stochastic ordering of delay distributions. In Proceedings of the International Conference on Computer Aided Design (ICCAD). 570--574. Google Scholar
Digital Library
- S. Sriram and S. S. Bhattacharyya. 2000. Embedded Multiprocessors: Scheduling and Synchronization. CRC Press. Google Scholar
Digital Library
- S. Stuijk, T. Basten, M. C. W. Geilen, and H. Corporaal. 2007. Multiprocessor resource allocation for throughput-constrained synchronous dataflow graphs. In Proceedings of the Design Automation Conference (DAC). 777--782. Google Scholar
Digital Library
- S. Stuijk, M. Geilen, and T. Basten. 2006a. Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. In Proceedings of the Design Automation Conference (DAC). 899--904. Google Scholar
Digital Library
- Sander Stuijk, Marc Geilen, and Twan Basten. 2006b. SDF3: SDF for free. In Proceedings of the 6th International Conference on Application of Concurrency to System Design. 276--278. Google Scholar
Digital Library
- J. W. Tschanz, J. T. Kao, S. G. Narendra, R. Nair, D. A. Antoniadis, A. P. Chandrakasan, and V. De. 2002. Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage. IEEE J. Solid-State Circuits. 37, 11, 1396--1402.Google Scholar
Cross Ref
- O. S. Unsal, J. W. Tschanz, K. Bowman, V. De, X. Vera, A. Gonzalez, and O. Ergin. 2006. Impact of parameter variations on circuits and microarchitecture. Proc. Microarchitec. (MICRO) 26, 6, 30--39. Google Scholar
Digital Library
- Feng Wang, C. Nicopoulos, Xiaoxia Wu, Yuan Xie, and N. Vijaykrishnan. 2007. Variation-aware task allocation and scheduling for MPSoC. In Proceedings of the International Conference on Computer Aided Design (ICCAD). 598--603. Google Scholar
Digital Library
- Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Staschulat, and Per Stenström. 2008. The worst-case execution-time problem overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7, 3, 36:1--36:53. Google Scholar
Digital Library
Index Terms
Process-variation-aware mapping of best-effort and real-time streaming applications to MPSoCs
Recommendations
High Throughput Asynchronous NoC Design under High Process Variation
Asynchronous switching is proposed as a robust design to mitigate the impact of process variation in Network on Chip (NoC). Circuit analysis is used to evaluate the influence of process variation on both synchronous and asynchronous designs. The impact ...
Process Variation Aware Bus-Coding Scheme for Delay Minimization in VLSI Interconnects
ISQED '08: Proceedings of the 9th international symposium on Quality Electronic DesignProcess variations can have a significant impact on both device and interconnect performance in Deep Sub-Micron (DSM) technology. In this paper, initially authors discuss the effects of process parameter variations on bus-encoding schemes for delay ...
Statistical energy optimization on voltage-frequency island based MPSoCs in the presence of process variations
Energy efficiency has become a primary design concern for embedded multiprocessor system-on-chips (MPSoCs). Recently, voltage-frequency island (VFI)-based design paradigm was proposed to optimize system energy by combining with task scheduling. However, ...






Comments