Abstract
Estimating the maximum clock frequency of homogeneous Coarse Grained Reconfigurable Arrays/Architectures (CGRAs) with an arbitrary number of Processing Elements (PE) is difficult. Clock frequency estimation of highly heterogeneous CGRAs takes additional factors into account, thus is even more difficult. Main challenges are the heterogeneous set of operators for each Processing Element (PE) and the irregular interconnect (connecting a CGRA’s PEs). Multiple estimation approaches could be reasonable. We propose an optimized statistical estimator, which is based on our prior work. We demonstrate its superiority to state-of-the-art neural networks in terms of accuracy and robustness, especially in situations with a sparse set of training data.
- [1] . 2011. Autoencoders, unsupervised learning and deep architectures. In Proceedings of the International Conference on Unsupervised and Transfer Learning Workshop (UTLW’11). JMLR.org, Bellevue, WA, 37–50.Google Scholar
- [2] . 2007. A 30 year retrospective on Dennard’s MOSFET scaling paper. IEEE Solid-State Circ. Societ. Newslett. 12, 1 (2007), 11–13.
DOI: Google ScholarCross Ref
- [3] . 2003. A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine. Neurocomputing 55 (
9 2003), 321–336.DOI: Google ScholarCross Ref
- [4] . 2015. Keras. Retrieved from https://github.com/fchollet/keras.Google Scholar
- [5] . 2016. Fast and accurate deep network learning by exponential linear units (ELUs). https://www.researchgate.net/publication/284579051_Fast_and_Accurate_Deep_Network_Learning_by_Exponential_Linear_Units_ELUs/citation/download.Google Scholar
- [6] 2020. Javadocs API 3.0 - Interface MultipleLinearRegression. Retrieved from https://commons.apache.org/proper/commons-math/javadocs/api-3.0/org/apache/commons/math3/stat/regression/MultipleLinearRegression.html.Google Scholar
- [7] . 2016. Incorporating Nesterov momentum into Adam. (2016). https://openreview.net/pdf/OM0jvwB8jIp57ZJjtNEZ.pdf.Google Scholar
- [8] 2015. TensorFlow: Large-scale Machine Learning on Heterogeneous Systems. Retrieved from http://tensorflow.org/.Google Scholar
- [9] . 2020. Graph Neural Networks in TensorFlow and Keras with Spektral.
arxiv:2006.12138 [cs.LG].Google Scholar - [10] . 2019. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow—Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, Inc., Sebastopol.Google Scholar
Digital Library
- [11] . 2015. Delving Deep into Rectifiers: Surpassing Human-level Performance on ImageNet Classification.
arxiv:1502.01852 [cs.CV].Google Scholar - [12] . 1964. Robust estimation of a location parameter. Ann. Math. Statist. 35, 1 (
03 1964), 73–101.DOI: Google ScholarCross Ref
- [13] . 1998. Neural-network design for small training sets of high dimension. IEEE Trans. Neural Netw. 9, 2 (1998), 266–280.Google Scholar
Digital Library
- [14] . 2018. Lookahead memory prefetching for CGRAs using partial loop unrolling. In Applied Reconfigurable Computing. Architectures, Tools, and Applications. Springer International Publishing, Cham, 93–104.Google Scholar
- [15] . 2020. Karlrupp/microprocessor-trend-DATA: Data repository for my blog series on Microprocessor Trend Data. Retrieved from https://github.com/karlrupp/microprocessor-trend-data.Google Scholar
- [16] . 1995. Convolutional networks for images, speech, and time-series. The Handbook of Brain Theory and Neural Networks. MIT PressGoogle Scholar
- [17] . 2019. A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications. ACM Comput. Surv. 52, 6 (
Oct. 2019).DOI: Google ScholarDigital Library
- [18] . 1936. On the generalised distance in statistics. Proc. Natl. Instit. Sci. India 2, 1 (1936), 49—55.Google Scholar
- [19] . 2014. A view of artificial neural network. In Proceedings of the International Conference on Advances in Engineering Technology Research (ICAETR’14). 1–3.
DOI: Google ScholarCross Ref
- [20] . 2005. Quick estimation of resources of FPGAs and ASICs using neural networks. In Lernen, Wissensentdeckung und Adaptivität Conference. 210–215.Google Scholar
- [21] . 2018. Compact area and performance modelling for CGRA architecture evaluation. In Proceedings of the International Conference on Field-Programmable Technology (FPT’18). 126–133.
DOI: Google ScholarCross Ref
- [22] . 2020. A survey on coarse-grained reconfigurable architectures from a performance perspective. IEEE Access 8 (2020), 146719–146743.
DOI: Google ScholarCross Ref
- [23] . 2019. Update or invalidate: Influence of coherence protocols on configurable HW accelerators. In Applied Reconfigurable Computing. Springer International Publishing, Cham, 305–316.Google Scholar
Cross Ref
- [24] . 2016. An overview of gradient descent optimization algorithms. CoRR abs/1609.04747 (2016).Google Scholar
- [25] . 2009. The graph neural network model. IEEE Trans. Neural Netw. 20, 1 (2009), 61–80.Google Scholar
Digital Library
- [26] . 2005. Using VHDL simulator to estimate logic path delays in combinational and embedded sequential circuits. In Proceedings of the International Conference on “Computer as a Tool”. 1683–1686.
DOI: Google ScholarCross Ref
- [27] . 2012. Design space exploration and implementation of a high performance and low area coarse grained reconfigurable processor. In Proceedings of the International Conference on Field-Programmable Technology. 67–70.
DOI: Google ScholarCross Ref
- [28] . 2008. Performance comparison of three types of autoencoder neural networks. In Proceedings of the 2nd Asia International Conference on Modelling Simulation (AMS). 213–218.Google Scholar
Digital Library
- [29] . 2016. ND4J: Fast, scientific and numerical computing for the JVM. Retrieved from https://github.com/eclipse/deeplearning4j.Google Scholar
- [30] . 2021. UltraSynth: Insights of a CGRA integration into a control engineering environment. J. Sig. Process. Syst. 93 (
5 2021), 1–17.DOI: Google ScholarDigital Library
- [31] . 2019. UltraSynth: Integration of a CGRA into a control engineering environment. In Applied Reconfigurable Computing. Springer International Publishing, Cham, 247–261.Google Scholar
- [32] . 2018. AMIDAR project: Lessons learned in 15 years of researching adaptive processors. In Proceedings of the 13th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC). 1–8.
DOI: Google ScholarCross Ref
- [33] . 2020. Towards purposeful design space exploration of heterogeneous CGRAs: Clock frequency estimation. In Proceedings of the 57th ACM/IEEE Design Automation Conference (DAC’20). 1–6.
DOI: Google ScholarCross Ref
- [34] . 2006. Area and delay estimation for FPGA implementation of coarse-grained reconfigurable architectures. SIGPLAN Not. 41, 7 (
June 2006), 182–188.DOI: Google ScholarDigital Library
- [35] . 2019. Heterogeneous graph neural network(
KDD’19 ). Association for Computing Machinery, New York, NY, 793–803.DOI: Google ScholarDigital Library
Index Terms
Advantages of a Statistical Estimation Approach for Clock Frequency Estimation of Heterogeneous and Irregular CGRAs
Recommendations
Mean likelihood frequency estimation
Estimation of signals with nonlinear as well as linear parameters in noise is studied. Maximum likelihood estimation has been shown to perform the best among all the methods. In such problems, joint maximum likelihood estimation of the unknown ...
FGC: A Tool-flow for Generating and Configuring Custom FPGAs(Abstract Only)
FPGA '18: Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysWe introduce the FGC Toolflow, the only tool providing flexible custom-FPGA generation and configuration to-date. Currently, researchers building custom FPGAs must create for FPGA schematics and bitstreams by hand. Both tasks are prohibitively time ...
Portable, flexible, and scalable soft vector processors
Field-programmable gate arrays (FPGAs) are increasingly used to implement embedded digital systems, however, the hardware design necessary to do so is time-consuming and tedious. The amount of hardware design can be reduced by employing a microprocessor ...






Comments