Abstract
Coarse-Grained Reconfigurable Array (CGRA) architectures are promising high-performance and power-efficient platforms. However, mapping applications efficiently on CGRA is a challenging task. This is known to be an NP complete problem. Hence, finding good mapping solutions for a given CGRA architecture within a reasonable time is complex. Additionally, finding scalability in compilation time and memory footprint for large heterogeneous CGRAs is also a well known problem. In this article, we present a stochastic mapping approach that can efficiently explore the architecture space and allows finding best of solutions while having limited and steady use of memory footprint. Experimental results show that our compilation flow allows to reach performances with low-complexity CGRA architectures that are as good as those obtained with more complex ones thanks to the better exploration of the mapping solution space. Parameters considered in our experiments are number of tiles, Register File (RF) size, number of load/store (LS) units, network topologies, and so on. Our results demonstrate that high-quality compilation for a wide range of applications is possible within reasonable run-times. Experiments with several DSP benchmarks show that the best CGRA configuration from the architectural exploration surpasses an ultra low-power DSP optimized RISC-V CPU to achieve up to 15.28× (with an average of 6× and minimum of 3.4×) performance gain and 29.7× (with an average of 13.5× and minimum of 6.3×) energy gain with an area overhead of 1.5× only.
- [1] . 2020. CRIMSON: Compute-intensive loop acceleration by randomized iterative modulo scheduling and optimized mapping on CGRAs. IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems 39, 11 (2020), 3300–3310.
DOI: Google ScholarCross Ref
- [2] . 2006. Optimal simultaneous scheduling, binding, and routing for processor-like reconfigurable architectures. In Proceedings of the International Conference on Field Programmable Logic and Applications. 1–6.
DOI: Google ScholarCross Ref
- [3] . 2018. An architecture-agnostic integer linear programming approach to CGRA mapping. In Proceedings of the 55th Annual Design Automation Conference.Association for Computing Machinery, New York, NY,6 pages.
DOI: Google ScholarDigital Library
- [4] . 2016. A scalable design approach to efficiently map applications on CGRAs. In Proceedings of the 2016 IEEE Computer Society Annual Symposium on VLSI. 655–660.
DOI: Google ScholarCross Ref
- [5] . 2008. Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays. SIGPLAN Notices 43, 7 (2008), 151–160.Google Scholar
Digital Library
- [6] . 2016. The Samsung Exynos 7420 Deep Dive - Inside A Modern 14nm SoC. Retrieved from http://www.anandtech.com/show/9330/exynos-7420-deep-dive.Google Scholar
- [7] . 2017. Near-threshold RISC-V core with dsp extensions for scalable IoT endpoint devices. IEEE Transactions on Very Large Scale Integration Systems 25, 10 (2017), 2700–2713.
DOI: Google ScholarDigital Library
- [8] . 2021. Snafu: An ultra-low-power, energy-minimal CGRA-generation framework and architecture. In Proceedings of the 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture. 1027–1040.
DOI: Google ScholarDigital Library
- [9] . 2012. EPIMap: Using epimorphism to map applications on CGRAs. In Proceedings of the 49th Annual Design Automation Conference. ACM, 1284–1291.
DOI: Google ScholarDigital Library
- [10] . 2013. REGIMap: Register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs). In Proceedings of the 50th Annual Design Automation Conference. ACM, 18:1–18:10.
DOI: Google ScholarDigital Library
- [11] . 2017. HyCUBE: A CGRA with reconfigurable single-cycle multi-hop interconnect. In Proceedings of the 54th Annual Design Automation Conference 2017.Association for Computing Machinery, New York, NY, 6 pages.
DOI: Google ScholarDigital Library
- [12] . 2012. ULP-SRP: Ultra low power samsung reconfigurable processor for biomedical applications. In Proceedings of the 2012 International Conference on Field-Programmable Technology. 329–334.
DOI: Google ScholarCross Ref
- [13] . 2020. GenMap: A genetic algorithmic approach for optimizing spatial mapping of coarse-grained reconfigurable architectures. IEEE Transactions on Very Large Scale Integration Systems 28, 11 (2020), 2383–2396.
DOI: Google ScholarCross Ref
- [14] . 2011. Mapping multi-domain applications onto coarse-grained reconfigurable architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 30, 5 (2011), 637–650.
DOI: Google ScholarDigital Library
- [15] . 1973. A note on the derivation of maximal common subgraphs of two directed or undirected graphs. CALCOLO 9, 4 (1973), 341–352.
DOI: Google ScholarCross Ref
- [16] . 2019. A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications. ACM Computing Surveys 52, 6, (2019), 39 pages.
DOI: Google ScholarDigital Library
- [17] . 2002. DRESC: A retargetable compiler for coarse-grained reconfigurable architectures. In Proceedings of the Field-Programmable Technology. 166–173.
DOI: Google ScholarCross Ref
- [18] . 2017. A coarse grain reconfigurable array (cgra) for statically scheduled data flow computing. Wave Computing White Paper (2017), 1–9.Google Scholar
- [19] . 2008. Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In Proceedings of the 17th InternationalConference on Parallel Architectures and Compilation Techniques. ACM.
DOI: Google ScholarDigital Library
- [20] . 2014. An automated design approach to map applications on CGRAs. In Proceedings of the 24th Edition of the Great Lakes Symposium on VLSI. ACM, 229–230.Google Scholar
Digital Library
- [21] . 2014. Efficient application mapping on CGRAs based on backward simultaneous scheduling/binding and dynamic graph transformations. In Proceedings of the 2014 IEEE 25th International Conference on Application-specific Systems, Architectures, and Processors. 169–172.
DOI: Google ScholarCross Ref
- [22] . 2020. A survey on coarse-grained reconfigurable architectures from a performance perspective. IEEE Access 8 (2020), 146719–146743.
DOI: Google ScholarCross Ref
- [23] . 2021. AURORA: Automated refinement of coarse-grained reconfigurable accelerators. In Proceedings of the 2021 Design, Automation Test in Europe Conference Exhibition. 1388–1393.
DOI: Google ScholarCross Ref
- [24] . 2012. Survey on coarse grained reconfigurable architectures. International Journal of Computer Applications 48, 16 (2012), 1–7.Google Scholar
Cross Ref
- [25] . 2020. DSAGEN: Synthesizing programmable spatial accelerators. In Proceedings of the 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture. 268–281.
DOI: Google ScholarDigital Library
- [26] . 2021. HiMap: Fast and scalable high-quality mapping on CGRA via hierarchical abstraction. In Proceedings of the 2021 Design, Automation Test in Europe Conference Exhibition. 1192–1197.
DOI: Google ScholarCross Ref
- [27] . 2016. Coarse grained reconfigurable architectures in the past 25 years: Overview and classification. In Proceedings of the 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.235–244.
DOI: Google ScholarCross Ref
- [28] . 2015. Joint affine transformation and loop pipelining for mapping nested loop on CGRAs. In Proceedings of the Design, Automation Test in Europe Conference Exhibition. 115–120.Google Scholar
Cross Ref
Index Terms
An Efficient and Flexible Stochastic CGRA Mapping Approach
Recommendations
CGRA-EAM—Rapid Energy and Area Estimation for Coarse-grained Reconfigurable Architectures
Reconfigurable architectures are quickly gaining in popularity due to their flexibility and ability to provide high energy efficiency. However, reconfigurable systems allow for a huge design space. Iterative design space exploration (DSE) is often ...
An automated design approach to map applications on CGRAs
GLSVLSI '14: Proceedings of the 24th edition of the great lakes symposium on VLSICoarse-Grained Reconfigurable Architectures (CGRAs) are promising high-performance and power-efficient platforms. However, their uses are still limited by the capability of mapping tools. This abstract paper outlines a new automated design flow to map ...
Modeling of CGRA to Improve Power Efficiency for Computationally Intensive Application
ICETET '13: Proceedings of the 2013 6th International Conference on Emerging Trends in Engineering and TechnologyTo achieve high computational efficiency by maintaining low power and area requirement is becoming vitally important task for many computationally intensive applications in mobile devices. Designing an architecture for such complex application on ASIC ...






Comments