skip to main content
research-article

An Efficient and Flexible Stochastic CGRA Mapping Approach

Published:29 October 2022Publication History
Skip Abstract Section

Abstract

Coarse-Grained Reconfigurable Array (CGRA) architectures are promising high-performance and power-efficient platforms. However, mapping applications efficiently on CGRA is a challenging task. This is known to be an NP complete problem. Hence, finding good mapping solutions for a given CGRA architecture within a reasonable time is complex. Additionally, finding scalability in compilation time and memory footprint for large heterogeneous CGRAs is also a well known problem. In this article, we present a stochastic mapping approach that can efficiently explore the architecture space and allows finding best of solutions while having limited and steady use of memory footprint. Experimental results show that our compilation flow allows to reach performances with low-complexity CGRA architectures that are as good as those obtained with more complex ones thanks to the better exploration of the mapping solution space. Parameters considered in our experiments are number of tiles, Register File (RF) size, number of load/store (LS) units, network topologies, and so on. Our results demonstrate that high-quality compilation for a wide range of applications is possible within reasonable run-times. Experiments with several DSP benchmarks show that the best CGRA configuration from the architectural exploration surpasses an ultra low-power DSP optimized RISC-V CPU to achieve up to 15.28× (with an average of 6× and minimum of 3.4×) performance gain and 29.7× (with an average of 13.5× and minimum of 6.3×) energy gain with an area overhead of 1.5× only.

REFERENCES

  1. [1] Balasubramanian Mahesh and Shrivastava Aviral. 2020. CRIMSON: Compute-intensive loop acceleration by randomized iterative modulo scheduling and optimized mapping on CGRAs. IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems 39, 11 (2020), 33003310. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Brenner Janina A., Veen J. C. Van Der, Fekete Sándor P., Filho J. Oliveira, and Rosenstiel Wolfgang. 2006. Optimal simultaneous scheduling, binding, and routing for processor-like reconfigurable architectures. In Proceedings of the International Conference on Field Programmable Logic and Applications. 16. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Chin S. Alexander and Anderson Jason H.. 2018. An architecture-agnostic integer linear programming approach to CGRA mapping. In Proceedings of the 55th Annual Design Automation Conference.Association for Computing Machinery, New York, NY,6 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Das Satyajit, Peyret Thomas, Martin Kevin J. M., Corre Gwenolé, Thevenin Mathieu, and Coussy Philippe. 2016. A scalable design approach to efficiently map applications on CGRAs. In Proceedings of the 2016 IEEE Computer Society Annual Symposium on VLSI. 655660. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Sutter Bjorn De, Coene Paul, Aa Tom Vander, and Mei Bingfeng. 2008. Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays. SIGPLAN Notices 43, 7 (2008), 151160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Frumusanu Andrei. 2016. The Samsung Exynos 7420 Deep Dive - Inside A Modern 14nm SoC. Retrieved from http://www.anandtech.com/show/9330/exynos-7420-deep-dive.Google ScholarGoogle Scholar
  7. [7] Gautschi Michael, Schiavone Pasquale Davide, Traber Andreas, Loi Igor, Pullini Antonio, Rossi Davide, Flamand Eric, Gürkaynak Frank K., and Benini Luca. 2017. Near-threshold RISC-V core with dsp extensions for scalable IoT endpoint devices. IEEE Transactions on Very Large Scale Integration Systems 25, 10 (2017), 27002713. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Gobieski Graham, Atli Ahmet Oguz, Mai Kenneth, Lucia Brandon, and Beckmann Nathan. 2021. Snafu: An ultra-low-power, energy-minimal CGRA-generation framework and architecture. In Proceedings of the 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture. 10271040. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Hamzeh Mahdi, Shrivastava Aviral, and Vrudhula Sarma. 2012. EPIMap: Using epimorphism to map applications on CGRAs. In Proceedings of the 49th Annual Design Automation Conference. ACM, 12841291. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Hamzeh Mahdi, Shrivastava Aviral, and Vrudhula Sarma. 2013. REGIMap: Register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs). In Proceedings of the 50th Annual Design Automation Conference. ACM, 18:1–18:10. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Karunaratne Manupa, Mohite Aditi Kulkarni, Mitra Tulika, and Peh Li-Shiuan. 2017. HyCUBE: A CGRA with reconfigurable single-cycle multi-hop interconnect. In Proceedings of the 54th Annual Design Automation Conference 2017.Association for Computing Machinery, New York, NY, 6 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Kim Changmoo, Chung Mookyoung, Cho Yeongon, Konijnenburg M., Ryu Soojung, and Kim Jeongwook. 2012. ULP-SRP: Ultra low power samsung reconfigurable processor for biomedical applications. In Proceedings of the 2012 International Conference on Field-Programmable Technology. 329334. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Kojima Takuya, Doan Nguyen Anh Vu, and Amano Hideharu. 2020. GenMap: A genetic algorithmic approach for optimizing spatial mapping of coarse-grained reconfigurable architectures. IEEE Transactions on Very Large Scale Integration Systems 28, 11 (2020), 23832396. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Lee Ganghee, Choi Kiyoung, and Dutt N. D.. 2011. Mapping multi-domain applications onto coarse-grained reconfigurable architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 30, 5 (2011), 637650. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Levi Giorgio. 1973. A note on the derivation of maximal common subgraphs of two directed or undirected graphs. CALCOLO 9, 4 (1973), 341352. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Liu Leibo, Zhu Jianfeng, Li Zhaoshi, Lu Yanan, Deng Yangdong, Han Jie, Yin Shouyi, and Wei Shaojun. 2019. A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications. ACM Computing Surveys 52, 6, (2019), 39 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Mei Bingfeng, Vernalde S., Verkest D., Man H. De, and Lauwereins R.. 2002. DRESC: A retargetable compiler for coarse-grained reconfigurable architectures. In Proceedings of the Field-Programmable Technology. 166173. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Nicol Chris. 2017. A coarse grain reconfigurable array (cgra) for statically scheduled data flow computing. Wave Computing White Paper (2017), 1–9.Google ScholarGoogle Scholar
  19. [19] Park Hyunchul, Fan Kevin, Mahlke Scott A., Oh Taewook, Kim Heeseok, and Kim Hong-seok. 2008. Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In Proceedings of the 17th InternationalConference on Parallel Architectures and Compilation Techniques. ACM. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Peyret Thomas, Corre Gwenolé, Thevenin Mathieu, Martin Kevin J. M., and Coussy Philippe. 2014. An automated design approach to map applications on CGRAs. In Proceedings of the 24th Edition of the Great Lakes Symposium on VLSI. ACM, 229230.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Peyret Thomas, Corre Gwenolé, Thevenin Mathieu, Martin Kevin J. M., and Coussy Philippe. 2014. Efficient application mapping on CGRAs based on backward simultaneous scheduling/binding and dynamic graph transformations. In Proceedings of the 2014 IEEE 25th International Conference on Application-specific Systems, Architectures, and Processors. 169172. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Podobas A., Sano K., and Matsuoka S.. 2020. A survey on coarse-grained reconfigurable architectures from a performance perspective. IEEE Access 8 (2020), 146719146743. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Tan Cheng, Xie Chenhao, Li Ang, Barker Kevin J., and Tumeo Antonino. 2021. AURORA: Automated refinement of coarse-grained reconfigurable accelerators. In Proceedings of the 2021 Design, Automation Test in Europe Conference Exhibition. 13881393. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Tehre Vaishali and Kshirsagar Ravindra. 2012. Survey on coarse grained reconfigurable architectures. International Journal of Computer Applications 48, 16 (2012), 17.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Weng Jian, Liu Sihao, Dadu Vidushi, Wang Zhengrong, Shah Preyas, and Nowatzki Tony. 2020. DSAGEN: Synthesizing programmable spatial accelerators. In Proceedings of the 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture. 268281. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Wijerathne Dhananiaya, Li Zhaoying, Pathania Anuj, Mitra Tulika, and Thiele Lothar. 2021. HiMap: Fast and scalable high-quality mapping on CGRA via hierarchical abstraction. In Proceedings of the 2021 Design, Automation Test in Europe Conference Exhibition. 11921197. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Wijtvliet Mark, Waeijen Luc, and Corporaal Henk. 2016. Coarse grained reconfigurable architectures in the past 25 years: Overview and classification. In Proceedings of the 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.235244. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Yin Shouyi, Liu Dajiang, Liu Leibo, Wei Shaojun, and Guo Yike. 2015. Joint affine transformation and loop pipelining for mapping nested loop on CGRAs. In Proceedings of the Design, Automation Test in Europe Conference Exhibition. 115120.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. An Efficient and Flexible Stochastic CGRA Mapping Approach

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Embedded Computing Systems
            ACM Transactions on Embedded Computing Systems  Volume 22, Issue 1
            January 2023
            512 pages
            ISSN:1539-9087
            EISSN:1558-3465
            DOI:10.1145/3567467
            • Editor:
            • Tulika Mitra
            Issue’s Table of Contents

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 29 October 2022
            • Online AM: 21 July 2022
            • Accepted: 13 July 2022
            • Revised: 1 June 2022
            • Received: 15 October 2021
            Published in tecs Volume 22, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Author Tags

            Qualifiers

            • research-article
            • Refereed
          • Article Metrics

            • Downloads (Last 12 months)343
            • Downloads (Last 6 weeks)38

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Full Text

          View this article in Full Text.

          View Full Text

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!