Abstract
The adoption of High-Level Synthesis (HLS) has increased as the latest HLS tools have evolved to provide high-quality results while improving productivity and time-to-market. Concurrently, many works have been proposing the incorporation of approximate computing techniques within HLS toolchains, allowing automated generation of inexact circuits for error-tolerant application domains with the aim of trading-off computation accuracy with area/power savings or performance improvements. Thus, when attempting to make a design meet timing requirements, designers of real-time systems using HLS may resort to approximation approaches. However, current approximate HLS tools do not allow specifying real-time constraints, being instead error-constrained to explore area, power, or performance optimizations. In this work, we propose an approximate HLS framework for real-time systems that can be integrated with state-of-the-art HLS tools. With this framework designers can specify real-time constraints and satisfy them while minimizing the output error. It uses scheduling information and Worst-Case Execution Time (WCET) analysis for iteratively exploring time-error trade-offs of approximations in the time-critical execution path. Experimental results on signal and image processing benchmarks show that we can reduce the WCET of exact designs by up to 35% with acceptable quality degradation.
- L. Aksoy, P. Flores, and J. Monteiro. 2015. Approximation of multiple constant multiplications using minimum look-up tables on FPGA. In 2015 IEEE International Symposium on Circuits and Systems (ISCAS). 2884--2887.Google Scholar
- A. Becher, J. Echavarria, D. Ziener, S. Wildermann, and J. Teich. 2016. A LUT-based approximate adder. In 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 27--27.Google Scholar
- Andrew Canis, Jongsok Choi, Mark Aldham, Victor Zhang, Ahmed Kammoona, Tomasz Czajkowski, Stephen D. Brown, and Jason H. Anderson. 2013. LegUp: An open-source high-level synthesis tool for FPGA-based processor/ accelerator systems. ACM Trans. Embed. Comput. Syst. 13, 2, Article 24 (Sept. 2013), 27 pages.Google Scholar
Digital Library
- B. Carrion Schafer. 2017. Enabling high-level synthesis resource sharing design space exploration in FPGAs through automatic internal bitwidth adjustments. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 36, 1 (Jan 2017), 97--105.Google Scholar
- Lakshmi N. B. Chakrapani, Kirthi Krishna Muntimadugu, Avinash Lingamneni, Jason George, and Krishna V. Palem. 2008. Highly energy and performance efficient embedded computing through approximately correct arithmetic: A mathematical foundation and preliminary experimental validation. In Proc. Int. Conf. Compilers, Architectures and Synthesis for Embedded Systems (CASES’08). 187--196.Google Scholar
- W. T. J. Chan et al. 2013. Statistical analysis and modeling for error composition in approximate computation circuits. In IEEE 31st Int. Conf. Computer Design (ICCD). 47--53.Google Scholar
- J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, and Z. Zhang. 2011. High-level synthesis for FPGAs: From prototyping to deployment. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 30, 4 (April 2011), 473--491.Google Scholar
Digital Library
- P. Coussy, D. D. Gajski, M. Meredith, and A. Takach. 2009. An introduction to high-level synthesis. IEEE Design Test of Computers 26, 4 (July 2009), 8--17.Google Scholar
Digital Library
- S. Ganapathy, G. Karakonstantis, A. Teman, and A. Burg. 2015. Mitigating the impact of faults in unreliable memories for error-resilient applications. In 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC). 1--6.Google Scholar
- M. Gort and J. H. Anderson. 2013. Range and bitmask analysis for hardware optimization in high-level synthesis. In 18th Asia and South Pacific Design Automation Conference (ASP-DAC). 773--779.Google Scholar
- J. Han and M. Orshansky. 2013. Approximate computing: An emerging paradigm for energy-efficient design. In 18th IEEE European Test Symp. (ETS). 1--6.Google Scholar
- Yuko Hara, Hiroyuki Tomiyama, Shinya Honda, and Hiroaki Takada. 2009. Proposal and quantitative analysis of the CHStone benchmark program suite for practical C-based high-level synthesis. Journal of Information Processing 17 (2009), 242--254. DOI:https://doi.org/10.2197/ipsjjip.17.242Google Scholar
Cross Ref
- Q. Huang, R. Lian, A. Canis, J. Choi, R. Xi, S. Brown, and J. Anderson. 2013. The effect of compiler optimizations on high-level synthesis for FPGAs. In IEEE 21st Annu. Int. Symp. Field-Programmable Custom Computing Machines. 89--96.Google Scholar
- ITU-T. 2019. ITU-T Test Signals for Telecommunication Systems. Retrieved March 29, 2019 from https://www.itu.int/net/itu-t/sigdb/menu.aspx.Google Scholar
- Andrew B. Kahng and Seokhyeong Kang. 2012. Accuracy-configurable adder for approximate arithmetic designs. In Proceedings of the 49th Annual Design Automation Conference (DAC’12). ACM, New York, NY, USA, 820--825.Google Scholar
- Dirk Koch, Frank Hannig, and Daniel Ziener. 2016. FPGAs for Software Programmers (1st ed.). Springer Publishing Company, Incorporated.Google Scholar
- C. Lattner and V. Adve. 2004. LLVM: A compilation framework for lifelong program analysis transformation. In Int. Symp. Code Generation and Optimization (CGO). 75--86.Google Scholar
Digital Library
- S. Lee and A. Gerstlauer. 2017. Data-dependent loop approximations for performance-quality driven high-level synthesis. IEEE Embedded Syst. Lett. 10, 1 (March 2017), 18--21.Google Scholar
- S. Lee, L. K. John, and A. Gerstlauer. 2017. High-level synthesis of approximate hardware under joint precision and voltage scaling. In Proc. Conf. Design, Automation 8 Test in Europe (DATE’17). 187--192.Google Scholar
- C. Li, W. Luo, S. S. Sapatnekar, and J. Huo. 2015. Joint precision optimization and high level synthesis for approximate computing. In Proc. 52Nd Annu. Design Automation Conference (DAC’15). Article 104, 6 pages.Google Scholar
- Y. S. Li and S. Malik. 1997. Performance analysis of embedded software using implicit path enumeration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 16, 12 (Dec 1997), 1477--1487.Google Scholar
Digital Library
- Avinash Lingamneni, Christian Enz, Krishna Palem, and Christian Piguet. 2013. Synthesizing parsimonious inexact circuits through probabilistic design techniques. ACM Trans. Embed. Comput. Syst. 12, 2s, Article 93 (May 2013), 26 pages.Google Scholar
Digital Library
- Paul Lokuciejewski and Peter Marwedel. 2011. Worst-Case Execution Time Aware Compilation Techniques for Real-Time Systems.Google Scholar
- Sparsh Mittal. 2016. A survey of techniques for approximate computing. ACM Comput. Surv. 48, 4, Article 62 (March 2016), 33 pages.Google Scholar
Digital Library
- T. Moreau, M. Wyse, J. Nelson, A. Sampson, H. Esmaeilzadeh, L. Ceze, and M. Oskin. 2015. SNNAP: Approximate computing on programmable SoCs via neural acceleration. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). 603--614.Google Scholar
- K. Nepal, S. Hashemi, H. Tann, R. I. Bahar, and S. Reda. 2019. Automated high-level generation of low-power approximate computing circuits. IEEE Trans. Emerg. Topics Comput. 7, 1 (Jan 2019), 18--30.Google Scholar
Cross Ref
- K. Nepal, Y. Li, R. I. Bahar, and S. Reda. 2014. ABACUS: A technique for automated behavioral synthesis of approximate computing circuits. In Proc. Conf. Design, Automation 8 Test in Europe (DATE’14). Article 361, 6 pages.Google Scholar
- Krishna V. Palem, Lakshmi N. B. Chakrapani, Zvi M. Kedem, Avinash Lingamneni, and Kirthi Krishna Muntimadugu. 2009. Sustaining Moore’s law in embedded computing through probabilistic and approximate design: Retrospects and prospects. In Proc. Int. Conf. Compilers, Architecture, and Synthesis for Embedded Systems (CASES’09). 1--10.Google Scholar
Digital Library
- P. Puschner and Ch. Koza. 1989. Calculating the maximum execution time of real-time programs. Real-Time Syst. 1, 2 (Sept. 1989), 159--176.Google Scholar
Digital Library
- A. Rahimi, L. Benini, and R. K. Gupta. 2013. Spatial memoization: Concurrent instruction reuse to correct timing errors in SIMD architectures. IEEE Transactions on Circuits and Systems II: Express Briefs 60, 12 (Dec 2013), 847--851. DOI:https://doi.org/10.1109/TCSII.2013.2281934Google Scholar
Cross Ref
- Mehrzad Samadi, Davoud Anoushe Jamshidi, Janghaeng Lee, and Scott Mahlke. 2014. Paraprox: Pattern-based approximation for data parallel applications. SIGARCH Comput. Archit. News 42, 1 (Feb. 2014), 35--50.Google Scholar
Digital Library
- Adrian Sampson, Werner Dietl, Emily Fortuna, Danushen Gnanapragasam, Luis Ceze, and Dan Grossman. 2011. EnerJ: Approximate data types for safe and general low-power computation. SIGPLAN Not. 46, 6 (June 2011), 164--174.Google Scholar
Digital Library
- Muhammad Shafique, Rehan Hafiz, Semeen Rehman, Walaa El-Harouni, and Jörg Henkel. 2016. Invited - cross-layer approximate computing: From logic to architectures. In Proceedings of the 53rd Annual Design Automation Conference (DAC’16). ACM, New York, NY, USA, Article 99, 6 pages.Google Scholar
Digital Library
- A. C. Shaw. 1989. Reasoning about time in higher-level language software. IEEE Trans. Softw. Eng. 15, 7 (July 1989), 875--889.Google Scholar
Digital Library
- Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing performance vs. accuracy trade-offs with loop perforation. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE’11). ACM, New York, NY, USA, 124--134.Google Scholar
Digital Library
- S. Sinha and W. Zhang. 2016. Low-power FPGA design using memoization-based approximate computing. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 24, 8 (Aug 2016), 2665--2678.Google Scholar
Digital Library
- Renée St. Amant, Amir Yazdanbakhsh, Jongse Park, Bradley Thwaites, Hadi Esmaeilzadeh, Arjang Hassibi, Luis Ceze, and Doug Burger. 2014. General-purpose code acceleration with limited-precision analog computation. In Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA’14). IEEE Press, Piscataway, NJ, USA, 505--516.Google Scholar
Cross Ref
- USC-SIP. 2019. The USC-SIPI Image Database. Retrieved March 29, 2019 from http://sipi.usc.edu/database.Google Scholar
- F. Vaverka, R. Hrbacek, and L. Sekanina. 2016. Evolving component library for approximate high level synthesis. In IEEE Symp. Series on Computational Intelligence (SSCI). 1--8.Google Scholar
- S. Venkataramani, S. T. Chakradhar, K. Roy, and A. Raghunathan. 2015. Approximate computing and the quest for computing efficiency. In Proc. 52Nd Annu. Design Automation Conference (DAC’15). Article 120, 6 pages.Google Scholar
- S. Venkataramani, A. Ranjan, K. Roy, and A. Raghunathan. 2014. AxNN: Energy-efficient neuromorphic systems using approximate computing. In 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). 27--32.Google Scholar
- S. Xu and B. C. Schafer. 2017. Exposing approximate computing optimizations at different levels: From behavioral to gate-level. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 25, 11 (Nov 2017), 3077--3088.Google Scholar
Cross Ref
- A. Yazdanbakhsh, D. Mahajan, H. Esmaeilzadeh, and P. Lotfi-Kamran. 2017. AxBench: A multiplatform benchmark suite for approximate computing. IEEE Design Test 34, 2 (April 2017), 60--68. DOI:https://doi.org/10.1109/MDAT.2016.2630270Google Scholar
Cross Ref
- T. Yeh, P. Faloutsos, M. Ercegovac, S. Patel, and G. Reinman. 2007. The art of deception: Adaptive precision reduction for area efficient physics acceleration. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). 394--406.Google Scholar
Digital Library
Index Terms
High-Level Synthesis of Approximate Designs under Real-Time Constraints
Recommendations
High-Level Synthesis of Resource-oriented Approximate Designs for FPGAs
DAC '19: Proceedings of the 56th Annual Design Automation Conference 2019When attempting to make a design fit a set of the heterogeneous resources found in Field-Programmable Gate Arrays (FPGAs), designers using High-Level Synthesis (HLS) may resort to approximate approaches. However, current FPGA-oriented approximate HLS ...
High-level synthesis of throughput-optimized and energy-efficient approximate designs
CF '20: Proceedings of the 17th ACM International Conference on Computing FrontiersApproximate accelerators for throughput-demanding error-resilient kernels can be a solution to meet design requirements with acceptable deviation from the exact implementation. However, handcrafting approximate accelerators may impose prohibitive ...
A comprehensive estimation technique for high-level synthesis
ISSS '95: Proceedings of the 8th international symposium on System synthesisAbstract: We present an integrated approach aimed at predicting layout area needed to implement a behavioral description for a given performance goal. Our approach is novel because: (1) it accounts for all types of RT level components (FUs, buses, ...






Comments