Abstract
In this article, we describe an alternative circuit design methodology when considering trade-offs between accuracy, performance, and silicon area. We compare two different approaches that could trade accuracy for performance. One is the traditional approach where the precision used in the datapath is limited to meet a target latency. The other is a proposed new approach which simply allows the datapath to operate without timing closure. We demonstrate analytically and experimentally that on average our approach obtains either smaller errors or equivalent faster operating frequencies in comparison to the traditional approach. This is because the worst case caused by timing violations only happens rarely, while precision loss results in errors to most data. We also show that for basic arithmetic operations such as addition, applying our approach to the simple building block of ripple carry adders can achieve better accuracy or performance than using faster adder designs to achieve similar latency.
- Altera. 2008. Cyclone Device Handbook.Google Scholar
- Semiconductor Industry Association. 2007. International Technology Roadmap for Semiconductors (ITRS).Google Scholar
- T. Austin, V. Bertacco, D. Blaauw, and T. Mudge. 2005. Opportunities and challenges for better than worst-case design. In Proc. ASP-Design Automation Conf. (2005), 2--7. Google Scholar
Digital Library
- T. Austin, D. Blaauw, T. Mudge, and K. Flautner. 2004. Making typical silicon matter with razor. IEEE Trans. Comput. 37, 3 (2004), 57--65. Google Scholar
Digital Library
- D. Boland and G. A. Constantinides. 2008. An FPGA-based implementation of the MINRES algorithm. In Proc. Int. Conf. Field Programmable Logic and Applications. IEEE, 379--384.Google Scholar
- D. Boland and G. A. Constantinides. 2011. Bounding variable values and round-off effects using Handelman representations. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 30, 11 (2011), 1691--1704. Google Scholar
Digital Library
- B. Colwell. 2004. We may need a new box. IEEE Trans. Comput. 37, 3 (2004), 40--41. Google Scholar
Digital Library
- G. A. Constantinides, N. Nicolici, and A. B. Kinsman. 2011. Numerical data representations for FPGA-based scientific computing. IEEE Des. Test Comput. 28, 4 (2011), 8--17. Google Scholar
Digital Library
- F. De, H. Luiz, and J. Stolfi. 2004. Affine arithmetic: Concepts and applications. Numer. Algorithms 37, 1--4 (2004), 147--158.Google Scholar
- D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, and others. 2003. Razor: A low-power pipeline based on circuit-level timing speculation. In Proc. Int. Symp. Microarchitecture. 7--18. Google Scholar
Digital Library
- H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger. 2012. Architecture support for disciplined approximate programming. In Proc. Int. Conf. Architectural Support for Programming Languages and Operating Systems. 301--312. Google Scholar
Digital Library
- B. Gojman, S. Nalmela, N. Mehta, N. Howarth, and A. DeHon. 2013. GROK-LAB: Generating real on-chip knowledge for intra-cluster delays using timing extraction. In Proc. Int. Symp. on Field Programmable Gate Arrays. 81--90. Google Scholar
Digital Library
- V. Gupta, D. Mohapatra, A. Raghunathan, and K. Roy. 2013. Low-power digital signal processing using approximate adders. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 32, 1 (2013), 124--137. Google Scholar
Digital Library
- Z. M. Kedem, V. J. Mooney, K. K. Muntimadugu, and K. V. Palem. 2011. An approach to energy-error tradeoffs in approximate ripple carry adders. In Proc. Int. Symp. on Low Power Electronics and Design. 211--216. Google Scholar
Digital Library
- K. Keutzer and M. Orshansky. 2002. From blind certainty to informed uncertainty. In Proc. Int. Workshop on Timing Issues in the Specification and Synthesis of Digital Systems. 37--41. Google Scholar
Digital Library
- A. B. Kinsman and N. Nicolici. 2010. Bit-width allocation for hardware accelerators for scientific computing using SAT-modulo theory. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 29, 3 (2010), 405--413. Google Scholar
Digital Library
- P. Kulkarni, P. Gupta, and M. Ercegovac. 2011. Trading accuracy for power with an underdesigned multiplier architecture. In Proc. Int. Conf. on VLSI Design. 346--351. Google Scholar
Digital Library
- S. L. Lu. 2004. Speeding up processing with approximation circuits. IEEE Trans. Comput. 37, 3 (2004), 67--73. Google Scholar
Digital Library
- R. E. Moore. 1966. Interval Analysis, Vol. 60. Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
- J. M. Rabaey, A. P. Chandrakasan, and B. Nikolic. 2003. Digital Integrated Circuits: A Design Perspective (2nd ed.). Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
- S. Ramprassad, N. R. Shanbha, and I. N. Hajj. 1997. Analytical estimation of signal transition activity from word-level statistics. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 16, 7 (1997), 718--733. Google Scholar
Digital Library
- A. Roldao-Lopes, A. Shahzad, G. A. Constantinides, and E. C. Kerrigan. 2009. More flops or more precision? Accuracy parameterizable linear equation solvers for model predictive control. In Proc. Int. Symp. Field Programmable Custom Computing Machines. 209--216. Google Scholar
Digital Library
- A. Sampson, W. Dietl, E. Fortuna, D. Gnanapragasam, L. Ceze, and D. Grossman. 2011. EnerJ: Approximate data types for safe and general low-power computation. In Proc. ACM SIGPLAN Notices, Vol. 46. 164--174. Google Scholar
Digital Library
- K. Shi, D. Boland, and G. A. Constantinides. 2013. Accuracy-performance tradeoffs on an FPGA through overclocking. In Proc. Int. Symp. Field-Programmable Custom Computing Machines. 29--36. Google Scholar
Digital Library
- W. Sung and K. Kum. 1995. Simulation-based word-length optimization method for fixed-point digital signal processing systems. IEEE Trans. Signal Process. 43, 12 (1995), 3087--3090. Google Scholar
Digital Library
- A. K. Uht. 2004. Going beyond worst-case specs with TEAtime. IEEE Trans. Comput. 37, 3 (2004), 51--56. Google Scholar
Digital Library
- K. Underwood. 2004. FPGAs vs. CPUs: Trends in peak floating-point performance. In Proc. Int. Symp. Field Programmable Gate Arrays. 171--180. Google Scholar
Digital Library
- H. Y. Wong, L. Cheng, Y. Lin, and L. He. 2005. FPGA device and architecture evaluation considering process variations. In Proc. Int. Conf. on Computer-Aided Design. 19--24. Google Scholar
Digital Library
- Xilinx. 2009. Virtex-6 FPGA Configurable Logic Block User Guide.Google Scholar
- Xilinx. 2011. Virtex-6 FPGA Clocking Resources User Guide.Google Scholar
Index Terms
Imprecise Datapath Design: An Overclocking Approach
Recommendations
Datapath Synthesis for Overclocking: Online Arithmetic for Latency-Accuracy Trade-offs
DAC '14: Proceedings of the 51st Annual Design Automation ConferenceDigital circuits are currently designed to ensure timing closure. Releasing this constraint by allowing timing violations could lead to significant performance improvements, but conventional forms of computer arithmetic do not fail gracefully when ...
A novel three-input approximate XOR gate design based on quantum-dot cellular automata
Quantum-dot cellular automata (QCA) are one of the most promising emerging nanoelectronic paradigms used for designing computers and very large-scale integration circuits. Many applications can tolerate the errors and imprecision of digital systems; ...
Systematic synthesis of approximate adders and multipliers with accurate error calculations
AbstractIn this study, we perform logic synthesis and area optimization of approximate ripple-carry adders and Wallace-tree multipliers with a given error constraint. We first implement approximate 1-bit adders having different error rates as ...
Highlights- Logic synthesis and area optimization of approximate ripple-carry adders and Wallace-tree multipliers is presented.






Comments