Abstract
Consider the problem of converting decimal scientific notation for a number into the best binary floating point approximation to that number, for some fixed precision. This problem cannot be solved using arithmetic of any fixed precision. Hence the IEEE Standard for Binary Floating-Point Arithmetic does not require the result of such a conversion to be the best approximation.
This paper presents an efficient algorithm that always finds the best approximation. The algorithm uses a few extra bits of precision to compute an IEEE-conforming approximation while testing an intermediate result to determine whether the approximation could be other than the best. If the approximation might not be the best, then the best approximation is determined by a few simple operations on multiple-precision integers, where the precision is determined by the input. When using 64 bits of precision to compute IEEE double precision results, the algorithm avoids higher-precision arithmetic over 99% of the time.
The input problem considered by this paper is the inverse of an output problem considered by Steele and White: Given a binary floating point number, print a correctly rounded decimal representation of it using the smallest number of digits that will allow the number to be read without loss of accuracy. The Steele and White algorithm assumes that the input problem is solved; an imperfect solution to the input problem, as allowed by the IEEE standard and ubiquitous in current practice, defeats the purpose of their algorithm.
- Clinger90 Clinger, William, and Jonathan Rees {editors}. Revisedn report on the algorithmic language Scheme. Technical Report CIS-TR-90-02, Department of Computer and Information Science, University of Oregon, 1990.Google Scholar
- Coonen80 Coonen, Jerome T. An implementation guide to a proposed standard for floating-point arithmetic. Computer 13, 1, January 1980, pages 68-79.Google Scholar
Digital Library
- Goldberg67 Goldberg, I. B. 27 bits is not enough for 8-digit accuracy. CACM 10, 2, February 1967, pages 105- 106. Google Scholar
Digital Library
- HW60 Hardy, G. H., and E. M. Wright. An Introduction to the Theory of Numbers, Fourth Edition. Oxford University Press, 1960.Google Scholar
- IEEE85 - IEEE Standard 754-1985. IEEE Standard for Binary Floating-Point Arithmetic. IEEE, New York, 1985.Google Scholar
- Knuth81 Knuth, Donald E. The Art of Computer Programming, Second Edition, Volume 2, Seminumerical Algorithms. Addison-Wesley, 1981. Google Scholar
Digital Library
- Matula68 Matula, David W. In-and-out conversions. CACM 11, 1, January 1968, pages 47-50. Google Scholar
Digital Library
- Matula70 Matula, David W. A formalization of floatingpoint numeric base conversion. IEEE Transactions on Computers, C-19, 8, August 1970, pages 681-692.Google Scholar
Digital Library
- Rees86 Rees, Jonathan, and William Clinger {editors}. Revised3 report on the algorithmic language Scheme. A CM SIGPLAN Notices 21, 12, December 1986, pages 37-79. Google Scholar
Digital Library
- Steele90 Steele Jr, Guy Lewis, and Ion L White. How to print floating point numbers accurately. Proceedings of this conference. Google Scholar
Digital Library
Index Terms
How to read floating point numbers accurately





Comments