ABSTRACT
Many problems in embedded compilation require one set of optimizations to be selected over another based on run time performance. Self-tuned libraries, iterative compilation and machine learning techniques all compare multiple compiled program versions. In each, program versions are timed to determine which has the best performance.
The program needs to be run multiple times for each version because there is noise inherent in most performance measurements. The number of runs must be enough to compare different versions, despite the noise, but executing more than this will waste time and energy. The compiler writer must either risk taking too few runs, potentially getting incorrect results, or taking too many runs increasing the time for their experiments or reducing the number of program versions evaluated. Prior works choose constant size sampling plans where each compiled version is executed a fixed number of times without regard to the level of noise.
In this paper we develop a sequential sampling plan which can automatically adapt to the experiment so that the compiler writer can have both confidence in the results and also be sure that no more runs were taken than were needed. We show that our system is able to correctly determine the best optimization settings with between 76% and 87% fewer runs than needed by a brute force, constant sampling size approach. We also compare our approach to JavaSTATS(10); we needed 77% to 89% fewer runs than it needed.
- F. Agakov, E. Bonilla, J.Cavazos, B.Franke, G. Fursin, M.F.P. O'Boyle, J. Thomson, M. Toussaint, and C.K.I. Williams. Using machine learning to focus iterative optimization. pages 295--305, 03 2006. Google Scholar
Digital Library
- L. Almagor, Keith D. Cooper, Alexander Grosul, Timothy J. Harvey, Steven W. Reeves, Devika Subramanian, Linda Torczon, and Todd Waterman. Finding effective compilation sequences. In LCTES'04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, pages 231--239, New York, NY, USA, 2004. ACM. Google Scholar
Digital Library
- S. M. Blackburn, K.S. McKinley, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovic, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The dacapo benchmarks: Java benchmarking development and analysis. In OOPSLA 2006, ACM Conference on Object-Oriented Programming, Systems, Languages and Applications, 2006. Google Scholar
Digital Library
- J Martin Bland and Douglas G Altman. Transforming data. BMJ (Clinical research ed.), 312, mar 1996.Google Scholar
- B.L. Welch. The generalization of 'student's' problem when several different population varlances are involved. Biometrika, 34:28--35, 1947.Google Scholar
Cross Ref
- F. Bodin, T. Kisuk, P. M. W. Knijnenburg, M. F. P. O'Boyle, and E. Rohou. Iterative compilation in a non-linear optimisation space. In Workshop on Prole 14 and Feedback-Directed Compilation, in Conjunction with the International Conference on Parallel Architectures and Compilation Techniques (PACT), 10 1998.Google Scholar
- G. E. P. Box and D. R. Cox. An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2):211--252, 1964.Google Scholar
Cross Ref
- E.C.Fieller. Some problems in interval estimation. Journal of the Royal Statistical Society, 16:175--185, 1954.Google Scholar
- Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson, Phil Barnard, Elton Ashton, Eric Courtois, Francois Bodin, Edwin Bonilla, John Thomson, Hugh Leather, Chris Williams, and Michael O'Boyle. Milepost gcc: machine learning based research compiler. In Proceedings of the GCC Developers'; Summit, June 2008.Google Scholar
- Andy Georges, Dries Buytaert, and Lieven Eeckhout. Statistically rigorous java performance evaluation. SIGPLAN Not., 42(10):57--76, 2007. Google Scholar
Digital Library
- Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13--30, March 1963.Google Scholar
Digital Library
- Prasad Kulkarni, Wankang Zhao, Hwashin Moon, Kyunghwan Cho, David Whalley, Jack Davidson, Mark Bailey, Yunheung Paek, and Kyle Gallivan. Finding effective optimization phase sequences. SIGPLAN Not., 38(7):12--23, 2003. Google Scholar
Digital Library
- M.A.Creasy. Confidence limits for the gradient in the linear functional relationship. Journal of the Royal Statistical Society, 18:64--69, 1956.Google Scholar
- H. B. Mann and D. R. Whitney. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18:50--60, 1947.Google Scholar
- Oded Maron and Andrew W. Moore. Hoeffding races: Accelerating model selection search for classification and function approximation. In Advances in neural information processing systems 6, pages 59--66. Morgan Kaufmann, 1994.Google Scholar
- Oded Maron and Andrew W. Moore. The racing algorithm: Model selection for lazy learners. Artificial Intelligence Review, 11:193--225, 1997. Google Scholar
Digital Library
- A. Monsifrot, F. Bodin, and R. Quiniou. A machine learning approach to automatic production of compiler heuristics, 2002. Google Scholar
Digital Library
- Eliot Moss, Paul Utgoff, John Cavazos, Doina Precup, Darko Stefanović, Carla Brodley, and David Scheeff". Learning to schedule straight-line code. In Michael I. Jordan, Michael J. Kearns, and Sara A. Solla, editors, Advances in Neural Information Processing Systems, volume 10. The MIT Press, 1998. Google Scholar
Digital Library
- Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, and Peter F. Sweeney. Producing wrong data without doing anything obviously wrong! In ASPLOS'09: Fourteenth International Conference on Architectural Support for Programming Languages and Operating Systems, 2009. Google Scholar
Digital Library
- Armitage P. and Healy M.J.R. Interpretation of Χ2 tests. Biometrics, 13:113-115, 1957.Google Scholar
- F. E. Satterthwaite. An approximate distribution of estimates of variance components. Biometrics Bulletin, 2:110--114, 1946.Google Scholar
Cross Ref
- Mark Stephenson and Saman Amarasinghe. Predicting unroll factors using supervised classification. In CGO'05: Proceedings of the international symposium on Code generation and optimization, pages 123--134, Washington, DC, USA, 2005. IEEE Computer Society. Google Scholar
Digital Library
- Mark Stephenson, Saman Amarasinghe, Martin Martin, and Una-May O'Reilly. Meta optimization: Improving compiler heuristics with machine learning. 06 2003. Google Scholar
Digital Library
- W.S.Gosset (Student). The probable error of a mean. Biometrika, 6:1--25, March 1908.Google Scholar
Cross Ref
- Abraham Wald. Sequential Analysis. 1947.Google Scholar
- Stefan Wellek. Testing Statistical Hypotheses of Equivalence. CRC Press, 2003.Google Scholar
- G.Barrie Wetherill and K.D. Glazebrook. Sequential methods in statistics. 3rd ed. London: Chapman and Hall, 1986.Google Scholar
- John Whitehead. The Design and Analysis of Sequential Trials. Ellis Horwood, 1992.Google Scholar
- W.J.Westlake. Use of confidence intervals in analysis of comparative bioavailability trials. Journal of Pharmaceutical Science, 61:1340--1341, 1972.Google Scholar
Cross Ref
Index Terms
Raced profiles: efficient selection of competing compiler optimizations
Recommendations
Raced profiles: efficient selection of competing compiler optimizations
LCTES '09Many problems in embedded compilation require one set of optimizations to be selected over another based on run time performance. Self-tuned libraries, iterative compilation and machine learning techniques all compare multiple compiled program versions. ...
Evita raced: metacompilation for declarative networks
Declarative languages have recently been proposed for many new applications outside of traditional data management. Since these are relatively early research efforts, it is important that the architectures of these declarative systems be extensible, in ...
ALIC: A Low Overhead Compiler Optimization Prediction Model
Iterative compilation based on machine learning can automatically predict the best optimization for the new programs. However, the efficient prediction models often require repetitive training, which leads to a higher training time overheads, and ...







Comments