Abstract
We present AUTOGEN---an algorithm that for a wide class of dynamic programming (DP) problems automatically discovers highly efficient cache-oblivious parallel recursive divide-and-conquer algorithms from inefficient iterative descriptions of DP recurrences. AUTOGEN analyzes the set of DP table locations accessed by the iterative algorithm when run on a DP table of small size, and automatically identifies a recursive access pattern and a corresponding provably correct recursive algorithm for solving the DP recurrence. We use AUTOGEN to autodiscover efficient algorithms for several well-known problems. Our experimental results show that several autodiscovered algorithms significantly outperform parallel looping and tiled loop-based algorithms. Also these algorithms are less sensitive to fluctuations of memory and bandwidth compared with their looping counterparts, and their running times and energy profiles remain relatively more stable. To the best of our knowledge, AUTOGEN is the first algorithm that can automatically discover new nontrivial divide-and-conquer algorithms.
- Performance Application Programming Interface (PAPI). http://icl.cs.utk.edu/papi/.Google Scholar
- XSEDE: Extreme Science and Engineering Discovery Environment. http://www.xsede.org/.Google Scholar
- N. Ahmed and K. Pingali. Automatic generation of block-recursive codes. In Euro-Par, pages 368--378, 2000. Google Scholar
Digital Library
- V. Bafna and N. Edwards. On de novo interpretation of tandem mass spectra for peptide identification. In Proc. RCMB, pages 9--18, 2003. Google Scholar
Digital Library
- R. Bellman. Dynamic Programming. Princeton University Press, 1957. Google Scholar
Digital Library
- M. Bender, R. Ebrahimi, J. Fineman, G. Ghasemiesfeh, R. Johnson, and S. McCauley. Cache-adaptive algorithms. In SODA, 2014. Google Scholar
Digital Library
- U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. ACM SIGPLAN Notices, 43(6):101--113, 2008. Google Scholar
Digital Library
- R. Chowdhury. Cache-efficient Algorithms and Data Structures: Theory and Experimental Evaluation. PhD thesis, Department of Computer Sciences, The University of Texas, Austin, Texas, 2007. Google Scholar
Digital Library
- R. Chowdhury and P. Ganapathi. Divide-and-conquer variants of bubble, selection, and insertion sorts. Unpublished manuscript.Google Scholar
- R. Chowdhury and V. Ramachandran. Cache-oblivious dynamic programming. In Proc. SODA, pages 591--600, 2006. Google Scholar
Digital Library
- R. Chowdhury and V. Ramachandran. Cache-efficient dynamic programming algorithms for multicores. In Proc. SPAA, pages 207--216, 2008. Google Scholar
Digital Library
- R. Chowdhury and V. Ramachandran. The cache-oblivious Gaussian elimination paradigm: theoretical framework, parallelization and experimental evaluation. TOCS, 47(4):878--919, 2010. Google Scholar
Digital Library
- R. Chowdhury, P. Ganapathi, V. Pradhan, J. J. Tithi, and Y. Xiao. An efficient cache-oblivious parallel viterbi algorithm. Unpublished manuscript.Google Scholar
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, third edition, 2009. Google Scholar
Digital Library
- J. Du, C. Yu, J. Sun, C. Sun, S. Tang, and Y. Yin. EasyHPS: A multilevel hybrid parallel system for dynamic programming. In Proc. IPDPSW, pages 630--639, 2013. Google Scholar
Digital Library
- F. C. Duckworth and A. J. Lewis. A fair method for resetting the target in interrupted one-day cricket matches. JORS, 49(3):220--227, 1998.Google Scholar
Cross Ref
- R. Durbin, S. R. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.Google Scholar
Cross Ref
- D. Eklov, N. Nikoleris, D. Black-Schaffer, and E. Hagersten. Cache pirating: Measuring the curse of the shared cache. In Proc. ICPP, pages 165--175, 2011. Google Scholar
Digital Library
- R. W. Floyd. Algorithm 97: shortest path. CACM, 5(6):345, 1962. Google Scholar
Digital Library
- M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Proc. FOCS, pages 285--297, 1999. Google Scholar
Digital Library
- Z. Galil and K. Park. Parallel algorithms for dynamic programming recurrences with more than O(1) dependency. JPDC, 21(2):213--222, 1994. Google Scholar
Digital Library
- R. Giegerich and G. Sauthoff. Yield grammar analysis in the Bellman's GAP compiler. In Proc. LDTA, page 7, 2011. Google Scholar
Digital Library
- D. Gusfield. Algorithms on Strings, Trees and Sequences. Cambridge University Press, 1997. Google Scholar
Digital Library
- D. S. Hirschberg. A linear space algorithm for computing maximal common subsequences. CACM, 18(6):341--343, 1975. Google Scholar
Digital Library
- J. O. S. Kennedy. Applications of dynamic programming to agriculture, forestry and fisheries: Review and prognosis. Rev Market Agr Econ, 49 (03), 1981.Google Scholar
- A. Levitin. Introduction to the Design and Analysis of Algorithms. Pearson, third edition, 2011.Google Scholar
- A. Lew and H. Mauch. Dynamic Programming: A Computational Tool, volume 38. Springer, 2006. Google Scholar
Digital Library
- W. Liu and B. Schmidt. A generic parallel pattern-based system for bioinformatics. In Proc. Euro-Par, pages 989--996. Springer, 2004.Google Scholar
- Y. Pu, R. Bodik, and S. Srivastava. Synthesis of first-order dynamic programming algorithms. ACM SIGPLAN Notices, 46(10):83--98, 2011. Google Scholar
Digital Library
- R. Reitzig. Automated parallelisation of dynamic programming recursions. Masters Thesis: University of Kaiserslautern, 2012.Google Scholar
- A. A. Robichek, E. J. Elton, and M. J. Gruber. Dynamic programming applications in finance. JF, 26(2):473--506, 1971.Google Scholar
Cross Ref
- D. Romer. It's fourth down and what does the Bellman equation say? A dynamic programming analysis of football strategy. Technical report, National Bureau of Economic Research, 2002.Google Scholar
- J. Rust. Numerical dynamic programming in economics. Handbook of Computational Economics, 1:619--729, 1996.Google Scholar
Cross Ref
- H. Sakoe and S. Chiba. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech, 26(1):43--49, 1978.Google Scholar
Cross Ref
- D. K. Smith. Dynamic programming and board games: A survey. EJOR, 176(3):1299--1318, 2007.Google Scholar
Cross Ref
- M. Sniedovich. Dynamic Programming: Foundations and Principles. CRC press, 2010.Google Scholar
- S. Tang, C. Yu, J. Sun, B.-S. Lee, T. Zhang, Z. Xu, and H. Wu. EasyPDP: An efficient parallel dynamic programming runtime system for computational biology. TPDS, 23(5):862--872, 2012. Google Scholar
Digital Library
- Y. Tang, R. Chowdhury, B. C. Kuszmaul, C.-K. Luk, and C. E. Leiserson. The Pochoir stencil compiler. In Proc. SPAA, pages 117--128, 2011a. Google Scholar
Digital Library
- Y. Tang, R. Chowdhury, C.-K. Luk, and C. E. Leiserson. Coding stencil computations using the Pochoir stencil-specification language. In Proc. HotPar, 2011b.Google Scholar
- J. Tithi, P. Ganapathi, A. Talati, S. Agarwal, and R. Chowdhury. High-performance energy-efficient recursive dynamic programming with matrix-multiplication-like flexible kernels. In Proc. IPDPS, 2015. Google Scholar
Digital Library
- J. Towns, T. Cockerill, M. Dahan, I. Foster, K. Gaither, A. Grimshaw, V. Hazlewood, S. Lathrop, D. Lifka, G. D. Peterson, R. Roskies, J. R. Scott, and N. Wilkens-Diehr. Xsede: Accelerating scientific discovery. Computing in Science and Engineering, 16(5):62--74, 2014. ISSN 1521-9615. doi: http://doi.ieeecomputersociety.org/10.1109/MCSE.2014.80.Google Scholar
Cross Ref
- J. Treibig, G. Hager, and G. Wellein. Likwid: A lightweight performance-oriented tool suite for x86 multicore environments. In Proc. ICPPW, pages 207--216, 2010. Google Scholar
Digital Library
- J. D. Ullman, A. V. Aho, and J. E. Hopcroft. The design and analysis of computer algorithms. Addison-Wesley, Reading, 4:1--2, 1974.Google Scholar
Digital Library
- M. S. Waterman. Introduction to Computational Biology: Maps, Sequences and Genomes. Chapman & Hall Ltd., 1995.Google Scholar
Recommendations
AUTOGEN: automatic discovery of cache-oblivious parallel recursive algorithms for solving dynamic programs
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingWe present AUTOGEN---an algorithm that for a wide class of dynamic programming (DP) problems automatically discovers highly efficient cache-oblivious parallel recursive divide-and-conquer algorithms from inefficient iterative descriptions of DP ...
Autogen: Automatic Discovery of Efficient Recursive Divide-8-Conquer Algorithms for Solving Dynamic Programming Problems
Special Issue: Invited papers from PPoPP 2016, Part 1We present Autogen—an algorithm that for a wide class of dynamic programming (DP) problems automatically discovers highly efficient cache-oblivious parallel recursive divide-and-conquer algorithms from inefficient iterative descriptions of DP ...
Provably Efficient Scheduling of Cache-oblivious Wavefront Algorithms
SPAA '17: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and ArchitecturesIterative wavefront algorithms for evaluating dynamic programming recurrences exploit optimal parallelism but show poor cache performance. Tiled-iterative wavefront algorithms achieve optimal cache complexity and high parallelism but are cache-aware and ...






Comments