skip to main content
article
Free Access

On minimizing materializations of array-valued temporaries

Published:01 November 2006Publication History
Skip Abstract Section

Abstract

We consider the analysis and optimization of code utilizing operations and functions operating on entire arrays. Models are developed for studying the minimization of the number of materializations of array-valued temporaries in basic blocks, each consisting of a sequence of assignment statements involving array-valued variables. We derive lower bounds on the number of materializations required, and develop several algorithms minimizing the number of materializations, subject to a simple constraint on allowable statement rearrangement. In contrast, we also show that when statement rearrangement is unconstrained, minimizing the number of materializations becomes NP-complete, even for very simple basic blocks.

References

  1. Abrams, P. S. 1970. An APL machine. Ph.D. dissertation. Stanford University, Stanford, CA.]] Google ScholarGoogle Scholar
  2. Aho, A. V., Sethi, R., and Ullman, J. D. 1986. Compilers: Principles, Techniques, and Tools. Addison--Wesley, Reading, MA.]] Google ScholarGoogle Scholar
  3. Allen, R. and Kennedy, K. 2002. Optimizing Compilers for Modern Architectures: A Dependence-Based Approach. Morgan-Kaufmann Publishers, San Francisco, CA.]] Google ScholarGoogle Scholar
  4. Bacon, D. F., Graham, S. L., and Sharp, O. J. 1994. Compiler transformations for high-performance computing. ACM Comput. Surv. 26, 4 (Dec.), 345--420.]] Google ScholarGoogle Scholar
  5. Budd, T. A. 1984. An APL compiler for a vector processor. ACM Trans. Prog. Lang. Syst. 6, 3 (July), 297--313.]] Google ScholarGoogle Scholar
  6. Chamberlain, B. L., Choi, S.-E., Lewis, E. C., Lin, C., Snyder, L., and Weathersby, W. D. 1996. Factor-join: A unique approach to compiling array languages for parallel machines. In Proceedings of the 9th International Workshop on Languages and Compilers for Parallel Computing, D. Padua, A. Nicolau, D. Gelernter, U. Banerjee, and D. Sehr, Eds. Lecture Notes in Computer Science, vol. 1239. Springer-Verlag, New York, pp. 481--500.]] Google ScholarGoogle Scholar
  7. Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., and Zadeck, F. K. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Prog. Lang. Syst. 13, 4 (Oct.), 451--490.]] Google ScholarGoogle Scholar
  8. Dinesh, T. B., Haveraaen, M., and Heering, J. 2000. An algebraic programming style for numerical software and its optimization. Sci. Prog. 8, 4, 247--259.]] Google ScholarGoogle Scholar
  9. Gao, G. R., Olsen, R., Sarkar, V., and Thekkath, R. 1992. Collective loop fusion for array contraction. In Proceedings of 5th International Workshop on Languages and Compilers for Parallel Computing (New Haven, CT, Aug.), U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, Eds. Lecture Notes in Computer Science, vol. 757. Springer-Verlag, pp. 281--295.]] Google ScholarGoogle Scholar
  10. Guibas, L. J. and Wyatt, D. K. 1978. Compilation and delayed evaluation in APL. In Conference Record of the 5th Annual ACM SIGACT--SIGPLAN Symposium on Principles of Programming Languages (POPL '78) (Tucson, AZ, Jan). ACM, New York, pp. 1--8.]] Google ScholarGoogle Scholar
  11. Gupta, M., Midkiff, S., Schonberg, E., Seshadri, V., Shields, D., Wang, K.-Y., Ching, W.-M., and Ngo, T. 1995. An HPF compiler for the IBM SP2. In Proceedings of Supercomputing '95. (San Diego, CA, Dec.). ACM, New York.]] Google ScholarGoogle Scholar
  12. Hassitt, A. and Lyon, L. E. 1972. Efficient evaluation of array subscripts of arrays. IBM J. Res. Devl. 16, 1 (Jan.), 45--57.]]Google ScholarGoogle Scholar
  13. Humphrey, W., Karmesin, S., Bassetti, F., and Reynders, J. 1997. Optimization of data-parallel field expressions in the POOMA framework. In Proceedings of the 1st International Conference on Scientific Computing in Object--Oriented Parallel Environments (ISCOPE '97) (Marina del Rey, CA, Dec.), Y. Ishikawa, R. R. Oldehoeft, J. Reynders, and M. Tholburn, Eds. Lecture Notes in Computer Science, vol. 1343. Springer-Verlag, New York, pp. 185--194.]] Google ScholarGoogle Scholar
  14. Hwang, G.-H., Lee, J. K., and Ju, D.-C. 1995. An array operation synthesis scheme to optimize Fortran 90 programs. ACM SIGPLAN Notices, Proceedings of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 30, 8 (Aug.), 112--122.]] Google ScholarGoogle Scholar
  15. Hwang, G.-H., Lee, J. K., and Ju, R. D.-C. 1998. A function-composition approach to synthesize Fortran 90 array operations. J. Paral. Dist. Comput. 54, 1 (Oct.), 1--47.]] Google ScholarGoogle Scholar
  16. Hwang, G.-H., Lee, J. K., and Ju, R. D.-C. 2001. Array operation synthesis to optimize HPF programs on distributed memory machines. J. Paral. Dist. Comput. 61, 4 (Apr.), 467--500.]] Google ScholarGoogle Scholar
  17. Ju, D.-C. 1992. The optimization and parallelization of array language programs. Ph.D. dissertation, University of Texas at Austin, Austin.]] Google ScholarGoogle Scholar
  18. Kennedy, K. 2001. Fast greedy weighted fusion. Int. J. Paral. Prog. (IJPP) 29, 5 (Oct.), 463--491.]] Google ScholarGoogle Scholar
  19. Kennedy, K. and McKinley, K. S. 1993. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing (Portland, OR, Aug.), U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, Eds. Lecture Notes in Computer Science, vol. 768. Springer-Verlag, New York, pp. 301--320.]] Google ScholarGoogle Scholar
  20. Kennedy, K., Mellor-Crummey, J., and Roth, G. 1995. Optimizing Fortran 90 shift operations on distributed-memory multicomputers. In Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (Columbus, OH, Aug.). Lecture Notes in Computer Science, vol. 1033. Springer-Verlag, New York, pp. 161--175.]] Google ScholarGoogle Scholar
  21. Knobe, K. and Sarkar, V. 1998. Array SSA form and its use in parallelization. In Conference Record 25th ACM SIGACT--SIGPLAN Symposium on Principles of Programming Languages (POPL '98) (San Diego, CA, Jan.). ACM, New York, pp. 107--120.]] Google ScholarGoogle Scholar
  22. Lewis, E. C., Lin, C., and Snyder, L. 1998. The implementation and evaluation of fusion and contraction in array languages. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation (Montreal, Que., Canada, June). ACM, New York, pp. 50--59.]] Google ScholarGoogle Scholar
  23. Lin, C. and Snyder, L. 1993. ZPL: An array sublanguage. In Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing (Portland, OR, Aug.), U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, Eds. Lecture Notes in Computer Science, vol. 768. Springer-Verlag, New York, pp. 96--114.]] Google ScholarGoogle Scholar
  24. Manjikian, N. and Abdelrahman, T. S. 1997. Fusion of loops for parallelism and locality. IEEE Trans. Paral. Dist. Syst. 8, 2 (Feb.), 193--209.]] Google ScholarGoogle Scholar
  25. Mullin, L. 1993. The Psi compiler project. In Workshop on Compilers for Parallel Computers. TU Delft, Holland.]]Google ScholarGoogle Scholar
  26. Mullin, L. M. R. 1988. A mathematics of arrays. Ph.D. dissertation. Syracuse University, Syracuse, New York.]]Google ScholarGoogle Scholar
  27. Roth, G. 1997. Optimizing Fortran90D/HPF for distributed-memory computers. Ph.D. dissertation, Dept. of Computer Science, Rice University.]] Google ScholarGoogle Scholar
  28. Roth, G. 2000. Advanced scalarization of array syntax. In Proceedings of the 9th International Compiler Construction Conference (CC '2000) (Berlin, Germany, Mar.). Lecture Notes in Computer Science, vol. 2017. Springer-Verlag, New York, pp. 219--231.]] Google ScholarGoogle Scholar
  29. Roth, G. and Kennedy, K. 1996. Dependence analysis of Fortran90 array syntax. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA '96) (Sunnyvale, CA, Aug.). CSREA Press, pp. 1225--1235.]]Google ScholarGoogle Scholar
  30. Roth, G. and Kennedy, K. 1998. Loop fusion in high-performance Fortran. In Proceedings of the 12th International Conference on Supercomputing (ICS '98) (Melbourne, Australia, July). ACM, New York, pp. 125--132.]] Google ScholarGoogle Scholar
  31. Roth, G., Mellor-Crummey, J., Kennedy, K., and Brickner, R. G. 1997. Compiling stencils in high performance Fortran. In Proceedings of the 1997 ACM/IEEE Conference on Supercomputing (SC '97): High Performance Networking and Computing (San Jose, CA, Nov.). ACM, New York.]] Google ScholarGoogle Scholar
  32. Schwartz, J. T. 1975. Optimization of very high level languages---I. Value transmission and its corollaries. Comput. Lang. 1, 2 (June), 161--194.]]Google ScholarGoogle Scholar
  33. Siek, J. G. and Lumsdaine, A. 1998. The matrix template library: A generic programming approach to high-performance numerical linear algebra. In Proceedings of the 2nd International Symposium on Computing in Object-Oriented Parallel Environments (ISCOPE '98) (Santa Fe, NM, Dec.), D. Caromel, R. R. Oldehoeft, and M. Tholburn, Eds. Lecture Notes in Computer Science, vol. 1505. Springer-Verlag, New York, pp. 59--70.]] Google ScholarGoogle Scholar
  34. Veldhuizen, T. 1995a. Using C++ template metaprograms. C++ Report 7, 4 (May), 36--43. (Reprinted in C++ Gems: Programming Pearls from the C++ Report, S. R. Lippman, Ed. Cambridge University Press, Cambridge, UK, pp. 459--474.)]] Google ScholarGoogle Scholar
  35. Veldhuizen, T. L. 1995b. Expression templates. C++ Report 7, 5 (June), 26--31. (Reprinted in C++ Gems: Programming Pearls from the C++ Report, S. S. Lippman, Ed. Cambridge University Press, Cambridge, UK, pp. 459--474.)]]Google ScholarGoogle Scholar
  36. Veldhuizen, T. L. 1998. Arrays in Blitz++. In Proceedings of the 2nd International Symposium on Scientific Computing in Object-Oriented Parallel Environments (ISCOPE '98) (Santa Fe, NM. Dec.). D. Caromel, R. R. Oldehoeft, and M. Tholburn, Eds. Lecture Notes in Computer Science, vol. 1505. Springer-Verlag, New York, pp. 223--230.]] Google ScholarGoogle Scholar
  37. Veldhuizen, T. L. and Gannon, D. 1998. Active libraries: Rethinking the roles of compilers and libraries. In Proceedings of the SIAM Workshop on Object Oriented Methods for Interoperable Scientific and Engineering Computing (OO '98) (Yorktown Heights, NY.). SIAM, Philadelphia, PA.]]Google ScholarGoogle Scholar
  38. Wolfe, M. 1996. High Performance Compilers for Parallel Computing. Addison-Wesley, Reading, MA.]] Google ScholarGoogle Scholar

Index Terms

  1. On minimizing materializations of array-valued temporaries

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!