skip to main content
article
Free Access

Compiling for shared-memory and message-passing computers

Published:01 March 1993Publication History
Skip Abstract Section

Abstract

Many parallel languages presume a shared address space in which any portion of a computation can access any datum. Some parallel computers directly support this abstraction with hardware shared memory. Other computers provide distinct (per-processor) address spaces and communication mechanisms on which software can construct a shared address space. Since programmers have difficulty explicitly managing address spaces, there is considerable interest in compiler support for shared address spaces on the widely available message-passing computers.

At first glance, it might appear that hardware-implemented shared memory is unquestionably a better base on which to implement a language. This paper argues, however, that compiler-implemented shared memory, despite its short-comings, has the potential to exploit more effectively the resources in a parallel computer. Hardware designers need to find mechanisms to combine the advantages of both approaches in a single system.

References

  1. ADVE, S. V. AND HILL, M.D. 1990. Weak ordering--a new definition. In Proceedings of the 17th Annual International Symposium on Computer Architecture (May), 2-14. Google ScholarGoogle Scholar
  2. ADVE, S. V., ADVE, V. S., HILL, M. D., AND VERNON, M.K. 1991. Comparison of hardware and software cache coherence schemes. In Proceedings of the 18th Annual International Symposium on Computer Architecture (June), 298-308. Google ScholarGoogle Scholar
  3. AGARWAL, A., LIM, B.-H., KRANZ, D., AND KUBIATOWICZ, J. 1990. APRIL: A processor architecture for multiprocessing. In Proceedings of the 17th Annual International Symposium on Computer Architecture (June), 104-114. Google ScholarGoogle Scholar
  4. AGARWAL, A., SIMONI, R., HOROWITZ, M., AND HENNESSY, J. 1988. An evaluation of directory schemes for cache coherence. In Proceedings of the 15th Annual International Symposium on Computer Architecture, 280-289. Google ScholarGoogle Scholar
  5. ALVERSON, R., CALLAHAN, D., CUMMINGS, D., KOBLENZ, B., PORTERFIELD, A., AND SMITH, B. 1990. The Tera computer system. In Proceedings of the 1990 International Conference on Supercomputing (June), 1-6. Google ScholarGoogle Scholar
  6. BELL, C.G. 1985. Multis: A new class of multiprocessor computers. Science 228. Amer. Assn. for Advancement of Science, Washington, D.C., 462-466.Google ScholarGoogle Scholar
  7. BOZKUS, Z., CHOUDHARY, A., Fox, G., HAUPT, T., RANKA, S. AND Wv, M.-V. 1993. Compiling Fortran 90D/HPF for distributed memory MIMD computers. Tech. Rep. SCCS-444 (Mar.). Syracuse University Press, Syracuse, N.Y.Google ScholarGoogle Scholar
  8. CALLAHAN, D. AND KENNEDY, K. 1988. Compiling programs for distributed-memory multiprocessors. J. Supercomput. 2, 151-169.Google ScholarGoogle Scholar
  9. CHATTERJEE, S., GILBERT, J. R., SCHREIBER, R., AND TENG, S. H. 1993. Automatic array alignment in data-parallel programs. In Conference Record of~ the 20th Annual ACM Symposium on Principles of Programming Lansuages. (Jan.). ACM, New York, 16-28. Google ScholarGoogle Scholar
  10. CULLER, D. E., SAH, A., SCHAUSER, K. E., EICKEN, T. VON, AND WAWRZYNEK, J. 1991. Fine-grain parallelism with minimal hardware support: A compiler-controlled thread abstract machine. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS IV) (Apr.), 164-175. Google ScholarGoogle Scholar
  11. DALLY, W. J., CHIEN, A., FISKE, S., HORWAT, W., KEEN, J., LARIVEE, M., NUTH, R., WILLS, S., CARRICK, P., AND FLYER, G. 1989. The J-Machine: A fine-grain concurrent computer. In Proc. Information Processing 89. Elsevier North-Holland, Inc., New York.Google ScholarGoogle Scholar
  12. EGGERS, S. J. AND KATZ, R.H. 1988. A characterization of sharing in parallel programs and its application to coherency protocol evaluation. In Proceedings of the 15th Annual International Symposium on Computer Architecture, 373-382. Google ScholarGoogle Scholar
  13. GANNON, D., JALBY, W. AND GALLIVAN, K. 1988. Strategies for cache and local memory management by global program transformation. J. Parallel Distrib. Comput. 5, 587-616. Google ScholarGoogle Scholar
  14. GERNDT, H.M. 1989. Automatic parallelization for distributed-memory multiprocessor systems. Ph.D. thesis, Rheinischen Friedrich-Wilhelms-Universit~t.Google ScholarGoogle Scholar
  15. GOODMAN, J. R., VERNON, M. K., AND WOEST, P.J. 1989. Efficient synchronization primitives for large-scale cache-coherent multiprocessors. In Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS III) (Apr.), 64-77. Google ScholarGoogle Scholar
  16. GUPTA, M., MIDKIFF, S., SCHONBERG, E., SWEENEY, P., WANG, K. Y., AND BURKE, M. 1993. Ptran II--A compiler for high-performance Fortran. In 4th Workshop on Compilers for Parallel Computers. Delft, Netherlands (Dec.).Google ScholarGoogle Scholar
  17. HALL, M. W., HIRANANDANI, S., KENNEDY, K., AND TSENG, C-W. 1992. Interprocedural compilation of Fortran D for MIMD distributed-memory machines, in Proceedings of Supercomputing 92 (Nov.), 522-534. Google ScholarGoogle Scholar
  18. HILL, M. D., AND SMITH, A. J. 1989. Evaluating associativity in CPU caches. IEEE Trans. Comput. C-38, 12 (Dec.), 1612-1630. Google ScholarGoogle Scholar
  19. HILL, M. D., LARUS, J. R., REINHARDT, S. K., AND WOOD, D. A. 1992. Cooperative shared memory: Software and hardware for scalable multiprocessors. ACM Trans. Comput. Syst. 11, 4 (Nov.). Google ScholarGoogle Scholar
  20. HILLIS, W. D. AND TUCKER, L.W. 1993. The CM-5 connection machine: A scalable supercomputer. Commun. ACM 36, 11 (Nov.), 31-40. Google ScholarGoogle Scholar
  21. HmANANDANI, S., KENNEDY, K., AND TSENG, C.-W. 1992. Compiling Fortran D for MIMD distributed-memory machines. Commun. ACM 35, 8 (Aug.), 66-80. Google ScholarGoogle Scholar
  22. Kendall Square Research. 1990. Kendall Square Research Technical Summary, Cambridge, Mass.Google ScholarGoogle Scholar
  23. KOELBEL, C., MEHROTRA, P., AND VAN ROSENDALE, J. 1990. Supporting shared data structures on distributed memory architectures. In 2nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP) (Mar.). ACM Press, New York, 177-186. Google ScholarGoogle Scholar
  24. KRANZ, D., JOHNSON, K., AGARWAL, A., KUBIATOWICZ, J., AND LIM, B.-H. 1993. Integrating message-passing and shared-memory: Early experience. In 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP) (May). ACM Press, New York, 54-63. Google ScholarGoogle Scholar
  25. LAM, M. S., ROTHBERG, E. E., AND WOLF, M.E. 1991. The cache performance and optimizations of blocked algorithms. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems ( ASPLOS IV) (Apr.), 63-74. Google ScholarGoogle Scholar
  26. LARUS, J. R., CHANDRA, S., AND WOOD, D.A. 1993. CICO: A shared-memory programming performance model. In Portability and Performance for Parallel Processors. John Wiley and Sons, New York. To be published.Google ScholarGoogle Scholar
  27. LENOSKI, D., LAUDON, J., GHARACHORLOO, K., WEBER, W.-D., GUPTA, A., HENNESSY, J., HOROWITZ, M., AND LAM, M. 1992. The Stanford DASH multiprocessor. IEEE Computer 25, 3 (Mar.), 63-79. Google ScholarGoogle Scholar
  28. LI, K. AND HUDAK, P. 1989. Memory coherence in shared virtual memory systems. ACM Trans. Comput. Syst. 7, 4 (Nov.), 321-359. Google ScholarGoogle Scholar
  29. LOVEMAN, D.B. 1993. High performance Fortran. IEEE Parallel Distrib. Tech. 1, 1 (Feb.), 25-42. Google ScholarGoogle Scholar
  30. MAYDAN, D. E., AMARASINGHE, S. P., AND LAM, M.S. 1993. Array data-flow analysis and its use in array privatization. In Conference Record of the 20th Annual ACM Symposium on Principles of Programming Languages (Jan.). ACM Press, New York, 2-15. Google ScholarGoogle Scholar
  31. PINGALI, K. AND ROGERS, A. 1990. Compiling for locality. In Proceedings of the 1990 International conference on parallel conference (vol II:software)(aug),II,142-146.Google ScholarGoogle Scholar
  32. RAMANUJAM, j. AND SADAYAPPAN, P. 1991. Compile-time technique for data distribution in distributed memory machines. IEEE Trans. Parallel Distrib. Syst. 2, 4 (Oct.), 472-482. Google ScholarGoogle Scholar
  33. REINHARDT, S. K., LARUS, J. R., AND WOOD, D.A. 1994. Typhoon and Tempest: User-level shared memory. In Proceedings of the 21st Annual International Symposium on Computer Architecture. To be published. Google ScholarGoogle Scholar
  34. RETTBERG, R. AND THOMAS, R. 1986. Contention is no obstacle to shared-memory multiprocessing. Commun. ACM 29, 12 (Dec.), 1202-1212. Google ScholarGoogle Scholar
  35. ROGERS, A. M. 1991. Compiling for locality of reference. Ph.D. thesis, Cornell University, Ithaca, N.Y. Google ScholarGoogle Scholar
  36. SALTZ, J., CROWLEY, J., MIRCHANDANE~, R., AND BERRYMAN, H. 1990. Run-time scheduling and execution of loops on message passing machines. J. Parallel Distrib. Comput. 8, 303-312. Google ScholarGoogle Scholar
  37. SINGH, J. P., JOE, T., GUPTA, A. AND HENNESSY, J. 1993. An empirical comparison of the Kendall Square Research KSR~I and Stanford DASH Multiprocessor. In Proceedings of Supercornputing 93 (Nov.). Google ScholarGoogle Scholar
  38. WOLF, M. E. AND LAM, M.S. 1991. A data locality optimizing algorithm. In Proceedings of the SIGPLAN "91 Conference on Programming Language Design and Implementation (June). ACM, New York, 30-44. Google ScholarGoogle Scholar
  39. WOOD, D. A., CHANDRA, S., FALSAFI, B., HILL, M. D., LARUS, J. R., LEBECK, A. R., LEWIS, J. C., MUKHERJEE, S. S., PALACHARLA, S., AND REINHARDT, S.K. 1993. Mechanisms for cooperative shared memory. In Proceedings of the 20th Annual International Symposium on Computer Architecture (May), 156-168. Google ScholarGoogle Scholar
  40. ZIMA, H. AND CHAPMAN, B. 1993. Compiling for distributed-memory systems. Proc. IEEE 81, 2 (Feb.), 264-287.Google ScholarGoogle Scholar

Index Terms

  1. Compiling for shared-memory and message-passing computers

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!