skip to main content
article

A spatial path scheduling algorithm for EDGE architectures

Published:20 October 2006Publication History
Skip Abstract Section

Abstract

Growing on-chip wire delays are motivating architectural features that expose on-chip communication to the compiler. EDGE architectures are one example of communication-exposed microarchitectures in which the compiler forms dataflow graphs that specify how the microarchitecture maps instructions onto a distributed execution substrate. This paper describes a compiler scheduling algorithm called spatial path scheduling that factors in previously fixed locations - called anchor points - for each placement. This algorithm extends easily to different spatial topologies. We augment this basic algorithm with three heuristics: (1) local and global ALU and network link contention modeling, (2) global critical path estimates, and (3) dependence chain path reservation. We use simulated annealing to explore possible performance improvements and to motivate the augmented heuristics and their weighting functions. We show that the spatial path scheduling algorithm augmented with these three heuristics achieves a 21% average performance improvement over the best prior algorithm and comes within an average of 5% of the annealed performance for our benchmarks.

References

  1. K. Arvind and R.S. Nikhil. Executing a program on the MIT taggedtoken dataflow architecture. IEEE Transactions on Computers, 39(3):300--318, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S.J. Beaty and P.H. Sweany. Instruction scheduling using simulated annealing. In International Conference on Massively Parallel Computing Systems, Colorado Springs, CO, Apr. 1998.Google ScholarGoogle Scholar
  3. V. Betz and J. Rose. VPR: A new packing, placement and routing tool for FPGA research. In FPL '97: Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications, pages 213--222, London, UK, 1997. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Burger, S.W. Keckler, K.S. McKinley, M. Dahlin, L.K. John, C. Lin, C.R. Moore, J. Burrill, R.G. McDonald, W. Yoder, and others. Scaling to the end of silicon with EDGE architectures. IEEE Computer, pages 44--55, July 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J.B. Dennis and D.P. Misunas. A preliminary architecture for a basic data-flow processor. In International Symposium on Computer Architecture, pages 126--132, New York, NY, USA, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J.R. Ellis. Bulldog: A Compiler for VLIW Architectures. MIT Press, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Fields, S. Rubin, and R. Bodik. Focusing processor policies via critical-path prediction. In Proceedings of the 28th Annual International Symposium on Computer Architecture, pages 74--85, July 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J.A. Fisher, J.R. Ellis, J.C. Ruttenberg, and A. Nicolau. Parallel processing: A smart compiler and a dumb machine. In ACM Symposium on Compiler Construction, Montreal, Canada, June 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. Gibert, J. Sanchez, and A. Gonzalez. Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor. In Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, pages 123--133, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. K. Kailas, K. Ebcioglu, and A.K. Agrawala. CARS: A new code generation framework for clustered ILP processors. In International Symposium on High-Performance Computer Architecture, pages 133--143, Jan. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Kessler and A. Bednarski. Optimal integrated code generation for clustered VLIWarchitectures. In Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems, pages 102--111, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Kirkpatrick, C.D. Gelatt Jr., and M.P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R.E. Korf. Depth-first iterative-deepening: an optimal admissible tree search. Artif. Intell., 27(1):97--109, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Lee, D. Puppin, S. Swanson, and S. Amarasinghe. Convergent scheduling. In International Symposium on Microarchitecture, Istanbul, Turkey, Oct. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Mercaldi, S. Swanson, A. Peterson, A. Putnam, A. Schwerin, M. Oskin, and S. Eggers. Modeling instruction placement on a spatial architecture. In SPAA '06: Proceedings of the Symposium on Parallel Architectures and Applications, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Moss, P.E. Utgoff, J. Cavazos, D. Precup, D. Stefanovic, C. Brodley, and D. Scheeff. Learning to schedule straight-line code. In Neural Information Processing Systems - Natural and Synthetic, Denver, CO, Dec. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Nagarajan, D. Burger, K.S. McKinley, C. Lin, S.W. Keckler, and S.K. Kushwaha. Instruction scheduling for emerging communication-exposed architectures. In The International Conference on Parallel Architectures and Compilation Techniques, pages 74--84, Antibes Juan-les-Pins, France, Oct. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Nagarajan, X. Chen, R.G. McDonald, D. Burger, and S.W. Keckler. Critical path analysis of the TRIPS architecture. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2006.Google ScholarGoogle ScholarCross RefCross Ref
  19. E. Ozer, S. Banerjia, and T.M. Conte. Unified assign and schedule: A new approach to scheduling for clustered register file microarchitectures. In International Symposium on Microarchitecture, pages 308--315, December 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P.G. Paulin and J.P. Knight. Force-directed scheduling in automatic data path synthesis. In DAC '87: Proceedings of the 24th ACM/IEEE conference on Design automation, pages 195--202, New York, NY, USA, 1987. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Qian, S. Carr, and P. Sweany. Optimizing loop performance for clustered VLIW architectures. In The International Conference on Parallel Architectures and Compilation Techniques, pages 271--280, Charlottesville, VA, Sept. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Smith, J. Burrill, J. Gibson, B. Maher, N. Nethercote, B. Yoder, D. Burger, and K.S. McKinley. Compiling for EDGE architectures. In International Symposium on Code Generation and Optimization, Manhattan, NY, Mar. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Swanson, K. Michaelson, A. Schwerin, and M. Oskin. WaveScalar. In Proceedings of the 36th Symposium on Microarchitecture, December 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Swanson, K. Michelson, and M. Oskin. Configuration by combustion: Online simulated annealing for dynamic hardware configuration. In ASPLOS X Wild and Crazy Idea Session, 2002.Google ScholarGoogle Scholar
  25. E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to software: Raw machines. IEEE Computer, pages 86--93, Sept. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Zalamea, J. Llosa, E. Ayguade, and M. Valero. Software and hardware techniques to optimize register file utilization in VLIW architectures. In Proceedings of the International Workshop on Advanced Compiler Technology for High Performance and Embedded Systems (IWACT), July 2001.Google ScholarGoogle Scholar

Index Terms

  1. A spatial path scheduling algorithm for EDGE architectures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 41, Issue 11
      Proceedings of the 2006 ASPLOS Conference
      November 2006
      425 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1168918
      Issue’s Table of Contents
      • cover image ACM Conferences
        ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
        October 2006
        440 pages
        ISBN:1595934510
        DOI:10.1145/1168857

      Copyright © 2006 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 October 2006

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!