skip to main content
10.1145/1504176.1504195acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Comparability graph coloring for optimizing utilization of stream register files in stream processors

Authors Info & Claims
Published:14 February 2009Publication History

ABSTRACT

A stream processor executes an application that has been decomposed into a sequence of kernels that operate on streams of data elements. During the execution of a kernel, all streams accessed must be communicated through the SRF (Stream Register File), a non-bypassing software-managed on-chip memory. Therefore, optimizing utilization of the SRF is crucial for good performance. The key insight is that the interference graphs formed by the streams in stream applications tend to be comparability graphs or decomposable into a set of multiple comparability graphs. We present a compiler algorithm that can find optimal or near-optimal colorings in stream IGs, thereby improving SRF utilization than the First-Fit

bin-packing algorithm, the best in the literature.

References

  1. Preston Briggs, Keith D. Cooper, and Linda Torczon. Improvements to graph coloring register allocation. ACM Transactions on Programming Languages and Systems, 16(3):428--455, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. J. Chaitin. Register allocation & spilling via graph coloring. In SIGPLAN '82: Proceedings of the 1982 SIGPLAN symposium on Compiler construction, pages 98--101. ACM Press, 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Fred C. Chow and John L. Hennessy. The priority-based coloring approach to register allocation. ACM Trans. Program. Lang. Syst.,12 (4):501--536, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. William J. Dally, Francois Labonte, Abhishek Das, Patrick Hanrahan, and Jung-Ho Ahn et al. Merrimac: Supercomputing with streams. In SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercom-puting, page 35. IEEE Computer Society, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Abhishek Das, William J. Dally, and Peter Mattson. Compiling for stream processing. In PACT '06: Proceedings of the 15th inter-national conference on Parallel architectures and compilation techniques, pages 33--42, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Janet Fabri. Automatic storage optimization. SIGPLAN Not., 14(8): 83--91, 1979. ISSN 0362-1340. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Lal George and Andrew W. Appel. Iterated register coalescing. ACM Trans. Program. Lang. Syst., 18(3):300--324, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jordan Gergov. Algorithms for compile-time memory optimization. In SODA '99: Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, pages 907--908, Philadelphia, PA, USA, 1999. Society for Industrial and Applied Mathematics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Martin Charles Golumbic. Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57). North-Holland Publishing Co., Amsterdam, The Netherlands, The Netherlands, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Govindarajan and S. Rengarajan. Buffer allocation in regular dataflow networks: An approach based on coloring circular-arc graphs. In HIPC '96: Proceedings of the Third International Conference on High-Performance Computing (HiPC '96), page 419, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. A. Kierstead. A polynomial time approximation algorithm for Discrete Math., 3):231--237, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Francois Labonte, Peter Mattson, William Thies, Ian Buck, Christos Kozyrakis, and Mark Horowitz. The stream virtual machine. In PACT '04: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pages 267--277, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Vincent Lefebvre and Paul Feautrier. Automatic storage management for parallel programs. Parallel Comput., 24(3-4):649--671, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Lian Li, Lin Gao, and Jingling Xue. Memory coloring: A compiler approach for scratchpad memory management. In PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, pages 329--338, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lian Li, Quan Hoang Nguyen, and Jingling Xue. Scratchpad allocation for data aggregates in superperfect graphs. In Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, pages 207--216. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lian Li, Hui Feng, Quan Hoang Nguyen, Lin Gao, and Jingling Xue. Compiler-directed scratchpad memory management via graph coloring. ACM Transactions on Architecture and Code Optimization, 2009. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. John D. Owens. Computer Graphics on a Stream Architecture. PhD thesis, Stanford University, November 2002.Google ScholarGoogle Scholar
  18. John D. Owens, Ujval J. Kapasi, Peter Mattson, Brian Towles, Ben Serebrin, Scott Rixner, and William J. Dally. Media processing applications on the imagine stream processor. In Proceedings of the IEEE International Conference on Computer Design, pages 295--302, September 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michael D. Smith, Norman Ramsey, and Glenn Holloway. A generalized algorithm for graph-coloring register allocation. In PLDI '04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, pages 277--288. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Michael Bedford Taylor and Jason Kim et al. The Raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE Micro, 22(2):25--35, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. W. Thies, M. Karczmarek, M. Gordon, D. Maze, J. Wong, H. Ho, M. Brown, and S. Amarasinghe. StreamIt: A compiler for streaming applications, 2001. MIT-LCS Technical Memo TM-622.Google ScholarGoogle Scholar
  22. Li Wang, Xuejun Yang, Jingling Xue, Yu Deng, Xiaobo Yan, Tao Tang, and Quan Hoang Nguyen. Optimizing scientific application loops on stream processors. In LCTES '08: Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems, pages 161--170. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Samuel Williams, John Shalf, Leonid Oliker, Shoaib Kamil, Parry Husbands, and Katherine Yelick. The potential of the cell processor for scientific computing. In CF '06: Proceedings of the 3rd conference on Computing frontiers, pages 9--20, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Nan Wu, Mei Wen, Ju Ren, Yi He, and Chunyuan Zhang. Register allocation on stream processor with local register file. In ACSAC '06: Proceedings of the 11th Asia-Pacific Computer Systems Architecture Conference, pages 545--551, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Xuejun Yang, Xiaobo Yan, Zuocheng Xing, Yu Deng, Jiang Jiang, and Ying Zhang. A 64-bit stream processor architecture for scientific applications. In ISCA '07: Proceedings of the 34th annual international symposium on Computer architecture, pages 210--219. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers, Gen Li, and Guibin Wang. Exploiting loop-dependent stream reuse for stream processors. In PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pages 22--31, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Comparability graph coloring for optimizing utilization of stream register files in stream processors

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
        February 2009
        322 pages
        ISBN:9781605583976
        DOI:10.1145/1504176
        • cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 44, Issue 4
          PPoPP '09
          April 2009
          294 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/1594835
          Issue’s Table of Contents

        Copyright © 2009 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 February 2009

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate230of1,014submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!