skip to main content
10.1145/1375657.1375677acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article

Post-pass periodic register allocation to minimise loop unrolling degree

Published:12 June 2008Publication History

ABSTRACT

This paper solves an open problem regarding loop unrolling after periodic register allocation. Although software pipelining is a powerful technique to extract fine-grain parallelism, it generates reuse circuits spanning multiple loop iterations. These circuits require periodic register allocation, which in turn yield a code generation challenge, generally addressed through: (1) hardware support --- rotating register files --- deemed too expensive for embedded processors, (2) insertion of register moves with a high risk of reducing the computation throughput --- initiation interval (II) --- of software pipelining, and (3) post-pass loop unrolling that does not compromise throughput but often leads to unpractical code growth. The latter approach relies on the proof that MAXLIVE registers are sufficient for periodic register allocation (2; 3; 5); yet the only heuristic to control the amount of post-pass loop unrolling does not achieve this bound and leads to undesired register spills (4; 7).

We propose a periodic register allocation technique allowing a software-only code generation that does not trade the optimality of the II for compactness of the generated code. Our idea is based on using the remaining registers: calling Rarch the number of architectural registers of the target processor, then the number of remaining registers that can be used for minimising the unrolling degree is equal to Rarch-MAXLIVE.

We provide a complete formalisation of the problem and algorithm, followed by extensive experiments. We achieve practical loop unrolling degrees in most cases --- with no increase of the II --- while state-of-the-art techniques would either induce register spilling, degrade the II or lead to unacceptable code growth.

References

  1. D. de Werra, C. Eisenbeis, S. Lelait, and B. Marmol. On a Graph-Theoretical Model for Cyclic Register Allocation. Discrete Applied Mathematics, 93(2-3):191--203, July 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S.A.A. Touati and C. Eisenbeis. Early Periodic Register Allocation on ILP Processors.Parallel Processing Letters, Vol. 14, No. 2, June 2004. World Scientific.Google ScholarGoogle Scholar
  3. L. J. Hendren, G. R. Gao, E. R. Altman, and C. Mukerji. A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs. In Proceedings of the International Conference on Compiler Construction (CC'02). Lecture Notes in Computer Science, 641:176--191, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Lam. Software Pipelining: An Effective Scheduling Technique for VLIW Machines, In Proceedings of the SIGPLAN 88 Conference on Programming Language Design and Implementation, pages 318--328, Atlanta, Georgia, June 22-24, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Eisenbeis, S. Lelait and B. Marmol. The Meeting Graph: A New Model for Loop Cyclic Register Allocation.in Proceedings of the IFIP WG 10.3 Working Conference on Parallel Architectures and Compilation Techniques, PACT 95, pages 264--267, Limasol,Cyprus, June 1995. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J.C. Dehnert, P.Y Hsu, and J.P. Bratt. Overlapped Loop Support in the Cydra 5.In proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pages 26--38, Boston, Massachusetts, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sylvain Lelait. Contribution à l'Allocation de Registres dans les Boucles. PhD thesis, Université d'Orléans, France, January 1996.Google ScholarGoogle Scholar
  8. S.A.A. Touati. Register Pressure in Instruction Level Parallelism. PhD thesis, Université de Versailles, France, June 2002.Google ScholarGoogle Scholar
  9. B.D de Dinechin. A Unified Software Pipeline Construction Scheme For Modulo Scheduled Loops. Proceedings of the 4th International Conference on Parallel Computing Technologies, pages 189--200, Yaroslavl, Russia, August 7-9, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Cytron and J. Ferrante. What's in a Name? or the Value of Renaming for Parallelism Detection and Storage Allocation, Proceedings of the 1987 International Conference on Parallel Processing, pages 19--27, Pennsylvanie, August 1987.Google ScholarGoogle Scholar
  11. A. Nicolau, R. Potasman and H. Wang. Register Allocation, Renaming and Their Impact on Fine-Grain Parallelism, Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing, Santa Clara, California, August 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Richard A. Huff. Lifetime-Sensitive Modulo Scheduling, In Proceedings of the ACM SIGPLAN 93 Conference on Programming Language Design and Implementation, pages 258--267, Albuquerque, New Mexico, June 23-25, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. A. Fisher, P. Faraboschi and C. Young.Embedded Computing: a VLIW Approach to Architecture, Compilers and Tools, Book, Morgan Kaufmann Publishers, 2005Google ScholarGoogle Scholar

Index Terms

  1. Post-pass periodic register allocation to minimise loop unrolling degree

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      LCTES '08: Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
      June 2008
      180 pages
      ISBN:9781605581040
      DOI:10.1145/1375657
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 43, Issue 7
        LCTES '08
        July 2008
        167 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/1379023
        Issue’s Table of Contents

      Copyright © 2008 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 June 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate116of438submissions,26%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!