skip to main content
research-article

Minimal placement of bank selection instructions for partitioned memory architectures

Published:29 January 2008Publication History
Skip Abstract Section

Abstract

We have devised an algorithm for minimal placement of bank selections in partitioned memory architectures. This algorithm is parameterizable for a chosen metric, such as speed, space, or energy. Bank switching is a technique that increases the code and data memory in microcontrollers without extending the address buses. Given a program in which variables have been assigned to data banks, we present a novel optimization technique that minimizes the overhead of bank switching through cost-effective placement of bank selection instructions. The placement is controlled by a number of different objectives, such as runtime, low power, small code size or a combination of these parameters. We have formulated the minimal placement of bank selection instructions as a discrete optimization problem that is mapped to a partitioned boolean quadratic programming (PBQP) problem. We implemented the optimization as part of a PIC Microchip backend and evaluated the approach for several optimization objectives. Our benchmark suite comprises programs from MiBench and DSPStone plus a microcontroller real-time kernel and drivers for microcontroller hardware devices. Our optimization achieved a reduction in program memory space of between 2.7 and 18.2%, and an overall improvement with respect to instruction cycles between 5.0 and 28.8%. Our optimization achieved the minimal solution for all benchmark programs. We investigated the scalability of our approach toward the requirements of future generations of microcontrollers. This study was conducted as a worst-case analysis on the entire MiBench suite. Our results show that our optimization (1) scales well to larger numbers of memory banks, (2) scales well to the larger problem sizes that will become feasible with future microcontrollers, and (3) achieves minimal placement for more than 72% of all functions from MiBench.

References

  1. Banakar, R., Steinke, S., Lee, B., Balakrishnan, M., and Marwedel, P. 2002. Scratchpad memory: A design alternative for cache on-chip memory in embedded systems. In Proceedings of the 10th International Symposium on Hardware/Software Codesign (CODES'02). ACM Press, New York. 73--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bryant, R. E. and O'Halloran, D. R. 2003. Computer Systems: A Programmer's Perspective. Prentice-Hall, Englewood Cliffs, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cai, Q. and Xue, J. 2003. Optimal and efficient speculation-based partial redundancy elimination. In Proceedings of the International Symposium on Code Generation and Optimization (CGO'03). IEEE Computer Society, Los Alamitos, CA. 91--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cho, J., Paek, Y., and Whalley, D. 2004. Fast memory bank assignment for fixed-point digital signal processors. ACM Transactions on Design Automation of Electronic Systems 9, 1, 52--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dattalo, T. S. 2006. The Gpsim SW simulator for PIC microcontrollers. http://www.dattalo.com/gnupic/gpsim.html.Google ScholarGoogle Scholar
  6. Delaluz, V., Kandemir, M., Vijaykrishnan, N., and Irwin, M. J. 2000. Energy-oriented compiler optimizations for partitioned memory architectures. In Proceedings of the 2000 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'00). ACM Press, New York. 138--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Eckstein, E. 2003. Code optimizations for digital signal processors. Ph.D. thesis, Institute of Computer Languages, Compilers and Languages Group, Vienna University of Technology.Google ScholarGoogle Scholar
  8. Fursin, G., Cavazos, J., O'Boyle, M., and Temam, O. 2007. MiDataSets: Creating the conditions for a more realistic evaluation of iterative optimization. In Proceedings of the International Conference on High Performance Embedded Architectures & Compilers (HiPEAC 2007). Vol. 4367. Springer LNCS, 245--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gartner Dataquest. 2004. 2003 microcontroller market share and unit shipments.Google ScholarGoogle Scholar
  10. Gartner Dataquest. 2005. Top companies revenue from shipments of 8-bit mcu---all applications.Google ScholarGoogle Scholar
  11. Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE 4th Annual Workshop on Workload Characterization. IEEE Computer Society, Los Alamitos, CA. 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hames, L. and Scholz, B. 2006. Nearly optimal register allocation with PBQP. In Proceedings of the 7th Joint Modular Languages Conference (JMLC'06). LNCS, vol. 4228. Springer, New York. 346--361. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Hempstead, M., Tripathi, N., Mauro, P., Wei, G.-Y., and Brooks, D. 2005. An ultra low power system architecture for sensor network applications. In Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA'05). IEEE Computer Society, Los Alamitos, CA. 208--219. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hempstead, M., Wei, G., and Brooks, D. 2006. Architecture and circuit techniques for low-throughput, energy-constrained systems across technology generations. In Proceedings of the 2006 International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES'06). ACM Press, New York. 368--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. HI-TECH Software. 2006. PICC ANSI C Compiler. http://www.htsoft.com/.Google ScholarGoogle Scholar
  16. Kiyohara, T., Mahlke, S., Chen, W., Bringmann, R., Hank, R., Anik, S., and Hwu, W.-M. 1993. Register connection: A new approach to adding registers into instruction set architectures. In Proceedings of the 20th Annual International Symposium on Computer Architecture. ACM Press, New York. 247--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kleinberg, J. M. and Tardos, E. 1999. Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science (FOCS'99). IEEE Computer Society, Los Alamitos, CA. 14--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Knoop, J., Rüthing, O., and Steffen, B. 1994. Optimal code motion: Theory and practice. ACM Trans. Program. Lang. Syst. 16, 4, 1117--1155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Leupers, R. and Kotte, D. 2001. Variable partitioning for dual memory bank DSPs. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. 1121--1124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Li, L., Gao, L., and Xue, J. 2005. Memory coloring: A compiler approach for scratchpad memory management. In Proceedings of the 2005 International Conference on Parallel Architectures and Compilation Techniques. 329--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Microchip Technology Inc. 1997. PICmicro mid-range MCU family reference manual.Google ScholarGoogle Scholar
  22. Microchip Technology Inc. 2003. PIC16F87XA data sheet.Google ScholarGoogle Scholar
  23. Microchip Technology Inc. 2006. PIC18F97J60 family data sheet, advance information.Google ScholarGoogle Scholar
  24. MicrochipC.com. 2006. PIC micros and C. http://www.microchipc.com/.Google ScholarGoogle Scholar
  25. Muchnick, S. S. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Nazhandali, L., Minuth, M., Zhai, B., Olson, J., Austin, T., and Blaauw, D. 2005. A second-generation sensor network processor with application-driven memory optimizations and out-of-order execution. In Proceedings of the 2005 International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES'05). ACM Press, New York. 249--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Nystrom, E. and Eichenberger, A. E. 1998. Effective cluster assignment for modulo scheduling. In Proceedings of the 31st Annual ACM/IEEE International Symposium on Microarchitecture. 103--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Panda, P. R., Dutt, N. D., and Nicolau, A. 2000. On-chip vs. off-chip memory: The data partitioning problem in embedded processor-based systems. ACM Transactions on Design Automation of Electronic Systems 5, 3, 682--704. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Panda, P. R., Catthoor, F., Dutt, N. D., Danckaert, K., Brockmeyer, E., Kulkarni, C., Vandercappelle, A., and Kjeldsberg, P. G. 2001. Data and memory optimization techniques for embedded systems. ACM Transactions on Design Automation of Electronic Systems 6, 2, 149--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ravindran, R. A., Senger, R. M., Marsman, E. D., Dasika, G. S., Guthaus, M. R., Mahlke, S. A., and Brown, R. B. 2005. Partitioning variables across register windows to reduce spill code in a low-power processor. IEEE Trans. Comput. 54, 8, 998--1012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Saghir, M. A. R., Chow, P., and Lee, C. G. 1996. Exploiting dual data-memory banks in digital signal processors. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM Press, New York. 234--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Scholz, B. and Eckstein, E. 2002. Register allocation for irregular architectures. In Proceedings of the Joint Conference on Languages, Compilers and Tools for Embedded Systems (LCTES'02). ACM, New York. 139--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Scholz, B., Horspool, N., and Knoop, J. 2004. Optimizing for space and time usage with speculative partial redundancy elimination. SIGPLAN Notices 39, 7, 221--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sudarsanam, A. and Malik, S. 1995. Memory bank and register allocation in software synthesis for ASIPs. In Proceedings of the 1995 IEEE/ACM International Conference on Computer-Aided Design (ICCAD'95). 388--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Udayakumaran, S. and Barua, R. 2003. Compiler-decided dynamic memory allocation for scratch-pad based embedded systems. In Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES'03). ACM Press, New York. 276--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Verma, M., Wehmeyer, L., and Marwedel, P. 2004. Cache-aware scratchpad allocation algorithm. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'04). IEEE Computer Society, Los Alamitos, CA. 1264--1269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zhuang, X., Pande, S., and Jr., J. S. G. 2002. A framework for parallelizing load/stores on embedded processors. In Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques (PACT'02). IEEE Computer Society, Los Alamitos, CA. 68--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Zhuge, Q., Xiao, B., and Sha, E. H.-M. 2002. Variable partitioning and scheduling of multiple memory architectures for DSP. In Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS'02). IEEE Computer Society, Los Alamitos, CA. 332. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Minimal placement of bank selection instructions for partitioned memory architectures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!