skip to main content
10.1145/1111037.1111055acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
Article

Compiler-directed channel allocation for saving power in on-chip networks

Authors Info & Claims
Published:11 January 2006Publication History

ABSTRACT

Increasing complexity in the communication patterns of embedded applications parallelized over multiple processing units makes it difficult to continue using the traditional bus-based on-chip communication techniques. The main contribution of this paper is to demonstrate the importance of compiler technology in reducing power consumption of applications designed for emerging multi processor, NoC (Network-on-Chip) based embedded systems. Specifically, we propose and evaluate a compiler-directed approach to NoC power management in the context of array-intensive applications, used frequently in embedded image/video processing. The unique characteristic of the compiler-based approach proposed in this paper is that it increases the idle periods of communication channels by reusing the same set of channels for as many communication messages as possible. The unused channels in this case take better advantage of the underlying power saving mechanism employed by the network architecture. However, this channel reuse optimization should be applied with care as it can hurt performance if two or more simultaneous communications are mapped onto the same set of channels. Therefore, the problem addressed in this paper is one of reducing the number of channels used to implement a set of communications without increasing the communication latency significantly. To test the effectiveness of our approach, we implemented it within an optimizing compiler and performed experiments using twelve application codes and a network simulation environment. Our experiments show that the proposed compiler-based approach is very successful in practice and works well under both hardware based and software based channel turn-off schemes.

References

  1. Mediabench. http://cares.icsl.ucla.edu/MediaBench/.Google ScholarGoogle Scholar
  2. Mibench. http://www.eecs.umich.edu/mibench.Google ScholarGoogle Scholar
  3. V. S. Adve and M. K. Vernon. Performance analysis of mesh interconnection networks with deterministic routing. IEEE Trans. Parallel Distrib. Syst., 5(3):225--246, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Agarwal. Limits on interconnection network performance. IEEE Trans. Parallel Distrib. Syst., 2(4):398--412, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Ascia, V. Catania, and M. Palesi. Multi-objective mapping for mesh-based NoC architectures. In Proc. the International Conference on Hardware/Software Codesign and System Synthesis, Sept. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Ayguad and J. Torres. Partitioning the statement per iteration space using non-singular matrices. In Proc. 7th ACM International Conference on Supercomputing ICS, pages 407--415, Tokyo, Japan, July 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Banerjee, J. A. Chandy, M. Gupta, E. W. H. IV, J. G. Holm, A. Lain, D. J. Palermo, S. Ramaswamy, and E. Su. The paradigm compiler for distributed-memory multicomputers. Computer, 28(10):37--47, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. K. Barua. Maps: A Compiler-Managed Memory System for Raw Machines. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. Benini and G. D. Micheli. Networks on chips: a new Soc paradigm. IEEE Computer, 35(1):70--78, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. V. Boppana and S. Chalasani. A framework for designing deadlock-free wormhole routing algorithms. IEEE Trans. Parallel Distrib. Syst., 7(2):169--183, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Bozkus, A. Choudhary, G. Fox, T. Haupt, and S. Ranka. Fortran 90d/hpf compiler for distributed memory mimd computers: design, implementation, and performance results. In Supercomputing '93: Proceedings of the 1993 ACM/IEEE conference on Supercomputing, pages 351--360, New York, NY, USA, 1993. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Chakrabarti, M. Gupta, and J.-D. Choi. Global communication analysis and optimization. In PLDI '96: Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation, pages 68--78, New York, NY, USA, 1996. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. W. J. Dally and C. L. Seitz. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans. Comput., 36(5):547--553, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. J. Dally and B. Towles. Route packets, not wires: On-chip inteconnectoin networks. In Proc. the 38th Conference on Design Automation, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. B. Duato, S. Yalamanchili, and L. Ni. Interconnection Networks. Morgan Kaufmann Publishers, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Eisley and L.-S. Peh. High-level power analysis of on-chip networks. In Proc. the 7th International Conference on Compilers, Architectures and Synthesis for Embedded Systems, Sept. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Gomaa, C. Scarbrough, T. N. Vijaykumar, and I. Pomeranz. Transient-fault recovery for chip multiprocessors. SIGARCH Comput. Archit. News, 31(2), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Hazucha and C. Svensson. Impact of cmos technology scaling on the atmospheric neutron soft error rate. IEEE Transactions on Nuclear Science, 47(6), 2000.Google ScholarGoogle ScholarCross RefCross Ref
  19. S. Hiranandani, K. Kennedy, and C.-W. Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8):66--80, Aug. 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Ho, K. Mai, and M. Horowitz. Efficient on-chip global interconnects. In Proc. Symposium on VLSI Circuits, June 2003.Google ScholarGoogle ScholarCross RefCross Ref
  21. J. Hu and R. Marculescu. Exploiting the routing flexibility for energy/performance aware mapping of regular NoC architectures. In Proc. the Design Automation and Test in Europe, Mar. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Hu and R. Marculescu. Energy- and performance-aware mapping for regular Noc architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 24(4):551--562, Apr. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Jose, G. Patounakis, and K. Shepard. A 8gbps on-chip serial link. Technical Report TR10-03-01, Columbia University, Nov. 2003.Google ScholarGoogle Scholar
  24. E. J. Kim, K. H. Yum, G. Link, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, M. Yousif, and C. R. Das. Energy optimization techniques in cluster interconnects. In Proc. the International Symposium on Low Power Electronics and Design, Aug. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. I. Kolcu. Personal communication.Google ScholarGoogle Scholar
  26. W. Lee, D. Puppin, S. Swenson, and S. Amarasinghe. Convergent scheduling. In Proc. the 35th International Symposium on Microarchitecture, Nov. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. P. Mohapatra. Wormhole routing techniques for directed connected multicomputer systems. ACM Computing Surveys, 30(3):374--410, Sept. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Nagarajan, D. Burger, K. S. McKinley, C. Lin, S. W. Keckler, and S. K. Kushwaha. Static placement, dynamic issue (SPDI) scheduling for EDGE architectures. In Proc. International Conference on Parallel Architectures and Compilation Techniques, Oct. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. L. M. Ni and P. K. McKinley. A survey of wormhole routing techniques in direct networks. Computer, 26(2):62--76, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. S. Patel. Power constrained design of multiprocessor interconnection networks. In Proc. the International Conference on Computer Design, Washington, DC, USA, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. V. Raghunathan, M. B. Srivastava, and R. K. Gupta. A survey of techniques for energy efficient on-chip communication. In Proc. the 40th Design Automation Conference, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. K. Reinhardt and S. S. Mukherjee. Transient fault detection via simultaneous multithreading. In ISCA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. R. Moore. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In Proc. the 30th Annual International Symposium on Computer Architecture, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L. Shang, L.-S. Peh, and N. K. Jha. Dynamic voltage scaling with links for power optimization of interconnection networks. In Proc. the International Symposium on High-Performance Computer Architecture, Feb. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. Shin and J. Kim. Power-aware communication optimization for networks-on-chips with voltage scalable links. In Proc. the International Conference on Hardware/Software Codesign and System Synthesis, Sept. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. Simunic and S. Boyd. Managing power consumption in networks on chip. In Proc. the Conference on Design, Automation and Test in Europe. IEEE Computer Society, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. V. Soteriou and L.-S. Peh. Dynamic power management for power optimization of interconnection networks using on/off links. In 11th Symposium on High Performance Interconnects (Hot-I), 2003.Google ScholarGoogle ScholarCross RefCross Ref
  38. V. Soteriou and L.-S. Peh. Design space exploration of power-aware on/off interconnection networks. In Proc. the 22nd International Conference on Computer Design, Oct. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. F. Vermeulen, F. Catthor, D. Verkest, and H. DeMan. Formalized three-layer system-level reuse model and methodology for embedded data-dominated applications. In DATE '00: Proceedings of the conference on Design, automation and test in Europe, pages 92--98, New York, NY, USA, 2000. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to software: Raw machines. Computer, 30(9):86--93, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. H.-S. Wang, X. Zhu, L.-S. Peh, and S. Malik. Orion: a power-performance simulator for interconnection networks. In Proc. the 35th International Symposium on Microarchitecture, Nov. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. W. Wolf. The future of multiprocessor systems-on-chips. In DAC '04: Proceedings of the 41st annual conference on Design automation, pages 681--685, New York, NY, USA, 2004. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. F. Worm, P. Ienne, P. Thiran, and G. D. Micheli. An adaptive low power transmission scheme for on-chip networks. In Proc. the International System Synthesis Symposium, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. N. D. Zervas, K. Masselos, and C. Goutis. Code transformations for embedded multimedia applications: impact on power and performance. In Proceedings of ISCA Power-Driven Microarchitecture Workshop, 1998.Google ScholarGoogle Scholar

Index Terms

  1. Compiler-directed channel allocation for saving power in on-chip networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!