skip to main content
research-article

Heuristics on Reachability Trees for Bicriteria Scheduling of Stream Graphs on Heterogeneous Multiprocessor Architectures

Published:17 February 2015Publication History
Skip Abstract Section

Abstract

In this article, we partition and schedule Synchronous Dataflow (SDF) graphs onto heterogeneous execution architectures in such a way as to minimize energy consumption and maximize throughput. Partitioning and scheduling SDF graphs onto homogeneous architectures is a well-known NP-hard problem. The heterogeneity of the execution architecture makes our problem exponentially challenging to solve. We model the problem as a weighted sum and solve it using novel state space exploration inspired from the theory of parallel automata. The resultant heuristic algorithm results in good scheduling when implemented in an existing stream framework.

References

  1. H. Aydin and Q. Yang. 2003. Energy-aware partitioning for multiprocessor real-time systems. In Proceedings of the International Parallel and Distributed Parallel Processing Symposium IPDPS'03. IEEE Computer Society, Nice, France. 113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Benoit, P. Renaud-Goud, and Y. Robert. 2010. Sharing resources for performance and energy optimization of concurrent streaming applications. In Proceedings of the 2010 22nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'10). IEEE Computer Society, Washington, DC, 79--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. D. Braun, H. J. Siegel, N. Beck, L. L. Bölöni, M. Maheswaran, A. I. Reuther, J. P. Robertson, M. D. Theys, B. Yao, D. Hensgen, and R. F. Freund. 2001. A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61, 6, 810--837. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Burch, E. Clarke, K. McMillan, D. Dill, and L. Hwang. 1990. Symbolic model checking: 1020 states and beyond. In Proceedings of the Annual Symposium on Logic in Computer Science (LICS'90). 428--439.Google ScholarGoogle Scholar
  5. P. M. Carpenter, A. Ramirez, and E. Ayguade. 2009. Mapping stream programs onto heterogeneous multiprocessor systems. In Proceedings of the 2009 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems(CASES'09). ACM, New York, NY, 57--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. M. Clarke, O. Grumberg, and D. Peled. 2000. Model Checking. MIT Press.Google ScholarGoogle Scholar
  7. W. Dally and B. Towles. 2001. Route packets, not wires: On-chip interconnection networks. In Proceedings of the 2001 Design Automation Conference. IEEE, Las Vegas, USA, 684--689. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Donald and M. Martonosi. 2006. An efficient, practical parallelization methodology for multicore architecture simulation. IEEE Comput. Archit. Lett. 5, 14--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. M. Farhad, Y. Ko, B. Burgstaller, and B. Scholz. 2011. Orchestration by approximation: Mapping stream programs onto multicore architectures. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'11). ACM, New York, NY, 357--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. R. Garey and D. S. Johnson. 1990. Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York, NY.Google ScholarGoogle Scholar
  11. M. I. Gordon, W. Thies, and S. Amarasinghe. 2006. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. SIGOPS Oper. Syst. Rev. 40, 151--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Griebl, C. Lengauer, and S. Wetzel. 1998. Code generation in the polytope model. In IEEE PACT. IEEE Computer Society Press, 106--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Z. Gu, M. Yuan, N. Guan, M. Lv, X. He, Q. Deng, and G. Yu. 2007. Static scheduling and software synthesis for dataflow graphs with symbolic model-checking. In RTSS'07. IEEE Computer Society, Washington, DC, 353--364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Hu and R. Marculescu. 2003. Energy-aware mapping for tile-based NoC architectures under performance constraints. In Proceedings of the 2003 Asia and South Pacific Design Automation Conference. ASP-DAC'03. IEEE Computer Society, New York, NY, USA, 233--239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Hu and R. Marculescu. 2004. Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In Proceedings of the Conference on Design, Automation and Test in Europe - Volume 1. DATE'04. IEEE Computer Society, 10234--. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Jha. 2005. Low-power system scheduling, synthesis and displays. Computers and Digital Techniques, IEE Proceedings -152, 3, 344--352.Google ScholarGoogle Scholar
  17. S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. 1983. Optimization by simulated annealing. Science 220, 671--680.Google ScholarGoogle ScholarCross RefCross Ref
  18. M. Kudlur and S. Mahlke. 2008. Orchestrating the execution of stream programs on multicore platforms. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation. PLDI'08. ACM, New York, NY, USA, 114--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Kumar, A. Jantsch, J.-P. Soininen, M. Forsell, M. Millberg, J. Oberg, K. Tiensyrja, and A. Hemani. 2002. A network on chip architecture and design methodology. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2002. 105--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Lee and D. Messerschmitt. 1987a. Synchronous data flow. Proceedings of the IEEE. Vol. 75, 9, 1235--1245.Google ScholarGoogle ScholarCross RefCross Ref
  21. E. A. Lee and D. G. Messerschmitt. 1987b. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. 36, 24--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. E. A. Lee and D. G. Messerschmitt. 1987c. Synchronous data flow. In Proceedings of the IEEE. Vol. 75, 1235--1245.Google ScholarGoogle ScholarCross RefCross Ref
  23. A. Malik and D. Gregg. 2012a. Executing Synchornous Data Flow Graphs on Heterogeneous Execution Architectures Using Integer Linear Programming. Technical report, Department of Computer Science and Statistics, Trinity College Dublin.Google ScholarGoogle Scholar
  24. A. Malik and D. Gregg. 2012b. Theorems for Model-Checking. Technical report, Department of Computer Science and Statistics, Trinity College Dublin.Google ScholarGoogle Scholar
  25. A. Malik and D. Gregg. 2013. Orchestrating stream graphs using model checking. TACO 10, 3, 19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Marcon, A. Borin, A. Susin, L. Carro, and F. Wagner. 2005. Time and energy efficient mapping of embedded applications onto nocs. In Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC'05). IEEE Computer Society, Vol. 1. 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. N. Mudge. 2000. Power: A first class design constraint for future architecture and automation. In Proceedings of the 7th International Conference on High Performance Computing (HiPC'00). IEEE Computer Society, 215--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Murali, M. Coenen, A. Radulescu, K. Goossens, and G. De Micheli. 2006. A methodology for mapping multiple use-cases onto networks on chips. In Proceedings of the Design, Automation and Test in Europe (DATE'06). IEEE Computer Society, Vol. 1. 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Nagar, J. Haddock, and S. Heragu. 1995. Multiple and bicriteria scheduling: A literature survey. Eur. J. Oper. Res. 81, 1, 88--104.Google ScholarGoogle ScholarCross RefCross Ref
  30. V. Sarkar. 1989. Partitioning and scheduling parallel algorithms for execution on multiprocessors. Ph.D. thesis, Stanford University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. N. Satish, K. Ravindran, and K. Keutzer. 2007. A decomposition-based constraint optimization approach for statically scheduling task graphs with communication delays to multiprocessors. In Design, Automation Test in Europe Conference Exhibition (DATE'07). IEEE Computer Society, 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. G. C. Sih and E. A. Lee. 1993. Declustering: A new multiprocessor scheduling technique. IEEE Trans. Parallel Distrib. Syst. 4, 625--637. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Singh, M. Shafique, A. Kumar, and J. Henkel. 2013. Mapping on multi/many-core systems: Survey of current and emerging trends. In Proceedings of the 50th ACM/EDAC/ IEEE Design Automation Conference (DAC), 2013. 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. W. Thies, M. Karczmarek, and S. P. Amarasinghe. 2002. StreamIt: A language for streaming applications. In Proceedings of the 11th International Conference on Compiler Construction (CC'02). Springer, London, UK, 179--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. K. Trifunovic, D. Nuzman, A. Cohen, A. Zaks, and I. Rosen. 2009. Polyhedral-model guided loop-nest auto-vectorization. In Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques (PACT'09). ACM, 327--337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Udupa, R. Govindarajan, and M. J. Thazhuthaveetil. 2009. Software pipelined execution of stream programs on GPUs. In Proceedings of the International Symposium of Code Generation and Optimization. IEEE Computer Society, 200--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. R. Xu, R. Melhem, and D. Moss. 2007. Energy-aware scheduling for streaming applications on chip multiprocessors. In Proceedings of the 28th IEEE Real-Time System Symposium (RTSS'07). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Heuristics on Reachability Trees for Bicriteria Scheduling of Stream Graphs on Heterogeneous Multiprocessor Architectures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!