Abstract
Massive amounts of legacy sequential code need to be parallelized to make better use of modern multiprocessor architectures. Nevertheless, writing parallel programs is still a difficult task. Automated parallelization methods can be effective both at the statement and loop levels and, recently, at the task level, but they are still restricted to specific source code constructs or application domains. We present in this article an innovative toolset that supports developers when performing manual code analysis and parallelization decisions. It automatically collects and represents the program profile and data dependencies in an interactive graphical format that facilitates the analysis and discovery of manual parallelization opportunities. The toolset can be used for arbitrary sequential C programs and parallelization patterns. Also, its program-scope data dependency tracing at runtime can complement the tools based on static code analysis and can also benefit from it at the same time. We also tested the effectiveness of the toolset in terms of time to reach parallelization decisions and of their quality. We measured a significant improvement for several real-world representative applications.
- V. H. Allan, R. B. Jones, R. M. Lee, and S. J. Allan. 1995. Software pipelining. ACM Computing Survey 27, 3, 367--432. Google Scholar
Digital Library
- R. Allen and K. Kennedy. 2002. Optimizing Compilers for Modern Architectures. Morgan Kaufmann, San Francisco. Google Scholar
Digital Library
- K. Asanovic, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer, J. Kubiatowicz, N. Morgan, D. Patterson, K. Sen, J. Wawrzynek, D. Wessel, and K. Yelick. 2009. A view of the parallel computing landscape. Communications of the ACM 52, 10, 56--67. Google Scholar
Digital Library
- E. Athanasaki, N. Anastopoulos, K. Kourtis, and N. Koziris. 2008. Exploring the performance limits of simultaneous multithreading for memory intensive applications. Journal of Supercomputing 44, 1, 64--97. Google Scholar
Digital Library
- D. F. Bacon, S. L. Graham, and O. J. Sharp. 1994. Compiler transformations for high-performance computing. ACM Computing Survey 26, 4, 345--420. Google Scholar
Digital Library
- M.-W. Benabderrahmane, L.-N. Pouchet, A. Cohen, and C. Bastoul. 2010. The polyhedral model is more widely applicable than you think. In Compiler Construction, R. Gupta, Ed. Lecture Notes in Computer Science Series, vol. 6011. Springer, Berlin, 283--303. Google Scholar
Digital Library
- D. Burger and J. Goodman. 2004. Billion-transistor architectures: There and back again. Computer 37, 3, 22--28. Google Scholar
Digital Library
- Compaan Design BV. 2012. Retrieved from http://www.compaandesign.com/.Google Scholar
- D. Culler, A. Dusseau, S. Goldstein, A. Krishnamurthy, S. Lumetta, T. von Eicken, and K. Yelick. 1993. Parallel programming in Split-C. In Proceedings of Supercomputing’93. 262--273. Google Scholar
Digital Library
- J. González and A. González. 1998. The potential of data value speculation to boost ILP. In Proceedings of the 12th International Conference on Supercomputing. ICS’98. ACM, New York, NY, USA, 21--28. Google Scholar
Digital Library
- B. Goossens and D. Parello. 2013. Limits of instruction-level parallelism capture. Procedia Computer Science 18, 0, 1664--1673. International Conference on Computational Science.Google Scholar
Cross Ref
- J. L. Hennessy and D. A. Patterson. 2012. Computer Architecture: A Quantitative Approach. Elsevier.Google Scholar
Digital Library
- W.-M. Hwu, K. Keutzer, and T. Mattson. 2008. The concurrency challenge. IEEE Design Test of Computers 25, 4, 312--320. Google Scholar
Digital Library
- G. Kahn. 1974. The semantics of a simple language for parallel programming. In Information Processing, J. L. Rosenfeld, Ed. North Holland, Amsterdam, Stockholm, Sweden, 471--475.Google Scholar
- V. Kathail, S. Aditya, R. Schreiber, B. Ramakrishna Rau, D. Cronquist, and M. Sivaraman. 2002. Pico: Automatically designing custom computers. Computer 35, 9, 39--47. Google Scholar
Digital Library
- B. Kienhuis, E. Rijpkema, and E. F. Deprettere. 2000. Compaan: Deriving process networks from Matlab for embedded signal processing architectures. In Proceedings of the 8th International Workshop on Hardware/Software Codesign. 13--17. Google Scholar
Digital Library
- T. Mattson, B. Sanders, and B. Massingill. 2004. Patterns for Parallel Programming. Software Patterns Series. Pearson Education. Google Scholar
Digital Library
- J.-Y. Mignolet, R. Baert, T. J. Ashby, P. Avasare, H.-O. Jang, and J. C. Son. 2009. MPA: Parallelizing an application onto a multicore platform made easy. IEEE Micro 29, 3, 31--39. Google Scholar
Digital Library
- G. C. Necula, S. Mcpeak, S. P. Rahul, and W. Weimer. 2002. CIL: Intermediate language and tools for analysis and transformation of C programs. In Proceedings of the International Conference on Compiler Construction. 213--228. Google Scholar
Digital Library
- G. Ottoni, R. Rangan, A. Stoler, and D. August. 2005. Automatic thread extraction with decoupled software pipelining. In Proceedings of 38th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO. IEEE. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1540952&tag=1. Google Scholar
Digital Library
- E. Pietriga. 2005. A toolkit for addressing HCI issues in visual language environments. In Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC’05) 00, 145--152. Google Scholar
Digital Library
- G. Ramalingam. 1994. The undecidability of aliasing. ACM Transactions on Programming Languages and Systems 16, 5, 1467--1471. Google Scholar
Digital Library
- W. Thies, V. Chandrasekhar, and S. Amarasinghe. 2007. A practical approach to exploiting coarse-grained pipeline parallelism in C programs. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’07). 356--369. Google Scholar
Digital Library
- G. Tournavitis, Z. Wang, B. Franke, and M. F. O’Boyle. 2009. Towards a holistic approach to auto-parallelization: Integrating profile-driven parallelism detection and machine-learning based mapping. SIGPLAN Notes 44, 6, 177--187. Google Scholar
Digital Library
- H. Vandierendonck, S. Rul, and K. De Bosschere. 2010. The Paralax infrastructure: Automatic parallelization with a helping hand. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT’10). ACM, New York, NY, 389--400. Google Scholar
Digital Library
- R. P. Wilson, R. S. French, C. S. Wilson, S. P. Amarasinghe, J. M. Anderson, S. W. K. Tjiang, S.-W. Liao, C.-W. Tseng, M. W. Hall, M. S. Lam, and J. L. Hennessy. 1994. Suif: An infrastructure for research on parallelizing and optimizing compilers. SIGPLAN Notes 29, 12, 31--37. Google Scholar
Digital Library
- C. Yang, Y. Chen, X. Fu, C.-C. Lim, and R. Ju. 2006. A comparison of parallelization and performance optimizations for two ray-tracing applications. Proceedings of HPC&S 6, 321--330.Google Scholar
Index Terms
Interactive Trace-Based Analysis Toolset for Manual Parallelization of C Programs
Recommendations
Energy-aware parallelization flow and toolset for C code
SCOPES '14: Proceedings of the 17th International Workshop on Software and Compilers for Embedded SystemsMulticore architectures are increasingly used in embedded systems to achieve higher throughput with lower energy consumption. This trend accentuates the need to convert existing sequential code to effectively exploit the resources of these ...
Dynamic Trace-Based Data Dependency Analysis for Parallelization of C Programs
SCAM '12: Proceedings of the 2012 IEEE 12th International Working Conference on Source Code Analysis and ManipulationWriting parallel code is traditionally considered a difficult task, even when it is tackled from the beginning of a project. In this paper, we demonstrate an innovative toolset that faces this challenge directly. It provides the software developers with ...
Automatic Trace-Based Parallelization of Java Programs
ICPP '07: Proceedings of the 2007 International Conference on Parallel ProcessingWe propose and evaluate a novel approach for automatic parallelization. The approach uses traces as units of parallel work. We discuss the benefits and challenges of the use of traces and propose an execution model for automatic parallelization based on ...






Comments