Abstract
We present Dynamic Out-of-Order Java (DOJ), a dynamic parallelization approach. In DOJ, a developer annotates code blocks as tasks to decouple these blocks from the parent execution thread. The DOJ compiler then analyzes the code to generate heap examiners that ensure the parallel execution preserves the behavior of the original sequential program. Heap examiners dynamically extract heap dependences between code blocks and determine when it is safe to execute a code block.
We have implemented DOJ and evaluated it on twelve benchmarks. We achieved an average compilation speedup of 31.15 times over OoOJava and an average execution speedup of 12.73 times over sequential versions of the benchmarks.
- M. D. Allen, S. Sridharan, and G. S. Sohi. Serialization sets: A dynamic dependence-based parallel execution model. In Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 85--96, 2009. Google Scholar
Digital Library
- E. D. Berger, T. Yang, T. Liu, and G. Novark. Grace: Safe multithreaded programming for C/C++. In Proceeding of the 24th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 81--96, 2009. Google Scholar
Digital Library
- M. J. Best, S. Mottishaw, C. Mustard, M. Roth, A. Fedorova, and A. Brownsword. Synchronization via scheduling: Techniques for efficiently managing shared state. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011. Google Scholar
Digital Library
- R. L. Bocchino, Jr., V. S. Adve, D. Dig, S. V. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian. A type and effect system for deterministic parallel Java. In Proceeding of the 24th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, 2009. Google Scholar
Digital Library
- B. Cahoon and K. S. McKinley. Data flow analysis for software prefetching linked data structures in Java. In Proceedings of the 10th International Conference on Parallel Architectures and Compilation Techniques, 2001. Google Scholar
Digital Library
- C. Cao Minh, J. Chung, C. Kozyrakis, and K. Olukotun. STAMP: Stanford transactional applications for multi-processing. In Proceedings of the IEEE International Symposium on Workload Characterization, 2008.Google Scholar
Cross Ref
- L. Dagum and R. Menon. OpenMP: An industry-standard API for shared-memory programming. IEEE Computing in Science and Engineering, 5(1):46--55, 1998. Google Scholar
Digital Library
- K. Dai. Code parallelization for the LGDG large-grain dataflow computation. In Proceedings of the Joint International Conference on Vector and Parallel Processing, volume 457, pages 243--252. Springer Berlin / Heidelberg, 1990. Google Scholar
Digital Library
- J. S. Danaher, I.-T. A. Lee, and C. E. Leiserson. The JCilk language for multithreaded computing. In Synchronization and Concurrency in Object-Oriented Languages, 2005.Google Scholar
- C. Ding, X. Shen, K. Kelsey, C. Tice, R. Huang, and C. Zhang. Software behavior oriented parallelization. In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 223--234, 2007. Google Scholar
Digital Library
- Y. Etsion, F. Cabarcas, A. Rico, A. Ramirez, R. M. Badia, E. Ayguade, J. Labarta, and M. Valero. Task superscalar: An out-of-order task pipeline. In 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010. Google Scholar
Digital Library
- K. Fatahalian, T. J. Knight, M. Houston, M. Erez, D. Reiter, H. Larkhoon, L. Ji, Y. Park, M. Ren, A. Aiken, W. J. Dally, and P. Hanrahan. Sequoia: Programming the memory hierarchy. In Proceedings of the ACM/IEEE Conference on Supercomputing, 2006. Google Scholar
Digital Library
- B. Hardekopf and C. Lin. Flow-sensitive pointer analysis for millions of lines of code. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2011. Google Scholar
Digital Library
- C. Huang and L. V. Kale. Charisma: Orchestrating migratable parallel objects. In Proceedings of the ACM International Symposium on High Performance Distributed Computing, pages 75--84, 2007. Google Scholar
Digital Library
- J. Huang, A. Raman, T. B. Jablin, Y. Zhang, T.-H. Hung, and D. I. August. Decoupled software pipelining creates parallelization opportunities. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pages 121--130, 2010. Google Scholar
Digital Library
- J. C. Jenista, Y. Eom, and B. Demsky. OoOJava: An out-of-order approach to parallel programming. In Second USENIX Workshop on Hot Topics in Parallelism, 2010. Google Scholar
Digital Library
- J. C. Jenista, Y. Eom, and B. Demsky. OoOJava: Software out-of-order execution. In Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011. Google Scholar
Digital Library
- J. C. Jenista, Y. Eom, and B. Demsky. Using disjoint reachability for parallelization. In Proceedings of the 20th International Conference on Compiler Construction, 2011. Google Scholar
Digital Library
- M. Kulkarni, M. Burtscher, C. Cascaval, and K. Pingali. Lonestar: A suite of parallel irregular programs. In IEEE International Symposium on Performance Analysis of Systems and Software, 2009.Google Scholar
Cross Ref
- H.-W. Loidl, F. Rubio, N. Scaife, K. Hammond, S. Horiguchi, U. Klusik, R. Loogen, G. J. Michaelson, R. Pena, S. Priebe, A. J. Rebón, and P. W. Trinder. Comparing parallel functional languages: Programming and performance. Higher-Order and Symbolic Computation, 16(3):203--251, 2003. Google Scholar
Digital Library
- M. Naik and A. Aiken. Conditional must not aliasing for static race detection. In Proceedings of the Symposium on Principles of Programming Languages, 2007. Google Scholar
Digital Library
- K. H. Randall. Cilk: Efficient Multithreaded Computing. PhD thesis, Massachusetts Institute of Technology, 1998. Google Scholar
Digital Library
- L. Rauchwerger, N. M. Amato, and D. A. Padua. Run-time methods for parallelizing partially parallel loops. In Proceedings of the 9th International Conference on Supercomputing, pages 137--146, 1995. Google Scholar
Digital Library
- M. C. Rinard, D. J. Scales, and M. S. Lam. Jade: A high-level, machine-independent language for parallel programming. Computer, 26:28--38, 1993. Google Scholar
Digital Library
- S. Rus, L. Rauchwerger, and J. Hoeflinger. Hybrid analysis: Static & dynamic memory reference analysis. International Journal on Parallel Programming, 31:251--283, 2003. Google Scholar
Digital Library
- J. H. Saltz and R. Mirchandaney. Run-time parallelization and scheduling of loops. IEEE Transactions on Computers, 40(5):603--612, May 1991. Google Scholar
Digital Library
- L. A. Smith, J. M. Bull, and J. Obdrzálek. A parallel Java Grande benchmark suite. In Proceedings of the SC2001, 2001. Google Scholar
Digital Library
- R. M. Tomasulo. An efficient algorithm for exploiting multiple arithmetic units. IBM Journal of Research and Development, 11(1):25--33, 1967. Google Scholar
Digital Library
- H. L. A. van der Spek, C. W. M. Holm, and H. A. G. Wijshoff. How to unleash array optimizations on code using recursive data structures. In Proceedings of the 24th International Conference on Supercomputing, pages 275--284, 2010. Google Scholar
Digital Library
- S. K. Venkata, I. Ahn, D. Jeon, A. Gupta, C. Louie, S. Garcia, S. Belongie, and M. B. Taylor. SD-VBS: The San Diego Vision Benchmark Suite. In Proceedings of the IEEE International Symposium on Workload Characterization, 2009. Google Scholar
Digital Library
- C. von Praun, L. Ceze, and C. Caşcaval. Implicit parallelism with ordered transactions. In Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 79--89, 2007. Google Scholar
Digital Library
- A. Welc, S. Jagannathan, and A. Hosking. Safe futures for Java. In Proceeding of the 20th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, volume 40, pages 439--453, 2005. Google Scholar
Digital Library
- J. Zhou and B. Demsky. Bamboo: A data-centric, object-oriented approach to multi-core software. In Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2010. Google Scholar
Digital Library
Index Terms
DOJ: dynamically parallelizing object-oriented programs
Recommendations
DOJ: dynamically parallelizing object-oriented programs
PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel ProgrammingWe present Dynamic Out-of-Order Java (DOJ), a dynamic parallelization approach. In DOJ, a developer annotates code blocks as tasks to decouple these blocks from the parent execution thread. The DOJ compiler then analyzes the code to generate heap ...
Start/Pat: A Parallel-Programming Toolkit
The authors address the question of how to use existing sequential Fortran code on multiprocessors. Their answer is Start/Pat, an interactive toolkit that automates the parallelization of sequential Fortran as it teaches the programmer how to exploit ...
Benchmarking UHGROMOS
HICSS '95: Proceedings of the 28th Hawaii International Conference on System SciencesPorting of the parallel Fortran preprocessor, Pfortran, to Intel Corporation and IBM Corporation massively parallel processor machines is presented. The machines include the Intel iPSC/860, the Caltech Intel DELTA, the Intel Paragon, and the IBM SP1. ...







Comments