Abstract
Isolation---the property that a task can access shared data without interference from other tasks---is one of the most basic concerns in parallel programming. In this paper, we present Aida, a new model of isolated execution for parallel programs that perform frequent, irregular accesses to pointer-based shared data structures. The three primary benefits of Aida are dynamism, safety and liveness guarantees, and programmability. First, Aida allows tasks to dynamically select and modify, in an isolated manner, arbitrary fine-grained regions in shared data structures, all the while maintaining a high level of concurrency. Consequently, the model can achieve scalable parallelization of regular as well as irregular shared-memory applications. Second, the model offers freedom from data races, deadlocks, and livelocks. Third, no extra burden is imposed on programmers, who access the model via a simple, declarative isolation construct that is similar to that for transactional memory. The key new insight in Aida is a notion of delegation among concurrent isolated tasks (known in Aida as assemblies). Each assembly A is equipped with a region in the shared heap that it owns---the only objects accessed by A are those it owns, guaranteeing race-freedom. The region owned by A can grow or shrink flexibly---however, when A needs to own a datum owned by B, A delegates itself, as well as its owned region, to B. From now on, B has the responsibility of re-executing the task A set out to complete. Delegation as above is the only inter-assembly communication primitive in Aida. In addition to reducing contention in a local, data-driven manner, it guarantees freedom from deadlocks and livelocks.
We offer an implementation of Aida on top of the Habanero Java parallel programming language. The implementation employs several novel ideas, including the use of a union-find data structure to represent tasks and the regions that they own. A thorough evaluation using several irregular data-parallel benchmarks demonstrates the low overhead and excellent scalability of Aida, as well as its benefits over existing approaches to declarative isolation. Our results show that Aida performs on par with the state-of-the-art customized implementations of irregular applications and much better than coarse-grained locking and transactional memory approaches.
- Richard J. Anderson and Heather Woll. Wait-free parallel algorithms for the union-find problem. In Proceedings of the twenty-third annual ACM symposium on Theory of computing, STOC '91, pages 370--380, New York, NY, USA, 1991. ACM. Google Scholar
Digital Library
- Ganesh Bikshandi, Jose G. Castanos, Sreedhar B. Kodali, V. Krishna Nandivada, Igor Peshansky, Vijay A. Saraswat, Sayantan Sur, Pradeep Varma, and Tong Wen. Efficient, portable implementation of asynchronous multi-place programs. In Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP '09, pages 271--282, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- Robert D. Blumofe and Charles E. Leiserson. Scheduling multithreaded computations by work-stealing. In Proceedins of the 35th Annual IEEE Conference on Foundations of Computer Science, 1994. Google Scholar
Digital Library
- Donald Burke, Joshua Epstein, Derek Cummings, Jon Parker, Kenneth Cline, Ramesh Singa, and Shubha Charkravarty. Individual-based computational modeling of smallpox epidemic control strategies. Academic Emergency Medicine, 13(11):1142--1149, 2006.Google Scholar
Cross Ref
- Martin Burtscher, Milind Kulkarni, Dimitrios Prountzos, and Keshav Pingali. On the scalability of an automatically parallelized irregular application. In José Nelson Amaral, editor, Languages and Compilers for Parallel Computing, pages 109--123. Springer-Verlag, Berlin, Heidelberg, 2008. Google Scholar
Digital Library
- Calin Cascaval, Colin Blundell, Maged Michael, Harold W. Cain, Peng Wu, Stefanie Chiras, and Siddhartha Chatterjee. Software transactional memory: Why is it only a research toy? Queue, 6:46--58, September 2008. Google Scholar
Digital Library
- Vincent Cavé, Jisheng Zhao, Jun Shirako, and Vivek Sarkar. Habanero-Java: the New Adventures of Old X10. In PPPJ'11: Proceedings of 9th International Conference on the Principles and Practice of Programming in Java, 2011. Google Scholar
Digital Library
- Robit Chandra, Leonardo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, and Ramesh Menon. Parallel programming in OpenMP. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2001. Google Scholar
Digital Library
- P. Charles et al. X10: an object-oriented approach to non-uniform cluster computing. In Proceedings of the ACM SIGPLAN conference on Object oriented programming, systems, languages, and applications, pages 519--538, New York, NY, USA, 2005. Google Scholar
Digital Library
- L. Paul Chew. Guaranteed-quality mesh generation for curved surfaces. In Proceedings of the ninth annual symposium on Computational geometry, SCG '93, pages 274--280, New York, NY, USA, 1993. ACM. Google Scholar
Digital Library
- 9th DIMACS Implemetation Challenge. Available from http://www.dis.uniroma1.it/~challenge9/.Google Scholar
- DS™ 2.1 beta. Available from http://www.cs.brown.edu/$\sim$mph/.Google Scholar
- Panagiota Fatourou and Nikolaos D. Kallimanis. Blocking Universal Constructions. Technical report, University of Ioannina, 2011. TR-2011-05.Google Scholar
- Peter A. Franaszek, John T. Robinson, and Alexander Thomasian. Concurrency control for high contention environments. ACM Trans. Database Syst., 17:304--345, June 1992. Google Scholar
Digital Library
- Keir Fraser and Tim Harris. Concurrent programming without locks. ACM Trans. Comput. Syst., 25, May 2007. Google Scholar
Digital Library
- Z. Galil and G. Italiano. Data structures and algorithms for disjoint set union problems. ACM Comput. Surv., 23(3):319--344, 1991. Google Scholar
Digital Library
- Galois. Available from http://iss.ices.utexas.edu/galois/.Google Scholar
- Habanero java web page. http://habanero.rice.edu/hj.Google Scholar
- Tim Harris, James R. Larus, and Ravi Rajwar. Transactional Memory, 2nd Edition. Morgan & Claypool, 2010. Google Scholar
Digital Library
- Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In Proceedings of the 22nd ACM symposium on Parallelism in algorithms and architectures, SPAA '10, pages 355--364, New York, NY, USA, 2010. ACM. Google Scholar
Digital Library
- Danny Hendler and Nir Shavit. Work dealing. In Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures, SPAA '02, pages 164--172, New York, NY, USA, 2002. ACM. Google Scholar
Digital Library
- Maurice Herlihy, Victor Luchangco, and Mark Moir. A flexible framework for implementing software transactional memory. In Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, OOPSLA '06, pages 253--262, New York, NY, USA, 2006. ACM. Google Scholar
Digital Library
- Dieter Jungnickel. Graphs, Networks and Algorithms. Springer Publishing Company, Incorporated, 3rd edition, 2007. Google Scholar
Digital Library
- M. Kulkarni, M. Burtscher, C. Cascaval, and K. Pingali. Lonestar: A suite of parallel irregular programs. In Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on, pages 65 --76, april 2009.Google Scholar
Cross Ref
- Milind Kulkarni, Martin Burtscher, Rajeshkar Inkulu, Keshav Pingali, and Calin Casçaval. How much parallelism is there in irregular applications? In Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP '09, pages 3--14, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- Milind Kulkarni, Keshav Pingali, Bruce Walter, Ganesh Ramanarayanan, Kavita Bala, and L. Paul Chew. Optimistic parallelism requires abstractions. In Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, PLDI '07, pages 211--222, New York, NY, USA, 2007. ACM. Google Scholar
Digital Library
- The Lonestar Benchmark Suite. Available from http://iss.ices.utexas.edu/lonestar/.Google Scholar
- Roberto Lublinerman, Swarat Chaudhuri, and Pavol Cerny. Parallel programming with object assemblies. In Proceeding of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, OOPSLA '09, pages 61--80, New York, NY, USA, 2009. ACM. Google Scholar
Digital Library
- V. Marathe, W. Scherer, and M. Scott. Design tradeoffs in modern software transactional memory systems. In Workshop on languages, compilers, and run-time support for scalable systems, 2004. Google Scholar
Digital Library
- Mario Méndez-Lojo, Donald Nguyen, Dimitrios Prountzos, Xin Sui, M. Amber Hassaan, Milind Kulkarni, Martin Burtscher, and Keshav Pingali. Structure-driven optimizations for amorphous data-parallel programs. In Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP '10, pages 3--14, New York, NY, USA, 2010. ACM. Google Scholar
Digital Library
- Y. Oyama, K. Taura, and A. Yonezawa. Executing parallel programs with synchronization bottlenecks efficiently. In Proceedings of the international workshop on parallel and distributed computing for symbolic and irregular applications, PDSIA '99, pages 182--204. World Scientific, 1999.Google Scholar
- K. Pingali, M. Kulkarni, D. Nguyen, M. Burtscher, M. Mendez-Lojo, D. Prountzos, X. Sui, and Z. Zhong. Amorphous data-parallelism in irregular applications. Technical Report TR-09-05, University of Texas at Austin, 2009.Google Scholar
- Dimitrios Prountzos, Roman Manevich, Keshav Pingali, and Kathryn S. McKinley. A shape analysis for optimizing parallel graph programs. In Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages, POPL '11, pages 159--172, New York, NY, USA, 2011. ACM. Google Scholar
Digital Library
- William N. Scherer, III and Michael L. Scott. Advanced contention management for dynamic software transactional memory. In Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing, PODC '05, pages 240--248, New York, NY, USA, 2005. ACM. Google Scholar
Digital Library
- Nir Shavit and Dan Touitou. Software transactional memory. In Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing, PODC '95, pages 204--213, New York, NY, USA, 1995. ACM. Google Scholar
Digital Library
- Jun Shirako, David M. Peixotto, Vivek Sarkar, and William N. Scherer. Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. In Proceedings of the 22nd annual international conference on Supercomputing, ICS '08, pages 277--288, New York, NY, USA, 2008. ACM. Google Scholar
Digital Library
- Sunay Tripathi. FireEngine -- A New Networking Architecture for the Solaris Operating System. Technical report, Sun Microsystems, 2004.Google Scholar
- Yonghong Yan, Jisheng Zhao, Yi Guo, and Vivek Sarkar. Hierarchical place trees: A portable abstraction for task parallelism and data movement. In The 22nd International Workshop on Languages and Compilers for Parallel Computing (LCPC'09), 2009. Google Scholar
Digital Library
Index Terms
Delegated isolation
Recommendations
Delegated isolation
OOPSLA '11: Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applicationsIsolation---the property that a task can access shared data without interference from other tasks---is one of the most basic concerns in parallel programming. In this paper, we present Aida, a new model of isolated execution for parallel programs that ...
Isolation for nested task parallelism
OOPSLA '13Isolation--the property that a task can access shared data without interference from other tasks--is one of the most basic concerns in parallel programming. Whilethere is a large body of past work on isolated task-parallelism, the integration of ...
Isolation for nested task parallelism
OOPSLA '13: Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applicationsIsolation--the property that a task can access shared data without interference from other tasks--is one of the most basic concerns in parallel programming. Whilethere is a large body of past work on isolated task-parallelism, the integration of ...







Comments