Abstract
Existing dynamic race detectors suffer from at least one of the following three limitations:
(i)space overhead per memory location grows linearly with the number of parallel threads [13], severely limiting the parallelism that the algorithm can handle;
(ii)sequentialization: the parallel program must be processed in a sequential order, usually depth-first [12, 24]. This prevents the analysis from scaling with available hardware parallelism, inherently limiting its performance;
(iii) inefficiency: even though race detectors with good theoretical complexity exist, they do not admit efficient implem entations and are unsuitable for practical use [4, 18].
We present a new precise dynamic race detector that leverages structured parallelism in order to address these limitations. Our algorithm requires constant space per memory location, works in parallel, and is efficient in practice. We implemented and evaluated our algorithm on a set of 15 benchmarks. Our experimental results indicate an average (geometric mean) slowdown of 2.78x on a 16-core SMP system.
Supplemental Material
Available for Download
Included files: This folder includes the following files: the only high-level tex file, Figures and Building the document instructions (to create PS and PDF files): latex draft, dvips draft, ps2pdf draft.ps
- Intel Cilk++ Programmer's Guide. http://software.intel.com/enus/ articles/download-intel-cilk-sdk/.Google Scholar
- The Computer Language Benchmarks Game. http://shootout.alioth.debian.org/.Google Scholar
- BARIK, R., ET AL. The habanero multicore software research project. In Proceeding of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications (2009), ACM, pp. 735--736. Google Scholar
Digital Library
- BENDER, M. A., ET AL. On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs. In SPAA'04 (Barcelona, Spain, June27-30 2004), pp. 133--144. Google Scholar
Digital Library
- BLUMOFE, R. D., ET AL. Cilk: an efficient multithreaded runtime system. In PPoPP'95 (Oct. 1995), pp. 207--216. Google Scholar
Digital Library
- CAVE, V., ZHAO, J., SHIRAKO, J., AND SARKAR, V. Habanero-java: the new adventures of old x10. In PPPJ'11 (2011). Google Scholar
Digital Library
- CHARLES, P., ET AL. X10: An object-oriented approach to nonuniform cluster computing. In OOPSLA 2005 Onward! Track (2005). Google Scholar
Digital Library
- CHENG, G.-I., FENG, M., LEISERSON, C. E., RANDALL, K. H., AND STARK, A. F. Detecting data races in cilk programs that use locks. In SPAA'98 (1998), pp. 298--309. Google Scholar
Digital Library
- CONG, J., SARKAR, V., REINMAN, G., AND BUI, A. Customizable Domain-Specific Computing. IEEE Design and Test, 2:28 (Mar 2011), 6--15. Google Scholar
Digital Library
- DINNING, A., AND SCHONBERG, E. An empirical comparison of monitoring algorithms for access anomaly detection. In PPoPP'90 (1990), ACM, pp. 1--10. Google Scholar
Digital Library
- DURAN, A., ET AL. Barcelona OpenMP tasks suite: A set of benchmarks targeting the exploitation of task parallelism in openmp. In ICPP'09 (2009), pp. 124--131. Google Scholar
Digital Library
- FENG,M., AND LEISERSON,C. E. Efficient detection of determinacy races in cilk programs. In SPAA'97 (1997), ACM, pp. 1--11. Google Scholar
Digital Library
- FLANAGAN, C., AND FREUND, S. N. FastTrack: efficient and precise dynamic race detection. In PLDI '09 (2009), ACM, pp. 121--133. Google Scholar
Digital Library
- FLANAGAN, C., AND FREUND, S. N. The roadrunner dynamic analysis framework for concurrent programs. In PASTE'10 (2010), ACM, pp. 1--8. Google Scholar
Digital Library
- FRIGO, M., LEISERSON, C. E., AND RANDALL, K. H. The implementation of the cilk-5 multithreaded language. In PLDI'98 (1998), ACM, pp. 212--223. Google Scholar
Digital Library
- GUO, Y., ET AL. Work-first and help-first scheduling policies for async-finish task parallelism. In IPDPS'09 (2009), IEEE Computer Society, pp. 1--12. Google Scholar
Digital Library
- GUO, Y., ZHAO, J., CAVE, V., AND SARKAR, V. Slaw: A scalable locality-aware adaptive work-stealing scheduler. In IPDPS (2010). Google Scholar
Digital Library
- KARUNARATNA, T. C. Nondeterminator-3: A provably good datarace detector that runs in parallel. Master's thesis, Department of Electrical Engineering and Computer Science, MIT,, Sept. 2005.Google Scholar
- LAMPORT, L. Concurrent reading and writing. Commun. ACM 20 (November 1977), 806--811. Google Scholar
Digital Library
- LEE, J. K., AND PALSBERG, J. Featherweight x10: a core calculus for async-finish parallelism. In PPoPP'10 (2010), ACM, pp. 25--36. Google Scholar
Digital Library
- MELLOR-CRUMMEY, J. On-the-fly detection of data races for programs with nested fork-join parallelism. In Supercomputing'91 (1991), ACM, pp. 24--33. Google Scholar
Digital Library
- MELLOR-CRUMMEY, J. Compile-time support for efficient data race detection in shared-memory parallel programs. In PADD '93: Proceedings of the ACM/ONR workshop on Parallel and distributed debugging (1993), ACM, pp. 129--139. Google Scholar
Digital Library
- OpenMP Application Program Interface v 3.0, 2008.Google Scholar
- RAMAN, R., ET AL. Efficient data race detection for async-finish parallelism. In RV'10 (2010), Springer-Verlag, pp. 368--383. Google Scholar
Digital Library
- RAMAN, R., ZHAO, J., SARKAR, V., VECHEV, M., AND YAHAV, E. Scalable and precise dynamic datarace detection for structured parallelism. Tech. Rep. TR12-01, Department of Computer Science, Rice University, Houston, TX, 2012.Google Scholar
- SAVAGE, S., ET AL. Eraser: a dynamic data race detector for multithreaded programs. ACM Trans. Comput. Syst. 15, 4 (1997), 391--411. Google Scholar
Digital Library
- SCHONBERG, E. On-the-fly detection of access anomalies. In PLDI'98 (1998), pp. 285--297. Google Scholar
Digital Library
- SMITH, L. A., AND BULL, J. M. A Parallel Java Grande Benchmark Suite. In In Supercomputing'01 (2001), ACM Press, p. 8. Google Scholar
Digital Library
- VALLEE-RAI, R., ET AL. Soot - a Java Optimization Framework. In Proceedings of CASCON 1999 (1999), pp. 125--135. Google Scholar
Digital Library
- WINSKEL, G. The Formal Semantics of Programming Languages. MIT Press, 1993. Google Scholar
Cross Ref
- ZHAO, J., AND SARKAR, V. Intermediate language extensions for parallelism. In VMIL'11 (2011), pp. 333--334. Google Scholar
Digital Library
Index Terms
Scalable and precise dynamic datarace detection for structured parallelism
Recommendations
Scalable and precise dynamic datarace detection for structured parallelism
PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and ImplementationExisting dynamic race detectors suffer from at least one of the following three limitations:
(i)space overhead per memory location grows linearly with the number of parallel threads [13], severely limiting the parallelism that the algorithm can handle;
(...
Efficient data race detection for async-finish parallelism
A major productivity hurdle for parallel programming is the presence of data races . Data races can lead to all kinds of harmful program behaviors, including determinism violations and corrupted memory. However, runtime overheads of current dynamic ...
A Survey on Parallelism and Determinism
Parallelism is often required for performance. In these situations an excess of non-determinism is harmful as it means the program can have several different behaviours or even different results. Even in domains such as high-performance computing where ...







Comments