Abstract
Arrays computations are at the core of numerical modelling and computational science applications. However, low-level manipulation of array indices is a source of program error. Many practitioners are aware of the need to ensure program correctness, yet very few of the techniques from the programming research community are applied by scientists. We aim to change that by providing targetted lightweight verification techniques for scientific code. We focus on the all too common mistake of array offset errors as a generalisation of off-by-one errors. Firstly, we report on a code analysis study on eleven real-world computational science code base, identifying common idioms of array usage and their spatial properties. This provides much needed data on array programming idioms common in scientific code. From this data, we designed a lightweight declarative specification language capturing the majority of array access patterns via a small set of combinators. We detail a semantic model, and the design and implementation of a verification tool for our specification language, which both checks and infers specifications. We evaluate our tool on our corpus of scientific code. Using the inference mode, we found roughly 87,000 targets for specification across roughly 1.1 million lines of code, showing that the vast majority of array computations read from arrays in a pattern with a simple, regular, static shape. We also studied the commit logs of one of our corpus packages, finding past bug fixes for which our specification system distinguishes the change and thus could have been applied to detect such bugs.
Supplemental Material
Available for Download
- T. Abe, T. Maeda, and M. Sato. 2013. Model Checking Stencil Computations Written in a Partitioned Global Address Space Language. In Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), 2013 IEEE 27th International. 365–374. DOI: Google Scholar
Digital Library
- J. Adams. 1991. MUDPACK: multigrid software for linear elliptic partial differential equations, version 3.0. National Center for Atmospheric Research, Boulder, Colorado. Scientific Computing Division User Doc.Google Scholar
- Krste Asanovi` c, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine A. Yelick. 2006. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report UCB/EECS-2006-183. EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS- 2006- 183.htmlGoogle Scholar
- Terry Barker, Haoran Pan, Jonathan Kohler, Rachel Warren, and Sarah Winne. 2006. Decarbonizing the Global Economy with Induced Technological Change: Scenarios to 2100 using E3MG. The Energy Journal 0, Special I (2006), 241–258. https://ideas.repec.org/a/aen/journl/2006se- a12.htmlGoogle Scholar
- Patrick Baudin, Jean-Christophe Filliâtre, Claude Marché, Benjamin Monate, Yannick Moy, and Virgile Prevosto. 2008. ACSL: ANSI C Specification Language. (2008).Google Scholar
- Isabelle Bey, Daniel J Jacob, Robert M Yantosca, Jennifer A Logan, Brendan D Field, Arlene M Fiore, Qinbin Li, Honguy Y Liu, Loretta J Mickley, and Martin G Schultz. 2001. Global modeling of tropospheric chemistry with assimilated meteorology: Model description and evaluation. Journal of Geophysical Research: Atmospheres 106, D19 (2001), 23073–23095. Google Scholar
Cross Ref
- L Susan Blackford, Antoine Petitet, Roldan Pozo, Karin Remington, R Clint Whaley, James Demmel, Jack Dongarra, Iain Duff, Sven Hammarling, Greg Henry, and others. 2002. An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Software 28, 2 (2002), 135–151.Google Scholar
Digital Library
- Stefan Blom, Marieke Huisman, and Matej MihelÄŊiÄĞ. 2014. Specification and verification of GPGP U programs. Science of Computer Programming 95, Part 3 (2014), 376 – 388. DOI: Google Scholar
Digital Library
- Robert D Blumofe, Christopher F Joerg, Bradley C Kuszmaul, Charles E Leiserson, Keith H Randall, and Yuli Zhou. 1996. Cilk: An efficient multithreaded runtime system. Journal of parallel and distributed computing 37, 1 (1996), 55–69.Google Scholar
Digital Library
- Jochen Burghardt, J Gerlach, L Gu, Kerstin Hartig, Hans Pohl, J Soto, and K Völlinger. 2010. ACSL by example, towards a verified C standard library. DEVICESOFT project publication. Fraunhofer FIRST Institute (December 2011) (2010).Google Scholar
- Mistral Contrastin, Matthew Danish, Dominic Orchard, and Andrew Rice. 2016. Lightning Talk: Supporting Software Sustainability with Lightweight Specifications. In Proceedings of the Fourth Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE4), University of Manchester, Manchester, UK, September 12–14, Vol. 1686. CEUR Workshop Proceedings.Google Scholar
- Mistral Contrastin, Matthew Danish, Dominic Orchard, and Andrew Rice. 2017. CamFort - refactoring, analysis, and verification tool for scientific Fortran programs. https://camfort.github.com . (2017). Accessed: 23rd August 2017.Google Scholar
- Larry S Davis. 1975. A survey of edge detection techniques. Computer graphics and image processing 4, 3 (1975), 248–270.Google Scholar
- C. Dawson, Q. Du, and T. Dupont. 1991. A finite difference domain decomposition algorithm for numerical solution of the heat equation. Math. Comp. 57, 195 (1991). Google Scholar
Cross Ref
- Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337–340.Google Scholar
Digital Library
- PE Farrell, MD Piggott, GJ Gorman, DA Ham, and CR Wilson. 2010. Automated continuous verification and validation for numerical simulation. Geoscientific Model Development Discussions 3 (2010), 1587–1623. Google Scholar
Cross Ref
- Andrew D. Friend and Andrew White. 2000. Evaluation and analysis of a dynamic terrestrial ecosystem model under preindustrial conditions at the global scale. Global Biogeochemical Cycles 14, 4 (2000), 1173–1190. DOI: Google Scholar
Cross Ref
- M. Griebel, T. Dornsheifer, and T. Neunhoeffer. 1997. Numerical simulation in fluid dynamics: a practical introduction. Vol. 3. Society for Industrial Mathematics.Google Scholar
- Shoaib Kamil, Alvin Cheung, Shachar Itzhaky, and Armando Solar-Lezama. 2016. Verified Lifting of Stencil Computations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 711–726. Google Scholar
Digital Library
- Dimitri Komatitsch, Jeroen Tromp, and others. 2016. SPECFEM3D. https://github.com/geodynamics/specfem3d . (2016). Accessed: 15 November 2016.Google Scholar
- Stas Negara, Mohsen Vakilian, Nicholas Chen, Ralph E Johnson, and Danny Dig. 2012. Is it dangerous to use version control histories to study source code evolution?. In ECOOP, Vol. 12. Springer, 79–103.Google Scholar
Digital Library
- W.L. Oberkampf and C.J. Roy. 2010. Verification and validation in scientific computing. Cambridge University Press. Google Scholar
Cross Ref
- Dominic Orchard, Mistral Contrastin, Matthew Danish, and Andrew Rice. 2017. Proofs for ‘Verifying Spatial Properties of Array Computations’. Technical Report UCAM-CL-TR-911. University of Cambridge, Computer Laboratory, 15 JJ Thomson Avenue, Cambridge CB3 0FD, United Kingdom.Google Scholar
- Dominic Orchard and Andrew Rice. 2014. A computational science agenda for programming language research. Procedia Computer Science 29 (2014), 713–727. Google Scholar
Cross Ref
- Tao Pang. 1999. An introduction to computational physics. (1999). 1st Edition.Google Scholar
- D.E. Post and L.G. Votta. 2005. Computational science demands a new paradigm. Physics today 58, 1 (2005), 35–41. Google Scholar
Cross Ref
- Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. ACM SIGPLAN Notices 48, 6 (2013), 519–530. Google Scholar
Digital Library
- G.W. Recktenwald. 2004. Finite-difference approximations to the heat equation. Class Notes (2004). http://www.f.kth.se/ ~jjalap/numme/FDheat.pdf .Google Scholar
- Armando Solar-Lezama, Gilad Arnold, Liviu Tancau, Rastislav Bodik, Vijay Saraswat, and Sanjit Seshia. 2007. Sketching Stencils. SIGPLAN Not. 42, 6 (June 2007), 167–178. DOI: Google Scholar
Digital Library
- David Sorenson, Richard Lehoucq, Chao Yang, Kristi Maschhoff, Sylvestre Ledru, and Allan Cornet. 2017. ARPACK-NG. https://github.com/opencollab/arpack- ng . (2017).Google Scholar
- Yuan Tang, Rezaul Alam Chowdhury, Bradley C Kuszmaul, Chi-Keung Luk, and Charles E Leiserson. 2011. The Pochoir Stencil Compiler. In Proceedings of the twenty-third annual ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 117–128. Google Scholar
Digital Library
- Elena Tolkova. 2014. Land–Water Boundary Treatment for a Tsunami Model With Dimensional Splitting. Pure and Applied Geophysics 171, 9 (2014), 2289–2314. Google Scholar
Cross Ref
- Philip Wadler. 1990. Linear types can change the world. In IFIP TC, Vol. 2. Citeseer, 347–359.Google Scholar
- David A Wheeler. 2001. SLOCCount. (2001).Google Scholar
- Damian R Wilson and Susan P Ballard. 1999. A microphysically based precipitation scheme for the UK Meteorological Office Unified Model. Quarterly Journal of the Royal Meteorological Society 125, 557 (1999), 1607–1636. Google Scholar
Cross Ref
Index Terms
Verifying spatial properties of array computations
Recommendations
Accelerating Haskell array codes with multicore GPUs
DAMP '11: Proceedings of the sixth workshop on Declarative aspects of multicore programmingCurrent GPUs are massively parallel multicore processors optimised for workloads with a large degree of SIMD parallelism. Good performance requires highly idiomatic programs, whose development is work intensive and requires expert knowledge.
To raise ...
Representations of Recursively Enumerable Array Languages by Contextual Array Grammars
Contagious Creativity - In Honor of the 80th Birthday of Professor Solomon MarcusThe main result proved in this paper shows that the natural embedding of any recursively enumerable one-dimensional array language in the two-dimensional space can be characterized by the projection of a two-dimensional array language generated by a ...
Representations of Recursively Enumerable Array Languages by Contextual Array Grammars
Contagious Creativity - In Honor of the 80th Birthday of Professor Solomon MarcusThe main result proved in this paper shows that the natural embedding of any recursively enumerable one-dimensional array language in the two-dimensional space can be characterized by the projection of a two-dimensional array language generated by a ...






Comments