skip to main content
research-article
Open Access
Seminal Paper

Decoupling algorithms from schedules for easy optimization of image processing pipelines

Published:01 July 2012Publication History
Skip Abstract Section

Abstract

Using existing programming tools, writing high-performance image processing code requires sacrificing readability, portability, and modularity. We argue that this is a consequence of conflating what computations define the algorithm, with decisions about storage and the order of computation. We refer to these latter two concerns as the schedule, including choices of tiling, fusion, recomputation vs. storage, vectorization, and parallelism.

We propose a representation for feed-forward imaging pipelines that separates the algorithm from its schedule, enabling high-performance without sacrificing code clarity. This decoupling simplifies the algorithm specification: images and intermediate buffers become functions over an infinite integer domain, with no explicit storage or boundary conditions. Imaging pipelines are compositions of functions. Programmers separately specify scheduling strategies for the various functions composing the algorithm, which allows them to efficiently explore different optimizations without changing the algorithmic code.

We demonstrate the power of this representation by expressing a range of recent image processing applications in an embedded domain specific language called Halide, and compiling them for ARM, x86, and GPUs. Our compiler targets SIMD units, multiple cores, and complex memory hierarchies. We demonstrate that it can handle algorithms such as a camera raw pipeline, the bilateral grid, fast local Laplacian filtering, and image segmentation. The algorithms expressed in our language are both shorter and faster than state-of-the-art implementations.

Skip Supplemental Material Section

Supplemental Material

References

  1. Adams, A., Talvala, E.-V., Park, S. H., Jacobs, D. E., Ajdin, B., Gelfand, N., Dolson, J., Vaquero, D., Baek, J., Tico, M., Lensch, H. P. A., Matusik, W., Pulli, K., Horowitz, M., and Levoy, M. 2010. The Frankencamera: An experimental platform for computational photography. ACM Transactions on Graphics 29, 4 (July), 29:1--29:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aubry, M., Paris, S., Hasinoff, S. W., Kautz, J., and Durand, F. 2011. Fast and robust pyramid-based image processing. Tech. Rep. MIT-CSAIL-TR-2011-049, Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  3. Buck, I. 2007. GPU computing: Programming a massively parallel processor. In CGO '07: Proceedings of the International Symposium on Code Generation and Optimization, IEEE Computer Society, 17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, J., Paris, S., and Durand, F. 2007. Real-time edge-aware image processing with the bilateral grid. ACM Transactions on Graphics 26, 3 (July), 103:1--103:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. CoreImage. Apple CoreImage programming guide. http://developer.apple.com/library/mac/#documentation/GraphicsImaging/Conceptual/CoreImaging.Google ScholarGoogle Scholar
  6. Elliott, C., Finne, S., and de Moor, O. 2003. Compiling embedded languages. Journal of Functional Programming 13, 2. Updated version of paper by the same name that appeared in SAIG '00 proceedings. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Elliott, C. 2001. Functional image synthesis. In Proceedings of Bridges.Google ScholarGoogle Scholar
  8. Fatahalian, K., Horn, D. R., Knight, T. J., Leem, L., Houston, M., Park, J. Y., Erez, M., Ren, M., Aiken, A., Dally, W. J., and Hanrahan, P. 2006. Sequoia: programming the memory hierarchy. In Proceedings of the 2006 ACM/IEEE conference on Supercomputing, ACM, SC '06. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Feautrier, P. 1991. Dataflow analysis of array and scalar references. International Journal of Parallel Programming 20.Google ScholarGoogle Scholar
  10. Gordon, M. I., Thies, W., Karczmarek, M., Lin, J., Meli, A. S., Leger, C., Lamb, A. A., Wong, J., Hoffman, H., Maze, D. Z., and Amarasinghe, S. 2002. A stream compiler for communication-exposed architectures. In International Conference on Architectural Support for Programming Languages and Operating Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Guenter, B., and Nehab, D. 2010. The neon image processing language. Tech. Rep. MSR-TR-2010-175, Microsoft Research.Google ScholarGoogle Scholar
  12. IPP. Intel Integrated Performance Primitives. http://software.intel.com/en-us/articles/intel-ipp/.Google ScholarGoogle Scholar
  13. Kapasi, U. J., Mattson, P., Dally, W. J., Owens, J. D., and Towles, B. 2002. Stream scheduling. Concurrent VLSI Architecture Tech Report 122, Stanford University, March.Google ScholarGoogle Scholar
  14. Kass, M., Witkin, A., and Terzopoulos, D. 1988. Snakes: Active contour models. International Journal of Computer Vision 1, 4.Google ScholarGoogle ScholarCross RefCross Ref
  15. Levoy, M. 1994. Spreadsheets for images. In Proceedings of SIGGRAPH 94, Computer Graphics Proceedings, Annual Conference Series, 139--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Li, C., Xu, C., Gui, C., and Fox, M. D. 2010. Distance regularized level set evolution and its application to image segmentation. IEEE Transactions on Image Processing 19, 12 (December), 3243--3254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. LLVM. The LLVM compiler infrastructure. http://llvm.org.Google ScholarGoogle Scholar
  18. McCool, M. D., Qin, Z., and Popa, T. S. 2002. Shader metaprogramming. In Graphics Hardware 2002, 57--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Newburn, C. J., So, B., Liu, Z., McCool, M., Ghuloum, A., Toit, S. D., Wang, Z. G., Du, Z. H., Chen, Y., Wu, G., Guo, P., Liu, Z., and Zhang, D. 2011. Intel's array building blocks: A retargetable, dynamic compiler and embedded language. In Proceedings of the 2011 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization, IEEE Computer Society, CGO '11, 224--235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. OpenCL, 2011. The OpenCL specification, version 1.2. http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf.Google ScholarGoogle Scholar
  21. OpenMP. OpenMP. http://openmp.org/.Google ScholarGoogle Scholar
  22. Paris, S., and Durand, F. 2009. A fast approximation of the bilateral filter using a signal processing approach. International Journal of Computer Vision 81, 1, 24--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Paris, S., Kornprobst, P., Tumblin, J., and Durand, F. 2009. Bilateral filtering: Theory and applications. Foundations and Trends in Computer Graphics and Vision. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Paris, S., Hasinoff, S. W., and Kautz, J. 2011. Local Laplacian filters: Edge-aware image processing with a Laplacian pyramid. ACM Transactions on Graphics 30, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Pharr, M., and Mark, W. R. 2012. ispc: A SPMD compiler for high-performance CPU programming. In Proceedings of Innovative Parallel Computing (InPar).Google ScholarGoogle Scholar
  26. PixelBender. Adobe PixelBender reference. http://www.adobe.com/content/dam/Adobe/en/devnet/pixelbender/pdfs/pixelbender_reference.pdf.Google ScholarGoogle Scholar
  27. Püschel, M., Moura, J. M. F., Johnson, J., Padua, D., Veloso, M., Singer, B., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., Chen, K., Johnson, R. W., and Rizzolo, N. 2005. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, special issue on "Program Generation, Optimization, and Adaptation" 93, 2, 232--275.Google ScholarGoogle Scholar
  28. Shantzis, M. A. 1994. A model for efficient and flexible image computing. In Proceedings of the 21st annual conference on Computer graphics and interactive techniques, ACM, SIGGRAPH '94, 147--154. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Decoupling algorithms from schedules for easy optimization of image processing pipelines

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 31, Issue 4
      July 2012
      935 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/2185520
      Issue’s Table of Contents
      • cover image ACM Overlay Books
        Seminal Graphics Papers: Pushing the Boundaries, Volume 2
        August 2023
        893 pages
        ISBN:9798400708978
        DOI:10.1145/3596711
        • Editor:
        • Mary C. Whitton

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 July 2012
      Published in tog Volume 31, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader