skip to main content
research-article

A sort-based deferred shading architecture for decoupled sampling

Published:21 July 2013Publication History
Skip Abstract Section

Abstract

Stochastic sampling in time and over the lens is essential to produce photo-realistic images, and it has the potential to revolutionize real-time graphics. In this paper, we take an architectural view of the problem and propose a novel hardware architecture for efficient shading in the context of stochastic rendering. We replace previous caching mechanisms by a sorting step to extract coherence, thereby ensuring that only non-occluded samples are shaded. The memory bandwidth is kept at a minimum by operating on tiles and using new buffer compression methods. Our architecture has several unique benefits not traditionally associated with deferred shading. First, shading is performed in primitive order, which enables late shading of vertex attributes and avoids the need to generate a G-buffer of pre-interpolated vertex attributes. Second, we support state changes, e.g., change of shaders and resources in the deferred shading pass, avoiding the need for a single über-shader. We perform an extensive architectural simulation to quantify the benefits of our algorithm on real workloads.

Skip Supplemental Material Section

Supplemental Material

tp112.mp4

References

  1. Akeley, K. 1993. RealityEngine Graphics. In Proceedings of SIGGRAPH 93, ACM, 109--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Akenine-Möller, T., Munkberg, J., and Hasselgren, J. 2007. Stochastic Rasterization using Time-Continuous Triangles. In Graphics Hardware, 7--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andersson, M., Hasselgren, J., and Akenine-Möller, T. 2011. Depth Buffer Compression for Stochastic Motion Blur Rasterization. In High Performance Graphics, 127--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Boulos, S., Luong, E., Fatahalian, K., Moreton, H., and Hanrahan, P. 2010. Space-Time Hierarchical Occlusion Culling for Micropolygon Rendering with Motion Blur. In High Performance Graphics, 11--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Burns, C. A., Fatahalian, K., and Mark, W. R. 2010. A Lazy Object-Space Shading Architecture with Decoupled Sampling. In High Performance Graphics, 19--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Cook, R. L., Carpenter, L., and Catmull, E. 1987. The Reyes Image Rendering Architecture. In Computer Graphics (Proceedings of SIGGRAPH 87), ACM, vol. 21, 95--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Deering, M., Winner, S., Schediwy, B., Duffy, C., and Hunt, N. 1988. The Triangle Processor and Normal Vector Shader: A VLSI System for High Performance Graphics. In Computer Graphics (Proceedings of SIGGRAPH 88), ACM, vol. 22, 21--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Fuchs, H., Poulton, J., Eyles, J., Greer, T., Goldfeather, J., Ellsworth, D., Molnar, S., Turk, G., Tebbs, B., and Israel, L. 1989. Pixel-Planes 5: A Heterogeneous Multiprocessor Graphics System using Processor-Enhanced Memories. In Computer Graphics (Proceedings of SIGGRAPH 89), ACM, vol. 23, 79--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Harada, T., McKee, J., and Yang, J. C. 2012. Forward+: Bringing Deferred Lighting to the Next Level. In Eurographics 2012 -- Short Papers, 5--8.Google ScholarGoogle Scholar
  10. Hasselgren, J., and Akenine-Möller, T. 2006. Efficient Depth Buffer Compression. In Graphics Hardware, 103--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Imagination Technologies Ltd., 2011. POWERVR Series5 Graphics -- SGX architecture guide for developers.Google ScholarGoogle Scholar
  12. Joe, S., and Kuo, F. Y. 2008. Constructing Sobol Sequences with Better Two-Dimensional Projections. SIAM Journal on Scientific Computing, 30, 5, 2635--2654. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Laine, S., and Karras, T. 2011. Efficient Triangle Coverage Tests for Stochastic Rasterization. Tech. Rep. NVR-2011-003, NVIDIA Corporation, Sep.Google ScholarGoogle Scholar
  14. Laine, S., and Karras, T. 2011. Improved Dual-Space Bounds for Simultaneous Motion and Defocus Blur. Tech. Rep. NVR-2011-004, NVIDIA Corporation, Nov.Google ScholarGoogle Scholar
  15. Laine, S., Aila, T., Karras, T., and Lehtinen, J. 2011. Clipless Dual-Space Bounds for Faster Stochastic Rasterization. ACM Transactions on Graphics, 30, 106:1--106:6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lehtinen, J., Aila, T., Chen, J., Laine, S., and Durand, F. 2011. Temporal Light Field Reconstruction for Rendering Distribution Effects. ACM Transactions on Graphics, 30, 55:1--55:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Liktor, G., and Dachsbacher, C. 2012. Decoupled Deferred Shading for Hardware Rasterization. In Symposium on Interactive 3D Graphics and Games, 143--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. McGuire, M., Enderton, E., Shirley, P., and Luebke, D. 2010. Real-Time Stochastic Rasterization on Conventional GPU Architectures. In High Performance Graphics, 173--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Morein, S. 2000. ATI Radeon HyperZ Technology. In Graphics Hardware, Hot3D Proceedings.Google ScholarGoogle Scholar
  20. Munkberg, J., and Akenine-Möller, T. 2011. Backface Culling for Motion Blur and Depth of Field. Journal of Graphics, GPU, and Game Tools, 15, 2, 123--139.Google ScholarGoogle ScholarCross RefCross Ref
  21. Munkberg, J., and Akenine-Möller, T. 2012. Hyperplane Culling for Stochastic Rasterization. In Eurographics 2012 -- Short Papers, 105--108.Google ScholarGoogle Scholar
  22. Munkberg, J., Clarberg, P., Hasselgren, J., Toth, R., Sugihara, M., and Akenine-Möller, T. 2011. Hierarchical Stochastic Motion Blur Rasterization. In High Performance Graphics, 107--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Nilsson, J., Clarberg, P., Johnsson, B., Munkberg, J., Hasselgren, J., Toth, R., Salvi, M., and Akenine-Möller, T. 2012. Design and Novel Uses of Higher-Dimensional Rasterization. In High Performance Graphics, 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Olsson, O., and Assarsson, U. 2011. Tiled Shading. Journal of Graphics, GPU, and Game Tools, 15, 4, 235--251.Google ScholarGoogle ScholarCross RefCross Ref
  25. Olsson, O., Billeter, M., and Assarsson, U. 2012. Clustered Deferred and Forward Shading. In High Performance Graphics, 87--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ragan-Kelley, J., Lehtinen, J., Chen, J., Doggett, M., and Durand, F. 2011. Decoupled Sampling for Graphics Pipelines. ACM Transactions on Graphics, 30, 3, 17:1--17:17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Rasmusson, J., Hasselgren, J., and Akenine-Möller, T. 2007. Exact and Error-Bounded Approximate Color Buffer Compression and Decompression. In Graphics Hardware, 41--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Saito, T., and Takahashi, T. 1990. Comprehensible Rendering of 3-D Shapes. In Computer Graphics (Proceedings of SIGGRAPH 90), ACM, vol. 24, 197--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: A Many-Core x86 Architecture for Visual Computing. ACM Transactions on Graphics, 27, 3, 18:1--18:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Shirley, P., Aila, T., Cohen, J., Enderton, E., Laine, S., Luebke, D., and McGuire, M. 2011. A Local Image Reconstruction Algorithm for Stochastic Rendering. In Symposium on Interactive 3D Graphics and Games, 9--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ström, J., Wennersten, P., Rasmusson, J., Hasselgren, J., Munkberg, J., Clarberg, P., and Akenine-Möller, T. 2008. Floating-Point Buffer Compression in a Unified Codec Architecture. In Graphics Hardware, 75--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Vaidyanathan, K., Toth, R., Salvi, M., Boulos, S., and Lefohn, A. 2012. Adaptive Image Space Shading for Motion and Defocus Blur. In High Performance Graphics, 13--21. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A sort-based deferred shading architecture for decoupled sampling

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 32, Issue 4
      July 2013
      1215 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/2461912
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 July 2013
      Published in tog Volume 32, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader