Abstract
Stochastic sampling in time and over the lens is essential to produce photo-realistic images, and it has the potential to revolutionize real-time graphics. In this paper, we take an architectural view of the problem and propose a novel hardware architecture for efficient shading in the context of stochastic rendering. We replace previous caching mechanisms by a sorting step to extract coherence, thereby ensuring that only non-occluded samples are shaded. The memory bandwidth is kept at a minimum by operating on tiles and using new buffer compression methods. Our architecture has several unique benefits not traditionally associated with deferred shading. First, shading is performed in primitive order, which enables late shading of vertex attributes and avoids the need to generate a G-buffer of pre-interpolated vertex attributes. Second, we support state changes, e.g., change of shaders and resources in the deferred shading pass, avoiding the need for a single über-shader. We perform an extensive architectural simulation to quantify the benefits of our algorithm on real workloads.
Supplemental Material
Available for Download
Supplemental material.
- Akeley, K. 1993. RealityEngine Graphics. In Proceedings of SIGGRAPH 93, ACM, 109--116. Google Scholar
Digital Library
- Akenine-Möller, T., Munkberg, J., and Hasselgren, J. 2007. Stochastic Rasterization using Time-Continuous Triangles. In Graphics Hardware, 7--16. Google Scholar
Digital Library
- Andersson, M., Hasselgren, J., and Akenine-Möller, T. 2011. Depth Buffer Compression for Stochastic Motion Blur Rasterization. In High Performance Graphics, 127--134. Google Scholar
Digital Library
- Boulos, S., Luong, E., Fatahalian, K., Moreton, H., and Hanrahan, P. 2010. Space-Time Hierarchical Occlusion Culling for Micropolygon Rendering with Motion Blur. In High Performance Graphics, 11--18. Google Scholar
Digital Library
- Burns, C. A., Fatahalian, K., and Mark, W. R. 2010. A Lazy Object-Space Shading Architecture with Decoupled Sampling. In High Performance Graphics, 19--28. Google Scholar
Digital Library
- Cook, R. L., Carpenter, L., and Catmull, E. 1987. The Reyes Image Rendering Architecture. In Computer Graphics (Proceedings of SIGGRAPH 87), ACM, vol. 21, 95--102. Google Scholar
Digital Library
- Deering, M., Winner, S., Schediwy, B., Duffy, C., and Hunt, N. 1988. The Triangle Processor and Normal Vector Shader: A VLSI System for High Performance Graphics. In Computer Graphics (Proceedings of SIGGRAPH 88), ACM, vol. 22, 21--30. Google Scholar
Digital Library
- Fuchs, H., Poulton, J., Eyles, J., Greer, T., Goldfeather, J., Ellsworth, D., Molnar, S., Turk, G., Tebbs, B., and Israel, L. 1989. Pixel-Planes 5: A Heterogeneous Multiprocessor Graphics System using Processor-Enhanced Memories. In Computer Graphics (Proceedings of SIGGRAPH 89), ACM, vol. 23, 79--88. Google Scholar
Digital Library
- Harada, T., McKee, J., and Yang, J. C. 2012. Forward+: Bringing Deferred Lighting to the Next Level. In Eurographics 2012 -- Short Papers, 5--8.Google Scholar
- Hasselgren, J., and Akenine-Möller, T. 2006. Efficient Depth Buffer Compression. In Graphics Hardware, 103--110. Google Scholar
Digital Library
- Imagination Technologies Ltd., 2011. POWERVR Series5 Graphics -- SGX architecture guide for developers.Google Scholar
- Joe, S., and Kuo, F. Y. 2008. Constructing Sobol Sequences with Better Two-Dimensional Projections. SIAM Journal on Scientific Computing, 30, 5, 2635--2654. Google Scholar
Digital Library
- Laine, S., and Karras, T. 2011. Efficient Triangle Coverage Tests for Stochastic Rasterization. Tech. Rep. NVR-2011-003, NVIDIA Corporation, Sep.Google Scholar
- Laine, S., and Karras, T. 2011. Improved Dual-Space Bounds for Simultaneous Motion and Defocus Blur. Tech. Rep. NVR-2011-004, NVIDIA Corporation, Nov.Google Scholar
- Laine, S., Aila, T., Karras, T., and Lehtinen, J. 2011. Clipless Dual-Space Bounds for Faster Stochastic Rasterization. ACM Transactions on Graphics, 30, 106:1--106:6. Google Scholar
Digital Library
- Lehtinen, J., Aila, T., Chen, J., Laine, S., and Durand, F. 2011. Temporal Light Field Reconstruction for Rendering Distribution Effects. ACM Transactions on Graphics, 30, 55:1--55:12. Google Scholar
Digital Library
- Liktor, G., and Dachsbacher, C. 2012. Decoupled Deferred Shading for Hardware Rasterization. In Symposium on Interactive 3D Graphics and Games, 143--150. Google Scholar
Digital Library
- McGuire, M., Enderton, E., Shirley, P., and Luebke, D. 2010. Real-Time Stochastic Rasterization on Conventional GPU Architectures. In High Performance Graphics, 173--182. Google Scholar
Digital Library
- Morein, S. 2000. ATI Radeon HyperZ Technology. In Graphics Hardware, Hot3D Proceedings.Google Scholar
- Munkberg, J., and Akenine-Möller, T. 2011. Backface Culling for Motion Blur and Depth of Field. Journal of Graphics, GPU, and Game Tools, 15, 2, 123--139.Google Scholar
Cross Ref
- Munkberg, J., and Akenine-Möller, T. 2012. Hyperplane Culling for Stochastic Rasterization. In Eurographics 2012 -- Short Papers, 105--108.Google Scholar
- Munkberg, J., Clarberg, P., Hasselgren, J., Toth, R., Sugihara, M., and Akenine-Möller, T. 2011. Hierarchical Stochastic Motion Blur Rasterization. In High Performance Graphics, 107--118. Google Scholar
Digital Library
- Nilsson, J., Clarberg, P., Johnsson, B., Munkberg, J., Hasselgren, J., Toth, R., Salvi, M., and Akenine-Möller, T. 2012. Design and Novel Uses of Higher-Dimensional Rasterization. In High Performance Graphics, 1--11. Google Scholar
Digital Library
- Olsson, O., and Assarsson, U. 2011. Tiled Shading. Journal of Graphics, GPU, and Game Tools, 15, 4, 235--251.Google Scholar
Cross Ref
- Olsson, O., Billeter, M., and Assarsson, U. 2012. Clustered Deferred and Forward Shading. In High Performance Graphics, 87--96. Google Scholar
Digital Library
- Ragan-Kelley, J., Lehtinen, J., Chen, J., Doggett, M., and Durand, F. 2011. Decoupled Sampling for Graphics Pipelines. ACM Transactions on Graphics, 30, 3, 17:1--17:17. Google Scholar
Digital Library
- Rasmusson, J., Hasselgren, J., and Akenine-Möller, T. 2007. Exact and Error-Bounded Approximate Color Buffer Compression and Decompression. In Graphics Hardware, 41--48. Google Scholar
Digital Library
- Saito, T., and Takahashi, T. 1990. Comprehensible Rendering of 3-D Shapes. In Computer Graphics (Proceedings of SIGGRAPH 90), ACM, vol. 24, 197--206. Google Scholar
Digital Library
- Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: A Many-Core x86 Architecture for Visual Computing. ACM Transactions on Graphics, 27, 3, 18:1--18:15. Google Scholar
Digital Library
- Shirley, P., Aila, T., Cohen, J., Enderton, E., Laine, S., Luebke, D., and McGuire, M. 2011. A Local Image Reconstruction Algorithm for Stochastic Rendering. In Symposium on Interactive 3D Graphics and Games, 9--14. Google Scholar
Digital Library
- Ström, J., Wennersten, P., Rasmusson, J., Hasselgren, J., Munkberg, J., Clarberg, P., and Akenine-Möller, T. 2008. Floating-Point Buffer Compression in a Unified Codec Architecture. In Graphics Hardware, 75--84. Google Scholar
Digital Library
- Vaidyanathan, K., Toth, R., Salvi, M., Boulos, S., and Lefohn, A. 2012. Adaptive Image Space Shading for Motion and Defocus Blur. In High Performance Graphics, 13--21. Google Scholar
Digital Library
Index Terms
A sort-based deferred shading architecture for decoupled sampling
Recommendations
Decoupled sampling for graphics pipelines
We propose a generalized approach to decoupling shading from visibility sampling in graphics pipelines, which we call decoupled sampling. Decoupled sampling enables stochastic supersampling of motion and defocus blur at reduced shading cost, as well as ...
Decoupled deferred shading for hardware rasterization
I3D '12: Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and GamesIn this paper we present decoupled deferred shading: a rendering technique based on a new data structure called compact geometry buffer, which stores shading samples independently from the visibility. This enables caching and efficient reuse of shading ...
Subpixel reconstruction antialiasing for deferred shading
I3D '11: Symposium on Interactive 3D Graphics and GamesSubpixel Reconstruction Antialiasing (SRAA) combines singlepixel (1x) shading with subpixel visibility to create antialiased images without increasing the shading cost. SRAA targets deferred-shading renderers, which cannot use multisample antialiasing.
...





Comments