ABSTRACT
When programming for GPUs, simply porting a large CPU program into an equally large GPU kernel is generally not a good approach. Due to SIMT execution model on GPUs, divergence in control flow carries substantial performance penalties, as does high register us-age that lessens the latency-hiding capability that is essential for the high-latency, high-bandwidth memory system of a GPU. In this paper, we implement a path tracer on a GPU using a wavefront formulation, avoiding these pitfalls that can be especially prominent when using materials that are expensive to evaluate. We compare our performance against the traditional megakernel approach, and demonstrate that the wavefront formulation is much better suited for real-world use cases where multiple complex materials are present in the scene.
- Aila, T., and Laine, S. 2009. Understanding the efficiency of ray traversal on GPUs. In Proc. High Performance Graphics, 145--149. Google Scholar
Digital Library
- Aila, T., Laine, S., and Karras, T. 2012. Understanding the efficiency of ray traversal on GPUs -- Kepler and Fermi addendum. Tech. Rep. NVR-2012-02, NVIDIA.Google Scholar
- Ernst, M., and Woop, S., 2011. Embree: Photo-realistic ray tracing kernels. White paper, Intel.Google Scholar
- Hoberock, J., Lu, V., Jia, Y., and Hart, J. C. 2009. Stream compaction for deferred shading. In Proc. High Performance Graphics, 173--180. Google Scholar
Digital Library
- Jakob, W., 2010. Mitsuba renderer. http://www.mitsuba-renderer.org.Google Scholar
- Joe, S., and Kuo, F. Y. 2008. Constructing Sobol sequences with better two-dimensional projections. SIAM J. Sci. Comput. 30, 2635--2654. Google Scholar
Digital Library
- Kajiya, J. T. 1986. The rendering equation. In Proc. ACM SIGGRAPH 86, 143--150. Google Scholar
Digital Library
- Kelemen, C., Szirmay-Kalos, L., Antal, G., and Csonka, F. 2002. A simple and robust mutation strategy for the Metropolis light transport algorithm. Comput. Graph. Forum 21, 3, 531--540.Google Scholar
Cross Ref
- Kniep, S., Häring, S., and Magnor, M. 2009. Efficient and accurate rendering of complex light sources. Comput. Graph. Forum 28, 4, 1073--1081. Google Scholar
Digital Library
- Lafortune, E. P., and Willems, Y. D. 1993. Bi-directional path tracing. In Proc. Compugraphics, 145--153.Google Scholar
- Novák, J., Havran, V., and Daschbacher, C. 2010. Path regeneration for interactive path tracing. In Eurographics 2007, short papers, 61--64.Google Scholar
- Parker, S. G., Bigler, J., Dietrich, A., Friedrich, H., Hoberock, J., Luebke, D., McAllister, D., McGuire, M., Morley, K., Robison, A., and Stich, M. 2010. OptiX: A general purpose ray tracing engine. ACM Trans. Graph. 29, 4, 66:1--66:13. Google Scholar
Digital Library
- Pharr, M., and Humphreys, G. 2010. Physically Based Rendering, 2nd ed. Morgan Kaufmann. Google Scholar
Digital Library
- Pharr, M., and Mark, W. 2012. ispc: A SPMD compiler for high-performance CPU programming. In Proc. InPar 2012, 1--13.Google Scholar
- Purcell, T. J., Buck, I., Mark, W. R., and Hanrahan, P. 2002. Ray tracing on programmable graphics hardware. ACM Trans. Graph. 21, 3, 703--712. Google Scholar
Digital Library
- Raab, M., Seibert, D., and Keller, A. 2008. Unbiased global illumination with participating media. In Monte Carlo and Quasi-Monte Carlo Methods 2006. 591--605.Google Scholar
- Robison, A. 2009. Hot3D talk: Scheduling in NVIRT. HPG '09, http://www.highperformancegraphics.org/previous/www_2009/presentations/nvidia-rt.pdf.Google Scholar
- Stich, M., Friedrich, H., and Dietrich, A. 2009. Spatial splits in bounding volume hierarchies. In Proc. High Performance Graphics, 7--13. Google Scholar
Digital Library
- van Antwerpen, D. 2011. Improving SIMD efficiency for parallel Monte Carlo light transport on the GPU. In Proc. High Performance Graphics, 41--50. Google Scholar
Digital Library
- Veach, E., and Guibas, L. 1994. Bidirectional estimators for light transport. In Proc. Eurographics Rendering Workshop, 147--162.Google Scholar
- Veach, E., and Guibas, L. J. 1995. Optimally combining sampling techniques for Monte Carlo rendering. In Proc. ACM SIGGRAPH 95, 419--428. Google Scholar
Digital Library
- Veach, E., and Guibas, L. J. 1997. Metropolis light transport. In Proc. ACM SIGGRAPH 97, 65--76. Google Scholar
Digital Library
- Wald, I. 2011. Active thread compaction for GPU path tracing. In Proc. High Performance Graphics, 51--58. Google Scholar
Digital Library
Index Terms
Megakernels considered harmful: wavefront path tracing on GPUs
Recommendations
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Acceleration of Blender Cycles Path-Tracing Engine Using Intel Many Integrated Core Architecture
Computer Information Systems and Industrial ManagementAbstractThis paper describes the acceleration of the most computationally intensive kernels of the Blender rendering engine, Blender Cycles, using Intel Many Integrated Core architecture (MIC). The proposed parallelization, which uses OpenMP technology, ...
Improving SIMD efficiency for parallel Monte Carlo light transport on the GPU
HPG '11: Proceedings of the ACM SIGGRAPH Symposium on High Performance GraphicsMonte Carlo Light Transport algorithms such as Path Tracing (PT), Bi-Directional Path Tracing (BDPT) and Metropolis Light Transport (MLT) make use of random walks to sample light transport paths. When parallelizing these algorithms on the GPU the ...




Comments