skip to main content
article

Stackless Multi-BVH Traversal for CPU, MIC and GPU Ray Tracing

Published:01 February 2014Publication History
Skip Abstract Section

Abstract

Stackless traversal algorithms for ray tracing acceleration structures require significantly less storage per ray than ordinary stack-based ones. This advantage is important for massively parallel rendering methods, where there are many rays in flight. On SIMD architectures, a commonly used acceleration structure is the MBVH, which has multiple bounding boxes per node for improved parallelism. It scales to branching factors higher than two, for which, however, only stack-based traversal methods have been proposed so far. In this paper, we introduce a novel stackless traversal algorithm for MBVHs with up to four-way branching. Our approach replaces the stack with a small bitmask, supports dynamic ordered traversal, and has a low computation overhead. We also present efficient implementation techniques for recent CPU, MIC Intel Xeon Phi and GPU NVIDIA Kepler architectures.

References

  1. {AK10} Aila T., Karras T.: Architecture considerations for tracing incoherent rays. In Proceedings of the Conference on High Performance Graphics Aire-la-Ville, Switzerland, 2010, HPG '10, Eurographics Association, pp. pp.113-122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. {AL09} Aila T., Laine S.: Understanding the efficiency of ray traversal on GPUs. In Proceedings of the Conference on High Performance Graphics 2009 New York, NY, USA, 2009, HPG '09, ACM Press, pp. pp.145-149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. {ALK12} Aila T., Laine S., Karras T.: Understanding the Efficiency of Ray Traversal on GPUs-Kepler and Fermi Addendum. NVIDIA Technical Report NVR-2012-02, NVIDIA Corporation, June 2012.Google ScholarGoogle Scholar
  4. {BAM13} Barringer R., Akenine-Möller T.: Dynamic stackless binary tree traversal. Journal of Computer Graphics Techniques JCGT 2, Volume 2 March 2013, pp.38-49.Google ScholarGoogle Scholar
  5. {BWW*12} Benthin C., Wald I., Woop S., Ernst M., Mark W.: Combining single and packet-ray tracing for arbitrary ray distributions on the Intel MIC architecture. IEEE Transactions on Visualization and Computer Graphics 18, Volume 9 September 2012, pp.1438-1448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. {Dam11} Dammertz H.: Acceleration Methods for Ray Tracing based Global Illumination. PhD thesis, Ulm University, 2011.Google ScholarGoogle Scholar
  7. {DHK08} Dammertz H., Hanika J., Keller A.: Shallow bounding volume hierarchies for fast SIMD ray tracing of incoherent rays. Computer Graphics Forum 27, Volume 4 2008, pp.1225-1233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. {EG08} Ernst M., Greiner G.: Multi bounding volume hierarchies. In Proceedings of the IEEE Symposium on Interactive Ray Tracing 2008 2008, pp. pp.35-40.Google ScholarGoogle ScholarCross RefCross Ref
  9. {Ern11} Ernst M.: Embree: Photo-realistic ray tracing kernels. In ACM SIGGRAPH 2011 Exhibitor Tech Talks 2011.Google ScholarGoogle Scholar
  10. {FS05} Foley T., Sugerman J.: KD-tree acceleration structures for a GPU raytracer. In Proceedings of the ACM SIGGRAPH/Eurographics Conference on Graphics Hardware New York, NY, USA, 2005, HWWS '05, ACM Press, pp. pp.15-22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. {GPM11} Garanzha K., Pantaleoni J., McAllister D.: Simpler and faster HLBVH with work queues. In Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics New York, NY, USA, 2011, HPG '11, ACM Press, pp. pp.59-64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {HB¿98} Havran V., Bittner J., ¿ára J.: Ray tracing with rope trees. In Proceedings of SCCG'98 Spring Conference on Computer Graphics Budmerice, Slovak Republic, April 1998, pp. pp.130-139.Google ScholarGoogle Scholar
  13. {HDW*11} Hapala M., Davidovič T., Wald I., Havran V., Slusallek P.: Efficient stack-less BVH traversal for ray tracing. In Proceedings of the 27th Spring Conference on Computer Graphics New York, NY, USA</publisherLoc>, 2011, <publisherLoc>SCCG '11, ACM Press, pp. pp.7-12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. {HSHH07} Horn D. R., Sugerman J., Houston M., Hanrahan P.: Interactive k-d tree GPU raytracing. In Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games New York, NY, USA, 2007, I3D '07, ACM Press, pp. pp.167-174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. {Int13} Intel: Intel Xeon Phi System Software Developer's Guide, June 2013.Google ScholarGoogle Scholar
  16. {KA13} Karras T., Aila T.: Fast parallel construction of high-quality bounding volume hierarchies. In Proceedings of the 5th High-Performance Graphics Conference New York, NY, USA, 2013, HPG '13, ACM Press, pp. pp.89-99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. {KIS*12} Kopta D., Ize T., Spjut J., Brunvand E., Davis A., Kensler A.: Fast, effective BVH updates for animated scenes. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games New York, NY, USA, 2012, I3D '12, ACM Press, pp. pp.197-204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. {KSS*13} Kopta D., Shkurko K., Spjut J., Brunvand E., Davis A.: An energy and bandwidth efficient ray tracing architecture. In Proceedings of the 5th High-Performance Graphics Conference New York, NY, USA, 2013, HPG '13, ACM Press, pp. pp.121-128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. {Lai10} Laine S.: Restart trail for stackless BVH traversal. In Proceedings of the Conference on High Performance Graphics Aire-la-Ville, Switzerland, 2010, HPG '10, Eurographics Association, pp. pp.107-111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. {MB90} MacDonald D. J., Booth K. S.: Heuristics for ray tracing using space subdivision. The Visual Computer 6, Volume 3 May 1990, pp.153-166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. {MT97} Möller T., Trumbore B.: Fast, minimum storage ray-triangle intersection. Journal of Graphics Tools 2, Volume 1 1997, pp.21-28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. {NFLM07} Navrátil P. A., Fussell D. S., Lin C., Mark W. R.: Dynamic ray scheduling to improve ray coherence and bandwidth utilization. In Proceedings of the 2007 IEEE Symposium on Interactive Ray Tracing Washington, DC, USA, 2007, RT '07, IEEE Computer Society, pp. pp.95-104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. {Nvi12} Nvidia: NVIDIA's Next Generation CUDA Compute Architecture: Kepler GK110. Whitepaper, NVIDIA Corporation, 2012.Google ScholarGoogle Scholar
  24. {PGSS07} Popov S., Günther J., Seidel H.-P., Slusallek P.: Stackless kd-tree traversal for high performance GPU ray tracing. Computer Graphics Forum 26, Volume 3 2007, pp.415-424.Google ScholarGoogle ScholarCross RefCross Ref
  25. {PKGH97} Pharr M., Kolb C., Gershbein R., Hanrahan P.: Rendering complex scenes with memory-coherent ray tracing. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques New York, NY, USA, 1997, SIGGRAPH '97, ACM Press/Addison-Wesley Publishing Co ., pp. pp.101-108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. {SCS*08} Seiler L., Carmean D., Sprangle E., Forsyth T., Abrash M., Dubey P., Junkins S., Lake A., Sugerman J., Cavin R., Espasa R., Grochowski E., Juan T., Hanrahan P.: Larrabee: A many-core x86 architecture for visual computing. ACM Transactions on Graphics 27, Volume 3 August 2008, pp.18:1-18:15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. {SFD09} Stich M., Friedrich H., Dietrich A.: Spatial splits in bounding volume hierarchies. In Proceedings of the Conference on High Performance Graphics 2009 New York, NY, USA, 2009, HPG '09, ACM Press, pp. pp.7-13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. {Smi98} Smits B.: Efficiency issues for ray tracing. Journal of Graphics Tools 3, Volume 2 February 1998, pp.1-14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. {TMG09} Torres R., Martín P. J., Gavilanes A.: Ray casting using a roped BVH with CUDA. In Proceedings of the 25th Spring Conference on Computer Graphics New York, NY, USA, 2009, SCCG '09, ACM Press, pp. pp.95-102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. {WBB08} Wald I., Benthin C., Boulos S.: Getting rid of packets - efficient SIMD single-ray traversal using multi-branching BVHs. In Proceedings of the IEEE Symposium on Interactive Ray Tracing 2008 2008, pp. pp.49-57.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Stackless Multi-BVH Traversal for CPU, MIC and GPU Ray Tracing
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access