skip to main content
research-article

HART: A Hybrid Architecture for Ray Tracing Animated Scenes

Published:01 March 2015Publication History
Skip Abstract Section

Abstract

We present a hybrid architecture, inspired by asynchronous BVH construction [1], for ray tracing animated scenes. Our hybrid architecture utilizes heterogeneous hardware resources: dedicated ray-tracing hardware for BVH updates and ray traversal and a CPU for BVH reconstruction. We also present a traversal scheme using a primitive's axis-aligned bounding box (PrimAABB). This scheme reduces ray-primitive intersection tests by reusing existing BVH traversal units and the primAABB data for tree updates; it enables the use of shallow trees to reduce tree build times, tree sizes, and bus bandwidth requirements. Furthermore, we present a cache scheme that exploits consecutive memory access by reusing data in an L1 cache block. We perform cycle-accurate simulations to verify our architecture, and the simulation results indicate that the proposed architecture can achieve real-time Whitted ray tracing animated scenes at 1,920 × 1,200 resolution. This result comes from our high-performance hardware architecture and minimized resource requirements for tree updates.

References

  1. [1] Ize T. , Wald I., and Parker S. G. , “Asynchronous BVH construction for ray tracing dynamic scenes on parallel multi-core architectures,” in Proc. Eur. Symp. Parallel Graph. Vis., 2007, pp. 101108. Google ScholarGoogle Scholar
  2. [2] Wald I., Mark W. R., Gunther J., Boulos S., Ize T. , Hunt W., Parker S. G. , and Shirley P., “State of the art in ray tracing animated scenes, Comput. Graph. Forum, vol. 28, no. 6, pp. 16911722, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Lauterbach C., Yoon S.-E., Tuft D., and Manocha D., “RT-DEFORM: Interactive ray tracing of dynamic scenes using BVH,” in Proc. IEEE Symp. Interactive Ray Tracing 2006, 2006, pp. 3945.Google ScholarGoogle Scholar
  4. [4] Wald I., Boulos S., and Shirley P., “Ray tracing deformable scenes using dynamic bounding volume hierarchies,” ACM Trans. Graph. , vol. 26, no. 1, pp. 6:16:18, 2007. Google ScholarGoogle Scholar
  5. [5] Wald I., “On fast construction of SAH-based bounding volume hierarchies ,” in Proc. IEEE Symp. Interactive Ray Tracing, 2007, pp. 33 40.Google ScholarGoogle Scholar
  6. [6] Yoon S.-E., Curtis S., and Manocha D., “Ray tracing dynamic scenes using selective restructuring,” in Proc. Eur. Symp. Rendering , 2007, pp. 7384.Google ScholarGoogle Scholar
  7. [7] Wald I., Ize T., and Parker S. G., “Fast, parallel, and asynchronous construction of BVHs for ray tracing animated scenes, Comput. Graph., vol. 32, no. 1, pp. 313, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Kang Y.-S., Nah J.-H., Park W.-C., and Yang S.-B. , “gkDtree: A group-based parallel update kd-tree for interactive ray tracing,J. Syst. Archit., vol. 59, no. 3, pp. 166175, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Kopta D., Ize T., Spjut J., Brunvand E., Davis A., and Kensler A., “Fast, effective BVH updates for animated scenes,” in Proc. ACM SIGGRAPH Symp. Interactive 3D Graph. Games, 2012, pp. 197204 .Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Gu Y. , He Y., Fatahalian K. , and Blelloch G., “Efficient BVH construction via approximate agglomerative clustering,” in Proc. 5th High-Perform. Graph. Conf., 2013 , pp. 8188.Google ScholarGoogle Scholar
  11. [11] Wald I., Woop S., Benthin C., Johnson G. S., and Ernst M., “Embree - a kernel framework for efficient CPU ray tracing ,” ACM Trans. Graph., vol. 33, no. 4, pp. 143:1143:8, 2014.Google ScholarGoogle Scholar
  12. [12] Zhou K., Hou Q., Wang R., and Guo B., “Real-time KD-tree construction on graphics hardware,ACM Trans. Graph., vol. 27, no. 5, pp. 111, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Lauterbach C., Garland M., Sengupta S., Luebke D., and Manocha D., “Fast BVH construction on GPUs, Comput. Graph. Forum, vol. 28, no. 2, pp. 375384 , 2009.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Garanzha K., Pantaleoni J., and McAllister D., “ Simpler and faster HLBVH with work queues,” in Proc. Conf. High Perform. Graph., 2011, pp. 5964.Google ScholarGoogle Scholar
  15. [15] Karras T., “Maximizing parallelism in the construction of BVHs, octrees, and k-d trees,” in Proc. 4th Conf. High-Perform. Graph., 2012, pp. 3337.Google ScholarGoogle Scholar
  16. [16] Karras T. and Aila T., “Fast parallel construction of high-quality bounding volume hierarchies,” in Proc. 5th High-Perform. Graph. Conf., 2013, pp. 8999 .Google ScholarGoogle Scholar
  17. [17] Wald I., “Fast construction of SAH BVHs on the Intel many integrated core (MIC) architecture,” IEEE Trans. Vis. Comput. Graph., vol. 18, no. 1, pp. 4757, Jan. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Woop S., Marmitt G., and Slusallek P., “B-KD trees for hardware accelerated ray tracing of dynamic scenes,” in Proc. 21st ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware, 2006, pp. 6777 .Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Woop S., Brunvand E., and Slusallek P., “ Estimating performance of a ray-tracing ASIC design,” in Proc. IEEE/EG Symp. Interactive Ray Tracing, 2006, pp. 714.Google ScholarGoogle Scholar
  20. [20] Doyle M. J., Fowler C., and Manzke M., “A hardware unit for fast SAH-optimised BVH construction,” ACM Trans. Graph., vol. 32, no. 4, pp. 66:166:13, 2013. Google ScholarGoogle Scholar
  21. [21] Bigler J., Stephens A., and Parker S. G., “Design for parallel interactive ray tracing systems,” in Proc. IEEE Symp. Interactive Ray Tracing, 2006, pp. 187196.Google ScholarGoogle Scholar
  22. [22] Parker S. G., Bigler J., Dietrich A., Friedrich H., Hoberock J., Luebke D., McAllister D., McGuire M., Morley K., Robison A., and Stich M., “OptiX: A general purpose ray tracing engine,” ACM Trans. Graph., vol. 29 , no. 4, pp. 66:166:13, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Nah J.-H., Park J.-S., Park C., Kim J.-W., Jung Y.-H., Park W.-C., and Han T.-D., “T&I engine: Traversal and intersection engine for hardware accelerated ray tracing,” ACM Trans. Graph., vol. 30, no. 6, pp. 160:1160:10 , 2011.Google ScholarGoogle Scholar
  24. [24] Lauterbach C., Mo Q., and Manocha D., “ gProximity: Hierarchical GPU-based operations for collision and distance queries, Comput. Graph. Forum, vol. 29, no. 2, pp. 419428 , 2010.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Schmittler J., Woop S., Wagner D., Paul W. J., and Slusallek P., “Realtime ray tracing of dynamic scenes on an FPGA chip ,” in Proc. ACM SIGGRAPH/EUROGRAPHICS Conf. Graph. Hardware, 2004, pp. 95106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Woop S., Schmittler J., and Slusallek P., “ RPU: A programmable ray processing unit for realtime ray tracing,ACM Trans. Graph., vol. 24, no. 3, pp. 434444 , 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Ramani K., Gribble C. P., and Davis A., “ StreamRay: A stream filtering architecture for coherent ray tracing,” in Proc. Archit. Support Program. Language Operating Syst., 2009, pp. 325336 .Google ScholarGoogle Scholar
  28. [28] Spjut J., Kensler A., Kopta D., and Brunvand E., “TRaX: A multicore hardware architecture for real-time ray tracing,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. , vol. 28, no. 12, pp. 18021815, Dec. 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Kopta D., Spjut J., Brunvand E., and Davis A., “Efficient MIMD architectures for high-performance ray tracing,” in Proc. 28th IEEE Int. Conf. Comput. Des., 2010, pp. 916.Google ScholarGoogle Scholar
  30. [30] Aila T. and Karras T., “Architecture considerations for tracing incoherent rays,” in Proc. Conf. High Perform. Graph., 2010, pp. 113122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Lee W.-J., Lee S.-H., Nah J.-H., Kim J.-W., Shin Y., Lee J., and Jung S.-Y. , “SGRT: A scalable mobile GPU architecture based on ray tracing,” in Proc. ACM SIGGRAPH 2012 Talks, 2012, p. 44:1.Google ScholarGoogle Scholar
  32. [32] Lee W.-J., Shin Y., Lee J., Kim J.-W., Nah J.-H. , Jung S.-Y., Lee S.-H. , Park H.-S., and Han T.-D., “SGRT: A mobile GPU architecture for real-time ray tracing, ” in Proc. 5th High-Perform. Graph. Conf., 2013, pp. 109119.Google ScholarGoogle Scholar
  33. [33] Nah J.-H., Kwon H.-J., Kim D.-S., Jeong C.-H., Park J. , Han T.-D., Manocha D. , and Park W.-C., “RayCore: A ray-tracing hardware architecture for mobile devices,” ACM Trans. Graph., vol. 33, no. 5 , pp. 162:1162:15, 2014.Google ScholarGoogle Scholar
  34. [34] S. Woop, “ A programmable hardware architecture for real-time ray tracing of coherent dynamic scenes ,” Ph.D. dissertation, Sarrland Univ., Saarbrücken, Germany, 2007.Google ScholarGoogle Scholar
  35. [35] B. C. Budge, J. C. Anderson, C. Garth, and K. I. Joy, “A hybrid CPU-GPU implementation for interactive ray-tracing of dynamic scenes,” Univ. California, Davis, CA, USA, Comput. Sci., Tech. Rep. CSE-2008-9, 2008.Google ScholarGoogle Scholar
  36. [36] Nah J.-H., Kang Y.-S., Lee K.-J., Lee S.-J., Han T.-D. , and Yang S.-B., “ MobiRT: an implementation of OpenGL ES-based CPU-GPU hybrid ray tracer for mobile devices ,” in Proc. ACM SIGGRAPH ASIA 2010 Sketches, 2010, pp. 50:150:2.Google ScholarGoogle Scholar
  37. [37] Bikker J. and Schijndel J. van, “The brigade renderer: A path tracer for real-time games,” Int. J. Comput. Games Technol., Article ID 578269, 2013, http://www.hindawi.com/journals/ijcgt/2013/578269/Google ScholarGoogle Scholar
  38. [38] Budge B., Bernardin T., Stuart J. A., Sengupta S., Joy K. I., and Owens J. D. , “Out-of-core data management for path tracing on hybrid resources,Comput. Graph. Forum, vol. 28, no. 2, pp. 385396, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Pajot A., Barthe L., Paulin M., and Poulin P., “Combinatorial bidirectional path-tracing for efficient hybrid CPU/GPU rendering,Comput. Graph. Forum, vol. 30 , no. 2, pp. 315324, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] LuxRender Luxrays [Online]. Available: http://www.luxrender.net/wiki/LuxRays, 2014.Google ScholarGoogle Scholar
  41. [41] Reshetov A., “Faster ray packets—Triangle intersection through vertex culling,” in Proc. IEEE Symp. Interactive Ray Tracing, 2007, pp. 105112.Google ScholarGoogle Scholar
  42. [42] Snyder J. and Barr A., “Ray tracing complex models containing surface tessellations,” ACM SIGGRAPH Comput. Graph., vol. 21, pp. 119128, 1987.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Nah J.-H., Park W.-C., Kang Y.-S., and Han T.-D. , “Ray-box culling for tree structures,J. Inf. Sci. Eng., vol. 29, no. 6, pp. 12111225, 2013.Google ScholarGoogle Scholar
  44. [44] I. Wald, “ Realtime ray tracing and interactive global illumination,” Ph.D. dissertation, Sarrland Univ., Saarbrücken, Germany, 2004.Google ScholarGoogle Scholar
  45. [45] Möller T. and Trumbore B., “Fast, minimum storage ray-triangle intersection,J. Graph. Tools , vol. 2, no. 1, pp. 2128, 1997 .Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Aila T. and Laine S., “Understanding the efficiency of ray traversal on GPUs,” in Proc. Conf. High Perform. Graph., 2009, pp. 145149. Google ScholarGoogle Scholar
  47. [47] Laine S., “Restart trail for stackless BVH traversal,” in Proc. Conf. High Perform. Graph., 2010, pp. 107111 .Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Nah J.-H. and Manocha D., “SATO: Surface-area traversal order for shadow ray tracing, Comput. Graph. Forum, vol. 33, no. 6, pp. 167177 , 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Bakhoda A., Yuan G., Fung W., Wong H., and Aamodt T., “Analyzing CUDA workloads using a detailed GPU simulator, ” in Proc. IEEE Int. Symp. Perform. Anal. Syst. Softw., 2009, pp. 163174.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Muralimanohar N., Balasubramonian R., and Jouppi N., “ Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0 ,” in Proc. 40th Annu. IEEE/ACM Int. Symp. Microarchit., 2007, pp. 314.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Patterson D. A. and Hennessy J. L., Computer Organization and Design: The Hardware/Software Interface, 4th ed. San Mateo, CA, USA: Morgan Kaufmann, 2008.Google ScholarGoogle Scholar
  52. [52] Mahesri A., Johnson D., Crago N., and Patel S. J., “Tradeoffs in designing accelerator architectures for visual computing,” in Proc. 41st Annu. IEEE/ACM Int. Symp. Microarchit., 2008, pp. 164175.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Kim J.-W., Kim J.-M., Lee M., and Han T.-D., “Asynchronous BVH reconstruction on CPU-GPU hybrid architecture,” in Proc. ACM SIGGRAPH 2014 Posters, 2014, p. 91:1.Google ScholarGoogle Scholar
  54. [54] Stich M., Friedrich H., and Dietrich A., “Spatial splits in bounding volume hierarchies,” in Proc. Conf. High Perform. Graph., 2009, pp. 713.Google ScholarGoogle Scholar
  55. [55] T. Aila, S. Laine, and T. Karras, “Understanding the efficiency of ray traversal on GPUs—Kepler and Fermi addendum,” NVIDIA Corporation, Santa Clara, CA, USA, NVIDIA Tech. Rep. NVR-2012-02, 2012. Google ScholarGoogle Scholar
  56. [56] Wald I., Benthin C., and Slusallek P., “ Distributed interactive ray tracing of dynamic scenes,” in Proc. IEEE Symp. Parallel and Large-Data Vis. Graph., 2003, pp. 7786.Google ScholarGoogle Scholar
  57. [57] Ernst M. and Woop S., “Ray tracing with shared-plane bounding volume hierarchies,J. Graph., GPU, Game Tools, vol. 15, no. 3, pp. 141 151, 2011.Google ScholarGoogle Scholar
  58. [58] Schissler C., Mehra R., and Manocha D., “High-order diffraction and diffuse reflections for interactive sound propagation in large environments,” ACM Trans. Graph., vol. 33, no. 4, pp. 39:139:12, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. HART: A Hybrid Architecture for Ray Tracing Animated Scenes
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access