Abstract
In the near future, GPUs are expected to have hardware support for real-time ray tracing in order to, e.g., help render complex lighting effects in video games and enable photorealistic augmented reality. One challenge in real-time ray tracing is dynamic scene support, that is, rebuilding or updating the spatial data structures used to accelerate rendering whenever the scene geometry changes. This paper proposes PLOCTree, an accelerator for tree construction based on the Parallel Locally-Ordered Clustering (PLOC) algorithm. Tree construction is highly memory-intensive, thus for the hardware implementation, the algorithm is rewritten into a bandwidth-economical form which converts most of the external memory traffic of the original software-based GPU implementation into streaming on-chip data traffic. As a result, the proposed unit is 3.9 times faster and uses 7.7 times less memory bandwidth than the GPU implementation. Compared to state-of-the-art hardware builders, PLOCTree gives a superior performance-quality tradeoff: it is nearly as fast as a state-of-the-art low-quality linear builder, while producing trees of similar Surface Area Heuristic (SAH) cost as a comparatively expensive binned SAH sweep builder.
- Timo Aila, Tero Karras, and Samuli Laine. 2013. On quality metrics of bounding volume hierarchies. In Proc. High-Performance Graphics. ACM, 101--107. Google Scholar
Digital Library
- Ciprian Apetrei. 2014. Fast and simple agglomerative LBVH construction. In Proc. Computer Graphics and Visual Computing.Google Scholar
- Jirí Bittner and Daniel Meister. 2015. T-SAH: Animation optimized bounding volume hierarchies. In Computer Graphics Forum, Vol. 34. 527--536. Google Scholar
Digital Library
- Jared Casper and Kunle Olukotun. 2014. Hardware acceleration of database operations. In Proc. ACM/SIGDA Int. Symp. Field-programmable gate arrays. 151--160. Google Scholar
Digital Library
- Karthik Chandrasekar, Christian Weis, Yonghui Li, Benny Akesson, Norbert Wehn, and Kees Goossens. 2012. DRAMPower: Open-source DRAM power 8 energy estimation tool. Retrieved Feb 30, 2017 from http://www.drampower.infoGoogle Scholar
- Yangdong Deng, Yufei Ni, Zonghui Li, Shuai Mu, and Wenjun Zhang. 2017. Toward real-time ray tracing: A survey on hardware acceleration and microarchitecture techniques. Comput. Surveys 50, 4 (2017), 58. Google Scholar
Digital Library
- Leonardo Domingues and Helio Pedrini. 2015a. Bounding volume hierarchy optimization through agglomerative treelet restructuring. In Proc. High-Performance Graphics. 13--20. Google Scholar
Digital Library
- Leonardo Domingues and Helio Pedrini. 2015b. Bounding volume hierarchy optimization through agglomerative treelet restructuring. In Proc. High-Performance Graphics. 13--20. Google Scholar
Digital Library
- Michael Doyle, Colin Fowler, and Michael Manzke. 2013. A hardware unit for fast SAH-optimized BVH construction. ACM Transactions on Graphics 32, 4 (2013), 139:1--10. Google Scholar
Digital Library
- Michael Doyle, Ciaran Tuohy, and Michael Manzke. 2017. Evaluation of a BVH construction accelerator architecture for high-quality visualization. IEEE Transactions on Multi-Scale Computing Systems (2017).Google Scholar
- Kirill Garanzha, Jacopo Pantaleoni, and David McAllister. 2011. Simpler and faster HLBVH with work queues. In Proc. High-Performance Graphics. 59--64. Google Scholar
Digital Library
- Yan Gu, Yong He, Kayvon Fatahalian, and Guy Blelloch. 2013. Efficient BVH construction via approximate agglomerative clustering. In Proc. High-Performance Graphics. 81--88. Google Scholar
Digital Library
- Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. 2010. Understanding sources of inefficiency in general-purpose chips. ACM SIGARCH Computer Architecture News 38, 3 (2010), 37--47. Google Scholar
Digital Library
- Mark Harris. 2016. Inside Pascal: NVIDIA's newest computing platform. Retrieved April 9, 2018 from https://devblogs.nvidia.com/inside-pascal/Google Scholar
- Thiago Ize, Ingo Wald, and Steven G Parker. 2007. Asynchronous BVH construction for ray tracing dynamic scenes on parallel multi-core architectures. In Proc. Eurographics Conf. Parallel Graphics and Visualization. 101--108. Google Scholar
Digital Library
- Tero Karras. 2012. Maximizing parallelism in the construction of BVHs, octrees, and k-d trees. In Proc. High-Performance Graphics. 33--37. Google Scholar
Digital Library
- Tero Karras and Timo Aila. 2013. Fast parallel construction of high-quality bounding volume hierarchies. In Proc. High-Performance Graphics. 89--99. Google Scholar
Digital Library
- Sean Keely. 2014. Reduced precision hardware for ray tracing. In Proc. High-Performance Graphics. 29--40. Google Scholar
Digital Library
- Yoongu Kim, Weikun Yang, and Onur Mutlu. 2015. Ramulator: A fast and extensible DRAM simulator. IEEE Computer Architecture Letters PP, 99 (2015), 1--1. Google Scholar
Digital Library
- Daniel Kopta, Thiago Ize, Josef Spjut, Erik Brunvand, Al Davis, and Andrew Kensler. 2012. Fast, effective BVH updates for animated scenes. In Proc. ACM SIGGRAPH Symp. Interactive 3D Graphics and Games. 197--204. Google Scholar
Digital Library
- Christian Lauterbach, Michael Garland, Shubhabrata Sengupta, David Luebke, and Dinesh Manocha. 2009. Fast BVH construction on GPUs. Computer Graphics Forum 28, 2 (2009), 375--384.Google Scholar
Cross Ref
- Sukhan Lee, Yuhwan Ro, Young Hoon Son, Hyunyoon Cho, Nam Sung Kim, and Jung Ho Ahn. 2017. Understanding power-performance relationship of energy-efficient modern DRAM devices. In Proc. IEEE Int. Symp. Workload Characterization. 110--111.Google Scholar
Cross Ref
- Sheng Li, Ke Chen, Jung Ho Ahn, Jay B Brockman, and Norman P Jouppi. 2011. CACTI-P: Architecture-level modeling for SRAM-based structures with advanced leakage reduction techniques. In Proc. IEEE/ACM Int. Conf. Computer-Aided Design. 694--701. Google Scholar
Digital Library
- Xingyu Liu, Yangdong Deng, Yufei Ni, and Zonghui Li. 2015. FastTree: A hardware KD-tree construction acceleration engine for real-time ray tracing. In Proc. Design, Automation 8 Test in Europe Conference 8 Exhibition. 1595--1598. Google Scholar
Digital Library
- J David MacDonald and Kellogg S Booth. 1990. Heuristics for ray tracing using space subdivision. The Visual Computer 6, 3 (1990), 153--166. Google Scholar
Digital Library
- James McCombe. 2014. New Techniques Made Possible by PowerVR Ray Tracing Hardware. GDC Talk.Google Scholar
- Morgan McGuire, Petrik Clarberg, and Nir Benty. 2018. The ray + raster era begins - an R8D roadmap for the game industry. (2018). Game Developers Conference Talk.Google Scholar
- Daniel Meister and Jiří Bittner. 2018. Parallel locally-ordered clustering for bounding volume hierarchy construction. IEEE Transactions on Visualization and Computer Graphics 24, 3 (2018), 1345--1353.Google Scholar
Cross Ref
- Jae-Ho Nah, Jin-Woo Kim, Junho Park, Won-Jong Lee, Jeong-Soo Park, Seok-Yoon Jung, Woo-Chan Park, Dinesh Manocha, and Tack-Don Han. 2015. HART: A hybrid architecture for ray tracing animated scenes. IEEE Transactions on Visualization and Computer Graphics 21, 3 (2015), 389--401.Google Scholar
Digital Library
- Jae-Ho Nah, Hyuck-Joo Kwon, Dong-Seok Kim, Cheol-Ho Jeong, Jinhong Park, Tack-Don Han, Dinesh Manocha, and Woo-Chan Park. 2014. RayCore: A ray-tracing hardware architecture for mobile devices. ACM Transactions on Graphics 33, 5 (2014), 162:1--15. Google Scholar
Digital Library
- Jae-Ho Nah, Jeong-Soo Park, Chanmin Park, Jin-Woo Kim, Yun-Hye Jung, Woo-Chan Park, and Tack-Don Han. 2011. T8I engine: Traversal and intersection engine for hardware accelerated raytracing. ACM Transactions on Graphics 30, 6 (Dec. 2011), 160:1--10. Google Scholar
Digital Library
- Jacopo Pantaleoni and David Luebke. 2010. HLBVH: Hierarchical LBVH construction for real-time ray tracing of dynamic geometry. In Proc. High-Performance Graphics. 87--95. Google Scholar
Digital Library
- Anuj Pathania, Qing Jiao, Alok Prakash, and Tulika Mitra. 2014. Integrated CPU-GPU power management for 3D mobile games. In Proc. Design Automation Conf. 1--6. Google Scholar
Digital Library
- Konstantin Shkurko, Tim Grant, Daniel Kopta, Ian Mallett, Cem Yuksel, and Erik Brunvand. 2017. Dual streaming for hardware-accelerated ray tracing. In Proc. High Performance Graphics. 12. Google Scholar
Digital Library
- Karthik Vaidyanathan, Tomas Akenine-Möller, and Marco Salvi. 2016. Watertight ray traversal with reduced precision. Proc. High-Performance Graphics (2016). Google Scholar
Digital Library
- Timo Viitanen, Matias Koskela, Pekka Jääskeläinen, Heikki Kultala, and Jarmo Takala. 2017. MergeTree: A fast hardware HLBVH constructor for animated ray tracing. ACM Transactions on Graphics 36, 5 (2017), 169. Google Scholar
Digital Library
- Marek Vinkler, Jiri Bittner, and Vlastimil Havran. 2017. Extended Morton codes for high performance bounding volume hierarchy construction. In Proc. High-Performance Graphics. 9. Google Scholar
Digital Library
- Ingo Wald. 2007. On fast construction of SAH-based bounding volume hierarchies. In Proc. IEEE Symp. Interactive Ray Tracing. 33--40. Google Scholar
Digital Library
- Ingo Wald, Solomon Boulos, and Peter Shirley. 2007. Ray tracing deformable scenes using dynamic bounding volume hierarchies. ACM Transactions on Graphics 26, 1 (2007), 6. Google Scholar
Digital Library
- Bruce Walter, Kavita Bala, Milind Kulkarni, and Keshav Pingali. 2008. Fast agglomerative clustering for rendering. In Proc. IEEE Symp. Interactive Ray Tracing. 81--86.Google Scholar
Cross Ref
- Sven Woop, Erik Brunvand, and Philipp Slusallek. 2006. Estimating performance of a ray-tracing ASIC design. In Proc. IEEE Symp. Interactive Ray Tracing. 7--14.Google Scholar
Cross Ref
- Sven Woop, Jörg Schmittler, and Philipp Slusallek. 2005. RPU: A programmable ray processing unit for real-time ray tracing. ACM Transactions on Graphics 24, 3 (2005), 434--444. Google Scholar
Digital Library
Index Terms
PLOCTree: A Fast, High-Quality Hardware BVH Builder
Recommendations
MergeTree: A Fast Hardware HLBVH Constructor for Animated Ray Tracing
Ray tracing is a computationally intensive rendering technique traditionally used in offline high-quality rendering. Powerful hardware accelerators have been recently developed that put real-time ray tracing even in the reach of mobile devices. However, ...
HART: A Hybrid Architecture for Ray Tracing Animated Scenes
We present a hybrid architecture, inspired by asynchronous BVH construction [1], for ray tracing animated scenes. Our hybrid architecture utilizes heterogeneous hardware resources: dedicated ray-tracing hardware for BVH updates and ray traversal and a CPU ...
Traversal fields for ray tracing dynamic scenes
VRST '06: Proceedings of the ACM symposium on Virtual reality software and technologyThis paper presents a novel scheme for accelerating ray traversal computation in ray tracing. By the scheme, a pre-computed stage is applied to constructing what is called a traversal field for each rigid object that records the destinations for all ...






Comments