Abstract
We present a hybrid architecture, inspired by asynchronous BVH construction [1], for ray tracing animated scenes. Our hybrid architecture utilizes heterogeneous hardware resources: dedicated ray-tracing hardware for BVH updates and ray traversal and a CPU for BVH reconstruction. We also present a traversal scheme using a primitive's axis-aligned bounding box (PrimAABB). This scheme reduces ray-primitive intersection tests by reusing existing BVH traversal units and the primAABB data for tree updates; it enables the use of shallow trees to reduce tree build times, tree sizes, and bus bandwidth requirements. Furthermore, we present a cache scheme that exploits consecutive memory access by reusing data in an L1 cache block. We perform cycle-accurate simulations to verify our architecture, and the simulation results indicate that the proposed architecture can achieve real-time Whitted ray tracing animated scenes at 1,920 × 1,200 resolution. This result comes from our high-performance hardware architecture and minimized resource requirements for tree updates.
- [1]
, “Asynchronous BVH
construction for ray tracing dynamic scenes on parallel multi-core architectures,” in
Proc. Eur. Symp. Parallel Graph. Vis., 2007, pp. 101–108.
Google Scholar
- [2]
, “State of the art in ray tracing animated scenes,
” Comput. Graph. Forum, vol. 28, no. 6,
pp. 1691–1722, 2009.Google Scholar
Cross Ref
- [3]
, “RT-DEFORM: Interactive ray tracing of
dynamic scenes using BVH,” in Proc. IEEE Symp. Interactive Ray Tracing 2006,
2006, pp. 39–45.Google Scholar
- [4]
, “Ray
tracing deformable scenes using dynamic bounding volume hierarchies,” ACM Trans. Graph.
, vol. 26, no. 1, pp. 6:1–6:18, 2007.
Google Scholar
- [5]
, “On fast construction of SAH-based bounding volume hierarchies
,” in Proc. IEEE Symp. Interactive Ray Tracing, 2007, pp. 33
–40.Google Scholar
- [6]
, “Ray
tracing dynamic scenes using selective restructuring,” in Proc. Eur. Symp. Rendering
, 2007, pp. 73–84.Google Scholar
- [7]
, “Fast,
parallel, and asynchronous construction of BVHs for ray tracing animated scenes,”
Comput. Graph., vol. 32, no. 1, pp. 3–13,
2008.Google Scholar
Digital Library
- [8]
, “gkDtree: A group-based
parallel update kd-tree for interactive ray tracing,” J. Syst. Archit., vol.
59, no. 3, pp. 166–175, 2013.
Google Scholar
Digital Library
- [9]
,
“Fast, effective BVH updates for animated scenes,” in Proc. ACM
SIGGRAPH Symp. Interactive 3D Graph. Games, 2012, pp. 197–204
.Google Scholar
Digital Library
- [10]
, “Efficient BVH construction via approximate
agglomerative clustering,” in Proc. 5th High-Perform. Graph. Conf., 2013
, pp. 81–88.Google Scholar
- [11]
, “Embree - a kernel framework for efficient CPU ray tracing
,” ACM Trans. Graph., vol. 33, no. 4, pp.
143:1–143:8, 2014.Google Scholar
- [12]
, “Real-time KD-tree construction on graphics
hardware,” ACM Trans. Graph., vol. 27, no. 5,
pp. 1–11, 2008.Google Scholar
Digital Library
- [13]
, “Fast BVH construction on GPUs,”
Comput. Graph. Forum, vol. 28, no. 2, pp. 375–384
, 2009.Google Scholar
Cross Ref
- [14]
, “
Simpler and faster HLBVH with work queues,” in Proc. Conf. High Perform.
Graph., 2011, pp. 59–64.Google Scholar
- [15]
, “Maximizing parallelism in the construction of BVHs, octrees, and
k-d trees,” in Proc. 4th Conf. High-Perform. Graph., 2012, pp.
33–37.Google Scholar
- [16]
,
“Fast parallel construction of high-quality bounding volume hierarchies,”
in Proc. 5th High-Perform. Graph. Conf., 2013, pp. 89–99
.Google Scholar
- [17]
, “Fast construction of SAH BVHs on the Intel many integrated core
(MIC) architecture,” IEEE Trans. Vis. Comput. Graph., vol. 18,
no. 1, pp. 47–57, Jan. 2012.
Google Scholar
Digital Library
- [18]
, “B-KD
trees for hardware accelerated ray tracing of dynamic scenes,” in Proc. 21st ACM
SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware, 2006, pp. 67–77
.Google Scholar
Digital Library
- [19]
, “
Estimating performance of a ray-tracing ASIC design,” in Proc. IEEE/EG Symp. Interactive
Ray Tracing, 2006, pp. 7–14.Google Scholar
- [20]
, “A hardware
unit for fast SAH-optimised BVH construction,” ACM Trans. Graph., vol.
32, no. 4, pp. 66:1–66:13, 2013.
Google Scholar
- [21]
, “Design
for parallel interactive ray tracing systems,” in Proc. IEEE Symp. Interactive Ray
Tracing, 2006, pp. 187–196.Google Scholar
- [22]
, “OptiX: A
general purpose ray tracing engine,” ACM Trans. Graph., vol. 29
, no. 4, pp. 66:1–66:13, 2010.Google Scholar
Digital Library
- [23]
, “T&I
engine: Traversal and intersection engine for hardware accelerated ray tracing,” ACM
Trans. Graph., vol. 30, no. 6, pp. 160:1–160:10
, 2011.Google Scholar
- [24]
, “
gProximity: Hierarchical GPU-based operations for collision and distance queries,”
Comput. Graph. Forum, vol. 29, no. 2, pp. 419–428
, 2010.Google Scholar
Cross Ref
- [25]
, “Realtime ray tracing of dynamic scenes on an FPGA chip
,” in Proc. ACM SIGGRAPH/EUROGRAPHICS Conf. Graph. Hardware, 2004,
pp. 95–106.Google Scholar
Digital Library
- [26]
, “
RPU: A programmable ray processing unit for realtime ray tracing,” ACM
Trans. Graph., vol. 24, no. 3, pp. 434–444
, 2005.Google Scholar
Digital Library
- [27]
, “
StreamRay: A stream filtering architecture for coherent ray tracing,” in Proc. Archit.
Support Program. Language Operating Syst., 2009, pp. 325–336
.Google Scholar
- [28]
, “TRaX: A multicore hardware architecture
for real-time ray tracing,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
, vol. 28, no. 12, pp. 1802–1815,
Dec. 2009.Google Scholar
Digital Library
- [29]
, “Efficient MIMD architectures for
high-performance ray tracing,” in Proc. 28th IEEE Int. Conf. Comput. Des.,
2010, pp. 9–16.Google Scholar
- [30]
,
“Architecture considerations for tracing incoherent rays,” in Proc.
Conf. High Perform. Graph., 2010, pp. 113–122.
Google Scholar
Digital Library
- [31]
, “SGRT: A scalable mobile
GPU architecture based on ray tracing,” in Proc. ACM SIGGRAPH 2012 Talks,
2012, p. 44:1.Google Scholar
- [32]
, “SGRT: A mobile GPU architecture for real-time ray tracing,
” in Proc. 5th High-Perform. Graph. Conf., 2013, pp. 109–
119.Google Scholar
- [33]
, “RayCore: A ray-tracing hardware architecture for
mobile devices,” ACM Trans. Graph., vol. 33, no. 5
, pp. 162:1–162:15, 2014.Google Scholar
- [34] S. Woop, “
A programmable hardware architecture for real-time ray tracing of coherent dynamic scenes
,” Ph.D. dissertation, Sarrland Univ., Saarbrücken, Germany, 2007.Google Scholar
- [35] B. C. Budge, J.
C. Anderson, C. Garth, and K. I. Joy, “A hybrid CPU-GPU implementation for interactive
ray-tracing of dynamic scenes,” Univ. California, Davis, CA, USA, Comput. Sci., Tech. Rep.
CSE-2008-9, 2008.Google Scholar
- [36]
, “
MobiRT: an implementation of OpenGL ES-based CPU-GPU hybrid ray tracer for mobile devices
,” in Proc. ACM SIGGRAPH ASIA 2010 Sketches, 2010, pp. 50:1–
50:2.Google Scholar
- [37]
, “The brigade renderer: A path tracer for real-time games,”
Int. J. Comput. Games Technol., Article ID 578269, 2013,
http://www.hindawi.com/journals/ijcgt/2013/578269/Google Scholar
- [38]
, “Out-of-core data management for path tracing on hybrid resources,”
Comput. Graph. Forum, vol. 28, no. 2, pp. 385–
396, 2009.Google Scholar
Cross Ref
- [39]
, “Combinatorial bidirectional path-tracing
for efficient hybrid CPU/GPU rendering,” Comput. Graph. Forum, vol. 30
, no. 2, pp. 315–324, 2011.Google Scholar
Cross Ref
- [40] LuxRender
Luxrays [Online]. Available: http://www.luxrender.net/wiki/LuxRays, 2014.Google Scholar
- [41]
, “Faster ray packets—Triangle intersection through vertex
culling,” in Proc. IEEE Symp. Interactive Ray Tracing, 2007, pp.
105–112.Google Scholar
- [42]
,
“Ray tracing complex models containing surface tessellations,” ACM
SIGGRAPH Comput. Graph., vol. 21, pp. 119–128,
1987.Google Scholar
Digital Library
- [43]
, “Ray-box culling for tree
structures,” J. Inf. Sci. Eng., vol. 29, no. 6,
pp. 1211–1225, 2013.Google Scholar
- [44] I. Wald, “
Realtime ray tracing and interactive global illumination,” Ph.D. dissertation,
Sarrland Univ., Saarbrücken, Germany, 2004.Google Scholar
- [45]
,
“Fast, minimum storage ray-triangle intersection,” J. Graph. Tools
, vol. 2, no. 1, pp. 21–28, 1997
.Google Scholar
Digital Library
- [46]
,
“Understanding the efficiency of ray traversal on GPUs,” in Proc.
Conf. High Perform. Graph., 2009, pp. 145–149.
Google Scholar
- [47]
, “Restart trail for stackless BVH traversal,”
in Proc. Conf. High Perform. Graph., 2010, pp. 107–111
.Google Scholar
Digital Library
- [48]
,
“SATO: Surface-area traversal order for shadow ray tracing,”
Comput. Graph. Forum, vol. 33, no. 6, pp. 167–177
, 2014.Google Scholar
Digital Library
- [49]
, “Analyzing CUDA workloads using a detailed GPU simulator,
” in Proc. IEEE Int. Symp. Perform. Anal. Syst. Softw., 2009, pp.
163–174.Google Scholar
Cross Ref
- [50]
, “
Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0
,” in Proc. 40th Annu. IEEE/ACM Int. Symp. Microarchit., 2007, pp. 3
–14.Google Scholar
Digital Library
- [51]
, Computer Organization and Design: The Hardware/Software Interface, 4th
ed. San Mateo, CA, USA: Morgan Kaufmann,
2008.Google Scholar
- [52]
, “Tradeoffs in designing accelerator
architectures for visual computing,” in Proc. 41st Annu. IEEE/ACM Int. Symp.
Microarchit., 2008, pp. 164–175.Google Scholar
Digital Library
- [53]
, “Asynchronous BVH reconstruction on CPU-GPU
hybrid architecture,” in Proc. ACM SIGGRAPH 2014 Posters, 2014, p.
91:1.Google Scholar
- [54]
, “Spatial
splits in bounding volume hierarchies,” in Proc. Conf. High Perform. Graph.,
2009, pp. 7–13.Google Scholar
- [55] T. Aila, S.
Laine, and T. Karras, “Understanding the efficiency of ray traversal on GPUs—Kepler and
Fermi addendum,” NVIDIA Corporation, Santa Clara, CA, USA, NVIDIA Tech. Rep. NVR-2012-02, 2012.
Google Scholar
- [56]
, “
Distributed interactive ray tracing of dynamic scenes,” in Proc. IEEE Symp. Parallel and
Large-Data Vis. Graph., 2003, pp. 77–86.Google Scholar
- [57]
,
“Ray tracing with shared-plane bounding volume hierarchies,” J.
Graph., GPU, Game Tools, vol. 15, no. 3, pp. 141–
151, 2011.Google Scholar
- [58]
, “High-order
diffraction and diffuse reflections for interactive sound propagation in large environments,”
ACM Trans. Graph., vol. 33, no. 4, pp. 39:1–
39:12, 2014.Google Scholar
Digital Library
Index Terms
HART: A Hybrid Architecture for Ray Tracing Animated Scenes
Recommendations
Traversal fields for ray tracing dynamic scenes
VRST '06: Proceedings of the ACM symposium on Virtual reality software and technologyThis paper presents a novel scheme for accelerating ray traversal computation in ray tracing. By the scheme, a pre-computed stage is applied to constructing what is called a traversal field for each rigid object that records the destinations for all ...
PLOCTree: A Fast, High-Quality Hardware BVH Builder
In the near future, GPUs are expected to have hardware support for real-time ray tracing in order to, e.g., help render complex lighting effects in video games and enable photorealistic augmented reality. One challenge in real-time ray tracing is ...
Ray tracing-based interactive diffuse indirect illumination
Despite great efforts in recent years to accelerate global illumination computation, the real-time ray tracing of fully dynamic scenes to support photorealistic indirect illumination effects has yet to be achieved in computer graphics. In this paper, we ...




Comments