skip to main content
research-article

RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices

Published:23 September 2014Publication History
Skip Abstract Section

Abstract

We present RayCore, a mobile ray-tracing hardware architecture. RayCore facilitates high-quality rendering effects, such as reflection, refraction, and shadows, on mobile devices by performing real-time Whitted ray tracing. RayCore consists of two major components: ray-tracing units (RTUs) based on a unified traversal and intersection pipeline and a tree-building unit (TBU) for dynamic scenes. The overall RayCore architecture offers considerable benefits in terms of die area, memory access, and power consumption. We have evaluated our architecture based on FPGA and ASIC evaluations and demonstrate its performance on different benchmarks. According to the results, our architecture demonstrates high performance per unit area and unit energy, making it highly suitable for use in mobile devices.

Skip Supplemental Material Section

Supplemental Material

a162-sidebyside.mp4

References

  1. Timo Aila and Tero Karras. 2010. Architecture considerations for tracing incoherent rays. In Proceedings of the Conference on High-Performance Graphics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Timo Aila and Samuli Laine. 2009. Understanding the efficiency of ray traversal on GPUs. In Proceedings of the Conference on High Performance Graphics. ACM Press, New York, 145--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Carsten Benthin, Ingo Wald, Sven Woop, Manfred Ernst, and William R. Mark. 2012. Combining single and packet ray tracing for arbitrary ray distributions on the Intel MIC architecture. IEEE Trans. Visual. Comput. Graph. 18, 9, 1438--1448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jacco Bikker. 2007. Real-time ray tracing through the eyes of a game developer. In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Shekhar Borkar and Andrew A. Chien. 2011. The future of microprocessors. Comm. ACM 54, 5, 67--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Solomon Boulos, David Edwards, J. Dylan Lacewell, Joe Kniss, Jan Kautz, Ingo Wald, and Peter Shirley. 2006. Interactive distribution ray tracing. Tech. rep., No UUSCI-2006-022, SCI Institute, University of Utah.Google ScholarGoogle Scholar
  7. Byn Choi, Rakesh Komuravelli, Victor Lu, Hyojin Sung, Robert L. Bocchino, Sarita V. Adve, and John C. Hart. 2010. Parallel SAH k-d tree construction. In Proceedings of the Conference on High Performance Graphics. 77--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Robert L. Cook, Thomas Porter, and Loren Carpenter. 1984. Distributed ray tracing. In Proceedings of the 11th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'84). ACM Press, New York, 137--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Peter Djeu, Warren A. Hunt, Rui Wang, Ikrima Elhassan, Gordon Stoll, and William R. Mark. 2011. Razor: An architecture for dynamic multiresolution ray tracing. ACM Trans. Graph. 30, 5, 115:1--115:26 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Michael J. Doyle, Colin Fowler, and Michael Manzke. 2013. A hardware unit for fast SAH-optimised BVH construction. ACM Trans. Graph. 32, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Venkatraman Govindaraju, Peter Djeu, Karthikeyan Sankaralingam, Mary Vernon, and William R. Mark. 2008. Toward a multicore architecture for real-time ray-tracing. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'08). 176--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Christiaan Gribble and Alexis Naveros. 2013. GPU ray tracing with rayforce. In Proceedings of the ACM SIGGRAPH Posters. 98:1--98:1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yan Gu, Yong He, Kayvon Fatahalian, and Guy Blelloch. 2013. Efficient BVH construction via approximate agglomerative clustering. In Proceedings of the 5th High-Performance Graphics Conference. 81--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hilbert Hagedoorn. 2012. Geforce GTX 680 review. Tech. rep., The guru of 3D. http://www.guru3d.com/articles.Google ScholarGoogle Scholar
  15. Ziyad S. Hakura and Anoop Gupta. 1997. The design and analysis of a cache architecture for texture mapping. SIGARCH Comput. Archit. News 25, 2, 108--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C. Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. 2010. Understanding sources of inefficiency in general-purpose chips. SIGARCH Comput. Archit. News 38, 3, 37--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jiri Havel and Adam Herout. 2010. Yet faster ray-triangle intersection (using SSE4). IEEE Trans. Visual. Comput. Graph. 16, 3, 434--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Vlastimil Havran, Robert Herzog, and Hans-Peter Seidel. 2006. On the fast construction of spatial hierarchies for ray tracing. In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 71--80.Google ScholarGoogle ScholarCross RefCross Ref
  19. Qiming Hou, Xin Sun, Kun Zhou, Christian Lauterbach, and Dinesh Manocha. 2011. Memory-scalable GPU spatial hierarchy construction. IEEE Trans. Visual. Comput. Graph. 17, 3, 466--474. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Warren Hunt, William R. Mark, and Gordon Stoll. 2006. Fast kd-tree construction with an adaptive error-bounded heuristic. In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 81--88.Google ScholarGoogle ScholarCross RefCross Ref
  21. Imgtec. 2013. Imagination technologies ships caustic series2 r2500 and r2100 ray tracing acceleration boards. http://www.imgtec.com/news/release/index.asp?NewsID=722.Google ScholarGoogle Scholar
  22. Thiago Ize and Charles D. Hansen. 2011. RTSAH traversal order for occlusion rays. Comput. Graph. Forum 30, 2, 297--305.Google ScholarGoogle ScholarCross RefCross Ref
  23. Yoon-Sig Kang, Jae-Ho Nah, Woo-Chan Park, and Sung-Bong Yang. 2013. gkDtree: A group-based parallel update kd-tree for interactive ray tracing. J. Syst. Archit. 59, 3, 166--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tero Karras. 2012. Maximizing parallelism in the construction of BVHs, octrees, and k-d trees. In Proceedings of the 4th ACM SIGGRAPH/Eurographics Conference on High-Performance Graphics. 33--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Tero Karras and Timo Aila. 2013. Fast parallel construction of highquality bounding volume hierarchies. In Proceedings of the 5th High-Performance Graphics Conference. 89--99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Alexander Keller, Tero Karras, Ingo Wald, Timo Aila, Samuli Laine, Jacco Bikker, Christiaan Gribble, Won-Jong Lee, and James Mccombe. 2013. Ray tracing is the future and ever will be.... In ACM SIGGRAPH Courses. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Hong-Yun Kim, Young-Jun Kim, and Lee-Sup Kim. 2012. MRTP: Mobile ray tracing processor with reconfigurable stream multi-processors for high datapath utilization. IEEE J. Solid-State Circ. 47, 2, 518--535.Google ScholarGoogle ScholarCross RefCross Ref
  28. Hong-Yoon Kim, Young-Jun Kim, Jiehwan Oh, and Lee-Sup Kim. 2013. A reconfigurable SIMT processor for mobile ray tracing with contention reduction in shared memory. IEEE Trans. Circ. Syst. 60, 4, 938--950.Google ScholarGoogle Scholar
  29. Daniel Kopta, Konstantin Shkurko, Josef Spjut, Erik Brunvand, and Al Davis. 2013. An energy and bandwidth efficient ray tracing architecture. In Proceedings of the 5th High-Performance Graphics Conference (HPG'13). 121--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Daniel Kopta, Joseph Spjut, Erik Brunvand, and Al Davis. 2010. Efficient MIMD architectures for high-performance ray tracing. In Proceedings of the IEEE International Conference on Computer Design.Google ScholarGoogle ScholarCross RefCross Ref
  31. Hyuck-Joo Kwon, Jae-Ho Nah, Dinesh Manocha, and Woo-Chan Park. 2014. Effective traversal algorithms and hardware architecture for pyramidal inverse displacement mapping. Comput. Graph. 38, 140--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Christian Lauterbach, Michael Garland, Shubhabrata Sengupta, David Luebke, and Dinesh Manocha. 2009. Fast BVH construction on GPUs. Comput. Graph. Forum 28, 2, 375--384.Google ScholarGoogle ScholarCross RefCross Ref
  33. Won-Jong Lee, Shi-Hwa Lee, Jae-Ho Nah, Jin-Woo Kim, Youngsam Shin, Jaedon Lee, and Seok-Yoon Jung. 2012. SGRT: A scalable mobile GPU architecture based on ray tracing. InACM SIGGRAPH Talks. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Won-Jong Lee, Youngsam Shin, Jaedon Lee, Jin-Woo Kim, Jae-Ho Nah, Seok-Yoon Jung, Shi-Hwa Lee, Hyun-Sang Park, and Tack-Don Han. 2013. SGRT: A mobile GPU architecture for real-time ray tracing. In Proceedings of the 5th High-Performance Graphics Conference. 109--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jonas Lext, Ulf Assarsson, and Tomas Moller. 2001. BART: A benchmark for animated ray tracing. IEEE Comput. Graph. Appl. 21, 2, 22--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Aqeel Mahesri, Daniel Johnson, Neal Crago, and Sanjay J. Patel. 2008. Tradeoffs in designing accelerator architectures for visual computing. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture. 164--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Bochang Moon, Yongyoung Byun, Tae-Joon Kim, Pio Claudio, Hye-Sun Kim, Yun-Ji Ban, Seung Woo Nam, and Sung-Eui Yoon. 2010. Cache-oblivious ray reordering. ACM Trans. Graph. 29, 3, 28:1--28:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Guy M. Morton. 1966. A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing. IBM.Google ScholarGoogle Scholar
  39. Jae-Ho Nah, Yun-Hye Jung, Woo-Chan Park, and Tack-Don Han. 2012. Efficient ray sorting for the tracing of incoherent rays. IEICE Electron. Express 9, 9, 849--854.Google ScholarGoogle ScholarCross RefCross Ref
  40. Jae-Ho Nah, Yoon-Sig Kang, Kwang-Jo Lee, Shin-Jun Lee, Tack-Don Han, and Sung-Bong Yang. 2010. MobiRT: An implementation of OpenGL ES-Based CPU-GPU hybrid ray tracer for mobile devices. In ACM SIGGRAPH ASIA Sketches, Vol. 50, ACM Press, New York, 50:1--50:2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jae-Ho Nah and Dinesh Manocha. 2014. SATO: Surface-area traversal order for shadow ray tracing. Comput. Graph. Forum. (preprint).Google ScholarGoogle Scholar
  42. Jae-Ho Nah, Jeong-Soo Park, Chanmin Park, Jin-Woo Kim, Yun-Hye Jung, Woo-Chan Park, and Tack-Don Han. 2011. T&I engine: Traversal and intersection engine for hardware accelerated ray tracing. ACM Trans. Graph. 30, 6, 160:1--160:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Notebookcheck. 2013. Apple a7 smartphone SOC. http://www.notebook check.net/Apple-A7-Smartphone-SoC.103280.0.html.Google ScholarGoogle Scholar
  44. Nvidia. 2013. NVIDIA Tegra 4 family GPU architecture. Whitepaper http://www.nvidia.com/docs/IO/116757/Tegra_4_GPU_Whitepaper_FINALv2.pdf.Google ScholarGoogle Scholar
  45. Woo-Chan Park, Dong-Seok Kim, Jeong-Soo Park, Sang-Duk Kim, Hong-Sik Kim, and Tack-Don Han. 2011. The design of a texture mapping unit with effective mip-map level selection for real-time ray tracing. IEICE Electron. Express 8, 13, 1064--1070.Google ScholarGoogle ScholarCross RefCross Ref
  46. Woo-Chan Park, Jae-Ho Nah, Jeong-Soo Park, Kyung-Ho Lee, Dong-Seok Kim, Sang-Duk Kim, Jin-Hong Park, Cheong-Ghil Kim, Yoon-Sig Kang, Sung-Bong Yang, and Tack-Don Han. 2008. An FPGA implementation of Whitted-style ray tracing accelerator. In Proceedings of the IEEE Symposium on Interactive Ray Tracing. 187--187.Google ScholarGoogle ScholarCross RefCross Ref
  47. Steven G. Parker, James Bigler, Andreas Dietrich, Heiko Friedrich, Jared Hoberock, David Luebke, David Mcallister, Morgan Mcguire, Keith Morley, Austin Robison, and Martin Stich. 2010. OptiX: A general purpose ray tracing engine. ACM Trans. Graph. 29, 4, 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Matt Pharr and Greg Humphreys. 2010. Physically Based Rendering 2nd Ed. Morgan Kaufmann, San Fransisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Karthik Ramani, Christiaan P. Gribble, and Al Davis. 2009. StreamRay: A stream filtering architecture for coherent ray tracing. In Proceeding of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'09). ACM Press, New York, 325--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Alexander Reshetov, Alexei Soupikov, and Jim Hurley. 2005. Multi-level ray tracing algorithm. ACM Trans. Graph. 24, 3, 1176--1185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Jorg Schmittler, Sven Woop, Daniel Wagner, Wolfgang J. Paul, and Philipp Slusallek. 2004. Realtime ray tracing of dynamic scenes on an FPGA chip. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware. 95--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Maxim Shevtsov, Alexei Soupikov, and Alexander Kapustin. 2007. Highly parallel fast kd-tree construction for interactive ray tracing of dynamic scenes. Comput. Graph. Forum 26, 3, 395--404.Google ScholarGoogle ScholarCross RefCross Ref
  53. Peter Shirley, Kelvin Sung, Erik Brunvand, Alan Davis, Steven Parker, and Solomon Boulos. 2008. Fast ray tracing and the potential effects on graphics and gaming courses. Comput. Graph. 32, 2, 260--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Siliconarts. 2013. RaySort. http://www.siliconarts.co.kr.Google ScholarGoogle Scholar
  55. Josef Spjut, Andrew Kensler, Daniel Kopta, and Erik Brunvand. 2009. TRaX: A multicore hardware architecture for real-time ray tracing. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 28, 12, 1802--1815. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Joseph Spjut, Daniel Kopta, Erik Brunvand, and Al Davis. 2012. A mobile accelerator architecture for ray tracing. In Proceedings of the 3rd Workshop on SoCs, Heterogeneous Architectures and Workloads (SHAW'12).Google ScholarGoogle Scholar
  57. Kevin Suffern. 2007. Ray Tracing from the Ground Up. A. K. Peters, Ltd. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Synopsys. 2013. Power optimization in design compiler. http://www.synopsys.com/Tools/Implementation/RTLSynthesis/Pages/PowerCompi ler.aspx.Google ScholarGoogle Scholar
  59. Tony Tamasi. 2008. Evolution of computer graphics. http://www.nvidia.com/content/nvision2008/tech_presentations/Technology_Keynotes/NVISIO N08-Tech_Keynote-GPU.pdf.Google ScholarGoogle Scholar
  60. Art Tevs, Ivo Ihrke, and Hans-Peter Seidel. 2008. Maximum mipmaps for fast, accurate, and scalable dynamic height field rendering. In Proceedings of the Symposium on Interactive 3D Graphics and Games (I3D'07). ACM Press, New York, 183--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Tsmc. 2012. 28nm technology. http://www.tsmc.com/english/dedicated Foundry/technology/28nm.htm.Google ScholarGoogle Scholar
  62. Eric Veach and Leonidas Guibas. 1994. Bidirectional estimators for light transport. In Proceedings of the Eurographics Rendering Workshop. 147--162.Google ScholarGoogle Scholar
  63. Carsten Wachter and Alexander Keller. 2006. Instant ray tracing: The bounding interval hierarchy. In Proceedings of the 17th Eurographics Workshop on Rendering. 139--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Barry Wagner. 2013. The evolving mobile platform. http://www.jedec. org/sites/default/files/Barry%20Wagner_Mobile%20Forum_May_2013-Final-04232013.pdf.Google ScholarGoogle Scholar
  65. Ingo Wald. 2004. Realtime ray tracing and interactive global illumination. http://www.sci.utah.edu/~wald/PhD/wald_phd.pdf.Google ScholarGoogle Scholar
  66. Ingo Wald, Carsten Benthin, and Philipp Slusallek. 2003. Distributed interactive ray tracing of dynamic scenes. In Proceedings of the IEEE Symposium on Parallel and Large-Data Visualization and Graphics. 77--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Ingo Wald, Solomon Boulos, and Peter Shirley. 2007. Ray tracing deformable scenes using dynamic bounding volume hierarchies. ACM Trans. Graph. 26, 1, 6:1--6:18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Ingo Wald and Vlastimil Havran. 2006. On building fast kd-trees for ray tracing, and on doing that in o(n log n). In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 61--69.Google ScholarGoogle ScholarCross RefCross Ref
  69. Ingo Wald, Thiago Ize, and Steven G. Parker. 2008. Fast, parallel, and asynchronous construction of BVHS for ray tracing animated scenes. Comput. Graph. 32, 1, 3--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Ingo Wald, William R. Mark, Johannes Gunther, Solomon Boulos, Thiago Ize, Warren Hunt, Steven G. Parker, and Peter Shirley. 2009. State of the art in ray tracing animated scenes. Comput. Graph. Forum 28, 6, 1691--1722.Google ScholarGoogle ScholarCross RefCross Ref
  71. Ingo Wald, Philipp Slusallek, Carsten Benthin, and Markus Wagner. 2001. Interactive rendering with coherent ray tracing. Comput. Graph. Forum 20, 3, 153--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Turner Whitted. 1980. An improved illumination model for shaded display. Comm. ACM 23, 6, 343--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Sven Woop, Erik Brunvand, and Philipp Slusallek. 2006a. Estimating performance of a ray-tracing ASIC design. In Proceedings of the IEEE/EG Symposium on Interactive Ray Tracing. 7--14.Google ScholarGoogle ScholarCross RefCross Ref
  74. Sven Woop, Gerd Marmitt, and Philipp Slusallek. 2006b. B-KD trees for hardware accelerated ray tracing of dynamic scenes. In Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware (GH'06). ACM Press, New York, 67--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Sven Woop, Jorg Schmittler, and Philipp Slusallek. 2005. RPU: A programmable ray processing unit for realtime ray tracing. ACM Trans. Graph. 24, 3, 434--444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Zhefeng Wu, Fukai Zhao, and Xinguo Liu. 2011. SAH KD-tree construction on GPU. In Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics (HPG'11). 71--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Kun Zhou, Qiming Hou, Rui Wang, and Baining Guo. 2008. Real-time kd-tree construction on graphics hardware. ACM Trans. Graph. 27, 5, 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. RayCore: A Ray-Tracing Hardware Architecture for Mobile Devices

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 33, Issue 5
        August 2014
        152 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/2672594
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 September 2014
        • Accepted: 1 March 2014
        • Received: 1 December 2013
        Published in tog Volume 33, Issue 5

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader