skip to main content
10.1145/1508244.1508282acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

StreamRay: a stream filtering architecture for coherent ray tracing

Published:07 March 2009Publication History

ABSTRACT

The wide availability of commodity graphics processors has made real-time graphics an intrinsic component of the human/computer interface. These graphics cores accelerate the z-buffer algorithm and provide a highly interactive experience at a relatively low cost. However, many applications in entertainment, science, and industry require high quality lighting effects such as accurate shadows, reflection, and refraction. These effects can be difficult to achieve with z-buffer algorithms but are straightforward to implement using ray tracing. Although ray tracing is computationally more complex, the algorithm exhibits excellent scaling and parallelism properties. Nevertheless, ray tracing memory access patterns are difficult to predict and the parallelism speedup promise is therefore hard to achieve.

This paper highlights a novel approach to ray tracing based on stream filtering and presents StreamRay, a multicore wide SIMD microarchitecture that delivers interactive frame rates of 15-32 frames/second for scenes of high geometric complexity and exhibits high utilization for SIMD widths ranging from eight to 16 elements. StreamRay consists of two main components: the ray engine, which is responsible for stream assembly and employs address generation units that generate addresses to form large SIMD vectors, and the filter engine, which implements the ray tracing operations with programmable accelerators. Results demonstrate that separating address and data processing reduces data movement and resource contention. Performance improves by 56% while simultaneously providing 11.63% power savings per accelerator core compared to a design which does not use separate resources for address and data computations.

References

  1. ATI. ATI products from AMD. http://ati.amd.com/products/index.html.Google ScholarGoogle Scholar
  2. C. Benthin. Realtime Ray Tracing on Current CPU Architectures. PhD thesis, Saarland University, 2006.Google ScholarGoogle Scholar
  3. S. Boulos, D. Edwards, J. D. Lacewell, J. Kniss, J. Kautz, I. Wald, and P. Shirley. Packet-based Whitted and distribution ray tracing. In Graphics Interface 2007, pages 177--184, May 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Boulos, I. Wald, and P. Shirley. Geometric and arithmetic culling methods for entire ray packets. Technical Report UUCS-06-10, University of Utah, 2006.Google ScholarGoogle Scholar
  5. D. Brooks, V. Tiwari, and M. Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. In ISCA, pages 83--94, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Burger and T. M. Austin. The simplescalar toolset, version 2.0. Technical Report TR-97-1342, University of Wisconsin-Madison, June 1997.Google ScholarGoogle Scholar
  7. E. E. Catmull. A submdivision algorithm for computer display of curved surfaces. PhD thesis, University of Utah, 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Cleary, B. Wyvill, G. Birtwistle, and R. Vatti. A parallel ray tracing computer. In Proceedings of the Association of Simulat Users Conference, pages 77--80, 1983.Google ScholarGoogle Scholar
  9. W. Dally and P. Hanrahan. Merrimac: Supercomputing with Streams. In Supercomputing, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Ernst and G. Greiner. Multi bounding volume hierarchies. In 2008 IEEE/Eurographics Symposium on Interactive Ray Tracing, pages 35--40.Google ScholarGoogle Scholar
  11. C. P. Gribble and K. Ramani. Coherent ray tracing via stream filtering. In 2008 IEEE/Eurographics Symposium on Interactive Ray Tracing, pages 59--66, August 2008.Google ScholarGoogle ScholarCross RefCross Ref
  12. P. Hanrahan. Using caching and breadth-first search to speed up ray--tracing. In Proceedings on Graphics Interface '86, pages 56--61, May 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. T. Kajiya. The rendering equation. In Siggraph 1986, pages 143--150, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Khailany, W. J. Dally, U. J. Kapasi, P. Mattson, J. Namkoong, J. D. Owens, B. Towles, A. Change, and S. Rixner. Imagine: Media processing with streams. IEEE Micro, 21(2):35--46, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Mahovsky and B. Wyvill. Memory-conserving bounding volume hierarchies with coherent raytracing. Computer Graphics Forum, 25(2):173--182, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  16. E. Mansson, J. Munkberg, and T. Akenine-Moller. Deep coherent ray tracing. In 2007 IEEE Symposium on Interactive Ray Tracing, pages 79--85, September 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. K. Mathew, A. Davis, and M. A. Parker. A low power architecture for embedded perception. In CASES '04: International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, pages 46--56, September 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. K. Nakamaru and Y. Ohno. Breadth-first ray tracing utilizing uniform spatial subdivision. IEEE Transactions on Visualization and Computer Graphics, 3(4):316--328, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. Navratil, D. Fussell, C. Lin, and W. R. Mark. Dynamic ray scheduling for improved system performance. In 2007 IEEE Symposium on Interactive Ray Tracing, pages 95--104, September 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. NVIDIA. NVIDIA GeForce 8800 GPU Architectural Overview. November 2006.Google ScholarGoogle Scholar
  21. S. Parker, W. Martin, P.-P. J. Sloan, P. Shirley, B. Smits, and C. Hansen. Interactive ray tracing. In Symposium on Interactive 3D Graphics, pages 119--126, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Pharr, C. Kolb, R. Gershbein, and P. Hanrahan. Rendering complex scenes with memory-coherent ray tracing. Computer Graphics, 31(Annual Conference Series):101--108, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Ramani and A. Davis. Application driven embedded system design: A face recognition case study. In CASES '07: Proceedings of the 2007 International conference on compilers, architectures, and synthesis for embedded Systems, pages 103--114, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. Ramani and A. Davis. Automating the Design of Embedded Domain Specific Accelerators. Technical report, University of Utah, 2008.Google ScholarGoogle Scholar
  25. K. Ramani, A. Ibrahim, and D. Shimizu. PowerRed: A Flexible Modeling Framework for Power Efficiency Exploration in GPUs. In Proceedings of the Workshop on General Purpose Processing on GPUs, GPGPU'07.Google ScholarGoogle Scholar
  26. A. Reshetov. Omnidirectional ray tracing traversal algorithm for kd-trees. In 2006 IEEE Symposium on Interactive Ray Tracing, pages 57--60, September 2006.Google ScholarGoogle ScholarCross RefCross Ref
  27. A. Reshetov. Faster ray packets-triangle intersection through vertex culling. In 2007 IEEE Symposium on Interactive Ray Tracing, pages 105--12, September 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Reshetov, A. Soupikov, and J. Hurley. Multi-level ray tracing algorithm. ACM Transacions on Graphics, 24(3):1176--1185, July 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Schmittler, I. Wald, and P. Slusallek. SaarCOR: A hardware architecture for ray tracing. In Eurographics Workshop on Graphics Hardware, pages 27--36, September 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan. Larrabee: A many-core x86 architecture for visual computing. ACM Transactions on Graphics, 27(3), 2008. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. I. Wald, C. Benthin, and S. Boulos. Getting rid of packets -- efficient simd single-ray traversal using multi-branching bvhs. In 2008 IEEE/Eurographics Symposium on Interactive Ray Tracing, pages 49--57, August 2008.Google ScholarGoogle ScholarCross RefCross Ref
  32. I. Wald, C. Benthin, M. Wagner, and P. Slusallek. Interactive rendering with coherent ray tracing. Computer Graphics Forum, 20(3):153--164, September 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. I. Wald, S. Boulos, and P. Shirley. Ray tracing deformable scenes using dynamic bounding volume hierarchies. ACM Transactions on Graphics, 26(1):6, January 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. T. Whitted. An improved illumination model for shaded display. Communications of the ACM, 23(6):343--349, 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. Woop, J. Schmittler, and P. Slusallek. RPU: a programmable ray processing unit for realtime ray tracing. ACM Transactions on Graphics, 24(3):434--444, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. StreamRay: a stream filtering architecture for coherent ray tracing

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!