Abstract
In this paper we explore the viability of path tracing massive scenes using a "supercomputer" constructed on-the-fly from thousands of small, serverless cloud computing nodes. We present R2E2 (Really Elastic Ray Engine) a scene decomposition-based parallel renderer that rapidly acquires thousands of cloud CPU cores, loads scene geometry from a pre-built scene BVH into the aggregate memory of these nodes in parallel, and performs full path traced global illumination using an inter-node messaging service designed for communicating ray data. To balance ray tracing work across many nodes, R2E2 adopts a service-oriented design that statically replicates geometry and texture data from frequently traversed scene regions onto multiple nodes based on estimates of load, and dynamically assigns ray tracing work to lightly loaded nodes holding the required data. We port pbrt's ray-scene intersection components to the R2E2 architecture, and demonstrate that scenes with up to a terabyte of geometry and texture data (where as little as 1/250th of the scene can fit on any one node) can be path traced at 4K resolution, in tens of seconds using thousands of tiny serverless nodes on the AWS Lambda platform.
Supplemental Material
- Timo Aila and Tero Karras. 2010. Architecture considerations for tracing incoherent rays. In Proceedings of the Conference on High Performance Graphics. 113--122.Google Scholar
Digital Library
- Amazon Web Services. 2021. Simple Queue Service (SQS). https://aws.amazon.com/sqs/.Google Scholar
- Lixiang Ao, Liz Izhikevich, Geoffrey M Voelker, and George Porter. 2018. Sprocket: A serverless video processing framework. In Proceedings of the ACM Symposium on Cloud Computing. 263--274.Google Scholar
Digital Library
- Carsten Benthin, Ingo Wald, Sven Woop, and Attila T. Áfra. 2018. Compressed-Leaf Bounding Volume Hierarchies. In Proceedings of the Conference on HighPerformance Graphics (Vancouver, British Columbia, Canada) (HPG '18). Association for Computing Machinery, New York, NY, USA, Article 6, 4 pages. Google Scholar
Digital Library
- Brian Budge, Tony Bernardin, Jeff A. Stuart, Shubhabrata Sengupta, Kenneth I. Joy, and John D. Owens. 2009. Out-of-core Data Management for Path Tracing on Hybrid Resources. Computer Graphics Forum (2009). Google Scholar
Cross Ref
- Brent Burley, David Adler, Matt Jen-Yuan Chiang, Hank Driskill, Ralf Habel, Patrick Kelly, Peter Kutz, Yining Karl Li, and Daniel Teece. 2018. The Design and Evolution of Disney's Hyperion Renderer. ACM Trans. Graph. 37, 3, Article 33 (jul 2018), 22 pages. Google Scholar
Digital Library
- Brent Burley and Dylan Lacewell. 2008. Ptex: Per-Face Texture Mapping for Production Rendering. In Proceedings of the Nineteenth Eurographics Conference on Rendering (Sarajevo, Bosnia and Herzegovina) (EGSR '08). Eurographics Association, Goslar, DEU, 1155--1164. Google Scholar
Digital Library
- Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. 2019. Cirrus: A serverless framework for end-to-end ml workflows. In Proceedings of the ACM Symposium on Cloud Computing. 13--24.Google Scholar
Digital Library
- Per H. Christensen, David M. Laur, Julia Fong, Wayne L. Wooten, and Dana Batali. 2003. Ray Differentials and Multiresolution Geometry Caching for Distribution Ray Tracing in Complex Scenes. Computer Graphics Forum 22, 3 (2003), 543--552. Google Scholar
Cross Ref
- J G Cleary, B M Wyvill, G M Birtwistle, and R Vatti. 1986. Multiprocessor Ray Tracing. Comput. Graph. Forum 5, 1 (March 1986), 3--12. Google Scholar
Digital Library
- Mark Dippé and John Swensen. 1984. An Adaptive Subdivision Algorithm and Parallel Architecture for Realistic Image Synthesis. SIGGRAPH Comput. Graph. 18, 3 (Jan. 1984), 149--158. Google Scholar
Digital Library
- Christian Eisenacher, Gregory Nichols, Andrew Selle, and Brent Burley. 2013. Sorted Deferred Shading for Production Path Tracing. In Proceedings of the Eurographics Symposium on Rendering (Zaragoza, Spain) (EGSR '13). Eurographics Association, Goslar, DEU, 125--132. Google Scholar
Digital Library
- Brad Fitzpatrick. 2004. Distributed Caching with Memcached. Linux J. 2004, 124 (Aug. 2004), 5.Google Scholar
Digital Library
- Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. 2019. From laptop to lambda: Outsourcing everyday jobs to thousands of transient functional containers. In 2019 USENIX Annual Technical Conference (USENIX ATC 19). 475--488.Google Scholar
- Sadjad Fouladi, Riad S Wahby, Brennan Shacklett, Karthikeyan Vasuki Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and Keith Winstein. 2017. Encoding, fast and slow: Low-latency video processing using thousands of tiny threads. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). 363--376.Google Scholar
- Iliyan Georgiev, Thiago Ize, Mike Farnsworth, Ramón Montoya-Vozmediano, Alan King, Brecht Van Lommel, Angel Jimenez, Oscar Anson, Shinji Ogaki, Eric Johnston, Adrien Herubel, Declan Russell, Frédéric Servant, and Marcos Fajardo. 2018. Arnold: A Brute-Force Production Path Tracer. ACM Trans. Graph. 37, 3, Article 32 (aug 2018), 12 pages. Google Scholar
Digital Library
- F.W Jansen and A.G Chalmers. 1993. Realism in real time?. In 4th Eurographics Workshop on Rendering. 27 -- 46.Google Scholar
- Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the cloud: Distributed computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing. 445--451.Google Scholar
Digital Library
- Toshi Kato and Jun Saito. 2002. "Kilauea" - Parallel Global Illumination Renderer. In Eurographics Workshop on Parallel Graphics and Visualization, D. Bartz, X. Pueyo, and E. Reinhard (Eds.). The Eurographics Association. Google Scholar
Cross Ref
- Hiroaki Kobayashi, Satoshi Nishimura, Hideyuki Kubota, Tadao Nakamura, and Yoshiharu Shigei. 1988. Load balancing strategies for a parallel ray-tracing system based on constant subdivision. The Visual Computer 4 (07 1988), 197--209.Google Scholar
- J. Mahovsky and B. Wyvill. 2006. Memory-Conserving Bounding Volume Hierarchies with Coherent Raytracing. Computer Graphics Forum (2006). Google Scholar
Cross Ref
- P. A. Navrátil, H. Childs, D. S. Fussell, and C. Lin. 2014. Exploring the Spectrum of Dynamic Scheduling Algorithms for Scalable Distributed-MemoryRay Tracing. IEEE Transactions on Visualization and Computer Graphics 20, 6 (2014), 893--906.Google Scholar
Digital Library
- K. Nemoto and T. Omachi. 1986. An Adaptive Subdivision by Sliding Boundary Surfaces for Fast Ray Tracing. In Proceedings of Graphics Interface and Vision Interface '86 (Vancouver, British Columbia, Canada) (GI '86). Canadian Man-Computer Communications Society, Toronto, Ontario, Canada, 43--48. http://graphicsinterface.org/wp-content/uploads/gi1986-9.pdfGoogle Scholar
- Jacopo Pantaleoni, Luca Fascione, Martin Hill, and Timo Aila. 2010. PantaRay: Fast Ray-Traced Occlusion Caching of Massive Scenes. ACM Trans. Graph. 29, 4, Article 37 (July 2010), 10 pages. Google Scholar
Digital Library
- Steven G. Parker, James Bigler, Andreas Dietrich, Heiko Friedrich, Jared Hoberock, David Luebke, David McAllister, Morgan McGuire, Keith Morley, Austin Robison, and Martin Stich. 2010. OptiX: A General Purpose Ray Tracing Engine. ACM Trans. Graph. 29, 4, Article 66 (jul 2010), 13 pages. Google Scholar
Digital Library
- Matt Pharr, Wenzel Jakob, and Greg Humphreys. 2016. Physically based rendering: From theory to implementation. Morgan Kaufmann.Google Scholar
- Matt Pharr, Craig Kolb, Reid Gershbein, and Pat Hanrahan. 1997. Rendering complex scenes with memory-coherent ray tracing. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques. 101--108.Google Scholar
Digital Library
- Thierry Priol and Kadi Bouatouch. 1989. Static load balancing for a parallel ray tracing on a MIMD hypercube. The Visual Computer 5 (1989), 109--119.Google Scholar
Cross Ref
- Erik Reinhard, Alan Chalmers, and Frederik W. Jansen. 1999. Hybrid Scheduling for Parallel Rendering Using Coherent Ray Tasks. In Proceedings of the 1999 IEEE Symposium on Parallel Visualization and Graphics (San Francisco, California, USA) (PVGS '99). IEEE Computer Society, USA, 21--28. Google Scholar
Digital Library
- J. Salmon and J. Goldsmith. 1989. A Hypercube Ray-Tracer. In Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications - Volume 2 (Pasadena, California, USA) (C3P). Association for Computing Machinery, New York, NY, USA, 1194--1206. Google Scholar
Digital Library
- Isaac D. Scherson and Elisha Caspary. 1988. Multiprocessing for ray tracing: a hierarchical self-balancing approach. The Visual Computer 4 (1988), 188--196.Google Scholar
Cross Ref
- Vaishaal Shankar, Karl Krauth, Qifan Pu, Eric Jonas, Shivaram Venkataraman, Ion Stoica, Benjamin Recht, and Jonathan Ragan-Kelley. 2018. numpywren: Serverless linear algebra. arXiv preprint arXiv:1810.09679 (2018).Google Scholar
- Myungbae Son and Sung-Eui Yoon. 2017. Timeline Scheduling for Out-of-Core Ray Batching. In Proceedings of High Performance Graphics (Los Angeles, California) (HPG '17). Association for Computing Machinery, New York, NY, USA, Article 11, 10 pages. Google Scholar
Digital Library
- Ingo Wald, Sven Woop, Carsten Benthin, Gregory S. Johnson, and Manfred Ernst. 2014. Embree: A Kernel Framework for Efficient CPU Ray Tracing. ACM Trans. Graph. 33, 4, Article 143 (jul 2014), 8 pages. Google Scholar
Digital Library
- Walt Disney Animation Studios. 2018. Moana Island Scene (v1.1). https://www.disneyanimation.com/resources/moana-island-scene/.Google Scholar
- Mike Wawrzoniak, Ingo Müller, Rodrigo Fraga Barcelos Paulus Bruno, and Gustavo Alonso. 2021. Boxer: Data Analytics on Network-enabled Serverless Platforms. In 11th Annual Conference on Innovative Data Systems Research (CIDR'21).Google Scholar
- Henri Ylitie, Tero Karras, and Samuli Laine. 2017a. Efficient Incoherent Ray Traversal on GPUs through Compressed Wide BVHs. In Proceedings of High Performance Graphics (Los Angeles, California) (HPG '17). Association for Computing Machinery, New York, NY, USA, Article 4, 13 pages. Google Scholar
Digital Library
- Henri Ylitie, Tero Karras, and Samuli Laine. 2017b. Efficient Incoherent Ray Traversal on GPUs through Compressed Wide BVHs. In Proceedings of High Performance Graphics (Los Angeles, California) (HPG '17). Association for Computing Machinery, New York, NY, USA, Article 4, 13 pages. Google Scholar
Digital Library
Index Terms
R2E2: low-latency path tracing of terabyte-scale scenes using thousands of cloud CPUs
Recommendations
Real-time multiply recursive reflections and refractions using hybrid rendering
We present a new method for real-time rendering of multiple recursions of reflections and refractions. The method uses the strengths of real-time ray tracing for objects close to the camera, by storing them in a per-frame constructed bounding volume ...
Ray tracing-based interactive diffuse indirect illumination
Despite great efforts in recent years to accelerate global illumination computation, the real-time ray tracing of fully dynamic scenes to support photorealistic indirect illumination effects has yet to be achieved in computer graphics. In this paper, we ...





Comments