skip to main content
research-article
Public Access

Acorn: adaptive coordinate networks for neural scene representation

Published:19 July 2021Publication History
Skip Abstract Section

Abstract

Neural representations have emerged as a new paradigm for applications in rendering, imaging, geometric modeling, and simulation. Compared to traditional representations such as meshes, point clouds, or volumes they can be flexibly incorporated into differentiable learning-based pipelines. While recent improvements to neural representations now make it possible to represent signals with fine details at moderate resolutions (e.g., for images and 3D shapes), adequately representing large-scale or complex scenes has proven a challenge. Current neural representations fail to accurately represent images at resolutions greater than a megapixel or 3D scenes with more than a few hundred thousand polygons. Here, we introduce a new hybrid implicit-explicit network architecture and training strategy that adaptively allocates resources during training and inference based on the local complexity of a signal of interest. Our approach uses a multiscale block-coordinate decomposition, similar to a quadtree or octree, that is optimized during training. The network architecture operates in two stages: using the bulk of the network parameters, a coordinate encoder generates a feature grid in a single forward pass. Then, hundreds or thousands of samples within each block can be efficiently evaluated using a lightweight feature decoder. With this hybrid implicit-explicit network architecture, we demonstrate the first experiments that fit gigapixel images to nearly 40 dB peak signal-to-noise ratio. Notably this represents an increase in scale of over 1000X compared to the resolution of previously demonstrated image-fitting experiments. Moreover, our approach is able to represent 3D shapes significantly faster and better than previous techniques; it reduces training times from days to hours or minutes and memory requirements by over an order of magnitude.

Skip Supplemental Material Section

Supplemental Material

a58-martel.mp4
3450626.3459785.mp4

References

  1. Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, and James Tompkin. 2020. MatryODShka: Real-time 6DoF video view synthesis using multi-sphere images. In Proc. ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Matan Atzmon and Yaron Lipman. 2020. SAL: Sign agnostic learning of shapes from raw data. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  3. Marsha J. Berger and Joseph Oliger. 1984. Adaptive mesh refinement for hyperbolic partial differential equations. Journal of computational Physics 53, 3 (1984), 484--512.Google ScholarGoogle ScholarCross RefCross Ref
  4. Michael Broxton, John Flynn, Ryan Overbeck, Daniel Erickson, Peter Hedman, Matthew Duvall, Jason Dourgarian, Jay Busch, Matt Whalen, and Paul Debevec. 2020. Immersive light field video with a layered mesh representation. ACM Trans. Graph. (SIGGRAPH) 39, 4 (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Rohan Chabra, Jan Eric Lenssen, Eddy Ilg, Tanner Schmidt, Julian Straub, Steven Lovegrove, and Richard Newcombe. 2020. Deep local shapes: Learning local SDF priors for detailed 3D reconstruction. In Proc. ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Eric R. Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Gordon Wetzstein. 2021. pi-GAN: Periodic implicit generative adversarial networks for 3D-aware image synthesis. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  7. Yinbo Chen, Sifei Liu, and Xiaolong Wang. 2021. Learning continuous image representation with local implicit image function. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  8. Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  9. Thomas Davies, Derek Nowrouzezahrai, and Alec Jacobson. 2020. Overfit neural networks as a compact shape representation. arXiv preprint arXiv:2009.09808 (2020).Google ScholarGoogle Scholar
  10. S. M. Ali Eslami, Danilo Jimenez Rezende, Frederic Besse, Fabio Viola, Ari S. Morcos, Marta Garnelo, Avraham Ruderman, Andrei A. Rusu, Ivo Danihelka, Karol Gregor, David P. Reichert, Lars Buesing, Theophane Weber, Oriol Vinyals, Dan Rosenbaum, Neil Rabinowitz, Helen King, Chloe Hillier, Matt Botvinick, Daan Wierstra, Koray Kavukcuoglu, and Demis Hassabis. 2018. Neural scene representation and rendering. Science 360, 6394 (2018), 1204--1210.Google ScholarGoogle Scholar
  11. John Flynn, Michael Broxton, Paul Debevec, Matthew DuVall, Graham Fyffe, Ryan Overbeck, Noah Snavely, and Richard Tucker. 2019. Deepview: View synthesis with learned gradient descent. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  12. Kyle Genova, Forrester Cole, Avneesh Sud, Aaron Sarna, and Thomas Funkhouser. 2020. Local deep implicit functions for 3D shape. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  13. Amos Gropp, Lior Yariv, Niv Haim, Matan Atzmon, and Yaron Lipman. 2020. Implicit geometric regularization for learning shapes. In Proc. ICML.Google ScholarGoogle Scholar
  14. LLC Gurobi Optimization. 2021. Gurobi Optimizer Reference Manual. http://www.gurobi.comGoogle ScholarGoogle Scholar
  15. Song Han, Huizi Mao, and William J. Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In Proc. ICLR.Google ScholarGoogle Scholar
  16. Peter Hedman, Julien Philip, True Price, Jan-Michael Frahm, George Drettakis, and Gabriel Brostow. 2018. Deep blending for free-viewpoint image-based rendering. ACM Trans. Graph. (SIGGRAPH Asia) 37, 6 (2018).Google ScholarGoogle Scholar
  17. Philipp Henzler, Niloy J. Mitra, and Tobias Ritschel. 2019. Escaping Plato's cave: 3D shape from adversarial rendering. In Proc. ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  18. Weizhang Huang and Robert D. Russell. 2010. Adaptive Moving Mesh Methods. Springer New York.Google ScholarGoogle Scholar
  19. Chiyu Jiang, Avneesh Sud, Ameesh Makadia, Jingwei Huang, Matthias Nießner, and Thomas Funkhouser. 2020b. Local implicit grid representations for 3D scenes. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  20. Yue Jiang, Dantong Ji, Zhizhong Han, and Matthias Zwicker. 2020a. SDFDiff: Differentiate rendering of signed distance fields for 3D shape optimization. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  21. Michael Kazhdan and Hugues Hoppe. 2013. Screened poisson surface reconstruction. ACM Trans. Graph. 32, 3 (2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Petr Kellnhofer, Lars Jebe, Andrew Jones, Ryan Spicer, Kari Pulli, and Gordon Wetzstein. 2021. Neural Lumigraph Rendering. In CVPR.Google ScholarGoogle Scholar
  23. Byungsoo Kim, Vinicius C. Azevedo, Nils Thuerey, Theodore Kim, Markus Gross, and Barbara Solenthaler. 2019. Deep fluids: A generative network for parameterized fluid simulations. Computer Graphics Forum 38, 2 (2019), 59--70.Google ScholarGoogle ScholarCross RefCross Ref
  24. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. In Proc. ICLR.Google ScholarGoogle Scholar
  25. Amit Kohli, Vincent Sitzmann, and Gordon Wetzstein. 2020. Semantic implicit neural scene representations with semi-supervised training. Proc. 3DV (2020).Google ScholarGoogle ScholarCross RefCross Ref
  26. Stanford Computer Graphics Laboratory. 2014. Stanford 3D Scanning Repository. http://graphics.stanford.edu/data/3Dscanrep/Google ScholarGoogle Scholar
  27. Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. 2020. Fourier neural operator for parametric partial differential equations. In Proc. NeurIPS.Google ScholarGoogle Scholar
  28. Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020a. Neural sparse voxel fields. In NeurIPS.Google ScholarGoogle Scholar
  29. Shaohui Liu, Yinda Zhang, Songyou Peng, Boxin Shi, Marc Pollefeys, and Zhaopeng Cui. 2020b. DIST: Rendering deep implicit signed distance function with differentiable sphere tracing. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  30. Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. 2019. Neural volumes: Learning dynamic renderable volumes from images. ACM Trans. Graph (SIGGRAPH) 38, 4 (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. William E. Lorensen and Harvey E. Cline. 1987. Marching cubes: A high resolution 3D surface construction algorithm. ACM Siggraph Computer Graphics 21, 4 (1987), 163--169.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3D reconstruction in function space. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  33. Mateusz Michalkiewicz, Jhony K. Pontes, Dominic Jack, Mahsa Baktashmotlagh, and Anders Eriksson. 2019. Implicit surface representations as layers in neural networks. In Proc. ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  34. Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. 2019. Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graph. (SIGGRAPH) 38, 4 (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing scenes as neural radiance fields for view synthesis. In Proc. ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Thu Nguyen-Phuoc, Christian Richardt, Long Mai, Yong-Liang Yang, and Niloy Mitra. 2020. BlockGAN: Learning 3D object-aware scene representations from unlabelled images. In Proc. NeurIPS.Google ScholarGoogle Scholar
  37. Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. 2020a. Differentiable volumetric rendering: Learning implicit 3D representations without 3D supervision. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  38. Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. 2020b. Differentiable volumetric rendering: Learning implicit 3d representations without 3d supervision. In CVPR.Google ScholarGoogle Scholar
  39. Michael Oechsle, Lars Mescheder, Michael Niemeyer, Thilo Strauss, and Andreas Geiger. 2019. Texture fields: Learning texture representations in function space. In Proc. ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  40. Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Love-grove. 2019. DeepSDF: Learning continuous signed distance functions for shape representation. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  41. Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, and Andreas Geiger. 2020. Convolutional occupancy networks. In Proc. ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Gernot Riegler and Vladlen Koltun. 2020. Free view synthesis. In Proc. ECCV.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, and Hao Li. 2019. PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. In Proc. ICCV.Google ScholarGoogle ScholarCross RefCross Ref
  44. Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas Geiger. 2020. GRAF: Generative radiance fields for 3D-aware image synthesis. In Proc. NeurIPS.Google ScholarGoogle Scholar
  45. Vincent Sitzmann, Julien N. P. Martel, Alexander W. Bergman, David B. Lindell, and Gordon Wetzstein. 2020. Implicit neural representations with periodic activation functions. In Proc. NeurIPS.Google ScholarGoogle Scholar
  46. Vincent Sitzmann, Justus Thies, Felix Heide, Matthias Nießner, Gordon Wetzstein, and Michael Zollhöfer. 2019a. DeepVoxels: Learning persistent 3D feature embeddings. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  47. Vincent Sitzmann, Michael Zollhöfer, and Gordon Wetzstein. 2019b. Scene representation networks: Continuous 3D-structure-aware neural scene representations. In Proc. NeurIPS.Google ScholarGoogle Scholar
  48. Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. 2021. Neural geometric level of detail: Real-time rendering with implicit 3D shapes. In Proc. CVPR.Google ScholarGoogle ScholarCross RefCross Ref
  49. Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. 2020. Fourier features let networks learn high frequency functions in low dimensional domains. Proc. NeurIPS (2020).Google ScholarGoogle Scholar
  50. Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, et al. 2020. State of the art on neural rendering. Proc. Eurographics (2020).Google ScholarGoogle ScholarCross RefCross Ref
  51. Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2019. Deferred neural rendering: Image synthesis using neural textures. ACM Trans. Graph. (SIGGRAPH) 38, 4 (2019), 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Ronen Basri, and Yaron Lipman. 2020. Multiview neural surface reconstruction by disentangling geometry and appearance. In Proc. NeurIPS.Google ScholarGoogle Scholar
  53. Xiuming Zhang, Sean Fanello, Yun-Ta Tsai, Tiancheng Sun, Tianfan Xue, Rohit Pandey, Sergio Orts-Escolano, Philip Davidson, Christoph Rhemann, Paul Debevec, Jonathan T. Barron, Ravi Ramamoorthi, and William T. Freeman. 2021. Neural light transport for relighting and view synthesis. ACM Trans. Graph. 40, 1 (2021).Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. 2018. Stereo magnification: Learning view synthesis using multiplane images. ACM Trans. Graph. (SIGGRAPH) 37, 4 (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Acorn: adaptive coordinate networks for neural scene representation

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Graphics
              ACM Transactions on Graphics  Volume 40, Issue 4
              August 2021
              2170 pages
              ISSN:0730-0301
              EISSN:1557-7368
              DOI:10.1145/3450626
              Issue’s Table of Contents

              Copyright © 2021 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 19 July 2021
              Published in tog Volume 40, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader