Abstract
Neural representations have emerged as a new paradigm for applications in rendering, imaging, geometric modeling, and simulation. Compared to traditional representations such as meshes, point clouds, or volumes they can be flexibly incorporated into differentiable learning-based pipelines. While recent improvements to neural representations now make it possible to represent signals with fine details at moderate resolutions (e.g., for images and 3D shapes), adequately representing large-scale or complex scenes has proven a challenge. Current neural representations fail to accurately represent images at resolutions greater than a megapixel or 3D scenes with more than a few hundred thousand polygons. Here, we introduce a new hybrid implicit-explicit network architecture and training strategy that adaptively allocates resources during training and inference based on the local complexity of a signal of interest. Our approach uses a multiscale block-coordinate decomposition, similar to a quadtree or octree, that is optimized during training. The network architecture operates in two stages: using the bulk of the network parameters, a coordinate encoder generates a feature grid in a single forward pass. Then, hundreds or thousands of samples within each block can be efficiently evaluated using a lightweight feature decoder. With this hybrid implicit-explicit network architecture, we demonstrate the first experiments that fit gigapixel images to nearly 40 dB peak signal-to-noise ratio. Notably this represents an increase in scale of over 1000X compared to the resolution of previously demonstrated image-fitting experiments. Moreover, our approach is able to represent 3D shapes significantly faster and better than previous techniques; it reduces training times from days to hours or minutes and memory requirements by over an order of magnitude.
Supplemental Material
- Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, and James Tompkin. 2020. MatryODShka: Real-time 6DoF video view synthesis using multi-sphere images. In Proc. ECCV.Google Scholar
Digital Library
- Matan Atzmon and Yaron Lipman. 2020. SAL: Sign agnostic learning of shapes from raw data. In Proc. CVPR.Google Scholar
Cross Ref
- Marsha J. Berger and Joseph Oliger. 1984. Adaptive mesh refinement for hyperbolic partial differential equations. Journal of computational Physics 53, 3 (1984), 484--512.Google Scholar
Cross Ref
- Michael Broxton, John Flynn, Ryan Overbeck, Daniel Erickson, Peter Hedman, Matthew Duvall, Jason Dourgarian, Jay Busch, Matt Whalen, and Paul Debevec. 2020. Immersive light field video with a layered mesh representation. ACM Trans. Graph. (SIGGRAPH) 39, 4 (2020).Google Scholar
Digital Library
- Rohan Chabra, Jan Eric Lenssen, Eddy Ilg, Tanner Schmidt, Julian Straub, Steven Lovegrove, and Richard Newcombe. 2020. Deep local shapes: Learning local SDF priors for detailed 3D reconstruction. In Proc. ECCV.Google Scholar
Digital Library
- Eric R. Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, and Gordon Wetzstein. 2021. pi-GAN: Periodic implicit generative adversarial networks for 3D-aware image synthesis. In Proc. CVPR.Google Scholar
Cross Ref
- Yinbo Chen, Sifei Liu, and Xiaolong Wang. 2021. Learning continuous image representation with local implicit image function. In Proc. CVPR.Google Scholar
Cross Ref
- Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In Proc. CVPR.Google Scholar
Cross Ref
- Thomas Davies, Derek Nowrouzezahrai, and Alec Jacobson. 2020. Overfit neural networks as a compact shape representation. arXiv preprint arXiv:2009.09808 (2020).Google Scholar
- S. M. Ali Eslami, Danilo Jimenez Rezende, Frederic Besse, Fabio Viola, Ari S. Morcos, Marta Garnelo, Avraham Ruderman, Andrei A. Rusu, Ivo Danihelka, Karol Gregor, David P. Reichert, Lars Buesing, Theophane Weber, Oriol Vinyals, Dan Rosenbaum, Neil Rabinowitz, Helen King, Chloe Hillier, Matt Botvinick, Daan Wierstra, Koray Kavukcuoglu, and Demis Hassabis. 2018. Neural scene representation and rendering. Science 360, 6394 (2018), 1204--1210.Google Scholar
- John Flynn, Michael Broxton, Paul Debevec, Matthew DuVall, Graham Fyffe, Ryan Overbeck, Noah Snavely, and Richard Tucker. 2019. Deepview: View synthesis with learned gradient descent. In Proc. CVPR.Google Scholar
Cross Ref
- Kyle Genova, Forrester Cole, Avneesh Sud, Aaron Sarna, and Thomas Funkhouser. 2020. Local deep implicit functions for 3D shape. In Proc. CVPR.Google Scholar
Cross Ref
- Amos Gropp, Lior Yariv, Niv Haim, Matan Atzmon, and Yaron Lipman. 2020. Implicit geometric regularization for learning shapes. In Proc. ICML.Google Scholar
- LLC Gurobi Optimization. 2021. Gurobi Optimizer Reference Manual. http://www.gurobi.comGoogle Scholar
- Song Han, Huizi Mao, and William J. Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. In Proc. ICLR.Google Scholar
- Peter Hedman, Julien Philip, True Price, Jan-Michael Frahm, George Drettakis, and Gabriel Brostow. 2018. Deep blending for free-viewpoint image-based rendering. ACM Trans. Graph. (SIGGRAPH Asia) 37, 6 (2018).Google Scholar
- Philipp Henzler, Niloy J. Mitra, and Tobias Ritschel. 2019. Escaping Plato's cave: 3D shape from adversarial rendering. In Proc. ICCV.Google Scholar
Cross Ref
- Weizhang Huang and Robert D. Russell. 2010. Adaptive Moving Mesh Methods. Springer New York.Google Scholar
- Chiyu Jiang, Avneesh Sud, Ameesh Makadia, Jingwei Huang, Matthias Nießner, and Thomas Funkhouser. 2020b. Local implicit grid representations for 3D scenes. In Proc. CVPR.Google Scholar
Cross Ref
- Yue Jiang, Dantong Ji, Zhizhong Han, and Matthias Zwicker. 2020a. SDFDiff: Differentiate rendering of signed distance fields for 3D shape optimization. In Proc. CVPR.Google Scholar
Cross Ref
- Michael Kazhdan and Hugues Hoppe. 2013. Screened poisson surface reconstruction. ACM Trans. Graph. 32, 3 (2013).Google Scholar
Digital Library
- Petr Kellnhofer, Lars Jebe, Andrew Jones, Ryan Spicer, Kari Pulli, and Gordon Wetzstein. 2021. Neural Lumigraph Rendering. In CVPR.Google Scholar
- Byungsoo Kim, Vinicius C. Azevedo, Nils Thuerey, Theodore Kim, Markus Gross, and Barbara Solenthaler. 2019. Deep fluids: A generative network for parameterized fluid simulations. Computer Graphics Forum 38, 2 (2019), 59--70.Google Scholar
Cross Ref
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. In Proc. ICLR.Google Scholar
- Amit Kohli, Vincent Sitzmann, and Gordon Wetzstein. 2020. Semantic implicit neural scene representations with semi-supervised training. Proc. 3DV (2020).Google Scholar
Cross Ref
- Stanford Computer Graphics Laboratory. 2014. Stanford 3D Scanning Repository. http://graphics.stanford.edu/data/3Dscanrep/Google Scholar
- Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. 2020. Fourier neural operator for parametric partial differential equations. In Proc. NeurIPS.Google Scholar
- Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020a. Neural sparse voxel fields. In NeurIPS.Google Scholar
- Shaohui Liu, Yinda Zhang, Songyou Peng, Boxin Shi, Marc Pollefeys, and Zhaopeng Cui. 2020b. DIST: Rendering deep implicit signed distance function with differentiable sphere tracing. In Proc. CVPR.Google Scholar
Cross Ref
- Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. 2019. Neural volumes: Learning dynamic renderable volumes from images. ACM Trans. Graph (SIGGRAPH) 38, 4 (2019).Google Scholar
Digital Library
- William E. Lorensen and Harvey E. Cline. 1987. Marching cubes: A high resolution 3D surface construction algorithm. ACM Siggraph Computer Graphics 21, 4 (1987), 163--169.Google Scholar
Digital Library
- Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3D reconstruction in function space. In Proc. CVPR.Google Scholar
Cross Ref
- Mateusz Michalkiewicz, Jhony K. Pontes, Dominic Jack, Mahsa Baktashmotlagh, and Anders Eriksson. 2019. Implicit surface representations as layers in neural networks. In Proc. ICCV.Google Scholar
Cross Ref
- Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. 2019. Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graph. (SIGGRAPH) 38, 4 (2019).Google Scholar
Digital Library
- Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing scenes as neural radiance fields for view synthesis. In Proc. ECCV.Google Scholar
Digital Library
- Thu Nguyen-Phuoc, Christian Richardt, Long Mai, Yong-Liang Yang, and Niloy Mitra. 2020. BlockGAN: Learning 3D object-aware scene representations from unlabelled images. In Proc. NeurIPS.Google Scholar
- Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. 2020a. Differentiable volumetric rendering: Learning implicit 3D representations without 3D supervision. In Proc. CVPR.Google Scholar
Cross Ref
- Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. 2020b. Differentiable volumetric rendering: Learning implicit 3d representations without 3d supervision. In CVPR.Google Scholar
- Michael Oechsle, Lars Mescheder, Michael Niemeyer, Thilo Strauss, and Andreas Geiger. 2019. Texture fields: Learning texture representations in function space. In Proc. ICCV.Google Scholar
Cross Ref
- Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Love-grove. 2019. DeepSDF: Learning continuous signed distance functions for shape representation. In Proc. CVPR.Google Scholar
Cross Ref
- Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, and Andreas Geiger. 2020. Convolutional occupancy networks. In Proc. ECCV.Google Scholar
Digital Library
- Gernot Riegler and Vladlen Koltun. 2020. Free view synthesis. In Proc. ECCV.Google Scholar
Digital Library
- Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, and Hao Li. 2019. PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. In Proc. ICCV.Google Scholar
Cross Ref
- Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas Geiger. 2020. GRAF: Generative radiance fields for 3D-aware image synthesis. In Proc. NeurIPS.Google Scholar
- Vincent Sitzmann, Julien N. P. Martel, Alexander W. Bergman, David B. Lindell, and Gordon Wetzstein. 2020. Implicit neural representations with periodic activation functions. In Proc. NeurIPS.Google Scholar
- Vincent Sitzmann, Justus Thies, Felix Heide, Matthias Nießner, Gordon Wetzstein, and Michael Zollhöfer. 2019a. DeepVoxels: Learning persistent 3D feature embeddings. In Proc. CVPR.Google Scholar
Cross Ref
- Vincent Sitzmann, Michael Zollhöfer, and Gordon Wetzstein. 2019b. Scene representation networks: Continuous 3D-structure-aware neural scene representations. In Proc. NeurIPS.Google Scholar
- Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. 2021. Neural geometric level of detail: Real-time rendering with implicit 3D shapes. In Proc. CVPR.Google Scholar
Cross Ref
- Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. 2020. Fourier features let networks learn high frequency functions in low dimensional domains. Proc. NeurIPS (2020).Google Scholar
- Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, et al. 2020. State of the art on neural rendering. Proc. Eurographics (2020).Google Scholar
Cross Ref
- Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2019. Deferred neural rendering: Image synthesis using neural textures. ACM Trans. Graph. (SIGGRAPH) 38, 4 (2019), 1--12.Google Scholar
Digital Library
- Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Ronen Basri, and Yaron Lipman. 2020. Multiview neural surface reconstruction by disentangling geometry and appearance. In Proc. NeurIPS.Google Scholar
- Xiuming Zhang, Sean Fanello, Yun-Ta Tsai, Tiancheng Sun, Tianfan Xue, Rohit Pandey, Sergio Orts-Escolano, Philip Davidson, Christoph Rhemann, Paul Debevec, Jonathan T. Barron, Ravi Ramamoorthi, and William T. Freeman. 2021. Neural light transport for relighting and view synthesis. ACM Trans. Graph. 40, 1 (2021).Google Scholar
Digital Library
- Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. 2018. Stereo magnification: Learning view synthesis using multiplane images. ACM Trans. Graph. (SIGGRAPH) 37, 4 (2018).Google Scholar
Digital Library
Index Terms
Acorn: adaptive coordinate networks for neural scene representation
Recommendations
ACORN: a system for CVS macro design by tree placement and tree customization
ACORN is a system for the physical design of cascode voltage switch (CVS) macros which utilizes tree placement and tree customization to improve macro wirability. The results obtained by designing a 43-tree differential (DCVS) macro on a masterslice ...





Comments