skip to main content
research-article
Open Access

Casual 3D photography

Published:20 November 2017Publication History
Skip Abstract Section

Abstract

We present an algorithm that enables casual 3D photography. Given a set of input photos captured with a hand-held cell phone or DSLR camera, our algorithm reconstructs a 3D photo, a central panoramic, textured, normal mapped, multi-layered geometric mesh representation. 3D photos can be stored compactly and are optimized for being rendered from viewpoints that are near the capture viewpoints. They can be rendered using a standard rasterization pipeline to produce perspective views with motion parallax. When viewed in VR, 3D photos provide geometrically consistent views for both eyes. Our geometric representation also allows interacting with the scene using 3D geometry-aware effects, such as adding new objects to the scene and artistic lighting effects.

Our 3D photo reconstruction algorithm starts with a standard structure from motion and multi-view stereo reconstruction of the scene. The dense stereo reconstruction is made robust to the imperfect capture conditions using a novel near envelope cost volume prior that discards erroneous near depth hypotheses. We propose a novel parallax-tolerant stitching algorithm that warps the depth maps into the central panorama and stitches two color-and-depth panoramas for the front and back scene surfaces. The two panoramas are fused into a single non-redundant, well-connected geometric mesh. We provide videos demonstrating users interactively viewing and manipulating our 3D photos.

References

  1. Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernandez Esteban, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual Reality Video. ACM Transactions on Graphics 35, 6 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jonathan T. Barron and Jitendra Malik. 2015. Shape, Illumination, and Reflectance from Shading. IEEE Trans. Pattern Anal. Mach. Intell. 37, 8 (2015), 1670--1687.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Frederic Besse, Carsten Rother, Andrew Fitzgibbon, and Jan Kautz. 2014. PMBP: Patch-Match Belief Propagation for Correspondence Field Estimation. Int. J. Comput. Vision 110, 1 (2014), 2--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Aaron F. Bobick and Stephen S. Intille. 1999. Large Occlusion Stereo. International Journal of Computer Vision 33, 3 (1999), 181--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chris Buehler, Michael Bosse, Leonard McMillan, Steven Gortler, and Michael Cohen. 2001. Unstructured Lumigraph Rendering. (2001), 425--432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gaurav Chaurasia, Sylvain Duchene, Olga Sorkine-Hornung, and George Drettakis. 2013. Depth Synthesis and Local Warps for Plausible Image-based Navigation. ACM Trans. Graph. 32, 3 (2013), 30:1--30:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Robert T. Collins. 1996. A space-sweep approach to true multi-image matching. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 1996). 358--363. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Paul Debevec, Chris Tchou, Andrew Gardner, Tim Hawkins, Charis Poullis, Jessi Stumpfel, Andrew Jones, Nathaniel Yun, Per Einarsson, Therese Lundgren, Marcos Fajardo, and Philippe Martinez. 2004. Estimating Surface Reflectance Properties of a Complex Scene under Captured Natural Illumination. ICT Technical Report ICT TR 06 2004 (2004).Google ScholarGoogle Scholar
  9. Paul E. Debevec, Camillo J. Taylor, and Jitendra Malik. 1996. Modeling and Rendering Architecture from Photographs: A Hybrid Geometry- and Image-based Approach. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '96). ACM, New York, NY, USA, 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sylvain Duchêne, Clement Riant, Gaurav Chaurasia, Jorge Lopez-Moreno, Pierre-Yves Laffont, Stefan Popov, Adrien Bousseau, and George Drettakis. 2015. Multi-View Intrinsic Images of Outdoors Scenes with an Application to Relighting. ACM Transactions on Graphics (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth Map Prediction from a Single Image Using a Multi-scale Deep Network. Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS) (2014), 2366--2374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jakob Engel, Vladlen Koltun, and Daniel Cremers. 2016. Direct Sparse Odometry. arXiv:1607.02565 (2016).Google ScholarGoogle Scholar
  13. Facebook. 2016. Facebook Surround 360. https://facebook360.fb.com/facebook-surround-360/. (2016). Accessed: 2016-12-26.Google ScholarGoogle Scholar
  14. John Flynn, Ivan Neulander, James Philbin, and Noah Snavely. 2016. DeepStereo: Learning to Predict New Views From the World's Imagery. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).Google ScholarGoogle Scholar
  15. Simon Fuhrmann and Michael Goesele. 2014. Floating Scale Surface Reconstruction. ACM Trans. Graph. 33, 4 (2014), article no. 46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Simon Fuhrmann, Fabian Langguth, and Michael Goesele. 2014. MVE: A Multi-view Reconstruction Environment. Proceedings of the Eurographics Workshop on Graphics and Cultural Heritage (GCH '14) (2014), 11--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yasutaka Furukawa and Carlos Hernández. 2015. Multi-View Stereo: A Tutorial. Foundations and Trends. in Computer Graphics and Vision 9, 1--2 (2015), 1--148. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yasutaka Furukawa and Jean Ponce. 2010. Accurate, Dense, and Robust Multiview Stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32, 8 (2010), 1362--1376. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Silvano Galliani, Katrin Lasinger, and Konrad Schindler. 2015. Massively Parallel Multiview Stereopsis by Surface Normal Diffusion. The IEEE International Conference on Computer Vision (ICCV) (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Clément Godard, Oisin Mac Aodha, and Gabriel J. Brostow. 2017. Unsupervised Monocular Depth Estimation with Left-Right Consistency. CVPR (2017).Google ScholarGoogle Scholar
  21. M. Goesele, N. Snavely, B. Curless, H. Hoppe, and S.M. Seitz. 2007. Multi-View Stereo for Community Photo Collections. (2007), 1--8.Google ScholarGoogle Scholar
  22. Google. 2015. Carboard Camera. https://googleblog.blogspot.com/2015/12/step-inside-your-photos-with-cardboard.html/. (2015). Accessed: 2016-12-26.Google ScholarGoogle Scholar
  23. Peter Hedman, Tobias Ritschel, George Drettakis, and Gabriel Brostow. 2016. Scalable Inside-out Image-based Rendering. ACM Trans. Graph. 35, 6 (2016), 231:1--231:11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Sunghoon Im, Hyowon Ha, François Rameau, Hae-Gon Jeon, Gyeongmin Choe, and InSo Kweon. 2016. All-Around Depth from Small Motion with a Spherical Panoramic Camera. European Conference on Computer Vision (ECCV '16) (2016), 156--172.Google ScholarGoogle ScholarCross RefCross Ref
  25. Hiroshi Ishiguro, Masashi Yamamoto, and Saburo Tsuji. 1990. Omni-directional stereo for making global map. In Third International Conference on Computer Vision. IEEE, 540--547.Google ScholarGoogle ScholarCross RefCross Ref
  26. Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, and Andrew Fitzgibbon. 2011. KinectFusion: Real-time 3D Reconstruction and Interaction Using a Moving Depth Camera. Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (2011), 559--568. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Michal Jancosek and Tomas Pajdla. 2011. Multi-view Reconstruction Preserving Weakly-supported Surfaces. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011) (2011), 3121--3128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kevin Karsch, Varsha Hedau, David Forsyth, and Derek Hoiem. 2011. Rendering Synthetic Objects into Legacy Photographs. ACM Trans. Graph. 30, 6 (2011), 157:1--157:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Michael Kazhdan and Hugues Hoppe. 2013. Screened Poisson Surface Reconstruction. ACM Trans. Graph. 32, 3 (2013), article no. 29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Erum Arif Khan, Erik Reinhard, Roland W. Fleming, and Heinrich H. Bülthoff. 2006. Image-based Material Editing. ACM Transactions on Graphics (Proc. SIGGRAPH 2006) 25, 3 (2006), 654--663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Vladimir Kolmogorov and Ramin Zabih. 2004. What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 2 (2004), 65--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Nikos Komodakis and Georgios Tziritas. 2007. Approximate Labeling via Graph Cuts Based on Linear Programming. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 8 (2007), 1436--1453. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Johannes Kopf, Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint Bilateral Upsampling. ACM Trans. Graph. 26, 3 (2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Johannes Kopf, Fabian Langguth, Daniel Scharstein, Richard Szeliski, and Michael Goesele. 2013. Image-based Rendering in the Gradient Domain. ACM Trans. Graph. 32, 6 (2013), 199:1--199:9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk, and Aaron Bobick. 2003. Graphcut Textures: Image and Video Synthesis Using Graph Cuts. ACM Trans. Graph. 22, 3 (2003), 277--286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Fabian Langguth, Kalyan Sunkavalli, Sunil Hadap, and Michael Goesele. 2016. Shading-aware Multi-view Stereo. Proceedings of the European Conference on Computer Vision (ECCV) (2016).Google ScholarGoogle ScholarCross RefCross Ref
  37. Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization Using Optimization. ACM Trans. Graph. 23, 3 (2004), 689--694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Kaimo Lin, Nianjuan Jiang, Loong-Fah Cheong, Minh N. Do, and Jiangbo Lu. 2016. SEAGULL: Seam-Guided Local Alignment for Parallax-Tolerant Image Stitching. 14th European Conference on Computer Vision (ECCV) (2016), 370--385.Google ScholarGoogle Scholar
  39. Sheng-Jie Luo, I-Chao Shen, Bing-Yu Chen, Wen-Huang Cheng, and Yung-Yu Chuang. 2012. Perspective-aware Warping for Seamless Stereoscopic Image Cloning. ACM Trans. Graph. 31, 6 (2012), article no. 182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ziyang Ma, Kaiming He, Yichen Wei, Jian Sun, and Enhua Wu. 2013. Constant Time Weighted Median Filtering for Stereo Matching and Beyond. In IEEE International Conference on Computer Vision (ICCV 2013). 49--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Raúl Mur-Artal and Juan D. Tardós. 2016. ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras. arXiv preprint arXiv:1610.06475 (2016).Google ScholarGoogle Scholar
  42. OpenMVS. 2016. OpenMVS: open Multi-View Stereo reconstruction library. https://github.com/cdcseacave/openMVS. (2016). Accessed: 2016-12-26.Google ScholarGoogle Scholar
  43. Shmuel Peleg and Moshe Ben-Ezra. 1999. Stereo panorama with a single camera. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 1999) (1999), 395--401.Google ScholarGoogle ScholarCross RefCross Ref
  44. Shmuel Peleg, Moshe Ben-Ezra, and Yael Pritch. 2001. Omnistereo: panoramic stereo imaging. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 3 (2001), 279--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Realities. 2017. realities.io | Go Places. http://realities.io/. (2017). Accessed: 2017-1-12.Google ScholarGoogle Scholar
  46. Christoph Rhemann, Asmaa Hosni, Michael Bleyer, Carsten Rother, and Margit Gelautz. 2011. Fast cost-volume filtering for visual correspondence and beyond. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011). 3017--3024. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Christian Richardt, Yael Pritch, Henning Zimmer, and Alexander Sorkine-Hornung. 2013. Megastereo: Constructing High-Resolution Stereo Panoramas. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013) (2013), 1256--1263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Daniel Scharstein and Richard Szeliski. 2002. A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms. International Journal of Computer Vision 47, 1--3 (2002), 7--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Frank Schmitt and Lutz Priese. 2009. Sky detection in CSC-segmented color images. International Conference on Computer Vision Theory and Applications (VISAPP 2009) (2009), 101--106.Google ScholarGoogle Scholar
  50. Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. 2016. Pixelwise View Selection for Unstructured Multi-View Stereo. European Conference on Computer Vision (ECCV) (2016).Google ScholarGoogle Scholar
  51. Steven M Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), Vol. 1. IEEE, 519--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Jonathan Shade, Steven Gortler, Li-wei He, and Richard Szeliski. 1998. Layered Depth Images. Proceedings of SIGGRAPH '98 (1998), 231--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Harry Shumand RickSzeliski. 1998. Construction and refinement of panoramic mosaics with global and local alignment. Sixth International Conference on Computer Vision (ICCV '98) (1998), 953--958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Richard Szeliski. 2006. Image Alignment and Stitching: A Tutorial. Found. Trends. Comput. Graph. Vis. 2, 1 (2006), 1--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Jayant Thatte, Jean-Baptiste Boin, Haricharan Lakshman, and Bernd Girod. 2016. Depth augmented stereo panorama for cinematic virtual reality with head-motion parallax. 2016 IEEE International Conference on Multimedia and Expo (ICME) (2016).Google ScholarGoogle ScholarCross RefCross Ref
  56. Benjamin Ummenhofer and Thomas Brox. 2015. Global, Dense Multiscale Reconstruction for a Billion Points. IEEE International Conference on Computer Vision (ICCV) (2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Benjamin Ummenhofer, Huizhong Zhou, Jonas Uhrig, Nikolaus Mayer, Eddy Ilg, Alexey Dosovitskiy, and Thomas Brox. 2017. DeMoN:Depth and Motion Network for Learning Monocular Stereo. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).Google ScholarGoogle Scholar
  58. Valve. 2016. Valve Developer Community: Advanced Outdoors Photogrammetry. https://developer.valvesoftware.com/wiki/Destinations/Advanced_Outdoors_Photogrammetry. (2016). Accessed: 2016-11-3.Google ScholarGoogle Scholar
  59. George Vogiatzis, Carlos Hernández Esteban, Philip H. S. Torr, and Roberto Cipolla. 2007. Multiview Stereo via Volumetric Graph-Cuts and Occlusion Robust Photo-Consistency. IEEE Trans. Pattern Anal. Mach. Intell. 29, 12 (2007), 2241--2246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Michael Waechter, Mate Beljan, Simon Fuhrmann, Nils Moehrle, Johannes Kopf, and Michael Goesele. 2017. Virtual Rephotography: Novel View Prediction Error for 3D Reconstruction. ACM Trans. Graph. 36, 1 (2017), article no. 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Michael Waechter, Nils Moehrle, and Michael Goesele. 2014. Let There Be Color! Large-Scale Texturing of 3D Reconstructions. ECCV 2014 8693 (2014), 836--850.Google ScholarGoogle ScholarCross RefCross Ref
  62. Katja Wolff, Changil Kim, Henning Zimmer, Christopher Schroers, Mario Botsch, Olga Sorkine-Hornung, and Alexander Sorkine-Hornung. 2016. Point Cloud Noise and Outlier Removal for Image-Based 3D Reconstruction. In International Conference on 3D Vision (3DV 2016). 118--127.Google ScholarGoogle ScholarCross RefCross Ref
  63. Chenglei Wu, Bennet Wilburn, Yasuyuki Matsushita, and Christian Theobalt. 2011. High-quality Shape from Multi-view Stereo and Shading Under General Illumination. IEEE Conference on Computer Vision and Pattern Recognition (CVPR '11) (2011), 969--976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Kuk-Jin Yoon and In-So Kweon. 2005. Locally adaptive support-weight approach for visual correspondence search. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2005), Vol. 2. 924--931. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Julio Zaragoza, Tat-Jun Chin, Michael S. Brown, and David Suter. 2013. As-Projective-As-Possible Image Stitching with Moving DLT. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (2013), 2339--2346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Fan Zhang and Feng Liu. 2014. Parallax-Tolerant Image Stitching. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014), 3262--3269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Fan Zhang and Feng Liu. 2015. Casual Stereoscopic Panorama Stitching. IEEE Conference on Computer Vision and Pattern Recognition (CVPR '15) (2015), 2002--2010.Google ScholarGoogle Scholar
  68. Ke Colin Zheng, Sing Bing Kang, Michael F. Cohen, and Richard Szeliski. 2007. Layered Depth Panoramas. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007) (2007), 1--8.Google ScholarGoogle Scholar
  69. C. Lawrence Zitnick, Sing Bing Kang, Matthew Uyttendaele, Simon Winder, and Richard Szeliski. 2004. High-quality Video View Interpolation Using a Layered Representation. ACM Trans. Graph. (Proc. SIGGRAPH 2004) 23, 3 (2004), 600--608. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Casual 3D photography

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Graphics
          ACM Transactions on Graphics  Volume 36, Issue 6
          December 2017
          973 pages
          ISSN:0730-0301
          EISSN:1557-7368
          DOI:10.1145/3130800
          Issue’s Table of Contents

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 November 2017
          Published in tog Volume 36, Issue 6

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader