Abstract
We present an approach to detailed reconstruction of complex real-world scenes with a handheld commodity range sensor. The user moves the sensor freely through the environment and images the scene. An offline registration and integration pipeline produces a detailed scene model. To deal with the complex sensor trajectories required to produce detailed reconstructions with a consumer-grade sensor, our pipeline detects points of interest in the scene and preserves detailed geometry around them while a global optimization distributes residual registration errors through the environment. Our results demonstrate that detailed reconstructions of complex scenes can be obtained with a consumer-grade camera.
Supplemental Material
Available for Download
Supplemental material.
- Agarwal, S., Snavely, N., Seitz, S. M., and Szeliski, R. 2010. Bundle adjustment in the large. In Proc. ECCV. Google Scholar
Digital Library
- Boykov, Y., Veksler, O., and Zabih, R. 2001. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 2001. Google Scholar
Digital Library
- Brown, B. J., and Rusinkiewicz, S. 2007. Global non-rigid alignment of 3-D scans. ACM Transactions on Graphics 26, 3. Google Scholar
Digital Library
- Chen, Y., and Medioni, G. G. 1992. Object modelling by registration of multiple range images. Image and Vision Computing 10, 3. Google Scholar
Digital Library
- Comaniciu, D., and Meer, P. 2002. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 5. Google Scholar
Digital Library
- Cui, Y., Schuon, S., Chan, D., Thrun, S., and Theobalt, C. 2010. 3D shape scanning with a time-of-flight camera. In Proc. CVPR.Google Scholar
- Curless, B., and Levoy, M. 1996. A volumetric method for building complex models from range images. In Proc. SIGGRAPH. Google Scholar
Digital Library
- Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., and Burgard, W. 2012. An evaluation of the RGB-D SLAM system. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
- Fuhrmann, S., and Goesele, M. 2011. Fusion of depth maps with multiple scales. ACM Transactions on Graphics 30, 6. Google Scholar
Digital Library
- Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 8. Google Scholar
Digital Library
- Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. 2010. Towards Internet-scale multi-view stereo. In Proc. CVPR.Google Scholar
- Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S. M. 2007. Multi-view stereo for community photo collections. In Proc. ICCV.Google Scholar
- Henry, P., Krainin, M., Herbst, E., Ren, X., and Fox, D. 2012. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. International Journal of Robotics Research 31, 5. Google Scholar
Digital Library
- Heredia, F., and Favier, R. 2012. Kinect Fusion extensions to large scale environments. http://www.pointclouds.org/blog/srcs/fheredia.Google Scholar
- Huber, D. F., and Hebert, M. 2003. Fully automatic registration of multiple 3D data sets. Image and Vision Computing 21, 7.Google Scholar
Cross Ref
- Khoshelham, K., and Elberink, S. O. 2012. Accuracy and resolution of Kinect depth data for indoor mapping applications. Sensors 12, 2.Google Scholar
Cross Ref
- Kummerle, R., Grisetti, G., Strasdat, H., Konolige, K., and Burgard, W. 2011. g2o: A general framework for graph optimization. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
- Microsoft. 2010. Kinect. http://www.xbox.com/en-us/kinect.Google Scholar
- Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Google Scholar
Digital Library
- Pollefeys, M., Gool, L. J. V., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., and Koch, R. 2004. Visual modeling with a hand-held camera. International Journal of Computer Vision 59, 3. Google Scholar
Digital Library
- Pollefeys, M., Nistér, D., Frahm, J.-M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S. J., Merrell, P., Salmi, C., Sinha, S. N., Talton, B., Wang, L., Yang, Q., Stewénius, H., Yang, R., Welch, G., and Towles, H. 2008. Detailed real-time urban 3D reconstruction from video. International Journal of Computer Vision 78, 2--3. Google Scholar
Digital Library
- PrimeSense. 2012. PrimeSense unveils Capri. http://www.primesense.com/news/primesense-unveils-capri/.Google Scholar
- Pulli, K. 1999. Multiview registration for large data sets. In Proc. International Conference on 3D Digital Imaging and Modeling (3DIM). Google Scholar
Digital Library
- Roth, H., and Vona, M. 2012. Moving volume KinectFusion. In British Machine Vision Conference (BMVC).Google Scholar
- Ruhnke, M., Kümmerle, R., Grisetti, G., and Burgard, W. 2012. Highly accurate 3D surface models by sparse surface adjustment. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
- Rusinkiewicz, S., Hall-Holt, O., and Levoy, M. 2002. Real-time 3D model acquisition. ACM Transactions on Graphics 21, 3. Google Scholar
Digital Library
- Rusu, R. B., and Cousins, S. 2011. 3D is here: Point Cloud Library (PCL). In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
- Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proc. CVPR. Google Scholar
Digital Library
- Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In International Conference on Intelligent Robot Systems (IROS).Google Scholar
- Triggs, B., Mclauchlan, P., Hartley, R., and Fitzgibbon, A. 2000. Bundle adjustment -- a modern synthesis. In Vision Algorithms: Theory and Practice. Google Scholar
Digital Library
- Troccoli, A., and Allen, P. K. 2008. Building illumination coherent 3D models of large-scale outdoor scenes. International Journal of Computer Vision 78, 2--3. Google Scholar
Digital Library
- Turk, G., and Levoy, M. 1994. Zippered polygon meshes from range images. In Proc. SIGGRAPH. Google Scholar
Digital Library
- Weise, T., Wismer, T., Leibe, B., and Gool, L. V. 2011. Online loop closure for real-time interactive 3D scanning. Computer Vision and Image Understanding 115, 5. Google Scholar
Digital Library
- Weyrich, T., Lawrence, J., Lensch, H. P. A., Rusinkiewicz, S., and Zickler, T. 2009. Principles of appearance acquisition and representation. Foundations and Trends in Computer Graphics and Vision 4, 2. Google Scholar
Digital Library
- Whelan, T., Johannsson, H., Kaess, M., Leonard, J., and McDonald, J. 2013. Robust real-time visual odometry for dense RGB-D mapping. In IEEE International Conference on Robotics and Automation (ICRA).Google Scholar
Cross Ref
- Williams, B. P., Cummins, M., Neira, J., Newman, P. M., Reid, I. D., and Tardós, J. D. 2009. A comparison of loop closing techniques in monocular SLAM. Robotics and Autonomous Systems 57, 12. Google Scholar
Digital Library
- Wu, C., Agarwal, S., Curless, B., and Seitz, S. M. 2011. Multicore bundle adjustment. In Proc. CVPR. Google Scholar
Digital Library
- Zeng, M., Zhao, F., Zheng, J., and Liu, X. 2013. Octree-based fusion for realtime 3D reconstruction. Graphical Models 75, 3. Google Scholar
Digital Library
Index Terms
Dense scene reconstruction with points of interest
Recommendations
A linear method for reconstruction from lines and points
ICCV '95: Proceedings of the Fifth International Conference on Computer VisionDiscusses the basic role of the trifocal tensor in scene reconstruction. This 3/spl times/3/spl times/3 tensor plays a role in the analysis of scenes from three views analogous to the role played by the fundamental matrix in the two-view case. In ...
Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-Baseline Stereo Using a Hand-Held Video Camera
Three-dimensional (3-D) models of outdoor scenes are widely used for object recognition, navigation, mixed reality, and so on. Because such models are often made manually with high costs, automatic 3-D reconstruction has been widely investigated. In ...
Simultaneous Scene Reconstruction and Auto-Calibration Using Constrained Iterative Closest Point for 3D Depth Sensor Array
CRV '15: Proceedings of the 2015 12th Conference on Computer and Robot VisionBeing able to monitor a large area is essential for intelligent warehouse automation. Complete depth map of aslant floor allows Automated Guided Vehicles (AGV) to navigate the environment and safely interact with nearby people and equipment, eliminating ...





Comments