Abstract
Long scenes can be imaged by mosaicing multiple images from cameras scanning the scene. We address the case of a video camera scanning a scene while moving in a long path, e.g. scanning a city street from a driving car, or scanning a terrain from a low flying aircraft.
A robust approach to this task is presented, which is applied successfully to sequences having thousands of frames even when using a hand-held camera. Examples are given on a few challenging sequences. The proposed system consists of two components: (i) Motion and depth computation. (ii) Mosaic rendering.
In the first part a "direct" method is presented for computing motion and dense depth. Robustness of motion computation has been increased by limiting the motion model for the scanning camera. An iterative graph-cuts approach, with planar labels and a flexible similarity measure, allows the computation of a dense depth for the entire sequence.
In the second part a new minimal aspect distortion (MAD) mosaicing uses depth to minimize the geometrical distortions of long panoramic images. In addition to MAD mosaicing, interactive visualization using X-Slits is also demonstrated.
- Agarwala, A., Dontcheva, M., Agrawala, M., Drucker, S., Colburn, A., Curless, B., Salesin, D., & Cohen, M. (2004). Interactive digital photomontage. In SIGGRAPH (pp. 294-302). Google Scholar
- Agarwala, A., Agrawala, M., Cohen, M., Salesin, D., & Szeliski, R. (2006a). Photographing long scenes with multi-viewpoint panoramas. ACM Transactions on Graphics, 25(3), 853-861. Google Scholar
Digital Library
- Agarwala, A., Agrawala, M., Cohen, M., Salesin, D., & Szeliski, R. (2006b). Photographing long scenes with multi-viewpoint panoramas. In SIGGRAPH'06 (pp. 853-861), July 2006. Google Scholar
- Bergen, J., Anandan, P., Hanna, K., & Hingorani, R. (1992). Hierarchical model-based motion estimation. In ECCV (pp. 237-252). Google Scholar
- Birchfield, S., & Tomasi, C. (1998). A pixel dissimilarity measure that is insensitive to image sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(4), 401-406. Google Scholar
Digital Library
- Birchfield, S., & Tomasi, C. (1999). Multiway cut for stereo and motion with slanted surfaces. In ICCV (Vol. 1, pp. 489-495).Google Scholar
- Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222-1239. Google Scholar
Digital Library
- Deng, Y., Yang, Q., Lin, X., & Tang, X. (2005). A symmetric patch-based correspondence model for occlusion handling. In ICCV (pp. 1316-1322), Washington, DC, USA. Google Scholar
- Feldman, D., & Zomet, A. (2004). Generating mosaics with minimum distortions. In Proceedings of the 2004 conference on computer vision and pattern recognition workshop (CVPRW'04) (Vol. 11, pp. 163-170), Washington, DC, USA. Google Scholar
- Felzenszwalb, P., & Huttenlocher, D. (2006). Efficient belief propagation for early vision. International Journal of Computer Vision, 70(1), 41-54. Google Scholar
Digital Library
- Gortler, S., Grzeszczuk, R., Szeliski, R., & Cohen, M. (1996). The lumigraph. SIGGRAPH, 30, 43-54. Google Scholar
Digital Library
- Hanna, K. (1991). Direct multi-resolution estimation of ego-motion and structure from motion. In MOTION'91 (pp. 156-162).Google Scholar
- Hartley, R., & Zisserman, A. (2004). Multiple view geometry (2nd ed.). Cambridge: Cambridge University Press. Google Scholar
- Hong, L., & Chen, G. (2004). Segment-based stereo matching using graph cuts. In CVPR (Vol. 1, pp. 74-81), Los Alamitos, CA, USA.Google Scholar
- Irani, M., & Peleg, S. (1993). Motion analysis for image enhancement: Resolution, occlusion, and transparency. Journal of Visual Communication and Image Representation, 4, 324-335.Google Scholar
Cross Ref
- Irani, M., Rousso, B., & Peleg, S. (1992). Detecting and tracking multiple moving objects using temporal integration. In ECCV'92 (pp. 282-287). Google Scholar
- Irani, M., Anandan, P., & Cohen, M. (2002). Direct recovery of planar-parallax from multple frames. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(11), 1528-1534. Google Scholar
Digital Library
- Kawasaki, H., Murao, M., Ikeuchi, K., & Sakauchi, M. (2001). Enhanced navigation system with real images and real-time information. In ITSWC'01, October 2001.Google Scholar
- Kolmogorov, V., & Zabih, R. (2001). Computing visual correspondence with occlusions via graph cuts. In ICCV (Vol. 2, pp. 508- 515), July 2001.Google Scholar
- Kolmogorov, V., & Zabih, R. (2002). What energy functions can be minimized via graph cuts? In ECCV'02 (pp. 65-81), May 2002. Google Scholar
- Levoy, M., & Hanrahan, P. (1996). Light field rendering. SIGGRAPH, 30, 31-42. Google Scholar
Digital Library
- Lowe, D. (2004). Distinctive image features from scale-invariant key-points. International Journal of Computer Vision, 60(2), 91-110. Google Scholar
Cross Ref
- Montoliu, R., & Pla, F. (2003). Robust techniques in least squares-based motion estimation problems. In Lecture notes in computer science: Vol. 2905. Progress in pattern recognition, speech and image analysis (pp. 62-70). Berlin: Springer.Google Scholar
- Ono, S., Kawasaki, H., Hirahara, K., Kagesawa, M., & Ikeuchi, K. (2003). Ego-motion estimation for efficient city modeling by using epipolar plane range image. In ITSWC'03, November 2003.Google Scholar
- Pollefeys, M., VanGool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., & Koch, R. (2004). Visual modeling with a hand-held camera. International Journal of Computer Vision, 59(3), 207- 232. Google Scholar
Digital Library
- Rav-Acha, A., & Peleg, S. (2004). A unified approach for motion analysis and view synthesis. In Second IEEE international symposium on 3D data processing, visualization, and transmission (3DPVT), Thessaloniki, Greece, September 2004. Google Scholar
Digital Library
- Rav-Acha, A., & Peleg, S. (2006). Lucas-Kanade without iterative warping. In ICIP'06 (pp. 1097-1100).Google Scholar
- Rav-Acha, A., Shor, Y., & Peleg, S. (2004). Mosaicing with parallax using time warping. In Second IEEE workshop on image and video registration, Washington, DC, July 2004. Google Scholar
- Román, A., & Lensch, H. P. A. (2006). Automatic multiperspective images. In Proceedings of eurographics symposium on rendering (pp. 161-171). Google Scholar
- Román, A., Garg, G., & Levoy, M. (2004). Interactive design of multiperspective images for visualizing urban landscapes. In IEEE visualization 2004 (pp. 537-544), October 2004. Google Scholar
- Shi, M., & Zheng, J. Y. (2005). A slit scanning depth of route panorama from stationary blur. In CVPR'05 (Vol. 1, pp. 1047-1054). Google Scholar
- Wexler, Y., & Simakov, D. (2005). Space-time scene manifolds. In ICCV'05 (Vol. 1, pp. 858-863). Google Scholar
- Xiao, J., & Shah, M. (2005). Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1644-1659. Google Scholar
Digital Library
- Yang, Q., Wang, L., & Yang, R. (2006). Real-time global stereo matching using hierarchical belief propagation. In BMVC (pp. 989- 998), Edinburgh, September 2006.Google Scholar
- Zheng, J. Y. (2000). Digital route panorama. IEEE Multimedia, 7(2), 7-10.Google Scholar
- Zhu, Z., Riseman, E., & Hanson, A. (2004). Generalized parallel-perspective stereo mosaics from airborne videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 226- 237. Google Scholar
Digital Library
- Zomet, A., Feldman, D., Peleg, S., & Weinshall, D. (2003). Mosaicing new views: the crossed-slits projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(6), 741-754. Google Scholar
Digital Library
Index Terms
Minimal Aspect Distortion (MAD) Mosaicing of Long Scenes
Recommendations
Omnivergent Stereo
The notion of a virtual camera for optimal 3D reconstruction is introduced. Instead of planar perspective images that collect many rays at a fixed viewpoint, omnivergent cameras collect a small number of rays at many different viewpoints. The resulting ...
Drift-Free Real-Time Sequential Mosaicing
We present a sequential mosaicing algorithm for a calibrated rotating camera which can for the first time build drift-free, consistent spherical mosaics in real-time, automatically and seamlessly even when previously viewed parts of the scene are re-...
Auto-calibration for image mosaicing and stereo vision
Transactions on Computational Science XIXThe paper investigates the auto-calibration problem for mobile device cameras. We extend existing algorithms to get a robust method that computes internal camera parameters given a series of distant objects images. The algorithm is tested on real images ...




Comments