Abstract
The panorama stitching system is an indispensable module in surveillance or space exploration. Such a system enables the viewer to understand the surroundings instantly by aligning the surrounding images on a plane and fusing them naturally. The bottleneck of existing systems mainly lies in alignment and naturalness of the transition of adjacent images. When facing dynamic foregrounds, they may produce outputs with misaligned semantic objects, which is evident and sensitive to human perception. We solve three key issues in the existing workflow that can affect its efficiency and the quality of the obtained panoramic video and present Pedestrian360, a panoramic video system based on a structured camera array (a spatial surround-view camera system). First, to get a geometrically aligned 360○ view in the horizontal direction, we build a unified multi-camera coordinate system via a novel refinement approach that jointly optimizes camera poses. Second, to eliminate the brightness and color difference of images taken by different cameras, we design a photometric alignment approach by introducing a bias to the baseline linear adjustment model and solving it with two-step least-squares. Third, considering that the human visual system is more sensitive to high-level semantic objects, such as pedestrians and vehicles, we integrate the results of instance segmentation into the framework of dynamic programming in the seam-cutting step. To our knowledge, we are the first to introduce instance segmentation to the seam-cutting problem, which can ensure the integrity of the salient objects in a panorama. Specifically, in our surveillance oriented system, we choose the most significant target, pedestrians, as the seam avoidance target, and this accounts for the name Pedestrian360. To validate the effectiveness and efficiency of Pedestrian360, a large-scale dataset composed of videos with pedestrians in five scenes is established. The test results on this dataset demonstrate the superiority of Pedestrian360 compared to its competitors. Experimental results show that Pedestrian360 can stitch videos at a speed of 12 to 26 fps, which depends on the number of objects in the shooting scene and their frequencies of movements. To make our reported results reproducible, the relevant code and collected data are publicly available at https://cslinzhang.github.io/Pedestrian360-Homepage/.
- [1] . 2017. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 12 (2017), 2481–2495.Google Scholar
- [2] . 2020. Real-time video stitching for mine surveillance using a hybrid image registration Method. Electronics 9, 9 (2020), 1336.Google Scholar
Cross Ref
- [3] . 2003. Recognising panoramas. In Proceedings of the IEEE International Conference on Computer Vision. 1218–1225. Google Scholar
Digital Library
- [4] . 2007. Automatic panoramic image stitching using invariant features. International Journal of Computer Vision 74, 1 (2007), 59–73. Google Scholar
Digital Library
- [5] . 1983. A multiresolution spline with application to image mosaics. ACM Transactions on Graphics 2, 4 (1983), 217–236. Google Scholar
Digital Library
- [6] . 2016. Natural image stitching with the global similarity prior. In Proceedings of the European Conference on Computer Vision. 186–201.Google Scholar
Cross Ref
- [7] . 2018. Automatic calibration of an around view monitor system exploiting lane markings. Sensors 18, 9 (2018), 2956.Google Scholar
Cross Ref
- [8] . 2001. Image quilting for texture synthesis and transfer. In Proceedings of the 28th International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’01). 341–346. Google Scholar
Digital Library
- [9] . 2018. Refinet: A deep segmentation assisted refinement network for salient object detection. IEEE Transactions on Multimedia 21, 2 (2018), 457–469. Google Scholar
Digital Library
- [10] . 2011. Constructing image panoramas using dual-homography warping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 49–56. Google Scholar
Digital Library
- [11] . 2017. 3-D surround view for advanced driver assistance systems. IEEE Transactions on Intelligent Transportation Systems 19, 1 (2017), 320–328.Google Scholar
Cross Ref
- [12] . 2012. Photometric and geometric rectification for stereoscopic images. In Three-Dimensional Image Processing and Applications II, Vol. 8290. SPIE, 829007.Google Scholar
- [13] . 2016. Parallax-robust surveillance video stitching. Sensors 16, 1 (2016), 7.Google Scholar
Cross Ref
- [14] . 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 2961–2969.Google Scholar
Cross Ref
- [15] . 2012. A system for vehicle surround view. IFAC Proceedings Volumes 45, 22 (2012), 120–125.Google Scholar
Cross Ref
- [16] . 2014. Infrastructure-based calibration of a multi-camera rig. In Proceedings of the IEEE International Conference on Robotics and Automation. 4912–4919.Google Scholar
Cross Ref
- [17] . 2013. Camodocal: Automatic intrinsic and extrinsic calibration of a rig with multiple generic cameras and odometry. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. 1793–1800.Google Scholar
- [18] . 2015. Discontinuous seam cutting for enhanced video stitching. In Proceedings of the IEEE International Conference on Multimedia and Expo. 1–6.Google Scholar
- [19] . 2013. Salient object detection: A discriminative regional feature integration approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2083–2090. Google Scholar
Digital Library
- [20] . 2013. Salient region detection by UFO: Uniqueness, focusness and objectness. In Proceedings of the IEEE International Conference on Computer Vision. 1976–1983. Google Scholar
Digital Library
- [21] . 2019. Minimum error seam-based efficient panorama video stitching method robust to parallax. IEEE Access 7 (2019), 167127–167140.Google Scholar
Cross Ref
- [22] . 2019. Robust cylindrical panorama stitching for low-texture scenes based on image alignment using deep learning and iterative optimization. Sensors 19, 23 (2019), 5310.Google Scholar
Cross Ref
- [23] . 2011. G2o: A general framework for graph optimization. In Proceedings of the IEEE International Conference on Robotics and Automation. 3607–3613.Google Scholar
- [24] . 2003. Graphcut textures: Image and video synthesis using graph cuts. ACM Transactions on Graphics 22, 3 (2003), 277–286. Google Scholar
Digital Library
- [25] . 2016. Rich360: Optimized spherical representation from structured panoramic camera arrays. ACM Transactions on Graphics 35, 4 (2016), 1–11. Google Scholar
Digital Library
- [26] . 2006. Five-point motion estimation made easy. In Proceedings of the IEEE International Conference on Pattern Recognition, Vol. 1. 630–633. Google Scholar
Digital Library
- [27] . 2018. Panorama video stitching system based on VR Works 360 video. In Proceedings of the IEEE Chinese Automation Congress. 715–720.Google Scholar
Cross Ref
- [28] . 2019. Attentive deep stitching and quality assessment for
omnidirectional images. IEEE Journal of Selected Topics in Signal Processing 14, 1 (2019), 209–221.Google ScholarCross Ref
- [29] . 2018. Perception-based seam cutting for image stitching. Signal, Image and Video Processing 12, 5 (2018), 967–974.Google Scholar
Cross Ref
- [30] . 2019. Quality evaluation-based iterative seam estimation for image stitching. Signal, Image and Video Processing 13, 6 (2019), 1199–1206.Google Scholar
Cross Ref
- [31] . 2019. Single-perspective warps in natural image stitching. IEEE Transactions on Image Processing 29 (2019), 724–735.Google Scholar
- [32] . 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. 740–755.Google Scholar
Cross Ref
- [33] . 2011. Smoothly varying affine stitching. In Proceedings of the IEEE International Conference on Computer Vision. 345–352. Google Scholar
Digital Library
- [34] . 2011. Real-time video surveillance for large scenes. In Proceedings of the IEEE International Conference on Wireless Communications and Signal Processing. 1–4.Google Scholar
Cross Ref
- [35] . 2020. Panoramic video stitching of dual cameras based on spatio-temporal seam optimization. Multimedia Tools and Applications 79, 5 (2020), 3107–3124.Google Scholar
Cross Ref
- [36] . 2018. Composing semantic collage for image retargeting. IEEE Transactions on Image Processing 27, 10 (2018), 5032–5043.Google Scholar
Cross Ref
- [37] . 2010. Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 2 (2010), 353–367. Google Scholar
Digital Library
- [38] . 2017. Automatic extrinsic calibration methods for surround view systems. In Proceedings of the IEEE Intelligent Vehicles Symposium. 82–88.Google Scholar
Cross Ref
- [39] . 2009. Multi-class image segmentation using conditional random fields and global classification. In Proceedings of the International Conference on Machine Learning. 817–824. Google Scholar
Digital Library
- [40] . 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91–99. Google Scholar
Digital Library
- [41] . 2014. Recursive context propagation network for semantic scene labeling. In Advances in Neural Information Processing Systems. 2447–2455. Google Scholar
Digital Library
- [42] . 2017. Semi supervised semantic segmentation using generative adversarial network. In Proceedings of the IEEE International Conference on Computer Vision. 5688–5696.Google Scholar
Cross Ref
- [43] . 2006. Digital photograph stitching with optimized matching of gradient and curvature. In Digital Photography II, Vol. 6069. SPIE, 60690G.Google Scholar
- [44] . 2013. Efficient implementation and processing of a real-time panorama video pipeline. In Proceedings of the IEEE International Symposium on Multimedia. 76–83. Google Scholar
Digital Library
- [45] . 2002. Calibration of multi-camera systems using planar patterns. Sensors 8 (2002), 4.Google Scholar
- [46] . 2001. Eliminating ghosting and exposure artifacts in image mosaics. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. 509–516.Google Scholar
Cross Ref
- [47] . 2015. Deep networks for saliency detection via local estimation and global search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3183–3192.Google Scholar
Cross Ref
- [48] . 2012. Geodesic saliency using background priors. In Proceedings of the European Conference on Computer Vision. 29–42. Google Scholar
Digital Library
- [49] . 2013. High-speed simultaneous image distortion correction transformations for a multicamera cylindrical panorama real-time video system using FPGA. IEEE Transactions on Circuits and Systems for Video Technology 24, 6 (2013), 1061–1069.Google Scholar
Cross Ref
- [50] . 2014. As-projective-as-possible image stitching with moving DLT. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 7 (2014), 1285–1298. Google Scholar
Digital Library
- [51] . 2014. A surround view camera solution for embedded systems. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 662–667. Google Scholar
Digital Library
- [52] . 2016. Multi-viewpoint panorama construction with wide-baseline images. IEEE Transactions on Image Processing 25, 7 (2016), 3099–3111.Google Scholar
Digital Library
- [53] . 2019. Seamless 3D surround view with a novel burger model. In Proceedings of the IEEE International Conference on Image Processing. 4150–4154.Google Scholar
Cross Ref
- [54] . 2007. A practical calibration method for multiple cameras. In Proceedings of the IEEE International Conference on Image and Graphics. 45–50. Google Scholar
Digital Library
- [55] . 2000. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 11 (2000), 1330–1334. Google Scholar
Digital Library
- [56] . 2015. HARF: Hierarchy-associated rich features for salient object detection. In Proceedings of the IEEE International Conference on Computer Vision. 406–414. Google Scholar
Digital Library
Index Terms
Pedestrian-Aware Panoramic Video Stitching Based on a Structured Camera Array
Recommendations
Panoramic video stitching from commodity HDTV cameras
Digital camera and smartphone technologies have made high-quality images and video pervasive and abundant. Combining or stitching collections of images from a variety of viewpoints into an extended panoramic image is a common and popular function for ...
Extrinsic calibration of heterogeneous cameras by line images
The extrinsic calibration refers to determining the relative pose of cameras. Most of the approaches for cameras with non-overlapping fields of view (FOV) are based on mirror reflection, object tracking or rigidity constraint of stereo systems whereas ...
Stitching Reliability for Estimating Camera Focal Length in Panoramic Image Mosaicing
ICPR '00: Proceedings of the International Conference on Pattern Recognition - Volume 1This paper proposes a measurable criterion called stitching reliability which reflects the stitching quality of panoramic mosaics. We also show that the accurate focal length from image sequences closely relates to the stitching reliability. However, ...






Comments