Abstract
In many distributed wireless surveillance applications, compressed videos are used for performing automatic video analysis tasks. The accuracy of object detection, which is essential for various video analysis tasks, can be reduced due to video quality degradation caused by lossy compression. This article introduces a video encoding framework with the objective of boosting the accuracy of object detection for wireless surveillance applications. The proposed video encoding framework is based on systematic investigation of the effects of lossy compression on object detection. It has been found that current standardized video encoding schemes cause temporal domain fluctuation for encoded blocks in stable background areas and spatial texture degradation for encoded blocks in dynamic foreground areas of a raw video, both of which degrade the accuracy of object detection. Two measures, the sum-of-absolute frame difference (SFD) and the degradation of texture in 2D transform domain (TXD), are introduced to depict the temporal domain fluctuation and the spatial texture degradation in an encoded video, respectively. The proposed encoding framework is designed to suppress unnecessary temporal fluctuation in stable background areas and preserve spatial texture in dynamic foreground areas based on the two measures, and it introduces new mode decision strategies for both intra- and interframes to improve the accuracy of object detection while maintaining an acceptable rate distortion performance. Experimental results show that, compared with traditional encoding schemes, the proposed scheme improves the performance of object detection and results in lower bit rates and significantly reduced complexity with comparable quality in terms of PSNR and SSIM.
- Andrew D. Bagdanov, Marco Bertini, Alberto Del Bimbo, and Lorenzo Seidenari. 2011. Adaptive video compression for video surveillance applications. In 2011 IEEE International Symposium on Multimedia (ISM’11). IEEE, 190--197. Google Scholar
Digital Library
- Axel Baumann, Marco Boltz, Julia Ebling, Matthias Koenig, Hartmut S. Loos, Marcel Merkel, Wolfgang Niem, Jan Karl Warzelhan, and Jie Yu. 2008. A review and comparison of measures for automatic video surveillance systems. EURASIP Journal on Image and Video Processing 1 (2008), 824726.Google Scholar
- Sebastian Brutzer, Benjamin Höferlin, and Gunther Heidemann. 2011. Evaluation of background subtraction techniques for video surveillance. In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE, 1937--1944. Google Scholar
Digital Library
- Jianshu Chao, Robert Huitl, Eckehard Steinbach, and Damien Schroeder. 2015. A novel rate control framework for SIFT/SURF feature preservation in H. 264/AVC video compression. IEEE Transactions on Circuits and Systems for Video Technology 25, 6 (2015), 958--972.Google Scholar
Digital Library
- Xiang Chen, Jenq-Neng Hwang, Kuan-Hui Lee, and Ricardo L. de Queiroz. 2015. Quality-of-content (QoC)-driven rate allocation for video analysis in mobile surveillance networks. In 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP’15). IEEE, 1--6.Google Scholar
- Seong Soo Chun, Jung-Rim Kim, and Sanghoon Sull. 2006. Intra prediction mode selection for flicker reduction in H. 264/AVC. IEEE Transactions on Consumer Electronics 52, 4 (2006), 1303--1310. Google Scholar
Digital Library
- Peter Corke, Tim Wark, Raja Jurdak, Wen Hu, Philip Valencia, and Darren Moore. 2010. Environmental wireless sensor networks. Proceedings of IEEE 98, 11 (2010), 1903--1917.Google Scholar
Cross Ref
- Wan Du, Zhenjiang Li, Jansen Christian Liando, and Mo Li. 2016. From rateless to distanceless: Enabling sparse sensor network deployment in large areas. IEEE/ACM Transactions on Networking 24, 4 (2016), 2498--2511. Google Scholar
Digital Library
- Yuming Fang, Zhenzhong Chen, Weisi Lin, and Chia-Wen Lin. 2012. Saliency detection in the compressed domain for adaptive image retargeting. IEEE Transactions on Image Processing 21, 9 (2012), 3888--3901. Google Scholar
Digital Library
- Yuming Fang, Weisi Lin, Zhenzhong Chen, Chia-Ming Tsai, and Chia-Wen Lin. 2014. A video saliency detection model in compressed domain. IEEE Transactions on Circuits and Systems for Video Technology 24, 1 (2014), 27--38. Google Scholar
Digital Library
- Nil Goyette, Pierre-Marc Jodoin, Fatih Porikli, Janusz Konrad, and Prakash Ishwar. 2012. Changedetection.net: A new change detection benchmark dataset. In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’12). IEEE, 1--8.Google Scholar
Cross Ref
- Hai-Miao Hu, Bo Li, Weiyao Lin, Wei Li, and Ming-Ting Sun. 2012. Region-based rate control for H. 264/AVC for low bit-rate applications. IEEE Transactions on Circuits and Systems for Video Technology 22, 11 (2012), 1564--1576. Google Scholar
Digital Library
- Weiming Hu, Tieniu Tan, Liang Wang, and Steve Maybank. 2004. A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 34, 3 (2004), 334--352. Google Scholar
Digital Library
- MathWorks Inc. 2006. Local range of image-MATLAB rangefilt. Retrieved March 21, 2017, from http://www.mathworks.com/help/images/ref/rangefilt.html.Google Scholar
- Amaya Jiménez-Moreno, Eduardo Martinez-Enriquez, Vipin Kumar, and Fernando Díaz-de María. 2014. Standard-compliant low-pass temporal filter to reduce the perceived flicker artifact. IEEE Transactions on Multimedia 16, 7 (2014), 1863--1873.Google Scholar
Cross Ref
- Emmanouil Kafetzakis, Christos Xilouris, Michail Alexandros Kourtis, Marcos Nieto, Iveel Jargalsaikhan, and Suzanne Little. 2013. The impact of video transcoding parameters on event detection for surveillance systems. In 2013 IEEE International Symposium on Multimedia (ISM’13). IEEE, 333--338. Google Scholar
Digital Library
- Lingchao Kong and Rui Dai. 2016. Temporal-fluctuation-reduced video encoding for object detection in wireless surveillance systems. In 2016 IEEE International Symposium on Multimedia (ISM’16). IEEE, 126--132.Google Scholar
Cross Ref
- Lingchao Kong and Rui Dai. 2017. Object-detection-based video compression for wireless surveillance systems. IEEE MultiMedia 24, 2 (2017), 76--85. Google Scholar
Digital Library
- Lingchao Kong, Rui Dai, and Yuchi Zhang. 2016. A new quality model for object detection using compressed videos. In 2016 IEEE International Conference on Image Processing (ICIP’16). IEEE, 3797--3801.Google Scholar
Cross Ref
- Pavel Korshunov and Wei Tsang Ooi. 2011. Video quality for face detection, recognition, and tracking. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 7, 3 (2011), 14. Google Scholar
Digital Library
- Thomas Kuo, Zefeng Ni, Carter De Leo, and B. S. Manjunath. 2010. Design and implementation of a wide area, large-scale camera network. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’10). IEEE, 25--32.Google Scholar
- Mikołaj Leszczuk. 2014. Optimising task-based video quality. Multimedia Tools and Applications 68, 1 (2014), 41--58. Google Scholar
Digital Library
- Zhan Ma, Meng Xu, Yen-Fu Ou, and Yao Wang. 2012. Modeling of rate and perceptual quality of compressed video as functions of frame rate and quantization stepsize and its applications. IEEE Transactions on Circuits and Systems for Video Technology 22, 5 (2012), 671--682. Google Scholar
Digital Library
- Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG. 2001. Working Draft Number 2, Revision 0 (WD-2). JVT-B118.Google Scholar
- VideoLAN Organization. 2005. x264, the best H.264/AVC encoder. Retrieved March 21, 2017, from http://www.videolan.org/developers/x264.html.Google Scholar
- Yen-Fu Ou, Zhan Ma, Tao Liu, and Yao Wang. 2011. Perceptual quality assessment of video considering both frame rate and quantization artifacts. IEEE Transactions on Circuits and Systems for Video Technology 21, 3 (2011), 286--298. Google Scholar
Digital Library
- Luis Patino, Tahir Nawaz, Tom Cane, and James Ferryman. 2017. PETS 2017: Dataset and challenge. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17) Workshops.Google Scholar
- R. M. T. P. Rajakaruna, W. A. C. Fernando, and J. Calic. 2011. Application-aware video coding architecture using camera and object motion-models. In 2011 6th IEEE International Conference on Industrial and Information Systems (ICIIS’11). IEEE, 76--81.Google Scholar
- ITU-T Recommendation. 2008. P.910. Subjective Video Quality Assessment Methods for Multimedia Applications, 910--200804.Google Scholar
- Danileno Rosário, José Arnaldo Filho, Denis Rosário, Aldri Santosy, and Mário Gerla. 2017. A relay placement mechanism based on UAV mobility for satisfactory video transmissions. In 2017 16th Annual Mediterranean Ad Hoc Networking Workshop (Med-Hoc-Net’17). IEEE, 1--8.Google Scholar
Cross Ref
- Lauro Snidaro, Ingrid Visentini, and Gian Luca Foresti. 2012. Fusing multiple video sensors for surveillance. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 8, 1 (2012), 7. Google Scholar
Digital Library
- Andrews Sobral and Antoine Vacavant. 2014. A comprehensive review of background subtraction algorithms evaluated with synthetic and real videos. Computer Vision and Image Understanding 122 (2014), 4--21.Google Scholar
Cross Ref
- Eren Soyak, Sotirios Tsaftaris, and Aggelos K. Katsaggelos. 2011. Low-complexity tracking-aware H. 264 video compression for transportation surveillance. IEEE Transactions on Circuits and Systems for Video Technology 21, 10 (2011), 1378--1389. Google Scholar
Digital Library
- Ee-Leng Tan and Woon-Seng Gan. 2015. Perceptual image coding with discrete cosine transform. In Perceptual Image Coding with Discrete Cosine Transform. Springer, 21--41.Google Scholar
- Bulent Tavli, Kemal Bicakci, Ruken Zilan, and Jose M. Barcelo-Ordinas. 2012. A survey of visual sensor network platforms. Multimedia Tools and Applications 60, 3 (2012), 689--726. Google Scholar
Digital Library
- Peng Wang, Yongfei Zhang, Hai-Miao Hu, and Bo Li. 2013. Region-classification-based rate control for flicker suppression of I-frames in HEVC. In 2013 20th IEEE International Conference on Image Processing (ICIP’13). IEEE, 1986--1990.Google Scholar
Cross Ref
- Zhuo Wei, Zheng Yan, Yongdong Wu, and Robert Huijie Deng. 2016. Trustworthy authentication on scalable surveillance video with background model support. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 12, 4s (2016), 64. Google Scholar
Digital Library
- Thomas Wiegand, Gary J. Sullivan, Gisle Bjontegaard, and Ajay Luthra. 2003. Overview of the H. 264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology 13, 7 (2003), 560--576. Google Scholar
Digital Library
- Hua Yang, Jill M. Boyce, and Alan Stein. 2008. Effective flicker removal from periodic intra frames and accurate flicker measurement. In 15th IEEE International Conference on Image Processing, 2008 (ICIP’08). IEEE, 2868--2871.Google Scholar
- Yun Ye, Song Ci, Aggelos K. Katsaggelos, Yanwei Liu, and Yi Qian. 2013. Wireless video surveillance: A survey. IEEE Access 1 (2013), 646--660.Google Scholar
Cross Ref
- Fan Zhang and David R. Bull. 2011. A parametric framework for video compression using region-based texture models. IEEE Journal of Selected Topics in Signal Processing 5, 7 (2011), 1378--1392.Google Scholar
Cross Ref
- Xiang Zhang, Siwei Ma, Shiqi Wang, Xinfeng Zhang, Huifang Sun, and Wen Gao. 2017. A joint compression scheme of video feature descriptors and visual content. IEEE Transactions on Image Processing 26, 2 (2017), 633--647. Google Scholar
Digital Library
Index Terms
Efficient Video Encoding for Automatic Video Analysis in Distributed Wireless Surveillance Systems
Recommendations
Homogeneous Video Transcoding of H.264/AVC Intra Coded Frames
ICCVG 2008: Proceedings of the International Conference on Computer Vision and Graphics: Revised PapersThe main goal of transcoding is to change bit rate of video sequence. This can be done by cascaded connection of decoder and encoder, known as Cascaded Pixel Domain Transcoder (CPDT). Decoding and re-encoding video bit stream always gives lower image ...
Video encoding and transcoding using machine learning
MDM '08: Proceedings of the 9th International Workshop on Multimedia Data Mining: held in conjunction with the ACM SIGKDD 2008Machine learning has been widely used in video analysis and search applications. In this paper, we describe a non-traditional use of machine learning in video processing - video encoding and transcoding. Video encoding and transcoding are ...
3D Searchless Fractal Video Encoding at Low Bit Rates
The development of compression techniques is crucial for several applications that require efficient storage and transmission of large data volumes. Fractal theory has been used in image and video compression due to advantages such as resolution ...






Comments