Abstract
The motion vector similarity between neighboring blocks is widely used in motion estimation algorithms. However, for nonneighboring blocks, they may also have similar motions due to close depths or belonging to the same object inside the scene. Therefore, the motion vectors usually have several kinds of patterns, which reveal a clustering structure. In this article, we propose a progressive clustering algorithm, which periodically counts the motion vectors of the past blocks to make incremental clustering statistics. These statistics are used as the motion vector predictors for the following blocks. It is proved to be much more efficient for one block to find the best-matching candidate with the predictors. We also design the clustering based search with CUDA for GPU acceleration. Another interesting application of the clustering statistics is persistent static object tracking. Based on the statistics, several auxiliary tracking areas are created to guide the object tracking. Even when the target object has significant changes in appearance or it disappears occasionally, its position still can be predicted. The experiments on Xiph.org Video Test Media dataset illustrate that our clustering based search algorithm outperforms the mainstream and some state-of-the-art motion estimation algorithms. It is 33 times faster on average than the full search algorithm with only slightly higher mean-square error values in the experiments. The tracking results show that the auxiliary tracking areas help to locate the target object effectively.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Progressive Motion Vector Clustering for Motion Estimation and Auxiliary Tracking
- S. Avidan. 2004. Support vector tracking. IEEE Trans. Pattern Anal. Mach. Intell. 26, 8, 1064--1072. Google Scholar
Digital Library
- K. Chen, Z. Zhou, and W. Wu. 2012. Clustering based search algorithm for motion estimation. In Proceedings of IEEE International Conference on Multimedia and Expo (ICME'12). IEEE, 622--627. Google Scholar
Digital Library
- Z. Chen, P. Zhou, and Y. He. 2002. Fast integer pel and fractional pel motion estimation for JVT. In Proceedings of the 6th Meeting of JVT-F017. Joint Video Team of ISO/IEC MPEG & ITU-T VCEG. 5--13.Google Scholar
- D. Comaniciu and V. Ramesh. 2000. Mean shift and optimal prediction for efficient object tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP'00). IEEE, 70--73.Google Scholar
- D. Comaniciu, V. Ramesh, and P. Meer. 2000. Real-time tracking of non-rigid objects using mean shift. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'00). IEEE, 2, 142--149.Google Scholar
- R. Cucchiara, A. Prati, and R. Vezzani. 2003. Object segmentation in videos from moving camera with MRFs on color and motion features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'03). IEEE, 405--410. Google Scholar
Digital Library
- D. L. Davies and D. W. Bouldin. 1979. A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1, 2, 224--227. Google Scholar
Digital Library
- W. Du and J. Piater. 2008. A probabilistic approach to integrating multiple cues in visual tracking. In Proceedings of the 10th European Conference on Computer Vision (ECCV'08). Springer, 225--238. Google Scholar
Digital Library
- M. Ester, H. P. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. AAAI, 96, 226--231.Google Scholar
- M. Gelgon and P. Bouthemy. 2000. A region-level motion-based graph representation and labeling for tracking a spatial image partition. Pattern Recognit. 33, 4, 725--740.Google Scholar
Cross Ref
- H. Grabner and H. Bischof. 2006. On-line boosting and vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06). IEEE, 260--267. Google Scholar
Digital Library
- B. Han, S. W. Joo, and L. S. Davis. 2007. Probabilistic fusion tracking using mixture kernel-based Bayesian filtering. In Proceedings of the 11th International Conference on Computer Vision (ICCV'07). IEEE, 1--8.Google Scholar
- Y. W. Huang, S. Y. Ma, C. F. Shen, and L. G. Chen. 2003. Predictive Line Search: an efficient motion estimation algorithm for MPEG-4 encoding systems on multimedia processors. IEEE Trans. Circ. Syst. Video Tech. 13, 1, 111--117. Google Scholar
Digital Library
- C. Hennebert, V. Rebuffel, and P. Bouthemy. 1996. In Proceedings of the 13th IEEE International Conference on Pattern Recognition (ICPR'96). IEEE, 218--222. Google Scholar
Digital Library
- T. Koga, K. Linuma, A. Hirano, Y. Iijima, and T. Ishiguro. 1981. Motion-compensated interframe coding for video conferencing. In Proceedings of the National Telecommunication Conference. IEEE, G5.3.1--G5.3.5.Google Scholar
- E. Krause. 1987. Taxicab Geometry: An Adventure in Non-Euclidean Geometry. Dover Publications, New York.Google Scholar
- J. Kwon and K. M. Lee. 2010. Visual tracking decomposition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10). IEEE, 1269--1276.Google Scholar
- R. Li, B. Zeng, and M. L. Liou. 1994. A new three-step search algorithm for block motion estimation. IEEE Trans. Circ. Syst. Video Technol. 4, 4, 438--442. Google Scholar
Digital Library
- J. MacQueen. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, 1, 14, 281--297.Google Scholar
- S. J. McKenna, Y. Raja, and S. Gong. 1999. Tracking color objects using adaptive mixture models. Image Vision Comput. 17, 3, 225--231.Google Scholar
Cross Ref
- K. Nummiaro, E. Koller-Meier, and L. Van Gool. 2003. An adaptive color-based particle filter. Image Vision Comput. 21, 1, 99--110.Google Scholar
Digital Library
- K. Okuma, A. Taleghani, N. De Freitas, J. J. Little, and D. G. Lowe. 2004. A boosted particle filter: Multitarget detection and tracking. In Proceedings of the 8th European Conference on Computer Vision (ECCV'04). Springer, 28--39.Google Scholar
- L. M. Po and W. C. Ma. 1996. A novel four-step search algorithm for fast block motion estimation. IEEE Trans. Circ. Syst. Video Technol. 6, 3, 313--317. Google Scholar
Digital Library
- M. Porto, C. Cristani, P. Dall'Oglio, M. Grellert, J. Mattos, S. Bampi, and L. Agostini. 2013. Iterative random search: a new local minima resistant algorithm for motion estimation in high-definition videos. Multimedia Tools Appl. 63, 1, 107--127. Google Scholar
Digital Library
- D. A. Ross, J. Lim, R. S. Lin, and M. H. Yang. 2008. Incremental learning for robust visual tracking. Int. J. Computer Vision 77, 1--3, 125--141. Google Scholar
Digital Library
- Z. Shi, W. A. C. Fernando, and D. V. S. De Silva. 2010. A motion estimation algorithm based on predictive intensive direction search for H. 264/AVC. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'10). IEEE, 667--672.Google Scholar
Cross Ref
- Z. Shi, W. A. C. Fernando, and A. Kondoz. 2011. Adaptive direction search algorithms based on motion correlation for block motion estimation. IEEE Trans. Consum. Electron. 57, 3, 1354--1361.Google Scholar
Cross Ref
- H. Tao, H. S. Sawhney, and R. Kumar. 2002. Object tracking with Bayesian estimation of dynamic layer representations. IEEE Trans. Pattern Anal. Mach. Intell. 24, 1, 75--89. Google Scholar
Digital Library
- Y. W. Wu, B. Ma, and P. Li. 2012. A variational method for contour tracking via covariance matching. Science China Info. Sci. 55, 11, 2611--2623.Google Scholar
- Xiph.org. 2013. Xiph.org video test media (derf's collection). http://media.xiph.org/video/derf/.Google Scholar
- A. Yilmaz, O. Javed, and M. Shah. 2006. Object tracking: A survey. ACM Comput. Surv. 38, 4, 13. Google Scholar
Digital Library
- Z. Yin and R. T. Collins. 2008. Object tracking and detection after occlusion via numerical hybrid local and global mode-seeking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08). IEEE, 1--8.Google Scholar
- Y. Zhou, Z. Zhou, K. Chen, and W. Wu. 2012. Persistent object tracking in road panoramic videos. In Proceedings of the 13th Pacific Rim Conference on Multimedia (PCM'12). Springer, 359--368. Google Scholar
Digital Library
- C. Zhu, X. Lin, L. P. Chau, K. P. Lim, H. A. Ang, and C. Y. Ong. 2001. A novel hexagon-based search algorithm for fast block motion estimation. In Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP'01). IEEE, 1593--1596. Google Scholar
Digital Library
- S. Zhu and K. K. Ma. 2000. A new diamond search algorithm for fast block-matching motion estimation. IEEE Trans. Image Process. 9, 2, 287--290. Google Scholar
Digital Library
Index Terms
Progressive Motion Vector Clustering for Motion Estimation and Auxiliary Tracking
Recommendations
Level-set-based motion estimation algorithm for multiple reference frame motion estimation
Motion estimation (ME) has a variety of applications in image processing, pattern recognition, target tracking, and video compression. In modern video compression standards such as H.264/AVC and HEVC, multiple reference frame ME (MRFME) is adopted to ...
Multiple block-size search algorithm for fast block motion estimation
ICICS'09: Proceedings of the 7th international conference on Information, communications and signal processingAlthough variable block-size motion estimation provides significant video quality and coding efficiency improvement, it requires much higher computational complexity compared with fixed block size motion estimation. The reason is that the current motion ...
Two-bit transform for binary block motion estimation
One-bit transforms (1BTs) have been proposed for low-complexity block-based motion estimation by reducing the representation order to a single bit, and employing binary matching criteria. However, as a single bit is used in the representation of image ...






Comments