Abstract
The amount of multimedia data on the Internet has increased exponentially in the past few decades and this trend is likely to continue. Multimedia content inherently has multiple information sources, therefore effective fusion methods are critical for data analysis and understanding. So far, most of the existing fusion methods are static with respect to time, making it difficult for them to handle the evolving multimedia content. To address this issue, in recent years, several evolving fusion methods were proposed, however, their requirements are difficult to meet, making them useful only in limited applications. In this article, we propose a novel evolving fusion method based on the online portfolio selection theory. The proposed method takes into account the correlation among different information sources and evolves the fusion model when new multimedia data is added. It performs effectively on both crisp and soft decisions without requiring additional context information. Extensive experiments on concept detection and human detection tasks over the TRECVID dataset and surveillance data have been conducted and significantly better performance has been obtained.
- E. Acar, F. Hopfgartner, and S. Albayrak. 2013. Violence detection in hollywood movies by the fusion of visual and mid-level audio cues. In Proceedings of the 21st ACM International Conference on Multimedia. 717--720. Google Scholar
Digital Library
- P. K. Atrey, M. A. Hossain, A. E. Saddik, and M. S. Kankanhalli. 2010. Multimodal fusion for multimedia analysis: A survey. Multimedia Syst. 16, 6, 345--379. Google Scholar
Digital Library
- P. K. Atrey and A. E. Saddik. 2008. Confidence evolution in multimedia systems. IEEE Trans. Multimedia 10, 7, 1288--1298. Google Scholar
Digital Library
- R. E. Bellman. 1961. Adaptive Control Processes - A Guided Tour. Princeton University Press.Google Scholar
- X. Benavent, A. Garcia-Serrano, R. Granados, J. Benavent, and E. De Ves. 2013. Multimedia information retrieval based on late semantic fusion approaches: Experiments on a wikipedia image collection. IEEE Trans. Multimedia 15, 8, 2009--2021. Google Scholar
Digital Library
- A. Blum and T. Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the Annual Conference on Computational Learning Theory. 92--100. Google Scholar
Digital Library
- C.-C. Chang and C.-J. Lin. 2001. LIBSVM: A library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm. Google Scholar
Digital Library
- J.-G. Chen and N. Ansari. 1998. Adaptive fusion of correlated local decisions. IEEE Trans. Syst. Man, Cybernet. 28, 2, 276--281. Google Scholar
Digital Library
- K. Crammer, M. Dredze, and F. Pereira. 2008. Exact convex confidence-weighted learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 345--352.Google Scholar
- B. V. Dasarathy. 1994. Decision Fusion. Computer Society Press. Google Scholar
Digital Library
- X. Geng, K. Smith-Miles, L. Wang, M. Li, and Q. Wu. 2010. Context-aware fusion: A case study on fusion of gait and face for human identification in video. Pattern Recogn. 43, 10, 3660--3673. Google Scholar
Digital Library
- D. P. Helmbold, R. E. Schapire, Y. Singer, and M. K. Warmuth. 1998. On-line portfolio selection using multiplicative updates. Math. Finance 8, 4, 325--347.Google Scholar
Cross Ref
- J. M. Keller, P. D. Gader, and C. W. Caldwell. 1995. Principle of least commitment in the analysis of chromosome images. Appl. Fuzzy Logic Technol. II 2493, 1, 178--186.Google Scholar
- L. I. Kuncheva. 2004. Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience. Google Scholar
Digital Library
- J.-S. Lee and C. H. Park. 2008. Adaptive decision fusion for audio-visual speech recognition. In Speech Recognition, Technologies and Applications, InTech, 275--296.Google Scholar
- B. Li and S. C. Hoi. 2012. Online portfolio selection: A survey. ACM Comput. Surv. 46, 3. Google Scholar
Digital Library
- M. Li, Y. Zheng, S. Lin, Y.-D. Zhang, and T.-S. Chua. 2009. Multimedia evidence fusion for video concept detection via owa operator. In Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling. 208--216. Google Scholar
Digital Library
- J. Ma, A. Kulesza, M. Dredze, K. Crammer, L. K. Saul, and F. Pereira. 2010. Exploiting feature covariance in high-dimensional online learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 493--500.Google Scholar
- J. R. Movellan and P. Mineiro. 1998. Robust sensor fusion: Analysis and application to audio visual speech recognition. Mach. Learn. 32, 85--100. Google Scholar
Digital Library
- G. Myers, R. Nallapati, J. Hout, S. Pancoast, R. Nevatia, C. Sun, A. Habibian, D. Koelma, K. Sande, A. Smeulders, and C. Snoek. 2014. Evaluating multimedia features and fusion for example-based event detection. Mach. Vis. Appl. 25, 1, 17--32. Google Scholar
Digital Library
- N. Poh and S. Bengio. 2005. How do correlation and variance of base-experts affect fusion in biometric authentication tasks? IEEE Trans. Signal Process. 53, 11, 4384--4396. Google Scholar
Digital Library
- A. Sayedelahl, R. Araujo, and M. Kamel. 2013. Audio-visual feature-decision level fusion for spontaneous emotion estimation in speech conversations. In Proceedings of the IEEE International Conference on Multimedia and Expo Workshops (ICMEW'13). 1--6.Google Scholar
- A. F. Smeaton, P. Over, and W. Kraaij. 2009. High-level feature detection from video in trecvid: A 5-year retrospective of achievements. In Multimedia Content Analysis, Theory and Applications, Springer, 151--174.Google Scholar
- D. M. Tax, M. V. Breukelen, R. P. Duin, and J. Kittler. 2000. Combining multiple classifiers by averaging or by multiplying? Pattern Recogn. 33, 1475--1485.Google Scholar
- M. Wang, X.-S. Hua, X. Yuan, Y. Song, and L.-R. Dai. 2007. Optimizing multi-graph learning: Towards a unified video annotation scheme. In Proceedings of the ACM International Conference on Multimedia. 862--871. Google Scholar
Digital Library
- X. Wang and M. Kankanhalli. 2013. Multimedia fusion with mean-covariance analysis. IEEE Trans. Multimedia 15, 1, 120--128. Google Scholar
Digital Library
- X. Wang and M. S. Kankanhalli. 2010. Portfolio theory of multimedia fusion. In Proceedings of the ACM International Conference on Multimedia. 723--726. Google Scholar
Digital Library
- X. Wang, Y. Rui, and M. S. Kankanhalli. 2011. Up-fusion: An evolving multimedia decision fusion method. In Proceedings of the ACM International Conference on Multimedia. 1089--1092. Google Scholar
Digital Library
- Y. Wu, E. Y. Chang, K. C.-C. Chang, and J. R. Smith. 2004. Optimal multimodal fusion for multimedia data analysis. In Proceedings of the ACM International Conference on Multimedia. 572--579. Google Scholar
Digital Library
- R. Yan and M. Naphade. 2005. Multi-modal video concept extraction using co-training. In Proceedings of the International Conference on Multimedia and Expo. 514--517.Google Scholar
- A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. 2007. Columbia university's baseline detectors for 374 lscom semantic visual concepts. Tech. rep., 222-2006-8, Columbia University. http://www.ee.columbia. edu/ln/dvmm/columbia374/.Google Scholar
- A. Yanagawa, W. Hsu, and S.-F. Chang. 2006. Brief descriptions of visual features for baseline trecvid concept detectors. Tech. rep., Columbia University. http://www.ee.columbia.edu/ln/dvmm/publications/06/akira-baseline-tr.pdf.Google Scholar
Index Terms
Up-Fusion: An Evolving Multimedia Fusion Method
Recommendations
Threshold-optimized decision-level fusion and its application to biometrics
Fusion is a popular practice to increase the reliability of biometric verification. In this paper, we propose an optimal fusion scheme at decision level by the AND or OR rule, based on optimizing matching score thresholds. The proposed fusion scheme ...
Improvement of Fusion Algorithm Based on Evidence Theory
WGEC '08: Proceedings of the 2008 Second International Conference on Genetic and Evolutionary ComputingThe relationship between basic probability assignment (BPA) and belief function in evidence theory is studied. The idea that belief function is regarded as BPA in data fusion is firstly introduced in this paper. An improved fusion algorithm based on ...
Underwater multi-focus image fusion based on sparse matrix
Special Section: Ambient advancements in intelligent computational sciencesDue to the limited focus range of optical imaging system, and the locations or focus of different objects in the same scene are different, multiple objects cannot be focused at the same time. In order to solve this problem and make the underwater image ...






Comments