skip to main content
research-article

A personal look back at twenty years of research in multimedia content analysis

Published:17 October 2013Publication History
Skip Abstract Section

Abstract

This paper is a personal look back at twenty years of research in multimedia content analysis. It addresses the areas of audio, photo and video analysis for the purpose of indexing and retrieval from the perspective of a multimedia researcher. Whereas a general analysis of content is impossible due to the personal bias of the user, significant progress was made in the recognition of specific objects or events. The paper concludes with a brief outlook on the future.

References

  1. Cao, L., Chang, S.-F., Codella, N., Cotton, C., Ellis, D., Gong, L., Hill, M., Hua, G., Kender, J., Merler, M., Mu, Y., Natsev, A., and Smith, J. R. 2011. IBM Research and Columbia University TRECVID-2011 multimedia event detection (MED) system. In Proceedings of the NIST TRECVID Workshop.Google ScholarGoogle Scholar
  2. Chen, D., Odobez, J. M., and Bourlard, H. 2004. Text detection and recognition in images and video frames. J. Pattern Recog. Soc. 37, 3, 595--608.Google ScholarGoogle ScholarCross RefCross Ref
  3. Ghias, A., Logan, J., Chamberlin, D., and Smith, B. C. 1995. Query by humming: Musical information retrieval in an audio database. In Proceedings of the ACM Multimedia Conference. 231--236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Google. 2013. http://images.google.com. (Last accessed 7/13).Google ScholarGoogle Scholar
  5. Han, J., Farin, D., and de With, P. H. N. 2008. Broadcast court-net sports video analysis using fast 3-D camera modeling. IEEE Trans. Circuits Syst. Video Technol. 18, 11, 1628--1638. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Lienhart, R., Kuhmünch, C. H., and Effelsberg, W. 1997a. On the detection and recognition of television commercials. In Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS'97). 509--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Lienhart, R., Pfeiffer, S., and Effelsberg, W. 1997b. Video abstracting. Comm. ACM, 40, 12, 55--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Moore, B. E., Ali, S., Mehran, R., and Shah, M. 2011. Visual crowd surveillance through a hydrodynamic lens. Comm. ACM, 54, 12, 64--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Niblack, C. W., Barber, R., Equitz, W., Flickner, M. D., Glasman, E. H., Petkovic, D., Yanker, P., Faloutsos, C. H., and Taubin, G. 1993. QBIC project: Querying images by content using color, texture, and shape. In Proceedings of the SPIE 1908, Storage and Retrieval for Image and Video Databases.Google ScholarGoogle Scholar
  10. Rowley, H. A., Baluja, S., and Kanade, T. 1998. Neural network-based face detection. IEEE Trans. Pattern Anal. Machine Intell. 20, 1, 23--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Rui, Y., Huang, T. H., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8, 5, 644--655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Shah, M. 2010. Visual crowd surveillance is like hydrodynamics. In Proceedings of the ACM Multimedia Conference. 3--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Uitdenbogerd, A., and Zobel, J. 1995. Melody matching techniques for large music databases. In Proceedings of the ACM Multimedia Conference. 57--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Wactlar, H. D., Christel, M. G., Gong, Y., and Hauptmann, A. G. 1999. Lessons learned from building a terabyte digital video library. Computer 32, 2, 66--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Wang, G., Hoiem, D., and Forsyth, D. 2012. Learning image similarity from Flickr groups using fast kernel machines. IEEE Trans. Pattern Anal. Machine Intell. 34, 11, 2177--2188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Zabih, R., Miller, J., and Mai, K. 1995. A feature-based algorithm for detecting and classifying scene breaks. In Proceedings of the ACM Multimedia Conference. 189--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Zhang, H., Kankanhalli, A., and Smoliar, S. 1993. Automatic Partitioning of full-motion video. Multimedia Syst. 1, 10--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Zhu, G., Huang, Q., Xu, C., Rui, Y., Jiang, S., Gao, W., and Yao, H. 2007. Trajectory based event tactics analysis in broadcast sports video. In Proceedings of the ACM Multimedia Conference. 58--67. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A personal look back at twenty years of research in multimedia content analysis

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader
                About Cookies On This Site

                We use cookies to ensure that we give you the best experience on our website.

                Learn more

                Got it!