Abstract
This article examines the use of two kinds of context to improve the results of content-based music taggers: the relationships between tags and between the clips of songs that are tagged. We show that users agree more on tags applied to clips temporally “closer” to one another; that conditional restricted Boltzmann machine models of tags can more accurately predict related tags when they take context into account; and that when training data is “smoothed” using context, support vector machines can better rank these clips according to the original, unsmoothed tags and do this more accurately than three standard multi-label classifiers.
- Aucouturier, J., Pachet, F., Roy, P., and Beuriv, A. 2007. Signal + context = better classification. In Proceedings of the International Symposium on Music Information Retrieval. 425--430.Google Scholar
- Bertin-Mahieux, T., Eck, D., Maillet, F., and Lamere, P. 2008. Autotagger: A model for predicting social tags from acoustic features on large music databases. J. New Music Res. 37, 2, 115--135.Google Scholar
Cross Ref
- Besag, J. 1975. Statistical analysis of non-lattice data. Statistician 24, 3, 179--195.Google Scholar
Cross Ref
- Boutell, M., Luo, J., Shen, X., and Brown, C. 2004. Learning multi-label scene classification. Patt. Recog. 37, 9, 1757--1771.Google Scholar
Cross Ref
- Chen, L., Xu, D., Tsang, I. W., and Luo, J. 2010. Tag-based web photo retrieval improved by batch mode re-tagging. In Proceedings of the 22nd IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 3440--3446.Google Scholar
- Cortes, C. and Mohri, M. 2004. Auc optimization vs. error rate minimization. In Proceedings of the Conference on Advances in Neural Information Processing Systems. S. Thrun, L. Saul, and B. Schölkopf, Eds., MIT Press, Cambridge, MA.Google Scholar
- Eck, D., Lamere, P., Bertin-Mahieux, T., and Green, S. 2008. Automatic generation of social tags for music recommendation. In Proceedings of the Conference on Advances in Neural Information Processing Systems. J. Platt, D. Koller, Y Singer, and S. Roweis, Eds., MIT Press, Cambridge, MA, 385--392.Google Scholar
- Han, Y., Wu, F., Jia, J., Zhuang, Y., and Yu, B. 2010. Multi-task sparse discriminant analysis (MtSDA) with overlapping categories. In Proceedings of the AAAI Conference on Artificial Intelligence. 469--474.Google Scholar
- Hand, D. J. 2009. Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach. Learn. 77, 103--123. Google Scholar
Digital Library
- Heitz, G. and Koller, D. 2008. Learning spatial context: Using stuff to find things. In Proceedings of the European Conference on Computer Vision. D. Forsyth, P. Torr, and A. Zisserman, Eds., Lecture Notes in Computer Science Series, vol. 5302, Springer, 30--43. Google Scholar
Digital Library
- Hinton, G. 2002. Training products of experts by minimizing contrastive divergence. Neur. Computat. 14, 1771--1800. Google Scholar
Digital Library
- Hoiem, D., Efros, A., and Hebert, M. 2008. Putting objects in perspective. Int. J. Comput. Vis. 80, 1, 3--15. Google Scholar
Digital Library
- Kang, F., Jin, R., and Sukthankar, R. 2006. Correlated Label Propagation with Application to Multi-label Learning. In Proceedings of the International Conference on Computer Vision and Pattern Recognition. 1719--1726. Google Scholar
Digital Library
- Larochelle, H. and Bengio, Y. 2008. Classification using discriminative restricted Boltzmann machines. In Proceedings of the International Conference on Machine Learning. A. McCallum and S. Roweis, Eds., Omnipress, 536--543. Google Scholar
Digital Library
- Lee, J. H. 2010. Crowdsourcing music similarity judgments using mechanical turk. In Proceedings of the International Symposium on Music Information Retrieval. 183--188.Google Scholar
- Mandel, M., Pascanu, R., Larochelle, H., and Bengio, Y. 2011. Autotagging music with conditional restricted boltzmann machines. http://arxiv.org/abs/1103.2832.Google Scholar
- Mandel, M. I., Eck, D., and Bengio, Y. 2010. Learning tags that vary within a song. In Proceedings of the International Symposium on Music Information Retrieval. 399--404.Google Scholar
- Mandel, M. I. and Ellis, D. P. W. 2008. A web-based game for collecting music metadata. J. New Music Res. 37, 2, 151--165.Google Scholar
Cross Ref
- Manning, C., Raghavan, P., and Schütze, H. 2008. Introduction to Information Retrieval. Cambridge University Press. Google Scholar
Digital Library
- Markines, B., Cattuto, C., Menczer, F., Benz, D., Hotho, A., and Stumme, G. 2009. Evaluating similarity measures for emergent semantics of social tagging. In Proceedings of the 18th International Conference on World Wide Web. ACM, 641--650. Google Scholar
Digital Library
- Miotto, R., Barrington, L., and Lanckriet, G. 2010. Improving auto-tagging by modeling semantic co-occurrences. In Proceedings of the International Symposium on Music Information Retrieval. 297--302.Google Scholar
- Murphy, K., Torralba, A., and Freeman, W. T. 2004. Using the forest to see the trees: A graphical model relating features, objects, and scenes. In Proceedings of the Conference on Advances in Neural Information Processing Systems. S. Thrun, L. Saul, and B. Schölkopf, Eds., MIT Press, Cambridge, MA.Google Scholar
- Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., and Belongie, S. 2007. Objects in context. In Proceedings of the International Conference on Computer Vision. IEEE, 1--8.Google Scholar
- Rasiwasia, N. and Vasconcelos, N. 2009. Holistic context modeling using semantic co-occurrences. In Proceedings of the International Conference on Computer Vision and Pattern Recognition. IEEE, Los Alamitos, CA, 1889--1895.Google Scholar
- Salakhutdinov, R., Mnih, A., and Hinton, G. 2007. Restricted Boltzmann machines for collaborative filtering. In Proceedings of the International Conference on Machine Learning. 791--798. Google Scholar
Digital Library
- Schifanella, R., Barrat, A., Cattuto, C., Markines, B., and Menczer, F. 2010. Folks in folksonomies: Social link prediction from shared metadata. In Proceedings of the ACM International Conference on Web Search and Data Mining. ACM, 271--280. Google Scholar
Digital Library
- Slaney, M. 2002. Semantic-audio retrieval. InProceedings of the International Conference on Acoustics, Speech, and Signal Processing.Google Scholar
- Smolensky, P. 1986. Information Processing in Dynamical Systems: Foundations of Harmony Theory. MIT Press.Google Scholar
- Snow, R., O'Connor, B., Jurafsky, D., and Ng, A. 2008. Cheap and fast—but is it good? Evaluating non-expert annotations for natural language tasks. In Proceedings of the Conference on Empirical Methods on Natural Language Processing. 254--263. Google Scholar
Digital Library
- Sorokin, A. and Forsyth, D. 2008. Utility data annotation with amazon mechanical turk. In Proceedings of the Workshop on Internet Vision at the IEEE Conference on Computer Vision and Pattern Recognition. 1--8.Google Scholar
- Taylor, G., Hinton, G. E., and Roweis, S. 2007. Modeling human motion using binary latent variables. In Proceedings of the Conference on Advances in Neural Information Processing Systems. B. Schiilkopf, J. Platt, and T. Hoffman, Eds., MIT Press, Cambridge, MA, 1345--1352.Google Scholar
- Tingle, D., Kim, Y. E., and Turnbull, D. 2010. Exploring automatic music annotation with “acoustically-objective” tags. In Proceedings of the International Conference on Multimedia Information Retrieval. ACM, 55--62. Google Scholar
Digital Library
- Trohidis, K., Tsoumakas, G., Kalliris, G., and Vlahavas, I. 2008. Multilabel classification of music into emotions. In Proceedings of the International Symposium on Music Information Retrieval.Google Scholar
- Tsoumakas, G., Katakis, I., and Vlahavas, I. 2010. Mining multi-label data. In Data Mining and Knowledge Discovery Handbook, O. Maimon and L. Rokach, Eds., Chapter 34, 667--685.Google Scholar
- Tsoumakas, G., Vilcek, J., Spyromitros, L., and Vlahavas, I. 2011. MULAN: A java library for multi-label learning. J. Mach. Learn. Res. 12, 2411--2414. Google Scholar
Digital Library
- Tsoumakas, G. and Vlahavas, I. 2007. Random k-Labelsets: An ensemble method for multilabel classification. In Proceedings of the European Conference on Machine Learning. Lecture Notes in Computer Science, vol. 4701, Springer, 406--417. Google Scholar
Digital Library
- Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., and Movellan, J. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In Proceedings of the Conference on Advances in Neural Information Processing Systems. Y. Bengio, D. Schuurmans, C. Williams, J. Lafferty, and A. Culotta, Eds., 2035--2043.Google Scholar
- Whitman, B. and Rifkin, R. 2002. Musical query-by-description as a multiclass learning problem. In Proceedings of the IEEE Workshop on Multimedia Signal Processing. 153--156.Google Scholar
- Yao, B. and Fei-Fei, L. 2010. Modeling mutual context of object and human pose in human-object interaction activities. In Proceedings of the International Conference on Computer Vision and Pattern Recognition. IEEE, 17--24.Google Scholar
- Zhang, M. and Zhou, Z. 2007. ML-KNN: A lazy learning approach to multi-label learning. Patt. Recog. 40, 7, 2038--2048. Google Scholar
Digital Library
Index Terms
Contextual tag inference
Recommendations
Tag Suggestr: Automatic Photo Tag Expansion Using Visual Information for Photo Sharing Websites
SAMT '08: Proceedings of the 3rd International Conference on Semantic and Digital Media Technologies: Semantic MultimediaIn this paper, we propose an automatic photo tag expansion system for the community photo collections, such as Flickr. Our aim is to suggest relevant tags for a target photograph uploaded to the system by a user, by incorporating the visual and textual ...
Social Tags and Emotions as main Features for the Next Song To Play in Automatic Playlist Continuation
UMAP'19 Adjunct: Adjunct Publication of the 27th Conference on User Modeling, Adaptation and PersonalizationThe broad diffusion over the Internet of songs streaming services points out the need for implementing efficient and personalized strategies for incrementing the fidelity of the customers. This scenario can collect enough information about the user and ...
A framework for tag-aware recommender systems
In social tagging system, a user annotates a tag to an item. The tagging information is utilized in recommendation process. In this paper, we propose a hybrid item recommendation method to mitigate limitations of existing approaches and propose a ...








Comments