skip to main content
research-article

Discovering multirelational structure in social media streams

Published:03 February 2012Publication History
Skip Abstract Section

Abstract

In this article, we present a novel algorithm to discover multirelational structures from social media streams. A media item such as a photograph exists as part of a meaningful interrelationship among several attributes, including time, visual content, users, and actions. Discovery of such relational structures enables us to understand the semantics of human activity and has applications in content organization, recommendation algorithms, and exploratory social network analysis.

We are proposing a novel nonnegative matrix factorization framework to characterize relational structures of group photo streams. The factorization incorporates image content features and contextual information. The idea is to consider a cluster as having similar relational patterns; each cluster consists of photos relating to similar content or context. Relations represent different aspects of the photo stream data, including visual content, associated tags, photo owners, and post times. The extracted structures minimize the mutual information of the predicted joint distribution. We also introduce a relational modularity function to determine the structure cost penalty, and hence determine the number of clusters. Extensive experiments on a large Flickr dataset suggest that our approach is able to extract meaningful relational patterns from group photo streams. We evaluate the utility of the discovered structures through a tag prediction task and through a user study. Our results show that our method based on relational structures, outperforms baseline methods, including feature and tag frequency based techniques, by 35%--420%. We have conducted a qualitative user study to evaluate the benefits of our framework in exploring group photo streams. The study indicates that users found the extracted clustering results clearly represent major themes in a group; the clustering results not only reflect how users describe the group data but often lead the users to discover the evolution of the group activity.

References

  1. Ahern, S., Naaman, M., Nair, R., and Yang, J. 2007. World explorer: Visualizing aggregate data from unstructured text in geo-referenced collections. In Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM, 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Backstrom, L., Huttenlocher, D., Kleinberg, J., and Lan, X. 2006. Group formation in large social networks: Membership, growth, and evolution. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 44--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Banerjee, A., Basu, S., and Merugu, S. 2007. Multi-way clustering on relation graphs. In Proceedings of the SIAM International Conference on Data Mining.Google ScholarGoogle Scholar
  4. Bekkerman, R., El-Yaniv, R., and McCallum, A. 2005. Multi-way distributional clustering via pairwise interactions. In Proceedings of the 22nd International Conference on Machine Learning (ICML). 41--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Blei, D., Ng, A., and Jordan, M. 2003. Latent Dirichlet allocation. J Mach Learn. Resear. 3, 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Blei, D. and Lafferty, J. 2006. Dynamic topic models. In Proceedings of the International Conference on Machine Learning. ACM, 120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cai, D., He, X., Li, Z., Ma, W., and Wen, J. 2004. Hierarchical clustering ofwww image search results using visual, textual and link information. In Proceedings of ACM Multimedia. ACM, New York, NY, 952--959. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chen, H., Chang, M., Chang, P., Tien,M., Hsu, W., and Wu, J. 2008. Sheepdog: Group and tag recommendation for flickr photos by automatic search-based learning. http://www.arnetminer.org/viewpub.do?pid=503817 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dhillon, I., Mallela, S., and Modha, D. 2003. Information-theoretic co-clustering. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 89--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Doreian, P. and Fujimoto, K. 2001. Structures of supreme court voting. University of Pittsburgh, manuscript, version November 3: 2001.Google ScholarGoogle Scholar
  11. Garg, N. and Weber, I. 2008. Personalized, interactive tag recommendation for Flickr. In Proceedings of the ACM International Conference on Recommender Systems. ACM, 67--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Järvelin, K. and Kekäläinen, J. 2000. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM New York, NY, 41--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kemp, C. and Tenenbaum, J. 2008. The discovery of structural form. Proc. Nat. Acad. Sci. 105, 31, 10687--10692.Google ScholarGoogle ScholarCross RefCross Ref
  14. Kennedy, L., Naaman,M., Ahern, S., Nair, R., and Ratienbury, T. 2007. How Flickr helps us make sense of the world: Context and content in community-contributed media collections. In Proceedings of ACM Multimedia. ACM, New York, NY, 631--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kirsten, M. and Wrobel, S. 1998. Relational distance-based clustering. In Proceedings of the 8th International Conference on Inductive Logic Programming. 261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kumar, R., Novak, J., and Tomkins, A. 2006. Structure and evolution of online social networks. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 611--617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Lee, D. and Seung, H. 2001. Algorithms for non-negative matrix factorization. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 556--562.Google ScholarGoogle Scholar
  18. Li, T. and Anand, S. 2007. Diva: A variance-based clustering approach for multi-type relational data. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 147--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lin, Y.-R., Chi, Y., Zhu, S., Sundaram, H., and Tseng, B. L. 2008. Facenet: A framework for analyzing communities and their evolutions in dynamics networks. In Proceedings of the International World Wide Web Conference. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lin, Y.-R., Sundaram, H., De Choudhury, M., and Kelliher, A. 2009a. Temporal patterns in social media streams: Theme discovery and evolution using joint analysis of content and context. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME'09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Lin, Y.-R., Sundaram, H., and Kelliher, A. 2009b. Jam: Joint action matrix factorization for summarizing a temporal heterogeneous social network. In Proceedings of the International Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  22. Liu, Z. and Laganiere, R. 2007. Phase congruence measurement for image similarity assessment. Patt. Recogn. Lett. 28, 1, 166--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Long, B., Wu, X., Zhang, Z., and Yu, P. 2006. Unsupervised learning on k-partite graphs. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 317--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints.int. J. Comput. Vision 60, 2, 91--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Loy, G. and Zelinsky, A. 2003. Fast radial symmetry for detecting points of interest. IEEE Trans. Patt. Anal. Mach. Intell. 25, 8, 959--973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. McCowan, L., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., and Zhang, D. 2005. Automatic analysis of multi modal group actions in meetings. IEEE Trans. Patt. Anal. Mach. Intell. 25, 3, 305--317. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Negoescu, R. and Gatica-Perez, D. 2008. Analyzing flickr groups. In Proceedings of the International Conference on Image and Video Retrieval. ACM New York, NY, 417--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Newman, M. and Girvan, M. 2004. Finding and evaluating community structure in networks. Phys. Rev. E 69, 2, 26113.Google ScholarGoogle ScholarCross RefCross Ref
  29. Palla, G., Barabasi, A., and Vicsek, T. 2007. Quantifying social group evolution. eprint arXiv: 0704.0744.Google ScholarGoogle Scholar
  30. Rege, M., Dong, M., and Hua, J. 2008. Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering. In Proceedings of the 17th International World Wide Web Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Schein, A., Popescul, A., Ungar, L., and Pennock, D. 2002. Methods and metrics for cold-start recommendations. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 253--260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shamma, D., Shaw, R., Shafron, P., and Liu, Y. 2007. Watch what I watch. In Proceedings of the International Workshop on Multimedia Information Retrieval. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sigurbjörnsson, B. and van Zwol, R. 2008. Flickr tag recommendation based on collective knowledge. In Proceedings of the 17th International World Wide Web Conference (WWW). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Star, S. and Griesemer, J. 1989. Institutional ecology, ‘translations’ and boundary objects: Amateurs and professionals in Berkeley's museum of vertebrate zoology, 1907--39. Soc. Stud. Sci. 19, 3, 387--420.Google ScholarGoogle ScholarCross RefCross Ref
  35. Sun, J., Faloutsos, C., Papadimitriou, S., and Yu, P. 2007. Graphscope: Parameter-free mining of large time-evolving graphs. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, 687--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Tang, L., Liu, H., Zhang, J., and Nazeri, Z. 2008. Community evolution in dynamic multi-mode networks. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Tong, H., He, J., Li, M., Zhang, C., and Ma, W. 2005. Graph based multi-modality learning. In Proceedings of ACM Multimedia. ACM, 862--871. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wang, X. and McCallum, A. 2006. Topics over time: A non-Markov continuous-time model of topical trends. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Wang, X., Sun, J., Chen, Z., and Zhai, C. 2006. Latent semantic analysis for multiple-type interrelated data objects. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 236--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Xiao, Z., Hou, Z., Miao, C., and Wang, J. 2005. Using phase information for symmetry detection. Patt. Recogn. Lett. 26, 13, 1985--1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Xie, L., Chang, S., Divakaran, A., and Sun, H. 2002. Structure analysis of soccer video with hidden markov models. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. IEEE, 4096--4099.Google ScholarGoogle Scholar
  42. Zhu, S., Yu, K., Chi, Y., and Gong, Y. 2007. Combining content and link for classification using matrix factorization. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 487--494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Zunjarward, A., Sundaram, H., and Xie, L. 2007. Contextual wisdom: Social relations and correlations for multimedia event annotation. In Proceedings of ACM Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Discovering multirelational structure in social media streams

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!