skip to main content
research-article

Discovering Latent Topics by Gaussian Latent Dirichlet Allocation and Spectral Clustering

Published:23 January 2019Publication History
Skip Abstract Section

Abstract

Today, diversifying the retrieval results of a certain query will improve customers’ search efficiency. Showing the multiple aspects of information provides users an overview of the object, which helps them fast target their demands. To discover aspects, research focuses on generating image clusters from initially retrieved results. As an effective approach, latent Dirichlet allocation (LDA) has been proved to have good performance on discovering high-level topics. However, traditional LDA is designed to process textual words, and it needs the input as discrete data. When we apply this algorithm to process continuous visual images, a common solution is to quantize the continuous features into discrete form by a bag-of-visual-words algorithm. During this process, quantization error will lead to information that inevitably is lost. To construct a topic model with complete visual information, this work applies Gaussian latent Dirichlet allocation (GLDA) on the diversity issue of image retrieval. In this model, traditional multinomial distribution is substituted with Gaussian distribution to model continuous visual features. In addition, we propose a two-phase spectral clustering strategy, called dual spectral clustering, to generate clusters from region level to image level. The experiments on the challenging landmarks of the DIV400 database show that our proposal improves relevance and diversity by about 10% compared to traditional topic models.

References

  1. Shane Ahern, Mor Naaman, Rahul Nair, and Jeannie Hui-I. Yang. 2007. World explorer: Visualizing aggregate data from unstructured text in geo-referenced collections. In Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’07). ACM, New York, NY, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Simone Bianco and Gianluigi Ciocca. 2015. User preferences modeling and learning for pleasing photo collage generation. ACM Transactions on Multimedia Computing, Communications, and Applications 12, 1 (Aug. 2015), 6:1--6:23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (Jan. 2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Deng Cai, Xiaofei He, Zhiwei Li, Wei-Ying Ma, and Ji-Rong Wen. 2004. Hierarchical clustering of WWW image search results using visual, textual and link information. In Proceedings of the 12th Annual ACM International Conference on Multimedia (MULTIMEDIA’04). ACM, New York, NY, 952--959. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jaime Carbinell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98), Vol. 51. ACM, New York, NY, 335--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Buttcher, and Ian MacKinnon. 2008. Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’08). ACM, New York, NY, 659--666. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Duc-Tien Dang-Nguyen, Luca Piras, Giorgio Giacinto, Giulia Boato, and Francesco G. B. DE Natale. 2017. Multimodal retrieval with diversification and relevance feedback for tourist attraction images. ACM Transactions on Multimedia Computing, Communications, and Applications 13, 4 (Aug. 2017), 49:1--49:24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Rajarshi Das, Manzil Zaheer, and Chris Dyer. 2015. Gaussian LDA for topic models with word embeddings. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 795--804.Google ScholarGoogle ScholarCross RefCross Ref
  9. Thomas Deselaers, Tobias Gass, Philippe Dreuw, and Hermann Ney. 2009. Jointly optimising relevance and diversity in image retrieval. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR’09). ACM, New York, NY, 39:1--39:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Pengfei Hu, Wenju Liu, Jiang Wei, and Zhanlei Yang. 2014. Latent topic model for audio retrieval. Pattern Recognition 47, 3 (Mar. 2014), 1138--1143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Bogdan Ionescu, Anca-Livia Radu, Maria Menendez, Henning Muller, Adrian Popescu, and Babak Loni. 2014. Div400: A social image retrieval result diversification dataset. In Proceedings of the 5th ACM Multimedia Systems Conference (MMSys’14). ACM, New York, NY, 29--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Go Irie, Dong Liu, Zhenguo Li, and Shih-Fu Chang. 2013. A Bayesian approach to multimodal visual dictionary learning. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13). IEEE, Los Alamitos, CA, 329--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zhengbao Jiang, Ji-Rong Wen, Zhicheng Dou, Wayne Xin Zhao, Jian-Yun Nie, and Ming Yue. 2017. Learning to diversify search results via subtopic attention. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’17). ACM, New York, NY, 545--554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Lyndon S. Kennedy and Mor Naaman. 2008. Generating diverse and representative image search results for landmarks. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). ACM, New York, NY, 297--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Shangsong Liang, Zhaochun Ren, and Maarten de Rijke. 2014. Personalized search result diversification via structured learning. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, New York, NY, 751--760. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. David G. Lowe. 1999. Object recognition from local scale-invariant features. In Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV’99), Vol. 2. IEEE, Los Alamitos, CA, 1150--1157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Changzhi Luo, Bingbing Ni, Shuicheng Yan, and Meng Wang. 2016. Image classification by selective regularized subspace learning. IEEE Transactions on Multimedia 18, 1 (Jan. 2016), 40--50.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nobuyuki Morioka and Jingdong Wang. 2011. Robust visual reranking via sparsity and ranking constraints. In Proceedings of the 19th ACM International Conference on Multimedia (MM’11). ACM, New York, NY, 533--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Monica Lestari Paramita, Mark Sanderson, and Paul Clough. 2010. Diversity in photo retrieval: Overview of the ImageCLEFPhoto Task 2009. In Proceedings of the 10th International Conference on Cross-Language Evaluation Forum: Multimedia Experiments (CLEF’09). 45--59. http://dl.acm.org/citation.cfm?id=1885110.1885119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Bryan C. Russell, Alexei A. Efros, Josef Sivic, William T. Freeman, and Andrew Zisserman. 2006. Using multiple segmentations to discover objects and their extent in image collections. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 2. IEEE, Los Alamitos, CA, 1605--1614. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Rodrygo L. T. Santos, Craig MacDonald, and Iadh Ounis. 2010. Exploiting query reformulations for Web search result diversification. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, NY, 881--890. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jianbo Shi and Jitendra Malik. 1997. Normalized cuts and image segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’97). IEEE, Los Alamitos, CA, 731--737. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Eleftherios Spyromitros-Xioufis, Symeon Papadopoulos, Alexandru Lucian Ginsca, Adrian Popescu, Yiannis Kompatsiaris, and Ioannis Vlahavas. 2015. Improving diversity in image search via supervised relevance scoring. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR’15). ACM, New York, NY, 323--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Reinier H. van Leuken, Lluis Garcia, Ximena Olivares, and Roelof van Zwol. 2009. Visual diversification of image search results. In Proceedings of the 18th International Conference on World Wide Web (WWW’09). ACM, New York, NY, 341--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Marcos R. Vieira, Humberto L. Razente, Marios Hadjieleftheriou Maria C. N. Barioni, Divesh Srivastava, Caetano Traina Jr., and Vassilis J. Tsotras. 2011. On query result diversification. In Proceedings of the 2011 IEEE 27th International Conference on Data Engineering (ICDE’11). IEEE, Los Alamitos, CA, 1163--1174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ulrike von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing 17, 4 (Dec. 2007), 395--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Di Wang, Xinbo Gao, Xiumei Wang, Lihuo He, and Bo Yuan. 2016. Multimodal discriminative binary embedding for large-scale cross-modal retrieval. IEEE Transactions on Image Processing 25, 10 (Oct. 2016), 4540--4554. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Meng Wang, Weijie Fu, Shijie Hao, Hengchang Liu, and Xindong Wu. 2017. Learning on big graph: Label inference and regularization with anchor hierarchy. IEEE Transactions on Knowledge and Data Engineering 29, 5 (May 2017), 1101--1114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Meng Wang, Hao Li, Dacheng Tao, Ke Lu, and Xindong Wu. 2012. Multimodal graph-based reranking for Web image search. IEEE Transactions on Image Processing 21, 11 (Nov. 2012), 4649--4661. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Meng Wang, Kuiyuan Yang, Xian-Sheng Hua, and Hong-Jiang Zhang. 2010. Towards a relevant and diverse search of social images. IEEE Transactions on Multimedia 12, 8 (Dec. 2010), 829--842. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Frank Wood and Michael J. Black. 2008. A nonparametric Bayesian alternative to spike sorting. Journal of Neuroscience Methods 173, 1 (Aug. 2008), 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  32. Linjun Yang, Bo Geng, Alan Hanjalic, and Xian-Sheng Hua. 2012. A unified context model for Web image retrieval. ACM Transactions on Multimedia Computing, Communications, and Applications 8, 3 (Aug. 2012), 28:1--28:19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Liu Yang, Rong Jin, Rahul Sukthankar, and Frederic Jurie. 2008. Unifying discriminative visual codebook generation with classifier training for object category recognition. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08). IEEE, Los Alamitos, CA, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  34. Jan Zahalka, Stevan Rudinac, and Marcel Worring. 2015. Interactive multimodal learning for venue recommendation. IEEE Transactions on Multimedia 17, 12 (Dec. 2015), 2235--2244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat-Seng Chua. 2014. Attribute-augmented semantic hierarchy: Towards a unified framework for content-based image retrieval. ACM Transactions on Multimedia Computing, Communications, and Applications 11, 1 (Oct. 2014), 21:1--21:21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Wengang Zhou, Houqiang Li, Yijuan Lu, and Qi Tian. 2013. SIFT match verification by geometric coding for large-scale partial-duplicate Web image search. ACM Transactions on Multimedia Computing, Communications, and Applications 9, 1 (Feb. 2013), 4:1--4:18. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Discovering Latent Topics by Gaussian Latent Dirichlet Allocation and Spectral Clustering

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Multimedia Computing, Communications, and Applications
            ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 15, Issue 1
            February 2019
            265 pages
            ISSN:1551-6857
            EISSN:1551-6865
            DOI:10.1145/3309717
            Issue’s Table of Contents

            Copyright © 2019 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 23 January 2019
            • Accepted: 1 October 2018
            • Revised: 1 September 2018
            • Received: 1 May 2018
            Published in tomm Volume 15, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!