skip to main content
research-article

Deep Attentive Multimodal Network Representation Learning for Social Media Images

Authors Info & Claims
Published:16 June 2021Publication History
Skip Abstract Section

Abstract

The analysis for social networks, such as the socially connected Internet of Things, has shown a deep influence of intelligent information processing technology on industrial systems for Smart Cities. The goal of social media representation learning is to learn dense, low-dimensional, and continuous representations for multimodal data within social networks, facilitating many real-world applications. Since social media images are usually accompanied by rich metadata (e.g., textual descriptions, tags, groups, and submitted users), simply modeling the image is not effective to learn the comprehensive information from social media images. In this work, we treat the image and its textual description as multimodal content, and transform other metainformation into the links between contents (such as two images marked by the same tag or submitted by the same user). Based on the multimodal content and social links, we propose a Deep Attentive Multimodal Graph Embedding model named DAMGE for more effective social image representation learning. We introduce both small- and large-scale datasets to conduct extensive experiments, of which the results confirm the superiority of the proposal on the tasks of social image classification and link prediction.

References

  1. Mikhail Belkin and Partha Niyogi. 2003. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15, 6 (2003), 1373–1396. DOI:https://doi.org/10.1162/089976603321780317 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jiuwen Cao, Kai Zhang, Minxia Luo, Chun Yin, and Xiaoping Lai. 2016. Extreme learning machine and adaptive sparse representation for image classification. Neural Netw. 81 (2016), 91–102. DOI:https://doi.org/10.1016/j.neunet.2016.06.001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast learning with graph convolutional networks via importance sampling. In 6th International Conference on Learning Representations (ICLR’18), Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=rytstxWAW.Google ScholarGoogle Scholar
  4. Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the 8th ACM International Conference on Image and Video Retrieval (CIVR’09), Stéphane Marchand-Maillet and Yiannis Kompatsiaris (Eds.). ACM. DOI:https://doi.org/10.1145/1646396.1646452 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Paul D. Clough, Michael Grubinger, Thomas Deselaers, Allan Hanbury, and Henning Müller. 2006. Overview of the ImageCLEF 2006 photographic retrieval and object annotation tasks. In Evaluation of Multilingual and Multi-Modal Information Retrieval, 7th Workshop of the Cross-Language Evaluation Forum (CLEF’06), Revised Selected Papers, Lecture Notes in Computer Science, Vol. 4730, Carol Peters, Paul D. Clough, Fredric C. Gey, Jussi Karlgren, Bernardo Magnini, Douglas W. Oard, Maarten de Rijke, and Maximilian Stempfhuber (Eds.). Springer, 579–594. DOI:https://doi.org/10.1007/978-3-540-74999-8_71 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Peng Cui, Shaowei Liu, and Wenwu Zhu. 2018. General knowledge embedded image representation learning. IEEE Trans. Multimedia 20, 1 (2018), 198–207. DOI:https://doi.org/10.1109/TMM.2017.2724843 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mark Everingham, Luc J. Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew Zisserman. 2010. The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88, 2 (2010), 303–338. DOI:https://doi.org/10.1007/s11263-009-0275-4 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc’Aurelio Ranzato, and Tomas Mikolov. 2013. DeViSE: A deep visual-semantic embedding model. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, Christopher J. C. Burges, Léon Bottou, Zoubin Ghahramani, and Kilian Q. Weinberger (Eds.). 2121–2129. http://papers.nips.cc/paper/5204-devise-a-deep-visual-semantic-embedding-model. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Yue Gao, Yi Zhen, Haojie Li, and Tat-Seng Chua. 2016. Filtering of brand-related microblogs using social-smooth multiview embedding. IEEE Trans. Multimedia 18, 10 (2016), 2115–2126. DOI:https://doi.org/10.1109/TMM.2016.2581483Google ScholarGoogle ScholarCross RefCross Ref
  10. Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Balaji Krishnapuram, Mohak Shah, Alexander J. Smola, Charu Aggarwal, Dou Shen, and Rajeev Rastogi (Eds.). ACM, 855–864. DOI:https://doi.org/10.1145/2939672.2939754 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Zepeng Gu, Bo Lang, Tongyu Yue, and Lei Huang. 2017. Learning joint multimodal representation based on multi-fusion deep neural networks. In 24th International Conference on Neural Information Processing (ICONIP’17), Part II, Lecture Notes in Computer Science, Vol. 10635, Derong Liu, Shengli Xie, Yuanqing Li, Dongbin Zhao, and El-Sayed M. El-Alfy (Eds.). Springer, 276–285. DOI:https://doi.org/10.1007/978-3-319-70096-0_29Google ScholarGoogle ScholarCross RefCross Ref
  12. William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 1024–1034. http://papers.nips.cc/paper/6703-inductive-representation-learning-on-large-graphs. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. David R. Hardoon, Sándor Szedmák, and John Shawe-Taylor. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 16, 12 (2004), 2639–2664. DOI:https://doi.org/10.1162/0899766042321814 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Feiran Huang, Xiaoming Zhang, Chaozhuo Li, Zhoujun Li, Yueying He, and Zhonghua Zhao. 2018. Multimodal network embedding via attention based multi-view variational autoencoder. In Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval (ICMR’18), Kiyoharu Aizawa, Michael S. Lew, and Shin’ichi Satoh (Eds.). ACM, 108–116. DOI:https://doi.org/10.1145/3206025.3206035 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Feiran Huang, Xiaoming Zhang, and Zhoujun Li. 2018. Learning joint multimodal representation with adversarial attention networks. In 2018 ACM Multimedia Conference on Multimedia(MM’18), Susanne Boll, Kyoung Mu Lee, Jiebo Luo, Wenwu Zhu, Hyeran Byun, Chang Wen Chen, Rainer Lienhart, and Tao Mei (Eds.). ACM, 1874–1882. DOI:https://doi.org/10.1145/3240508.3240614 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Feiran Huang, Xiaoming Zhang, Zhoujun Li, Tao Mei, Yueying He, and Zhonghua Zhao. 2017. Learning social image embedding with deep multimodal attention networks. In Proceedings of the Thematic Workshops of ACM Multimedia 2017, Wanmin Wu, Jianchao Yang, Qi Tian, and Roger Zimmermann (Eds.). ACM, 460–468. DOI:https://doi.org/10.1145/3126686.3126720 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Feiran Huang, Xiaoming Zhang, Zhoujun Li, Zhonghua Zhao, and Yueying He. 2018. From content to links: Social image embedding with deep multimodal model. Knowl.-Based Syst. 160 (2018), 251–264. DOI:https://doi.org/10.1016/j.knosys.2018.07.020Google ScholarGoogle ScholarCross RefCross Ref
  18. Feiran Huang, Xiaoming Zhang, Jie Xu, Chaozhuo Li, and Zhoujun Li. 2019. Network embedding by fusing multimodal contents and links. Knowl.-Based Syst. 171 (2019), 44–55. DOI:https://doi.org/10.1016/j.knosys.2019.02.003Google ScholarGoogle ScholarCross RefCross Ref
  19. Feiran Huang, Xiaoming Zhang, Jie Xu, Zhonghua Zhao, and Zhoujun Li. 2019. Multimodal learning of social image representation by exploiting social relations. IEEE Transactions on Cybernetics 51, 3 (2021), 1506–1518.Google ScholarGoogle ScholarCross RefCross Ref
  20. Feiran Huang, Xiaoming Zhang, Zhonghua Zhao, and Zhoujun Li. 2019. Bi-directional spatial-semantic attention networks for image-text matching. IEEE Trans. Image Processing 28, 4 (2019), 2008–2020. DOI:https://doi.org/10.1109/TIP.2018.2882225Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Feiran Huang, Xiaoming Zhang, Zhonghua Zhao, Zhoujun Li, and Yueying He. 2018. Deep multi-view representation learning for social images. Appl. Soft Comput. 73 (2018), 106–118. DOI:https://doi.org/10.1016/j.asoc.2018.08.010Google ScholarGoogle ScholarCross RefCross Ref
  22. Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval (MIR’08), Michael S. Lew, Alberto Del Bimbo, and Erwin M. Bakker (Eds.). ACM, 39–43. DOI:https://doi.org/10.1145/1460096.1460104 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations (ICLR ]1), Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SJU4ayYgl.Google ScholarGoogle Scholar
  24. Chaozhuo Li, Senzhang Wang, Dejian Yang, Zhoujun Li, Yang Yang, Xiaoming Zhang, and Jianshe Zhou. 2017. PPNE: Property preserving network embedding. In Proceedings of the 22nd International Conference on Database Systems for Advanced Applications, (DASFAA’17), Part I, Lecture Notes in Computer Science, Vol. 10177, K. Selçuk Candan, Lei Chen, Torben Bach Pedersen, Lijun Chang, and Wen Hua (Eds.). Springer, 163–179. DOI:https://doi.org/10.1007/978-3-319-55753-3_11Google ScholarGoogle ScholarCross RefCross Ref
  25. Chaozhuo Li, Lei Zheng, Senzhang Wang, Feiran Huang, Philip S. Yu, and Zhoujun Li. 2019. Multi-hot compact network embedding. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM’1), Wenwu Zhu, Dacheng Tao, Xueqi Cheng, Peng Cui, Elke A. Rundensteiner, David Carmel, Qi He, and Jeffrey Xu Yu (Eds.). ACM, 459–468. DOI:https://doi.org/10.1145/3357384.3357903 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zechao Li, Jinhui Tang, and Tao Mei. 2019. Deep collaborative embedding for social image understanding. IEEE Trans. Pattern Anal. Mach. Intell. 41, 9 (2019), 2070–2083. DOI:https://doi.org/10.1109/TPAMI.2018.2852750Google ScholarGoogle ScholarCross RefCross Ref
  27. Lizi Liao, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua. 2018. Attributed social network embedding. IEEE Trans. Knowl. Data Eng. 30, 12 (2018), 2257–2270. DOI:https://doi.org/10.1109/TKDE.2018.2819980Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Shaowei Liu, Peng Cui, Wenwu Zhu, and Shiqiang Yang. 2015. Learning socially embedded visual representation from scratch. In Proceedings of the 23rd Annual ACM Conference on Multimedia Conference (MM ’15,), Xiaofang Zhou, Alan F. Smeaton, Qi Tian, Dick C. A. Bulterman, Heng Tao Shen, Ketan Mayer-Patel, and Shuicheng Yan (Eds.). ACM, 109–118. DOI:https://doi.org/10.1145/2733373.2806247 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yun Liu, Xiaoming Zhang, Feiran Huang, and Zhoujun Li. 2018. Adversarial learning of answer-related representation for visual question answering. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM’18), Alfredo Cuzzocrea, James Allan, Norman W. Paton, Divesh Srivastava, Rakesh Agrawal, Andrei Z. Broder, Mohammed J. Zaki, K. Selçuk Candan, Alexandros Labrinidis, Assaf Schuster, and Haixun Wang (Eds.). ACM, 1013–1022. DOI:https://doi.org/10.1145/3269206.3271765 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yun Liu, Xiaoming Zhang, Feiran Huang, Xianghong Tang, and Zhoujun Li. 2019. Visual question answering via attention-based syntactic structure tree-LSTM. Appl. Soft Comput. 82 (2019). DOI:https://doi.org/10.1016/j.asoc.2019.105584Google ScholarGoogle Scholar
  31. Jiasen Lu, Jianwei Yang, Dhruv Batra, and Devi Parikh. 2016. Hierarchical question-image co-attention for visual question answering. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016,, Daniel D. Lee, Masashi Sugiyama, Ulrike von Luxburg, Isabelle Guyon, and Roman Garnett (Eds.). 289–297. http://papers.nips.cc/paper/6202-hierarchical-question-image-co-attention-for-visual-question-answering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Zhiwu Lu, Liwei Wang, and Ji-Rong Wen. 2014. Direct semantic analysis for social image classification. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, Carla E. Brodley and Peter Stone (Eds.). AAAI Press, 1258–1264. http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15), Lluís Màrquez, Chris Callison-Burch, Jian Su, Daniele Pighin, and Yuval Marton (Eds.). The Association for Computational Linguistics, 1412–1421. DOI:https://doi.org/10.18653/v1/d15-1166Google ScholarGoogle Scholar
  34. Julian J. McAuley and Jure Leskovec. 2012. Image labeling on a network: Using social-network metadata for image classification. In Proceedings of the 12th European Conference on Computer Vision (ECCV’12) , Part IV, (Lecture Notes in Computer Science), Vol. 7575, Andrew W. Fitzgibbon, Svetlana Lazebnik, Pietro Perona, Yoichi Sato, and Cordelia Schmid (Eds.),. Springer, 828–841. DOI:https://doi.org/10.1007/978-3-642-33765-9_59 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y. Ng. 2011. Multimodal deep learning. In Proceedings of the 28th International Conference on Machine Learning (ICML’11), Lise Getoor and Tobias Scheffer (Eds.). Omnipress, 689–696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14), A meeting of SIGDAT, a Special Interest Group of the ACL, Alessandro Moschitti, Bo Pang, and Walter Daelemans (Eds.). ACL, 1532–1543. http://aclweb.org/anthology/D/D14/D14-1162.pdf.Google ScholarGoogle Scholar
  37. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 701–710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Corinna Cortes, Neil D. Lawrence, Daniel D. Lee, Masashi Sugiyama, and Roman Garnett (Eds.). 91–99. http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Abhishek Sharma and David W. Jacobs. 2011. Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). IEEE Computer Society, 593–600. DOI:https://doi.org/10.1109/CVPR.2011.5995350 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556.Google ScholarGoogle Scholar
  41. Nitish Srivastava and Ruslan Salakhutdinov. 2012. Multimodal learning with deep Boltzmann machines. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012, Peter L. Bartlett, Fernando C. N. Pereira, Christopher J. C. Burges, Léon Bottou, and Kilian Q. Weinberger (Eds.). 2231–2239. http://papers.nips.cc/paper/4683-multimodal-learning-with-deep-boltzmann-machines. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Joshua B. Tenenbaum, Vin De Silva, and John C. Langford. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290, 5500 (2000), 2319–2323.Google ScholarGoogle ScholarCross RefCross Ref
  43. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998–6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph attention networks. In 6th International Conference on Learning Representations (ICLR’18), Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=rJXMpikCZ.Google ScholarGoogle Scholar
  45. Petar Velickovic, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, and R. Devon Hjelm. 2019. Deep graph infomax. In Proceedings of the 7th International Conference on Learning Representations (ICLR’19). OpenReview.net. https://openreview.net/forum?id=rklz9iAcKQ.Google ScholarGoogle Scholar
  46. Vedran Vukotic, Christian Raymond, and Guillaume Gravier. 2016. Bidirectional joint representation learning with symmetrical deep neural networks for multimodal and crossmodal applications. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval (ICMR’16), John R. Kender, John R. Smith, Jiebo Luo, Susanne Boll, and Winston H. Hsu (Eds.). ACM, 343–346. DOI:https://doi.org/10.1145/2911996.2912064 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Zhitao Wang, Chengyao Chen, and Wenjie Li. 2017. Predictive network representation learning for link prediction. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Noriko Kando, Tetsuya Sakai, Hideo Joho, Hang Li, Arjen P. de Vries, and Ryen W. White (Eds.). ACM, 969–972. DOI:https://doi.org/10.1145/3077136.3080692 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15) (JMLR Workshop and Conference Proceedings), Francis R. Bach and David M. Blei (Eds.), Vol. 37. JMLR.org, 2048–2057. http://jmlr.org/proceedings/papers/v37/xuc15.html. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Xing Xu, Fumin Shen, Yang Yang, Heng Tao Shen, and Xuelong Li. 2017. Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans. Image Processing 26, 5 (2017), 2494–2507. DOI:https://doi.org/10.1109/TIP.2017.2676345 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Fei Yan and Krystian Mikolajczyk. 2015. Deep correlation for matching images and text. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). IEEE Computer Society, 3441–3450. DOI:https://doi.org/10.1109/CVPR.2015.7298966Google ScholarGoogle ScholarCross RefCross Ref
  51. Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, and Edward Y. Chang. 2015. Network representation learning with rich text information. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI’15), Qiang Yang and Michael Wooldridge (Eds.). AAAI Press, 2111–2117. http://ijcai.org/Abstract/15/299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Zhenguo Yang, Qing Li, Zheng Lu, Yun Ma, Zhiguo Gong, and Wenyin Liu. 2017. Dual structure constrained multimodal feature coding for social event detection from flickr data. ACM Trans. Internet Techn. 17, 2 (2017), 19:1–19:20. DOI:https://doi.org/10.1145/3015463 Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Ting Yao, Yingwei Pan, Yehao Li, Zhaofan Qiu, and Tao Mei. 2017. Boosting image captioning with attributes. In IEEE International Conference on Computer Vision (ICCV’17). IEEE Computer Society, 4904–4912. DOI:https://doi.org/10.1109/ICCV.2017.524Google ScholarGoogle ScholarCross RefCross Ref
  54. Yi Zhuang, Nan Jiang, Qing Li, Lei Chen, and Chunhua Ju. 2015. Progressive batch medical image retrieval processing in mobile wireless networks. ACM Trans. Internet Techn. 15, 3 (2015), 9:1–9:27. DOI:https://doi.org/10.1145/2783437 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Attentive Multimodal Network Representation Learning for Social Media Images

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!