skip to main content
research-article

Attribute-wise Explainable Fashion Compatibility Modeling

Published:16 April 2021Publication History
Skip Abstract Section

Abstract

With the boom of the fashion market and people’s daily needs for beauty, clothing matching has gained increased research attention. In a sense, tackling this problem lies in modeling the human notions of the compatibility between fashion items, i.e., Fashion Compatibility Modeling (FCM), which plays an important role in a wide bunch of commercial applications, including clothing recommendation and dressing assistant. Recent advances in multimedia processing have shown remarkable effectiveness in accurate compatibility evaluation. However, these studies work like a black box and cannot provide appropriate explanations, which are indeed of importance for gaining users’ trust and improving their experience. In fact, fashion experts usually explain the compatibility evaluation through the matching patterns between fashion attributes (e.g., a silk tank top cannot go with a knit dress). Inspired by this, we devise an attribute-wise explainable FCM solution, named ExFCM, which can simultaneously generate the item-level compatibility evaluation for input fashion items and the attribute-level explanations for the evaluation result. In particular, ExFCM consists of two key components: attribute-wise representation learning and attribute interaction modeling. The former works on learning the region-aware attribute representation for each item with the threshold global average pooling. Besides, the latter is responsible for compiling the attribute-level matching signals into the overall compatibility evaluation adaptively with the attentive interaction mechanism. Note that ExFCM is trained without any attribute-level compatibility annotations, which facilitates its practical applications. Extensive experiments on two real-world datasets validate that ExFCM can generate more accurate compatibility evaluations than the existing methods, together with reasonable explanations.

References

  1. Kenan E. Ak, Ashraf A. Kassim, Joo-Hwee Lim, and Jo Yew Tham. 2018. Learning attribute representations with localization for flexible fashion search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 7708–7717.Google ScholarGoogle ScholarCross RefCross Ref
  2. Da Cao, Liqiang Nie, Xiangnan He, Xiaochi Wei, Shunzhi Zhu, and Tat-Seng Chua. 2017. Embedding factorization models for jointly recommending items and user generated lists. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 585–594.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Huizhong Chen, Andrew C. Gallagher, and Bernd Girod. 2012. Describing clothing by semantic attributes. In Proceedings of the European Conference on Computer Vision. Springer, 609–623.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Peng Cui, Shaowei Liu, and Wenwu Zhu. 2018. General knowledge embedded image representation learning. IEEE Trans. Multimedia 20, 1 (2018), 198–207.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cunxiao Du, Zhaozheng Chin, Fuli Feng, Lei Zhu, Tian Gan, and Liqiang Nie. 2019. Explicit interaction model towards text classification. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, 6359–6366.Google ScholarGoogle ScholarCross RefCross Ref
  6. Zunlei Feng, Zhenyun Yu, Yezhou Yang, Yongcheng Jing, Junxiao Jiang, and Mingli Song. 2018. Interpretable partitioned embedding for customized fashion outfit composition. In Proceedings of the ACM International Conference on Multimedia Retrieval. ACM, 143–151.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Xintong Han, Zuxuan Wu, Phoenix X. Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, and Larry S. Davis. 2017. Automatic spatially-aware fashion concept discovery. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 1472–1480.Google ScholarGoogle Scholar
  8. Xintong Han, Zuxuan Wu, Yu-Gang Jiang, and Larry S. Davis. 2017. Learning fashion compatibility with bidirectional LSTMs. In Proceedings of the ACM International Conference on Multimedia. ACM, 1078–1086.Google ScholarGoogle Scholar
  9. Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the International Conference on World Wide Web. ACM, 173–182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. 2016. Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 549–558.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yonghao He, Shiming Xiang, Cuicui Kang, Jian Wang, and Chunhong Pan. 2016. Cross-modal retrieval via deep and bidirectional representation learning. IEEE Trans. Multimedia 18, 7 (2016), 1363–1377.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Wei-Lin Hsiao and Kristen Grauman. 2018. Creating capsule wardrobes from fashion images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 7161–7170.Google ScholarGoogle ScholarCross RefCross Ref
  13. Yang Hu, Xi Yi, and Larry S. Davis. 2015. Collaborative fashion recommendation: A functional tensor factorization approach. In Proceedings of the ACM International Conference on Multimedia. ACM, 129–138.Google ScholarGoogle Scholar
  14. Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard H. Hovy, and Eric P. Xing. 2016. Harnessing deep neural networks with logic rules. In Proceedings of the Meeting of the Association for Computational Linguistics. The Association for Computer Linguistics, 2410–2420.Google ScholarGoogle Scholar
  15. Junshi Huang, Rogério Schmidt Feris, Qiang Chen, and Shuicheng Yan. 2015. Cross-domain image retrieval with a dual attribute-aware ranking network. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 1062–1070.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dong Li, Ting Yao, Ling-Yu Duan, Tao Mei, and Yong Rui. 2019. Unified spatio-temporal attention networks for action recognition in videos. IEEE Trans. Multimedia 21, 2 (2019), 416–428.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Linghui Li, Sheng Tang, Yongdong Zhang, Lixi Deng, and Qi Tian. 2018. GLA: Global-local attention for image description. IEEE Trans. Multimedia 20, 3 (2018), 726–737.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yuncheng Li, Liangliang Cao, Jiang Zhu, and Jiebo Luo. 2017. Mining fashion outfit composition using an end-to-end deep learning approach on set data. IEEE Trans. Multimedia 19, 8 (2017), 1946–1955.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Lizi Liao, Xiangnan He, Bo Zhao, Chong-Wah Ngo, and Tat-Seng Chua. 2018. Interpretable multimodal retrieval for fashion products. In Proceedings of the ACM International Conference on Multimedia. ACM, 1571–1579.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Min Lin, Qiang Chen, and Shuicheng Yan. 2014. Network in network. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  21. Yujie Lin, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Jun Ma, and Maarten de Rijke. 2019. Explainable fashion recommendation with joint outfit matching and comment generation. IEEE Trans. Knowl. Data Eng. 32, 8 (2019), 1502--1516.Google ScholarGoogle ScholarCross RefCross Ref
  22. Jinhuan Liu, Xuemeng Song, Zhumin Chen, and Jun Ma. 2019. Neural fashion experts: I know how to make the complementary clothing matching. Neurocomputing 359 (2019), 249–263.Google ScholarGoogle ScholarCross RefCross Ref
  23. Si Liu, Jiashi Feng, Zheng Song, Tianzhu Zhang, Hanqing Lu, Changsheng Xu, and Shuicheng Yan. 2012. Hi, magic closet, tell me what to wear! In Proceedings of the ACM International Conference on Multimedia. ACM, 619–628.Google ScholarGoogle Scholar
  24. Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1096–1104.Google ScholarGoogle ScholarCross RefCross Ref
  25. Yi-Jie Lu, Linjun Yang, Kuiyuan Yang, and Yong Rui. 2015. Mining latent attributes from click-through logs for image recognition. IEEE Trans. Multimedia 17, 8 (2015), 1213–1224.Google ScholarGoogle ScholarCross RefCross Ref
  26. Lei Ma, Hongliang Li, Fanman Meng, Qingbo Wu, and King Ngi Ngan. 2017. Learning efficient binary codes from high-level feature representations for multilabel image retrieval. IEEE Trans. Multimedia 19, 11 (2017), 2545–2560.Google ScholarGoogle ScholarCross RefCross Ref
  27. Julian J. McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 43–52.Google ScholarGoogle Scholar
  28. Martin Mirakyan, Karen Hambardzumyan, and Hrant Khachatrian. 2018. Natural language inference over interaction space: ICLR 2018 reproducibility report. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  29. Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the International Conference on Uncertainty in Artificial Intelligence. AUAI Press, 452–461.Google ScholarGoogle Scholar
  30. Steffen Rendle and Lars Schmidt-Thieme. 2010. Pairwise interaction tensor factorization for personalized tag recommendation. In Proceedings of the Conference on Web Search and Web Data Mining, Brian D. Davison, Torsten Suel, Nick Craswell, and Bing Liu (Eds.). ACM, 81–90.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sijie Song and Tao Mei. 2018. When multimedia meets fashion. IEEE Trans. Multimedia 25, 3 (2018), 102–108.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Xuemeng Song, Fuli Feng, Xianjing Han, Xin Yang, Wei Liu, and Liqiang Nie. 2018. Neural compatibility modeling with attentive knowledge distillation. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 5–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Xuemeng Song, Fuli Feng, Jinhuan Liu, Zekun Li, Liqiang Nie, and Jun Ma. 2017. NeuroStylist: Neural compatibility modeling for clothing matching. In Proceedings of the ACM International Conference on Multimedia. ACM, 753–761.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Xuemeng Song, Xianjing Han, Yunkai Li, Jingyuan Chen, Xin-Shun Xu, and Liqiang Nie. 2019. GP-BPR: Personalized compatibility modeling for clothing matching. In Proceedings of the ACM International Conference on Multimedia. ACM, 320–328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Guang-Lu Sun, Zhi-Qi Cheng, Xiao Wu, and Qiang Peng. 2018. Personalized clothing recommendation combining user social circle and fashion style consistency. Multimedia Tools Applic. 77, 14 (2018), 17731–17754.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Pongsate Tangseng and Takayuki Okatani. 2020. Toward explainable fashion recommendation. In Proceedings of the Winter Conference on Applications of Computer Vision. IEEE, 2153–2162.Google ScholarGoogle ScholarCross RefCross Ref
  37. Nava Tintarev and Judith Masthoff. 2007. A survey of explanations in recommender systems. In Proceedings of the International Conference on Data Engineering Workshops. IEEE, 801–810.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Mariya I. Vasileva, Bryan A. Plummer, Krishna Dusad, Shreya Rajpal, Ranjitha Kumar, and David A. Forsyth. 2018. Learning type-aware embeddings for fashion compatibility. In Proceedings of the European Conference on Computer Vision. Springer, 405–421.Google ScholarGoogle Scholar
  39. Cheng Wang, Haojin Yang, Christian Bartz, and Christoph Meinel. 2016. Image captioning with deep bidirectional LSTMs. In Proceedings of the ACM International Conference on Multimedia. ACM, 988–997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Qiurui Wang, Chun Yuan, Jingdong Wang, and Wenjun Zeng. 2019. Learning attentional recurrent neural network for visual tracking. IEEE Trans. Multimedia 21, 4 (2019), 930–942.Google ScholarGoogle ScholarCross RefCross Ref
  41. Shuohang Wang and Jing Jiang. 2016. Learning natural language inference with LSTM. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. The Association for Computational Linguistics, 1442–1451.Google ScholarGoogle ScholarCross RefCross Ref
  42. Xin Wang, Bo Wu, and Yueqi Zhong. 2019. Outfit compatibility prediction and diagnosis with multi-layered comparison network. In Proceedings of the ACM International Conference on Multimedia. ACM, 329–337.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Yu Wu, Wei Wu, Chen Xing, Ming Zhou, and Zhoujun Li. 2017. Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. In Proceedings of the 55th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 496–505.Google ScholarGoogle ScholarCross RefCross Ref
  44. Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, and Tat-Seng Chua. 2017. Attentional factorization machines: Learning the weight of feature interactions via attention networks. In Proceedings of the International Joint Conference on Artificial Intelligence. ijcai.org, 3119–3125.Google ScholarGoogle ScholarCross RefCross Ref
  45. Xun Yang, Yunshan Ma, Lizi Liao, Meng Wang, and Tat-Seng Chua. 2019. TransNFCM: Translation-based neural fashion compatibility modeling. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, 403–410.Google ScholarGoogle ScholarCross RefCross Ref
  46. Xin Yang, Xuemeng Song, Xianjing Han, Haokun Wen, Jie Nie, and Liqiang Nie. 2020. Generative attribute manipulation scheme for flexible fashion search. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 941–950.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat-Seng Chua. 2013. Attribute-augmented semantic hierarchy: Towards bridging semantic gap and intention gap in image retrieval. In Proceedings of the ACM International Conference on Multimedia. ACM, 33–42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Yongfeng Zhang and Xu Chen. 2018. Explainable recommendation: A survey and new perspectives. arxiv:cs.IR/1804.11192.Google ScholarGoogle Scholar
  49. Bolei Zhou, Aditya Khosla, Àgata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2921–2929.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Attribute-wise Explainable Fashion Compatibility Modeling

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 1
        February 2021
        392 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3453992
        Issue’s Table of Contents

        Copyright © 2021 Association for Computing Machinery.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 16 April 2021
        • Accepted: 1 September 2020
        • Revised: 1 August 2020
        • Received: 1 March 2020
        Published in tomm Volume 17, Issue 1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!