skip to main content
research-article

Deep Bidirectional Cross-Triplet Embedding for Online Clothing Shopping

Published:04 January 2018Publication History
Skip Abstract Section

Abstract

In this article, we address the cross-domain (i.e., street and shop) clothing retrieval problem and investigate its real-world applications for online clothing shopping. It is a challenging problem due to the large discrepancy between street and shop domain images. We focus on learning an effective feature-embedding model to generate robust and discriminative feature representation across domains. Existing triplet embedding models achieve promising results by finding an embedding metric in which the distance between negative pairs is larger than the distance between positive pairs plus a margin. However, existing methods do not address the challenges in the cross-domain clothing retrieval scenario sufficiently. First, the intradomain and cross-domain data relationships need to be considered simultaneously. Second, the number of matched and nonmatched cross-domain pairs are unbalanced. To address these challenges, we propose a deep cross-triplet embedding algorithm together with a cross-triplet sampling strategy. The extensive experimental evaluations demonstrate the effectiveness of the proposed algorithms well. Furthermore, we investigate two novel online shopping applications, clothing trying on and accessories recommendation, based on a unified cross-domain clothing retrieval framework.

References

  1. Sean Bell and Kavita Bala. 2015. Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics 34, 4, 98.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jane Bromley, James W. Bentz, Léon Bottou, Isabelle Guyon, Yann LeCun, Cliff Moore, Eduard Säckinger, and Roopak Shah. 1993. Signature verification using a Siamese time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence 7, 04, 669--688.Google ScholarGoogle ScholarCross RefCross Ref
  3. Qiang Chen, Junshi Huang, Rogerio Feris, Lisa M. Brown, Jian Dong, and Shuicheng Yan. 2015. Deep domain adaptation for describing people based on fine-grained clothing attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5315--5324. Google ScholarGoogle ScholarCross RefCross Ref
  4. Chueh-Min Cheng, Meng-Fang Chung, Ming-Yang Yu, Ming Ouhyoung, Hao-Hua Chu, and Yung-Yu Chuang. 2008. Chromirror: A real-time interactive mirror for chromatic and color-harmonic dressing. In CHI’08 Extended Abstracts on Human Factors in Computing Systems. ACM, 2787--2792. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1. IEEE, 539--546. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Daniel Cohen-Or, Olga Sorkine, Ran Gal, Tommer Leyvand, and Ying-Qing Xu. 2006. Color harmonization. In ACM Transactions on Graphics, Vol. 25. ACM, 624--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Wei Di, Catherine Wah, Anurag Bhardwaj, Robinson Piramuthu, and Neel Sundaresan. 2013. Style finder: Fine-grained clothing style detection and retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 8--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Zhengming Ding and Yun Fu. 2014. Low-rank common subspace for multi-view learning. In IEEE International Conference on Data Mining. IEEE, 110--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Zhengming Ding, Ming Shao, and Yun Fu. 2015. Missing modality transfer learning via latent low-rank constraint. IEEE Transactions on Image Processing 24, 11, 4322--4334. Google ScholarGoogle ScholarCross RefCross Ref
  10. Yun Fu, Shuhui Jiang. 2017. Fashion style generator. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI-17). 3721--3727. DOI:http://dx.doi.org/10.24963/ijcai.2017/520 Google ScholarGoogle ScholarCross RefCross Ref
  11. Jianlong Fu, Jinqiao Wang, Zechao Li, Min Xu, and Hanqing Lu. 2012. Efficient clothing retrieval with semantic-preserving visual phrases. In Proceedings of the Asian Conference on Computer Vision. Springer, 420--431.Google ScholarGoogle Scholar
  12. M. Hadi Kiapour, Kota Yamaguchi, Alexander C. Berg, and Tamara L. Berg. 2014. Hipster wars: Discovering elements of fashion styles. In Proceedings of the European Conference on Computer Vision. Springer, 472--488. Google ScholarGoogle ScholarCross RefCross Ref
  13. M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, and Tamara L. Berg. 2015. Where to buy it: Matching street clothing photos in online shops. In Proceedings of the IEEE International Conference on Computer Vision. 3343--3351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. IEEE, 1735--1742. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Junshi Huang, Rogerio S. Feris, Qiang Chen, and Shuicheng Yan. 2015. Cross-domain image retrieval with a dual attribute-aware ranking network. In Proceedings of the IEEE International Conference on Computer Vision. 1062--1070. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Tomoharu Iwata, Shinji Wanatabe, and Hiroshi Sawada. 2011. Fashion coordinates recommender system using photographs from fashion magazines. In International Joint Conference on Artificial Intelligence, Vol. 22. Citeseer, 2262.Google ScholarGoogle Scholar
  17. Vignesh Jagadeesh, Robinson Piramuthu, Anurag Bhardwaj, Wei Di, and Neel Sundaresan. 2014. Large scale visual recommendations from street fashion images. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1925--1934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia. ACM, 675--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Shuhui Jiang, Xueming Qian, Ke Lan, Lei Zhang, and Tao Mei. 2013. Mobile multimedia travelogue generation by exploring geo-locations and image tags. In IEEE International Symposium on Circuits and Systems (ISCAS’13). IEEE, 881--884.Google ScholarGoogle Scholar
  20. Shuhui Jiang, Xueming Qian, Tao Mei, and Yun Fu. 2016. Personalized travel sequence recommendation on multi-source big social media. IEEE Transactions on Big Data 2, 1, 43--56. Google ScholarGoogle ScholarCross RefCross Ref
  21. Shuhui Jiang, Xueming Qian, Jialie Shen, Yun Fu, and Tao Mei. 2015. Author topic model-based collaborative filtering for personalized POI recommendations. IEEE Transactions on Multimedia 17, 6, 907--918. Google ScholarGoogle ScholarCross RefCross Ref
  22. Shuhui Jiang, Xueming Qian, Yao Xue, Fan Li, and Xingsong Hou. 2013. Generating representative images for landmark by discovering high frequency shooting locations from community-contributed photos. In IEEE International Conference on Multimedia and Expo Workshops. IEEE, 1--6.Google ScholarGoogle Scholar
  23. Shuhui Jiang, Ming Shao, Chengcheng Jia, and Yun Fu. 2016a. Consensus style centralizing auto-encoder for weak style classification. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI.Google ScholarGoogle Scholar
  24. Shuhui Jiang, Yue Wu, and Yun Fu. 2016b. Deep bi-directional cross-triplet embedding for cross-domain clothing retrieval. In Proceedings of the ACM on Multimedia Conference. ACM, 52--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yannis Kalantidis, Lyndon Kennedy, and Li-Jia Li. 2013. Getting the look: Clothing recognition and segmentation for automatic product suggestions in everyday photos. In Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval. ACM, 105--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lyndon S. Kennedy and Mor Naaman. 2008. Generating diverse and representative image search results for landmarks. In Proceedings of the 17th International Conference on World Wide Web. ACM, 297--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.Google ScholarGoogle Scholar
  28. J. Li, H. Y. Chang, and J. Yang. 2015. Sparse deep stacking network for image classification. In Proceedings of the AAAI Conference on Artificial Intelligence 3804--3810.Google ScholarGoogle Scholar
  29. Jun Li, Tong Zhang, Wei Luo, Jian Yang, Xiaotong Yuan, and Jian Zhang. 2017. Sparseness analysis in the pretraining of deep neural networks. IEEE Transactions on Neural Networks and Learning Systems 28, 6, 1425--1438. Google ScholarGoogle ScholarCross RefCross Ref
  30. Si Liu, Jiashi Feng, Zheng Song, Tianzhu Zhang, Hanqing Lu, Changsheng Xu, and Shuicheng Yan. 2012b. Hi, magic closet, tell me what to wear!. In Proceedings of the 20th ACM International Conference on Multimedia. ACM, 619--628. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Si Liu, Xiaodan Liang, Luoqi Liu, Xiaohui Shen, Jianchao Yang, Changsheng Xu, Liang Lin, Xiaochun Cao, and Shuicheng Yan. 2015. Matching-CNN meets KNN: Quasi-parametric human parsing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1419--1427. Google ScholarGoogle ScholarCross RefCross Ref
  32. Si Liu, Zheng Song, Guangcan Liu, Changsheng Xu, Hanqing Lu, and Shuicheng Yan. 2012a. Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3330--3337. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. 2016. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1096--1104. Google ScholarGoogle ScholarCross RefCross Ref
  34. Wei Luo, Jun Li, Jian Yang, Wei Xu, and Jian Zhang. 2017. Convolutional sparse autoencoders for image classification. IEEE Transactions on Neural Networks and Learning Systems PP, 99 (2017), 1--6. http://ieeexplore.ieee.org/document/7962256.Google ScholarGoogle Scholar
  35. Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 43--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 815--823. Google ScholarGoogle ScholarCross RefCross Ref
  37. Edward Shen, Henry Lieberman, and Francis Lam. 2007. What am I gonna wear?: Scenario-oriented recommendation. In Proceedings of the 12th International Conference on Intelligent User Interfaces. ACM, 365--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, and Raquel Urtasun. 2015. Neuroaesthetics in fashion: Modeling the perception of fashionability. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 869--877. Google ScholarGoogle ScholarCross RefCross Ref
  39. Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. 2015. Deep metric learning via lifted structured feature embedding. Arxiv Preprint Arxiv:1511.06452.Google ScholarGoogle Scholar
  40. Zhiqiang Tao, Hongfu Liu, Sheng Li, Zhengming Ding, and Yun Fu. 2017. From ensemble clustering to multi-view clustering. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2843--2849. Google ScholarGoogle ScholarCross RefCross Ref
  41. Andreas Veit, Balazs Kovacs, Sean Bell, Julian McAuley, Kavita Bala, and Serge Belongie. 2015. Learning visual clothing style with heterogeneous dyadic co-occurrences. In Proceedings of the IEEE International Conference on Computer Vision. 4642--4650. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Sirion Vittayakorn, Kota Yamaguchi, Alexander C. Berg, and Tamara L. Berg. 2015. Runway to realway: Visual analysis of fashion. In IEEE Winter Conference on Applications of Computer Vision. IEEE, 951--958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Jiang Wang, Yang Song, Thomas Leung, Chuck Rosenberg, Jingbin Wang, James Philbin, Bo Chen, and Ying Wu. 2014. Learning fine-grained image similarity with deep ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1386--1393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xi Wang, Zhenfeng Sun, Wenqiang Zhang, Yu Zhou, and Yu-Gang Jiang. 2016. Matching user photos to online products with robust deep features. In Proceedings of the ACM on International Conference on Multimedia Retrieval. ACM, 7--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kilian Q. Weinberger and Lawrence K. Saul. 2009. Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research 10, 207--244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Kota Yamaguchi, M. Kiapour, and Tamara Berg. 2013. Paper doll parsing: Retrieving similar styles to parse clothing items. In Proceedings of the IEEE International Conference on Computer Vision. 3519--3526.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Kota Yamaguchi, M. Hadi Kiapour, Luis E. Ortiz, and Tamara L. Berg. 2012. Parsing clothing in fashion photographs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3570--3577. Google ScholarGoogle ScholarCross RefCross Ref
  48. Handong Zhao, Zhengming Ding, and Yun Fu. 2017. Multi-view clustering via deep matrix factorization. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, February 4--9, 2017, San Francisco, California, USA. 2921--2927.Google ScholarGoogle Scholar

Index Terms

  1. Deep Bidirectional Cross-Triplet Embedding for Online Clothing Shopping

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!