Abstract
The 2D virtual try-on task aims to transfer a target clothing image to the corresponding region of a person image. Although an extensive amount of research has been conducted due to its immense applications, this task still remains a great challenge to handle some complicated issues (e.g., non-rigid shapes, large occlusions and arbitrary poses). To this end, we propose a novel network with structural and textural consistency-preserving mechanism for producing high-fidelity try-on images. Specifically, we first generate the semantic layout of a clothing-agnostic person to obtain the segmentation map, which is used as the transforming conditions of the target clothes. Based on a recurrent network structure, the transform lookup is performed to iteratively update a dense flow. Then, we adopt a thin-plate-spline-based warping method to estimate the coarse offset flow for all key-point positions. Guided by this sparse flow, a multi-scale deformable convolution module is designed to further iteratively predict the fine offsets for densely sampled positions, by which the clothing item and person shape can be accurately aligned. Finally, we develop a refinement module to effectively fuse the global and local features, which can render accurate geometric structures of the body parts and maintain texture sharpness of the clothes. Extensive experiments on benchmark datasets demonstrate that our method outperforms other state-of-the-art methods in terms of quantitative and qualitative try-on results. The code is available on: https://github.com/TJU-WEIHAO/MLCN.
- [1] . 2018. Large scale GAN training for high fidelity natural image synthesis. In International Conference on Learning Representations.Google Scholar
- [2] . 2019. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 1 (2019), 172–186.Google Scholar
Digital Library
- [3] . 2020. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics 39, 4 (2020), 1–16.Google Scholar
Digital Library
- [4] . 2021. VITON-HD: High-resolution virtual try-on via misalignment-aware normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14131–14140.Google Scholar
Cross Ref
- [5] . 1982. The Frechet distance between multivariate normal distributions. Journal of Multivariate Analysis (1982).Google Scholar
Cross Ref
- [6] 1977. Splines minimizing rotation-invariant seminorms in Sobolev spaces. Constructive Theory of Functions of Several Variables 572 (1977), 85–100.Google Scholar
Cross Ref
- [7] . 2019. Interpretable partitioned embedding for intelligent multi-item fashion outfit composition. ACM Trans. Multimedia Comput. Commun. Appl. 15, 2 (2019).Google Scholar
- [8] . 2022. Transform, warp, and dress: A new transformation-guided model for virtual try-On. ACM Trans. Multimedia Comput. Commun. Appl. 18, 2 (2022).Google Scholar
Digital Library
- [9] . 2021. Shape controllable virtual try-on for underwear models. In ACM Multimedia.Google Scholar
- [10] . 2021. Disentangled cycle consistency for highly-realistic virtual try-on. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16928–16938.Google Scholar
Cross Ref
- [11] . 2021. Parser-free virtual try-on via distilling appearance flows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8485–8493.Google Scholar
Cross Ref
- [12] . 2018. Instance-level human parsing via part grouping network. In Proceedings of the European Conference on Computer Vision. 770–785.Google Scholar
Digital Library
- [13] . 2014. Generative adversarial networks. In Advances in Neural Information Processing Systems. 2672–2680.Google Scholar
Digital Library
- [14] . 2019. GarNet: A two-stream network for fast and accurate 3D cloth draping. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8738–8747.Google Scholar
Cross Ref
- [15] . 2019. ClothFlow: A flow-based model for clothed person generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10470–10479.Google Scholar
Cross Ref
- [16] . 2019. FiNet: Compatible and diverse fashion image inpainting. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4480–4490.Google Scholar
Cross Ref
- [17] 2018. VITON: An image-based virtual try-on network. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7543–7552.Google Scholar
Cross Ref
- [18] . 2020. SieveNet: A unified framework for robust image-based virtual try-on. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 2171–2179.Google Scholar
Cross Ref
- [19] . 2017. The conditional analogy GAN: Swapping fashion articles on people images. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2287–2292.Google Scholar
Cross Ref
- [20] . 2021. Alias-free generative adversarial networks. In Advances in Neural Information Processing Systems.Google Scholar
- [21] . 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4396–4405.Google Scholar
Cross Ref
- [22] . 2019. LA-VITON: A network for looking-attractive virtual try-on. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 2287–2292.Google Scholar
Cross Ref
- [23] . 2021. Deep-based self-refined face-top coordination. ACM Trans. Multimedia Comput. Commun. Appl. 17, 3 (2021).Google Scholar
Digital Library
- [24] . 2020. 3D reconstruction of clothes using a human body model and its application to image-based virtual try-on. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.Google Scholar
- [25] . 2020. CP-VTON+: Clothing shapeand texture preserving image-based virtual try-on. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.Google Scholar
- [26] . 2018. Dense pose transfer. In European Conference on ComputerVision. 128–143.Google Scholar
Digital Library
- [27] . 2019. Semantic image synthesis with spatially-adaptive normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2332–2341.Google Scholar
Cross Ref
- [28] 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5967–5976.Google Scholar
- [29] . 2017. ClothCap: Seamless 4D clothing capture and retargeting. ACM Transactions on Graphics 36, 4 (2017), 1–15.Google Scholar
Digital Library
- [30] . 2018. SwapNet: Image based garment transfer. In European Conference on Computer Vision. 679–695.Google Scholar
Digital Library
- [31] . 2016. Generative adversarial text to image synthesis. In 33rd International Conference on Machine Learning. 1060–1069.Google Scholar
- [32] . 2014. Virtual fitting by single-shot body shape estimation. In International Conference on 3D Body Scanning Technologies.Google Scholar
Cross Ref
- [33] . 2015. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations.Google Scholar
- [34] . 2018. Toward characteristic preserving image-based virtual try-on network. In European Conference on Computer Vision. 607–623.Google Scholar
Digital Library
- [35] . 2021. Sketch your own GAN. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14050–14060.Google Scholar
Cross Ref
- [36] . 2004. Image quality assessment: From error visibility to structural similarity. International Conference on Learning Representations 13, 4 (2004), 600–612.Google Scholar
- [37] . 2018. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1316–1324.Google Scholar
Cross Ref
- [38] . 2020. Towards photo-realistic virtual try-on by adaptively generating-preserving image content. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7850–7859.Google Scholar
Cross Ref
- [39] . 2021. Attribute-wise explainable fashion compatibility modeling. ACM Trans. Multimedia Comput. Commun. Appl. 17, 1 (2021).Google Scholar
Digital Library
- [40] . 2019. VTNFP: An image-based virtual try-on network with body and clothing feature preservation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10510–10619.Google Scholar
Cross Ref
- [41] . 2018. Human appearance transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5391–5399.Google Scholar
Cross Ref
- [42] . 2022. Tell, imagine, and search: End-to-end learning for composing text and image to image retrieval. ACM Trans. Multimedia Comput. Commun. Appl. 18, 2 (2022).Google Scholar
Digital Library
- [43] . 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 586–595.Google Scholar
Cross Ref
- [44] 2012. Image-based clothes animation for virtual fitting. In SIGGRAPH Asia 2012 Technical Briefs.Google Scholar
- [45] . 2017. Be your own Prada: Fashion synthesis with structural coherence. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1689–1697.Google Scholar
Cross Ref
Index Terms
A Multi-Level Consistency Network for High-Fidelity Virtual Try-On
Recommendations
Progressive Limb-Aware Virtual Try-On
MM '22: Proceedings of the 30th ACM International Conference on MultimediaExisting image-based virtual try-on methods directly transfer specific clothing to a human image without utilizing clothing attributes to refine the transferred clothing geometry and textures, which causes incomplete and blurred clothing appearances. In ...
Dress Code: High-Resolution Multi-category Virtual Try-On
Computer Vision – ECCV 2022AbstractImage-based virtual try-on strives to transfer the appearance of a clothing item onto the image of a target person. Prior work focuses mainly on upper-body clothes (e.g. t-shirts, shirts, and tops) and neglects full-body or lower-body items. This ...
Toward multi-category garments virtual try-on method by coarse to fine TPS deformation
AbstractVirtual try-on facilitates users to evaluate the wearing effect of garments on their bodies. As online clothing shopping develops, the category and style of garments constantly enrich. It is an issue to warp multi-category garments as the user ...






Comments