skip to main content
research-article

Only a matter of style: age transformation using a style-based regression model

Published:19 July 2021Publication History
Skip Abstract Section

Abstract

The task of age transformation illustrates the change of an individual's appearance over time. Accurately modeling this complex transformation over an input facial image is extremely challenging as it requires making convincing, possibly large changes to facial features and head shape, while still preserving the input identity. In this work, we present an image-to-image translation method that learns to directly encode real facial images into the latent space of a pre-trained unconditional GAN (e.g., StyleGAN) subject to a given aging shift. We employ a pre-trained age regression network to explicitly guide the encoder in generating the latent codes corresponding to the desired age. In this formulation, our method approaches the continuous aging process as a regression task between the input age and desired target age, providing fine-grained control over the generated image. Moreover, unlike approaches that operate solely in the latent space using a prior on the path controlling age, our method learns a more disentangled, non-linear path. Finally, we demonstrate that the end-to-end nature of our approach, coupled with the rich semantic latent space of StyleGAN, allows for further editing of the generated images. Qualitative and quantitative evaluations show the advantages of our method compared to state-of-the-art approaches. Code is available at our project page: https://yuval-alaluf.github.io/SAM.

Skip Supplemental Material Section

Supplemental Material

a45-alaluf.mp4
3450626.3459805.mp4

References

  1. Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2stylegan: How to embed images into the stylegan latent space?. In Proceedings of the IEEE international conference on computer vision. 4432--4441.Google ScholarGoogle ScholarCross RefCross Ref
  2. Rameen Abdal, Yipeng Qin, and Peter Wonka. 2020a. Image2StyleGAN++: How to Edit the Embedded Images?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8296--8305.Google ScholarGoogle ScholarCross RefCross Ref
  3. Rameen Abdal, Peihao Zhu, Niloy Mitra, and Peter Wonka. 2020b. StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows. arXiv:2008.02401 [cs.CV]Google ScholarGoogle Scholar
  4. Grigory Antipov, Moez Baccouche, and Jean-Luc Dugelay. 2017. Face Aging With Conditional Generative Adversarial Networks. arXiv:1702.01983 [cs.CV]Google ScholarGoogle Scholar
  5. David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, and Antonio Torralba. 2019. Semantic Photo Manipulation with a Generative Image Prior. ACM Trans. Graph. 38, 4, Article 59 (July 2019), 11 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. John Bauld. 2019. Image taken by John Bauld and can be found here. License: Attribution 2.0 Generic (CC BY 2.0).Google ScholarGoogle Scholar
  7. Baylies. 2019. stylegan-encoder. https://github.com/pbaylies/stylegan-encoder. Accessed: January 2021.Google ScholarGoogle Scholar
  8. Georges Biard. 2016. (2016). Image taken by Georges Biard and can be found here. License: Attribution-Share Alike 3.0 Unported (CC BY-SA 3.0).Google ScholarGoogle Scholar
  9. A. M. Burton, R. S. Kramer, K. L. Ritchie, and R. Jenkins. 2016. Identity From Variation: Representations of Faces Derived From Multiple Instances. Cogn Sci 40, 1 (Jan 2016), 202--223.Google ScholarGoogle ScholarCross RefCross Ref
  10. Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8789--8797.Google ScholarGoogle ScholarCross RefCross Ref
  11. Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8188--8197.Google ScholarGoogle ScholarCross RefCross Ref
  12. Ruth Clutterbuck and Robert A Johnston. 2002. Exploring Levels of Face Familiarity by Using an Indirect Face-Matching Measure. Perception 31, 8 (2002), 985--994. arXiv:https://doi.org/10.1068/p3335 PMID: 12269591. Google ScholarGoogle ScholarCross RefCross Ref
  13. Edo Collins, Raja Bala, Bob Price, and Sabine Süsstrunk. 2020. Editing in Style: Uncovering the Local Semantics of GANs. arXiv:2004.14367 [cs.CV]Google ScholarGoogle Scholar
  14. Antonia Creswell and Anil Anthony Bharath. 2018. Inverting the generator of a generative adversarial network. IEEE transactions on neural networks and learning systems 30, 7 (2018), 1967--1974.Google ScholarGoogle Scholar
  15. Gorup de Besanez. 1990. Image taken by Gorup de Besanez and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google ScholarGoogle Scholar
  16. Jaqueline de Souza. 2019. Image taken by Jaqueline de Souza and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google ScholarGoogle Scholar
  17. Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4690--4699.Google ScholarGoogle ScholarCross RefCross Ref
  18. Emily Denton, Ben Hutchinson, Margaret Mitchell, and Timnit Gebru. 2019. Detecting bias with generative counterfactual face attribute augmentation. arXiv preprint arXiv:1906.06439 (2019).Google ScholarGoogle Scholar
  19. Chi Nhan Duong, Khoa Luu, Kha Gia Quach, and Tien D. Bui. 2018. Longitudinal Face Aging in the Wild - Recent Deep Learning Approaches. arXiv:1802.08726 [cs.CV]Google ScholarGoogle Scholar
  20. H. Fang, W. Deng, Y. Zhong, and J. Hu. 2020. Triple-GAN: Progressive Face Aging with Triple Translation Loss. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 3500--3509. Google ScholarGoogle ScholarCross RefCross Ref
  21. Yun Fu, Guodong Guo, and Thomas Huang. 2010. Age Synthesis and Estimation via Faces: A Survey. IEEE transactions on pattern analysis and machine intelligence 32 (11 2010), 1955--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Angela George. 2012. Image taken by Angela George and can be found here. License: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0).Google ScholarGoogle Scholar
  23. Markos Georgopoulos, James Oldfield, Mihalis A. Nicolaou, Yannis Panagakis, and Maja Pantic. 2020. Enhancing Facial Data Diversity With Style-Based Face Aging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  24. Lore Goetschalckx, Alex Andonian, Aude Oliva, and Phillip Isola. 2019. GANalyze: Toward Visual Definitions of Cognitive Image Properties. arXiv:1906.10112 [cs.CV]Google ScholarGoogle Scholar
  25. Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. GANSpace: Discovering Interpretable GAN Controls. arXiv preprint arXiv:2004.02546 (2020).Google ScholarGoogle Scholar
  26. Z. He, M. Kan, S. Shan, and X. Chen. 2019. S2GAN: Share Aging Factors Across Ages and Share Aging Trends Among Individuals. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 9439--9448. Google ScholarGoogle ScholarCross RefCross Ref
  27. Zhenliang He, Wangmeng Zuo, Meina Kan, Shiguang Shan, and Xilin Chen. 2018. AttGAN: Facial Attribute Editing by Only Changing What You Want. arXiv:1711.10678 [cs.CV]Google ScholarGoogle Scholar
  28. Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal Unsupervised Image-to-image Translation. In ECCV.Google ScholarGoogle Scholar
  29. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2018. Image-to-Image Translation with Conditional Adversarial Networks. arXiv:1611.07004 [cs.CV]Google ScholarGoogle Scholar
  30. Rob Jenkins, David White, Xandra Van Montfort, and A. Mike Burton. 2011. Variability in photos of the same face. Cognition 121, 3 (2011), 313 -- 323. Google ScholarGoogle ScholarCross RefCross Ref
  31. Robert A.Johnston, Masami Kanazawa, Takashi Kato, and Masaomi Oda. 1997. Exploring the Structure of Multidimensional Face-space: The Effects of Age and Gender. Visual Cognition 4, 1 (1997), 39--57. arXiv:https://doi.org/10.1080/713756750 Google ScholarGoogle ScholarCross RefCross Ref
  32. Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).Google ScholarGoogle Scholar
  33. Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401--4410.Google ScholarGoogle ScholarCross RefCross Ref
  34. Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8110--8119.Google ScholarGoogle ScholarCross RefCross Ref
  35. Korush and Millie. 2020. Image taken by Korush and Millie and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google ScholarGoogle Scholar
  36. Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, and Marc'Aurelio Ranzato. 2018. Fader Networks: Manipulating Images by Sliding Attributes. arXiv:1706.00409 [cs.CV]Google ScholarGoogle Scholar
  37. Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu, Maneesh Kumar Singh, and Ming-Hsuan Yang. 2020. DRIT++: Diverse Image-to-Image Translation viaDisentangled Representations. International Journal of Computer Vision (2020), 1--16.Google ScholarGoogle Scholar
  38. Peipei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, and Zhenan Sun. 2019. UVA: A Universal Variational Framework for Continuous Age Analysis. arXiv:1904.00158 [cs.CV]Google ScholarGoogle Scholar
  39. Alan Light. 1989. Image taken by Alan Light and can be found here. License: Attribution 2.0 Generic (CC BY 2.0).Google ScholarGoogle Scholar
  40. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117--2125.Google ScholarGoogle ScholarCross RefCross Ref
  41. Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised Image-to-Image Translation Networks. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc., 700--708. https://proceedings.neurips.cc/paper/2017/file/dc6a6489640ca02b0d42dabeb8e46bb7-Paper.pdfGoogle ScholarGoogle Scholar
  42. Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, and Jan Kautz. 2019. Few-Shot Unsupervised Image-to-Image Translation. In IEEE International Conference on Computer Vision (ICCV).Google ScholarGoogle Scholar
  43. Yunfan Liu, Qi Li, Zhenan Sun, and Tieniu Tan. 2020. Style Intervention: How to Achieve Spatial Disentanglement with Style-based Generators? arXiv:2011.09699 [cs.CV]Google ScholarGoogle Scholar
  44. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Ahmed M. Megreya and A. Mike Burton. 2006. Unfamiliar faces are not faces: Evidence from a matching task. Memory & Cognition 34, 4 (01 Jun 2006), 865--876. Google ScholarGoogle ScholarCross RefCross Ref
  46. A. M. Megreya and A. M. Burton. 2008. Matching faces to photographs: Poor performance in eyewitness memory (without the memory). Journal of Experimental Psychology: Applied, 14(4) (2008), 364--372. https://doi.org/0.1037/a0013464Google ScholarGoogle Scholar
  47. Mila Mileva, Andrew W. Young, Rob Jenkins, and A. Mike Burton. 2020. Facial identity across the lifespan. Cognitive Psychology 116 (2020), 101260. Google ScholarGoogle ScholarCross RefCross Ref
  48. Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. arXiv:1411.1784 [cs.LG]Google ScholarGoogle Scholar
  49. Yotam Nitzan, Amit Bermano, Yangyan Li, and Daniel Cohen-Or. 2020. Face Identity Disentanglement via Latent Space Mapping. ACM Trans. Graph. 39, 6, Article 225 (Nov. 2020), 14 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Roy Or-El, Soumyadip Sengupta, Ohad Fried, Eli Shechtman, and Ira Kemelmacher-Shlizerman. 2020. Lifespan Age Transformation Synthesis. arXiv:2003.09764 [cs.CV]Google ScholarGoogle Scholar
  51. Yibo Hu Xiang Wu Ran He Zhenan Sun Peipei Li, Huaibo Huang. 2020. Hierarchical Face Aging through Disentangled Latent Characteristics. ECCV (2020).Google ScholarGoogle Scholar
  52. Stanislav Pidhorskyi, Donald A Adjeroh, and Gianfranco Doretto. 2020. Adversarial Latent Autoencoders. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14104--14113.Google ScholarGoogle ScholarCross RefCross Ref
  53. Narayanan Ramanathan, Rama Chellappa, and Soma Biswas. 2009. Computational methods for modeling facial aging: A survey. Journal of Visual Languages Computing 20, 3 (2009), 131 -- 144. ADVANCES IN MULTIMODAL BIOMETRIC SYSTEMS.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2020. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. arXiv:2008.00951 [cs.CV]Google ScholarGoogle Scholar
  55. Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2015. DEX: Deep EXpectation of apparent age from a single image. In IEEE International Conference on Computer Vision Workshops (ICCVW).Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2018. Deep expectation of real and apparent age from a single image without facial landmarks. International Journal of Computer Vision 126, 2-4 (2018), 144--157.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. David Shankbone. 2008. Image taken by David Shankbone and can be found here. License: Attribution-Share Alike 3.0 Unported.Google ScholarGoogle Scholar
  58. David Shankbone. 2010. Image taken by David Shankbone and can be found here. License: Attribution 3.0 Unported (CC BY 3.0).Google ScholarGoogle Scholar
  59. Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9243--9252.Google ScholarGoogle ScholarCross RefCross Ref
  60. Yujun Shen and Bolei Zhou. 2020. Closed-Form Factorization of Latent Semantics in GANs. arXiv preprint arXiv:2007.06600 (2020).Google ScholarGoogle Scholar
  61. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV]Google ScholarGoogle Scholar
  62. Hao Tang, Hong Liu, Dan Xu, Philip H. S. Torr, and Nicu Sebe. 2020. AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Networks. arXiv:1911.11897 [cs.CV]Google ScholarGoogle Scholar
  63. X. Tang, Z. Wang, W. Luo, and S. Gao. 2018. Face Aging with Identity-Preserved Conditional Generative Adversarial Networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 7939--7947. Google ScholarGoogle ScholarCross RefCross Ref
  64. Ayush Tewari, Mohamed Elgharib, Mallikarjun B R., Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, and Christian Theobalt. 2020. PIE: Portrait Image Embedding for Semantic Control. arXiv:2009.09485 [cs.CV]Google ScholarGoogle Scholar
  65. Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, and Daniel Cohen-Or. 2021. Designing an Encoder for StyleGAN Image Manipulation. arXiv:2102.02766 [cs.CV]Google ScholarGoogle Scholar
  66. Government U.S. 2014. Image taken from here. Licensed under the Public Domain as a work of the U.S. federal government.Google ScholarGoogle Scholar
  67. Yuri Viazovetskyi, Vladimir Ivashkin, and Evgeny Kashin. 2020. StyleGAN2 Distillation for Feed-forward Image Manipulation. arXiv preprint arXiv:2003.03581 (2020).Google ScholarGoogle Scholar
  68. Andrey Voynov and Artem Babenko. 2020. Unsupervised Discovery of Interpretable Directions in the GAN Latent Space. arXiv preprint arXiv:2002.03754 (2020).Google ScholarGoogle Scholar
  69. W. Wang, Z. Cui, Y. Yan, J. Feng, S. Yan, X. Shu, and N. Sebe. 2016. Recurrent Face Aging. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2378--2386. Google ScholarGoogle ScholarCross RefCross Ref
  70. Zongze Wu, Dani Lischinski, and Eli Shechtman. 2020. StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation. arXiv:2011.12799 [cs.CV]Google ScholarGoogle Scholar
  71. Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, and Ming-Hsuan Yang. 2021. GAN Inversion: A Survey. arXiv:2101.05278 [cs.CV]Google ScholarGoogle Scholar
  72. Ceyuan Yang, Yujun Shen, and Bolei Zhou. 2020. Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis. arXiv:1911.09267 [cs.CV]Google ScholarGoogle Scholar
  73. Hongyu Yang, Di Huang, Yunhong Wang, and Anil K. Jain. 2019. Learning Face Age Progression: A Pyramid Architecture of GANs. arXiv:1711.10352 [cs.CV]Google ScholarGoogle Scholar
  74. Xu Yao, Gilles Puy, Alasdair Newson, Yann Gousseau, and Pierre Hellier. 2020. High Resolution Face Age Editing. CoRR abs/2005.04410 (2020).Google ScholarGoogle Scholar
  75. Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. 2868--2876. Google ScholarGoogle ScholarCross RefCross Ref
  76. Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.Google ScholarGoogle ScholarCross RefCross Ref
  77. Zhifei Zhang, Yang Song, and Hairong Qi. 2017. Age Progression/Regression by Conditional Adversarial Autoencoder. arXiv:1702.08423 [cs.CV]Google ScholarGoogle Scholar
  78. Jiapeng Zhu, Yujun Shen, Deli Zhao, and Bolei Zhou. 2020. In-domain gan inversion for real image editing. arXiv preprint arXiv:2004.00049 (2020).Google ScholarGoogle Scholar
  79. Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A Efros. 2016. Generative visual manipulation on the natural image manifold. In European conference on computer vision. Springer, 597--613.Google ScholarGoogle ScholarCross RefCross Ref
  80. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017a. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Computer Vision (ICCV), 2017 IEEE International Conference on.Google ScholarGoogle ScholarCross RefCross Ref
  81. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017b. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar

Index Terms

  1. Only a matter of style: age transformation using a style-based regression model

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 40, Issue 4
        August 2021
        2170 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/3450626
        Issue’s Table of Contents

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 July 2021
        Published in tog Volume 40, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader