Abstract
The task of age transformation illustrates the change of an individual's appearance over time. Accurately modeling this complex transformation over an input facial image is extremely challenging as it requires making convincing, possibly large changes to facial features and head shape, while still preserving the input identity. In this work, we present an image-to-image translation method that learns to directly encode real facial images into the latent space of a pre-trained unconditional GAN (e.g., StyleGAN) subject to a given aging shift. We employ a pre-trained age regression network to explicitly guide the encoder in generating the latent codes corresponding to the desired age. In this formulation, our method approaches the continuous aging process as a regression task between the input age and desired target age, providing fine-grained control over the generated image. Moreover, unlike approaches that operate solely in the latent space using a prior on the path controlling age, our method learns a more disentangled, non-linear path. Finally, we demonstrate that the end-to-end nature of our approach, coupled with the rich semantic latent space of StyleGAN, allows for further editing of the generated images. Qualitative and quantitative evaluations show the advantages of our method compared to state-of-the-art approaches. Code is available at our project page: https://yuval-alaluf.github.io/SAM.
Supplemental Material
- Rameen Abdal, Yipeng Qin, and Peter Wonka. 2019. Image2stylegan: How to embed images into the stylegan latent space?. In Proceedings of the IEEE international conference on computer vision. 4432--4441.Google Scholar
Cross Ref
- Rameen Abdal, Yipeng Qin, and Peter Wonka. 2020a. Image2StyleGAN++: How to Edit the Embedded Images?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8296--8305.Google Scholar
Cross Ref
- Rameen Abdal, Peihao Zhu, Niloy Mitra, and Peter Wonka. 2020b. StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows. arXiv:2008.02401 [cs.CV]Google Scholar
- Grigory Antipov, Moez Baccouche, and Jean-Luc Dugelay. 2017. Face Aging With Conditional Generative Adversarial Networks. arXiv:1702.01983 [cs.CV]Google Scholar
- David Bau, Hendrik Strobelt, William Peebles, Jonas Wulff, Bolei Zhou, Jun-Yan Zhu, and Antonio Torralba. 2019. Semantic Photo Manipulation with a Generative Image Prior. ACM Trans. Graph. 38, 4, Article 59 (July 2019), 11 pages. Google Scholar
Digital Library
- John Bauld. 2019. Image taken by John Bauld and can be found here. License: Attribution 2.0 Generic (CC BY 2.0).Google Scholar
- Baylies. 2019. stylegan-encoder. https://github.com/pbaylies/stylegan-encoder. Accessed: January 2021.Google Scholar
- Georges Biard. 2016. (2016). Image taken by Georges Biard and can be found here. License: Attribution-Share Alike 3.0 Unported (CC BY-SA 3.0).Google Scholar
- A. M. Burton, R. S. Kramer, K. L. Ritchie, and R. Jenkins. 2016. Identity From Variation: Representations of Faces Derived From Multiple Instances. Cogn Sci 40, 1 (Jan 2016), 202--223.Google Scholar
Cross Ref
- Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8789--8797.Google Scholar
Cross Ref
- Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8188--8197.Google Scholar
Cross Ref
- Ruth Clutterbuck and Robert A Johnston. 2002. Exploring Levels of Face Familiarity by Using an Indirect Face-Matching Measure. Perception 31, 8 (2002), 985--994. arXiv:https://doi.org/10.1068/p3335 PMID: 12269591. Google Scholar
Cross Ref
- Edo Collins, Raja Bala, Bob Price, and Sabine Süsstrunk. 2020. Editing in Style: Uncovering the Local Semantics of GANs. arXiv:2004.14367 [cs.CV]Google Scholar
- Antonia Creswell and Anil Anthony Bharath. 2018. Inverting the generator of a generative adversarial network. IEEE transactions on neural networks and learning systems 30, 7 (2018), 1967--1974.Google Scholar
- Gorup de Besanez. 1990. Image taken by Gorup de Besanez and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google Scholar
- Jaqueline de Souza. 2019. Image taken by Jaqueline de Souza and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google Scholar
- Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4690--4699.Google Scholar
Cross Ref
- Emily Denton, Ben Hutchinson, Margaret Mitchell, and Timnit Gebru. 2019. Detecting bias with generative counterfactual face attribute augmentation. arXiv preprint arXiv:1906.06439 (2019).Google Scholar
- Chi Nhan Duong, Khoa Luu, Kha Gia Quach, and Tien D. Bui. 2018. Longitudinal Face Aging in the Wild - Recent Deep Learning Approaches. arXiv:1802.08726 [cs.CV]Google Scholar
- H. Fang, W. Deng, Y. Zhong, and J. Hu. 2020. Triple-GAN: Progressive Face Aging with Triple Translation Loss. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 3500--3509. Google Scholar
Cross Ref
- Yun Fu, Guodong Guo, and Thomas Huang. 2010. Age Synthesis and Estimation via Faces: A Survey. IEEE transactions on pattern analysis and machine intelligence 32 (11 2010), 1955--76. Google Scholar
Digital Library
- Angela George. 2012. Image taken by Angela George and can be found here. License: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0).Google Scholar
- Markos Georgopoulos, James Oldfield, Mihalis A. Nicolaou, Yannis Panagakis, and Maja Pantic. 2020. Enhancing Facial Data Diversity With Style-Based Face Aging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.Google Scholar
Cross Ref
- Lore Goetschalckx, Alex Andonian, Aude Oliva, and Phillip Isola. 2019. GANalyze: Toward Visual Definitions of Cognitive Image Properties. arXiv:1906.10112 [cs.CV]Google Scholar
- Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. GANSpace: Discovering Interpretable GAN Controls. arXiv preprint arXiv:2004.02546 (2020).Google Scholar
- Z. He, M. Kan, S. Shan, and X. Chen. 2019. S2GAN: Share Aging Factors Across Ages and Share Aging Trends Among Individuals. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 9439--9448. Google Scholar
Cross Ref
- Zhenliang He, Wangmeng Zuo, Meina Kan, Shiguang Shan, and Xilin Chen. 2018. AttGAN: Facial Attribute Editing by Only Changing What You Want. arXiv:1711.10678 [cs.CV]Google Scholar
- Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal Unsupervised Image-to-image Translation. In ECCV.Google Scholar
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2018. Image-to-Image Translation with Conditional Adversarial Networks. arXiv:1611.07004 [cs.CV]Google Scholar
- Rob Jenkins, David White, Xandra Van Montfort, and A. Mike Burton. 2011. Variability in photos of the same face. Cognition 121, 3 (2011), 313 -- 323. Google Scholar
Cross Ref
- Robert A.Johnston, Masami Kanazawa, Takashi Kato, and Masaomi Oda. 1997. Exploring the Structure of Multidimensional Face-space: The Effects of Age and Gender. Visual Cognition 4, 1 (1997), 39--57. arXiv:https://doi.org/10.1080/713756750 Google Scholar
Cross Ref
- Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).Google Scholar
- Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4401--4410.Google Scholar
Cross Ref
- Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8110--8119.Google Scholar
Cross Ref
- Korush and Millie. 2020. Image taken by Korush and Millie and can be found here. License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).Google Scholar
- Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, and Marc'Aurelio Ranzato. 2018. Fader Networks: Manipulating Images by Sliding Attributes. arXiv:1706.00409 [cs.CV]Google Scholar
- Hsin-Ying Lee, Hung-Yu Tseng, Qi Mao, Jia-Bin Huang, Yu-Ding Lu, Maneesh Kumar Singh, and Ming-Hsuan Yang. 2020. DRIT++: Diverse Image-to-Image Translation viaDisentangled Representations. International Journal of Computer Vision (2020), 1--16.Google Scholar
- Peipei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, and Zhenan Sun. 2019. UVA: A Universal Variational Framework for Continuous Age Analysis. arXiv:1904.00158 [cs.CV]Google Scholar
- Alan Light. 1989. Image taken by Alan Light and can be found here. License: Attribution 2.0 Generic (CC BY 2.0).Google Scholar
- Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117--2125.Google Scholar
Cross Ref
- Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised Image-to-Image Translation Networks. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc., 700--708. https://proceedings.neurips.cc/paper/2017/file/dc6a6489640ca02b0d42dabeb8e46bb7-Paper.pdfGoogle Scholar
- Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila, Jaakko Lehtinen, and Jan Kautz. 2019. Few-Shot Unsupervised Image-to-Image Translation. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
- Yunfan Liu, Qi Li, Zhenan Sun, and Tieniu Tan. 2020. Style Intervention: How to Achieve Spatial Disentanglement with Style-based Generators? arXiv:2011.09699 [cs.CV]Google Scholar
- Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).Google Scholar
Digital Library
- Ahmed M. Megreya and A. Mike Burton. 2006. Unfamiliar faces are not faces: Evidence from a matching task. Memory & Cognition 34, 4 (01 Jun 2006), 865--876. Google Scholar
Cross Ref
- A. M. Megreya and A. M. Burton. 2008. Matching faces to photographs: Poor performance in eyewitness memory (without the memory). Journal of Experimental Psychology: Applied, 14(4) (2008), 364--372. https://doi.org/0.1037/a0013464Google Scholar
- Mila Mileva, Andrew W. Young, Rob Jenkins, and A. Mike Burton. 2020. Facial identity across the lifespan. Cognitive Psychology 116 (2020), 101260. Google Scholar
Cross Ref
- Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. arXiv:1411.1784 [cs.LG]Google Scholar
- Yotam Nitzan, Amit Bermano, Yangyan Li, and Daniel Cohen-Or. 2020. Face Identity Disentanglement via Latent Space Mapping. ACM Trans. Graph. 39, 6, Article 225 (Nov. 2020), 14 pages. Google Scholar
Digital Library
- Roy Or-El, Soumyadip Sengupta, Ohad Fried, Eli Shechtman, and Ira Kemelmacher-Shlizerman. 2020. Lifespan Age Transformation Synthesis. arXiv:2003.09764 [cs.CV]Google Scholar
- Yibo Hu Xiang Wu Ran He Zhenan Sun Peipei Li, Huaibo Huang. 2020. Hierarchical Face Aging through Disentangled Latent Characteristics. ECCV (2020).Google Scholar
- Stanislav Pidhorskyi, Donald A Adjeroh, and Gianfranco Doretto. 2020. Adversarial Latent Autoencoders. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14104--14113.Google Scholar
Cross Ref
- Narayanan Ramanathan, Rama Chellappa, and Soma Biswas. 2009. Computational methods for modeling facial aging: A survey. Journal of Visual Languages Computing 20, 3 (2009), 131 -- 144. ADVANCES IN MULTIMODAL BIOMETRIC SYSTEMS.Google Scholar
Digital Library
- Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2020. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. arXiv:2008.00951 [cs.CV]Google Scholar
- Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2015. DEX: Deep EXpectation of apparent age from a single image. In IEEE International Conference on Computer Vision Workshops (ICCVW).Google Scholar
Digital Library
- Rasmus Rothe, Radu Timofte, and Luc Van Gool. 2018. Deep expectation of real and apparent age from a single image without facial landmarks. International Journal of Computer Vision 126, 2-4 (2018), 144--157.Google Scholar
Digital Library
- David Shankbone. 2008. Image taken by David Shankbone and can be found here. License: Attribution-Share Alike 3.0 Unported.Google Scholar
- David Shankbone. 2010. Image taken by David Shankbone and can be found here. License: Attribution 3.0 Unported (CC BY 3.0).Google Scholar
- Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9243--9252.Google Scholar
Cross Ref
- Yujun Shen and Bolei Zhou. 2020. Closed-Form Factorization of Latent Semantics in GANs. arXiv preprint arXiv:2007.06600 (2020).Google Scholar
- Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 [cs.CV]Google Scholar
- Hao Tang, Hong Liu, Dan Xu, Philip H. S. Torr, and Nicu Sebe. 2020. AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Networks. arXiv:1911.11897 [cs.CV]Google Scholar
- X. Tang, Z. Wang, W. Luo, and S. Gao. 2018. Face Aging with Identity-Preserved Conditional Generative Adversarial Networks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 7939--7947. Google Scholar
Cross Ref
- Ayush Tewari, Mohamed Elgharib, Mallikarjun B R., Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, and Christian Theobalt. 2020. PIE: Portrait Image Embedding for Semantic Control. arXiv:2009.09485 [cs.CV]Google Scholar
- Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, and Daniel Cohen-Or. 2021. Designing an Encoder for StyleGAN Image Manipulation. arXiv:2102.02766 [cs.CV]Google Scholar
- Government U.S. 2014. Image taken from here. Licensed under the Public Domain as a work of the U.S. federal government.Google Scholar
- Yuri Viazovetskyi, Vladimir Ivashkin, and Evgeny Kashin. 2020. StyleGAN2 Distillation for Feed-forward Image Manipulation. arXiv preprint arXiv:2003.03581 (2020).Google Scholar
- Andrey Voynov and Artem Babenko. 2020. Unsupervised Discovery of Interpretable Directions in the GAN Latent Space. arXiv preprint arXiv:2002.03754 (2020).Google Scholar
- W. Wang, Z. Cui, Y. Yan, J. Feng, S. Yan, X. Shu, and N. Sebe. 2016. Recurrent Face Aging. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2378--2386. Google Scholar
Cross Ref
- Zongze Wu, Dani Lischinski, and Eli Shechtman. 2020. StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation. arXiv:2011.12799 [cs.CV]Google Scholar
- Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, and Ming-Hsuan Yang. 2021. GAN Inversion: A Survey. arXiv:2101.05278 [cs.CV]Google Scholar
- Ceyuan Yang, Yujun Shen, and Bolei Zhou. 2020. Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis. arXiv:1911.09267 [cs.CV]Google Scholar
- Hongyu Yang, Di Huang, Yunhong Wang, and Anil K. Jain. 2019. Learning Face Age Progression: A Pyramid Architecture of GANs. arXiv:1711.10352 [cs.CV]Google Scholar
- Xu Yao, Gilles Puy, Alasdair Newson, Yann Gousseau, and Pierre Hellier. 2020. High Resolution Face Age Editing. CoRR abs/2005.04410 (2020).Google Scholar
- Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. 2868--2876. Google Scholar
Cross Ref
- Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.Google Scholar
Cross Ref
- Zhifei Zhang, Yang Song, and Hairong Qi. 2017. Age Progression/Regression by Conditional Adversarial Autoencoder. arXiv:1702.08423 [cs.CV]Google Scholar
- Jiapeng Zhu, Yujun Shen, Deli Zhao, and Bolei Zhou. 2020. In-domain gan inversion for real image editing. arXiv preprint arXiv:2004.00049 (2020).Google Scholar
- Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A Efros. 2016. Generative visual manipulation on the natural image manifold. In European conference on computer vision. Springer, 597--613.Google Scholar
Cross Ref
- Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017a. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Computer Vision (ICCV), 2017 IEEE International Conference on.Google Scholar
Cross Ref
- Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017b. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems.Google Scholar
Index Terms
Only a matter of style: age transformation using a style-based regression model
Recommendations
Semantic photo manipulation with a generative image prior
Despite the recent success of GANs in synthesizing images conditioned on inputs such as a user sketch, text, or semantic labels, manipulating the high-level attributes of an existing natural photograph with GANs is challenging for two reasons. First, it ...
Image Manipulation with Perceptual Discriminators
Computer Vision – ECCV 2018AbstractSystems that perform image manipulation using deep convolutional networks have achieved remarkable realism. Perceptual losses and losses based on adversarial discriminators are the two main classes of learning objectives behind these advances. In ...
Age-dependent face diversification via latent space analysis
AbstractFacial age transformation methods can change facial appearance according to the target age. However, most existing methods do not consider that people get older with different attribute changes (e.g., wrinkles, hair volume, and face shape) ...





Comments