Abstract
Deep generative models make visual content creation more accessible to novice users by automating the synthesis of diverse, realistic content based on a collected dataset. However, the current machine learning approaches miss a key element of the creative process - the ability to synthesize things that go far beyond the data distribution and everyday experience. To begin to address this issue, we enable a user to "warp" a given model by editing just a handful of original model outputs with desired geometric changes. Our method applies a low-rank update to a single model layer to reconstruct edited examples. Furthermore, to combat overfitting, we propose a latent space augmentation method based on style-mixing. Our method allows a user to create a model that synthesizes endless objects with defined geometric changes, enabling the creation of a new generative model without the burden of curating a large-scale dataset. We also demonstrate that edited models can be composed to achieve aggregated effects, and we present an interactive interface to enable users to create new models through composition. Empirical measurements on multiple test cases suggest the advantage of our method against recent GAN fine-tuning methods. Finally, we showcase several applications using the edited models, including latent space interpolation and image editing.
Supplemental Material
- Rameen Abdal, Yipeng Qin, and Peter Wonka. 2020. Image2StyleGAN++: How to Edit the Embedded Images?. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Cross Ref
- Rameen Abdal, Peihao Zhu, Niloy J Mitra, and Peter Wonka. 2021. StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows. ACM Transactions on Graphics (TOG) (2021).Google Scholar
- Kfr Aberman, Jing Liao, Mingyi Shi, Dani Lischinski, Baoquan Chen, and Daniel Cohen-Or. 2018. Neural Best-Buddies: Sparse Cross-Domain Correspondence. ACM Transactions on Graphics (TOG) (2018).Google Scholar
Digital Library
- Yuval Alaluf, Or Patashnik, and Daniel Cohen-Or. 2021. Only a Matter of Style: Age Transformation Using a Style-Based Regression Model. ACM Transactions on Graphics (TOG) (2021).Google Scholar
Digital Library
- Badour Albahar, Jingwan Lu, Jimei Yang, Zhixin Shu, Eli Shechtman, and Jia-Bin Huang. 2021. Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN. ACM Transactions on Graphics (TOG) (2021).Google Scholar
Digital Library
- Marc Alexa, Daniel Cohen-Or, and David Levin. 2000. As-Rigid-As-Possible Shape Interpolation. 157--164.Google Scholar
- Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. In ACM SIGGRAPH.Google Scholar
- Harry G Barrow, Jay M Tenenbaum, Robert C Bolles, and Helen C Wolf. 1977. Parametric correspondence and chamfer matching: Two new techniques for image matching. Technical Report. SRI International Artificial Intelligence Center.Google Scholar
- David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, and Antonio Torralba. 2020. Rewriting a deep generative model. In European Conference on Computer Vision (ECCV).Google Scholar
Digital Library
- Thaddeus Beier and Shawn Neely. 1992. Feature-Based Image Metamorphosis. ACM Transactions on Graphics (TOG) (1992).Google Scholar
- Andrew Brock, Jef Donahue, and Karen Simonyan. 2019. Large scale gan training for high fidelity natural image synthesis. In International Conference on Learning Representations (ICLR).Google Scholar
- Matthew Brown and David G Lowe. 2007. Automatic Panoramic Image Stitching using Invariant Features. International Journal of Computer Vision (IJCV) (2007).Google Scholar
Digital Library
- James Cameron and Jon Landau. 2009. Avatar.Google Scholar
- Kaidi Cao, Jing Liao, and Lu Yuan. 2018. CariGANs: Unpaired Photo-to-Caricature Translation. ACM Transactions on Graphics (TOG) (2018).Google Scholar
Digital Library
- Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A Efros. 2019. Everybody Dance Now. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
- Seokju Cho, Sunghwan Hong, Sangryul Jeon, Yunsung Lee, Kwanghoon Sohn, and Seungryong Kim. 2021. CATs: Cost Aggregation Transformers for Visual Correspondence. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. StarGAN v2: Diverse Image Synthesis for Multiple Domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Cross Ref
- Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. 2017. Density estimation using Real NVP. In International Conference on Learning Representations (ICLR).Google Scholar
- Ian Failes. 2016. Masters of FX: Behind the Scenes with Geniuses of Visual and Special Effects. CRC Press.Google Scholar
- Martin A Fischler and Robert C Bolles. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM (1981).Google Scholar
Digital Library
- Joel Frank, Thorsten Eisenhofer, Lea Schönherr, Asja Fischer, Dorothea Kolossa, and Thorsten Holz. 2020. Leveraging frequency analysis for deep fake image recognition. In International Conference on Machine Learning (ICML).Google Scholar
- Raghudeep Gadde, Qianli Feng, and Aleix M. Martinez. 2021. Detail Me More: Improving GAN's photo-realism of complex scenes. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
- Rinon Gal, Dana Cohen, Amit Bermano, and Daniel Cohen-Or. 2021. SWAGAN: A Style-based Wavelet-driven Generative Model. ACM Transactions on Graphics (TOG) (2021).Google Scholar
- Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, and Daniel Cohen-Or. 2022. StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators. ACM Transactions on Graphics (TOG) (2022).Google Scholar
Digital Library
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. GANSpace: Discovering Interpretable GAN Controls. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Minyoung Huh, Jun-Yan Zhu Richard Zhang, Sylvain Paris, and Aaron Hertzmann. 2020. Transforming and Projecting Images to Class-conditional Generative Networks. In European Conference on Computer Vision (ECCV).Google Scholar
Digital Library
- Takeo Igarashi, Tomer Moscovich, and John F Hughes. 2005. As-Rigid-As-Possible Shape Manipulation. ACM Transactions on Graphics (TOG) (2005).Google Scholar
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Cross Ref
- Wonjong Jang, Gwangjin Ju, Yucheol Jung, Jiaolong Yang, Xin Tong, and Seungyong Lee. 2021. StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation. ACM Transactions on Graphics (TOG) (2021).Google Scholar
- Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive growing of gans for improved quality, stability, and variation. In International Conference on Learning Representations (ICLR).Google Scholar
- Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2020a. Training Generative Adversarial Networks with Limited Data. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2021. Alias-Free Generative Adversarial Networks. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Cross Ref
- Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020b. Analyzing and improving the image quality of stylegan. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020).Google Scholar
Cross Ref
- Byungmoon Kim, Daichi Ito, and Gahye Park. 2019. Facial feature liquifying using face mesh. US Patent 10,223,767.Google Scholar
- Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR).Google Scholar
- Diederik P Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Diederik P Kingma and Max Welling. 2014. Auto-encoding variational bayes. International Conference on Learning Representations (ICLR) (2014).Google Scholar
- Nupur Kumari, Richard Zhang, Eli Shechtman, and Jun-Yan Zhu. 2022. Ensembling Of-the-shelf Models for GAN Training. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Kathleen M Lewis, Srivatsan Varadharajan, and Ira Kemelmacher-Shlizerman. 2021. TryOnGAN: Body-Aware Try-On via Layered Interpolation. ACM Transactions on Graphics (TOG) (2021).Google Scholar
Digital Library
- Yijun Li, Richard Zhang, Jingwan Lu, and Eli Shechtman. 2020. Few-shot Image Generation with Elastic Weight Consolidation. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, and Sing Bing Kang. 2017. Visual Attribute Transfer Through Deep Image Analogy. ACM Transactions on Graphics (TOG) 36, 4 (July 2017).Google Scholar
Digital Library
- Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, and Sanja Fidler. 2021. EditGAN: High-Precision Semantic Image Editing. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Ce Liu, Jenny Yuen, and Antonio Torralba. 2010. SIFT Flow: Dense Correspondence across Scenes and Its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2010).Google Scholar
- Bruce D Lucas, Takeo Kanade, et al. 1981. An Iterative Image Registration Technique with an Application to Stereo Vision. In International Joint Conference on Artificial Intelligence (IJCAI).Google Scholar
Digital Library
- George Lucas and Gary Kurtz. 1977. Star Wars.Google Scholar
- Sangwoo Mo, Minsu Cho, and Jinwoo Shin. 2020. Freeze the Discriminator: a Simple Baseline for Fine-Tuning GANs. In CVPR Workshop.Google Scholar
- Atsuhiro Noguchi and Tatsuya Harada. 2019. Image generation from small datasets via batch statistics adaptation. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
Cross Ref
- Utkarsh Ojha, Yijun Li, Cynthia Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, and Richard Zhang. 2021. Few-shot Image Generation via Cross-domain Correspondence. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Cross Ref
- Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016. Conditional image generation with PixelCNN decoders. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Roy Or-El, Soumyadip Sengupta, Ohad Fried, Eli Shechtman, and Ira Kemelmacher-Shlizerman. 2020. Lifespan Age Transformation Synthesis. In European Conference on Computer Vision (ECCV).Google Scholar
- Xingang Pan, Xiaohang Zhan, Bo Dai, Dahua Lin, Chen Change Loy, and Ping Luo. 2020. Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation. In European Conference on Computer Vision (ECCV).Google Scholar
- Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. Semantic Image Synthesis with Spatially-Adaptive Normalization. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021a. Styleclip: Text-driven manipulation of stylegan imagery. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
Cross Ref
- Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, and Dani Lischinski. 2021b. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
- Tiziano Portenier, Qiyang Hu, Attila Szabó, Siavash Arjomand Bigdeli, Paolo Favaro, and Matthias Zwicker. 2018. Faceshop: Deep Sketch-Based Face Image Editing. ACM Transactions on Graphics (TOG) 37, 4 (2018).Google Scholar
Digital Library
- Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML).Google Scholar
- Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-Shot Text-to-Image Generation. In International Conference on Machine Learning (ICML).Google Scholar
- Ali Razavi, Aaron van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with vq-vae-2. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2021a. Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Cross Ref
- Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, and Daniel Cohen-Or. 2021b. Encoding in style: a stylegan encoder for image-to-image translation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Cross Ref
- Andreas Rössler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. FaceForensics++: Learning to Detect Manipulated Facial Images. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
Cross Ref
- Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, and Aleksander Madry. 2021. Editing a classifer by rewriting its prediction rules. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Axel Sauer, Kashyap Chitta, Jens Müller, and Andreas Geiger. 2021. Projected GANs Converge Faster. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Scott Schaefer, Travis McPhail, and Joe Warren. 2006. Image Deformation Using Moving Least Squares. ACM Transactions on Graphics (TOG) 25, 3 (2006).Google Scholar
Digital Library
- Deb Debayan Shi, Yichun and Anil K. Jain. 2019. WarpGAN: Automatic Caricature Generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- YiChang Shih, Sylvain Paris, Connelly Barnes, William T Freeman, and Frédo Durand. 2014. Style Transfer for Headshot Portraits. ACM Transactions on Graphics (TOG) (2014).Google Scholar
- Yichang Shih, Sylvain Paris, Frédo Durand, and William T Freeman. 2013. Data-driven hallucination of different times of day from a single outdoor photo. ACM Transactions on Graphics (TOG) 32, 6 (2013), 200.Google Scholar
Digital Library
- X. Soria, E. Riba, and A. Sappa. 2020. Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection. In Winter Conference on Applications of Computer Vision.Google Scholar
- Diana Sungatullina, Egor Zakharov, Dmitry Ulyanov, and Victor S. Lempitsky. 2021. Image Manipulation with Perceptual Discriminators. In European Conference on Computer Vision (ECCV).Google Scholar
- Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Lu Yuan, Sergey Tulyakov, and Nenghai Yu. 2020. MichiGAN: Multi-input-conditioned hair image generation for portrait editing. arXiv preprint arXiv:2010.16417 (2020).Google Scholar
- Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, and Daniel Cohen-Or. 2021. Designing an Encoder for StyleGAN Image Manipulation. ACM Transactions on Graphics (TOG) (2021).Google Scholar
- Ngoc-Trung Tran, Viet-Hung Tran, Ngoc-Bao Nguyen, Trung-Kien Nguyen, and Ngai-Man Cheung. 2020. Towards good practices for data augmentation in gan training. arXiv preprint arXiv:2006.05338 2 (2020).Google Scholar
- Hung-Yu Tseng, Lu Jiang, Ce Liu, Ming-Hsuan Yang, and Weilong Yang. 2021. Regularing Generative Adversarial Networks under Limited Data. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Sheng-Yu Wang, David Bau, and Jun-Yan Zhu. 2021. Sketch Your Own GAN. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
- Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A. Efros. 2020b. CNN-generated images are surprisingly easy to spot... for now. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018a. Video-to-Video Synthesis. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Yaxing Wang, Abel Gonzalez-Garcia, David Berga, Luis Herranz, Fahad Shahbaz Khan, and Joost van de Weijer. 2020a. MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Yaxing Wang, Chenshen Wu, Luis Herranz, Joost van de Weijer, Abel Gonzalez-Garcia, and Bogdan Raducanu. 2018b. Transferring gans: generating images from limited data. In European Conference on Computer Vision (ECCV).Google Scholar
- Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing (TIP) 13, 4 (2004), 600--612.Google Scholar
Digital Library
- Simon N Wood. 2003. Thin Plate Regression Splines. Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2003).Google Scholar
- Fisher Yu, Ari Sef, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015).Google Scholar
- Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Cross Ref
- Xu Zhang, Svebor Karaman, and Shih-Fu Chang. 2019. Detecting and Simulating Artifacts in GAN Fake Images. In IEEE International Workshop on Information Forensics and Security (WIFS).Google Scholar
- Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, and Sanja Fidler. 2021. Image {GAN}s meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering. In International Conference on Learning Representations (ICLR).Google Scholar
- Miaoyun Zhao, Yulai Cong, and Lawrence Carin. 2020a. On leveraging pretrained GANs for generation with limited data. In International Conference on Machine Learning (ICML).Google Scholar
- Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, and Song Han. 2020b. Differentiable Augmentation for Data-Efficient GAN Training. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
- Zhengli Zhao, Zizhao Zhang, Ting Chen, Sameer Singh, and Han Zhang. 2020c. Image augmentations for GAN training. arXiv preprint arXiv:2006.02595 (2020).Google Scholar
- Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2017).Google Scholar
- Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A Efros. 2016. Generative visual manipulation on the natural image manifold. In European Conference on Computer Vision (ECCV).Google Scholar
Cross Ref
Index Terms
Rewriting geometric rules of a GAN
Recommendations
Semantic photo manipulation with a generative image prior
Despite the recent success of GANs in synthesizing images conditioned on inputs such as a user sketch, text, or semantic labels, manipulating the high-level attributes of an existing natural photograph with GANs is challenging for two reasons. First, it ...
Restoration of damaged artworks based on a generative adversarial network
AbstractAncient and contemporary artworks represent culture, heritage, and history. The artworks act as a bridge between the past and future of humankind. Preserving artwork is necessary for saving cultural heritage for future generations. However, ...
HI-GAN: A hierarchical generative adversarial network for blind denoising of real photographs
AbstractAlthough deep convolutional neural networks (DCNNs) and generative adversarial networks (GANs) have achieved remarkable success in image denoising, they have been facing a severe problem of the trade-off between removing noise and ...





Comments