Abstract
Time-lapse image sequences offer visually compelling insights into dynamic processes that are too slow to observe in real time. However, playing a long time-lapse sequence back as a video often results in distracting flicker due to random effects, such as weather, as well as cyclic effects, such as the day-night cycle. We introduce the problem of disentangling time-lapse sequences in a way that allows separate, after-the-fact control of overall trends, cyclic effects, and random effects in the images, and describe a technique based on data-driven generative models that achieves this goal. This enables us to "re-render" the sequences in ways that would not be possible with the input images alone. For example, we can stabilize a long sequence to focus on plant growth over many months, under selectable, consistent weather.
Our approach is based on Generative Adversarial Networks (GAN) that are conditioned with the time coordinate of the time-lapse sequence. Our architecture and training procedure are designed so that the networks learn to model random variations, such as weather, using the GAN's latent space, and to disentangle overall trends and cyclic variations by feeding the conditioning time label to the model using Fourier features with specific frequencies.
We show that our models are robust to defects in the training data, enabling us to amend some of the practical difficulties in capturing long time-lapse sequences, such as temporary occlusions, uneven frame spacing, and missing frames.
Supplemental Material
Available for Download
- Anokhin, I., Solovev, P., Korzhenkov, D., Kharlamov, A., Khakhulin, T., Silvestrov, A., Nikolenko, S., Lempitsky, V., and Sterkin, G. (2020). High-resolution daytime translation without domain labels. In Proc. CVPR.Google Scholar
Cross Ref
- Brock, A., Donahue, J., and Simonyan, K. (2019). Large scale gan training for high fidelity natural image synthesis. In Proc. ICLR.Google Scholar
- Choi, Y., Uh, Y., Yoo, J., and Ha, J.-W. (2020). Stargan v2: Diverse image synthesis for multiple domains. In Proc. CVPR.Google Scholar
- Chong, M. J., Chu, W.-S., Kumar, A., and Forsyth, D. (2021). Retrieve in style: Unsupervised facial feature transfer and retrieval. In Proc. ICCV.Google Scholar
Cross Ref
- Clark, A., Donahue, J., and Simonyan, K. (2019). Efficient video generation on complex datasets. CoRR, abs/1907.06571.Google Scholar
- Collins, E., Bala, R., Price, B., and Süsstrunk, S. (2020). Editing in style: Uncovering the local semantics of GANs. In Proc. CVPR.Google Scholar
Cross Ref
- Colton, S. and Ferrer, B. P. (2021). Ganlapse generative photography. In Proc. International Conference on Computational Creativity.Google Scholar
- Dinh, L., Sohl-Dickstein, J., and Bengio, S. (2017). Density estimation using Real NVP. In Proc. ICLR.Google Scholar
- Endo, Y., Kanamori, Y., and Kuriyama, S. (2019). Animating landscape: self-supervised learning of decoupled motion and appearance for single-image video synthesis. In Proc. SIGGRAPH ASIA 2019.Google Scholar
Digital Library
- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative Adversarial Networks. In Proc. NIPS.Google Scholar
- Härkönen, E., Hertzmann, A., Lehtinen, J., and Paris, S. (2020). GANSpace: Discovering interpretable GAN controls. In Proc. NeurIPS.Google Scholar
- Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., and Hochreiter, S. (2017). GANs trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, page 6629--6640, Red Hook, NY, USA. Curran Associates Inc.Google Scholar
Digital Library
- Ho, J., Jain, A., and Abbeel, P. (2020). Denoising diffusion probabilistic models. In Proc. NeurIPS.Google Scholar
- Horita, D. and Yanai, K. (2020). Ssa-gan: End-to-end time-lapse video generation with spatial self-attention. In Proc. ACPR.Google Scholar
Digital Library
- Huang, X., Mallya, A., Wang, T.-C., and Liu, M.-Y. (2021). Multimodal conditional image synthesis with product-of-experts GANs. CoRR, abs/2112.05130.Google Scholar
- Jacobs, N., Burgin, W., Fridrich, N., Abrams, A., Miskell, K., Braswell, B. H., Richardson, A. D., and Pless, R. (2009). The global network of outdoor webcams: Properties and applications. In ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL).Google Scholar
Digital Library
- Jacobs, N., Roman, N., and Pless, R. (2007). Consistent temporal variations in many outdoor scenes. In Proc. CVPR.Google Scholar
Cross Ref
- Kafri, O., Patashnik, O., Alaluf, Y., and Cohen-Or, D. (2021). Stylefusion: A generative model for disentangling spatial segments. CoRR, abs/2107.07437.Google Scholar
- Karacan, L., Akata, Z., Erdem, A., and Erdem, E. (2019). Manipulating attributes of natural scenes via hallucination. In Proc. TOG.Google Scholar
- Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., and Aila, T. (2020a). Training generative adversarial networks with limited data. In Proc. NeurIPS.Google Scholar
- Karras, T., Aittala, M., Laine, S., Härkönen, E., Hellsten, J., Lehtinen, J., and Aila, T. (2021). Alias-free generative adversarial networks. In Proc. NeurIPS.Google Scholar
- Karras, T., Laine, S., and Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proc. CVPR.Google Scholar
Cross Ref
- Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., and Aila, T. (2020b). Analyzing and improving the image quality of StyleGAN. In Proc. CVPR.Google Scholar
- Kim, J., Kim, M., Kang, H., and Lee, K. (2020). U-gat-it: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In Proc. ICLR.Google Scholar
- Kingma, D. P. and Dhariwal, P. (2018). Glow: Generative flow with invertible 1x1 convolutions. In Proc. NeurIPS.Google Scholar
- Kingma, D. P. and Welling, M. (2014). Auto-encoding variational bayes. In Proc. ICLR.Google Scholar
- Logacheva, E., Suvorov, R., Khomenko, O., Mashikhin, A., and Lempitsky, V. (2020). Deeplandscape: Adversarial modeling of landscape videos. In Proc. ECCV.Google Scholar
Digital Library
- Martin-Brualla, R., Gallup, D., and Seitz, S. M. (2015). Time-lapse mining from internet photos. In Proc. TOG.Google Scholar
Digital Library
- Mirza, M. and Osindero, S. (2014). Conditional generative adversarial nets. CoRR, abs/1411.1784.Google Scholar
- Miyato, T. and Koyama, M. (2018). cgans with projection discriminator. In Proc. ICLR.Google Scholar
- Nam, S., Ma, C., Chai, M., Brendel, W., Xu, N., and Kim, S. J. (2019). End-to-end time-lapse video synthesis from a single outdoor image. In Proc. CVPR.Google Scholar
Cross Ref
- Park, T., Liu, M.-Y., Wang, T., and Zhu, J.-Y. (2019). Semantic image synthesis with spatially-adaptive normalization. In Proc. CVPR.Google Scholar
- Park, T., Zhu, J.-Y., Wang, O., Lu, J., Shechtman, E., Efros, A. A., and Zhang, R. (2020). Swapping autoencoder for deep image manipulation. In Proc. NeurIPS.Google Scholar
- Sohl-Dickstein, J., Weiss, E. A., Maheswaranathan, N., and Ganguli, S. (2015). Deep unsupervised learning using nonequilibrium thermodynamics. In Proc. ICML.Google Scholar
- Song, Y. and Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. In Proc. NeurIPS.Google Scholar
- Sun, J., Shen, Z., Wang, Y., Bao, H., and Zhou, X. (2021). LoFTR: Detector-free local feature matching with transformers. In Proc. CVPR.Google Scholar
Cross Ref
- Tancik, M., Srinivasan, P. P., Mildenhall, B., Fridovich-Keil, S., Raghavan, N., Singhal, U., Ramamoorthi, R., Barron, J. T., and Ng, R. (2020). Fourier features let networks learn high frequency functions in low dimensional domains. In Proc. NeurIPS.Google Scholar
- Tulyakov, S., Liu, M.-Y., Yang, X., and Kautz, J. (2018). MoCoGAN: Decomposing motion and content for video generation. In Proc. CVPR.Google Scholar
Cross Ref
- van den Oord, A., Kalchbrenner, N., and Kavukcuoglu, K. (2016a). Pixel recurrent neural networks. In Proc. ICML.Google Scholar
- van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., and Kavukcuoglu, K. (2016b). Conditional image generation with PixelCNN decoders. In Proc. NIPS.Google Scholar
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017). Attention is all you need. In Proc. NeurIPS.Google Scholar
- Wang, T.-C., Liu, M.-Y., Tao, A., Liu, G., Kautz, J., and Catanzaro, B. (2019). Few-shot video-to-video synthesis. In Proc. NeurIPS.Google Scholar
- Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Liu, G., Tao, A., Kautz, J., and Catanzaro, B. (2018). Video-to-video synthesis. In Proc. NeurIPS.Google Scholar
- Xiong, W., Luo, W., Ma, L., Liu, W., and Luo, J. (2018). Learning to generate time-lapse videos using multi-stage dynamic generative adversarial networks. In Proc. CVPR.Google Scholar
Cross Ref
- Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proc. ICCV.Google Scholar
Cross Ref
Index Terms
Disentangling random and cyclic effects in time-lapse sequences
Recommendations
Generalized von Neumann-Kakutani transformation and random-start scrambled Halton sequences
It is a well-known fact that the Halton sequence exhibits poor uniformity in high dimensions. Starting with Braaten and Weller in 1979, several researchers introduced permutations to scramble the digits of the van der Corput sequences that make up the ...
Random Generative Adversarial Networks
SoICT '22: Proceedings of the 11th International Symposium on Information and Communication TechnologyGenerative Adversarial Networks have surprisingly shown great ability in synthesizing high-fidelity and diverse images while resolving the problem of so-called mode collapse still remains a challenge to all researches in this field. In this paper, we ...
Generative Adversarial Networks-Based Pseudo-Random Number Generator for Embedded Processors
Information Security and Cryptology – ICISC 2020AbstractA pseudo-random number generator (PRNG) is a fundamental building block for modern cryptographic solutions. In this paper, we present a novel PRNG based on generative adversarial networks (GAN). A recurrent neural network (RNN) layer is used to ...





Comments