Abstract
Inferring a high dynamic range (HDR) image from a single low dynamic range (LDR) input is an ill-posed problem where we must compensate lost data caused by under-/over-exposure and color quantization. To tackle this, we propose the first deep-learning-based approach for fully automatic inference using convolutional neural networks. Because a naive way of directly inferring a 32-bit HDR image from an 8-bit LDR image is intractable due to the difficulty of training, we take an indirect approach; the key idea of our method is to synthesize LDR images taken with different exposures (i.e., bracketed images) based on supervised learning, and then reconstruct an HDR image by merging them. By learning the relative changes of pixel values due to increased/decreased exposures using 3D deconvolutional networks, our method can reproduce not only natural tones without introducing visible noise but also the colors of saturated pixels. We demonstrate the effectiveness of our method by comparing our results not only with those of conventional methods but also with ground-truth HDR images.
References
- Ahmet Oǧuz Akyüz, Roland Fleming, Bernhard E. Riecke, Erik Reinhard, and Heinrich H. Bülthoff. 2007. Do HDR Displays Support LDR Content?: A Psychophysical Evaluation. ACM Trans. Graph. 26, 3, Article 38 (July 2007). Google Scholar
Digital Library
- Francesco Banterle, Alessandro Artusi, Kurt Debattista, and Alan Chalmers. 2011. Advanced High Dynamic Range Imaging: Theory and Practice. AK Peters (CRC Press), Natick, MA, USA. Google Scholar
Digital Library
- Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers. 2006. Inverse tone mapping. In Proc. of GRAPHITE'06. 349--356. Google Scholar
Digital Library
- Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers. 2008. Expanding low dynamic range videos for high dynamic range applications. In Proc. of SCCG'08. 33--41. Google Scholar
Digital Library
- Francesco Banterle, Patrick Ledda, Kurt Debattista, Alan Chalmers, and Marina Bloj. 2007. A framework for inverse tone mapping. The Visual Computer 23, 7 (2007), 467--478. Google Scholar
Digital Library
- André Brock, Theodore Lim, James M. Ritchie, and Nick Weston. 2016. Generative and Discriminative Voxel Modeling with Convolutional Neural Networks. CoRR abs/1608.04236 (2016). http://arxiv.org/abs/1608.04236Google Scholar
- Paul E. Debevec and Jitendra Malik. 1997. Recovering High Dynamic Range Radiance Maps from Photographs. In Proc. of SIGGRAPH'97. 369--378. Google Scholar
Digital Library
- Emily L. Denton, Soumith Chintala, Arthur Szlam, and Rob Fergus. 2015. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. In Proc. of NIPS'15. 1486--1494. Google Scholar
Digital Library
- Piotr Didyk, Rafal Mantiuk, Matthias Hein, and Hans-Peter Seidel. 2008. Enhancement of Bright Video Features for HDR Displays. Comput. Graph. Forum 27, 4 (2008), 1265--1274. Google Scholar
Digital Library
- Gabriel Eilertsen, Joel Kronander, Gyorgy Denes, Rafal Mantiuk, and Jonas Unger. 2017. HDR image reconstruction from a single exposure using deep CNNs. ACM Trans. Graph. (Proc. of SIGGRAPH ASIA 2017) 36, 6 (Nov. 2017). Google Scholar
Digital Library
- Brian V. Funt and Lilong Shi. 2010a. The effect of exposure on MaxRGB color constancy. In Human Vision and Electronic Imaging XV, part of the IS&T-SPIE Electronic Imaging Symposium. 75270.Google Scholar
- Brian V. Funt and Lilong Shi. 2010b. The Rehabilitation of MaxRGB. In Proc. of Color and Imaging Conference 2010. 256--259.Google Scholar
- Felix A. Gers, Jürgen Schmidhuber, and Fred A. Cummins. 2000. Learning to Forget: Continual Prediction with LSTM. Neural Computation 12, 10 (2000), 2451--2471. Google Scholar
Digital Library
- Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Proc. of NIPS'14. 2672--2680. Google Scholar
Digital Library
- Michael D. Grossberg and Shree K. Nayar. 2003. What is the Space of Camera Response Functions?. In Proc. of CVPR'03. 602--612.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proc. of CVPR'16. 770--778.Google Scholar
Cross Ref
- Rui Huang, Shu Zhang, Tianyu Li, and Ran He. 2017. Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis. CoRR abs/1704.04086 (2017). http://arxiv.org/abs/1704.04086Google Scholar
- Yongqing Huo, Fan Yang, and Vincent Brost. 2013. Dodging and Burning Inspired Inverse Tone Mapping Algorithm. Computational Information Systems 9, 9 (2013), 3461--3468.Google Scholar
- Yongqing Huo, Fan Yang, Le Dong, and Vincent Brost. 2014. Physiological inverse tone mapping based on retina response. The Visual Computer 30, 5 (2014), 507--517.Google Scholar
Cross Ref
- Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proc. of ICML'15. 448--456. Google Scholar
Digital Library
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. arxiv (2016).Google Scholar
- G. Jain, A. Plappally, and S. Raman. 2014. InternetHDR: Enhancing an LDR image using visually similar Internet images. In Proc. of Twentieth National Conference on Communications. 1--6.Google Scholar
- Shuiwang Ji, Wei Xu, Ming Yang, and Kai Yu. 2013. 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1 (2013), 221--231. Google Scholar
Digital Library
- Nima Khademi Kalantari and Ravi Ramamoorthi. 2017. Deep High Dynamic Range Imaging of Dynamic Scenes. ACM Transactions on Graphics (Proc. of SIGGRAPH 2017) 36, 4 (2017). Google Scholar
Digital Library
- Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-Scale Video Classification with Convolutional Neural Networks. In Proc. of CVPR'14. 1725--1732. Google Scholar
Digital Library
- Min H. Kim and Jan Kautz. 2008. Consistent Tone Reproduction. In Proc. of CGIM'08. 152--159. Google Scholar
Digital Library
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980Google Scholar
- Rafael Kovaleski and Manuel M. Oliveira. 2014. High-Quality Reverse Tone Mapping for a Wide Range of Exposures. In Proc. of SIBGRAPI'14. 49--56. Google Scholar
Digital Library
- Rafael Pacheco Kovaleski and Manuel M. Oliveira. 2009. High-quality brightness enhancement functions for real-time reverse tone mapping. The Visual Computer 25, 5--7 (2009), 539--547. Google Scholar
Digital Library
- Joel Kronander, Stefan Gustavson, Gerhard Bonnet, and Jonas Unger. 2013. Unified HDR reconstruction from raw CFA data. In Proc. of ICCP'13. 1--9.Google Scholar
Cross Ref
- William Lotter, Gabriel Kreiman, and David Cox. 2016. Unsupervised Learning of Visual Structure using Predictive Generative Networks. In ICLR'16 workshop.Google Scholar
- Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In ICML Workshop on Deep Learning for Audio, Speech, and Language Processing.Google Scholar
- Mann, Picard, S. Mann, and R. W. Picard. 1995. On Being 'undigital' With Digital Cameras: Extending Dynamic Range By Combining Differently Exposed Pictures. In Proc. of IS&T. 442--448.Google Scholar
- Rafal Mantiuk, Kil Joong Kim, Allan G. Rempel, and Wolfgang Heidrich. 2011. HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Trans. Graph. 30, 4 (2011), 40:1--40:14. Google Scholar
Digital Library
- R. K. Mantiuk, K. Myszkowski, and H.-P. Seidel. 2015. High Dynamic Range Imaging. In Wiley Encyclopedia of Electrical and Electronics Engineering. John Wiley & Sons Inc., 1--42.Google Scholar
- Belen Masia, Sandra Agustin, Roland W. Fleming, Olga Sorkine, and Diego Gutierrez. 2009. Evaluation of Reverse Tone Mapping Through Varying Exposure Conditions. ACM Transactions on Graphics (Proc. of SIGGRAPH Asia) 28, 5 (2009), 160:1--160:8. Google Scholar
Digital Library
- Belen Masia, Roland W. Fleming, Olga Sorkine, and Diego Gutierrez. 2010. Selective Reverse Tone Mapping. In Congreso Español de Informatica Grafica. Eurographics.Google Scholar
- Belen Masia and Diego Gutierrez. 2016. Content-Aware Reverse Tone Mapping. In Proc. of ICAITA 2016.Google Scholar
Cross Ref
- Belen Masia, Ana Serrano, and Diego Gutierrez. 2015. Dynamic range expansion based on image statistics. Multimedia Tools and Applications (2015), 1--18. Google Scholar
Digital Library
- Daniel Maturana and Sebastian Scherer. 2015. VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition. In Proc. of IEEE/IROS'15. 922--928.Google Scholar
Cross Ref
- Tom Mertens, Jan Kautz, and Frank Van Reeth. 2007. Exposure Fusion. In Proc. of Pacific Graphics 2007. 382--390. Google Scholar
Digital Library
- Vinod Nair and Geoffrey E. Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proc. of ICML2010. 807--814. Google Scholar
Digital Library
- Hiromi Nemoto, Pavel Korshunov, Philippe Hanhart, and Touradj Ebrahimi. 2015. Visual attention in LDR and HDR images. In International Workshop on Video Processing and Quality Metrics for Consumer Electronics. http://mmspg.epfl.ch/hdr-eyeGoogle Scholar
- Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, and Alexei A. Efros. 2016. Context Encoders: Feature Learning by Inpainting. In Proc. of CVPR'16. 2536--2544.Google Scholar
- Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. CoRR abs/1511.06434 (2015). http://arxiv.org/abs/1511.06434Google Scholar
- Erik Reinhard, Michael M. Stark, Peter Shirley, and James A. Ferwerda. 2002. Photographic tone reproduction for digital images. ACM Trans. Graph. 21, 3 (2002), 267--276. Google Scholar
Digital Library
- Allan G. Rempel, Matthew Trentacoste, Helge Seetzen, H. David Young, Wolfgang Heidrich, Lorne Whitehead, and Greg Ward. 2007. Ldr2Hdr: On-the-fly Reverse Tone Mapping of Legacy Video and Photographs. ACM Trans. Graph. 26, 3, Article 39 (July 2007). Google Scholar
Digital Library
- O. Ronneberger, P.Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) (LNCS), Vol. 9351. 234--241.Google Scholar
- Florian M. Savoy, Vassilios Vonikakis, Stefan Winkler, and Sabine Süsstrunk. 2014. Recovering badly exposed objects from digital photos using internet images. In Proc. of Digital Photography X, part of the IS&T-SPIE Electronic Imaging Symposium. 90230W.Google Scholar
- Ana Serrano, Felix Heide, Diego Gutierrez, Gordon Wetzstein, and Belen Masia. 2016. Convolutional Sparse Coding for High Dynamic Range Imaging. Comput. Graph. Forum 35, 2 (2016), 153--163.Google Scholar
Cross Ref
- Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from over-fitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google Scholar
Digital Library
- Nitish Srivastava, Elman Mansimov, and Ruslan Salakhutdinov. 2015. Unsupervised Learning of Video Representations using LSTMs. In Proc. of ICML'15. 843--852. Google Scholar
Digital Library
- Michael D. Tocci, Chris Kiser, Nora Tocci, and Pradeep Sen. 2011. A versatile HDR video production system. ACM Trans. Graph. 30, 4 (2011), 41:1--41:10. Google Scholar
Digital Library
- Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning Spatiotemporal Features with 3D Convolutional Networks. In Proc. of ICCV'15. 4489--4497. Google Scholar
Digital Library
- Carl Vondrick, Hamed Pirsiavash, and Antonio Torralba. 2016. Generating Videos with Scene Dynamics. In Proc. of NIPS'16. 613--621. Google Scholar
Digital Library
- Lvdi Wang, Li-Yi Wei, Kun Zhou, Baining Guo, and Heung-Yeung Shum. 2007. High Dynamic Range Image Hallucination. In Proc. of EGSR'07. 321--326. Google Scholar
Digital Library
- T. H. Wang, C. W. Chiu, W. C. Wu, J. W. Wang, C. Y. Lin, C. T. Chiu, and J. J. Liou. 2015. Pseudo-Multiple-Exposure-Based Tone Fusion With Local Region Adjustment. IEEE Transactions on Multimedia 17, 4 (2015), 470--484.Google Scholar
Cross Ref
- Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In Proc. of CVPR'15. 1912--1920.Google Scholar
- Feng Xiao, Jeffrey M. DiCarlo, Peter B. Catrysse, and Brian A. Wandell. 2002. High Dynamic Range Imaging of Natural Scenes. In Proc. of Color and Imaging Conference 2002. 337--342.Google Scholar
- J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba. 2012. Recognizing scene viewpoint using panoramic place representation. In Proc. of CVPR'12. 2695--2702. Google Scholar
Digital Library
- Jinsong Zhang and Jean-François Lalonde. 2017. Learning High Dynamic Range from Outdoor Panoramas. (2017). arXiv:arXiv:1703.10200Google Scholar
- H. Zhao, O. Gallo, I. Frosio, and J. Kautz. 2017. Loss Functions for Image Restoration With Neural Networks. IEEE Transactions on Computational Imaging 3, 1 (2017), 47--57.Google Scholar
Cross Ref
- Hang Zhao, Boxin Shi, Christy Fernandez-Cull, Sai-Kit Yeung, and Ramesh Raskar. 2015. Unbounded High Dynamic Range Photography Using a Modulo Camera. In Proc. of ICCP'15. 1--10.Google Scholar
Cross Ref
- Bolei Zhou, Àgata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning Deep Features for Scene Recognition using Places Database. In Proc. of NIPS'14. 487--495. Google Scholar
Digital Library
Index Terms
Deep reverse tone mapping





Comments