research-article

Deep reverse tone mapping

Publication: ACM Transactions on GraphicsArticle No.: 177 https://doi.org/10.1145/3130800.3130834

Abstract

Inferring a high dynamic range (HDR) image from a single low dynamic range (LDR) input is an ill-posed problem where we must compensate lost data caused by under-/over-exposure and color quantization. To tackle this, we propose the first deep-learning-based approach for fully automatic inference using convolutional neural networks. Because a naive way of directly inferring a 32-bit HDR image from an 8-bit LDR image is intractable due to the difficulty of training, we take an indirect approach; the key idea of our method is to synthesize LDR images taken with different exposures (i.e., bracketed images) based on supervised learning, and then reconstruct an HDR image by merging them. By learning the relative changes of pixel values due to increased/decreased exposures using 3D deconvolutional networks, our method can reproduce not only natural tones without introducing visible noise but also the colors of saturated pixels. We demonstrate the effectiveness of our method by comparing our results not only with those of conventional methods but also with ground-truth HDR images.

References

  1. Ahmet Oǧuz Akyüz, Roland Fleming, Bernhard E. Riecke, Erik Reinhard, and Heinrich H. Bülthoff. 2007. Do HDR Displays Support LDR Content?: A Psychophysical Evaluation. ACM Trans. Graph. 26, 3, Article 38 (July 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Francesco Banterle, Alessandro Artusi, Kurt Debattista, and Alan Chalmers. 2011. Advanced High Dynamic Range Imaging: Theory and Practice. AK Peters (CRC Press), Natick, MA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers. 2006. Inverse tone mapping. In Proc. of GRAPHITE'06. 349--356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Francesco Banterle, Patrick Ledda, Kurt Debattista, and Alan Chalmers. 2008. Expanding low dynamic range videos for high dynamic range applications. In Proc. of SCCG'08. 33--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Francesco Banterle, Patrick Ledda, Kurt Debattista, Alan Chalmers, and Marina Bloj. 2007. A framework for inverse tone mapping. The Visual Computer 23, 7 (2007), 467--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. André Brock, Theodore Lim, James M. Ritchie, and Nick Weston. 2016. Generative and Discriminative Voxel Modeling with Convolutional Neural Networks. CoRR abs/1608.04236 (2016). http://arxiv.org/abs/1608.04236Google ScholarGoogle Scholar
  7. Paul E. Debevec and Jitendra Malik. 1997. Recovering High Dynamic Range Radiance Maps from Photographs. In Proc. of SIGGRAPH'97. 369--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Emily L. Denton, Soumith Chintala, Arthur Szlam, and Rob Fergus. 2015. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks. In Proc. of NIPS'15. 1486--1494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Piotr Didyk, Rafal Mantiuk, Matthias Hein, and Hans-Peter Seidel. 2008. Enhancement of Bright Video Features for HDR Displays. Comput. Graph. Forum 27, 4 (2008), 1265--1274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Gabriel Eilertsen, Joel Kronander, Gyorgy Denes, Rafal Mantiuk, and Jonas Unger. 2017. HDR image reconstruction from a single exposure using deep CNNs. ACM Trans. Graph. (Proc. of SIGGRAPH ASIA 2017) 36, 6 (Nov. 2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Brian V. Funt and Lilong Shi. 2010a. The effect of exposure on MaxRGB color constancy. In Human Vision and Electronic Imaging XV, part of the IS&T-SPIE Electronic Imaging Symposium. 75270.Google ScholarGoogle Scholar
  12. Brian V. Funt and Lilong Shi. 2010b. The Rehabilitation of MaxRGB. In Proc. of Color and Imaging Conference 2010. 256--259.Google ScholarGoogle Scholar
  13. Felix A. Gers, Jürgen Schmidhuber, and Fred A. Cummins. 2000. Learning to Forget: Continual Prediction with LSTM. Neural Computation 12, 10 (2000), 2451--2471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Proc. of NIPS'14. 2672--2680. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Michael D. Grossberg and Shree K. Nayar. 2003. What is the Space of Camera Response Functions?. In Proc. of CVPR'03. 602--612.Google ScholarGoogle Scholar
  16. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proc. of CVPR'16. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  17. Rui Huang, Shu Zhang, Tianyu Li, and Ran He. 2017. Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis. CoRR abs/1704.04086 (2017). http://arxiv.org/abs/1704.04086Google ScholarGoogle Scholar
  18. Yongqing Huo, Fan Yang, and Vincent Brost. 2013. Dodging and Burning Inspired Inverse Tone Mapping Algorithm. Computational Information Systems 9, 9 (2013), 3461--3468.Google ScholarGoogle Scholar
  19. Yongqing Huo, Fan Yang, Le Dong, and Vincent Brost. 2014. Physiological inverse tone mapping based on retina response. The Visual Computer 30, 5 (2014), 507--517.Google ScholarGoogle ScholarCross RefCross Ref
  20. Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proc. of ICML'15. 448--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2016. Image-to-Image Translation with Conditional Adversarial Networks. arxiv (2016).Google ScholarGoogle Scholar
  22. G. Jain, A. Plappally, and S. Raman. 2014. InternetHDR: Enhancing an LDR image using visually similar Internet images. In Proc. of Twentieth National Conference on Communications. 1--6.Google ScholarGoogle Scholar
  23. Shuiwang Ji, Wei Xu, Ming Yang, and Kai Yu. 2013. 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1 (2013), 221--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Nima Khademi Kalantari and Ravi Ramamoorthi. 2017. Deep High Dynamic Range Imaging of Dynamic Scenes. ACM Transactions on Graphics (Proc. of SIGGRAPH 2017) 36, 4 (2017). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-Scale Video Classification with Convolutional Neural Networks. In Proc. of CVPR'14. 1725--1732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Min H. Kim and Jan Kautz. 2008. Consistent Tone Reproduction. In Proc. of CGIM'08. 152--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  28. Rafael Kovaleski and Manuel M. Oliveira. 2014. High-Quality Reverse Tone Mapping for a Wide Range of Exposures. In Proc. of SIBGRAPI'14. 49--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Rafael Pacheco Kovaleski and Manuel M. Oliveira. 2009. High-quality brightness enhancement functions for real-time reverse tone mapping. The Visual Computer 25, 5--7 (2009), 539--547. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Joel Kronander, Stefan Gustavson, Gerhard Bonnet, and Jonas Unger. 2013. Unified HDR reconstruction from raw CFA data. In Proc. of ICCP'13. 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  31. William Lotter, Gabriel Kreiman, and David Cox. 2016. Unsupervised Learning of Visual Structure using Predictive Generative Networks. In ICLR'16 workshop.Google ScholarGoogle Scholar
  32. Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In ICML Workshop on Deep Learning for Audio, Speech, and Language Processing.Google ScholarGoogle Scholar
  33. Mann, Picard, S. Mann, and R. W. Picard. 1995. On Being 'undigital' With Digital Cameras: Extending Dynamic Range By Combining Differently Exposed Pictures. In Proc. of IS&T. 442--448.Google ScholarGoogle Scholar
  34. Rafal Mantiuk, Kil Joong Kim, Allan G. Rempel, and Wolfgang Heidrich. 2011. HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Trans. Graph. 30, 4 (2011), 40:1--40:14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. K. Mantiuk, K. Myszkowski, and H.-P. Seidel. 2015. High Dynamic Range Imaging. In Wiley Encyclopedia of Electrical and Electronics Engineering. John Wiley & Sons Inc., 1--42.Google ScholarGoogle Scholar
  36. Belen Masia, Sandra Agustin, Roland W. Fleming, Olga Sorkine, and Diego Gutierrez. 2009. Evaluation of Reverse Tone Mapping Through Varying Exposure Conditions. ACM Transactions on Graphics (Proc. of SIGGRAPH Asia) 28, 5 (2009), 160:1--160:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Belen Masia, Roland W. Fleming, Olga Sorkine, and Diego Gutierrez. 2010. Selective Reverse Tone Mapping. In Congreso Español de Informatica Grafica. Eurographics.Google ScholarGoogle Scholar
  38. Belen Masia and Diego Gutierrez. 2016. Content-Aware Reverse Tone Mapping. In Proc. of ICAITA 2016.Google ScholarGoogle ScholarCross RefCross Ref
  39. Belen Masia, Ana Serrano, and Diego Gutierrez. 2015. Dynamic range expansion based on image statistics. Multimedia Tools and Applications (2015), 1--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Daniel Maturana and Sebastian Scherer. 2015. VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition. In Proc. of IEEE/IROS'15. 922--928.Google ScholarGoogle ScholarCross RefCross Ref
  41. Tom Mertens, Jan Kautz, and Frank Van Reeth. 2007. Exposure Fusion. In Proc. of Pacific Graphics 2007. 382--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Vinod Nair and Geoffrey E. Hinton. 2010. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proc. of ICML2010. 807--814. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Hiromi Nemoto, Pavel Korshunov, Philippe Hanhart, and Touradj Ebrahimi. 2015. Visual attention in LDR and HDR images. In International Workshop on Video Processing and Quality Metrics for Consumer Electronics. http://mmspg.epfl.ch/hdr-eyeGoogle ScholarGoogle Scholar
  44. Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, and Alexei A. Efros. 2016. Context Encoders: Feature Learning by Inpainting. In Proc. of CVPR'16. 2536--2544.Google ScholarGoogle Scholar
  45. Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. CoRR abs/1511.06434 (2015). http://arxiv.org/abs/1511.06434Google ScholarGoogle Scholar
  46. Erik Reinhard, Michael M. Stark, Peter Shirley, and James A. Ferwerda. 2002. Photographic tone reproduction for digital images. ACM Trans. Graph. 21, 3 (2002), 267--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Allan G. Rempel, Matthew Trentacoste, Helge Seetzen, H. David Young, Wolfgang Heidrich, Lorne Whitehead, and Greg Ward. 2007. Ldr2Hdr: On-the-fly Reverse Tone Mapping of Legacy Video and Photographs. ACM Trans. Graph. 26, 3, Article 39 (July 2007). Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. O. Ronneberger, P.Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) (LNCS), Vol. 9351. 234--241.Google ScholarGoogle Scholar
  49. Florian M. Savoy, Vassilios Vonikakis, Stefan Winkler, and Sabine Süsstrunk. 2014. Recovering badly exposed objects from digital photos using internet images. In Proc. of Digital Photography X, part of the IS&T-SPIE Electronic Imaging Symposium. 90230W.Google ScholarGoogle Scholar
  50. Ana Serrano, Felix Heide, Diego Gutierrez, Gordon Wetzstein, and Belen Masia. 2016. Convolutional Sparse Coding for High Dynamic Range Imaging. Comput. Graph. Forum 35, 2 (2016), 153--163.Google ScholarGoogle ScholarCross RefCross Ref
  51. Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from over-fitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Nitish Srivastava, Elman Mansimov, and Ruslan Salakhutdinov. 2015. Unsupervised Learning of Video Representations using LSTMs. In Proc. of ICML'15. 843--852. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Michael D. Tocci, Chris Kiser, Nora Tocci, and Pradeep Sen. 2011. A versatile HDR video production system. ACM Trans. Graph. 30, 4 (2011), 41:1--41:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning Spatiotemporal Features with 3D Convolutional Networks. In Proc. of ICCV'15. 4489--4497. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Carl Vondrick, Hamed Pirsiavash, and Antonio Torralba. 2016. Generating Videos with Scene Dynamics. In Proc. of NIPS'16. 613--621. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Lvdi Wang, Li-Yi Wei, Kun Zhou, Baining Guo, and Heung-Yeung Shum. 2007. High Dynamic Range Image Hallucination. In Proc. of EGSR'07. 321--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. T. H. Wang, C. W. Chiu, W. C. Wu, J. W. Wang, C. Y. Lin, C. T. Chiu, and J. J. Liou. 2015. Pseudo-Multiple-Exposure-Based Tone Fusion With Local Region Adjustment. IEEE Transactions on Multimedia 17, 4 (2015), 470--484.Google ScholarGoogle ScholarCross RefCross Ref
  58. Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3D ShapeNets: A deep representation for volumetric shapes. In Proc. of CVPR'15. 1912--1920.Google ScholarGoogle Scholar
  59. Feng Xiao, Jeffrey M. DiCarlo, Peter B. Catrysse, and Brian A. Wandell. 2002. High Dynamic Range Imaging of Natural Scenes. In Proc. of Color and Imaging Conference 2002. 337--342.Google ScholarGoogle Scholar
  60. J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba. 2012. Recognizing scene viewpoint using panoramic place representation. In Proc. of CVPR'12. 2695--2702. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Jinsong Zhang and Jean-François Lalonde. 2017. Learning High Dynamic Range from Outdoor Panoramas. (2017). arXiv:arXiv:1703.10200Google ScholarGoogle Scholar
  62. H. Zhao, O. Gallo, I. Frosio, and J. Kautz. 2017. Loss Functions for Image Restoration With Neural Networks. IEEE Transactions on Computational Imaging 3, 1 (2017), 47--57.Google ScholarGoogle ScholarCross RefCross Ref
  63. Hang Zhao, Boxin Shi, Christy Fernandez-Cull, Sai-Kit Yeung, and Ramesh Raskar. 2015. Unbounded High Dynamic Range Photography Using a Modulo Camera. In Proc. of ICCP'15. 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  64. Bolei Zhou, Àgata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning Deep Features for Scene Recognition using Places Database. In Proc. of NIPS'14. 487--495. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep reverse tone mapping

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!