Abstract
Nearly every commodity imaging system we directly interact with, or indirectly rely on, leverages power efficient, application-adjustable black-box hardware image signal processing (ISPs) units, running either in dedicated hardware blocks, or as proprietary software modules on programmable hardware. The configuration parameters of these black-box ISPs often have complex interactions with the output image, and must be adjusted prior to deployment according to application-specific quality and performance metrics. Today, this search is commonly performed manually by "golden eye" experts or algorithm developers leveraging domain expertise. We present a fully automatic system to optimize the parameters of black-box hardware and software image processing pipelines according to any arbitrary (i.e., application-specific) metric. We leverage a differentiable mapping between the configuration space and evaluation metrics, parameterized by a convolutional neural network that we train in an end-to-end fashion with imaging hardware in-the-loop. Unlike prior art, our differentiable proxies allow for high-dimension parameter search with stochastic first-order optimizers, without explicitly modeling any lower-level image processing transformations. As such, we can efficiently optimize black-box image processing pipelines for a variety of imaging applications, reducing application-specific configuration times from months to hours. Our optimization method is fully automatic, even with black-box hardware in the loop. We validate our method on experimental data for real-time display applications, object detection, and extreme low-light imaging. The proposed approach outperforms manual search qualitatively and quantitatively for all domain-specific applications tested. When applied to traditional denoisers, we demonstrate that---just by changing hyperparameters---traditional algorithms can outperform recent deep learning methods by a substantial margin on recent benchmarks.
- Abdelrahman Abdelhamed, Stephen Lin, and Michael S Brown. 2018. A High-Quality Denoising Dataset for Smartphone Cameras. In IEEE Conference on Computer Vision and Pattern Recognition. 1692--1700.Google Scholar
Cross Ref
- D. Ackley. 2012. A Connectionist Machine for Genetic Hillclimbing. Springer US.Google Scholar
- Michal Aharon, Michael Elad, Alfred Bruckstein, et al. 2006. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on signal processing 54, 11 (2006), 4311. Google Scholar
Digital Library
- Brendan Barry, Cormac Brick, Fergal Connor, David Donohoe, David Moloney, Richard Richmond, Martin O'Riordan, and Vasile Toma. 2015. Always-on vision processing unit for mobile applications. IEEE Micro 35, 2 (2015), 56--66.Google Scholar
Digital Library
- Donald Baxter, Frederic Cao, Henrik Eliasson, and Jonathan Phillips. 2012. Development of the I3A CPIQ spatial metrics. Proc.SPIE 8293.Google Scholar
Cross Ref
- James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, Feb (2012), 281--305. Google Scholar
Digital Library
- James Bergstra, Dan Yamins, and David D Cox. 2013. Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In 12th Python in Science Conference. Citeseer, 13--20.Google Scholar
Cross Ref
- A. Buades, B. Coll, and J.-M. Morel. 2005. A non-local algorithm for image denoising. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. 60--65. Google Scholar
Digital Library
- Harold Burger, Christian Schuler, and Stefan Harmeling. 2012. Image denoising: Can plain neural networks compete with BM3D?. In IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- Chakravarty R. Alla Chaitanya, Anton S. Kaplanyan, Christoph Schied, Marco Salvi, Aaron Lefohn, Derek Nowrouzezahrai, and Timo Aila. 2017. Interactive Reconstruction of Monte Carlo Image Sequences Using a Recurrent Denoising Autoencoder. ACM Trans. Graph. 36, 4 (July 2017). Google Scholar
Digital Library
- C. Chen, Q. Chen, J. Xu, and V. Koltun. 2018. Learning to See in the Dark. ArXiv e-prints (May 2018). arXiv:1805.01934Google Scholar
- Q. Chen, J. Xu, and V. Koltun. 2017. Fast Image Processing with Fully-Convolutional Networks. In 2017 IEEE International Conference on Computer Vision (ICCV). 2516--2525.Google Scholar
- Yunjin Chen and Thomas Pock. 2017. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE transactions on pattern analysis and machine intelligence 39, 6 (2017), 1256--1272. Google Scholar
Digital Library
- J. Choi, S. Jang, S. Lee, Y. Hwang, and B. H. Choi. 2014. Memory optimization of bilateral filter and its hardware implementation. In The 18th IEEE International Symposium on Consumer Electronics (ISCE 2014). 1--2.Google Scholar
Cross Ref
- K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Processing 16, 8 (2007). Google Scholar
Digital Library
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE Conf. on Computer Vision and Pattern Recognition. 248--255.Google Scholar
Cross Ref
- Michael Elad and Michal Aharon. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image processing 15, 12 (2006), 3736--3745. Google Scholar
Digital Library
- Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, and Xin Tong. 2018. Image Smoothing via Unsupervised Learning. ACM Transactions on Graphics (Proceedings of SIGGRAPH ASIA 2018) 37, 6 (2018). Google Scholar
Digital Library
- Kenneth Garrard, Thomas Bruegge, Jeff Hoffman, Thomas Dow, and Alex Sohn. 2005. Design tools for freeform optics. In Current Developments in Lens Design and Optical Engineering VI, Vol. 5874. International Society for Optics and Photonics, 58740A.Google Scholar
Cross Ref
- Carl Friedrich Gauss. 1843. Dioptrische Untersuchungen von CF Gauss. in der Dieterichschen Buchhandlung.Google Scholar
- Joseph M Geary. 2002. Introduction to lens design: with practical ZEMAX examples. Willmann-Bell Richmond.Google Scholar
- Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research 32, 11 (2013), 1231--1237. Google Scholar
Digital Library
- M. Gharbi, G. Chaurasia, S. Paris, and F. Durand. 2016. Deep joint demosaicking and denoising. ACM Transactions on Graphics (TOG) 35, 6 (2016), 191. Google Scholar
Digital Library
- M. Gharbi, J. Chen, J. Barron, S. Hasinoff, and F. Durand. 2017. Deep Bilateral Learning for Real-Time Image Enhancement. ACM Trans. Graph. (SIGGRAPH) (2017). Google Scholar
Digital Library
- Radek Grzeszczuk, Demetri Terzopoulos, and Geoffrey Hinton. 1998. NeuroAnimator: Fast Neural Network Emulation and Control of Physics-based Models. In Proc. of the 25th Annual Conf. on Computer Graphics and Interactive Techniques (SIGGRAPH). ACM. Google Scholar
Digital Library
- Shuhang Gu, Lei Zhang, Wangmeng Zuo, and Xiangchu Feng. 2014. Weighted nuclear norm minimization with application to image denoising. In IEEE Conference on Computer Vision and Pattern Recognition. Google Scholar
Digital Library
- Shi Guo, Zifei Yan, Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. Toward convolutional blind denoising of real photographs. arXiv preprint arXiv:1807.04686 (2018).Google Scholar
- Mohit Gupta, Amit Agrawal, Ashok Veeraraghavan, and Srinivasa G Narasimhan. 2011. Structured light 3D scanning in the presence of global illumination. In IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 713--720. Google Scholar
Digital Library
- Nikolaus Hansen, Sibylle D Müller, and Petros Koumoutsakos. 2003. Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary computation 11, 1 (2003), 1--18. Google Scholar
Digital Library
- S. Hasinoff, D. Sharlet, R. Geiss, A. Adams, J. Barron, F. Kainz, J. Chen, and M. Levoy. 2016. Burst Photography for High Dynamic Range and Low-light Imaging on Mobile Cameras. ACM Trans. Graph. 35, 6, Article 192 (2016), 12 pages. Google Scholar
Digital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Cross Ref
- F. Heide, M. Steinberger, Y.-T. Tsai, M. Rouf, D. Pajak, D. Reddy, O. Gallo, J. Liu, W. Heidrich, K. Egiazarian, J. Kautz, and K. Pulli. 2014. FlexISP: A flexible camera image processing framework. ACM Trans. Graph. (SIGGRAPH Asia) 33, 6 (2014). Google Scholar
Digital Library
- ISO. {n. d.}a. ISO 1858. https://standards.ieee.org/standard/1858-2016.html. ({n. d.}). {Online; accessed 5-January-2019}.Google Scholar
- ISO. {n. d.}b. ISO 71696. https://www.iso.org/standard/71696.htm. ({n. d.}). {Online; accessed 5-January-2019}.Google Scholar
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
Cross Ref
- Momin Jamil and Xin-She Yang. 2013. A Literature Survey of Benchmark Functions For Global Optimization Problems. CoRR abs/1308.4008 (2013). arXiv:1308.4008 http://arxiv.org/abs/1308.4008Google Scholar
- Norman Koren. 2006. The Imatest program: comparing cameras with different amounts of sharpening. In Digital Photography II, Vol. 6069. International Society for Optics and Photonics, 60690L.Google Scholar
Cross Ref
- Tzu-Mao Li, Miika Aittala, Frédo Durand, and Jaakko Lehtinen. 2018. Differentiable Monte Carlo Ray Tracing through Edge Sampling. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 37, 6 (2018), 222:1--222:11. Google Scholar
Digital Library
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.Google Scholar
- Hsueh-Ti Derek Liu, Michael Tao, Chun-Liang Li, Derek Nowrouzezahrai, and Alec Jacobson. 2019. Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer. In International Conference on Learning Representations.Google Scholar
- Ilya Loshchilov, Tobias Glasmachers, and Hans-Georg Beyer. 2017. Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization. CoRR abs/1705.06693 (2017). http://arxiv.org/abs/1705.06693Google Scholar
- Daniel Malacara-Hernández and Zacarías Malacara-Hernández. 2016. Handbook of optical design. CRC Press.Google Scholar
- Ruben Martinez-Cantin. 2014. Bayesopt: A bayesian optimization library for nonlinear optimization, experimental design and bandits. The Journal of Machine Learning Research 15, 1 (2014), 3735--3739. Google Scholar
Digital Library
- ON Semi MT9P111. 2015. MT9P111: 1/4-Inch 5 Mp System-On-A-Chip (SOC) CMOS Digital Image Sensor. http://www.onsemi.com/pub/Collateral/MT9P111-D.PDF. (2015).Google Scholar
- John A Nelder and Roger Mead. 1965. A simplex method for function minimization. The computer journal 7, 4 (1965), 308--313.Google Scholar
- J. Nishimura, T. Gerasimow, R. Sushma, A. Sutic, C. Wu, and G. Michael. 2018. Automatic ISP Image Quality Tuning Using Nonlinear Optimization. In 2018 25th IEEE International Conference on Image Processing (ICIP). 2471--2475.Google Scholar
- Sylvain Paris, Samuel W Hasinoff, and Jan Kautz. 2011. Local Laplacian filters: Edge-aware image processing with a Laplacian pyramid. ACM Trans. Graph. 30, 4 (2011). Google Scholar
Digital Library
- Pieter Peers, Dhruv K Mahajan, Bruce Lamond, Abhijeet Ghosh, Wojciech Matusik, Ravi Ramamoorthi, and Paul Debevec. 2009. Compressive light transport sensing. ACM Transactions on Graphics (TOG) 28, 1 (2009), 3. Google Scholar
Digital Library
- Jonathan B. Phillips and Henrik Eliasson. 2018. Camera Image Quality Benchmarking (1st ed.). Wiley Publishing. Google Scholar
Digital Library
- MJD Powell. 1965. A method for minimizing a sum of squares of non-linear functions without calculating derivatives. Comput. J. 7, 4 (1965), 303--307.Google Scholar
Cross Ref
- R. Ramanath, W. Snyder, Y. Yoo, and M. Drew. 2005. Color image processing pipeline in digital still cameras. IEEE Signal Processing Magazine 22, 1 (2005), 34--43.Google Scholar
Cross Ref
- L. A. Rastrigin. 1974. Systems of extremal control. Nauka (1974). https://ci.nii.ac.jp/naid/10018403158/en/Google Scholar
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. Google Scholar
Digital Library
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention.Google Scholar
- Stefan Roth and Michael J Black. 2005. Fields of experts: A framework for learning image priors. In IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2. Google Scholar
Digital Library
- Pradeep Sen, Billy Chen, Gaurav Garg, Stephen R Marschner, Mark Horowitz, Marc Levoy, and Hendrik Lensch. 2005. Dual photography. ACM Transactions on Graphics (TOG) 24, 3 (2005), 745--755. Google Scholar
Digital Library
- Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. IEEE 104, 1 (2016), 148--175.Google Scholar
Cross Ref
- L. Shao, R. Yan, X. Li, and Y. Liu. 2014. From Heuristic Optimization to Dictionary Learning: A Review and Comprehensive Comparison of Image Denoising Algorithms. IEEE Transactions on Cybernetics 44, 7 (2014), 1001--1013.Google Scholar
Cross Ref
- Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Cross Ref
- Vincent Sitzmann, Steven Diamond, Yifan Peng, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, and Gordon Wetzstein. 2018. End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging. ACM Transactions on Graphics (TOG) 37, 4 (2018), 114. Google Scholar
Digital Library
- Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems. Google Scholar
Digital Library
- R. Stead. 2016. P2020 - Standard for Automotive System Image Quality. https://standards.ieee.org/develop/project/2020.html. (2016).Google Scholar
- David G Stork and Patrick R Gill. 2013. Lensless ultra-miniature CMOS computational imagers and sensors. (2013).Google Scholar
- David G Stork and Patrick R Gill. 2014. Optical, mathematical, and computational foundations of lensless ultra-miniature diffractive imagers and sensors. International Journal on Advances in Systems and Measurements 7, 3 (2014), 4.Google Scholar
- Kevin Swersky, Jasper Snoek, and Ryan P Adams. 2013. Multi-task bayesian optimization. In Advances in neural information processing systems. 2004--2012. Google Scholar
Digital Library
- Hossein Talebi and Peyman Milanfar. 2014. Global image denoising. IEEE Trans. Image Process 23, 2 (2014), 755--768. Google Scholar
Digital Library
- Carlo Tomasi and Roberto Manduchi. 1998. Bilateral filtering for gray and color images. In Computer Vision, 1998. Sixth International Conference on. IEEE, 839--846. Google Scholar
Digital Library
- Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2017. Deep image prior. In IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
- Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep Edge-Aware Filters. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research), Francis Bach and David Blei (Eds.), Vol. 37. PMLR, Lille, France, 1669--1678. http://proceedings.mlr.press/v37/xub15.html Google Scholar
Digital Library
- Hao Zhang, Wenjiang Liu, Ruolin Wang, Tao Liu, and Mengtian Rong. 2016. Hardware architecture design of block-matching and 3D-filtering denoising algorithm. Journal of Shanghai Jiaotong University (Science) 21, 2 (2016), 173--183.Google Scholar
Cross Ref
- Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2017. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing 26, 7 (2017), 3142--3155. Google Scholar
Digital Library
- Lei Zhang, Weisheng Dong, David Zhang, and Guangming Shi. 2010. Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition 43, 4 (2010), 1531--1549. Google Scholar
Digital Library
- L. Zhang, X. Wu, A. Buades, and X. Li. 2011. Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. Journal of Electronic Imaging 20, 2 (2011).Google Scholar
- Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In IEEE International Conference on Computer Vision and Pattern Recognition.Google Scholar
Cross Ref
- Daniel Zoran and Yair Weiss. 2011. From Learning Models of Natural Image Patches to Whole Image Restoration.Google Scholar
Index Terms
Hyperparameter optimization in black-box image processing using differentiable proxies
Recommendations
Fingerprint image processing acceleration through run-time reconfigurable hardware
To the best of the authors' knowledge, this is the first brief that implements a complete automatic fingerprint-based authentication system (AFAS) application under a dynamically partial self-reconfigurable field-programmable gate array (FPGA). The main ...
Design and evaluation of a hardware/software FPGA-based system for fast image processing
We evaluate the performance of a hardware/software architecture designed to perform a wide range of fast image processing tasks. The system architecture is based on hardware featuring a Field Programmable Gate Array (FPGA) co-processor and a host ...
Research on the Reconfigurable Image Processing System
IITA '08: Proceedings of the 2008 Second International Symposium on Intelligent Information Technology Application - Volume 01In view of problems of the processing performance and system flexibility for the current image processing system, propose an image processing system using reconfigurable technology. Studying on the theory of dynamic reconfigurable technology and the ...





Comments