skip to main content
research-article

Synergy between Semantic Segmentation and Image Denoising via Alternate Boosting

Published:06 February 2023Publication History
Skip Abstract Section

Abstract

The capability of image semantic segmentation may be deteriorated due to the noisy input image, where image denoising prior to segmentation may help. Both image denoising and semantic segmentation have been developed significantly with the advance of deep learning. In this work, we are interested in the synergy between these two tasks by using a holistic deep model. We observe that not only denoising helps combat the drop of segmentation accuracy due to the noisy input, but also pixel-wise semantic information boosts the capability of denoising. We then propose a boosting network to perform denoising and segmentation alternately. The proposed network is composed of multiple segmentation and denoising blocks (SDBs), each of which estimates a semantic map and then uses the map to regularize denoising. Experimental results show that the denoised image quality is improved substantially and the segmentation accuracy is improved to close to that on clean images, and segmentation and denoising are both boosted as the number of SDBs increases. On the Cityscapes dataset, using three SDBs improves the denoising quality to 34.42 dB in PSNR, and the segmentation accuracy to 66.5 in mIoU, when the additive white Gaussian noise level is 50.

REFERENCES

  1. [1] Abdelhamed Abdelrahman, Lin Stephen, and Brown Michael S.. 2018. A high-quality denoising dataset for smartphone cameras. In CVPR. IEEE, 16921700.Google ScholarGoogle Scholar
  2. [2] Abdolghader Pedram, Ridsdale Andrew, Grammatikopoulos Tassos, Resch Gavin, Legare Francois, Stolow Albert, Pegoraro Adrian F., and Tamblyn Isaac. 2021. Unsupervised hyperspectral stimulated Raman microscopy image enhancement: Denoising and segmentation via one-shot deep learning. arXiv preprint arXiv:2104.08338 (2021).Google ScholarGoogle Scholar
  3. [3] Anwar Saeed and Barnes Nick. 2019. Real image denoising with feature attention. In ICCV. IEEE, 31553164.Google ScholarGoogle Scholar
  4. [4] Anwar Saeed, Porikli Fatih, and Huynh Cong Phuoc. 2017. Category-specific object image denoising. IEEE Trans. Image Process. 26, 11 (2017), 55065518.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Arbelaez Pablo, Maire Michael, Fowlkes Charless, and Malik Jitendra. 2010. Contour detection and hierarchical image segmentation. IEEE Trans. Patt. Anal. Mach. Intell. 33 (2010), 898916.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Badrinarayanan Vijay, Kendall Alex, and Cipolla Roberto. 2017. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Patt. Anal. Mach. Intell. 39, 12 (2017), 24812495.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Buchholz Tim-Oliver, Prakash Mangal, Schmidt Deborah, Krull Alexander, and Jug Florian. 2020. DenoiSeg: Joint denoising and segmentation. In ECCV. Springer, 324337.Google ScholarGoogle Scholar
  8. [8] Charest Michael R., Elad Michael, and Milanfar Peyman. 2006. A general iterative regularization framework for image denoising. In CISS. IEEE, 452457.Google ScholarGoogle Scholar
  9. [9] Chen Chang, Xiong Zhiwei, Tian Xinmei, Zha Zheng-Jun, and Wu Feng. 2019. Real-world image denoising with deep boosting. IEEE Trans. Patt. Anal. Mach. Intell. 42, 12 (2019), 30713087.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Chen Liangyu, Lu Xin, Zhang Jie, Chu Xiaojie, and Chen Chengpeng. 2021. HINet: Half instance normalization network for image restoration. In CVPR. 182192.Google ScholarGoogle Scholar
  11. [11] Chen Liang-Chieh, Papandreou George, Kokkinos Iasonas, Murphy Kevin, and Yuille Alan L.. 2014. Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:1412.7062 (2014).Google ScholarGoogle Scholar
  12. [12] Chen Liang-Chieh, Papandreou George, Kokkinos Iasonas, Murphy Kevin, and Yuille Alan L.. 2017. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Patt. Anal. Mach. Intell. 40, 4 (2017), 834848.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Cordts Marius, Omran Mohamed, Ramos Sebastian, Rehfeld Timo, Enzweiler Markus, Benenson Rodrigo, Franke Uwe, Roth Stefan, and Schiele Bernt. 2016. The cityscapes dataset for semantic urban scene understanding. In CVPR. IEEE, 32133223.Google ScholarGoogle Scholar
  14. [14] Dabov Kostadin, Foi Alessandro, Katkovnik Vladimir, and Egiazarian Karen. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16, 8 (2007), 20802095.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Dong Weisheng, Li Xin, Zhang Lei, and Shi Guangming. 2011. Sparsity-based image denoising via dictionary learning and structural clustering. In CVPR. IEEE, 457464.Google ScholarGoogle Scholar
  16. [16] Elad Michael and Aharon Michal. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15, 12 (2006), 37363745.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Glorot Xavier and Bengio Yoshua. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 249256.Google ScholarGoogle Scholar
  18. [18] Guo Shi, Yan Zifei, Zhang Kai, Zuo Wangmeng, and Zhang Lei. 2019. Toward convolutional blind denoising of real photographs. In CVPR. IEEE, 17121722.Google ScholarGoogle Scholar
  19. [19] Han Shizhong, Meng Zibo, Khan Ahmed-Shehab, and Tong Yan. 2016. Incremental boosting convolutional neural network for facial action unit recognition. In NIPS. 109117.Google ScholarGoogle Scholar
  20. [20] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2015. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In ICCV. 10261034.Google ScholarGoogle Scholar
  21. [21] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In CVPR. IEEE, 770778.Google ScholarGoogle Scholar
  22. [22] Hosotani Fumitaka, Inuzuka Yuya, Hasegawa Masaya, Hirobayashi Shigeki, and Misawa Tadanobu. 2015. Image denoising with edge-preserving and segmentation based on mask NHA. IEEE Trans. Image Process. 24 (2015), 60256033.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Ioffe Sergey and Szegedy Christian. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML. 448456.Google ScholarGoogle Scholar
  24. [24] Johnson Justin, Alahi Alexandre, and Fei-Fei Li. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. Springer, 694711.Google ScholarGoogle Scholar
  25. [25] Kim Yoonsik, Soh Jae Woong, Park Gu Yong, and Cho Nam Ik. 2020. Transfer learning from synthetic to real-noise denoising with adaptive instance normalization. In CVPR. IEEE, 34823492.Google ScholarGoogle Scholar
  26. [26] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  27. [27] Krull Alexander, Buchholz Tim-Oliver, and Jug Florian. 2019. Noise2Void-learning denoising from single noisy images. In CVPR. IEEE, 21292137.Google ScholarGoogle Scholar
  28. [28] Larrazabal Agostina J., Martinez Cesar, and Ferrante Enzo. 2019. Anatomical priors for image segmentation via post-processing with denoising autoencoders. In MICCAI. Springer, 585593.Google ScholarGoogle Scholar
  29. [29] Latif Ghazanfar, Iskandar D. A., Alghazo Jaafar, Butt Mohsin, and Khan Adil H.. 2018. Deep CNN based MR image denoising for tumor segmentation using watershed transform. Int. J. Eng. Technol. 7 (2018), 3742.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Lehtinen Jaakko, Munkberg Jacob, Hasselgren Jon, Laine Samuli, Karras Tero, Aittala Miika, and Aila Timo. 2018. Noise2Noise: Learning image restoration without clean data. In ICML. 29652974.Google ScholarGoogle Scholar
  31. [31] Liang Jingyun, Cao Jiezhang, Sun Guolei, Zhang Kai, Gool Luc Van, and Timofte Radu. 2021. SwinIR: Image restoration using Swin transformer. In ICCV. 18331844.Google ScholarGoogle Scholar
  32. [32] Lin Guosheng, Milan Anton, Shen Chunhua, and Reid Ian. 2017. RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In CVPR. IEEE, 19251934.Google ScholarGoogle Scholar
  33. [33] Lin Tsung-Yi, Maire Michael, Belongie Serge, Hays James, Perona Pietro, Ramanan Deva, Dollár Piotr, and Zitnick C. Lawrence. 2014. Microsoft COCO: Common objects in context. In ECCV. Springer, 740755.Google ScholarGoogle Scholar
  34. [34] Lin Xiaoyu. 2021. Learning degraded image classification with restoration data fidelity. arXiv preprint arXiv:2101.09606 (2021).Google ScholarGoogle Scholar
  35. [35] Liu Ding, Wen Bihan, Jiao Jianbo, Liu Xianming, Wang Zhangyang, and Huang Thomas S.. 2020. Connecting image denoising and high-level vision tasks via deep learning. IEEE Trans. Image Process. 29 (2020), 36953706.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Liu Ding, Wen Bihan, Liu Xianming, Wang Zhangyang, and Huang Thomas S.. 2018. When image denoising meets high-level vision tasks: A deep learning approach. In IJCAI. 842848.Google ScholarGoogle Scholar
  37. [37] Liu Ze, Lin Yutong, Cao Yue, Hu Han, Wei Yixuan, Zhang Zheng, Lin Stephen, and Guo Baining. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV. 1001210022.Google ScholarGoogle Scholar
  38. [38] Long Jonathan, Shelhamer Evan, and Darrell Trevor. 2015. Fully convolutional networks for semantic segmentation. In CVPR. IEEE, 34313440.Google ScholarGoogle Scholar
  39. [39] Mairal Julien, Bach Francis, Ponce Jean, Sapiro Guillermo, and Zisserman Andrew. 2009. Non-local sparse models for image restoration. In ICCV. IEEE, 22722279.Google ScholarGoogle Scholar
  40. [40] Mao Xiaojiao, Shen Chunhua, and Yang Yu-Bin. 2016. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In NIPS. 28022810.Google ScholarGoogle Scholar
  41. [41] Moghimi Mohammad, Belongie Serge J., Saberian Mohammad J., Yang Jian, Vasconcelos Nuno, and Li Li-Jia. 2016. Boosted convolutional neural networks. In BMVC. 16.Google ScholarGoogle Scholar
  42. [42] Paszke Adam, Gross Sam, Massa Francisco, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In NeurIPS. 80248035.Google ScholarGoogle Scholar
  43. [43] Plotz Tobias and Roth Stefan. 2017. Benchmarking denoising algorithms with real photographs. In CVPR. IEEE, 15861595.Google ScholarGoogle Scholar
  44. [44] Remez Tal, Litany Or, Giryes Raja, and Bronstein Alex M.. 2018. Class-aware fully convolutional Gaussian and Poisson denoising. IEEE Trans. Image Process. 27, 11 (2018), 57075722.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Ren Wenqi, Pan Jinshan, Cao Xiaochun, and Yang Ming-Hsuan. 2017. Video deblurring via semantic segmentation and pixel-wise non-linear kernel. In ICCV. IEEE, 10771085.Google ScholarGoogle Scholar
  46. [46] Ronneberger Olaf, Fischer Philipp, and Brox Thomas. 2015. U-Net: Convolutional networks for biomedical image segmentation. In MICCAI. Springer, 234241.Google ScholarGoogle Scholar
  47. [47] Shao Jie, Hu Kai, Wang Changhu, Xue Xiangyang, and Raj Bhiksha. 2020. Is normalization indispensable for training deep neural network? In NeurIPS. 1343413444.Google ScholarGoogle Scholar
  48. [48] Sharma Vivek, Diba Ali, Neven Davy, Brown Michael S., Gool Luc Van, and Stiefelhagen Rainer. 2018. Classification-driven dynamic image enhancement. In CVPR. IEEE, 40334041.Google ScholarGoogle Scholar
  49. [49] Simonyan Karen and Zisserman Andrew. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  50. [50] Singh Maneesh, Ishwar Prakash, Ratakonda Krishna, and Ahuja Narendra. 1999. Segmentation based denoising using multiple compaction domains. In ICIP. IEEE, 372375.Google ScholarGoogle Scholar
  51. [51] Strudel Robin, Garcia Ricardo, Laptev Ivan, and Schmid Cordelia. 2021. Segmenter: Transformer for semantic segmentation. In ICCV. 72627272.Google ScholarGoogle Scholar
  52. [52] Talebi Hossein, Zhu Xiang, and Milanfar Peyman. 2012. How to SAIF-ly boost denoising performance. IEEE Trans. Image Process. 22, 4 (2012), 14701485.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Ulyanov Dmitry, Vedaldi Andrea, and Lempitsky Victor. 2018. Deep image prior. In CVPR. IEEE, 94469454.Google ScholarGoogle Scholar
  54. [54] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. Adv. Neural Inf. Process. 30 (2017).Google ScholarGoogle Scholar
  55. [55] Vatsa Mayank, Singh Richa, and Noore Afzel. 2009. Denoising and segmentation of 3D brain images.Image Process. Comput. Vis. Patt. Recog. 9 (2009), 561567.Google ScholarGoogle Scholar
  56. [56] Wang Jingdong, Sun Ke, Cheng Tianheng, et al. 2020. Deep high-resolution representation learning for visual recognition. IEEE Trans. Patt. Anal. Mach. Intell. (2020). DOI:Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Wang Li, Li Dong, Zhu Yousong, Tian Lu, and Shan Yi. 2020. Dual super-resolution learning for semantic segmentation. In CVPR. IEEE, 37743783.Google ScholarGoogle Scholar
  58. [58] Wang Sicheng, Wen Bihan, Wu Junru, Tao Dacheng, and Wang Zhangyang. 2019. Segmentation-aware image denoising without knowing true segmentation. arXiv preprint arXiv:1905.08965 (2019).Google ScholarGoogle Scholar
  59. [59] Wang Wenhai, Xie Enze, Li Xiang, Fan Deng-Ping, Song Kaitao, Liang Ding, Lu Tong, Luo Ping, and Shao Ling. 2021. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In ICCV. 568578.Google ScholarGoogle Scholar
  60. [60] Wang Xintao, Yu Ke, Dong Chao, and Loy Chen Change. 2018. Recovering realistic texture in image super-resolution by deep spatial feature transform. In CVPR. IEEE, 606615.Google ScholarGoogle Scholar
  61. [61] Wang Zhendong, Cun Xiaodong, Bao Jianmin, and Liu Jianzhuang. 2021. Uformer: A general u-shaped transformer for image restoration. arXiv preprint arXiv:2106.03106 (2021).Google ScholarGoogle Scholar
  62. [62] Xu Ziyue, Bagci Ulas, Seidel Jurgen, Thomasson David, Solomon Jeff, and Mollura Daniel J.. 2014. Segmentation based denoising of PET images: An iterative approach via regional means and affinity propagation. In MICCAI. Springer, 698705.Google ScholarGoogle Scholar
  63. [63] Xu Ziyue, Gao Mingchen, Papadakis Georgios Z., Luna Brian, Jain Sanjay, Mollura Daniel J., and Bagci Ulas. 2018. Joint solution for PET image segmentation, denoising, and partial volume correction. Med. Image Anal. 46 (2018), 229243.Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Yang Maoke, Yu Kun, Zhang Chi, Li Zhiwei, and Yang Kuiyuan. 2018. DenseASPP for semantic segmentation in street scenes. In CVPR. IEEE, 36843692.Google ScholarGoogle Scholar
  65. [65] Yu Fisher and Koltun Vladlen. 2015. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015).Google ScholarGoogle Scholar
  66. [66] Yuan Yuhui, Chen Xilin, and Wang Jingdong. 2020. Object-contextual representations for semantic segmentation. In ECCV. Springer, 173190.Google ScholarGoogle Scholar
  67. [67] Zamir Syed Waqas, Arora Aditya, Khan Salman, Hayat Munawar, Khan Fahad Shahbaz, and Yang Ming-Hsuan. 2021. Restormer: Efficient transformer for high-resolution image restoration. arXiv preprint arXiv:2111.09881 (2021).Google ScholarGoogle Scholar
  68. [68] Zamir Syed Waqas, Arora Aditya, Khan Salman, Hayat Munawar, Khan Fahad Shahbaz, Yang Ming-Hsuan, and Shao Ling. 2021. Multi-stage progressive image restoration. In CVPR. 1482114831.Google ScholarGoogle Scholar
  69. [69] Zhang Haochen, Liu Dong, and Xiong Zhiwei. 2019. Two-stream action recognition-oriented video super-resolution. In ICCV. IEEE, 87998808.Google ScholarGoogle Scholar
  70. [70] Zhang Kai, Zuo Wangmeng, Chen Yunjin, Meng Deyu, and Zhang Lei. 2017. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26, 7 (2017), 31423155.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. [71] Zhang Kai, Zuo Wangmeng, Gu Shuhang, and Zhang Lei. 2017. Learning deep CNN denoiser prior for image restoration. In CVPR. IEEE, 39293938.Google ScholarGoogle Scholar
  72. [72] Zhang Kai, Zuo Wangmeng, and Zhang Lei. 2018. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27, 9 (2018), 46084622.Google ScholarGoogle Scholar
  73. [73] Zhang Zhenyu, Cui Zhen, Xu Chunyan, Jie Zequn, Li Xiang, and Yang Jian. 2018. Joint task-recursive learning for semantic segmentation and depth estimation. In ECCV. Springer, 235251.Google ScholarGoogle Scholar
  74. [74] Zhao Hengshuang, Shi Jianping, Qi Xiaojuan, Wang Xiaogang, and Jia Jiaya. 2017. Pyramid scene parsing network. In CVPR. IEEE, 28812890.Google ScholarGoogle Scholar
  75. [75] Zheng Sixiao, Lu Jiachen, Zhao Hengshuang, Zhu Xiatian, Luo Zekun, Wang Yabiao, Fu Yanwei, Feng Jianfeng, Xiang Tao, Torr Philip H. S., et al. 2021. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In CVPR. 68816890.Google ScholarGoogle Scholar
  76. [76] Zhou Bolei, Zhao Hang, Puig Xavier, Fidler Sanja, Barriuso Adela, and Torralba Antonio. 2017. Scene parsing through ADE20K dataset. In CVPR. IEEE, 633641.Google ScholarGoogle Scholar

Index Terms

  1. Synergy between Semantic Segmentation and Image Denoising via Alternate Boosting

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 2
        March 2023
        540 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3572860
        • Editor:
        • Abdulmotaleb El Saddik
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 February 2023
        • Online AM: 14 July 2022
        • Accepted: 7 July 2022
        • Revised: 2 June 2022
        • Received: 13 December 2021
        Published in tomm Volume 19, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)163
        • Downloads (Last 6 weeks)26

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!