Abstract
Most matting research resorts to advanced semantics to achieve high-quality alpha mattes, and a direct low-level features combination is usually explored to complement alpha details. However, we argue that appearance-agnostic integration can only provide biased foreground (FG) details and that alpha mattes require different-level feature aggregation for better pixel-wise opacity perception. In this article, we propose an end-to-end hierarchical and progressive attention matting network (HAttMatting++), which can better predict the opacity of the FG from single RGB images without additional input. Specifically, we utilize channel-wise attention (CA) to distill pyramidal features and employ spatial attention (SA) at different levels to filter appearance cues. This progressive attention mechanism can estimate alpha mattes from adaptive semantics and semantics-indicated boundaries. We also introduce a hybrid loss function fusing structural similarity, mean square error, adversarial loss, and sentry supervision to guide the network to further improve the overall FG structure. In addition, we construct a large-scale and challenging image matting dataset comprised of 59,000 training images and 1,000 test images (a total of 646 distinct FG alpha mattes), which can further improve the robustness of our hierarchical and progressive aggregation model. Extensive experiments demonstrate that the proposed HAttMatting++ can capture sophisticated FG structures and achieve state-of-the-art performance with single RGB images as input.
- [1] . 2017. Designing effective inter-pixel information flow for natural image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 228–236.Google Scholar
Cross Ref
- [2] . 2018. Semantic soft segmentation. ACM Transactions on Graphics 37, 4 (2018), Article 72.Google Scholar
Digital Library
- [3] . 2019. Disentangled image matting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19). 8818–8827.Google Scholar
Cross Ref
- [4] . 2017. SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 6298–6306.Google Scholar
Cross Ref
- [5] . 2018. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 4 (2018), 834–848.Google Scholar
Digital Library
- [6] . 2018. Semantic human matting. In Proceedings of the ACM International Conference on Multimedia (MM’18). 618–626.Google Scholar
Digital Library
- [7] . 2013. KNN matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 9 (2013), 2175–2188.Google Scholar
Digital Library
- [8] . 2016. Automatic trimap generation and consistent matting for light-field images. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 8 (2016), 1504–1517.Google Scholar
Digital Library
- [9] . 2019. Deep convolutional neural network for natural image matting using initial alpha mattes. IEEE Transactions on Image Processing 28, 3 (2019), 1054–1067.Google Scholar
Cross Ref
- [10] . 2021. Learning affinity-aware upsampling for deep image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 6841–6850.Google Scholar
Cross Ref
- [11] . 2010. The PASCAL Visual Object Classes (VOC) challenge. International Journal of Computer Vision 88, 2 (2010), 303–338.Google Scholar
Digital Library
- [12] . 2010. Shared sampling for real-time alpha matting. Computer Graphics Forum 29, 2 (2010), 575–584.Google Scholar
Cross Ref
- [13] . 2014. Generative adversarial nets. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS’14). 2672–2680.Google Scholar
- [14] . 2017. Deeply supervised salient object detection with short connections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 3203–3212.Google Scholar
Cross Ref
- [15] . 2019. Context-aware image matting for simultaneous foreground and alpha estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19). 4129–4138.Google Scholar
Cross Ref
- [16] . 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 5967–5976.Google Scholar
Cross Ref
- [17] . 2015. Image matting with KL-divergence based sparse sampling. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’15). 424–432.Google Scholar
Digital Library
- [18] . 2011. Nonlocal matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’11). 2193–2200.Google Scholar
Digital Library
- [19] . 2007. A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 2 (2007), 228–242.Google Scholar
Digital Library
- [20] . 2008. Spectral matting. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 10 (2008), 1699–1712.Google Scholar
Digital Library
- [21] . 2020. Natural image matting via guided contextual attention. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’20). 11450–11457.Google Scholar
Cross Ref
- [22] . 2021. Real-time high-resolution background matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 8762–8771.Google Scholar
Cross Ref
- [23] . 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision (ECCV’14). 740–755.Google Scholar
Cross Ref
- [24] . 2015. ParseNet: Looking wider to see better. arXiv preprint arXiv:1506.04579 (2015).Google Scholar
- [25] . 2019. Indices matter: Learning to index for deep image matting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19). 3265–3274.Google Scholar
Cross Ref
- [26] . 2018. AlphaGAN: Generative adversarial networks for natural image matting. In Proceedings of the British Machine Vision Conference (BMVC’18). 259.Google Scholar
- [27] . 2021. Exploring dense context for salient object detection. IEEE Transactions on Circuits and Systems for Video Technology 32, 3 (2021), 1378–1389.Google Scholar
- [28] . 2020. Don’t hit me! Glass detection in real-world scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’20).Google Scholar
Cross Ref
- [29] . 2020. Attention-guided hierarchical structure aggregation for image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20).Google Scholar
Cross Ref
- [30] . 2020. Multi-scale information assembly for image matting. Computer Graphics Forum 39 (2020), 565–574.Google Scholar
- [31] . 2019. BASNet: Boundary-aware salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 7471–7481.Google Scholar
Cross Ref
- [32] . 2011. A global sampling method for alpha matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’11). 2049–2056.Google Scholar
- [33] . 2009. A perceptually motivated online benchmark for image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’09). 1826–1833.Google Scholar
Cross Ref
- [34] . 2018. Learning from synthetic data: Addressing domain shift for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 3752–3761.Google Scholar
Cross Ref
- [35] . 2020. Background matting: The world is your green screen. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 2288–2297.Google Scholar
Cross Ref
- [36] . 2013. Improving image matting using comprehensive sampling sets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’13). 636–643.Google Scholar
Digital Library
- [37] . 2016. Deep automatic portrait matting. In Proceedings of the European Conference on Computer Vision (ECCV’16). 92–107.Google Scholar
Cross Ref
- [38] . 2021. Semantic image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 11120–11129.Google Scholar
Cross Ref
- [39] . 2019. Learning-based sampling for natural image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 3050–3058.Google Scholar
Cross Ref
- [40] . 2022. Bi-directional object-context prioritization learning for saliency ranking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’22).Google Scholar
Cross Ref
- [41] . 2020. Weakly-supervised salient instance detection. In Proceedings of the British Machine Vision Conference (BMVC’20).Google Scholar
- [42] . 2021. Learning to detect instance-level salient objects using complementary image labels. International Journal of Computer Vision 130 (2021), 729–746.Google Scholar
- [43] . 2018. CRRN: Multi-scale guided concurrent reflection removal network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 4777–4785.Google Scholar
Cross Ref
- [44] . 2007. Optimized color sampling for robust matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’07). 1–8.Google Scholar
Cross Ref
- [45] . 2018. Deep propagation based image matting. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’18). 999–1006.Google Scholar
Cross Ref
- [46] . 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.Google Scholar
Digital Library
- [47] . 2021. Improved image matting via real-time user clicks and uncertainty estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 15374–15383.Google Scholar
Cross Ref
- [48] . 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV’18). 3–19.Google Scholar
Digital Library
- [49] . 2020. Real or not real, that is the question. In Proceedings of the International Conference on Learning Representations (ICLR’20).Google Scholar
- [50] . 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 5987–5995.Google Scholar
Cross Ref
- [51] . 2017. Deep image matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’17). 311–320.Google Scholar
Cross Ref
- [52] . 2020. Smart scribbles for image matting. ACM Transactions on Multimedia Computing Communications and Applications 16, 4 (2020), Article 121, 21 pages.Google Scholar
Digital Library
- [53] . 2018. Active matting. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS’18). 4590–4600.Google Scholar
- [54] . 2006. Easy matting—A stroke based approach for continuous image matting. Computer Graphics Forum 25, 3 (2006), 567–576.Google Scholar
- [55] . 2021. High-resolution deep image matting. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’21). 3217–3224.Google Scholar
Cross Ref
- [56] . 2018. Generative image inpainting with contextual attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). 5505–5514.Google Scholar
Cross Ref
- [57] . 2021. Mask guided matting via progressive refinement network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 1154–1163.Google Scholar
Cross Ref
- [58] . 2019. A late fusion CNN for digital matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 7461–7470.Google Scholar
Cross Ref
- [59] . 2009. Learning based digital matting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’09). 889–896.Google Scholar
- [60] . 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’17). 2242–2251.Google Scholar
Cross Ref
Index Terms
Hierarchical and Progressive Image Matting
Recommendations
Automatic and accurate image matting
ICCCI'10: Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume Part IIIThis paper presents a modified spectral matting to obtain automatic and accurate image matting. Spectral matting is the state-of-the-art image matting and also a milestone in theoretic matting research. However, using spectral matting without user ...
Automatic image matting using component-hue-difference-based spectral matting
ACIIDS'12: Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part IIThis paper presents automatic image matting using component-hue-difference-based spectral matting to obtain accurate alpha mattes. Spectral matting is the state-of-the-art image matting and it is also a milestone in theoretic matting research. However, ...
Unsupervised and reliable image matting based on modified spectral matting
Spectral matting is the state-of-the-art image matting and also a milestone in theoretic matting research. For spectral matting without user intervention, the accuracy of alpha matte is low and the computational cost is high. Therefore, this paper ...






Comments