Abstract
Few-shot segmentation aims to segment objects belonging to a specific class under the guidance of a few annotated examples. Most existing approaches follow the prototype learning paradigm and generate category prototypes by squeezing masked feature maps extracted from images in the support set. These support prototypes may lead to inaccurate predictions when directly compared with features extracted from the query set due to the considerable distribution discrepancy between support and query features. We propose a query-guided prototype learning architecture to address this problem from two aspects: (i) We propose a cross-alignment loss for training the segmentation decoder. This loss function will help the decoder improve its robustness against the distribution discrepancy between support and query features. (ii) We build a dynamic fusion module to strengthen the original support prototype with another prototype extracted from query features. Experiments show that our method achieves promising results compared to previous prototype learning methods on PASCAL-5i and COCO-20i datasets.
- [1] . 2021. Few-shot segmentation without meta-learning: A good transductive inference is all you need? In CVPR. 13979–13988.Google Scholar
- [2] . 2017. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40, 4 (2017), 834–848.Google Scholar
Cross Ref
- [3] . 2016. The cityscapes dataset for semantic urban scene understanding. In CVPR. 3213–3223.Google Scholar
- [4] . 2018. Few-shot semantic segmentation with prototype learning. In BMVC, Vol. 3.Google Scholar
- [5] . 2021. An image is worth 16 \(\times\) 16 words: Transformers for image recognition at scale. ICLR (2021).Google Scholar
- [6] . 2010. The pascal visual object classes (VOC) challenge. IJCV 88, 2 (2010), 303–338.Google Scholar
Digital Library
- [7] . 2019. Dual attention network for scene segmentation. In CVPR. 3146–3154.Google Scholar
- [8] . 2014. Simultaneous detection and segmentation. In ECCV. Springer, 297–312.Google Scholar
- [9] . 2016. Deep residual learning for image recognition. In CVPR. 770–778.Google Scholar
- [10] . 2019. CCNET: Criss-cross attention for semantic segmentation. In ICCV. 603–612.Google Scholar
- [11] . 2021. Adaptive prototype learning and allocation for few-shot segmentation. In CVPR. 8334–8343.Google Scholar
- [12] . 2014. Microsoft COCO: Common objects in context. In ECCV. Springer, 740–755.Google Scholar
- [13] . 2021. Anti-aliasing semantic reconstruction for few-shot semantic segmentation. In CVPR. 9747–9756.Google Scholar
- [14] . 2020. CRNet: Cross-reference networks for few-shot segmentation. In CVPR. 4165–4173.Google Scholar
- [15] . 2020. Part-aware prototype network for few-shot semantic segmentation. In ECCV. Springer, 142–158.Google Scholar
- [16] . 2015. Fully convolutional networks for semantic segmentation. In CVPR. 3431–3440.Google Scholar
- [17] . 2019. Feature weighting and boosting for few-shot segmentation. In ICCV. 622–631.Google Scholar
- [18] . 2018. Conditional networks for few-shot semantic segmentation.Google Scholar
- [19] . 2015. Faster r-CNN: Towards real-time object detection with region proposal networks. In NeurIPS. 91–99.Google Scholar
- [20] . 2015. U-net: Convolutional networks for biomedical image segmentation. In MICCAI. Springer, 234–241.Google Scholar
- [21] . 2015. Imagenet large scale visual recognition challenge. IJCV 115, 3 (2015), 211–252.Google Scholar
Digital Library
- [22] . 2017. One-shot learning for semantic segmentation. In BMVC.Google Scholar
- [23] . 2019. AMP: Adaptive masked proxies for few-shot segmentation. In ICCV. 5249–5258.Google Scholar
- [24] . 2017. Prototypical networks for few-shot learning. In NeurIPS. 4077–4087.Google Scholar
- [25] . 2020. Prior guided feature enrichment network for few-shot segmentation. TPAMI PP, 99 (2020), 1–1.Google Scholar
Cross Ref
- [26] . 2017. Attention is all you need. In NeurIPS. 5998–6008.Google Scholar
- [27] . 2020. Few-shot semantic segmentation with democratic attention networks. In ECCV. Springer, 730–746.Google Scholar
- [28] . 2020. Deep high-resolution representation learning for visual recognition. TPAMI 43, 10 (2020), 3349–3364.Google Scholar
Cross Ref
- [29] . 2019. PANET: Few-shot image semantic segmentation with prototype alignment. In ICCV. 9197–9206.Google Scholar
- [30] . 2018. Non-local neural networks. In CVPR. 7794–7803.Google Scholar
- [31] . 2021. A \(^2\)-Net: Learning attribute-aware hash codes for large-scale fine-grained image retrieval. Advances in Neural Information Processing Systems 34 (2021).Google Scholar
- [32] . 2019. Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples. TIP 28, 12 (2019), 6116–6125.Google Scholar
Digital Library
- [33] . 2021. Scale-aware graph neural network for few-shot semantic segmentation. In CVPR. 5475–5484.Google Scholar
- [34] . 2020. Prototype mixture models for few-shot semantic segmentation. In ECCV. Springer, 763–778.Google Scholar
- [35] . 2021. Self-guided and cross-guided learning for few-shot segmentation. In CVPR. 8312–8321.Google Scholar
- [36] . 2019. Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. In ICCV. 9587–9595.Google Scholar
- [37] . 2019. CANET: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In CVPR. 5217–5226.Google Scholar
- [38] . 2020. SG-one: Similarity guidance network for one-shot semantic segmentation. IEEE Transactions on Cybernetics (2020).Google Scholar
Cross Ref
- [39] . 2017. Pyramid scene parsing network. In CVPR. 2881–2890.Google Scholar
- [40] . 2021. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In CVPR. 6881–6890.Google Scholar
- [41] . 2017. Scene parsing through ade20k dataset. In CVPR. 633–641.Google Scholar
Index Terms
Query-Guided Prototype Learning with Decoder Alignment and Dynamic Fusion in Few-Shot Segmentation
Recommendations
Few-Shot Segmentation via Complementary Prototype Learning and Cascaded Refinement
Pattern Recognition and Computer VisionAbstractPrototype learning has been widely explored for few-shot segmentation. Existing methods typically learn the prototype from the foreground features of all support images, which rarely consider the background similarities between the query images ...
Zero-shot classification with unseen prototype learning
AbstractZero-shot learning (ZSL) aims at recognizing instances from unseen classes via training a classification model with only seen data. Most existing approaches easily suffer from the classification bias from unseen to seen categories since the models ...
Complementary features based prototype self-updating for few-shot learning
AbstractThe goal of few-shot learning is to use limited labeled samples to complete independent classification tasks. The feature extractor of few-shot learning needs to have a stronger feature expression ability to generalize in unseen novel ...
Highlights- BMFE can obtain the complementary features representation of FSL.
- TMM mechanism ...






Comments