Abstract
Dictionary-based Classification (DC) has been a promising learning theory in multimedia computing. Previous studies focused on learning a discriminative dictionary as well as the sparsest representation based on the dictionary, to cope with the complex conditions in real-world applications. However, robustness by learning only one single dictionary is far from the optimal level. What is worse, it cannot take advantage of the available techniques proven in modern machine learning, like data augmentation, to mitigate the same problem. In this work, we propose a novel method that utilizes joint Augmented and Compressed Dictionaries for Robust Dictionary-based Classification (ACD-RDC). For optimization under the noise model introduced by real-world conditions, the objective function of ACD-RDC incorporates only two simple, but well-designed constraints, including one enhanced sparsity constraint by the general data augmentation, which requires less case-by-case and sophisticated tuning, and another discriminative constraint solved by a jointly learned dictionary. The optimization of the objective function is then deduced theoretically to an approximate linear problem. The sparsity and discrimination enhanced by data augmentation guarantees the robustness for image classification under various conditions, which constructs the first positive case using data augmentation to obtain robust dictionary-based classification. Numerous experiments have been conducted on popular facial and object image datasets. The results demonstrate that ACD-RDC obtains more promising classification on diversely collected images than the current dictionary-based classification methods. ACD-RDC is also confirmed to be a state-of-the-art classification method when using deep features as inputs.
- . 2006. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing 54, 11 (2006), 4311.Google Scholar
Digital Library
- . 2017a. Joint discriminative Bayesian dictionary and classifier learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1193–1202.Google Scholar
Cross Ref
- . 2017b. Efficient classification with sparsity augmented collaborative representation. Pattern Recognition 65 (2017), 136–145.Google Scholar
Digital Library
- . 2016. A probabilistic collaborative representation based approach for pattern classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2950–2959.Google Scholar
Cross Ref
- . 2005. Decoding by linear programming. IEEE Transactions on Information Theory 51, 12 (2005), 4203–4215.Google Scholar
Digital Library
- . 2015. Max-margin discriminant projection via data augmentation. IEEE Transactions on Knowledge and Data Engineering 27, 7 (2015), 1964–1976.Google Scholar
Digital Library
- . 2019. Image block augmentation for one-shot learning. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Vol. 6. 1–8.Google Scholar
Digital Library
- . 2011. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Vol. 15. 215–223.Google Scholar
- . 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 248–255.Google Scholar
Cross Ref
- . 2006. For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences 59, 6 (2006), 797–829.Google Scholar
Cross Ref
- . 2012. Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit. IEEE Transactions on Information Theory 58, 2 (2012), 1094–1121.Google Scholar
Digital Library
- . 2018. Modeling visual context is key to augmenting object detection datasets. In Proceedings of the European Conference on Computer Vision (ECCV’18). 364–380.Google Scholar
Digital Library
- . 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. 2672–2680.Google Scholar
Digital Library
- . 2019. A generalized mean distance-based k-nearest neighbor classifier. Expert Systems with Applications 115 (2019), 356–372.Google Scholar
Cross Ref
- . 2020. Weighted discriminative collaborative competitive representation for robust image classification. Neural Networks 125 (2020), 104–120.Google Scholar
Cross Ref
- . 2017. Low-shot visual recognition by shrinking and hallucinating features. In Proceedings of the IEEE International Conference on Computer Vision. 3018–3027.Google Scholar
Cross Ref
- . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- . 2011. The data augmentation algorithm: Theory and methodology. In Handbook of Markov Chain Monte Carlo (2011), 253–293.Google Scholar
- . 2007. Sparse representation for signal classification. In Advances in Neural Information Processing Systems. 609–616.Google Scholar
- . 2009. Comparing measures of sparsity. IEEE Transactions on Information Theory 55, 10 (2009), 4723–4741.Google Scholar
Digital Library
- . 2013. Label consistent K-SVD: Learning a discriminative dictionary for recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 11 (2013), 2651–2664.Google Scholar
Digital Library
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2009. Learning multiple layers of features from tiny images. arXiv (2009).Google Scholar
- . 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097–1105.Google Scholar
Digital Library
- . 2017. Recovering PCA and sparse PCA via hybrid-(L1,L2) sparse sampling of data elements. The Journal of Machine Learning Research 18, 1 (2017), 2558–2591.Google Scholar
Digital Library
- . 2018. Person re-identification by cross-view multi-level dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 12 (2018), 2963–2977.Google Scholar
Digital Library
- . 2017. A locality-constrained and label embedding dictionary learning algorithm for image classification. IEEE Transactions on Neural Networks and Learning Systems 28, 2 (2017), 278–293.Google Scholar
Cross Ref
- . 1977. The Theory of Error-correcting Codes. Vol. 16. Elsevier.Google Scholar
- . 2011. Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 4 (2011), 791–804.Google Scholar
Digital Library
- . 1998. The AR Face Database. CVC Technical Report 24 (1998).Google Scholar
- . 1996. Columbia Object Image Library (coil-20).Google Scholar
- . 2018. Copula based classifier fusion under statistical dependence. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 11 (2018), 2740–2748.Google Scholar
Digital Library
- . 2019. Bayesian combination of probabilistic classifiers using multivariate normal mixtures. Journal of Machine Learning Research 20, 51 (2019), 1–18.Google Scholar
- . 2011. Data augmentation for support vector machines. Bayesian Analysis 6, 1 (2011), 1–23.Google Scholar
Cross Ref
- . 2019. Global robustness evaluation of deep neural networks with provable guarantees for the hamming distance. In International Joint Conference on Artificial Intelligence. Early Access.Google Scholar
Cross Ref
- . 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 815–823.Google Scholar
Cross Ref
- . 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- . 1987. The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association 82, 398 (1987), 528–540.Google Scholar
Cross Ref
- . 2016. Local linear convergence of ISTA and FISTA on the LASSO problem. SIAM Journal on Optimization 26, 1 (2016), 313–336.Google Scholar
Cross Ref
- . 2018. Representation learning by rotating your faces. IEEE Transactions on Pattern Analysis and Machine Intelligence Early Access (2018), 1–14.
DOI: Google ScholarCross Ref
- . 2017. A Bayesian data augmentation approach for learning deep models. In Advances in Neural Information Processing Systems. 2797–2806.Google Scholar
- . 2001. The art of data augmentation. Journal of Computational and Graphical Statistics 10, 1 (2001), 1–50.Google Scholar
Cross Ref
- . 2013. Heterogeneous visual features fusion via sparse multimodal machine. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3097–3102.Google Scholar
Digital Library
- . 2018. Low-shot learning from imaginary data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7278–7286.Google Scholar
Cross Ref
- . 2020. Incomplete multiview spectral clustering with adaptive graph learning. IEEE Transactions on Cybernetics 50, 4 (2020), 1418–1429.Google Scholar
Cross Ref
- . 2011. Face recognition in unconstrained videos with matched background similarity. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 529–534.Google Scholar
Digital Library
- . 2010. Sparse representation for computer vision and pattern recognition. Proc. IEEE 98, 6 (2010), 1031–1044.Google Scholar
Cross Ref
- . 2009. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 2 (2009), 210–227.Google Scholar
Digital Library
- . 2018. Feature generating networks for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5542–5551.Google Scholar
Cross Ref
- . 2017. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms.
arXiv:cs.LG/cs.LG/1708.07747 Google Scholar - . 2016. Sun database: Exploring a large collection of scene categories. International Journal of Computer Vision 119, 1 (2016), 3–22.Google Scholar
Digital Library
- . 2015. Multi-view learning with incomplete views. IEEE Transactions on Image Processing 24, 12 (2015), 5812–5825.Google Scholar
Digital Library
- . 2017a. A survey of dictionary learning algorithms for face recognition. IEEE Access 5 (2017), 8502–8514.Google Scholar
Cross Ref
- . 2017b. Sample diversity, representation effectiveness and robust dictionary learning for face recognition. Information Sciences 375 (2017), 171–182.Google Scholar
Digital Library
- . 2013. An improvement to the nearest neighbor classifier and face recognition experiments. Int. J. Innov. Comput. Inf. Control 9, 2 (2013), 543–554.Google Scholar
- . 2020. Effective data augmentation with multi-domain learning GANs. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 6566–6574.Google Scholar
Cross Ref
- . 2019. Cross-modality bridging and knowledge transferring for image understanding. IEEE Transactions on Multimedia 21, 10 (2019), 2675–2685.Google Scholar
Digital Library
- . 2019. Shared multi-view data representation for multi-domain event detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 1, early access (2019), 1–14.Google Scholar
- . 2018. Learning with single-teacher multi-student. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 4390–4397.Google Scholar
Cross Ref
- . 2020. Regularization on augmented data to diversify sparse representation for robust image classification. IEEE Transactions on Cybernetics (2020).
D OI:DOI: Google ScholarCross Ref
- . 2018. Collaboratively weighting deep and classic representation via \(l_2\) regularization for image classification. In Proceedings of the Asian Conference on Machine Learning. 502–517.Google Scholar
- . 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters 23, 10 (2016), 1499–1503.Google Scholar
Cross Ref
- . 2011. Sparse representation or collaborative representation: Which helps face recognition?. In Proceedings of the 2011 International Conference on Computer Vision. IEEE, 471–478.Google Scholar
Digital Library
- . 2010. Discriminative K-SVD for dictionary learning in face recognition. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2691–2698.Google Scholar
Cross Ref
- . 2019. Ontology-driven hierarchical sparse coding for large-scale image classification. Neurocomputing 360 (2019), 209–219.Google Scholar
Digital Library
- . 2015. A survey of sparse representation: Algorithms and applications. IEEE Access 3 (2015), 490–530.Google Scholar
Cross Ref
- . 2019. CamStyle: A novel data augmentation method for person re-identification. IEEE Transactions on Image Processing 28, 3 (2019), 1176–1190.Google Scholar
Digital Library
- . 2014. Gibbs max-margin topic models with data augmentation. The Journal of Machine Learning Research 15, 1 (2014), 1073–1110.Google Scholar
Digital Library
Index Terms
Joint Augmented and Compressed Dictionaries for Robust Image Classification
Recommendations
Robust Dictionary Learning by Error Source Decomposition
ICCV '13: Proceedings of the 2013 IEEE International Conference on Computer VisionSparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain ...
Low-rank double dictionary learning from corrupted data for robust image classification
A novel low-rank double dictionary learning (LRD2L) approach is proposed for robust image classification.It integrates the low-rank matrix recovery technique with the class-specific and class-shared dictionary learning.It can effectively handle the ...
Multi-label classification using error correcting output codes
A framework for multi-label classification extended by Error Correcting Output Codes ECOCs is introduced and empirically examined in the article. The solution assumes the base multi-label classifiers to be a noisy channel and applies ECOCs in order to ...






Comments