skip to main content
research-article

Collaborative PCA/DCA Learning Methods for Compressive Privacy

Published:07 July 2017Publication History
Skip Abstract Section

Abstract

In the Internet era, the data being collected on consumers like us are growing exponentially, and attacks on our privacy are becoming a real threat. To better ensure our privacy, it is safer to let the data owner control the data to be uploaded to the network as opposed to taking chance with data servers or third parties. To this end, we propose compressive privacy, a privacy-preserving technique to enable the data creator to compress data via collaborative learning so that the compressed data uploaded onto the Internet will be useful only for the intended utility and not be easily diverted to malicious applications.

For data in a high-dimensional feature vector space, a common approach to data compression is dimension reduction or, equivalently, subspace projection. The most prominent tool is principal component analysis (PCA). For unsupervised learning, PCA can best recover the original data given a specific reduced dimensionality. However, for the supervised learning environment, it is more effective to adopt a supervised PCA, known as discriminant component analysis (DCA), to maximize the discriminant capability.

The DCA subspace analysis embraces two different subspaces. The signal-subspace components of DCA are associated with the discriminant distance/power (related to the classification effectiveness), whereas the noise subspace components of DCA are tightly coupled with recoverability and/or privacy protection. This article presents three DCA-related data compression methods useful for privacy-preserving applications:

Utility-driven DCA: Because the rank of the signal subspace is limited by the number of classes, DCA can effectively support classification using a relatively small dimensionality (i.e., high compression).

Desensitized PCA: By incorporating a signal-subspace ridge into DCA, it leads to a variant especially effective for extracting privacy-preserving components. In this case, the eigenvalues of the noise-space are made to become insensitive to the privacy labels and are ordered according to their corresponding component powers.

Desensitized K-means/SOM: Since the revelation of the K-means or SOM cluster structure could leak sensitive information, it is safer to perform K-means or SOM clustering on a desensitized PCA subspace.

References

  1. Thee Chanyaswad, J. Morris Chang, and Sun-Yuan Kung. 2017. A compressive multi-kernel method for privacy-preserving machine learning. In Proceedings of the 2017 IEEE International Joint Conference on Neural Networks (IJCNN’17).Google ScholarGoogle ScholarCross RefCross Ref
  2. Thee Chanyaswad, J. Morris Chang, Prateek Mittal, and Sun-Yuan Kung. 2016. Discriminant-component eigenfaces for privacy-preserving face recognition. In Proceedings of the 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP’16). IEEE, Los Alamitos, CA, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  3. Richard O. Duda and Peter E. Hart. 1973. Pattern Recognition and Scene Analysis. Wiley.Google ScholarGoogle Scholar
  4. Ronald A. Fisher. 1936. The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 2, 179--188.Google ScholarGoogle ScholarCross RefCross Ref
  5. Arthur E. Hoerl and Robert W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 1, 55--67.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Teuvo Kohonen. 1984. Self-Organization and Associative Memory. Springer-Verlag, New York, NY.Google ScholarGoogle Scholar
  7. Sun-Yuan Kung. 2014. Kernel Methods and Machine Learning. Cambridge University Press, Cambridge, England.Google ScholarGoogle Scholar
  8. Sun-Yuan Kung. 2015. Discriminant component analysis for privacy protection and visualization of big data. Multimedia Tools and Applications 76, 3, 3999--4034. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sun-Yuan Kung. 2017. Compressive privacy: From information\/estimation theory to machine learning {lecture notes}. IEEE Signal Processing Magazine 34, 1, 94--112.Google ScholarGoogle ScholarCross RefCross Ref
  10. Beresford N. Parlett. 1980. The Symmetric Eigenvalue Problem. Prentice-Hall Series in Computational Mathematics. Prentice Hall.Google ScholarGoogle Scholar
  11. Andrey Nikolayevich Tikhonov. 1943. On the stability of inverse problems. Comptes Rendus (Doklady) de l’Academie des Sciences de l’URSS 39, 195--198.Google ScholarGoogle Scholar
  12. Vladimir Vapnik. 1995. The Nature of Statistical Learning Theory. Springer-Verlag, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Collaborative PCA/DCA Learning Methods for Compressive Privacy

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!