skip to main content
research-article

When Simpler Data Does Not Imply Less Information: A Study of User Profiling Scenarios With Constrained View of Mobile HTTP(S) Traffic

Published:27 January 2018Publication History
Skip Abstract Section

Abstract

The exponential growth in smartphone adoption is contributing to the availability of vast amounts of human behavioral data. This data enables the development of increasingly accurate data-driven user models that facilitate the delivery of personalized services that are often free in exchange for the use of its customers’ data. Although such usage conventions have raised many privacy concerns, the increasing value of personal data is motivating diverse entities to aggressively collect and exploit the data. In this article, we unfold profiling scenarios around mobile HTTP(S) traffic, focusing on those that have limited but meaningful segments of the data. The capability of the scenarios to profile personal information is examined with real user data, collected in the wild from 61 mobile phone users for a minimum of 30 days. Our study attempts to model heterogeneous user traits and interests, including personality, boredom proneness, demographics, and shopping interests. Based on our modeling results, we discuss various implications to personalization, privacy, and personal data rights.

References

  1. Alessandro Acquisti and Jens Grossklags. 2005. Privacy and rationality in individual decision making. IEEE Security and Privacy 3, 1, 26--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Yair Amichai-Hamburger, Naama Lamdan, Rinat Madiel, and Tsahi Hayat. 2008. Personality characteristics of Wikipedia members. CyberPsychology and Behavior 11, 6, 679--681.Google ScholarGoogle ScholarCross RefCross Ref
  3. Albert-Laszlo Barabasi. 2005. The origin of bursts and heavy tails in human dynamics. Nature 435, 7039, 207--211.Google ScholarGoogle Scholar
  4. Li Bian and Henry Holtzman. 2011. Online friend recommendation through personality matching and collaborative filtering. In Proceedings of the 5th International Conference on Mobile Ubiquitous Computing, Systems, Services, and Technologies (UBICOMM’11). 230--235.Google ScholarGoogle Scholar
  5. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Leo Breiman. 2001. Random forests. Machine Learning 45, 1, 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Sarah Butt and James G. Phillips. 2008. Personality and self reported mobile phone use. Computers in Human Behavior 24, 2, 346--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Juan Miguel Carrascosa, Jakub Mikians, Ruben Cuevas, Vijay Erramilli, and Nikolaos Laoutaris. 2015. I always feel like somebody’s watching me: Measuring online behavioural advertising. In Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies. ACM, New York, NY, 13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Xavier Carreras, Isaac Chao, Lluis Padró, and Muntsa Padró. 2004. FreeLing: An open-source suite of language analyzers. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC’04). 239--242.Google ScholarGoogle Scholar
  10. Claude Castelluccia, Lukasz Olejnik, and Tran Minh-Dung. 2014. Selling off privacy at auction. In Proceedings of the Network and Distributed System Security Symposium (NDSS’14).Google ScholarGoogle Scholar
  11. William B. Cavnar and John M. Trenkle. 1994. N-gram-based text categorization. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval (SDAIR’94). 161--175.Google ScholarGoogle Scholar
  12. Jilin Chen, Rowan Nairn, Les Nelson, Michael Bernstein, and Ed Chi. 2010. Short and tweet: Experiments on recommending content from information streams. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 1185--1194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gokul Chittaranjan, Jan Blom, and Daniel Gatica-Perez. 2013. Mining large-scale smartphone data for personality studies. Personal and Ubiquitous Computing 17, 3, 433--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. European Commission. Protection of Personal Data. Retrieved December 3, 2017, from http://ec.europa.eu/justice/data-protection/.Google ScholarGoogle Scholar
  15. Ramón Compañó and Wainer Lusoli. 2010. The policy maker’s anguish: Regulating personal data behavior between paradoxes and dilemmas. In Economics of Information Security and Privacy. Springer, 169--185.Google ScholarGoogle Scholar
  16. Yves-Alexandre De Montjoye, César A. Hidalgo, Michel Verleysen, and Vincent D. Blondel. 2013. Unique in the crowd: The privacy bounds of human mobility. Scientific Reports 3, 1376.Google ScholarGoogle ScholarCross RefCross Ref
  17. Yves-Alexandre de Montjoye, Jordi Quoidbach, Florent Robic, and Alex Sandy Pentland. 2013. Predicting personality using novel mobile phone-based metrics. In Proceedings of the International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction. 48--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. V. Del Barrio and M. A. Carrasco. 2003. Personalidad y emociones infantiles. In Proceedings of the 29th Inter-American Congress of Psychology, Lima. 13--18.Google ScholarGoogle Scholar
  19. Cynthia Dwork. 2008. Differential privacy: A survey of results. In Proceedings of the International Conference on Theory and Applications of Models of Computation. 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. William Enck, Damien Octeau, Patrick McDaniel, and Swarat Chaudhuri. 2011. A study of Android application security. In Proceedings of the USENIX Security Symposium, Vol. 2. 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Richard Farmer and Norman D. Sundberg. 1986. Boredom proneness—the development and correlates of a new scale. Journal of Personality Assessment 50, 1, 4--17.Google ScholarGoogle ScholarCross RefCross Ref
  22. Bruce Ferwerda, Markus Schedl, and Marko Tkalcic. 2016. Personality traits and the relationship with (non-) disclosure behavior on Facebook. In Proceedings of the 25th International Conference Companion on World Wide Web. 565--568. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lewis R. Goldberg, John A. Johnson, Herbert W. Eber, Robert Hogan, Michael C. Ashton, C. Robert Cloninger, and Harrison G. Gough. 2006. The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality 40, 1, 84--96.Google ScholarGoogle ScholarCross RefCross Ref
  24. Aniko Hannak, Gary Soeller, David Lazer, Alan Mislove, and Christo Wilson. 2014. Measuring price discrimination and steering on e-commerce Web sites. In Proceedings of the 2014 Internet Measurement Conference. ACM, New York, NY, 305--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Lee Hoxter and David Lester. 1988. Tourist behavior and personality. Personality and Individual Differences 9, 1, 177--178.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jen-Hung Huang and Yi-Chun Yang. 2010. The relationship between personality traits and online shopping motivations. Social Behavior and Personality: An International Journal 38, 5, 673--679.Google ScholarGoogle ScholarCross RefCross Ref
  27. Vijay S. Iyengar. 2002. Transforming data to satisfy privacy constraints. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 279--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Eui Jun Jeong, Hye Rim Lee, and Ji Hye Yoo. 2015. Addictive use due to personality: Focused on Big Five personality traits and game addiction. International Journal of Social, Behavioral, Educational, Economic, Business and Industrial Engineering 9, 6, 1995--1999.Google ScholarGoogle Scholar
  29. Zhi-Qiang Jiang, Wen-Jie Xie, Ming-Xia Li, Boris Podobnik, Wei-Xing Zhou, and H. Eugene Stanley. 2013. Calling patterns in human communication dynamics. Proceedings of the National Academy of Sciences 110, 5, 1600--1605.Google ScholarGoogle Scholar
  30. Noam Koenigstein, Gideon Dror, and Yehuda Koren. 2011. Yahoo! music recommendations: Modeling music ratings with temporal dynamics and item taxonomy. In Proceedings of the 5th ACM Conference on Recommender Systems. ACM, New York, NY, 165--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Christian Kohlschütter, Peter Fankhauser, and Wolfgang Nejdl. 2010. Boilerplate detection using shallow text features. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. ACM, New York, NY, 441--450. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Michal Kosinski, David Stillwell, and Thore Graepel. 2013. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences 110, 15, 5802--5805.Google ScholarGoogle Scholar
  33. Richard N. Landers and John W. Lounsbury. 2006. An investigation of Big Five and narrow personality traits in relation to Internet usage. Computers in Human Behavior 22, 2, 283--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Pedro Giovanni Leon, Blase Ur, Yang Wang, Manya Sleeper, Rebecca Balebako, Richard Shay, Lujo Bauer, Mihai Christodorescu, and Lorrie Faith Cranor. 2013. What matters to users? Factors that affect users’ willingness to share information with online advertisers. In Proceedings of the 9th Symposium on Usable Privacy and Security. ACM, New York, NY, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jack Lindamood, Raymond Heatherly, Murat Kantarcioglu, and Bhavani Thuraisingham. 2009. Inferring private information using social network data. In Proceedings of the 18th International Conference on World Wide Web. ACM, New York, NY, 1145--1146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jiahui Liu, Peter Dolan, and Elin Rønby Pedersen. 2010. Personalized news recommendation based on click behavior. In Proceedings of the 15th International Conference on Intelligent User Interfaces. ACM, New York, NY, 31--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Aleksandar Matic, Martin Pielot, and Nuria Oliver. 2015. Boredom-computer interaction: Boredom proneness and the use of smartphone. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, New York, NY, 837--841. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Arvind Narayanan and Vitaly Shmatikov. 2008. Robust de-anonymization of large sparse datasets. In Proceedings of the 2008 IEEE Symposium on Security and Privacy (SP’08). IEEE, Los Alamitos, CA, 111--125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. David Naylor, Alessandro Finamore, Ilias Leontiadis, Yan Grunenberger, Marco Mellia, Maurizio Munafò, Konstantina Papagiannaki, and Peter Steenkiste. 2014. The cost of the S in HTTPS. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies. ACM, New York, NY, 133--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Helen Nissenbaum. 2011. A contextual approach to privacy online. Daedalus 140, 4, 32--48.Google ScholarGoogle ScholarCross RefCross Ref
  41. Rodrigo De Oliveira, Mauro Cherubini, and Nuria Oliver. 2013. Influence of personality on satisfaction with mobile phone services. ACM Transactions on Computer-Human Interaction 20, 2, 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Cody Nathan Ramirez. 2010. The Relationship Between Internet Usage by Category and Personality Structure. Ph.D. Dissertation. California State University, Sacramento.Google ScholarGoogle Scholar
  43. Christoph Randler. 2008. Morningness--eveningness, sleep--wake variables and Big Five personality factors. Personality and Individual Differences 45, 2, 191--196.Google ScholarGoogle ScholarCross RefCross Ref
  44. Pierangela Samarati and Latanya Sweeney. 1998. Protecting Privacy When Disclosing Information: k-anonymity and Its Enforcement Through Generalization and Suppression. Technical Report. SRI International, Menlo Park, CA.Google ScholarGoogle Scholar
  45. Suranga Seneviratne, Aruna Seneviratne, Prasant Mohapatra, and Anirban Mahanti. 2014. Predicting user traits from a snapshot of apps installed on a smartphone. ACM SIGMOBILE Mobile Computing and Communications Review 18, 2, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jacopo Staiano, Bruno Lepri, Nadav Aharony, Fabio Pianesi, Nicu Sebe, and Alex Pentland. 2012. Friends don’t lie: Inferring personality traits from social network structure. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing. ACM, New York, NY, 321--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jacopo Staiano, Nuria Oliver, Bruno Lepri, Rodrigo de Oliveira, Michele Caraviello, and Nicu Sebe. 2014. Money walks: A human-centric study on the economics of personal mobile data. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, New York, NY, 583--594. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Leman Pinar Tosun and Timo Lajunen. 2010. Does Internet use reflect your personality? Relationship between Eysenck’s personality dimensions and Internet use. Computers in Human Behavior 26, 2, 162--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Wen-Chin Tsao. 2013. Big Five personality traits as predictors of Internet usage categories. International Journal of Management 30, 4, 374.Google ScholarGoogle Scholar
  50. Chih-Chien Wang and Hui-Wen Yang. 2008. Passion for online shopping: The influence of personality and compulsive buying. Social Behavior and Personality: An International Journal 36, 5, 693--706.Google ScholarGoogle ScholarCross RefCross Ref
  51. Yuan Zhong, Nicholas Jing Yuan, Wen Zhong, Fuzheng Zhang, and Xing Xie. 2015. You are where you go: Inferring demographic attributes from location check-ins. In Proceedings of the 8th ACM International Conference on Web Search and Data Mining. ACM, New York, NY, 295--304. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. When Simpler Data Does Not Imply Less Information: A Study of User Profiling Scenarios With Constrained View of Mobile HTTP(S) Traffic

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 12, Issue 2
      May 2018
      174 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/3176641
      Issue’s Table of Contents

      Copyright © 2018 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 January 2018
      • Accepted: 1 September 2017
      • Revised: 1 July 2017
      • Received: 1 March 2017
      Published in tweb Volume 12, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!