skip to main content
research-article

An Analytic System for User Gender Identification through User Shared Images

Published:28 June 2017Publication History
Skip Abstract Section

Abstract

Many social media applications, such as recommendation, virality prediction, and marketing, make use of user gender, which may not be explicitly specified or kept privately. Meanwhile, advanced mobile devices have become part of our lives and a huge amount of content is being generated by users every day, especially user shared images shared by individuals in social networks. This particular form of user generated content is widely accessible to others due to the sharing nature. When user gender is only accessible to exclusive parties, these user shared images are proved to be an easier way to identify user gender. This work investigated 3,152,344 images by 7,450 users from Fotolog and Flickr, two image-oriented social networks. It is observed that users who share visually similar images are more likely to have the same gender. A multimedia big data system that utilizes this phenomenon is proposed for user gender identification with 79% accuracy. These findings are useful for information or services in any social network with intensive image sharing.

References

  1. Jalal S. Alowibdi, Ugo Buy, and Paul Yu. 2013a. Empirical evaluation of profile characteristics for gender classification on Twitter. In Proceedings of the 2013 12th International Conference on Machine Learning and Applications (ICMLA), Vol. 1. IEEE, 365--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jalal S. Alowibdi, Ugo Buy, and Paul Yu. 2013b. Language independent gender classification on Twitter. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 739--743. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Shlomo Argamon, Moshe Koppel, James W. Pennebaker, and Jonathan Schler. 2009. Automatically profiling the author of an anonymous text. Commun. ACM 52, 2 (2009), 119--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David Arthur and Sergei Vassilvitskii. 2007. k-means++: The advantages of careful seeding. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 1027--1035. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. John D. Burger, John Henderson, George Kim, and Guido Zarrella. 2011. Discriminating gender on Twitter. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1301--1309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014).Google ScholarGoogle Scholar
  7. Ming Cheung and James She. 2014. Bag-of-features tagging approach for a better recommendation with social big data. In Proceedings of the 4th International Conference on Advances in Information Mining and Management (IMMM’14). 83--88.Google ScholarGoogle Scholar
  8. Ming Cheung and James She. 2016. Evaluating the privacy risk of user shared images. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 12, 58 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ming Cheung, James She, and Zhanming Jie. 2015a. Connection discovery using big data of user-shared images in social media. IEEE Transactions on Multimedia 17, 9 (2015), 1417--1428.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ming Cheung, James She, and Li Xiaopeng. 2015b. Non-user generated annotation on user shared images for connection discovery. In Proceedings of the 2015 IEEE/ACM Int’l Conference on Green Computing and Communications (GreenCom) and International Conference on Cyber, Physical and Social Computing (CPSCom). IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dorin Comaniciu and Peter Meer. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 5 (2002), 603--619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Andrew Estabrooks, Taeho Jo, and Nathalie Japkowicz. 2004. A multiple resampling method for learning from imbalanced data sets. Computational Intelligence 20, 1 (2004), 18--36.Google ScholarGoogle ScholarCross RefCross Ref
  13. Sumit Goswami, Sudeshna Sarkar, and Mayur Rustagi. 2009. Stylometric analysis of bloggers’ age and gender. In Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media.Google ScholarGoogle Scholar
  14. Noelle J. Hum, Perrin E. Chamberlin, Brittany L. Hambright, Anne C. Portwood, Amanda C. Schat, and Jennifer L. Bevan. 2011. A picture is worth a thousand words: A content analysis of Facebook profile photographs. Computers in Human Behavior 27, 5 (2011), 1828--1833. Google ScholarGoogle ScholarCross RefCross Ref
  15. Zhanming Jie, Ming Cheung, and James She. 2015. A cloud-assisted framework for bag-of-features tagging in social networks. In Proceedings of the 4th IEEE Symposium on Network Cloud Computing and Applications. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Frederic Jurie and Bill Triggs. 2005. Creating efficient codebooks for visual recognition. In Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, Vol. 1. IEEE, 604--610. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Timor Kadir and Michael Brady. 2001. Saliency, scale and image description. International Journal of Computer Vision 45, 2 (2001), 83--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ashish Kapoor, Kristen Grauman, Raquel Urtasun, and Trevor Darrell. 2007. Active learning with Gaussian processes for object categorization. In Proceedings of the 2007 IEEE 11th International Conference on Computer Vision. IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  19. Lyndon Kennedy, Mor Naaman, Shane Ahern, Rahul Nair, and Tye Rattenbury. 2007. How Flickr helps us make sense of the world: Context and content in community-contributed media collections. In Proceedings of the 15th International Conference on Multimedia. ACM, 631--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Xin Li, Lei Guo, and Yihong E. Zhao. 2008. Tag-based social interest discovery. In Proceedings of the 17th International Conference on World Wide Web. ACM, 675--684. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yoseph Linde, Andres Buzo, and Robert M. Gray. 1980. An algorithm for vector quantizer design. IEEE Transactions on Communications 28, 1 (1980), 84--95.Google ScholarGoogle ScholarCross RefCross Ref
  22. Wendy Liu and Derek Ruths. 2013. What’s in a name? Using first names as features for gender inference in Twitter. In Proceedings of the AAAI Spring Symposium: Analyzing Microtext.Google ScholarGoogle Scholar
  23. ERIC LOUGHEED. 2012. Frazzled by Facebook? An exploratory study of gender differences in social network communication among undergraduate men and women. College Student Journal (2012), 88--99.Google ScholarGoogle Scholar
  24. David G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 2 (2004), 91--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Andrew McCallum, Kamal Nigam, and Lyle H. Ungar. 2000. Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 169--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Adam Meyerson. 2001. Online facility location. In Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science. IEEE, 426--431. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Emily Moxley, Jim Kleban, Jiejun Xu, and B. S. Manjunath. 2009. Not all tags are created equal: Learning Flickr tag semantics for global annotation. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’09). IEEE, 1452--1455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Arjun Mukherjee and Bing Liu. 2010. Improving gender classification of blog authors. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 207--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Nicole L. Muscanell and Rosanna E. Guadagno. 2012. Make new friends or keep the old: Gender and personality differences in social networking use. Computers in Human Behavior 28, 1 (2012), 107--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Claudia Peersman, Walter Daelemans, and Leona Van Vaerenbergh. 2011. Predicting age and gender in online social networks. In Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents. ACM, 37--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Delip Rao, David Yarowsky, Abhishek Shreevats, and Manaswi Gupta. 2010. Classifying latent user attributes in Twitter. In Proceedings of the 2nd International Workshop on Search and Mining User-generated Contents. ACM, 37--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Jessica Rose, Susan Mackey-Kallis, Len Shyles, Kelly Barry, Danielle Biagini, Colleen Hart, and Lauren Jack. 2012. Face it: The impact of gender on social media images. Communication Quarterly 60, 5 (2012), 588--607.Google ScholarGoogle ScholarCross RefCross Ref
  33. Jitao Sang, Changsheng Xu, and Jing Liu. 2012. User-aware image tag refinement via ternary semantic analysis. IEEE Transactions on Multimedia 14, 3 (2012), 883--895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. A. Schwartz, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie M. Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, and Martin E. Seligman. 2013. Personality, gender, and age in the language of social media: The open-vocabulary approach. PloS One 8, 9 (2013), e73791.Google ScholarGoogle ScholarCross RefCross Ref
  35. Andriy Shepitsen, Jonathan Gemmell, Bamshad Mobasher, and Robin Burke. 2008. Personalized recommendation in social tagging systems using hierarchical clustering. In Proceedings of the 2008 ACM Conference on Recommender Systems (RecSys’08). ACM, New York, NY, 259--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Brkur Sigurbjrnsson and Roelof Van Zwol. 2008. Flickr tag recommendation based on collective knowledge. In Proceedings of the 17th International Conference on the World Wide Web. ACM, 327--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Michele M. Strano. 2008. User descriptions and interpretations of self-presentation through Facebook profile images. Cyberpsychology: Journal of Psychosocial Research on Cyberspace 2, 2 (2008), 5.Google ScholarGoogle Scholar
  38. Zhi Wang, Lifeng Sun, Wenwu Zhu, Shiqiang Yang, Hongzhi Li, and Dapeng Wu. 2013. Joint social and content recommendation for user-generated videos in online social network. IEEE Transactions on Multimedia 15, 3 (2013), 698--709. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Quanzeng You, Sumit Bhatia, Tong Sun, and Jiebo Luo. 2014. The eyes of the beholder: Gender prediction using images posted in online social networks. In Proceedings of the 2014 IEEE International Conference on Data Mining Workshop (ICDMW). IEEE, 1026--1030.Google ScholarGoogle ScholarCross RefCross Ref
  40. Xiaoming Zhang, Zhoujun Li, and Wenhan Chao. 2012. Tagging images by merging multiple features in a integrated manner. Journal of Intelligent Information Systems 39, 1 (2012), 87--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Xiaoming Zhang, Xiaojian Zhao, Zhoujun Li, Jiali Xia, Ramesh Jain, and Wenhan Chao. 2013. Social image tagging using graph-based reinforcement on multi-type interrelated objects. Signal Processing 93, 8 (2013), 2178--2189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Tom C. Zhou, Hao Ma, Michael R. Lyu, and Irwin King. 2010. UserRec: A user recommendation framework in social tagging systems. In AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An Analytic System for User Gender Identification through User Shared Images

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Multimedia Computing, Communications, and Applications
            ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 13, Issue 3
            August 2017
            233 pages
            ISSN:1551-6857
            EISSN:1551-6865
            DOI:10.1145/3104033
            Issue’s Table of Contents

            Copyright © 2017 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 June 2017
            • Accepted: 1 April 2017
            • Revised: 1 February 2017
            • Received: 1 June 2016
            Published in tomm Volume 13, Issue 3

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!