Abstract
Many social media applications, such as recommendation, virality prediction, and marketing, make use of user gender, which may not be explicitly specified or kept privately. Meanwhile, advanced mobile devices have become part of our lives and a huge amount of content is being generated by users every day, especially user shared images shared by individuals in social networks. This particular form of user generated content is widely accessible to others due to the sharing nature. When user gender is only accessible to exclusive parties, these user shared images are proved to be an easier way to identify user gender. This work investigated 3,152,344 images by 7,450 users from Fotolog and Flickr, two image-oriented social networks. It is observed that users who share visually similar images are more likely to have the same gender. A multimedia big data system that utilizes this phenomenon is proposed for user gender identification with 79% accuracy. These findings are useful for information or services in any social network with intensive image sharing.
- Jalal S. Alowibdi, Ugo Buy, and Paul Yu. 2013a. Empirical evaluation of profile characteristics for gender classification on Twitter. In Proceedings of the 2013 12th International Conference on Machine Learning and Applications (ICMLA), Vol. 1. IEEE, 365--369. Google Scholar
Digital Library
- Jalal S. Alowibdi, Ugo Buy, and Paul Yu. 2013b. Language independent gender classification on Twitter. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 739--743. Google Scholar
Digital Library
- Shlomo Argamon, Moshe Koppel, James W. Pennebaker, and Jonathan Schler. 2009. Automatically profiling the author of an anonymous text. Commun. ACM 52, 2 (2009), 119--123. Google Scholar
Digital Library
- David Arthur and Sergei Vassilvitskii. 2007. k-means++: The advantages of careful seeding. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 1027--1035. Google Scholar
Digital Library
- John D. Burger, John Henderson, George Kim, and Guido Zarrella. 2011. Discriminating gender on Twitter. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1301--1309. Google Scholar
Digital Library
- Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the devil in the details: Delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014).Google Scholar
- Ming Cheung and James She. 2014. Bag-of-features tagging approach for a better recommendation with social big data. In Proceedings of the 4th International Conference on Advances in Information Mining and Management (IMMM’14). 83--88.Google Scholar
- Ming Cheung and James She. 2016. Evaluating the privacy risk of user shared images. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 12, 58 (2016). Google Scholar
Digital Library
- Ming Cheung, James She, and Zhanming Jie. 2015a. Connection discovery using big data of user-shared images in social media. IEEE Transactions on Multimedia 17, 9 (2015), 1417--1428.Google Scholar
Digital Library
- Ming Cheung, James She, and Li Xiaopeng. 2015b. Non-user generated annotation on user shared images for connection discovery. In Proceedings of the 2015 IEEE/ACM Int’l Conference on Green Computing and Communications (GreenCom) and International Conference on Cyber, Physical and Social Computing (CPSCom). IEEE. Google Scholar
Digital Library
- Dorin Comaniciu and Peter Meer. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 5 (2002), 603--619. Google Scholar
Digital Library
- Andrew Estabrooks, Taeho Jo, and Nathalie Japkowicz. 2004. A multiple resampling method for learning from imbalanced data sets. Computational Intelligence 20, 1 (2004), 18--36.Google Scholar
Cross Ref
- Sumit Goswami, Sudeshna Sarkar, and Mayur Rustagi. 2009. Stylometric analysis of bloggers’ age and gender. In Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media.Google Scholar
- Noelle J. Hum, Perrin E. Chamberlin, Brittany L. Hambright, Anne C. Portwood, Amanda C. Schat, and Jennifer L. Bevan. 2011. A picture is worth a thousand words: A content analysis of Facebook profile photographs. Computers in Human Behavior 27, 5 (2011), 1828--1833. Google Scholar
Cross Ref
- Zhanming Jie, Ming Cheung, and James She. 2015. A cloud-assisted framework for bag-of-features tagging in social networks. In Proceedings of the 4th IEEE Symposium on Network Cloud Computing and Applications. IEEE. Google Scholar
Digital Library
- Frederic Jurie and Bill Triggs. 2005. Creating efficient codebooks for visual recognition. In Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, Vol. 1. IEEE, 604--610. Google Scholar
Digital Library
- Timor Kadir and Michael Brady. 2001. Saliency, scale and image description. International Journal of Computer Vision 45, 2 (2001), 83--105. Google Scholar
Digital Library
- Ashish Kapoor, Kristen Grauman, Raquel Urtasun, and Trevor Darrell. 2007. Active learning with Gaussian processes for object categorization. In Proceedings of the 2007 IEEE 11th International Conference on Computer Vision. IEEE, 1--8.Google Scholar
Cross Ref
- Lyndon Kennedy, Mor Naaman, Shane Ahern, Rahul Nair, and Tye Rattenbury. 2007. How Flickr helps us make sense of the world: Context and content in community-contributed media collections. In Proceedings of the 15th International Conference on Multimedia. ACM, 631--640. Google Scholar
Digital Library
- Xin Li, Lei Guo, and Yihong E. Zhao. 2008. Tag-based social interest discovery. In Proceedings of the 17th International Conference on World Wide Web. ACM, 675--684. Google Scholar
Digital Library
- Yoseph Linde, Andres Buzo, and Robert M. Gray. 1980. An algorithm for vector quantizer design. IEEE Transactions on Communications 28, 1 (1980), 84--95.Google Scholar
Cross Ref
- Wendy Liu and Derek Ruths. 2013. What’s in a name? Using first names as features for gender inference in Twitter. In Proceedings of the AAAI Spring Symposium: Analyzing Microtext.Google Scholar
- ERIC LOUGHEED. 2012. Frazzled by Facebook? An exploratory study of gender differences in social network communication among undergraduate men and women. College Student Journal (2012), 88--99.Google Scholar
- David G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 2 (2004), 91--110. Google Scholar
Digital Library
- Andrew McCallum, Kamal Nigam, and Lyle H. Ungar. 2000. Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 169--178. Google Scholar
Digital Library
- Adam Meyerson. 2001. Online facility location. In Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science. IEEE, 426--431. Google Scholar
Digital Library
- Emily Moxley, Jim Kleban, Jiejun Xu, and B. S. Manjunath. 2009. Not all tags are created equal: Learning Flickr tag semantics for global annotation. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’09). IEEE, 1452--1455. Google Scholar
Digital Library
- Arjun Mukherjee and Bing Liu. 2010. Improving gender classification of blog authors. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 207--217. Google Scholar
Digital Library
- Nicole L. Muscanell and Rosanna E. Guadagno. 2012. Make new friends or keep the old: Gender and personality differences in social networking use. Computers in Human Behavior 28, 1 (2012), 107--112. Google Scholar
Digital Library
- Claudia Peersman, Walter Daelemans, and Leona Van Vaerenbergh. 2011. Predicting age and gender in online social networks. In Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents. ACM, 37--44. Google Scholar
Digital Library
- Delip Rao, David Yarowsky, Abhishek Shreevats, and Manaswi Gupta. 2010. Classifying latent user attributes in Twitter. In Proceedings of the 2nd International Workshop on Search and Mining User-generated Contents. ACM, 37--44. Google Scholar
Digital Library
- Jessica Rose, Susan Mackey-Kallis, Len Shyles, Kelly Barry, Danielle Biagini, Colleen Hart, and Lauren Jack. 2012. Face it: The impact of gender on social media images. Communication Quarterly 60, 5 (2012), 588--607.Google Scholar
Cross Ref
- Jitao Sang, Changsheng Xu, and Jing Liu. 2012. User-aware image tag refinement via ternary semantic analysis. IEEE Transactions on Multimedia 14, 3 (2012), 883--895. Google Scholar
Digital Library
- H. A. Schwartz, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie M. Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, and Martin E. Seligman. 2013. Personality, gender, and age in the language of social media: The open-vocabulary approach. PloS One 8, 9 (2013), e73791.Google Scholar
Cross Ref
- Andriy Shepitsen, Jonathan Gemmell, Bamshad Mobasher, and Robin Burke. 2008. Personalized recommendation in social tagging systems using hierarchical clustering. In Proceedings of the 2008 ACM Conference on Recommender Systems (RecSys’08). ACM, New York, NY, 259--266. Google Scholar
Digital Library
- Brkur Sigurbjrnsson and Roelof Van Zwol. 2008. Flickr tag recommendation based on collective knowledge. In Proceedings of the 17th International Conference on the World Wide Web. ACM, 327--336. Google Scholar
Digital Library
- Michele M. Strano. 2008. User descriptions and interpretations of self-presentation through Facebook profile images. Cyberpsychology: Journal of Psychosocial Research on Cyberspace 2, 2 (2008), 5.Google Scholar
- Zhi Wang, Lifeng Sun, Wenwu Zhu, Shiqiang Yang, Hongzhi Li, and Dapeng Wu. 2013. Joint social and content recommendation for user-generated videos in online social network. IEEE Transactions on Multimedia 15, 3 (2013), 698--709. Google Scholar
Digital Library
- Quanzeng You, Sumit Bhatia, Tong Sun, and Jiebo Luo. 2014. The eyes of the beholder: Gender prediction using images posted in online social networks. In Proceedings of the 2014 IEEE International Conference on Data Mining Workshop (ICDMW). IEEE, 1026--1030.Google Scholar
Cross Ref
- Xiaoming Zhang, Zhoujun Li, and Wenhan Chao. 2012. Tagging images by merging multiple features in a integrated manner. Journal of Intelligent Information Systems 39, 1 (2012), 87--107. Google Scholar
Digital Library
- Xiaoming Zhang, Xiaojian Zhao, Zhoujun Li, Jiali Xia, Ramesh Jain, and Wenhan Chao. 2013. Social image tagging using graph-based reinforcement on multi-type interrelated objects. Signal Processing 93, 8 (2013), 2178--2189. Google Scholar
Digital Library
- Tom C. Zhou, Hao Ma, Michael R. Lyu, and Irwin King. 2010. UserRec: A user recommendation framework in social tagging systems. In AAAI. Google Scholar
Digital Library
Index Terms
An Analytic System for User Gender Identification through User Shared Images
Recommendations
Evaluating the Privacy Risk of User-Shared Images
Special Section on Trust Management for Multimedia Big Data and Special Section on Best Papers of ACM Multimedia 2015User-shared images are shared on social media about a user’s life and interests that are widely accessible to others due to their sharing nature. Unlike for online profiles and social graphs, most users are unaware of the privacy risks relating to ...
Intersections of gender and sexual minority status
The present study refined existing bullying literature by examining differences in risk of three types of bullying victimization (offline only, cyberbullying only, and co-occurring victimization) for four different gender-sexual minority status groups ...
Gender differences in response to Facebook status updates from same and opposite gender friends
We conducted two studies to examine gender differences in response to Facebook status updates from same and opposite gender friends. Study 1 surveyed 522 undergraduate students (216 females and 306 males), and compared males' and females' responses to ...






Comments