Abstract
Sexism, an injustice that subjects women and girls to enormous suffering, manifests in blatant as well as subtle ways. In the wake of growing documentation of experiences of sexism on the web, the automatic categorization of accounts of sexism has the potential to assist social scientists and policymakers in studying and thereby countering sexism. The existing work on sexism classification has certain limitations in terms of the categories of sexism used and/or whether they can co-occur. To the best of our knowledge, this is the first work on the multi-label classification of sexism of any kind(s).1 We also consider the related task of misogyny classification. While sexism classification is performed on textual accounts describing sexism suffered or observed, misogyny classification is carried out on tweets perpetrating misogyny. We devise a novel neural framework for classifying sexism and misogyny that can combine text representations obtained using models such as Bidirectional Encoder Representations from Transformers with distributional and linguistic word embeddings using a flexible architecture involving recurrent components and optional convolutional ones. Further, we leverage unlabeled accounts of sexism to infuse domain-specific elements into our framework. To evaluate the versatility of our neural approach for tasks pertaining to sexism and misogyny, we experiment with adapting it for misogyny identification. For categorizing sexism, we investigate multiple loss functions and problem transformation techniques to address the multi-label problem formulation. We develop an ensemble approach using a proposed multi-label classification model with potentially overlapping subsets of the category set. Proposed methods outperform several deep-learning as well as traditional machine learning baselines for all three tasks.
- Sweta Agrawal and Amit Awekar. 2018. Deep learning for detecting cyberbullying across multiple social media platforms. In Proceedings of the European Conference on Information Retrieval. Springer, 141–153.Google Scholar
Cross Ref
- Resham Ahluwalia, Himani Soni, Edward Callow, Anderson Nascimento, and Martine De Cock. 2018. Detecting hate speech against women in english tweets. Eval. NLP Speech Tools Ital. 12 (2018), 194.Google Scholar
Cross Ref
- Maria Anzovino, Elisabetta Fersini, and Paolo Rosso. 2018. Automatic identification and classification of misogynistic language on Twitter. In Proceedings of the International Conference on Applications of Natural Language to Information Systems. Springer, 57–64.Google Scholar
Cross Ref
- Ron Artstein and Massimo Poesio. 2008. Inter-coder agreement for computational linguistics. Comput. Linguist. 34, 4 (2008), 555–596. Google Scholar
Digital Library
- Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 759–760. Google Scholar
Digital Library
- Amir Bakarov. 2018. Vector space models for automatic misogyny identification. In Proceedings of 6th Evaluation Campaign of Natural Language, Processing, and Speech Tools for Italian Final Workshop (EVALITA’18). 211–213.Google Scholar
Cross Ref
- Angelo Basile and Chiara Rubagotti. 2018. CrotoneMilano for AMI at Evalita2018. A performant, cross-lingual misogyny detection system. Eval. NLP Speech Tools Ital. 12 (2018), 206.Google Scholar
Cross Ref
- Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5 (2017), 135–146.Google Scholar
- Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown. 2004. Learning multi-label scene classification. Pattern Recogn. 37, 9 (2004), 1757–1771.Google Scholar
Cross Ref
- Pete Burnap and Matthew L. Williams. 2016. Us and them: Identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 5, 1 (2016), 11.Google Scholar
Cross Ref
- Davide Buscaldi. 2018. Tweetaneuse@ AMI EVALITA2018: Character-based models for the automatic misogyny identification task. Eval. NLP Speech Tools Ital. 12 (2018), 214.Google Scholar
Cross Ref
- Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar et al. 2018. Universal sentence encoder. Retrieved from https://arXiv:1803.11175.Google Scholar
- Arijit Ghosh Chowdhury, Ramit Sawhney, Rajiv Shah, and Debanjan Mahata. 2019. # YouToo? Detection of personal recollections of sexual harassment on social media. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2527–2537.Google Scholar
Cross Ref
- Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Edu. Psychol. Measure. 20, 1 (1960), 37–46.Google Scholar
Cross Ref
- Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, and Antoine Bordes. 2017. Supervised learning of universal sentence representations from natural language inference data. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 670–680.Google Scholar
Cross Ref
- Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the 11th International AAAI Conference on Web and Social Media.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arXiv:1810.04805.Google Scholar
- Debolina Dutta and Oishik Sircar. 2013. India’s winter of discontent: Some feminist dilemmas in the wake of a rape. Feminist Studies 39, 1 (2013), 293–306.Google Scholar
- Jacquelynne S. Eccles, Janis E. Jacobs, and Rena D. Harold. 1990. Gender role stereotypes, expectancy effects, and parents’ socialization of gender differences. J. Soc. Issues 46, 2 (1990), 183–201.Google Scholar
Cross Ref
- Elisabetta Fersini, Debora Nozza, and Paolo Rosso. 2018. Overview of the evalita 2018 task on automatic misogyny identification (ami). Eval. NLP Speech Tools Ital. 12 (2018), 59.Google Scholar
Cross Ref
- Simona Frenda, Ghanem Bilal et al. 2018. Exploration of misogyny in Spanish and English tweets. In Proceedings of the 3rd Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval’18), Vol. 2150. Ceur Workshop Proceedings, 260–267.Google Scholar
- Simona Frenda, Bilal Ghanem, Estefanía Guzmán-Falcón, Manuel Montes-y Gómez, Luis Villasenor-Pineda et al. 2018. Automatic expansion of lexicons for multilingual misogyny detection. In Proceedings of the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA’18), Vol. 2263. CEUR-WS, 1–6.Google Scholar
Cross Ref
- Lei Gao, Alexis Kuppersmith, and Ruihong Huang. 2017. Recognizing explicit and implicit hate speech using a weakly supervised two-path bootstrapping approach. In Proceedings of the 8th International Joint Conference on Natural Language Processing. 774–782.Google Scholar
- Geoffrey E. Hinton and Ruslan R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504–507.Google Scholar
- Akshita Jha and Radhika Mamidi. 2017. When does a compliment become sexist? Analysis and classification of ambivalent sexism using twitter data. In Proceedings of the 2nd Workshop on NLP and Computational Social Science. 7–16.Google Scholar
Cross Ref
- Sweta Karlekar and Mohit Bansal. 2018. SafeCity: Understanding diverse forms of sexual harassment personal stories. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’18). 2805–2811.Google Scholar
Cross Ref
- Aparup Khatua, Erik Cambria, and Apalak Khatua. 2018. Sounds of silence breakers: Exploring sexual violence on Twitter. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’18). 397–400. Google Scholar
Digital Library
- Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1746–1751.Google Scholar
Cross Ref
- Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 1188–1196. Google Scholar
Digital Library
- Richard Liao. 2017. textClassifier. Retrieved from https://github.com/richliao/textClassifier.Google Scholar
- Margaret Mead. 1963. Sex and Temperament in Three Primitive Societies. Vol. 370. Morrow New York.Google Scholar
- Sophie Melville, Kathryn Eccles, and Taha Yasseri. 2019. Topic modeling of everyday sexism project entries. Front. Dig. Human. 5 (2019), 28.Google Scholar
Cross Ref
- Nivedita Menon. 2012. Seeing Like a Feminist. Penguin UK.Google Scholar
- Saif Mohammad. 2018. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 174–184.Google Scholar
Cross Ref
- Saif M. Mohammad and Peter D. Turney. 2013. Crowdsourcing a word–emotion association lexicon. Comput. Intell. 29, 3 (2013), 436–465.Google Scholar
Cross Ref
- Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abusive language detection in online user content. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 145–153. Google Scholar
Digital Library
- Debora Nozza, Claudia Volpetti, and Elisabetta Fersini. 2019. Unintended bias in misogyny detection. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 149–155. Google Scholar
Digital Library
- Endang Wahyu Pamungkas, Valerio Basile, and Viviana Patti. 2020. Misogyny detection in Twitter: A multilingual and cross-domain study. Info. Process. Manage. 57, 6 (2020), 102360.Google Scholar
Cross Ref
- Endang Wahyu Pamungkas, Alessandra Teresa Cignarella, Valerio Basile, Viviana Patti et al. 2018. Automatic identification of misogyny in English and Italian tweets at EVALITA 2018 with a multilingual hate lexicon. In Proceedings of the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA’18), Vol. 2263. CEUR-WS, 1–6.Google Scholar
Cross Ref
- Pulkit Parikh, Harika Abburi, Pinkesh Badjatiya, Radhika Krishnan, Niyati Chhaya, Manish Gupta, and Vasudeva Varma. 2019. Multi-label categorization of accounts of sexism using a neural framework. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 1642–1652.Google Scholar
Cross Ref
- Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In Proceedings of the International Conference on Machine Learning. 1310–1318. Google Scholar
Digital Library
- Nikhil Pattisapu, Manish Gupta, Ponnurangam Kumaraguru, and Vasudeva Varma. 2017. Medical persona classification in social media. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ACM, 377–384. Google Scholar
Digital Library
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825–2830. Google Scholar
Digital Library
- Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.Google Scholar
Cross Ref
- Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of North American Chapter of the Association for Computational Linguistics (NAACL-HLT’18). 2227–2237.Google Scholar
Cross Ref
- Jing Qian, Mai ElSherief, Elizabeth Belding, and William Yang Wang. 2018. Hierarchical CVAE for fine-grained hate speech classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 3550–3559.Google Scholar
Cross Ref
- Marta Recasens, Cristian Danescu-Niculescu-Mizil, and Dan Jurafsky. 2013. Linguistic models for analyzing and detecting biased language. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 1. 1650–1659.Google Scholar
- Punyajoy Saha, Binny Mathew, Pawan Goyal, and Animesh Mukherjee. 2018. Hateminers: Detecting hate speech against women. Retrieved from https://arXiv:1812.06700.Google Scholar
- Nicolas Schrading, Cecilia Ovesdotter Alm, Raymond Ptucha, and Christopher Homan. 2015. An analysis of domestic abuse discourse on Reddit. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2577–2583.Google Scholar
Cross Ref
- H. Andrew Schwartz, Maarten Sap, Margaret L. Kern, Johannes C. Eichstaedt, Adam Kapelner, Megha Agrawal, Eduardo Blanco, Lukasz Dziurzynski, Gregory Park, David Stillwell et al. 2016. Predicting individual well-being through the language of social media. In Proceedings of the Pacific Symposium on Biocomputing. World Scientific, 516–527.Google Scholar
Cross Ref
- Sima Sharifirad, Borna Jafarpour, Stan Matwin et al. 2018. Boosting text classification performance on sexist tweets by text augmentation and text generation using a combination of knowledge graphs. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW’18). 107–114.Google Scholar
Cross Ref
- Elena Shushkevich and John Cardiff. 2018. Misogyny detection and classification in english tweets: The experience of the ITT team. Eval. NLP Speech Tools Ital. 12 (2018), 182.Google Scholar
Cross Ref
- Cynthia Van Hee, Els Lefever, Ben Verhoeven, Julie Mennes, Bart Desmet, Guy De Pauw, Walter Daelemans, and Véronique Hoste. 2015. Detection and fine-grained classification of cyberbullying events. In Proceedings of International Conference on Recent Advances in Natural Language Processing (RANLP’15). 672–680.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Conference on Neural Information Processing Systems (NIPS’17). 5998–6008. Google Scholar
Digital Library
- Jin Wang, Liang-Chih Yu, K. Robert Lai, and Xuejie Zhang. 2016. Dimensional sentiment analysis using a regional CNN-LSTM model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Vol. 2. 225–230.Google Scholar
Cross Ref
- William Warner and Julia Hirschberg. 2012. Detecting hate speech on the world wide web. In Proceedings of the 2nd Workshop on Language in Social Media. Association for Computational Linguistics, 19–26. Google Scholar
Digital Library
- Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In Proceedings of the NAACL Student Research Workshop. 88–93.Google Scholar
Cross Ref
- Han Xiao. 2018. bert-as-service. Retrieved from https://github.com/hanxiao/bert-as-service.Google Scholar
- Peng Yan, Linjing Li, Weiyun Chen, and Daniel Zeng. 2019. Quantum-inspired density matrix encoder for sexual harassment personal stories classification. In Proceedings of the IEEE International Conference on Intelligence and Security Informatics (ISI’19). IEEE, 218–220.Google Scholar
Cross Ref
- Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.Google Scholar
Cross Ref
- Min-Ling Zhang and Zhi-Hua Zhou. 2014. A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26, 8 (2014), 1819–1837.Google Scholar
Cross Ref
- Ziqi Zhang and Lei Luo. 2018. Hate speech detection: A solved problem? The challenging case of long tail on Twitter. Semantic Web (2018), 1–21.Google Scholar
- Chunting Zhou, Chonglin Sun, Zhiyuan Liu, and Francis Lau. 2015. A C-LSTM neural network for text classification. Retrieved from https://arXiv:1511.08630.Google Scholar
Index Terms
Categorizing Sexism and Misogyny through Neural Approaches
Recommendations
Exploring Misogyny across the Manosphere in Reddit
WebSci '19: Proceedings of the 10th ACM Conference on Web ScienceThe 'manosphere' has been a recent subject of feminist scholarship on the web. Serious accusations have been levied against it for its role in encouraging misogyny and violent threats towards women online, as well as for potentially radicalising lonely ...
Fine-grained Multi-label Sexism Classification Using Semi-supervised Learning
Web Information Systems Engineering – WISE 2020AbstractSexism, a pervasive form of oppression, causes profound suffering through various manifestations. Given the rising number of experiences of sexism reported online, categorizing these recollections automatically can aid the fight against sexism, as ...
Sexism: toxic to women's persistence in CSE doctoral programs
SIGCSE '09: Proceedings of the 40th ACM technical symposium on Computer science educationUsing longitudinal survey data from women in the CRA-W Graduate Cohort program, we measured the prevalence of observed or experienced sexism and its impact on departure from Computer Science and Computer Engineering (CSE) doctoral programs. Our data ...






Comments