Abstract
Fake news stories can polarize society, particularly during political events. They undermine confidence in the media in general. Current NLP systems are still lacking the ability to properly interpret and classify Arabic fake news. Given the high stakes involved, determining truth in social media has recently become an emerging research that is attracting tremendous attention. Our literature review indicates that applying the state-of-the-art approaches on news content address some challenges in detecting fake news’ characteristics, which needs auxiliary information to make a clear determination. Moreover, the ‘Social-context-based’ and ‘propagation-based’ approaches can be either an alternative or complementary strategy to content-based approaches. The main goal of our research is to develop a model capable of automatically detecting truth given an Arabic news or claim. In particular, we propose a deep neural network approach that can classify fake and real news claims by exploiting ‘Convolutional Neuron Networks’. Our approach attempts to solve the problem from the fact checking perspective, where the fact-checking task involves predicting whether a given news text claim is factually authentic or fake. We opt to use an Arabic balanced corpus to build our model because it unifies stance detection, stance rationale, relevant document retrieval and fact-checking. The model is trained on different well selected attributes. An extensive evaluation has been conducted to demonstrate the ability of the fact-checking task in detecting the Arabic fake news. Our model outperforms the performance of the state-of-the-art approaches when applied to the same Arabic dataset with the highest accuracy of 91%.
- [1] & & & & . 2019. Fake news detection on social media using geometric deep learning. arXiv preprint arXiv:1902.06673. Retrieved from https://arxiv.org/abs/1902.06673.Google Scholar
- [2] . Beyond news contents: The role of social context for fake news detection. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining. ACM, New York, NY, 9 pages.
DOI: https://doi.org/10.1145/3289600.3290994 Google ScholarDigital Library
- [3] . 2019. Combating Fake News with Adversarial Domain Adaptation and Neural Models. Master's thesis in Computer Sciences and Engineering. Massachusetts Institute of Technology. 80 pages.Google Scholar
- [4] . 2017. Where the truth Lies: Explaining the credibility of emerging claims on the web and social media. In Proceedings of the 26th International Conference on World Wide Web Companion. 1003–1012. Google Scholar
Digital Library
- [5] . 2017. Detection and analysis of 2016 US presidential election related rumors on twitter. In Proceedings of the International Conference on Social, Cultural, and Behavioral Modeling. 10354, Springer, Cham, 2017, 14–24.Google Scholar
Cross Ref
- [6] . 2017. Gleaning wisdom from the past: Early detection of emerging rumors in social media. In Proceedings of the 2017 SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics. 99–107.Google Scholar
Cross Ref
- [7] . 2016. News verification by exploiting conflicting social viewpoints in microblogs. In Proceedings of the 13th AAAI Conference on Artificial Intelligence. 2972–2978. Google Scholar
Digital Library
- [8] . 2016. ICE: Information credibility evaluation on social media via representation learning. arXiv preprint arXiv:1609.09226. Retrieved from https://arxiv.org/abs/1609.09226.Google Scholar
- [9] . 2017. Integrating stance detection and fact checking in a unified corpus. In Proceedings of the 16th Annual Conference of North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google Scholar
- [10] . 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter 19, 1 (2017), 22–36. Google Scholar
Digital Library
- [11] . 2018. Detection and resolution of rumours in social media: A survey. ACM Computing Surveys 51, 2 (February 2018), 32:1–32:36. Google Scholar
Digital Library
- [12] . 2017. Infographic: Beyond Fake News –10 Types of Misleading News – thirteen Languages. Retrieved March 9, 2019 from https://eavi.eu/beyond-fake-news-10-types-misleading-info/.Google Scholar
- [13] . 2019. Understanding the Mechanisms of Propaganda. Retrieved April 4, 2019 from https://www.thebalancesmb.com/what-is-propaganda-and-how-does-it-work-2295248.Google Scholar
- [14] Merriam-Webster, pseudoscience. 2019. Retrieved March 10, 2019 from https://www.merriam-webster.com/dictionary/pseudoscience.Google Scholar
- [15] . 2018. Conspiracy theories: Evolved functions and psychological mechanisms. Perspectives on Psychological Science: A Journal of the Association for Psychological Science 13, 6 (2018), 770–788.
DOI: https://doi.org/10.1177/1745691618774270Google ScholarCross Ref
- [16] . 2017. Social media and fake news in the 2016 election. Journal of Economic Perspectives 31, 2 (2017), 211–36.
DOI: 10.1257/jep.31.2.211Google ScholarCross Ref
- [17] . 2015. Social media definition and the governance challenge: An introduction to the special issue. Telecommunications Policy 39, 9 (2015), 745–750. Google Scholar
Digital Library
- [18] . 2018. A chronological history of social media. Retrieved March 16, 2019 from https://interestingengineering.com/a-chronological-history-of-social-media.Google Scholar
- [19] . 2019. YouTube: Everything you need to know. Retrieved May 4, 2019 from https://www.lifewire.com/youtube-explained-1616693.Google Scholar
- [20] . 2018. Social media marketing overview: What It Is and how to use It. Retrieved March 21, 2019 from https://www.thebalancesmb.com/social-media-overview-what-it-is-and-how-to-use-it-2531971.Google Scholar
- [21] . 2018. Social Media? It's Serious!: Understanding the dark side of social media. European Management Journal. 36, 4 (2018), 431–438.Google Scholar
Cross Ref
- [22] . 2018. Post-Truth Retrieved March 23, 2019 from https://mitpress.mit.edu/books/post-truth.Google Scholar
- [23] . 2012. Data Mining: Concepts and Techniques, Vol. A (3rd ed.). Elsevier. Google Scholar
Digital Library
- [24] , Text Mining. Retrieved March 28, 2019 from https://www.logianalytics.com/resources/bi-encyclopedia/text-mining/.Google Scholar
- [25] . 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146–1151.
DOI: 10.1126/science.aap9559Google ScholarCross Ref
- [26] . 2011. How natural language processing helps uncover social media sentiment. Retrieved November 8, 2011 from https://mashable.com/2011/11/08/natural-language-processing-social-media/#VbWC8PySNqqy.Google Scholar
- [27] . 2009. Introduction to information retrieval: Tokenization. Retrieved April 4, 2019 from https://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html.Google Scholar
- [28] . 2016. Farasa: A fast and furious segmenter for arabic. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 11–16.Google Scholar
Cross Ref
- [29] . 2017. Enhancing Arabic stemming process using resources and benchmarking tools. Journal of King Saud University - Computer and Information Sciences 29, 2 (April 2017), 264–170.
DOI: 10.1016/j.jksuci.2016.11.010 Google ScholarDigital Library
- [30] 2001. On lemmatization in Arabic, A formal definition of the Arabic entries of multilingual lexical databases. In Proceedings of the ACL 39th Annual Meeting. Workshop on Arabic Language Processing; Status and Prospect. 23–30.Google Scholar
- [31] Cambridge English Dictionary, lemmatization. Retrieved April 12, 2019 from https://dictionary.cambridge.org/dictionary/english/lemmatization.Google Scholar
- [32] . 2019. Natural language processing with python: 5. Categorizing and Tagging Words. Retrieved September 4, 2019 from https://www.nltk.org/book/ch05.html.Google Scholar
- [33] . 2019. Your Guide to Natural Language Processing (NLP). Retrieved April 14, 2019 from https://towardsdatascience.com/your-guide-to-natural-language-processing-nlp-48ea2511f6e1.Google Scholar
- [34] Samia. 2018. Understanding Word Embeddings. Retrieved April 14, 2019 from https://towardsml.com/2018/06/12/understanding-word-embeddings/.Google Scholar
- [35] Jayesh Bapu Ahire. 2018. Introduction to Word Vectors. Retrieved March 12, 2018 from https://medium.com/@jayeshbahire/introduction-to-word-vectors-ea1d4e4b84bf.Google Scholar
- [36] . 2017. What is One Hot Encoding? Why And When do you have to use it? Retrieved April 15, 2019 from https://hackernoon.com/what-is-one-hot-encoding-why-and-when-do-you-have-to-use-it-e3c6186d008f.Google Scholar
- [37] DeepAI, Machine Learning Glossary and Terms: One Hot Encoding. Retrieved April 16, 2019 from https://deepai.org/machine-learning-glossary-and-terms/one-hot-encoding.Google Scholar
- [38] . 2018. Word Representation in Natural Language Processing Part I. Retrieved April 16, 2019 from https://towardsdatascience.com/word-representation-in-natural-language-processing-part-i-e4cd54fed3d4.Google Scholar
- [39] . 2018. A Gentle Introduction to k-fold Cross-Validation. Retrieved April 17, 2019 from https://machinelearningmastery.com/k-fold-cross-validation/.Google Scholar
- [40] 2014. Convolutional neural networks for sentence classifcation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1746–1751.Google Scholar
Cross Ref
- [41] . 2019. Convolutional Neural Networks, Explained. Retrieved April 17, 2019 from https://www.datascience.com/blog/convolutional-neural-network.Google Scholar
- [42] . 2018. A Comprehensive Guide to Convolutional Neural Networks — the ELI5 way. Retrieved April 17, 2019 from https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53.Google Scholar
- [43] FileInfo: The File Extensions Database, JSONFile Extension. Retrieved August 20, 2018 from https://fileinfo.com/extension/json.Google Scholar
- [44] pythonTM, What is Python? Retrieved June 3, 2019 from https://docs.python.org/3/faq/general.html#general-information.Google Scholar
- [45] Spyder, Spyder: Overview. 2018. Retrieved July 26, 2019 from https://www.spyder-ide.org/.Google Scholar
- [46] Anaconda Distribution. 2019. Retrieved July 26, 2019 from https://docs.anaconda.com/anaconda/.Google Scholar
- [47] Anaconda Documentation: Anaconda Distribution. 2019. Retrieved July 26, 2019 from https://docs.anaconda.com/.Google Scholar
- [48] QCRI Arabic Language Technologies: Tools & Demos “FARASA”. 2016. Retrieved July 28, 2019 from http://qatsdemo.cloudapp.net/farasa/.Google Scholar
- [49] Arabic Language Technologies Group: Farasa. 2019. Retrieved July 28, 2019 from http://alt.qcri.org/farasa/.Google Scholar
- [50] NLTK 3.4.5 documentation: Natural Language Toolkit. Retrieved September 2, 2019 from https://www.nltk.org/.Google Scholar
- [51] Pandas: Python Data Analysis Library. 2019. Retrieved July 29, 2019 from https://pandas.pydata.org/.Google Scholar
- [52] pandas 0.25.1 documentation - API reference: DataFrame Constructor. Retrieved July 29, 2019 from https://pandas.pydata.org/pandas-docs/stable/reference/frame.html.Google Scholar
- [53] scikit-learn - Machine Learning in Python. 2019. Retrieved July 29, 2019 from https://scikit-learn.org/stable/index.html.Google Scholar
- [54] scikit-learn, sklearn.preprocessing. MultiLabelBinarizer. 2019. Retrieved July 30, 2019 from https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html.Google Scholar
- [55] Facebook, fastText: Library for efficient text classification and representation learning. 2019. Retrieved August 3, 2019 from https://fasttext.cc/.Google Scholar
- [56] Facebook, Resources: Word vectors for 157 languages. 2019. Retrieved August 3, 2019 from https://fasttext.cc/docs/en/crawl-vectors.html.Google Scholar
- [57] . 2018. FastText: Under the Hood. Retrieved August 3, 2019 from https://towardsdatascience.com/fasttext-under-the-hood-11efc57b2b3.Google Scholar
- [58] . 2016. Accuracy, Precision, Recall & F1 Score: Interpretation of Performance Measures. Retrieved August 4, 2019 from https://blog.exsilio.com/all/accuracy-precision-recall-f1-score-interpretation-of-performance-measures/.Google Scholar
- [59] . 2014. Simple guide to confusion matrix terminology. Retrieved August 4, 2019 from https://www.dataschool.io/simple-guide-to-confusion-matrix-terminology/.Google Scholar
- [60] . 2018. Understanding AUC - ROC Curve. Retrieved August 4, 2019 from https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5.Google Scholar
- [61] . 2017. Team Athene on the fake news challenge. Retrieved Oct, 29, 2019 from https://medium.com/@andre134679/team-atheneon-the-fake-news-challenge-28a5cf5e017b.Google Scholar
- [62] . 2017. A simple but tough-to-beat baseline for the Fake News Challenge stance detection task. arXiv preprint arXiv:1707.03264. Retrieved from https://arxiv.org/abs/1707.03264.Google Scholar
- [63] . 2018. Automatic stance detection using end-to-end memory networks. In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics.Google Scholar
Cross Ref
- [64] . 2020. A Hybrid Recommender System For Rating Prediction Of Arabic Reviews. International Journal of Asian Language Processing 30, 2 (2020), 25.
DOI: https://doi.org/10.1142/S2717554520500101Google ScholarCross Ref
- [65] . 2021. A survey of offensive language detection for the arabic language. ACM Transactions on Asian and Low-Resource Language Information Processing 20, 1 (April 2021), Article 12, 44 pages.
DOI: https://doi.org/10.1145/3421504 Google ScholarDigital Library
- [66] . 2020. BERT Transformer model for detecting arabic GPT2 Auto-Generated Tweets. In Proceedings of the The 5th Arabic Natural Language Processing Workshop.Google Scholar
Index Terms
Arabic Fake News Detection: A Fact Checking Based Deep Learning Approach
Recommendations
Fake News Early Detection: A Theory-driven Model
Field NotesMassive dissemination of fake news and its potential to erode democracy has increased the demand for accurate fake news detection. Recent advancements in this area have proposed novel techniques that aim to detect fake news by exploring how it ...
Interpretable Fake News Detection on Social Media
ICSIM '23: Proceedings of the 2023 6th International Conference on Software Engineering and Information ManagementWith the development of information technology, public opinion can quickly spread to all over the world, permeate every corner of social life, and have a great impact on human's lives. Extracted from large-scale and multi-mode social media, user-...
Fake News Research: Theories, Detection Strategies, and Open Problems
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningFake news has become a global phenomenon due its explosive growth, particularly on social media. The goal of this tutorial is to (1) clearly introduce the concept and characteristics of fake news and how it can be formally differentiated from other ...






Comments