Abstract
The tremendous increase in the growth of misinformation in news articles has the potential threat for the adverse effects on society. Hence, the detection of misinformation in news data has become an appealing research area. The task of annotating and detecting distorted news article sentences is the immediate need in this research direction. Therefore, an attempt has been made to formulate the legitimacy annotation guideline followed by annotation and detection of the legitimacy in Bengali e-papers. The sentence-level manual annotation of Bengali news has been carried out in two levels, namely “Level-1 Shallow Level Classification” and “Level-2 Deep Level Classification” based on semantic properties of Bengali sentences. The tagging of 1,300 anonymous Bengali e-paper sentences has been done using the formulated guideline-based tags for both levels. The validation of the annotation guideline has been done by applying benchmark supervised machine learning algorithms using the lexical feature, syntactic feature, domain-specific feature, and Level-2 specific feature in both levels. Performance evaluation of these classifiers is done in terms of Accuracy, Precision, Recall, and F-Measure. In both levels, Support Vector Machine outperforms other benchmark classifiers with an accuracy of 72% and 65% in Level-1 and Level-2, respectively.
- [1] . 2000. Traditional news media online: An examination of added values. Communications-Sankt Augustin Then Berlin 25, 1 (2000), 85–102.Google Scholar
- [2] . 2006. News consumption and the new electronic media. Harvard International Journal of Press/Politics 11, 1 (2006), 29–52.Google Scholar
Cross Ref
- [3] . 2015. Automatic deception detection: Methods for finding fake news. Proceedings of the Association for Information Science and Technology 52, 1 (2015), 1–4.Google Scholar
Cross Ref
- [4] . 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter 19, 1 (2017), 22–36.Google Scholar
Digital Library
- [5] . 2020. FAKEDETECTOR: Effective fake news detection with deep diffusive neural network. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE’20). IEEE, Los Alamitos, CA, 1826–1829.Google Scholar
Cross Ref
- [6] . 2017. Automatic detection of fake news. arXiv preprint arXiv:1708.07104 (2017).Google Scholar
- [7] . 2009. Automatic satire detection: Are you having a laugh? In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. 161–164.Google Scholar
Digital Library
- [8] . 2015. Fact-checking effect on viral hoaxes: A model of misinformation spread in social networks. In Proceedings of the 24th International Conference on World Wide Web. ACM, New York, NY, 977–982.Google Scholar
Digital Library
- [9] . 2017. Tabloids in the era of social media? Understanding the production and consumption of clickbaits in Twitter. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (2017), 30.Google Scholar
Cross Ref
- [10] . 2015. Misleading online content: Recognizing clickbait as “ false news.” In Proceedings of the 2015 ACM Workshop on Multimodal Deception Detection. 15–19.Google Scholar
Digital Library
- [11] . 2017. Getting out the truth: The role of libraries in the fight against fake news. Reference Services Review 45, 2 (2017), 143–148.Google Scholar
Cross Ref
- [12] . 2018. Detecting fake news in social media networks. Procedia Computer Science 141 (2018), 215–222.Google Scholar
Cross Ref
- [13] . 2018. The science of fake news. Science 359, 6380 (2018), 1094–1096.Google Scholar
Cross Ref
- [14] . 2019. Identification of synthetic sentence in Bengali news using hybrid approach. In Proceedings of the 16th International Conference on Natural Language Processing (ICON’19).Google Scholar
- [15] . 2019. Big data and quality data for fake news and misinformation detection. Big Data & Society 6, 1 (2019), 2053951719843310.Google Scholar
Cross Ref
- [16] . 2018. Defining “fake news”: A typology of scholarly definitions. Digital Journalism 6, 2 (2018), 137–153.Google Scholar
Cross Ref
- [17] . 2019. Influence of fake news in Twitter during the 2016 US presidential election. Nature Communications 10, 1 (2019), 7.Google Scholar
Cross Ref
- [18] . 2007. Rumor, gossip and urban legends. Diogenes 54, 1 (2007), 19–35.Google Scholar
Cross Ref
- [19] . 2015. Towards detecting rumours in social media. In Proceedings of the Workshops at the 29th AAAI Conference on Artificial Intelligence.Google Scholar
- [20] . 2012. With Facebook, blogs, and fake news, teens reject journalistic “objectivity.” Journal of Communication Inquiry 36, 3 (2012), 246–262.Google Scholar
Cross Ref
- [21] . 2015. Deception detection for news: Three types of fakes. Proceedings of the Association for Information Science and Technology 52, 1 (2015), 1–4.Google Scholar
Cross Ref
- [22] . 2019. Beyond news contents: The role of social context for fake news detection. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining. 312–320.Google Scholar
Digital Library
- [23] . 2017. From clickbait to fake news detection: An approach based on detecting the stance of headlines to articles. In Proceedings of the 2017 EMNLP Workshop: Natural Language Processing Meets Journalism. 84–89.Google Scholar
Cross Ref
- [24] . 2017. Separating facts from fiction: Linguistic models to classify suspicious and trusted news posts on Twitter. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 647–653.Google Scholar
Cross Ref
- [25] . 2015. Deception detection for news: Three types of fakes. In Proceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community. 83.Google Scholar
Cross Ref
- [26] . 2017. Study of hoax news detection using naïve Bayes classifier in Indonesian language. In Proceedings of the 2017 11th International Conference on Information and Communication Technology and System (ICTS’17). IEEE, Los Alamitos, CA, 73–78.Google Scholar
Cross Ref
- [27] . 2018. Attending sentences to detect satirical fake news. In Proceedings of the 27th International Conference on Computational Linguistics. 3371–3380. https://www.aclweb.org/anthology/C18-1285.Google Scholar
- [28] . 2017. Social media and fake news in the 2016 election. Journal of Economic Perspectives 31, 2 (2017), 211–236.Google Scholar
Cross Ref
- [29] . 2019. Fake news on Twitter during the 2016 US presidential election. Science 363, 6425 (2019), 374–378.Google Scholar
Cross Ref
- [30] . 2020. BanFakeNews: A dataset for detecting fake news in Bangla. arXiv preprint arXiv:2004.08789 (2020).Google Scholar
- [31] . 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 5 (1971), 378–382.Google Scholar
Cross Ref
- [32] . 2012. Interrater reliability: The kappa statistic. Biochemia Medica: Casopis Hrvatskoga Društva Medicinskih Biokemicara/HDMB 22 (Oct. 2012), 276–282.
DOI: DOI: http://dx.doi.org/10.11613/BM.2012.031Google ScholarCross Ref
- [33] . 2014. Named entity recognition in Bengali using system combination. Lingvisticæ Investigationes 37, 1 (2014), 1–22.Google Scholar
Cross Ref
- [34] . 2011. Anaphora resolution for Bengali, Hindi, and Tamil using random tree algorithm in Weka. In Proceedings of the 9th International Conference on Natural Language Processing (ICON’11).Google Scholar
Index Terms
Deep Level Analysis of Legitimacy in Bengali News Sentences
Recommendations
Bengali verb subcategorization frame acquisition: a baseline model
ALR7: Proceedings of the 7th Workshop on Asian Language ResourcesAcquisition of verb subcategorization frames is important as verbs generally take different types of relevant arguments associated with each phrase in a sentence in comparison to other parts of speech categories. This paper presents the acquisition of ...
A Comprehensive Guideline for Bengali Sentiment Annotation
Sentiment Analysis (SA) is a Natural Language Processing (NLP) and an Information Extraction (IE) task that primarily aims to obtain the writer’s feelings expressed in positive or negative by analyzing a large number of documents. SA is also widely ...
BenLem (A Bengali Lemmatizer) and Its Role in WSD
A lemmatization algorithm for Bengali has been developed and evaluated. Its effectiveness for word sense disambiguation (WSD) is also investigated. One of the key challenges for computer processing of highly inflected languages is to deal with the ...






Comments