skip to main content
research-article

On the Usage of a Classical Arabic Corpus as a Language Resource: Related Research and Key Challenges

Published:09 January 2019Publication History
Skip Abstract Section

Abstract

This article presents a literature review of computer-science-related research applied on hadith, a kind of Arabic narration which appeared in the 7th century. We study and compare existent works in several fields of Natural Language Processing (NLP), Information Retrieval (IR), and Knowledge Extraction (KE). Thus, we illicit their main drawbacks and identify some perspectives, which may be considered by the research community. We also study the characteristics of these types of documents, by enumerating the advantages/limits of using hadith as a language resource. Moreover, our study shows that previous studies used different collections of hadiths, thus making it hard to compare their results objectively. Besides, many preprocessing steps are recurrent through these applications, thus wasting a lot of time. Consequently, the key issues for building generic language resources from hadiths are discussed, taking into account the relevance of related literature and the wide community of researchers that are interested in these narrations. The ultimate goal is to structure hadith books for multiple usages, thus building common collections which may be exploited in future applications.

References

  1. N. S. Abdul Karim and N. R. Hazmi. 2005. Assessing Islamic information quality on the Internet: A case of information about hadith. Malaysian Journal of Library 8 Information Science 10, 2 (2005), 51--61.Google ScholarGoogle Scholar
  2. J. Adams, H. T. A. Khan, and R. Raeside. 2007. Research Methods for Graduate Business and Social Science Students. SAGE Publications, New Delhi, India, 56.Google ScholarGoogle Scholar
  3. M. A. Ahmad. 2013. Towards the Analysis of Narrative Networks. Tech. Rep. number13-017, Department of Computer Science and Engineering, University of Minnesota. Retrieved from https://www.cs.umn.edu/sites/cs.umn.edu/files/tech_reports/13-017.pdf.Google ScholarGoogle Scholar
  4. K. A. Aldhaln, A. M. Zeki, and A. M. Zeki. 2011. Encyclopedias of hadith software: The current status and future view. In Proceedings of the 3rd National Information Technology Symposium (NITS), Riyadh, Saudi Arabia.Google ScholarGoogle Scholar
  5. A. A. Al-Echikh. 1998. Encyclopedia of the Six Major Citation Collections. Dar-esselem, Kingdom of Saudi Arabia, Ryadh.Google ScholarGoogle Scholar
  6. S. AlGahtani, W. Black, and J. McNaught. 2009. Arabic part-of-speech tagging using transformation-based learning. In Proceedings of the 2nd International Conference on Arabic Language Resources and Tools. Cairo, Egypt, 66--70.Google ScholarGoogle Scholar
  7. M. Alhawarat. 2015. A domain-based approach to extract Arabic person names using n-grams and simple rules. Asian Journal of Information Technology 14, 8 (2015), 287--293.Google ScholarGoogle Scholar
  8. M. Al-Humaid. 2000. The Similitudes of the Quran and Hadith: A Comparative Study. Ph.D. dissertation. University of Birmingham, UK.Google ScholarGoogle Scholar
  9. M. Al-Jam'aan. 2014. Jawami= Al-Kalim software: Presentation and criticism. International Journal of Islamic Applications in Computer Science and Technology 2, 3 (2014), 22--33.Google ScholarGoogle Scholar
  10. M. Alkhatib, A. A. Monem, and K. Shaalan. 2017. A rich Arabic wordnet resource for Al-Hadith Al-Shareef. Procedia Computer Science 117 (2017), 101--110.Google ScholarGoogle ScholarCross RefCross Ref
  11. H. A. Al-Muhtaseb, S. A. Mahmoud, and R. S. Qahwahi. 2009. A novel minimal script for Arabic text recognition databases and benchmarks. International Journal of Circuits, Systems and Signal Processing 3, 3 (2009), 145--153.Google ScholarGoogle Scholar
  12. A. F. Al-Mukhtar and H. M. Al-Razzo. 2014. Exploiting the A priori algorithm for mining famous hadith narrators in the chains of prophetic hadiths {in Arabic}. International Journal of Islamic Applications in Computer Science and Technology 2, 1 (2014), 26--39. (In Arabic.)Google ScholarGoogle Scholar
  13. K. S. Aloufi. 2011. Diacritic oriented Arabic information retrieval system. International Journal of Computer Science and Security 5, 1 (2011), 143--155.Google ScholarGoogle Scholar
  14. M. Alrabiah and A. M. S. Al-Salman. 2013. The design and construction of the 50 million words KSUCCA King Saud University Corpus of Classical Arabic. In Proceedings of the 2nd Workshop on Arabic Corpus Linguistics, E. Atwell and A. Hardie (Eds.). Lancaster University, UK.Google ScholarGoogle Scholar
  15. H. M. Al-Razzo. 2008. Data mining applications on Islamic knowledge resources. Retrieved from http://www.alukah.net/Culture/0/3123/. (In Arabic).Google ScholarGoogle Scholar
  16. H. M. Al-Razzo and A. F. Al-Mukhtar. 2014. Computer-based mining of Imam Muslim's narrations. International Journal of Islamic Applications in Computer Science and Technology 2, 2 (2014), 29--37. (In Arabic).Google ScholarGoogle Scholar
  17. A. Al-Rjoub. 2007. A New Approach for Arabic Root Extraction, MSc Thesis. Department of Computer Science, Jordan University of Science and Technology, Irbid, Jordan.Google ScholarGoogle Scholar
  18. A. Al-Rumkhani, M. Al-Razgan, and A. Al-Faris. 2016. TibbOnto: Knowledge representation of prophet medicine (Tibb Al-Nabawi). Procedia Computer Science 82 (2016), 138--142.Google ScholarGoogle ScholarCross RefCross Ref
  19. H. A. Al-Sanasleh and B. H. Hammo. 2017. Building domain ontology: Experiences in developing the prophetic ontology form Quran and hadith. In Proceedings of the International Conference on New Trends in Computing Sciences (ICTCS’17). 223--228.Google ScholarGoogle Scholar
  20. E. Al-Shawakfa, A. Al-Badarneh, S. Shantawi, K. Al-Rabab‘ah, and B. Bani-Ismail. 2010. A comparison study of some Arabic root finding algorithms. Journal of the American Society for Information Science 8 Technology 61, 5 (2010), 1015--1024. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. O. A. Al-Thubaity. 2014. 700M+ Arabic corpus: KACST Arabic corpus design and construction. Language Resources and Evaluation 49, 3 (2014), 721--751. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Al-Turki, M. Al-Turki, I. Al-Riss, and I. Al-Nimmi. 2014. Jamii= al-sunna al-nabawia software. International Journal of Islamic Applications in Computer Science and Technology 2, 2 (2014), 1--28. (In Arabic.)Google ScholarGoogle Scholar
  23. E. Atwell, C. Brierley, K. Dukes, M. Sawalha, and A. B. Sharaf. 2011. An artificial intelligence approach to Arabic and Islamic content on the Internet. In Proceedings of the 3rd National Information Technology Symposium (NITS).Google ScholarGoogle Scholar
  24. R. Ayed, O. Ben Khiroun, I. Bounhas, B. Elayeb, and Y. Slimani. 2014a. Kunuz: A standard Arabic collection for information retrieval. In Proceedings of the 6th National Conference [email protected], A. Ben Hamadou et al. (Eds.). (In Arabic.)Google ScholarGoogle Scholar
  25. R. Ayed, I. Bounhas, B. Elayeb, N. Bellamine Ben Saoud, and F. Evrard. 2014b. Evaluation d'une approche possibiliste pour la désambiguısation des textes arabes. In Proceedings of TALN’2014 -- Traitement Automatique des Langues Naturelles. 316--327.Google ScholarGoogle Scholar
  26. R. Ayed, I. Bounhas, B. Elayeb, N. Bellamine Ben Saoud, and F. Evrard. 2014c. Improving Arabic texts morphological disambiguation using a possibilistic classifier. In Proceedings of the 19th International Conference on Applications of Natural Language to Information Systems (NLDB=14). 138--147.Google ScholarGoogle Scholar
  27. R. Ayed, I. Bounhas, B. Elayeb, F. Evrard, and N. Bellamine Ben Saoud. 2012a. A possibilistic approach for the automatic morphological disambiguation of Arabic texts. In Proceedings of the 13th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD’12). IEEE Computer Society, 187--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Ayed, I. Bounhas, B. Elayeb, F. Evrard, and N. Bellamine Ben Saoud. 2012b. Arabic morphological analysis and disambiguation using a possibilistic classifier. In Proceedings of the 8th International Conference on Intelligent Computing (ICIC’12). Lecture Notes in Artificial Intelligence, vol. 7390. Springer-Verlag, Berlin, 274--279. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Ayed. 2018. Désambiguïsation Morphologique de Textes Arabes à Base de Classification Possibiliste pour la Recherche d'Information Socio-Sémantique. Ph.D. dissertation, National School of Computer Sciences, La Manouba, Tunisia.Google ScholarGoogle Scholar
  30. M. Azami. 1978. Studies in Hadith Methodology and Literature. American Trust Publications.Google ScholarGoogle Scholar
  31. A. Azmi and N. AlBadia. 2012. Mining and visualizing the narration tree of hadiths (prophetic traditions). In Cross-Disciplinary Advances in Applied Natural Language Processing: Issues and Approaches, C. Boonthum-Denecke, P. M. McCarthy and T. Lamkin (Eds.). 239--257. IGI Global, Hershey, PA.Google ScholarGoogle Scholar
  32. A. Azmi, F. Alkhalifah, A. Alsaeed, and A. Barnawi. 2014. Using non-conventional search schemes to retrieve Hadiths. In Proceedings of the 5th International Conference on Arabic Language Processing (CITALA’14).Google ScholarGoogle Scholar
  33. A. Azmi and A. M. AlOfaidly. 2014. A novel method to automatically pass hukm on Hadith. In Proceedings of the 5th International Conference on Arabic Language Processing (CITALA’14). 118--124.Google ScholarGoogle Scholar
  34. A. Azmi and N. Bin Badia. 2006. E-narrator - An application for creating an ontology of hadiths narration tree semantically and graphically. Arabian Journal for Science and Engineering 31, 2C (2006), 51--68.Google ScholarGoogle Scholar
  35. A. Azmi and N. Bin Badia. 2010. iTree -- automating the construction of the narration tree of hadiths (prophetic traditions). In Proceedings of the 6th IEEE Conference on Natural Language Processing 8 Knowledge Engineering.Google ScholarGoogle Scholar
  36. S. S. Balgasem and L. Q. Zakaria. 2017. A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith. In Proceedings of the 6th International Conference on Electrical Engineering and Informatics (ICEEI’17). 1--5.Google ScholarGoogle Scholar
  37. S. Baqai, A. Basharat, H. Khalid, A. Hassan, and S. Zafar. 2009. Leveraging semantic web technologies for standardized knowledge modeling and retrieval from the Holy Qur'an and religious texts. In Proceedings of the 7th International Conference on Frontiers of Information Technology. Abbottabad, Pakistan, 42--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. R. Baradaran and B. Mineai-Bidgoli. 2015. Event extraction from classical Arabic texts. International Arab Journal of Information Technology 12, 5 (2015), 494--502.Google ScholarGoogle Scholar
  39. R. S. Baraka and Y. M. Dalloul. 2014. Building hadith ontology to support the authenticity of isnad. International Journal on Islamic Applications in Computer Science and Technology 2, 1 (2014), 25--39.Google ScholarGoogle Scholar
  40. A. Basharat. 2016. Semantics Driven Human-Machine Computation Framework for Linked Islamic Knowledge Engineering. International Semantic Web Conference. Springer International Publishing, Kobe, Japan, 793--802. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. A. Basharat, B. Abro, I. B. Arpinar, and K. Rasheed. 2016. Semantic hadith: Leveraging linked data opportunities for Islamic knowledge. In Proceedings of the Workshop on Linked Data on the Web.Google ScholarGoogle Scholar
  42. A. Basharat, K. Rasheed, and I. B. Arpinar. 2015. Towards linked open Islamic knowledge using human computation and crowdsourcing. In Proceedings of the International Conference on Islamic Applications in Computer Science and Technology. Konya, Turkey.Google ScholarGoogle Scholar
  43. M. Ben Aouicha. 2009. Une Approche Algébrique Pour La Recherche D'information Structurée. Ph.D. Dissertation, Paul Sabatier University, Toulouse, France.Google ScholarGoogle Scholar
  44. S. Ben Guirat, I. Bounhas, and Y. Slimani. 2016. Combining indexing units for Arabic information retrieval. International Journal of Software Innovation 4, 4 (2016), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. O. Ben Khiroun, R. Ayed, B. Elayeb, I. Bounhas, N. Ben Saoud, and F. Evrard. 2014. Towards a new standard Arabic test collection for mono- and cross-language information retrieval. In Proceedings of the 19 th International Conference on Applications of Natural Language to Information Systems (NLDB=14). 168--171.Google ScholarGoogle Scholar
  46. M. A. Bidhendi, B. Minaei-Bidgoli, and H. Jouzi. 2012. Extracting person names from ancient Islamic Arabic texts. In LRE-Rel: Pre-Conference Workshop in Language Resource and Evaluation for Religious Texts, The 8th International Conference on Language Resources and Evaluation (LREC). 1--6.Google ScholarGoogle Scholar
  47. A. Bies, D. DiPersio, and M. Maamouri. 2012. Linguistic resources for Arabic machine translation. In Challenges for Arabic Machine Translation, A. Soudi, A. Farghaly, G. Neumann and R. Zbib (Eds.). John Benjamins, Amsterdam, The Netherlands, 15--22.Google ScholarGoogle Scholar
  48. M. Boella. 2011a. Regular expressions for interpreting and cross-referencing Hadith texts. Langues et Littératures du Monde Arabe (LLMA) 9, 3 (2011a), 25--39.Google ScholarGoogle Scholar
  49. M. Boella. 2011b. Reading a text, finding a database: An anachronistic interpretation of hadiths in light of information science. Rivista Degli Studi Orientali 84, 1/4 (2011b), 439--448.Google ScholarGoogle Scholar
  50. M. Boella, F. R. Romani, A. Al-Raies, C. Solimando, and G. Lancioni. 2011. The SALAH project: Segmentation and linguistic analysis of ḥadīṯ Arabic texts. In Information Retrieval Technology. Lecture Notes in Computer Science, vol. 7097. Springer, Berlin, 538--549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. M. Boudchiche, A. Mazroui, M. Ould Abdallahi Ould Bebah, A. Lakhouaja, and A. Boudlal. 2016. Alkhalil morpho sys 2: A robust Arabic morpho-syntactic analyzer. Journal of King Saud University -- Computer and Information Sciences 29, 2 (2016), 141--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2015a. Information reliability evaluation: From Arabic storytelling to computer sciences. ACM Journal on Computing and Cultural Heritage (JOCCH) 8, 3 (2015b), 14--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. I. Bounhas, R. Ayed, B. Elayeb, F. Evrard, and N. Bellamine Ben Saoud. 2015b. Experimenting a discriminative possibilistic classifier with reweighting model for Arabic morphological disambiguation. Computer Speech and Language 33 (2015a), 67--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. I. Bounhas, R. Ayed, B. Elayeb, F. Evrard, and Bellamine Ben Saoud N. 2015c. A hybrid possibilistic approach for Arabic full morphological disambiguation. Computer Data 8 Knowledge Engineering 44, 1 (2015c), 91--126.Google ScholarGoogle Scholar
  55. I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2011a. Organizing contextual knowledge for Arabic text disambiguation and terminology extraction. Knowledge Organization (KO) 38 (2011a), 473--490.Google ScholarGoogle Scholar
  56. I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2011b. ArabOnto: Experimenting a new distributional approach for building Arabic ontological resources. International Journal of Metadata, Semantics and Ontologies (IJMSO) 6 (2011b), 81--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2010. Toward a computer study of the reliability of Arabic stories. Journal of the American Society for Information Science and Technology 61, 8 (2010), 1686--1705. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. I. Bounhas and Y. Slimani. 2010a. Désambiguisation de textes Arabes pour l'extraction des syntagmes nominaux: l'apport de la structure des documents. In Actes du 10 ème Colloque Africain sur la Recherche en Informatique et en Mathématiques Appliquées. 93--100.Google ScholarGoogle Scholar
  59. I. Bounhas and Y. Slimani. 2010b. Toward a generic approach for analyzing and representing Arabic document for the socio-semantic Web. In Proceedings of the International Computing Conference in Arabic. 197--210.Google ScholarGoogle Scholar
  60. I. Bounhas and Y. Slimani. 2009. A social approach for semi-structured document modeling and analysis. In Proceedings of the International Conference on Knowledge Management and Information Sharing (KMIS). 95--102.Google ScholarGoogle Scholar
  61. I. Bounhas. 2012. Construction Et Intégration D'ontologies Pour La Cartographie Socio-sémantique De Fonds Documentaires Arabes Guidée Par La Fiabilité De L'information. Ph.D. Dissertation, Tunis El Manar University, Tunisia.Google ScholarGoogle Scholar
  62. J. A. C. Brown. 2009. Hadith: Muhammad's Legacy in the Medieval and Modern World. Cambridge University Press/Oneworld Publications, London, England.Google ScholarGoogle Scholar
  63. T. Buckwalter. 2004. Buckwalter Arabic Morphological Analyzer Version 2.0. Linguistic Data Consortium (LDC), catalog number LDC2002L49, ISBN 1-58563-257-0. https://catalog.ldc.upenn.edu/ldc2004l02. Last access: October 19, 2017.Google ScholarGoogle Scholar
  64. Y. M. Dalloul. 2013. An Ontology-Based Approach to Support the Process of Judging Hadith Isnad. MSc Thesis, Islamic University of Gaza, Palestine.Google ScholarGoogle Scholar
  65. K. Darwish, W. Arafa, and M. I. Eldesouki. 2009. Stemming techniques of Arabic language: Comparative study from the information retrieval perspective. Egyptian Computer Journal 36, 1 (2009), 30--49.Google ScholarGoogle Scholar
  66. A. Dastani, B. Minaei-Bidgoli, M. R. Vafaei, and H. Jouzi. 2012. An Introduction to Noor Diacritized Corpus. In LRE-Rel: Pre-conference Workshop in Language Resource and Evaluation for Religious Texts. The Eighth International Conference on Language Resources and Evaluation (LREC). Istanbul, Turkey.Google ScholarGoogle Scholar
  67. K. Dukes. 2013. Statistical Parsing by Machine Learning from a Classical Arabic Treebank. Ph.D. Thesis. University of Leeds, UK.Google ScholarGoogle Scholar
  68. K. Dukes, E. Atwell, and N. Habash. 2013. Supervised collaboration for syntactic annotation of Quranic Arabic. Language Resources and Evaluation 47, 1 (2013), 33--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. B. Elayeb and I. Bounhas. 2016. Arabic cross-language information retrieval: A review. ACM Transaction on Asian and Low-Resource Language Information Processing (ACM-TALLIP) 15, 3, Article 18 (2016), 44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. N. Fabil, Z. Ismail, Z. Shukur, S. A. Noah, and J. Salim. 2012. Information visualization (IV) application for information acquisition based on visual perception. Creative Education 3, 8B (2012), 86--89.Google ScholarGoogle ScholarCross RefCross Ref
  71. K. Faidi, R. Ayed, I. Bounhas, and B. Elayeb. 2014. Comparing Arabic NLP tools for hadith classification. In Proceedings of the 2nd International Conference on Islamic Applications in Computer Science and Technologies (IMAN).Google ScholarGoogle Scholar
  72. M. Ghanem, A. Mouloudi, and M. Mourchid. 2015. Creation and populating of an Islamic knowledge ontology using extraction pattern bootstrapping. In Proceedings of the 3rd National Day on Engineering, Networks and Telecommunications (NDENT). 36--39.Google ScholarGoogle Scholar
  73. D. Graff, M. Maamouri, B. Bouziri, S. Krouna, S. Kulick, and T. Buckwalter. 2009. Standard Arabic Morphological Analyzer (SAMA) Version 3.1. Linguistic Data Consortium (LDC), Catalog number LDC2009E73. Retrieved from https://catalog.ldc.upenn.edu/LDC2010L01.Google ScholarGoogle Scholar
  74. D. Graff and K. Walker. 2001. Arabic Newswire Part 1 Corpus (1-58563-190-6). Linguistic Data Consortium (LDC), Catalog number LDC2001T55. Retrieved from https://catalog.ldc.upenn.edu/LDC2001T55.Google ScholarGoogle Scholar
  75. A. Guillaume. 2003. Traditions of Islam: An Introduction to the Study of the Hadith Literature. Whitefish: Kessinger Publishing.Google ScholarGoogle Scholar
  76. N. Habash, O. Rambow, and R. Roth. 2009. MADA + TOKAN: A toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming and lemmatization. In Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR). 102--109.Google ScholarGoogle Scholar
  77. H. Hamam, M. T. B. Othman, A. Kilani, and M. Ben Othman. 2015. Data mining in Sciences of the prophet's tradition in general and in impeachment and amendment in particular. International Journal on Islamic Applications in Computer Science 8 Technology 3 (2015), 9--16.Google ScholarGoogle Scholar
  78. F. Harrag. 2011. Une Approche de Fouille des Textes Basée Sur La Classification et La Segmentation Thématique: Application Au Corpus Des Traditions Prophétiques Hadith. Ph.D. Dissertation, Ferhat Abbas University, Algeria.Google ScholarGoogle Scholar
  79. F. Harrag. 2014. Text mining approach for knowledge extraction in Sahîh Al-Bukhari. Computers in Human Behavior 30 (2014), 558--566. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. F. Harrag and A. Hamdi-Cherif. 2008. UML modeling of text mining in Arabic language and application to prophetic traditions ‘Hadith’. In Proceedings of the 1st International. Symposium on Computers and Arabic Language.Google ScholarGoogle Scholar
  81. F. Harrag, A. Hamdi-Cherif, and E. El-Qawasmah. 2008. Information retrieval architecture for hadith text mining. Journal of Digital Information Management 6, 6 (2008), 449--455.Google ScholarGoogle Scholar
  82. F. Harrag, A. Hamdi-Cherif, A. M. S. Al-Salman, and E. El-Qawasmeh. 2009. Experiments in improvement of Arabic information retrieval. In Proceedings of the 3rd International Conference on Arabic Language Processing (CITALA’09).Google ScholarGoogle Scholar
  83. F. Harrag, E. El-Qawasmeh, and A. M. S. Al-Salman. 2011a. Extracting named entities from prophetic narration texts (hadith). Software Engineering and Computer Systems 80 (2011a), 289--297.Google ScholarGoogle Scholar
  84. F. Harrag, A. Hamdi-Cherif, A. M. S. Al-Salman, and E. El-Qawasmah. 2011b. Evaluating the effectiveness of VSM model and topic segmentation in retrieving Arabic documents. International Journal of Computer Systems Science and Engineering 26, 1 (2011b).Google ScholarGoogle Scholar
  85. F. Harrag, A. Alothaim, A. Abanmy, F. Alomaigan, and S. Alsalehi. 2013. Ontology extraction approach for prophetic narration (hadith) using association rules. International Journal on Islamic Applications in Computer Science and Technology 1, 2 (2013), 48--57.Google ScholarGoogle Scholar
  86. Z. Harris. 1968. Mathematical Structures of Language. John Wiley 8 Sons, New York.Google ScholarGoogle Scholar
  87. A. H. Hassan. 2014. Hadith isnad study and criticism based on computer applications. International Journal of Islamic Applications in Computer Science and Technology 2, 2 (2014), 38--48. (In Arabic.)Google ScholarGoogle Scholar
  88. S. M. O. Hassan and E. S. Atwell. 2016a. Design requirements for multilingual hadith corpus. International Journal of Science and Research 5, 4 (2016a), 494--498.Google ScholarGoogle Scholar
  89. S. M. O. Hassan and E. S. Atwell. 2016b. Design and implementing of multilingual hadith corpus. International Journal of Recent Research in Social Sciences and Humanities 3, 2 (2016b), 100--104.Google ScholarGoogle Scholar
  90. S. M. O. Hassan and E. S. Atwell. 2016c. Concept search tool for multilingual hadith corpus. International Journal of Science and Research 5, 4 (2016c), 1326--1328.Google ScholarGoogle Scholar
  91. M. Hyder and S. Ghazanfer. 2008. Towards a database oriented hadith research using relational, algorithmic and data-warehousing techniques. The Islamic Culture, Quarterly Journal of Shaikh Zayed Islamic Center for Islamic and Arabic Studies 19 (2008), 14--19.Google ScholarGoogle Scholar
  92. A. Jaber and F. A. Zaraket. 2017. MERF: Morphology-based Entity and Relational Entity Extraction Framework for Arabic. CoRR abs/1709.05700.Google ScholarGoogle Scholar
  93. M. Javed, Z. Shafiq, S. Saeed, and A. Javed. 2011. Utilizing domain ontologies for semantically enriched knowledge management. International Journal of Recent Trends in Engineering and Technology 6, 1 (2011), 7--10.Google ScholarGoogle Scholar
  94. H. Jouzi, A. R. Zadeh, E. Baraty, and B. Minaei-Bidgoli. 2012. A new framework for detecting similar texts in Islamic Hadith Corpora. In LRE-Rel: Pre-conference Workshop in Language Resource and Evaluation for Religious Texts, The 8th International Conference on Language Resources and Evaluation (LREC). 38--41.Google ScholarGoogle Scholar
  95. S. Khoja. 2001. Khoja's Arabic Stemmer (version 1.0). London, UK. http://zeus.cs.pacificu.edu/shereen/research.htm#stemming. Last access: October 23, 2017.Google ScholarGoogle Scholar
  96. W. Lahbib, I. Bounhas, B. Elayeb, F. Evrard, and Y. Slimani. 2013. A hybrid approach for Arabic semantic relation extraction. In Proceedings of the 26th International Florida Artificial Intelligence Research Society Conference (FLAIRS=13).Google ScholarGoogle Scholar
  97. W. Lahbib, I. Bounhas, and B. Elayeb. 2014. Arabic-English domain terminology extraction from aligned corpora. In Proceedings of the 13th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE). Lecture Notes in Computer Science, vol. 8841. Springer, Berlin, 745--759.Google ScholarGoogle Scholar
  98. W. Lahbib, I. Bounhas, and Y. Slimani. 2015. Arabic terminology extraction and enrichment based on domain-specific text mining. In Proceedings of the 27th International Conference on Tools with Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. W. Lahbib, I. Bounhas, and Y. Slimani. 2018. A possibilistic approach for Arabic domain terminology extraction and translation. In Proceedings of the 32nd International Symposium on Computer and Information Sciences.Google ScholarGoogle Scholar
  100. G. Lancioni and M. Boella. 2012. Idiomatic MWEs and machine translation, a retrieval and representation model: The AraMWE project. In Proceedings of the 4th Workshop on Computational Approaches to Arabic Script-Based Languages.Google ScholarGoogle Scholar
  101. M. Maamouri and A. Bies. 2004. Developing an Arabic treebank: Methods, guidelines, procedures, and tools. In Proceedings of the Workshop on Computational Approaches to Arabic Script-Based Languages (Semitic=04). Association for Computational Linguistics, Geneva, Switzerland, 2--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. A. Mahmood, H. U. Khan, F. K. Alarfaj, M. Ramzan, and M. Ilyas. 2018. A multilingual datasets repository of the hadith content. International Journal of Advanced Computer Science and Applications 9, 2 (2018), 165--172Google ScholarGoogle ScholarCross RefCross Ref
  103. M. Majdalawieh, F. Marir, and I. Tiemsani. 2017. Developing adaptive Islamic law business processes models for Islamic finance and banking by text mining the Holy Qur'an and hadith. In Proceedings of DataCom 2017 The 3rd IEEE International Conference on Big Data Intelligence and Computing, 1278--1283.Google ScholarGoogle Scholar
  104. J. Makhlouta, H. Harkous, and F. A. Zaraket. 2012. Arabic entity graph extraction using morphology, finite state machines, and graph transformations. In Proceedings of Computational Linguistics and Intelligent Text Processing (CICLING). Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. J. Makhlouta and H. Harkous. 2010. AUBSarf: Compositional non-deterministic finite-state automata for Arabic morphological analysis. In Proceedings of the 9th Faculty of Engineering and Architecture Student Conference. American University of Beirut, Beirut, Lebanon.Google ScholarGoogle Scholar
  106. H. Maraoui, K. Haddar, and L. Romary. 2017a. Encoding prototype of Al-Hadith Al-Shareef in TEI. In Proceedings of the International Conference on Arabic Language Processing. 217--229.Google ScholarGoogle Scholar
  107. H. Maraoui, K. Haddar, and L. Romary. 2017b. Modeling of Al-Hadith Al-Shareef with TEI. In Proceedings of the International Conference on Engineering 8 MIS (ICEMIS).Google ScholarGoogle Scholar
  108. H. Maraoui, K. Haddar, and L. Romary. 2018. Segmentation tool for hadith corpus to generate TEI encoding. In Proceedings of the 4th International Conference on Advanced Intelligent Systems and Informatics (AISI’18).Google ScholarGoogle Scholar
  109. A. G. Martínez, T. Feige, and T. Eich. 2017. Clear-cut methodology for Arabic OCR and post-correction with low technical skilled annotators. In Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage. 67--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. M. M. Najeeb. 2015. Multi-agent system for hadith processing. International Journal of Software Engineering and Its Applications 9, 9 (2015), 153--166.Google ScholarGoogle ScholarCross RefCross Ref
  111. M. M. Najeeb. 2016a. XML database for hadith and narrators. American Journal of Applied Sciences 13, 1 (2016a), 55--63.Google ScholarGoogle ScholarCross RefCross Ref
  112. M. M. Najeeb. 2016b. Processing of “Hadith Isnad” based on hidden Markov model. International Journal of Engineering and Technology 6, 1 (2016b), 50--55.Google ScholarGoogle Scholar
  113. M. M. Najeeb, A. Abdelkader, and M. B. Al-Zghoul. 2014. Arabic natural language processing laboratory serving Islamic sciences. International Journal of Advanced Computer Science and Applications 5, 3 (2014), 114--117.Google ScholarGoogle Scholar
  114. M. M. Najeeb, A. Abdelkader, M. B. Al-Zghoul, and A. Osman. 2015. A lexicon for hadith science based on a corpus. International Journal of Computer Science and Information Technologies 6, 2 (2015), 1336--1340.Google ScholarGoogle Scholar
  115. R. Parker, D. Graff, K. Chen, J. Kong, and K. Maeda. 2009. Arabic Gigaword. In Linguistic Data Consortium (LDC), Catalog number LDC2002L49. Retrieved from https://catalog.ldc.upenn.edu/ldc2011t11.Google ScholarGoogle Scholar
  116. A. Pasha, M. Al-badrashiny, M. Diab, A. Kholy, R. El Eskander, N. Habash, M. Pooleery, O. Rambow, and R. M. Roth. 2014. MADAMIRA: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In Proceedings of the 9th Language Resources Evaluation Conference. 1094--1101.Google ScholarGoogle Scholar
  117. M. Pazienza, M. Pennacchiotti, and F. Zanzotto. 2005. Terminology extraction: An analysis of linguistic and statistical approaches. In Knowledge Mining Series: Studies in Fuzziness and Soft Computing, S. Sirmakessis (Eds.). Springer, Berlin, 255--279.Google ScholarGoogle Scholar
  118. P. Pecina and P. Schlesinger. 2006. Combining association measures for collocation extraction. In Proceedings of 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (poster sessions). 651--658. Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. N. A. Rahman, Z. A. Bakar, and T. M. T. Sembok. 2010. Query expansion using thesaurus in improving Malay Hadith retrieval system. In Proceedings of the IEEE International Symposium in Information Technology (ITSim) Vol. 3. 1404--1409.Google ScholarGoogle ScholarCross RefCross Ref
  120. B. Sadeki. 2007. Narrative social structure: Anatomy of the hadith transmission network. Journal of Interdisciplinary History 38, 2 (2007), 328--329.Google ScholarGoogle ScholarCross RefCross Ref
  121. A. R. Saeed and S. W. Jaffry. 2013. Information mining from Muslim scriptures. In Proceedings of the 4th Workshop on South and Southeast Asian NLP (WSSANLP). International Joint Conference on Natural Language Processing. 66--71.Google ScholarGoogle Scholar
  122. H. Sayoud and H. Hadjadj. 2017. Fusion based authorship attribution -- Application of comparison between the Quran and hadith. In Proceedings of the 6th International Conference on Arabic Language Processing.Google ScholarGoogle Scholar
  123. F. Shahzad. 2016. Development of innovative Islamic web applications. International Journal of Social, Behavioral, Educational, Economic, Business and Industrial Engineering 10, 9 (2016), 3067--3072.Google ScholarGoogle Scholar
  124. Z. Shukur, N. Fabil, J. Salim, and S. A. Noah. 2011. Visualization of the hadith chain of narrators. In Proceedings of the 2nd International Visual Informatics Conference (IVIC 2011), Visual Informatics: Sustaining Research and Innovations (Part II)., Lecture Notes in Computer Science, vol. 7067. Springer, Berlin, 340--347. Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. M. A. Siddiqui, M. E.-S. Saleh, and A. Bagais. 2013. Early results for named entity recognition in a hadith corpus. In Proceedings of WACL’2, 2nd Workshop on Arabic Corpus Linguistics, Lancaster University, UK.Google ScholarGoogle Scholar
  126. M. A. Siddiqui, M. E.-S. Saleh, and A. A. Bagais. 2014. Extraction and visualization of the chain of narrators from hadiths using named entity recognition and classification. International Journal of Computational Linguistics Research 5, 1 (2014), 14--25.Google ScholarGoogle Scholar
  127. N. Soudani, I. Bounhas, B. Elayeb, and Y. Slimani. 2014. Toward an Arabic ontology for Arabic word sense disambiguation based on normalized dictionaries. In Proceedings of the 13th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE). Lecture Notes in Computer Science, vol. 8841. Springer, Berlin, 655--658. Google ScholarGoogle ScholarDigital LibraryDigital Library
  128. Q. Ul Ain and A. Basharat. 2011. Ontology driven information extraction from the Holy Qur'an related documents. In Proceedings of the 26th IEEEP All Pakistan Students Research Seminar.Google ScholarGoogle Scholar
  129. Y. Xu and Z. Chen. 2006. Relevance judgment: What do information users consider beyond topicality? Journal of the American Society for Information Science and Technology 57, 7 (2006), 961--973. Google ScholarGoogle ScholarCross RefCross Ref
  130. M. Zacklad, Bénel A, L. Zaher, C. Lejeune, J. Cahier, and C. Zhou. 2007. Hypertopic: Une métasémiotique et un protocole pour le Web socio-sémantique. In Actes Des 18 ème Journées Francophones D'ingénierie Des Connaissances (IC’07). 217--228.Google ScholarGoogle Scholar
  131. F. A. Zaraket and J. Makhlouta. 2012. Arabic cross-document NLP for the hadith and biography literature. In Proceedings of FLAIRS (Florida Artificial Intelligence Research Society) Conference.Google ScholarGoogle Scholar
  132. E. Zelaci. 2014. The Translation of Metaphoric Expressions in the Holy Hadith into English. MSc Thesis, University Kasdi Merbah, Ouargla, Algeria.Google ScholarGoogle Scholar

Index Terms

  1. On the Usage of a Classical Arabic Corpus as a Language Resource: Related Research and Key Challenges

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Asian and Low-Resource Language Information Processing
          ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 18, Issue 3
          September 2019
          386 pages
          ISSN:2375-4699
          EISSN:2375-4702
          DOI:10.1145/3305347
          Issue’s Table of Contents

          Copyright © 2019 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 January 2019
          • Accepted: 1 September 2018
          • Revised: 1 August 2018
          • Received: 1 March 2018
          Published in tallip Volume 18, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!