Abstract
The absence of publicly available reusable test collections for Arabic question answering on the Holy Qur’an has impeded the possibility of fairly comparing the performance of systems in that domain. In this article, we introduce AyaTEC, a reusable test collection for verse-based question answering on the Holy Qur’an, which serves as a common experimental testbed for this task. AyaTEC includes 207 questions (with their corresponding 1,762 answers) covering 11 topic categories of the Holy Qur’an that target the information needs of both curious and skeptical users. To the best of our effort, the answers to the questions (each represented as a sequence of verses) in AyaTEC were exhaustive—that is, all qur’anic verses that directly answered the questions were exhaustively extracted and annotated. To facilitate the use of AyaTEC in evaluating the systems designed for that task, we propose several evaluation measures to support the different types of questions and the nature of verse-based answers while integrating the concept of partial matching of answers in the evaluation.
- Heba Abdelnasser, Maha Ragab, Reham Mohamed, Alaa Mohamed, Bassant Farouk, Nagwa El-Makky, and Marwan Torki. 2014. Al-Bayan: An Arabic question answering system for the Holy Quran. In Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP’14). 57--64. http://www.aclweb.org/anthology/W14-3607.Google Scholar
Cross Ref
- Fatimah Dato Ahmad. 1995. A Malay Language Document Retrieval System: An Experimental Approach and Analysis. UKM, Bangi.Google Scholar
- M. Alrabiah, A. Al-Salman, E. S. Atwell, and Nawal Alhelewh. 2014. KSUCCA: A key to exploring Arabic historical linguistics. International Journal of Computational Linguistics 5, 2 (2014), 27--36.Google Scholar
- Eric Atwell, Nizar Habash, Bill Louw, Bayan Abu Shawar, Tony McEnery, Wajdi Zaghouani, and Mahmoud El-Haj. 2010. Understanding the Quran: A new grand challenge for computer science and artificial intelligence. In Proceedings of the Conference on Grand Challenges in Computing Research (GCCR’10).Google Scholar
- Yonatan Belinkov, Alexander Magidow, Alberto Barrón-Cedeño, Avi Shmidman, and Maxim Romanov. 2019. Studying the history of the Arabic language: Language technology and a large-scale historical corpus. Language Resources and Evaluation 53 (2019), 771--805.Google Scholar
Cross Ref
- Hoa Trang Dang, Diane Kelly, and Jimmy Lin. 2007. Overview of the TREC 2007 question answering track. In Proceedings of the 15th Text REtrieval Conference (TREC’07).Google Scholar
- Hoa Trang Dang, Jimmy Lin, and Diane Kelly. 2006. Overview of the TREC 2006 question answering track. In Proceedings of the 14th Text REtrieval Conference (TREC’06).Google Scholar
- Aimad Hakkoum and Said Raghay. 2016. Semantic Q8A system on the Quran. Arabian Journal for Science and Engineering 41, 12 (Dec. 2016), 5205--5214. DOI:https://doi.org/10.1007/s13369-016-2251-yGoogle Scholar
Cross Ref
- M. A. Hamdelsayed and E. S. Atwell. 2016. Islamic applications of automatic question-answering. Journal of Engineering and Computer Science 17, 2 (2016), 51--57.Google Scholar
- Mohamed Adany Hamdelsayed and E. S. Atwell. 2016. Using Arabic numbers (singular, dual, and plurals) patterns to enhance question answering system results. In Proceedings of the 4th International Conference on Islamic Applications in Computer Science and Technologies (IMAN’16).Google Scholar
- Mohamed Adany Hamdelsayed, Ebtihal Mustafa Elamin Mohamed, MohamedAlmoayed TajAlsir Mohamed Saeed, Abakr Musa Ai, Edress Babiker Edress Mohamed Mhmoud, Maha Ali Mahmoud, Ahmed Shamat, and Eric Atwell. 2017. Islamic application of question answering systems: Comparative study. Journal of Advanced Computer Science and Technology Research 7, 1 (2017), 29--41.Google Scholar
- Suhaib Kh Hamed and Mohd Juzaiddin Ab Aziz. 2016. A question answering system on Holy Quran translation based on question expansion technique and neural network classification. Journal of Computer Science 12, 3 (2016), 169--177.Google Scholar
Cross Ref
- Bothaina Hamoud and Eric Atwell. 2016. Using an Islamic question and answer knowledge base to answer questions about the Holy Quran. International Journal on Islamic Applications in Computer Science And Technology 4, 4 (2016), 20--29.Google Scholar
- Bothaina Hamoud and Eric Atwell. 2017. Evaluation corpus for restricted-domain question-answering systems for the Holy Quran. International Journal of Science and Research 6, 8 (2017), 1133--1138.Google Scholar
- Clive Holes. 2004. Modern Arabic: Structures, Functions, and Varieties. Georgetown University Press.Google Scholar
- Aisha Jilani. 2013. Parallel Corpus Multi Stream Question Answering with Applications to the Qu’ran. Ph.D. Dissertation. University of Huddersfield.Google Scholar
- J. Richard Landis and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics 33, 1 (1977), 159--174.Google Scholar
- Jimmy Lin and Boris Katz. 2006. Building a reusable test collection for question answering. Journal of the American Society for Information Science and Technology 57, 7 (2006), 851--861.Google Scholar
Digital Library
- Karim Ouda. 2015. QuranAnalysis: A Semantic Search and Intelligence System for the Quran. Ph.D. Dissertation. University of Leeds, Leeds, UK.Google Scholar
- Hamed Zakeri Rad, Sabrina Tiun, and Saidah Saad. 2018. Lexical scoring system of lexical chain for quranic document retrieval. GEMA Online® Journal of Language Studies 18, 2 (2018), 59--79.Google Scholar
- Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 784--789. https://www.aclweb.org/anthology/papers/P/P18/P18-2124/.Google Scholar
Cross Ref
- Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2383--2392. DOI:https://doi.org/10.18653/v1/D16-1264Google Scholar
Cross Ref
- Abdul-Baquee M. Sharaf and Eric Atwell. 2012. QurAna: Corpus of the Quran annotated with pronominal anaphora. In Proceedings of the 8th Conference on International Language Resources and Evaluation (LREC’12). 130--137.Google Scholar
- H. Shmeisani, S. Tartir, A. Al-Na’ssaan, and M. Naji. 2014. Semantically answering questions from the Holy Quran. In Proceedings of the 2nd International Conference on Islamic Applications in Computer Science and Technology. 1--8.Google Scholar
- Julius Sim and Chris C. Wright. 2005. The kappa statistic in reliability studies: Use, interpretation, and sample size requirements. Physical Therapy 85, 3 (2005), 257--268.Google Scholar
Cross Ref
- Ellen M. Voorhees. 2003. Overview of the TREC 2003 question answering track. In Proceedings of the 11th Text REtrieval Conference (TREC’03).Google Scholar
- Ellen M. Voorhees. 2004. Overview of the TREC 2004 question answering track. In Proceedings of the 12th Text REtrieval Conference (TREC’04). 54--68.Google Scholar
- Ellen M. Voorhees and Hoa Trang Dang. 2005. Overview of the TREC 2005 question answering track. In Proceedings of the 13th Text REtrieval Conference (TREC’05). 52--62.Google Scholar
- Ellen M. Voorhees and Dawn M. Tice. 2000. Building a question answering test collection. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, 200--207.Google Scholar
- Aliyu Rufai Yauri, Rabiah Abdul Kadir, Azreen Azman, and M. A. Azmi Murad. 2013. Quranic verse extraction base on concepts using OWL-DL ontology. Research Journal of Applied Sciences, Engineering and Technology 6, 23 (2013), 4492--4498.Google Scholar
Cross Ref
Index Terms
AyaTEC: Building a Reusable Verse-Based Test Collection for Arabic Question Answering on the Holy Qur’an
Recommendations
Arabic natural language processing for Qur’anic research: a systematic review
AbstractThe Qur’an is a fourteen centuries old divine book in Arabic language that is read and followed by almost two billion Muslims globally as their sacred religious text. With the rise of Islam, the Arabic language gained popularity and became the ...
Towards a historical dictionary for Arabic language
AbstractA historical dictionary is a language dictionary which studies the evolution of the construction of words and their meanings through the chronological stages the language has undergone. However, despite its richness, Arabic does not yet have a ...
Evaluation of Profitable Intangible Cultural Heritage: Liulingzhui Brewing Technology as an Example
ICCIS '13: Proceedings of the 2013 International Conference on Computational and Information SciencesProfitable intangible cultural heritage refers to intangible cultural heritage that can be used to produce goods and services, and hence could make economic profit for their bearers. They are intangible, profitable, adhesive, and of historical ...






Comments