Zekr concordance of quran

#Zekr concordance of quran manual

Website visitors havesince grown steadily to 1,500 users per day, while thenumber of online corrections has reduced over time. Thisproved quite effective - over an initial period of sixmonths, 2,000 words (2.6%) were revised as a result ofsupervised volunteer correction. The annotated corpus was thenput online to allow for collaborative annotation. Thisinitial stage of annotation was reasonably accurate, but toproduce a more reliable research resource, manualcorrection was required. Aninitial tagging was performed using the BuckwalterArabic Morphological Analyzer (Buckwalter, 2002),which was adapted to work with Quranic Arabic. Given theimportance of the Quran, special care has been taken toensure a high level of accuracy for the finalpart-of-speech tagging and morphological annotation. We describe a newmulti-stage approach to this component. ProcessingQuranic Arabic is a unique challenge from acomputational point of view, since it differs significantlyfrom Modern Standard Arabic (MSA).In this paper, we focus on the morphological annotationin the Quranic Arabic Corpus. The motivation behind this work isto produce a resource that enables further analysis of theQuran, the 1,400 year old central religious text of Islam.The 77,430 words of the Quran form a distinct genredifficult to compare to other texts of Arabic. IntroductionThe Quranic Arabic Corpus () isan on-line annotated linguistic resource with multiplelayers of annotation including morphologicalsegmentation, part-of-speech tagging, syntactic analysisusing dependency grammar !!"#$%&' ()#*%&' +&#,-" and asemantic ontology. This processis evaluated to validate the appropriateness of the chosen methodology.1.

#Zekr concordance of quran manual

The different stages include automaticmorphological tagging using diacritic edit-distance, two-pass manual verification, and online collaborative annotation. In this paper, we discuss how the unique challenge ofmorphological annotation of Quranic Arabic is solved using a multi-stage approach. We made this decision in order to leverage a large body of existing historicalgrammatical analysis, and to encourage online collaborative annotation. The Quranic Arabic Corpus differs from other Arabic computational resources in adopting a tagset thatclosely follows traditional Arabic grammar. This paperdescribes a new approach to morphological annotation of Quranic Arabic, a genre difficult to compare with other forms of Arabic.Processing Quranic Arabic is a unique challenge from a computational point of view, since the vocabulary and spelling differ fromModern Standard Arabic. The motivation behind thiswork is to produce a resource that enables further analysis of the Quran, the 1,400 year old central religious text of Islam. Morphological Annotation of Quranic ArabicKais Dukes 1 and Nizar Habash 21School of Computing, University of Leeds, LS2 9JT, United Kingdom2Center for Computational Learning Systems, Columbia University, New York, USAE-mail: Quranic Arabic Corpus () is an annotated linguistic resource with multiple layers of annotation includingmorphological segmentation, part-of-speech tagging, and syntactic analysis using dependency grammar.