Quran Reader
This module includes classes and functions for reading the Quranic Arabic corpus.
The Quranic Arabic corpus contains syntactic rules and morphological information for every word in the Holy Quran.
QuranReader
¶
This class includes functions for reading the Quranic Arabic corpus.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quran_file
|
str
|
Path to the corpus file. |
required |
__init__(quran_file)
¶
Initializes the QuranReader with the given file path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quran_file
|
str
|
Path to the corpus file. |
required |
parts()
¶
Yields the parts of the Quranic text along with their syntactic information.
A part is not necessarily a word; for example, the word "Ar-Rahman" is composed of two parts: "Al" and "Rahman".
Examples:
>>> parts = QuranReader(quran_file='quranic_corpus_morphology.txt').parts()
>>> print(next(parts))
{'loc': (1, 1, 1, 1), 'text': 'بِ', 'tag': 'P'}
>>> print(next(parts))
{'loc': (1, 1, 1, 2), 'text': 'سْمِ', 'tag': 'N', 'lem': 'ٱسْم', 'root': 'سمو'}
>>> print(next(parts))
{'loc': (1, 1, 2, 1), 'text': 'ٱللَّهِ', 'tag': 'PN', 'lem': 'ٱللَّه', 'root': 'اله'}
Yields:
| Type | Description |
|---|---|
dict[str, str]
|
The next part of the Quranic text. |
words()
¶
Yields morphological information for the words of the Quran.
Examples:
>>> words = QuranReader(quran_file='quranic_corpus_morphology.txt').words()
>>> print(next(words))
('1.1.1', 'بِسْمِ', 'ٱسْم', 'سمو', 'P-N', [{'text': 'بِ', 'tag': 'P'}, {'text': 'سْمِ', 'tag': 'N', 'lem': 'ٱسْم', 'root': 'سمو'}])
Yields:
| Type | Description |
|---|---|
tuple[str, str, str, str, str, list[dict[str, str]]]
|
Morphological information of the next word in the Quran. |