Skip to content

Quran Reader

This module includes classes and functions for reading the Quranic Arabic corpus.

The Quranic Arabic corpus contains syntactic rules and morphological information for every word in the Holy Quran.

QuranReader

This class includes functions for reading the Quranic Arabic corpus.

Parameters:

Name Type Description Default
quran_file str

Path to the corpus file.

required

__init__(quran_file)

Initializes the QuranReader with the given file path.

Parameters:

Name Type Description Default
quran_file str

Path to the corpus file.

required

parts()

Yields the parts of the Quranic text along with their syntactic information.

A part is not necessarily a word; for example, the word "Ar-Rahman" is composed of two parts: "Al" and "Rahman".

Examples:

>>> parts = QuranReader(quran_file='quranic_corpus_morphology.txt').parts()
>>> print(next(parts))
{'loc': (1, 1, 1, 1), 'text': 'بِ', 'tag': 'P'}
>>> print(next(parts))
{'loc': (1, 1, 1, 2), 'text': 'سْمِ', 'tag': 'N', 'lem': 'ٱسْم', 'root': 'سمو'}
>>> print(next(parts))
{'loc': (1, 1, 2, 1), 'text': 'ٱللَّهِ', 'tag': 'PN', 'lem': 'ٱللَّه', 'root': 'اله'}

Yields:

Type Description
dict[str, str]

The next part of the Quranic text.

words()

Yields morphological information for the words of the Quran.

Examples:

>>> words = QuranReader(quran_file='quranic_corpus_morphology.txt').words()
>>> print(next(words))
('1.1.1', 'بِسْمِ', 'ٱسْم', 'سمو', 'P-N', [{'text': 'بِ', 'tag': 'P'}, {'text': 'سْمِ', 'tag': 'N', 'lem': 'ٱسْم', 'root': 'سمو'}])

Yields:

Type Description
tuple[str, str, str, str, str, list[dict[str, str]]]

Morphological information of the next word in the Quran.