Corpus Readers¶
Natural language processing requires data. This data, often referred to as "corpora," is essential for pattern extraction and machine learning. Reading these corpora and converting raw data into a format suitable for NLP tasks usually requires additional time for coding and preprocessing.
To save you time, we have provided classes and functions that make it easy to read popular Persian corpora. The classes and functions in this section are provided solely to facilitate developers' work and are not considered a core part of the Hazm library.