Skip to content

Utils

get_data_path(filename)

Returns the data file path in a zip-safe manner.

Parameters:

Name Type Description Default
filename str

The name of the data file.

required

Returns:

Type Description
Path

The path to the specified data file.

maketrans(a, b)

Maps each character in string a to the corresponding character in string b.

Examples:

>>> table = maketrans('012', '۰۱۲')
>>> '012'.translate(table)
'۰۱۲'

Parameters:

Name Type Description Default
a str

A string of characters to be replaced.

required
b str

A string of characters to replace with.

required

Returns:

Type Description
dict[int, Any]

A dictionary mapping character ordinals to their replacements.

past_roots()

Returns a string of past roots joined by a pipe character.

Examples:

>>> from hazm.utils import past_roots
>>> past_roots()[:20]
'آباد|آزمود|آسود|آشفت'

Returns:

Type Description
str

A string containing all past roots, suitable for use in regex.

present_roots()

Returns a string of present roots joined by a pipe character.

Examples:

>>> from hazm.utils import present_roots
>>> present_roots()[:20]
'آباد|آزمای|آسای|آشوب'

Returns:

Type Description
str

A string containing all present roots, suitable for use in regex.

regex_replace(patterns, text)

Finds regex patterns and replaces them with the given text.

Examples:

>>> from hazm.utils import regex_replace
>>> patterns = [(r'apples', 'oranges'), (r'red', 'blue')]
>>> regex_replace(patterns, 'red apples')
'blue oranges'

Parameters:

Name Type Description Default
patterns list[tuple[str, str]]

A list of tuples, each containing (pattern, replacement).

required
text str

The input text to be processed.

required

Returns:

Type Description
str

The modified text after all replacements.

stopwords_list(stopwords_file=default_stopwords)

Returns a sorted list of stopwords.

Examples:

>>> from hazm.utils import stopwords_list
>>> stopwords_list()[:4]
['آخرین', 'آقای', 'آمد', 'آمده']

Parameters:

Name Type Description Default
stopwords_file str | Path

Path to the stopwords file. Defaults to default_stopwords.

default_stopwords

Returns:

Type Description
list[str]

A sorted list of unique stopwords.

verbs_list()

Returns a list of verbs from the default verbs file.

Examples:

>>> from hazm.utils import verbs_list
>>> verbs_list()[:2]
['آباد#آباد', 'آزمای#آزمود']

Returns:

Type Description
list[str]

A list of verbs.

words_list(words_file=default_words)

Returns a list of words from the specified file.

Examples:

>>> from hazm.utils import words_list
>>> words_list()[1]
('آب', 549005877, ('N', 'AJ'))

Parameters:

Name Type Description Default
words_file str | Path

Path to the words file. Defaults to default_words.

default_words

Returns:

Type Description
list[tuple[str, int, tuple[str, ...]]]

A list of tuples, each containing (word, count, categories).