Utils
Utils module. Contains utility functions for language processing.
- levenshtein_dist: Calculates the Levenshtein distance between two strings.
- validate_lang_input: Validates the language input and ensures it is a valid tuple.
Functions
levenshtein_dist(str1, str2)
Calculate the Levenshtein distance between two strings.
The Levenshtein distance is a metric for measuring the difference between two strings, defined as the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
str1 |
str
|
The first string. |
required |
str2 |
str
|
The second string. |
required |
Returns:
Name | Type | Description |
---|---|---|
int |
int
|
The Levenshtein distance between the two strings. |
Source code in simplemma/utils.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|
validate_lang_input(lang)
Make sure the lang variable is a valid tuple.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lang |
Any
|
The language input. |
required |
Returns:
Type | Description |
---|---|
Tuple[str]
|
Tuple[str]: A tuple containing the language code. |
Raises:
Type | Description |
---|---|
TypeError
|
If the lang argument is not a tuple or a string. |
Source code in simplemma/utils.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|