Skip to content

Lemmatization Strategy

This file defines the LemmatizationStrategy protocl class, which all lemmatization strategies should extend to be usable by the Simplemma library.

Classes

LemmatizationStrategy

Bases: Protocol

This protocol defines the interface for lemmatization strategies. Subclasses implementing this protocol must provide an implementation for the get_lemma method.

Note

This protocol should be implemented by concrete lemmatization strategy classes. Concrete implementations of this protocol should provide a concrete implementation for the get_lemma method.

Source code in simplemma/strategies/lemmatization_strategy.py
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class LemmatizationStrategy(Protocol):
    """
    This protocol defines the interface for lemmatization strategies. Subclasses implementing this protocol
    must provide an implementation for the `get_lemma` method.

    Note:
        This protocol should be implemented by concrete lemmatization strategy classes.
        Concrete implementations of this protocol should provide a concrete implementation for the `get_lemma` method.
    """

    @abstractmethod
    def get_lemma(self, token: str, lang: str) -> Optional[str]:
        """
        This method receives a token and a language code and should return the lemma for the token in the specified language.
        If the lemma is not found, it should return `None`.

        Args:
            token (str): The input token to lemmatize.
            lang (str): The language code for the token's language.

        Returns:
            Optional[str]: The lemma for the token, or `None` if not found.

        Raises:
            NotImplementedError: If the method is not implemented by the subclass.

        """
        raise NotImplementedError()

Functions

get_lemma(token, lang) abstractmethod

This method receives a token and a language code and should return the lemma for the token in the specified language. If the lemma is not found, it should return None.

Parameters:

Name Type Description Default
token str

The input token to lemmatize.

required
lang str

The language code for the token's language.

required

Returns:

Type Description
Optional[str]

Optional[str]: The lemma for the token, or None if not found.

Raises:

Type Description
NotImplementedError

If the method is not implemented by the subclass.

Source code in simplemma/strategies/lemmatization_strategy.py
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
@abstractmethod
def get_lemma(self, token: str, lang: str) -> Optional[str]:
    """
    This method receives a token and a language code and should return the lemma for the token in the specified language.
    If the lemma is not found, it should return `None`.

    Args:
        token (str): The input token to lemmatize.
        lang (str): The language code for the token's language.

    Returns:
        Optional[str]: The lemma for the token, or `None` if not found.

    Raises:
        NotImplementedError: If the method is not implemented by the subclass.

    """
    raise NotImplementedError()