Package Api Documentation for pylexique

API Reference for the classes in pylexique.pylexique.py

Main module of pylexique.

class pylexique.pylexique.Lexique383(lexique_path: Optional[str] = None, parser_type: str = 'csv')[source]

Bases : object

This is the class handling the lexique database. It provides methods for interacting with the Lexique DB and retrieve lexical items. All the lexical items are then stored in an Ordered Dict.

Paramètres:
  • lexique_path – string. Path to the lexique file.

  • parser_type – string. “pandas_csv” and “csv” are valid values. “csv” is the default value.

Variables:
  • lexique – Dictionary containing all the LexicalItem objects indexed by orthography.

  • lemmes – Dictionary containing all the LexicalItem objects indexed by lemma.

  • anagrams – Dictionary containing all the LexicalItem objects indexed by anagram form.

static _parse_csv(lexique_path: str) Generator[list, Any, None][source]
Paramètres:

lexique_path – string. Path to the lexique file.

Renvoie:

generator of rows: Content of the Lexique38x database.

_parse_lexique(lexique_path: str, parser_type: str) None[source]
Parses the given lexique file and creates 2 hash tables to store the data.
Paramètres:
  • lexique_path – string. Path to the lexique file.

  • parser_type – string. Can be either “csv”, “pandas_csv”.

Renvoie:

_create_db(lexicon: Generator[list, Any, None]) None[source]
Creates 2 hash tables populated with the entries in lexique if it does not exist yet.
One hash table holds the LexItems, the other holds the same data but grouped by lemmma to give access to all lexical forms of a word.
Paramètres:

lexicon – Iterable. Iterable containing the lexique383 entries.

Renvoie:

_convert_entries(row_fields: Union[List[str], List[Union[str, float, int, bool]]]) Tuple[str, str, str, str, str, str, float, float, float, float, str, int, int, bool, int, int, str, str, int, int, int, int, str, int, str, str, str, str, str, float, int, float, float, str, int][source]
Convert entries from strings to int, bool or float and generates
a new list with typed entries.
Paramètres:

row_fields – List of column entries representing a row.

Renvoie:

ConvertedRow: List of typed column entries representing a typed row.

get_lex(words: Union[Tuple[str, ...], str]) Dict[str, Union[LexItem, List[LexItem]]][source]

Recovers the lexical entries for the words in the sequence

Paramètres:

words – A string or a tuple of multiple strings for getting the LexItems for multiple words.

Renvoie:

Dictionary of LexItems.

Raises:

TypeError.

get_all_forms(word: str) List[LexItem][source]

Gets all lexical forms of a given word.

Paramètres:

word – String.

Renvoie:

List of LexItem objects sharing the same root lemma.

Raises:

ValueError.

Raises:

TypeError.

get_anagrams(word: str) List[LexItem][source]

Gets all lexical forms of a given word.

Paramètres:

word – String.

Renvoie:

List of LexItem objects which are anagrams of the given word.

Raises:

ValueError.

Raises:

TypeError.

static _save_errors(errors: Union[List[Tuple[List[Union[str, float, int, bool]], List[str]]], List[DefaultDict[str, List[Dict[str, str]]]]], errors_path: str) None[source]

Saves the mismatched key/values in Lexique383 based on type coercion.

Paramètres:
  • errors – List of errors encountered while parsing Lexique38x

  • errors_path – Path to save the errors.

Renvoie:

class pylexique.pylexique.LexItem(ortho: str, phon: str, lemme: str, cgram: str, genre: str, nombre: str, freqlemfilms2: float, freqlemlivres: float, freqfilms2: float, freqlivres: float, infover: str, nbhomogr: int, nbhomoph: int, islem: bool, nblettres: int, nbphons: int, cvcv: str, p_cvcv: str, voisorth: int, voisphon: int, puorth: int, puphon: int, syll: str, nbsyll: int, cv_cv: str, orthrenv: str, phonrenv: str, orthosyll: str, cgramortho: str, deflem: float, defobs: int, old20: float, pld20: float, morphoder: str, nbmorph: int)[source]

Bases : LexEntryTypes

This class defines the lexical items in Lexique383.
It uses slots for memory efficiency.
to_dict() Dict[str, Union[str, float, int, bool]][source]
Converts the LexItem to a dict containing its attributes and their values
Renvoie:

OrderedDict. Dictionary with key/values correspondence wit LexItem objects.

Raises:

AttributeError.

class pylexique.pylexique.LexEntryTypes(ortho: str, phon: str, lemme: str, cgram: str, genre: str, nombre: str, freqlemfilms2: float, freqlemlivres: float, freqfilms2: float, freqlivres: float, infover: str, nbhomogr: int, nbhomoph: int, islem: bool, nblettres: int, nbphons: int, cvcv: str, p_cvcv: str, voisorth: int, voisphon: int, puorth: int, puphon: int, syll: str, nbsyll: int, cv_cv: str, orthrenv: str, phonrenv: str, orthosyll: str, cgramortho: str, deflem: float, defobs: int, old20: float, pld20: float, morphoder: str, nbmorph: int)[source]

Bases : object

Type information about all the lexical attributes in a LexItem object.

API Reference for the classes in pylexique.utils.py