Nayiri for Developers

Nayiri Armenian Lexicon — Data Model

Lexeme

The Oxford Dictionary defines a lexeme as “a basic lexical unit of a language, consisting of one word or several words, considered as an abstract unit, and applied to a family of words related by form or meaning.”

In the context of the Nayiri Armenian Lexicon, a Lexeme functions as a container that groups one or more Lemmas sharing the same or closely related meaning.

For example, a single Lexeme may group Lemmas that are spelling variants of the same abstract unit, such as ճշդել and ճշտել. It may also group Lemmas that represent different morphological or historical forms of the same abstract unit, such as գիտել, գիտենալ, and գիտնալ. Finally, a Lexeme may group Lemmas that are identical in form but differ in part of speech, such as հայ, which may function as a noun, adjective, or adverb depending on context.

Lemma

A lemma is defined as the heading of a dictionary entry, and is usually the canonical word form of the dictionary entry's subject. For example, dictionary entries for verbs usually have lemmas in the infinitive, such as ճշդել.

In the context of the Nayiri Armenian Lexicon, a Lemma is a container with exactly one canonical word form and exactly one part of speech, and encapsulates one or more Word Forms derived from its canonical form.

Word Form

A Word Form is a specific inflected realization of a lemma.

Word Forms are generated by applying one or more inflectional rules—along with any relevant exceptions—to the canonical form.

For example, ճշդեմ, ճշդես, ճշդէ, ճշդողները, and չճշդածներուն are all inflected Word Forms of the lemma ճշդել.

Each Word Form is associated with exactly one Inflection object, which provides its morphological analysis.

Inflection

In the context of the Nayiri Lexicon for the Armenian language, an Inflection represents the morphological analysis of a given Word Form.

Inflection objects are defined globally and shared across Word Forms.

In the example above, the Word Form ճշդեմ is described by the Inflection with the display name “Present Tense • Subjunctive Mood • First Person • Singular”.

Next: File Format