Nayiri Developers: Nayiri Armenian Lexicon

Nayiri Armenian Lexicon — Data Schema Reference

The lexicon data is provided in JSON format. The following sections describe the attributes of each JSON object.

Root-level

The root-level JSON Object has the following attributes:

Attribute Key Type Description

lexemes JSON Array An Array of Lexeme objects (described below)

inflections JSON Array An Array of globally-defined Inflection objects (described below) shared by Word Forms

metadata JSON Object Provides an overview of the data set, including versioning, licensing, and some basic statistics.

Attribute Key	Type	Description
lexemes	JSON Array	An Array of Lexeme objects (described below)
inflections	JSON Array	An Array of globally-defined Inflection objects (described below) shared by Word Forms
metadata	JSON Object	Provides an overview of the data set, including versioning, licensing, and some basic statistics.

Lexeme

Attribute Key Type Description

lexemeId String The 4-digit identifier that uniquely identifies this Lexeme in the Nayiri Lexicon.

It is the base64url encoding of the underlying 24-bit unique identifier.

description String A human-readable description of this Lexeme that by convention has a comma-separated list of lemmas and a short English definition in parantheses, meant to provide a way to disambiguate distinct Lexemes with the same Lemma.

For example, the Lexeme representing the postposition համար has the description “համար (for, on account of)”, whereas the Lexeme representing the noun համար has the description “համար (account, number, count, calculation, enumeration)”.

lemmaType String The type of Lemmas this Lexeme can contain.

One of: { NOMINAL, VERBAL, UNINFLECTED }

Lexemes with lemmaType of NOMINAL can store Lemmas that represent Nouns, Adjectives, Adverbs, and Adpositions

Lexemes with lemmaType of VERBAL are meant to store Lemmas of Verbs only

Lexemes with lemmaType of UNINFLECTED are meant to store Lexemes that are exclusively uninflected, such as adverbs (e.g. անմիջապէս), adpositions, conjunctions, interjections, articles, and determiners.

lemmas JSON Array The Lemma objects belonging to this Lexeme.

Attribute Key	Type	Description
lexemeId	String	The 4-digit identifier that uniquely identifies this Lexeme in the Nayiri Lexicon. It is the base64url encoding of the underlying 24-bit unique identifier.
description	String	A human-readable description of this Lexeme that by convention has a comma-separated list of lemmas and a short English definition in parantheses, meant to provide a way to disambiguate distinct Lexemes with the same Lemma. For example, the Lexeme representing the postposition համար has the description “համար (for, on account of)”, whereas the Lexeme representing the noun համար has the description “համար (account, number, count, calculation, enumeration)”.
lemmaType	String	The type of Lemmas this Lexeme can contain. One of: { `NOMINAL, VERBAL, UNINFLECTED` } Lexemes with lemmaType of NOMINAL can store Lemmas that represent Nouns, Adjectives, Adverbs, and Adpositions Lexemes with lemmaType of VERBAL are meant to store Lemmas of Verbs only Lexemes with lemmaType of UNINFLECTED are meant to store Lexemes that are exclusively uninflected, such as adverbs (e.g. անմիջապէս), adpositions, conjunctions, interjections, articles, and determiners.
lemmas	JSON Array	The Lemma objects belonging to this Lexeme.

Lemma

Attribute Key Type Description

lemmaId String The 5-digit identifier that uniquely identifies this Lemma in the Nayiri Lexicon.

It is the base64url encoding of the underlying 30-bit unique identifier.

lemmaString String The canonical word form of this Lemma. (For example, ճշդել)

There may be more than one Lemma with the same lemmaString in a given Lexeme.

For example, in the Uninflected Lexeme with the description “որ (that; when, whenever; if; so that, in order to)”, the two contained Uninflected Lemmas for the conjunction and adverb both have the same lemmaString (որ).

partOfSpeech String The part of speech of this lemma, which is one of:

{ NOUN, PRONOUN, VERB, ADJECTIVE, ADVERB, CONJUNCTION, INTERJECTION, ARTICLE, DETERMINER, ADPOSITION }

lemmaDisplayString String A human-readable description of this Lemma. By convention, it is the lemmaString followed by an English definition in parentheses. It is meant to provide a way to disambiguate Lemmas with the same lemmaString within the same Lexeme.

In the preceding example, the Uninflected Lemma for the conjunction որ has the lemmaDisplayString “որ (that; if; so that, In order to)”, where as the adverb has “որ (when, whenever)”.

numWordForms Integer A convenience attribute showing the total number of Word Forms in this Lemma.

wordForms JSON Array The WordForm objects attributed to this Lemma.

Attribute Key	Type	Description
lemmaId	String	The 5-digit identifier that uniquely identifies this Lemma in the Nayiri Lexicon. It is the base64url encoding of the underlying 30-bit unique identifier.
lemmaString	String	The canonical word form of this Lemma. (For example, ճշդել) There may be more than one Lemma with the same lemmaString in a given Lexeme. For example, in the Uninflected Lexeme with the description “որ (that; when, whenever; if; so that, in order to)”, the two contained Uninflected Lemmas for the conjunction and adverb both have the same lemmaString (որ).
partOfSpeech	String	The part of speech of this lemma, which is one of: { `NOUN, PRONOUN, VERB, ADJECTIVE, ADVERB, CONJUNCTION, INTERJECTION, ARTICLE, DETERMINER, ADPOSITION` }
lemmaDisplayString	String	A human-readable description of this Lemma. By convention, it is the lemmaString followed by an English definition in parentheses. It is meant to provide a way to disambiguate Lemmas with the same lemmaString within the same Lexeme. In the preceding example, the Uninflected Lemma for the conjunction որ has the lemmaDisplayString “որ (that; if; so that, In order to)”, where as the adverb has “որ (when, whenever)”.
numWordForms	Integer	A convenience attribute showing the total number of Word Forms in this Lemma.
wordForms	JSON Array	The WordForm objects attributed to this Lemma.

Word Form

Attribute Key Type Description

s String An inflected word form (e.g. ճշդեմ, ճշդես, ճշդէ) of the containing Lemma (e.g. ճշդել)

i String The unique identifier of the Inflection object representing the morphological analysis of this Word Form.

Attribute Key	Type	Description
s	String	An inflected word form (e.g. ճշդեմ, ճշդես, ճշդէ) of the containing Lemma (e.g. ճշդել)
i	String	The unique identifier of the Inflection object representing the morphological analysis of this Word Form.

Inflection

Attribute Key Type Description

inflectionId String The 4-digit unique identifier of this Inflection object.

It is the base64url encoding of the underlying 24-bit unique identifier of this Inflection object.

lemmaType String One of: { NOMINAL, VERBAL, UNINFLECTED }

Note that no attributes besides inflectionId and displayName apply to the special Inflection object with lemmaType == UNINFLECTED

displayName JSON Object Provides an internationalized human-readable display name for this Inflection.

The keys are the locale, and the values are the localized display names. Both the keys and values are Strings.

At present, only the hy (Armenian) and en (English) locale Strings are supported.

verbalInflectionClass String Signifies the broad category of Verbal Inflections represented by this Inflection object.

Applicable only when lemmaType == VERBAL

One of: { REGULAR_VERB, INFINITIVE, PRESENT_PARTICIPLE, PAST_PARTICIPLE, FUTURE_PARTICIPLE, PRESENT_PARTICIPLE_SUBSTANTIVE, PAST_PARTICIPLE_SUBSTANTIVE, FUTURE_PARTICIPLE_SUBSTANTIVE }

verbPolarity String Signifies the polarity of the verb for this Inflection.

Applicable only when lemmaType == VERBAL

One of: { POSITIVE, NEGATIVE }

verbTense String Signifies the grammatical tense of the verb for this Inflection.

Applicable only when verbalInflectionClass == REGULAR_VERB

One of: { SIMPLE_PRESENT, PRESENT_CONTINUOUS, PRESENT_PERFECT, SIMPLE_PAST, PAST_PERFECT, PAST_IMPERFECT, PAST_CONTINUOUS, SIMPLE_FUTURE, FUTURE_PERFECT, NONE }

verbMood String Signifies the grammatical mood of the verb for this Inflection.

Applicable only when verbalInflectionClass == REGULAR_VERB

One of: { INDICATIVE, IMPERATIVE, PROHIBITIVE, SUBJUNCTIVE, CONDITIONAL }

grammaticalPerson String Signifies the grammatical person of the verb for this Inflection.

Applicable only when verbalInflectionClass == REGULAR_VERB

One of: { FIRST, SECOND, THIRD, NONE }

grammaticalNumber String Signifies the grammatical number of the noun, verb, or substantive participle for this Inflection.

Applicable only when (lemmaType == NOMINAL) || (verbalInflectionClass == (REGULAR_VERB || PRESENT_PARTICIPLE_SUBSTANTIVE || PAST_PARTICIPLE_SUBSTANTIVE || FUTURE_PARTICIPLE_SUBSTANTIVE)

One of: { SINGULAR, PLURAL }

grammaticalCase String Signifies the grammatical number of the noun, infinitive, or substantive participle for this Inflection.

Applicable only when (lemmaType == NOMINAL) || (verbalInflectionClass == (INFINITIVE || PRESENT_PARTICIPLE_SUBSTANTIVE || PAST_PARTICIPLE_SUBSTANTIVE || FUTURE_PARTICIPLE_SUBSTANTIVE))

One of: { NOMINATIVE, ACCUSATIVE, GENITIVE, DATIVE, ABLATIVE, INSTRUMENTAL, LOCATIVE }

grammaticalArticle String Signifies any grammatical article appended to the noun, infinitive, or substantive participle for this Inflection.

Applicable only when (lemmaType == NOMINAL ) || (verbalInflectionClass == (INFINITIVE || PRESENT_PARTICIPLE_SUBSTANTIVE || PAST_PARTICIPLE_SUBSTANTIVE || FUTURE_PARTICIPLE_SUBSTANTIVE))

One of: { NONE, DEFINITE_ARTICLE_UHT, DEFINITE_ARTICLE_NOO, POSSESSIVE_ARTICLE_SINGULAR_FIRST_PERSON, POSSESSIVE_ARTICLE_SINGULAR_SECOND_PERSON, POSSESSIVE_ARTICLE_UHT, POSSESSIVE_ARTICLE_NOO, DEFINITE_ARTICLE_NOO_WITH_FIRST_PERSON_POSSESSIVE_ARTICLE, DEFINITE_ARTICLE_NOO_WITH_SECOND_PERSON_POSSESSIVE_ARTICLE, DEFINITE_ARTICLE_NOO_WITH_THIRD_PERSON_POSSESSIVE_ARTICLE_UHT, DEFINITE_ARTICLE_NOO_WITH_THIRD_PERSON_POSSESSIVE_ARTICLE_NOO }

Attribute Key	Type	Description
inflectionId	String	The 4-digit unique identifier of this Inflection object. It is the base64url encoding of the underlying 24-bit unique identifier of this Inflection object.
lemmaType	String	One of: { `NOMINAL, VERBAL, UNINFLECTED` } Note that no attributes besides inflectionId and displayName apply to the special Inflection object with `lemmaType == UNINFLECTED`
displayName	JSON Object	Provides an internationalized human-readable display name for this Inflection. The keys are the locale, and the values are the localized display names. Both the keys and values are Strings. At present, only the `hy` (Armenian) and `en` (English) locale Strings are supported.
verbalInflectionClass	String	Signifies the broad category of Verbal Inflections represented by this Inflection object. Applicable only when `lemmaType == VERBAL` One of: { `REGULAR_VERB, INFINITIVE, PRESENT_PARTICIPLE, PAST_PARTICIPLE, FUTURE_PARTICIPLE, PRESENT_PARTICIPLE_SUBSTANTIVE, PAST_PARTICIPLE_SUBSTANTIVE, FUTURE_PARTICIPLE_SUBSTANTIVE` }
verbPolarity	String	Signifies the polarity of the verb for this Inflection. Applicable only when `lemmaType == VERBAL` One of: { `POSITIVE, NEGATIVE` }
verbTense	String	Signifies the grammatical tense of the verb for this Inflection. Applicable only when `verbalInflectionClass == REGULAR_VERB` One of: { `SIMPLE_PRESENT, PRESENT_CONTINUOUS, PRESENT_PERFECT, SIMPLE_PAST, PAST_PERFECT, PAST_IMPERFECT, PAST_CONTINUOUS, SIMPLE_FUTURE, FUTURE_PERFECT, NONE` }
verbMood	String	Signifies the grammatical mood of the verb for this Inflection. Applicable only when `verbalInflectionClass == REGULAR_VERB` One of: { `INDICATIVE, IMPERATIVE, PROHIBITIVE, SUBJUNCTIVE, CONDITIONAL` }
grammaticalPerson	String	Signifies the grammatical person of the verb for this Inflection. Applicable only when `verbalInflectionClass == REGULAR_VERB` One of: { `FIRST, SECOND, THIRD, NONE` }
grammaticalNumber	String	Signifies the grammatical number of the noun, verb, or substantive participle for this Inflection. Applicable only when `(lemmaType == NOMINAL) \|\| (verbalInflectionClass == (REGULAR_VERB \|\| PRESENT_PARTICIPLE_SUBSTANTIVE \|\| PAST_PARTICIPLE_SUBSTANTIVE \|\| FUTURE_PARTICIPLE_SUBSTANTIVE)` One of: { `SINGULAR, PLURAL` }
grammaticalCase	String	Signifies the grammatical number of the noun, infinitive, or substantive participle for this Inflection. Applicable only when `(lemmaType == NOMINAL) \|\| (verbalInflectionClass == (INFINITIVE \|\| PRESENT_PARTICIPLE_SUBSTANTIVE \|\| PAST_PARTICIPLE_SUBSTANTIVE \|\| FUTURE_PARTICIPLE_SUBSTANTIVE))` One of: { `NOMINATIVE, ACCUSATIVE, GENITIVE, DATIVE, ABLATIVE, INSTRUMENTAL, LOCATIVE` }
grammaticalArticle	String	Signifies any grammatical article appended to the noun, infinitive, or substantive participle for this Inflection. Applicable only when `(lemmaType == NOMINAL ) \|\| (verbalInflectionClass == (INFINITIVE \|\| PRESENT_PARTICIPLE_SUBSTANTIVE \|\| PAST_PARTICIPLE_SUBSTANTIVE \|\| FUTURE_PARTICIPLE_SUBSTANTIVE))` One of: { `NONE, DEFINITE_ARTICLE_UHT, DEFINITE_ARTICLE_NOO, POSSESSIVE_ARTICLE_SINGULAR_FIRST_PERSON, POSSESSIVE_ARTICLE_SINGULAR_SECOND_PERSON, POSSESSIVE_ARTICLE_UHT, POSSESSIVE_ARTICLE_NOO, DEFINITE_ARTICLE_NOO_WITH_FIRST_PERSON_POSSESSIVE_ARTICLE, DEFINITE_ARTICLE_NOO_WITH_SECOND_PERSON_POSSESSIVE_ARTICLE, DEFINITE_ARTICLE_NOO_WITH_THIRD_PERSON_POSSESSIVE_ARTICLE_UHT, DEFINITE_ARTICLE_NOO_WITH_THIRD_PERSON_POSSESSIVE_ARTICLE_NOO` }

Metadata

The Metadata object provides version information of the lexicon data, some statistics about the data, and human-readable descriptions of its authorship, licensing, and attribution requirements.

Attribute Key Type Description

version String A version String that uniquely identifies this release.

It is formatted as YYYY-MM-DD-vN, where YYYY is the year, MM is the month (01-12), and DD is the day of the month (01-31), and N is the revision number for that day.

license String The license under which the data is released

attribution String The attribution text that consumers of the data should display in their application or derivative work when using the data

publisher String

sponsorship String

author String

contactEmail String A contact email address for support

website String URL to the Nayiri website

numLexemes Integer The number of Lexemes in the data set

numLemmas Integer The total number of Lemmas across all Lexemes in the data set

numWordForms Integer The total number of Word Forms across all Lemmas of all Lexemes in the data set

numInflections Integer The number of Inflection objects defined globally

Attribute Key	Type	Description
version	String	A version String that uniquely identifies this release. It is formatted as `YYYY-MM-DD-vN`, where `YYYY` is the year, `MM` is the month (01-12), and `DD` is the day of the month (01-31), and `N` is the revision number for that day.
license	String	The license under which the data is released
attribution	String	The attribution text that consumers of the data should display in their application or derivative work when using the data
publisher	String
sponsorship	String
author	String
contactEmail	String	A contact email address for support
website	String	URL to the Nayiri website
numLexemes	Integer	The number of Lexemes in the data set
numLemmas	Integer	The total number of Lemmas across all Lexemes in the data set
numWordForms	Integer	The total number of Word Forms across all Lemmas of all Lexemes in the data set
numInflections	Integer	The number of Inflection objects defined globally

Back to: Nayiri Armenian Lexicon Home