Skip to main content

Morphology

Genja's morph filter (and its shorter variant m) allows inflection using Universal Dependencies (UD) features. The most commonly used UD features are:

  1. Case: Nom, Gen, Dat, Acc, Loc, Ins
  2. Tense: Pres, Fut, Past
  3. Voice: Act, Pass

Example: Národní divadlo v Praze bylo postaveno v roce 1881.

  • Specification using the full morph filter:

    Národní divadlo v {{'Praha'|morph('Case=Loc')}}
    {{'být'|morph('Tense=Past|Gender=Neut|Number=Sing')}} {{'postavit'|morph('Voice=Pass|Gender=Neut|Number=Sing')}}
    v {{'rok'|morph('Case=Loc')}} 1881.
  • Equivalent version with the shorter m filter:

    Národní divadlo v {{'Praha'|m('6')}} {{'být'|m('rNS')}} {{'postavit'|m('sNS')}} {{'rok'|m('6')}} 1881.`

Agreement and reference

Reference ensures that syntactically linked items keep the same grammatical features. We use it when one of the items can have various grammatical features as it is not known in advance. The word/phrase that needs to copy its grammatical features is denoted by a parameter agr (or @ in the shorter variant). The word/phrase where the features are copied from is denoted by a parameter ref (or # in the shorter variant). The indices of the agreement members have to be the same in order to match during the agreement postprocessing. The index cannot be 0.

{{parties[0]|name|morph('Case=Nom', ref=1)}} {{'vyhrát'|morph('Tense=Past', agr=1)}} volby.Piráti vyhráli volby. / ODS vyhrála volby. / ANO vyhrálo volby.

Note that agreement cannot be combined with text modifying filters (e.g., capitalize) at this time:

  • unsupported: {{'senátorka'|m('#1')|capitalize}} {{'vyhrát'|m('r@1')}}; use {{'Senátorka'|m('#1')}} {{'vyhrát'|m('r@1')}}
  • unsupported: {{5|unitNum('senátorka', m='@1')|capitalize}} {{'vyhrát'|m('r@1')}}
  • unsupported: {{2|ord_numeral_in_words|m('@1')|capitalize}} {{'senátorka'|m('#1')}}

Possessor's agreement

Possessor's agreement is a special type of agreement between an object and its possessor. It is different from the "standard" agreement as it changes different parts of the word. In the templates, it is denoted by a parameter poss (or $ in the sorter variant). See examples:

  • Přišel tam {{'předseda'|m('S1#1')}}. {{'Jeho'|m('@2$1')}} {{'syn'|m('S4#2')}} jsem neviděl.Přišel tam předseda. Jeho syna jsem neviděl.
  • Přišel tam {{'předseda'|m('S1#3')}}. {{'Jeho'|m('@4$3')}} {{'dcera'|m('S4#4')}} jsem neviděl.Přišel tam předseda. Jeho dceru jsem neviděl.
  • Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6$5')}} {{'syn'|m('S4#6')}} jsem neviděl.Přišla tam předsedkyně. Jejího syna jsem neviděl.
  • Přišla tam {{'předsedkyně'|m('S1#7')}}. {{'Jeho'|m('@8$7')}} {{'dcera'|m('S4#8')}} jsem neviděl.Přišla tam předsedkyně. Její dceru jsem neviděl.

The form of the word jeho differs based on the gender and number of the possessor (jeho vs její vs jejich). Moreover, in the example above, the word jeho differs based on the gender of words syna/dceru – this is the "standard" agreement.

Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6', poss=5)}} {{'syn'|m('S4#6')}} jsem neviděl.

Is equal to:

Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6$5')}} {{'syn'|m('S4#6')}} jsem neviděl.Přišla tam předsedkyně. Jejího syna jsem neviděl.

morph – long morphology filter (Cs/En)

Inflects a phrase as specified by the argument UD features. The UD features are passed as a single argument to the morph filter. The format is as follows:

  • Name and value of a feature are separated by =, e.g. Case=Loc
  • Multiple features are separated by |, e.g., Voice=Pass|Gender=Neut|Number=Sing

Example:

  • {{'go'|morph('Tense=Past')}}went (English)

  • {{'house'|morph('Number=Plur')}}houses (English)

  • {{'this'|capitalize|morph('Number=Plur')}} {{'goose'|morph('Number=Plur')}} {{'go'|morph('Tense=Past')}} to see {{'PRS_OBJ'|pronoun(num='Sing', gen='Fem')}}.These geese went to see her.

  • Národní divadlo {{'Praha'|morph('Case=Loc', prep='v')}} {{'být'|morph('Tense=Past|Gender=Neut|Number=Sing')}} {{'postavit'|morph('Voice=Pass|Gender=Neut|Number=Sing')}} {{'rok'|morph('Case=Loc', prep='v')}} 1881.Národní divadlo v Praze bylo postaveno v roce 1881.

m – short morphology filter

Shorter variant of the morph filter. Each morphological feature is represented by a single character. Agreement indices are denoted by @ (equivalent of the agr parameter) and # (equivalent of the ref parameter). This filter supports only the most common features. To use other features, the morph filter has to be used.

List of supported features:

Categorym filter codemorph filter full formMeaning
GenderMGender=Masc|Animacy=Animmasculine animate
FGender=Femfeminine
NGender=Neutneuter
IGender=Masc|Animacy=Inanmasculine inanimate
NumberPNumber=Plurplural
SNumber=Singsingular
Case1Case=Nomnominative
2Case=Gengenitive
3Case=Datdative
4Case=Accaccusative
5Case=Vocvocative
6Case=Loclocative
7Case=Insinstrumental
TensepTense=Prespresent
rTense=Pastpast
fTense=Futfuture
VoicesVoice=Passpassive

All the following examples are equivalent, except the last one adds the preposition z as well:

  • {{'hezký'|m('@21')}} {{'muž'|m('P2#21')}}hezkých mužů
  • {{'hezký'|m('@21')}} {{'muž'|m('P2', ref='21')}}hezkých mužů
  • {{'hezký'|m('@21')}} {{'muž'|m('P2', ref=21)}}hezkých mužů
  • {{'hezký'|m('@21')}} {{'muž'|morph('Number=Plur|Case=Gen', ref=21)}}hezkých mužů
  • {{'hezký'|m('@21')}} {{'muž'|morph('Number=Plur|Case=Gen', ref=21, prep='z')}}z hezkých mužů

Morphology-related arguments

In general, all filters that are related to morphology (support word declension) can use the following arguments:

  • m - morphological features: either in a form of universal dependencies features, as described in the section about the morph filter, or in a form of short features as described in the section about the m filter.
  • ref - agreement reference id, described in the Agreement and reference section
  • agr - agreement target id, described in the Agreement and reference section
  • poss - possessor's agreement reference, described in the Possessor's agreement section
  • prep - preposition, described in Vocalization of prepositions

These arguments can be used in the following filters: name, unit, unitNum. They are used as keyword arguments, meaning that the key=value syntax has to be used, but the order of the arguments does not matter.

Czech examples

Passive voice

  • {{'být'|m('r@1')}} {{'zvolit'|m('s@1')}} {{'muž'|m('S1', ref=1)}}byl zvolen muž
  • {{'být'|m('r@2')}} {{'zvolit'|m('s@2')}} {{'žena'|m('S1', ref=2)}}byla zvolena žena
  • {{'být'|m('r@3')}} {{'zvolit'|morph('Voice=Pass', agr=3)}} {{'muž'|m('S1', ref=3)}}byl zvolen muž
  • {{'být'|m('r@4')}} {{'zvolit'|morph('Voice=Pass', agr=4)}} {{'žena'|m('S1', ref=4)}}byla zvolena žena

Future tense

Future is expressed differently for imperfective (psát, vyhrávat) and perfective (napsat, vyhrát) verbs:

  • for imperfective verbs, use the verb být in future tense plus infinitive:

    Volby {{'být'|morph('Tense=Fut|Person=3',agr=2)}} vyhrávat {{'ODS'|m('1#2')}}.Volby bude vyhrávat ODS.

  • for perfective verbs, use the morphologically present tense:

    Volby {{'vyhrát'|morph('Tense=Pres|Person=3',agr=2)}} {{'ODS'|m('1#2')}}.Volby vyhraje ODS.

Czech derivation

Derivation is a process of creating a related word from a base word, for example by adding a suffix.

Female forms

We provide the femForm filter for generating female forms for nouns, e.g., senátorsenátorka.

  • Direct usage:

    Paní Nováková je {{'senátor'|femForm(m='7')}}.Paní Nováková je senátorkou.

  • Conditional usage:

    {{'pan'|femForm(person.isFemale, m='S1')}} {{person|name(m='1')}} je {{'senátor'|femForm(person.isFemale, m='7')}}paní Nováková je senátorkou/pan Novák je senátorem

In this example, generating the female variant depends on the isFemale property of the entity person.

English pronouns

For English, we provide the pronoun filter which generates pronouns in agreement with a given expression, for example, the name of a person for which we want to generate he/she based on their gender.

Example: {{cand|name(ref=’1’)}} claimed fifth place with {{...}} percent of people voting for {{'PRS_OBJ'|pronoun(agr='1')}}

Values of the input argument:

  • PRS_SUBJ - personal pronoun, nominative: he, she, ...
  • PRS_OBJ - personal pronoun, accusative: him, her, ...
  • POSS_DET - possessive determiner: his, her, ...
  • POSS_PRON - possessive pronoun: his, hers, ...
  • REFL - reflexive pronoun (accusative): himself, herself, ...
  • EMPH - emphatic pronoun: himself, herself as in He himself did it.

Entities used for agreement have to be imported to GKB with gender info to enable the filter to work correctly.

Pronouns can also be generated without reference to other expression by providing the necessary morphological features:

{{'PRS_SUBJ'|pronoun(num='Sing', gen='Fem')}}she