5. Morphology

Genja’s morph (and its shorter variant m) filter allows inflection using Universal Dependencies (UD) features. Most commonly used UD features:

  1. Case: Nom, Gen, Dat, Acc, Loc, Ins

  2. Tense: Pres, Fut, Past

  3. Voice: Act, Pass

Example:

Národní divadlo v Praze bylo postaveno v roce 1881.

  • Specification using the full morph filter:

    Národní divadlo v {{'Praha'|morph('Case=Loc')}} {{'být'|morph('Tense=Past|Gender=Neut|Number=Sing')}}  {{'postavit'|morph('Voice=Pass|Gender=Neut|Number=Sing')}} v {{'rok'|morph('Case=Loc')}} 1881.

  • Equivalent version with the shorter m filter:

    Národní divadlo v {{'Praha'|m('6')}} {{'být'|m('rNS')}} {{'postavit'|m('sNS')}} v {{'rok'|m('6')}} 1881.

5.1. Agreement and Reference

Reference ensures that syntactically linked items keep the same grammatical features. We use it when one of the items can have various grammatical features as it is not known in advance. The word/phrase that needs to copy its grammatical features is denoted by a parameter agr (or @ in the shorter variant). The word/phrase where the features are copied from is denoted by a parameter ref (or # in the shorter variant). The indices of the agreement members have to be the same in order to match during the agreement postprocessing. The index cannot be 0.

{{parties[0]|name|morph('Case=Nom', ref=1)}} {{'vyhrát'|morph('Tense=Past', agr=1)}} volby.

Piráti vyhráli volby. / ODS vyhrála volby. / ANO vyhrálo volby.

Note that agreement cannot be combined with text modifying filters (e.g., capitalize) at this time:

  • unsupported: {{'senátorka'|m('#1')|capitalize}} {{'vyhrát'|m('r@1')}}; use {{'Senátorka'|m('#1')}} {{'vyhrát'|m('r@1')}}

  • unsupported: {{5|unitNum('senátorka', m='@1')|capitalize}} {{'vyhrát'|m('r@1')}}

  • unsupported: {{2|ord_numeral_in_words|m('@1')|capitalize}} {{'senátorka'|m('#1')}}

5.1.1. Possessor’s agreement

Possessor’s agreement is a special type of agreement between an object and its possessor. It is different from the “standard” agreement as it changes different parts of the word. In the templates, it is denoted by a parameter poss (or $ in the sorter variant). See example:

  • Přišel tam {{'předseda'|m('S1#1')}}. {{'Jeho'|m('@2$1')}} {{'syn'|m('S4#2')}} jsem neviděl.Přišel tam předseda. Jeho syna jsem neviděl.

  • Přišel tam {{'předseda'|m('S1#3')}}. {{'Jeho'|m('@4$3')}} {{'dcera'|m('S4#4')}} jsem neviděl.Přišel tam předseda. Jeho dceru jsem neviděl.

  • Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6$5')}} {{'syn'|m('S4#6')}} jsem neviděl.Přišla tam předsedkyně. Jejího syna jsem neviděl.

  • Přišla tam {{'předsedkyně'|m('S1#7')}}. {{'Jeho'|m('@8$7')}} {{'dcera'|m('S4#8')}} jsem neviděl.Přišla tam předsedkyně. Její dceru jsem neviděl.

The form of the word “jeho” differs based on the gender and number of the possessor (“jeho” vs “její vs “jejich”). Moreover, in the example above, the word “jeho” differs based on the gender of words “syna/dceru” – this is the “standard” agreement.

Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6', poss=5)}} {{'syn'|m('S4#6')}} jsem neviděl.

Is equal to:

Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6$5')}} {{'syn'|m('S4#6')}} jsem neviděl.Přišla tam předsedkyně. Jejího syna jsem neviděl.

5.2. morph (Cs/En)

Inflects a phrase as specified by the argument UD features. The UD features are passed as a single argument to the morph filter. The format is as follows:

  • Name and Value of a feature are separated by =, e.g. Case=Loc

  • Multiple features are separated by |, e.g. Voice=Pass|Gender=Neut|Number=Sing

Example:

  • {{'go'|morph('Tense=Past')}}went (English)

  • {{'house'|morph('Number=Plur')}}houses (English)

  • {{'this'|capitalize|morph('Number=Plur')}} {{'goose'|morph('Number=Plur')}} {{'go'|morph('Tense=Past')}} to see {{'PRS_OBJ'|pronoun(num='Sing', gen='Fem')}}.These geese went to see her

  • Národní divadlo {{'Praha'|morph('Case=Loc', prep='v')}} {{'být'|morph('Tense=Past|Gender=Neut|Number=Sing')}} {{'postavit'|morph('Voice=Pass|Gender=Neut|Number=Sing')}} {{'rok'|morph('Case=Loc', prep='v')}} 1881.

    Národní divadlo v Praze bylo postaveno v roce 1881.

5.3. m

Shorter variant of the morph filter. Each morphological feature is represented by a single character. Agreement indices are denoted by “@“ (equivalent of the agr parameter) and “#“ (equivalent of the ref parameter). This filter supports only the most common features. To use other features, the morph filter has to be used.

List of supported features:

Category

m filter code

morph filter full form

Meaning

Gender

M

Gender=Masc|Animacy=Anim

masculine animate

F

Gender=Fem

feminine

N

Gender=Neut

neuter

I

Gender=Masc|Animacy=Inan

masculine inanimate

Number

P

Number=Plur

plural

S

Number=Sing

singular

Case

1

Case=Nom

nominative

2

Case=Gen

genitive

3

Case=Dat

dative

4

Case=Acc

accusative

5

Case=Voc

vocative

6

Case=Loc

locative

7

Case=Ins

instrumental

Tense

p

Tense=Pres

present

r

Tense=Past

past

f

Tense=Fut

future

Voice

s

Voice=Pass

passive

Examples: (all of these examples are equivalent)

  • {{'hezký'|m('@21')}} {{'muž'|m('P2#21')}}hezkých mužů

  • {{'hezký'|m('@21')}} {{'muž'|m('P2', ref='21')}}hezkých mužů

  • {{'hezký'|m('@21')}} {{'muž'|m('P2', ref=21)}}hezkých mužů

  • {{'hezký'|m('@21')}} {{'muž'|morph('Number=Plur|Case=Gen', ref=21)}}hezkých mužů

  • {{'hezký'|m('@21')}} {{'muž'|morph('Number=Plur|Case=Gen', ref=21, prep='z')}}z hezkých mužů

5.5. Czech examples

5.5.1. Passive voice

  • {{'být'|m('r@1')}} {{'zvolit'|m('s@1')}} {{'muž'|m('S1', ref=1)}}byl zvolen muž

  • {{'být'|m('r@2')}} {{'zvolit'|m('s@2')}} {{'žena'|m('S1', ref=2)}}byla zvolena žena

  • {{'být'|m('r@3')}} {{'zvolit'|morph('Voice=Pass', agr=3)}} {{'muž'|m('S1', ref=3)}}byl zvolen muž

  • {{'být'|m('r@4')}} {{'zvolit'|morph('Voice=Pass', agr=4)}} {{'žena'|m('S1', ref=4)}}byla zvolena žena

5.5.2. Future tense

Future is expressed differently for imperfective (psát, vyhrávat) and perfective (napsat, vyhrát) verbs:

  • for imperfective verbs, use the být in future tense plus infinitive

  • for perfective verbs, use the morphological present tense

  • Volby {{'vyhrát'|morph('Tense=Pres|Person=3',agr=2)}} {{'ODS'|m('1#2')}}Volby vyhraje ODS

  • Volby {{'být'|morph('Tense=Fut|Person=3',agr=2)}} vyhrávat {{'ODS'|m('1#2')}}Volby bude vyhrávat ODS

5.6. Czech derivation

Derivation is a process of creating a related word from a base word, for example by adding a suffix.

5.6.1. Female forms

We provide the femForm filter for generating female forms for nouns, e.g., senátorsenátorka.

Direct usage:

  • Paní Nováková je {{'senátor'|femForm(m='7')}}Paní Nováková je senátorkou

Conditional usage:

{{'pan'|femForm(person.isFemale, m='S1')}} {{person|name(m='1')}} je {{'senátor'|femForm(person.isFemale, m='7')}}paní Nováková je senátorkou/pan Novák je senátorem

In this example, generating the female variant depends on the isFemale property of the entity person.

5.7. English pronouns

For English, we provide the pronoun filter which generates pronouns in agreement with a given expression, for example the name of a person for which we want to generate he/she based on their gender.

Example:

  • {{cand|name(ref=’1’)}} claimed fifth place with {{...}} percent of people voting for {{'PRS_OBJ'|pronoun(agr='1')}}

Values of the input argument:

  • PRS_SUBJ - personal pronoun, nominative: he, she, …

  • PRS_OBJ - personal pronoun, accusative: him, her, …

  • POSS_DET - possessive determiner: his, her, …

  • POSS_PRON - possessive pronoun: his, hers, …

  • REFL - reflexive pronoun (accusative): himself, herself, …

  • EMPH - emphatic pronoun: himself, herself as in “He himself did it.”

Entities used for agreement have to be imported to GKB with gender info to enable the filter to work correctly.

Pronouns can also be generated without reference to other expression by providing the necessary morphological features:

{{'PRS_SUBJ'|pronoun(num='Sing', gen='Fem')}}she