5. Morphology¶
Genja’s morph
(and its shorter variant m
) filter allows inflection
using Universal Dependencies (UD) features.
Most commonly used UD features:
Case:
Nom
,Gen
,Dat
,Acc
,Loc
,Ins
Tense:
Pres
,Fut
,Past
Voice:
Act
,Pass
Example:
Národní divadlo v Praze bylo postaveno v roce 1881.
Specification using the full
morph
filter:Národní divadlo v {{'Praha'|morph('Case=Loc')}} {{'být'|morph('Tense=Past|Gender=Neut|Number=Sing')}} {{'postavit'|morph('Voice=Pass|Gender=Neut|Number=Sing')}} v {{'rok'|morph('Case=Loc')}} 1881.
Equivalent version with the shorter
m
filter:Národní divadlo v {{'Praha'|m('6')}} {{'být'|m('rNS')}} {{'postavit'|m('sNS')}} v {{'rok'|m('6')}} 1881.
5.1. Agreement and Reference¶
Reference ensures that syntactically linked items keep the same grammatical features.
We use it when one of the items can have various grammatical features as it is not known in advance.
The word/phrase that needs to copy its grammatical features is denoted by a parameter agr
(or @
in the shorter variant).
The word/phrase where the features are copied from is denoted by a parameter ref
(or #
in the shorter variant).
The indices of the agreement members have to be the same in order to match during the agreement postprocessing.
The index cannot be 0
.
{{parties[0]|name|morph('Case=Nom', ref=1)}} {{'vyhrát'|morph('Tense=Past', agr=1)}} volby.
→Piráti vyhráli volby. / ODS vyhrála volby. / ANO vyhrálo volby.
Note that agreement cannot be combined with text modifying filters (e.g., capitalize
) at this time:
unsupported:
{{'senátorka'|m('#1')|capitalize}} {{'vyhrát'|m('r@1')}}
; use{{'Senátorka'|m('#1')}} {{'vyhrát'|m('r@1')}}
unsupported:
{{5|unitNum('senátorka', m='@1')|capitalize}} {{'vyhrát'|m('r@1')}}
unsupported:
{{2|ord_numeral_in_words|m('@1')|capitalize}} {{'senátorka'|m('#1')}}
5.1.1. Possessor’s agreement¶
Possessor’s agreement is a special type of agreement between an object and its possessor.
It is different from the “standard” agreement as it changes different parts of the word.
In the templates, it is denoted by a parameter poss
(or $
in the sorter variant). See example:
Přišel tam {{'předseda'|m('S1#1')}}. {{'Jeho'|m('@2$1')}} {{'syn'|m('S4#2')}} jsem neviděl.
→ Přišel tam předseda. Jeho syna jsem neviděl.Přišel tam {{'předseda'|m('S1#3')}}. {{'Jeho'|m('@4$3')}} {{'dcera'|m('S4#4')}} jsem neviděl.
→ Přišel tam předseda. Jeho dceru jsem neviděl.Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6$5')}} {{'syn'|m('S4#6')}} jsem neviděl.
→ Přišla tam předsedkyně. Jejího syna jsem neviděl.Přišla tam {{'předsedkyně'|m('S1#7')}}. {{'Jeho'|m('@8$7')}} {{'dcera'|m('S4#8')}} jsem neviděl.
→ Přišla tam předsedkyně. Její dceru jsem neviděl.
The form of the word “jeho” differs based on the gender and number of the possessor (“jeho” vs “její vs “jejich”). Moreover, in the example above, the word “jeho” differs based on the gender of words “syna/dceru” – this is the “standard” agreement.
Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6', poss=5)}} {{'syn'|m('S4#6')}} jsem neviděl.
Is equal to:
Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6$5')}} {{'syn'|m('S4#6')}} jsem neviděl.
→ Přišla tam předsedkyně. Jejího syna jsem neviděl.
5.2. morph
(Cs/En)¶
Inflects a phrase as specified by the argument UD features.
The UD features are passed as a single argument to the morph
filter. The format is as follows:
Name and Value of a feature are separated by
=
, e.g.Case=Loc
Multiple features are separated by
|
, e.g.Voice=Pass|Gender=Neut|Number=Sing
Example:
{{'go'|morph('Tense=Past')}}
→ went (English){{'house'|morph('Number=Plur')}}
→ houses (English){{'this'|capitalize|morph('Number=Plur')}} {{'goose'|morph('Number=Plur')}} {{'go'|morph('Tense=Past')}} to see {{'PRS_OBJ'|pronoun(num='Sing', gen='Fem')}}.
→ These geese went to see herNárodní divadlo {{'Praha'|morph('Case=Loc', prep='v')}} {{'být'|morph('Tense=Past|Gender=Neut|Number=Sing')}} {{'postavit'|morph('Voice=Pass|Gender=Neut|Number=Sing')}} {{'rok'|morph('Case=Loc', prep='v')}} 1881.
Národní divadlo v Praze bylo postaveno v roce 1881.
5.3. m
¶
Shorter variant of the morph filter. Each morphological feature is represented by a single character. Agreement indices are denoted by “@“ (equivalent of the agr parameter) and “#“ (equivalent of the ref parameter). This filter supports only the most common features. To use other features, the morph filter has to be used.
List of supported features:
Category |
|
|
Meaning |
---|---|---|---|
Gender |
|
|
masculine animate |
|
|
feminine |
|
|
|
neuter |
|
|
|
masculine inanimate |
|
Number |
|
|
plural |
|
|
singular |
|
Case |
|
|
nominative |
|
|
genitive |
|
|
|
dative |
|
|
|
accusative |
|
|
|
vocative |
|
|
|
locative |
|
|
|
instrumental |
|
Tense |
|
|
present |
|
|
past |
|
|
|
future |
|
Voice |
|
|
passive |
Examples: (all of these examples are equivalent)
{{'hezký'|m('@21')}} {{'muž'|m('P2#21')}}
→ hezkých mužů{{'hezký'|m('@21')}} {{'muž'|m('P2', ref='21')}}
→ hezkých mužů{{'hezký'|m('@21')}} {{'muž'|m('P2', ref=21)}}
→ hezkých mužů{{'hezký'|m('@21')}} {{'muž'|morph('Number=Plur|Case=Gen', ref=21)}}
→ hezkých mužů{{'hezký'|m('@21')}} {{'muž'|morph('Number=Plur|Case=Gen', ref=21, prep='z')}}
→ z hezkých mužů
5.5. Czech examples¶
5.5.1. Passive voice¶
{{'být'|m('r@1')}} {{'zvolit'|m('s@1')}} {{'muž'|m('S1', ref=1)}}
→ byl zvolen muž{{'být'|m('r@2')}} {{'zvolit'|m('s@2')}} {{'žena'|m('S1', ref=2)}}
→ byla zvolena žena{{'být'|m('r@3')}} {{'zvolit'|morph('Voice=Pass', agr=3)}} {{'muž'|m('S1', ref=3)}}
→ byl zvolen muž{{'být'|m('r@4')}} {{'zvolit'|morph('Voice=Pass', agr=4)}} {{'žena'|m('S1', ref=4)}}
→ byla zvolena žena
5.5.2. Future tense¶
Future is expressed differently for imperfective (psát, vyhrávat) and perfective (napsat, vyhrát) verbs:
for imperfective verbs, use the být in future tense plus infinitive
for perfective verbs, use the morphological present tense
Volby {{'vyhrát'|morph('Tense=Pres|Person=3',agr=2)}} {{'ODS'|m('1#2')}}
→ Volby vyhraje ODSVolby {{'být'|morph('Tense=Fut|Person=3',agr=2)}} vyhrávat {{'ODS'|m('1#2')}}
→ Volby bude vyhrávat ODS
5.6. Czech derivation¶
Derivation is a process of creating a related word from a base word, for example by adding a suffix.
5.6.1. Female forms¶
We provide the femForm
filter for generating female forms for nouns, e.g., senátor → senátorka.
Direct usage:
Paní Nováková je {{'senátor'|femForm(m='7')}}
→ Paní Nováková je senátorkou
Conditional usage:
{{'pan'|femForm(person.isFemale, m='S1')}} {{person|name(m='1')}} je {{'senátor'|femForm(person.isFemale, m='7')}}
→ paní Nováková je senátorkou/pan Novák je senátorem
In this example, generating the female variant depends on the isFemale
property of the entity person.
5.7. English pronouns¶
For English, we provide the pronoun filter which generates pronouns in agreement with a given expression, for example the name of a person for which we want to generate he/she based on their gender.
Example:
{{cand|name(ref=’1’)}} claimed fifth place with {{...}} percent of people voting for {{'PRS_OBJ'|pronoun(agr='1')}}
Values of the input argument:
PRS_SUBJ
- personal pronoun, nominative: he, she, …PRS_OBJ
- personal pronoun, accusative: him, her, …POSS_DET
- possessive determiner: his, her, …POSS_PRON
- possessive pronoun: his, hers, …REFL
- reflexive pronoun (accusative): himself, herself, …EMPH
- emphatic pronoun: himself, herself as in “He himself did it.”
Entities used for agreement have to be imported to GKB with gender info to enable the filter to work correctly.
Pronouns can also be generated without reference to other expression by providing the necessary morphological features:
{{'PRS_SUBJ'|pronoun(num='Sing', gen='Fem')}}
→ she