Morphology
Genja's morph
filter (and its shorter variant m
) allows inflection using
Universal Dependencies (UD) features.
The most commonly used UD features are:
- Case:
Nom
,Gen
,Dat
,Acc
,Loc
,Ins
- Tense:
Pres
,Fut
,Past
- Voice:
Act
,Pass
Example: Národní divadlo v Praze bylo postaveno v roce 1881.
-
Specification using the full
morph
filter:Národní divadlo v {{'Praha'|morph('Case=Loc')}}
{{'být'|morph('Tense=Past|Gender=Neut|Number=Sing')}} {{'postavit'|morph('Voice=Pass|Gender=Neut|Number=Sing')}}
v {{'rok'|morph('Case=Loc')}} 1881. -
Equivalent version with the shorter
m
filter:Národní divadlo v {{'Praha'|m('6')}} {{'být'|m('rNS')}} {{'postavit'|m('sNS')}} {{'rok'|m('6')}} 1881.`
Agreement and reference
Reference ensures that syntactically linked items keep the same grammatical features.
We use it when one of the items can have various grammatical features as it is not known in advance.
The word/phrase that needs to copy its grammatical features is denoted by a parameter agr
(or @
in the shorter variant).
The word/phrase where the features are copied from is denoted by a parameter ref
(or #
in the shorter variant).
The indices of the agreement members have to be the same in order to match during the agreement postprocessing.
The index cannot be 0
.
{{parties[0]|name|morph('Case=Nom', ref=1)}} {{'vyhrát'|morph('Tense=Past', agr=1)}} volby.
→
Piráti vyhráli volby. / ODS vyhrála volby. / ANO vyhrálo volby.
Note that agreement cannot be combined with text modifying filters
(e.g., capitalize
) at this time:
- unsupported:
{{'senátorka'|m('#1')|capitalize}} {{'vyhrát'|m('r@1')}}
; use{{'Senátorka'|m('#1')}} {{'vyhrát'|m('r@1')}}
- unsupported:
{{5|unitNum('senátorka', m='@1')|capitalize}} {{'vyhrát'|m('r@1')}}
- unsupported:
{{2|ord_numeral_in_words|m('@1')|capitalize}} {{'senátorka'|m('#1')}}
Possessor's agreement
Possessor's agreement is a special type of agreement between an object and its possessor.
It is different from the "standard" agreement as it changes different parts of the word.
In the templates, it is denoted by a parameter poss
(or $
in the sorter variant).
See examples:
Přišel tam {{'předseda'|m('S1#1')}}. {{'Jeho'|m('@2$1')}} {{'syn'|m('S4#2')}} jsem neviděl.
→ Přišel tam předseda. Jeho syna jsem neviděl.Přišel tam {{'předseda'|m('S1#3')}}. {{'Jeho'|m('@4$3')}} {{'dcera'|m('S4#4')}} jsem neviděl.
→ Přišel tam předseda. Jeho dceru jsem neviděl.Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6$5')}} {{'syn'|m('S4#6')}} jsem neviděl.
→ Přišla tam předsedkyně. Jejího syna jsem neviděl.Přišla tam {{'předsedkyně'|m('S1#7')}}. {{'Jeho'|m('@8$7')}} {{'dcera'|m('S4#8')}} jsem neviděl.
→ Přišla tam předsedkyně. Její dceru jsem neviděl.
The form of the word jeho differs based on the gender and number of the possessor (jeho vs její vs jejich). Moreover, in the example above, the word jeho differs based on the gender of words syna/dceru – this is the "standard" agreement.
Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6', poss=5)}} {{'syn'|m('S4#6')}} jsem neviděl.
Is equal to:
Přišla tam {{'předsedkyně'|m('S1#5')}}. {{'Jeho'|m('@6$5')}} {{'syn'|m('S4#6')}} jsem neviděl.
→ Přišla tam předsedkyně. Jejího syna jsem neviděl.
morph
– long morphology filter (Cs/En)
Inflects a phrase as specified by the argument UD features.
The UD features are passed as a single argument to the morph
filter.
The format is as follows:
- Name and value of a feature are separated by
=
, e.g.Case=Loc
- Multiple features are separated by
|
, e.g.,Voice=Pass|Gender=Neut|Number=Sing
Example:
-
{{'go'|morph('Tense=Past')}}
→ went (English) -
{{'house'|morph('Number=Plur')}}
→ houses (English) -
{{'this'|capitalize|morph('Number=Plur')}} {{'goose'|morph('Number=Plur')}} {{'go'|morph('Tense=Past')}} to see {{'PRS_OBJ'|pronoun(num='Sing', gen='Fem')}}.
→ These geese went to see her. -
Národní divadlo {{'Praha'|morph('Case=Loc', prep='v')}} {{'být'|morph('Tense=Past|Gender=Neut|Number=Sing')}} {{'postavit'|morph('Voice=Pass|Gender=Neut|Number=Sing')}} {{'rok'|morph('Case=Loc', prep='v')}} 1881.
→ Národní divadlo v Praze bylo postaveno v roce 1881.
m
– short morphology filter
Shorter variant of the morph filter.
Each morphological feature is represented by a single character.
Agreement indices are denoted by @
(equivalent of the agr
parameter) and #
(equivalent of the ref
parameter).
This filter supports only the most common features.
To use other features, the morph filter has to be used.
List of supported features:
Category | m filter code | morph filter full form | Meaning |
---|---|---|---|
Gender | M | Gender=Masc|Animacy=Anim | masculine animate |
F | Gender=Fem | feminine | |
N | Gender=Neut | neuter | |
I | Gender=Masc|Animacy=Inan | masculine inanimate | |
Number | P | Number=Plur | plural |
S | Number=Sing | singular | |
Case | 1 | Case=Nom | nominative |
2 | Case=Gen | genitive | |
3 | Case=Dat | dative | |
4 | Case=Acc | accusative | |
5 | Case=Voc | vocative | |
6 | Case=Loc | locative | |
7 | Case=Ins | instrumental | |
Tense | p | Tense=Pres | present |
r | Tense=Past | past | |
f | Tense=Fut | future | |
Voice | s | Voice=Pass | passive |
All the following examples are equivalent, except the last one adds the preposition z as well:
{{'hezký'|m('@21')}} {{'muž'|m('P2#21')}}
→ hezkých mužů{{'hezký'|m('@21')}} {{'muž'|m('P2', ref='21')}}
→ hezkých mužů{{'hezký'|m('@21')}} {{'muž'|m('P2', ref=21)}}
→ hezkých mužů{{'hezký'|m('@21')}} {{'muž'|morph('Number=Plur|Case=Gen', ref=21)}}
→ hezkých mužů{{'hezký'|m('@21')}} {{'muž'|morph('Number=Plur|Case=Gen', ref=21, prep='z')}}
→ z hezkých mužů
Morphology-related arguments
In general, all filters that are related to morphology (support word declension) can use the following arguments:
m
- morphological features: either in a form of universal dependencies features, as described in the section about themorph
filter, or in a form of short features as described in the section about them
filter.ref
- agreement reference id, described in the Agreement and reference sectionagr
- agreement target id, described in the Agreement and reference sectionposs
- possessor's agreement reference, described in the Possessor's agreement sectionprep
- preposition, described in Vocalization of prepositions
These arguments can be used in the following filters: name, unit, unitNum. They are used as keyword arguments, meaning that the key=value syntax has to be used, but the order of the arguments does not matter.
Czech examples
Passive voice
{{'být'|m('r@1')}} {{'zvolit'|m('s@1')}} {{'muž'|m('S1', ref=1)}}
→ byl zvolen muž{{'být'|m('r@2')}} {{'zvolit'|m('s@2')}} {{'žena'|m('S1', ref=2)}}
→ byla zvolena žena{{'být'|m('r@3')}} {{'zvolit'|morph('Voice=Pass', agr=3)}} {{'muž'|m('S1', ref=3)}}
→ byl zvolen muž{{'být'|m('r@4')}} {{'zvolit'|morph('Voice=Pass', agr=4)}} {{'žena'|m('S1', ref=4)}}
→ byla zvolena žena
Future tense
Future is expressed differently for imperfective (psát, vyhrávat) and perfective (napsat, vyhrát) verbs:
-
for imperfective verbs, use the verb být in future tense plus infinitive:
Volby {{'být'|morph('Tense=Fut|Person=3',agr=2)}} vyhrávat {{'ODS'|m('1#2')}}.
→ Volby bude vyhrávat ODS. -
for perfective verbs, use the morphologically present tense:
Volby {{'vyhrát'|morph('Tense=Pres|Person=3',agr=2)}} {{'ODS'|m('1#2')}}.
→ Volby vyhraje ODS.
Czech derivation
Derivation is a process of creating a related word from a base word, for example by adding a suffix.
Female forms
We provide the femForm
filter for generating female forms for nouns,
e.g., senátor → senátorka.
-
Direct usage:
Paní Nováková je {{'senátor'|femForm(m='7')}}.
→ Paní Nováková je senátorkou. -
Conditional usage:
{{'pan'|femForm(person.isFemale, m='S1')}} {{person|name(m='1')}} je {{'senátor'|femForm(person.isFemale, m='7')}}
→ paní Nováková je senátorkou/pan Novák je senátorem
In this example, generating the female variant depends on the isFemale
property of the entity person.
English pronouns
For English, we provide the pronoun filter which generates pronouns in agreement with a given expression, for example, the name of a person for which we want to generate he/she based on their gender.
Example:
{{cand|name(ref=’1’)}} claimed fifth place with {{...}} percent of people voting for {{'PRS_OBJ'|pronoun(agr='1')}}
Values of the input argument:
PRS_SUBJ
- personal pronoun, nominative: he, she, ...PRS_OBJ
- personal pronoun, accusative: him, her, ...POSS_DET
- possessive determiner: his, her, ...POSS_PRON
- possessive pronoun: his, hers, ...REFL
- reflexive pronoun (accusative): himself, herself, ...EMPH
- emphatic pronoun: himself, herself as in He himself did it.
Entities used for agreement have to be imported to GKB with gender info to enable the filter to work correctly.
Pronouns can also be generated without reference to other expression by providing the necessary morphological features:
{{'PRS_SUBJ'|pronoun(num='Sing', gen='Fem')}}
→ she