Inari Listenmaa

Logo

CV · Blog · GitHub

17 December 2022

Generalising agreement, part III: Beyond verbs

This is the third post in the agreement series. Reading the previous posts (part I, part II) is recommended for the general context, but as long as you know GF and some basic linguistics, the examples in this post are understandable.

In the first post, we learned that many languages mark the core argument(s) like subject, object and indirect object in the verb inflection.

In the second post, we learned that some languages mark also the addressee. Then I argued that register can also be thought of as agreement: the argument that is marked is the situation itself.

In this post, I’m going to take it a bit more general still. If an orthographical word in any category depends on other elements in the tree, I’ll call that agreement. The rest of this post will be dedicated to just a single example: contraction of adpositions, pronouns and negation particle in Somali. The way we deal with them in GF in the Somali resource grammar is just like we deal with any agreement.

Examples of Somali contractions

I’ll start by showing 3 examples from Saeed (1999) pages 39 and 110, and Nilsson (2022) page 127.

Single adposition

The following pair from Saeed (1999, p. 110) shows a single adposition u ‘for’, introducing the oblique argument Cali ‘Ali’.

SomaliContr0

In the English translation, we have a prepositional phrase for Ali. In Somali, the construction is discontinuous, with the NP Cali at the start of the sentence, and the adposition u later, before the verb.

Discontinuity in itself is not a hard problem in GF: thanks to the record syntax, linearisation rules can add strings to a record, and postpone building the phrase until more features are known. However, what makes Somali adpositions a challenging exercise for GF is their obligatory contractions with other parts of speech.

Two adpositions

The next example, Saeed (1999 p. 39), shows two adpositions merging.

SomaliContr1

Again the English translation features two prepositional phrases as continuous constituents: ‘in this way’ and ‘for Farah’. In Somali, the noun phrases sidan ‘this way’ and Faarax ‘Farah’ appear in the beginning of the sentence, and the two adpositions u and u merge into one orthographic word, ugu, before the verb.

Adposition + object pronoun + negation particle

The final example (Nilsson 2022, p. 127) features an adposition ka, the object pronoun i and the negation particle ma, all in a single orthographic word igama.

SomaliContr34

Note that the adposition ka on its own has meanings such as ‘from’ or ‘about’. In this expression, it is part of the compound adposition ka dul1 ‘over’, where the ka component merges with the object pronoun and the negation in the beginning of the sentence, and dul appears before the infinitive verb.

Morphology

From the systematic listing in Saeed (1999) pp. 38–41, we count at least 80 distinct combinations, many of which feature nontrivial assimilations and metathesis, and are therefore best stored as full forms, not combined from smaller pieces.

Neither Saeed nor Nilsson cover all of the possible combinations, so it is unclear whether they do not contract, or are just unlikely to occur together in a single verbal group. For instance, there is no mention of combining the first item on the list, an impersonal subject pronoun and an object pronoun (4 distinct forms) with one or more adpositions—we assume for the reason that such sentences were not attested in a corpus.

However, later examples in Saeed (1999) reveal forms that are missing from the systematic listing on pages 38–41, such as impersonal subject pronoun + reflexive pronoun. So in addition to the explicitly listed 80 forms, I have included in the GF implementation all forms I could find elsewhere in the two sources, which brings the total number to 88 unique forms.

This is still not a complete list, but based on the source material, it should cover the bulk of the combinations appearing spontaneously. Later you’ll see the params that are the LHS of the inflection tables, and you can count that there are much more than 88 forms—those not found in the sources are at the moment linearised as "???". This is very unsatisfactory, and if you have a more complete resource, get in touch with me!

Negation suffix ma can also appear with these 88+ combinations, but its addition is purely concatenative, so I have chosen to attach the negation in a separate step, using run-time gluing with the BIND token.

GF parameters

These 88+ forms are incorporated into an inflection table in GF. The inflection table is build out of three parameter types: Adposition, AdpCombination (adpositions + impersonal subject pronoun) and AdpObjAgr (all object pronouns that contract).

Adposition

First, we introduce a parameter for single adpositions, called Adposition.

param
  Adposition = NoAdp | U | Ku | Ka | La ;

This parameter Adposition is present in lexical categories, such as Prep, Adv, V2 and others that may introduce a nominal argument with an adposition. The lack of adposition, NoAdp, is included as an explicit value, corresponding to a direct object.

AdpCombination

The second parameter, called AdpCombination, lists the combinations of adpositions and impersonal subject pronouns.2 They are represented as one parameter, because both of these components can combine with object pronouns: it was a natural way to cluster the parts of speech: object pronouns vs. everything else.

param
  AdpCombination =
    Single Adposition     -- 0-1 adpositions (0 = NoAdp)
  | ImpersSubj Adposition -- impersonal subject + 0-1 adpositions
  | Ugu | Uga | Ula       -- two adpositions (6 distinct forms)
  | Kaga | Kula | Kala ;

The combinations of two adpositions exhibit syncretism: there are only 6 distinct forms, because some of the forms cover many combinations. AdpCombination is present in phrasal categories where the full information of all objects and obliques and negation is still open. Referring to the RGL category hierarchy, Adposition appears in V2*, V3, N2, A2, Prep, Adv, IAdv, and AdpCombination in VP, VPSlash and ClSlash.

AdpObjAgr

The third parameter, AdpObjAgr lists all possible object pronouns that combine with the AdpCombination values, to produce the 88+ distinct forms.

param
  AdpObjAgr =     -- NB. separate from verbal agreement
    Sg1Obj
  | Sg2Obj
  | Pl1Obj Inclusion
  | Pl2Obj
  | ReflexiveObj
  | ZeroObj ;    -- i.e. the AdpCombination value on its own

The last value is a zero morpheme, used for third person objects, to add no extra segment into the final contraction. Combining an AdpCombination with a zero object just returns the combination itself: ZeroObj + Ugu returns ugu, whereas Sg1Obj + Ugu returns iigu.

AdpObjAgr appears in the categories where a nominal object or oblique argument can be added: VP, VPSlash, ClSlash and Adv. Note also that AdpObjAgr parameter is separate from the parameter that affects the form of the finite verb of the clause. The category NP records the full agreement Agr, and AdpObjAgr is computed from Agr whenever a NP becomes the object of a VP or an Adv.

The full inflection table

The inflection table is of type AdpObjAgr => AdpCombination => Str and it produces all combinations, within the limits of what data I could find.

allContrs : AdpObjAgr => AdpCombination => Str = table {
  Sg1Obj => table {
    Single NoAdp  => "i" ;   -- just the object pronoun
    Single Ka     => "iga" ;
    ...
    ImpersSubj NoAdp => "lay" ; -- impers.subj + obj.pron
    ...
    Kala          => "igala" -- object pronoun + Ka + La
  } ;
  ...
  ZeroObj => table {
    Single NoAdp => "" ;   -- P3 direct object = ∅
    Single Ka    => "ka" ; -- just the adposition
    ...
    ImpersSubj NoAdp => "la" ; -- just impers. subj
    ImpersSubj U => "loo" ; -- impers. subj + adposition
    ...
    Kala         => "kala"
  }
} ;

In the table, we see values from an empty string (for ZeroObj + Single NoAdp, i.e. third person direct object), up to combinations of 3 elements, such as igala for Sg1Obj + Kala (which originally comes from Ka + La).

Combination of parameters

Below is a part of the rules on how to combine two adpositions into a combination parameter.

  combine : Adposition -> Adposition -> AdpCombination
  combine adp1 adp2 = case <adp1,adp2> of {
    <U, U|Ku> => Ugu ;
    <U, Ka>   => Uga ;
    <U, La>   => Ula ;
    <Ku|Ka,
     Ku|Ka>   => Kaga ; -- 4 combinations, same form
    <Ku, La>  => Kula ;
    <Ka, La>  => Kala ;
    <NoAdp, p> => Single p
  } ;

Thanks to the syncretism of the contractions, we get away with a much smaller parameter type, which means better performance in the GF code. Suppose that a VP that inherits the adposition Ku from a V2 and the adposition Ka from an Adv. Instead of storing the pair <Ku,Ka>, the combine function merges them into a single AdpCombination value Kaga.

Updating the AdpCombination value

As explained previously, lexical categories have an Adposition field, which is elevated to AdpCombination in phrasal categories. Below are some examples of how the values are updated at the application of different functions.

The same happens in questions: QuestIAdv : IAdv -> Cl -> QCl updates the Cl’s AdpCombination with the IAdv’s Adposition, and stores the result into the newly constructed QCl.

Parameters’ journey in the syntax tree

Below is a GF parse tree for igama dul boodi kartid ‘you cannot jump over me’, where the concrete strings are shown in black Times New Roman, and the internal parameters in colourful Courier font with dotted lines from the lexical categories. The string igama is only linearized when all three contributing parameters are known: negation from Pol, adposition from Prep and object pronoun from Pron.

SomaliParseTree

All these arguments propagate their inherent parameter up to the S level, at which time the inflection table allContrs is consulted. The VP didn’t contain any other adposition, so the full value of the AdpCombination became Single Ka. The Adv’s object NP was build out of i_Pron, which contributed with the AdpObjAgr value Sg1Obj. Finally, the negation morpheme ma was glued onto the iga that came from the inflection table, and the full contraction was linearized into a single token igama.

Footnotes

  1. Contrast with the sentence “I can jump over it” where the adposition doesn’t contract, because the object is 3rd person (i.e. zero morpheme) and the sentence is positive. SomaliContr5 

  2. Due to gaps in data, impersonal subject pronoun can’t combine with two adpositions—such a state is simply not represented in the param type. Ideally, the ImpersSubj constructor would combine with one of the 6 combinations of 2 adpositions (and an object pronoun at later step). I do allow the combination impersonal subject + one adposition + object pronoun, but some of the forms have a dummy linearisation, because I couldn’t find a form. 

tags: gf, linguistics