Did you ever wonder just how exciting the life of a GF grammarian is? Ever wished that someone wore a helmet camera all day while creating new inflection tables and wondering about the scope of reflexive pronouns? Then this post is just for you!
We describe a collaboration between the Grammarian, who is an expert in GF, and the Tester, a native Dutch speaker. We generated a set of test sentences for each function in Dutch, with English translations, and gave them to the native tester to read. The tester replied with a list of sentences that were wrong, along with suggestions for improvement. Communication between the grammarian and the tester was conducted via email.
We can classify the bugs in two dimensions: how easy it is to understand what the problem is, and how easy it is to fix the grammar. Ease of understanding is relative to the grammarian: a trained linguist who is fluent in Dutch would have easy time pinpointing the error from the generated test cases, having both intuition and technical names for things. Ease of fixing is relative to the grammar: a given grammatical phenomenon can be implemented in a variety of ways, some of which are harder to understand. Say that relative clauses are implemented as terrible spaghetti code in a Dutch grammar, but very elegantly in a German grammar, and both have a bug that results in a similar ungrammatical sentences. In such a case, the problem would be equally easy to understand in both languages, but fixing the bug would be easier in the German grammar.
In more concrete terms, easy to fix means just some local changes in a single function. In contrast, bugs that are hard to fix usually involve modifying several functions, restructuring the code or adding new parameters.
Perhaps the easiest bug to fix is to correct a wrong lexical choice. Below is an example feedback from the tester.
“opschakelen” is not the right translation of “switch on”. “aanzetten” or “aandoen” is better.
Other examples include wrong inflection or agreement, e.g. the polite second person pronoun should take the third person singular verb form, but was mistakenly taking second person forms.
Typically, bugs that are due to an almost complete implementation are
easy to fix”. For instance, particle verbs were missing the particle
in future tense. Looking at the generated sentences, we could see the
particle being in the right place in all other tenses, except for the
future. There was a single function that constructed all the tenses,
and looking at the source code, we could see the line ++
verb.particle
in all other tenses except the future. In such a case,
fixing the bug is fairly trivial.
Dutch negation uses two strategies: the clausal negation particle niet ‘not’, and the noun phrase negation geen ‘no’. There are some subtleties in their usage–the following quote comes from the tester:
In any case, one can never say “eet niet wormen” (don’t eat worms, literally).
That should always be “eet geen wormen” (don’t eat worms, correctly translated)
After that, we sent three more sentences as follow-up, and got the following answer:
eet niet deze wormen - maybe OK?, feels strange
eet deze wormen niet - definitely OK
eet niet 5 wormen - definitely OK
In the brain of a computational linguist, those two feedbacks translated into “clauses with indefinite noun phrases (worms) use noun phrase negation, but if the noun phrase is quantified (these worms, five worms), then clausal negation is okay”. This makes sense also semantically (if you’re the kind of person who reads this blog): the negation of “eat 5 worms” is not “eat no worms”, you can still eat 4 worms or 400.
In the grammar, this fix required changes to 13 categories. Not all
categories had to be changed manually, but e.g. a change in NP
changes all categories that depend on it, such as Comp
and VP
.
Depending on how modularly the grammar is implemented, this means that
some functions that operate on VP
or Comp
need to be changed too,
when NP
changes.
The following two sentences were generated by the same function, which turns superlative adjectives and ordinal numbers into complements. The tester reported problems with both of them, as follows:
ik wil roodst worden –> ik wil het roodst worden (‘I want to become reddest’)
ik wiltiendworden –> ik wil tiende worden (‘I want to become tenth’)
We gave some more sentences to the tester, and got the following feedback:
ik wil linker worden –> ik wil de linker worden (‘I want to become left’)
ik wil 224e worden = OK (‘I want to become 224th’)
This small example gave at least three different ways of using these complements: for numerals, no article and -e at the end (tiende ‘tenth’); for superlative adjectives, the article het and no -e at the end of the adjective (het roodst ‘the reddest’), and for a class of adjectives like left and right [TODO: or is it only those?], the article de (de linker ‘the left one’).
In addition, the grammar has a separate construction for combining a numeral and a superlative adjective, e.g. “tenth best”. Since the tests were generated per function, the main tester didn’t read those sentences at the same time. After noticing the additional function, we asked another informant how to say Nth best, and got an alternative construction op (N-1) na best. Eventually, we got an answer that the strategy used for superlative adjectives, i.e. with the article het and no -e in the number, is acceptable.
Once it was clear to the grammarian how to proceed, fixing the bug was
easy. There was already a parameter for the adjective form:
attributive in two forms (strong and weak) and one predicative, and
the different classes of adjectives corresponded to the abstract
syntax of the GF RGL. Thus it was easy to modify the predicative form
in a different way for different adjective types. Earlier, the
predicative was just identical to the other attributive form, but now
the AP
type actually contains 3 different strings for
superlatives: beste and best for attributive and het best for
predicative.
Adjectives in positive or comparative don’t get the
article: good is just goede, goed and goed (not ✱het goed).
If there hadn’t been already a parameter for different adjective forms, or if the classes of words with different behaviours hadn’t corresponded to the RGL categories, then this bug would’ve required more work to fix.
As an example of a problem that was hard to understand and hard to fix, we take the agreement of a reflexive construction in conjunction with a verbal complement. (Just the description sounds hard to understand!)
More concretely, consider the following sentences:
These seem like reasonable choices: if the object of liking was I in the second example, it wouldn’t be myself but me: “I help you like me”.
In the GF grammar, these sentences are constructed in a series of steps:
PredVP (UsePron i_Pron)
(ComplSlash
(SlashV2V help_V2V
(ReflVP
(SlashV2a like_V2)
)
)
(UsePron they_Pron)
)
The innermost subtree is SlashV2a like_V2
: the transitive
verb like is converted into a VPSlash
(i.e. VP\NP
). Right after, the function
ReflVP
fills the NP
slot and creates a VP
. However, no concrete
string for the object is yet chosen, because the reflexive object
depends on the subject. The status of the VP is as follows at the
stage ReflVP (SlashV2a like_V2)
:
s = "like" ;
ncomp = table { I => "myself" ; You => "yourself" ; … } ;
vcomp = [] ;
If we added a subject at that point, the subject would choose the appropriate agreement: I like myself, you like yourself. But instead, we add another slash-making construction, SlashV2V help_V2V
. Now the new verb help_V2V
, which takes both a direct object and a verbal complement, becomes the main verb. The old verb like becomes a verbal complement.
s = "help" ;
ncomp = table { I => "myself" ; You => "yourself"; … } ;
vcomp = "like" ;
The next stage is to add an NP
complement they_Pron
, using the function ComplSlash
. The standard way for ComplSlash
is to insert its NP
argument into the ncomp
table, taking the vcomp
field along.
In the old buggy version, ComplSlash
just concatenated the new object and the vcomp
with the reflexive that was already in the ncomp
table. But the scope of the reflexive was wrong: when adding an object to a VPSlash
that has a verbal complement clause, the object should complete the verbal complement and pick the agreement. It is not in the scope for the subject.
This was the old behaviour:
s = "help" ;
ncomp = table { I => "them like myself" ; You => "them like yourself" ; …} ;
vcomp = [] } ;
And this is after fixing the bug:
s = "help" ;
ncomp = table { _ => "them like themselves" } ;
vcomp = [] ;
But this turned out not to be a perfect solution. The exception to this is when the VPSlash
is formed by VPSlashPrep : VP -> Prep -> VPSlash
. With the changes to ComplSlash
, we suddenly got sentences such as “[I like ourselves] without us”.
This would be a valid linearisation for a tree where [ourselves without us] is a constituent (such a tree is formed by another set of functions and was linearised correctly), but in this case, the order of the constructors is as follows:
ReflVP like
s = "like" ;
ncomp = table { I => "myself" ; You => "yourself" ; … } ;
VPSlashPrep (ReflVP like) without
s = "like" ;
ncomp = table { I => "myself" ; You => "yourself" ; … } ;
prep = "without"
ComplSlash (VPSlashPrep (ReflVP like) without) we_Pron)
s = "like" ;
ncomp = table { I => "myself" ; You => "yourself" ; … } ;
adv = "without us"
The desired behaviour is to put the complement into an adverbial slot and keeping the agreement in ncomp
open to wait for the subject. But the following happened after our initial changes in ComplSlash
:
s = "like" ;
ncomp = table { _ => "ourselves without us" } ;
To fix this problem, we added another parameter to the category
VPSlash
. All VPSlash
es constructed by VPSlashPrep
have now a
missingAdv
set True: this tells that the VPSlash
is not missing a
core argument, so it shouldn’t affect the agreement. With the new
parameter, ComplSlash
can now distinguish when to choose the
agreement from the NP
argument and when to leave it open for the
subject.
The same bug was found in languages, and we fixed it for Dutch, English and German, using the same strategy.
Excited, aren’t you! You thought they had already fixed ComplSlash
but then came VPSlashPrep
and revealed that all of this was part of
its plan. While trying to help, they actually planted more bugs into
the grammar. Will the herogrammarian make it in time before a
critical application outputs a wrong translation and loses the
customer a million SEK?
Anyway. After all these personal tales, you might want to know how many bugs were fixed and how many were of which kind. Let’s skip the whole ease of understanding, it’s subjective anyway, so here’s just how easy the bugs were to fix. I’ve probably forgot a bunch of bugs here.
AP
s: placement and the adjective form. “een getrouwde worm” is correct, but a heavier AP
should become a postmodifier, and in that case, the adjective form should be without the e at the end.PastPartAP
ReflVP
with VPSlash
Tester 1 has read probably hundreds or thousands of sentences by now. We wonder if he’s still sane. (TODO: get some more accurate numbers).
Tester 2 has been used as a backup when Tester 1 was not available and Grammarian wanted quick feedback. She’s read roughly tens of sentences.
If you’ve read so far, maybe you appreciate the kind of sentences we torture our testers with. Not to be confused with useful life advice.