S <: T
We say:
Terms of type S
are more informative, for example:
{s : Str ; b : Bool} <: {s : Str}
We can use S
anywhere we could use T
. In other words, if we want
to do something with an s : Str
, we don’t mind that the extra field
b : Bool
, we can just ignore it.
See also the GF reference manual
entry on subtyping.
The most important thing for GF syntax is the record extension **
, as follows:
oper
-- Record extension for types
A : Type = {s : Str} ;
B : Type = A ** {s2 : Str} ; -- {s,s2 : Str}
-- …and for terms
a : A = {s = "a"} ;
b : B = a ** {s2 = "b"} ; -- {s = "a" ; s2 = "b}
oper
sA common use case is to reuse a single oper
for several lin
s, even
though they are of different type (but share a supertype!)
Consider VP, VPSlash and ClSlash: VP has a verb and 0-n complements,
depending on the verb subcategory (intransitive, transitive, sentence
complement, …). VPSlash may also have any number of existing
complements but crucially, it is missing a complement. ClSlash is a
clause missing a complement: just like VPSlash, but it also has a
subject.
Here are types for VP, VPSlash and ClSlash in an unnamed RGL language1:
oper
VPhrase : Type = {- details don't matter -} ;
VPhraseSlash : Type =
VPhrase ** {post : Postposition ;
missing : MissingArg} ;
ClauseSlash : Type =
VPhraseSlash ** {subj : NounPhrase} ;
lincat
VP = VPhrase ;
VPSlash = VPhraseSlash ;
ClSlash = ClauseSlash ;
And here is the oper insertAdv
, which inserts an adverb into a VPhrase
.
oper
insertAdv : VPhrase -> {s : Str} -> VPhrase = \vp,a ->
vp ** {adv = vp.adv ++ a.s} ;
We can reuse the same oper insertAdv
for all subtypes of VPhrase
:
lin
-- : VP -> Adv -> VP ; -- sleep here
AdvVP = insertAdv ;
-- : VPSlash -> Adv -> VPSlash ; -- use (it) here
AdvVPSlash vps adv = vps ** insertAdv vps adv ;
-- : ClSlash -> Adv -> ClSlash ; -- (whom) he sees today
AdvSlash cls adv = cls ** insertAdv cls adv ;
AdvVP
is straightforward: the lincats of VP
and Adv
match
exactly VPhrase
and {s : Str}
.
But AdvVPSlash and AdvSlash manipulate subtypes of VPhrase
–how
does that work?
insertAdv
takes a VPhrase
, so it can also take
any of its subtypes, such as VPSlash
and ClSlash
. Thus calling
insertAdv cls adv
is perfectly fine.
insertAdv
also returns a VPhrase
, but AdvVPSlash
and
AdvSlash
expect more informative types. Thus the following would
be wrong:
ΑdvSlash cls adv = insertAdv cls adv ;
Why? Because insertAdv adv cls
returns a VPhrase
, but AdvSlash
needs to return a ClSlash
. The subj
, post
and missing
fields were
already present in the argument cls
, and inserting an adverb doesn’t
change that, so we add cls **
before calling insertAdv
: this
extends the original cls
by the changes made by insertAdv
. We
could be more verbose and write the same thing like this:
AdvSlash cls adv =
insertAdv cls adv ** {subj = cls.subj ;
post = cls.post ;
missing = cls.missing} ;
Here’s an exercise: what subtype relations of Det ⁇ Idet
and NP ⁇
IP
make the following code work?
lin
-- : Det -> CN -> NP -- the songs
DetCN det cn = {- actual implementation here -} ;
-- : IDet -> CN -> IP ; -- which five songs
IdetCN = DetCN ;
(Obviously, Det = Idet
and NP = IP
, but what if they have to be different?)
We return to this question, and meanwhile I explain that by the way,
we can also reuse lin
s, not just opers!
lin
soper
insertComp : Comp -> VPhrase -> VPhrase = \c,vp ->
vp ** {comp = {- details don't matter -}} ;
lin
-- : VV -> VP -> VP ;
ComplVV vv vp =
let vcomp : Comp = mkComp vp ;
in insertComp vcomp (useV vv) ;
-- : VV -> VPSlash -> VPSlash ;
SlashVV vv vps = vps ** ComplVV vv vps ;
Look, we just reused ComplVV
for implementing SlashVV
! As before,
ComplVV
returns a less informative type than what VPSlash
needs,
so we need to prefix the result with vps **
to keep all the necessary
fields of the vps
.
But since ComplVV
is a lin
and not an oper
, it adds some hidden
VP
-specific junk to the record (namely, a lock field). However,
VP-with-a-lock-field is simply a subtype of VP-without-a-lock-field,
so SlashVV
doesn’t care! It can just happily use the relevant
fields from the result of ComplVV vv vps
.
Now, let’s talk about lock fields in more detail.
You may have seen GF compiler output the following messages:
<some expression>
lin Y
And if you’ve looked around in the resource grammars, you may have seen this kind of pattern in several places:
oper
Noun : Type = {s : NForm => Str ; g : Gender} ;
lincat
N = Noun ;
N2 = Noun ** {c2 : Preposition} ;
N3 = Noun ** {c2,c3 : Preposition} ;
So here we define N2
and N3
as subtypes of N
, except that we do
it in an awkward way, extending an oper Noun
instead of the lincat
N
.
When you write a lincat for some cat, you make it into some concrete
record type. You can use the same concrete record for many different
lincats, e.g. lincat Adv, Conj = {s : Str} ;
.
To you these are identical, but the GF compiler inserts a hidden field,
so actually Adv
is {s : Str ; lock_Adv : {}}
and Conj is {s : Str ; lock_Conj : {}}
.
That’s why we don’t write lincat N2 = N ** {c2 : Prep}
: it would
make N2
into the following type:
{s : NForm => Str ; g : Gender ; lock_N : {} ;
c2 : Prep ; lock_N2 : {}} ;
This is also the reason why you might have seen lin X
in the
paradigms, like in the following:
oper
mkNoun : Str -> Noun = {- details don't matter -} ;
mkN : Str -> N = \s -> lin N (mkNoun s) ;
mkN2 : Str -> Prep -> N2 =
\s,prep -> lin N2 (mkNoun s ** {c2 = prep}) ;
mkN3 : Str -> Prep -> N3 =
\s,p,r -> lin N3 (mkNoun s ** {c2 = p ; c3 = r}) ;
The oper mkNoun
creates a record with fields {s : NForm => Str ; g : Gender}
.
Since N
, N2
and N3
all share those two fields, it makes sense to
reuse the common parts. Then we make the resulting Noun ** {c2 :
Prep}
into an actual N2
by wrapping it in lin N2
.
mkNoun s ** {c2 = prep} -- : Noun ** {c2:Prep}
lin N2 (mkNoun s ** {c2 = prep}) -- : N2
Now we get back to the earlier question: which subtype relations hold between Det ⁇ Idet
and NP ⁇ IP
(other than =)?
lin
-- : Det -> CN -> NP -- the songs
DetCN det cn = {- actual implementation here -} ;
-- : IDet -> CN -> IP ; -- which five songs
IdetCN = DetCN ;
Idet
as an argument to DetCN
, which expects a Det
. Thus Idet
has to have more fields than Det
.NP
as the result of IdetCN
, which expects to return an IP
. Thus NP
has to have more fields than IP
.Here we have the answer: IDet <: Det
and NP <: IP
.
How about subtypes on abstract syntax? Here’s a presentation by Hans Leiß.
1: I believe in the pedagogical principle of “don’t tell people what you’re teaching them is complex, so they won’t be scared of it”.
tags: gf