symb
and friendsHere’s a simple grammar that parses expressions such as the following:
abstract Numbers = {
flags startcat = Command ;
cat
Command ;
fun
Call : Int -> Command ;
Follow : String -> Int -> Command ;
Measure : Float -> Command ;
}
With the following concrete syntax in English:
concrete NumbersEng of Numbers = open
Prelude,
SymbolicEng,
SyntaxEng,
ParadigmsEng in {
lincat
Command = Imp ;
lin
Call i =
mkImp (mkV2 "call") (symb i);
Follow s i =
mkImp (mkV2 "follow") (dashNP (symb s) (symb i)) ;
Measure f =
mkImp (mkV2 "measure") (symb f) ;
oper
dashNP : NP -> NP -> NP = \np1,np2 -> np1 **
{s = \\c => np1.s ! c ++ "-" ++ np2.s ! c} ;
}
Let’s break this down to smaller pieces.
Look at the abstract syntax. In cat
, I have only defined Command
,
but in fun
I’m using the types Int
, String
and Float
. These types
are built in GF, and you can see their specification in here.
They are present in all grammars, even if you don’t write them. If you type
pg
in your GF shell, you’d see something like the following:
symb
and friendsIn order to become an object of a verb such as “call” or “measure”, the literal
needs to be turned into a GF category like NP
. The module Symbolic
has an overloaded function symb
, which does just that. Check out the link
(and the new fancy version of RGL source browser!) to see the API.
In the grammar, I can simply use the function symb
that turns an Int into an NP
as follows:
lin
Call i = -- : NP
mkImp (mkV2 "call") (symb i) ;
Parsing works just as expected:
Numbers> p "follow ABC - 123"
Follow "ABC" 123
Numbers> p "call 911"
Call 911
(With the grammar I pasted, you get spaces the registration plate;
use BIND
to get rid of it.)
You can also linearise any tree you want to (up to integer overflow):
Numbers> l Call 99999999999999
call 99999999999999
Numbers> l Call 9999999999999999999999999999999999
call 4003012203950112767
If you type gt
in e.g. Foods grammar, you get a long and repetitive list of trees like
“this very very very Italian wine is very Italian”,
and at some point it stops, depending on the --depth
parameter.
Generating trees with literals works differently. For both random and exhaustive generation in the GF shell, you only get one of each:
Numbers> gt -cat=Int
999
Numbers> gt -cat=String
"Foo"
Numbers> gt -cat=Float
3.14
This happens in every category that uses literals, so here’s all that gt
does
for the whole grammar:
Numbers> gt
Call 999
Follow "Foo" 999
Measure 3.14
In the RGL synopsis, you see functions that take a Str
as an argument. For instance, the Str -> Num
instance of mkNum, which takes an argument like “35” (in digits) and returns “thirty-five” spelled out. Would it then be possible to transform a literal (Int
, Float
or String
) using such a RGL function?
The answer is no.
What does mkNum : Str -> Num
have to do to map “35” to “thirty-five”? It has to pattern-match the string “35”, otherwise it won’t know which numeral to output.
It’s perfectly fine to use an oper like mkNum : Str -> Num
when you’re constructing lexicon. By lexicon, I mean any functions (fun
in the abstract syntax) that don’t take arguments. As long as the function creates some string from scratch, it can do anything it wants to that string: pattern match, split and concat other strings with +
.
But all of these functions, like Call : Int -> Command
, take an argument. When a function takes an argument, we can manipulate the argument (literal or not) in much more limited ways.
So in this case, you may only take the literal’s string value unchanged—you can’t even look at it!—and construct a few limited categories out of it. That’s what symb
does.
(For a longer explanation on when you can or cannot pattern match and glue tokens, see the gotchas post.)
Part II of this post describes how to use arbitrary strings and numbers in more advanced settings than just independent NPs (and whether that is a good idea to do).