This post will show how to use GF grammars from an external program, and how to manipulate GF trees from that program. The topic is introduced in Lesson 7 of the tutorial, and I will cover parts that I find missing in the tutorial:
Not all things are missing from the tutorial per se, but they are explained in different places. In contrast, I aim to make this post as self-contained as possible. If you have already installed GF and the PGF library in the language of your choice, you can go directly to Embedding grammars.
It is also possible to embed GF grammars into C#, JavaScript/TypeScript and Java, but I will not cover them in this tutorial. This is enough of a complex choose-your-adventure already.
The relevant bits for embedding GF grammars to other programming languages are explained in the following.
A GF grammar consists of an abstract syntax and a number of concrete syntaxes. They live in files that end in .gf
. The executable is called gf
, and you can use it in two ways:
Assuming that you have a file called LangEng.gf
in the same directory, you can run the following command.
$ gf LangEng.gf
Lang> p -cat=Cl "I am an apple"
PredVP (UsePron i_Pron) (UseComp (CompCN (UseN apple_N)))
Lang> help
<lots of helpful output>
$ gf -make LangEng.gf
linking ... OK
Writing Lang.pgf...
If you don’t specify anything else than -make
, then you will get a .pgf
file. (More on PGF in the next section.) You can get other formats using the flag -f
:
$ gf -make -f haskell LangEng.gf
linking ... OK
Writing Lang.pgf...
Writing Lang.hs...
Lang.hs is a Haskell version of the abstract syntax. We’ll get back to it in the section about transforming GF trees.
You can find other arguments to -f
if you run gf -h
.
A GF file is compiled into a Portable Grammar Format, shortened PGF. If we want to use a GF grammar from another program, most often we need to compile it into PGF first. (Sometimes we can skip the PGF level: see tutorial for compiling the grammar directly to JavaScript.)
PGF is also the name of a Haskell library, which contains functions for reading and manipulating PGF files.
How about if you don’t want to use Haskell? Not a problem, there’s another library called PGF, written in C. The recommended usage of the C library is through bindings to another programming language. Currently there are 4 options: Python, Java, C# and Haskell again (in which case it is called PGF2
, to distinguish from the native Haskell library called PGF
). In this post, we will use Python.
If you haven’t installed GF yet, you most likely want to do it now. The options are: a) download a binary, b) install from Hackage, and c) compile from source.
If you have Mac or Ubuntu, the easiest way is to download the binary. Python bindings are included in the binary.
2022 update: Python bindings in the binary don’t seem to work for M1 Macbooks. In addition, some Mac users report that the Python bindings from binary only work for Python 2, not Python 3. If this happens to you, and you want it to work for Python 3, you may need to follow the steps for manual installation.
If you don’t have Mac or Ubuntu, you can install GF in any way you like—see instructions on the download page—but it won’t include the Python bindings, so you will need to set them up separately.
For any system, the easiest way is to install GF and the libraries from Hackage: type cabal install gf
.
If you don’t (want to) have a system-wide GHC, you have two options:
stack install
.Now follows installation instructions for the PGF
library in Python and Haskell. To follow this tutorial, it is enough to choose only one.
If you downloaded the Mac or Ubuntu binary of GF, then you should have the Python bindings already.
To test if you have the Python bindings, open a Python shell and type import pgf
:
$ python
<information about your python>
Type "help", "copyright", "credits" or "license" for more information.
>>> import pgf
If the import succeeeds, you have the library, and you can skip all the way to Embedding grammars.
Since June 2020, the PGF library is in PyPI. So the installation step, if you’re not using the binary, is as follows:
pip install pgf
And that’s it. Make sure that you install it for the right Python—substitute with pip3 install gf
or whichever version you want to use.
Now you can continue to Embedding grammars.
If pip install pgf
didn’t work for you, you can install it manually.
I have no experience on installing the libraries in Windows. If it doesn’t work, we would appreciate if you open an issue at GF’s GitHub describing your problem.
If you can’t or don’t want to download the binary for Mac or Ubuntu (e.g. not having Mac or Ubuntu are pretty solid reasons!), then you need to install the C runtime and the Python bindings separately.
You need to download the source code at gf-core. Then, go to the directory gf-core/src/runtime/c
, where you find installation instructions. Follow them to install the C runtime.
After installing the C runtime at gf-core/src/runtime/c
, go to gf-core/src/runtime/python
and run the following commands:
$ python setup.py build
$ sudo python setup.py install
If you have several versions of Python on your computer, make sure that you use the right one when installing. If desired, substitute python
in the above commands for python3
or the path to your custom Python binary.
ImportError?
On Linux: if you get ImportError: libgu.so.0: cannot open shared object file: No such file or directory
$ ~/sudo nano /etc/ld.so.conf
Add /usr/local/lib
to the file, and then run the following
$ sudo ldconfig
Now open a Python shell (with the same Python that you used to build+install in the previous step) and type import pgf
—if it works, now you can skip to Embedding grammars.
How about if you have followed the steps until here, and it doesn’t work? Please open an issue on GF’s GitHub describing your setup, what steps you took and the output.
If you want to use Haskell, the first question is which library to use, PGF
or PGF2
? Remember, PGF
is a native Haskell library, and PGF2
is Haskell bindings to a C library.
For this post, I chose PGF
for three reasons:
PGF2
.There are many good reasons to use the PGF2
library. However, that is a topic for another post (which you can read here).
If you installed GF from Hackage (typing cabal install gf
) or compiled it from source, then you should have the PGF library.
Open your ghci
and type import PGF
. If it succeeds, you have successfully installed the PGF
library, and you can skip all the way to Embedding grammars.
If you have installed GF by other means, or you don’t want to have a system-wide GHC, read further.
In my repository gf-embedded-grammars-tutorial, you’ll find a Stack file, which downloads all relevant libraries for you in an isolated location.
Clone the repository and skip to Embedding grammars, where one of your first tasks is to run stack build
.
Let’s see. Are you sure you don’t want to use Stack?
▶ Yes, I know what Stack is and don’t want to use it
You have ended up in the current branch of this choose-your-adventure, if you followed one of the red routes in this flowchart.
Let me quote docs.haskellstack.org:
Stack is a cross-platform program for developing Haskell projects. It is aimed at Haskellers both new and experienced.
It features:
- Installing GHC automatically, in an isolated location.
- Installing packages needed for your project.
- Building your project.
- …
When you install a program with Stack, it will not affect your previous Haskell ecosystem in any way. The downside is that it will download another version of GHC and libraries, which takes more space, but this is a trade-off for guaranteeing reproducible builds. If you use Stack just once for this project, you can still keep using Cabal only for all other projects in the past and future. So unless disk space is absolutely critical, I recommend this option.
First, install Stack. This is a simple process involving running one command on your terminal. After that, the rest of the process involves one extra stack build
and then typing stack run <program>
instead of runghc <program>
. If you want to run a ghci with the libraries that are installed locally, you need to write stack ghci
instead of ghci
. That’s pretty much the concrete noticeable differences that affect your daily life. If you want to learn more, you can read the documentation at docs.haskellstack.org.
If you decided to give Stack a try, you can skip to Embedding grammars. Otherwise, read on.
If you haven’t installed GF: GOTO install GF and choose either from Hackage or compile from source.
If your current GF is the downloaded binary, you could do one of the following:
cabal install
only the libraries; ORcabal install gf
. This reinstalls the GF executable (hopefully to different place where your binary is) but doesn’t include the RGL, so you likely don’t need to change your $GF_LIB_PATH
or any other environment variables you might have.
If something weird happens from having multiple GF installations, or anything else goes wrong, you can open an issue at GF’s GitHub.
From this point on, I assume that you have managed to install the PGF library for Python or Haskell. Again, you can choose to follow the instructions for Python or Haskell further in this post.
Clone my tutorial repository:
git clone https://github.com/inariksit/gf-embedded-grammars-tutorial.git
In the main directory (i.e. called gf-embedded-grammars-tutorial
), run
gf -make resource/MiniLangEng.gf
This creates the PGF file MiniLang.pgf
.
The repository contains a Jupyter notebook named ReflTransfer.ipynb
. It’s meant to be opened with Jupyter, but if you don’t have the possibility to install Jupyter on the machine you’re reading this, you can still view the notebook on GitHub, where it just looks like a standard non-interactive tutorial. Here’s the link: ReflTransfer.ipynb on GitHub.
If you have a chance to use Jupyter on your own computer, I recommend it: you can modify the code and add new features. If you haven’t used Jupyter notebooks before, here’s a tutorial and installation instructions.
Once you have installed Jupyter, go to the main directory of my repository (i.e. the one called embedded-grammars-tutorial
) and run the command jupyter notebook
.
$ jupyter notebook
This will open your browser with the following view. Click the file ReflTransfer.ipynb.
Now you can use the notebook as an interactive tutorial. You can modify anything in the cells or write new cells and run them.
The rest of this post will be about Haskell, so unless you want to learn how to embed grammars bilingually, you’re done now! Here’s the last jump in this post, to links.
The first steps are:
Clone my tutorial repository:
git clone https://github.com/inariksit/gf-embedded-grammars-tutorial.git
In the main directory (i.e. called gf-embedded-grammars-tutorial
), run
gf -make -f haskell --haskell=lexical --lexical=A,N,V,V2 resource/MiniLangEng.gf
This creates the PGF file MiniLang.pgf
and the Haskell file MiniLang.hs
.
The extra flags --haskell=lexical --lexical=A,N,V,V2
make the generated Haskell file more compact, so that there is only a single constructor for all As, Ns, Vs and V2s. I’ll explain that feature more in the sequel, you can ignore it for now—it won’t be relevant for the rest of this tutorial.
If you use Stack: run stack build
, still in the main directory.
If you are not using Stack, you can ignore both the Stack and the Cabal files in the repository, just runghc ReflTransfer.hs
will be enough later on.
The PGF library is documented at Hackage. The standard GF tutorial lists some of the most important functions, if you want to see fewer things at once. I will explain the functions I use in my code, but once you’re familiar with the small examples from the GF tutorial and this tutorial, do browse the full API at Hackage!
Open a Haskell shell (e.g. ghci
or stack ghci
) and import the PGF library. Do this in the main directory of my repository, same where you compiled MiniLangEng.gf
into PGF and Haskell files.
$ stack ghci
…
Ok, two modules loaded.
> import PGF
Now you can open MiniLang.pgf
in the shell as follows.
PGF> gr <- readPGF "MiniLang.pgf"
PGF> :t gr
gr :: PGF
PGF> languages gr
[MiniLangEng]
PGF> categories gr
[A,AP,Adv,CN,Cl,Conj, … ,VP]
In order to parse or linearise, you need a concrete language as well. Here’s one way to do it:
PGF> let eng = head $ languages gr
PGF> parse gr eng (startCat gr) "I sleep"
[EApp (EFun UttS) (EApp (EApp (EFun UsePresCl) (EFun PPos)) (EApp (EApp (EFun PredVP) (EApp (EFun UsePron) (EFun i_Pron))) (EApp (EFun UseV) (EFun sleep_V))))]
If you want to see trees that look like from GF, you need to use showExpr
from the PGF library, like this:
PGF> let trees = parse gr eng (startCat gr) "I sleep"
PGF> map (showExpr []) trees
["UttS (UsePresCl PPos (PredVP (UsePron i_Pron) (UseV sleep_V)))"]
Before we go further into the technologies, let us have a concrete goal to keep it interesting! We want to do semantics-preserving syntactic transfer.
I added a function called ReflV2
into the good old miniresource abstract syntax. The enhanced miniresource is found in the tutorial repository, MiniGrammar.
UseV : V -> VP ; -- sleep
ComplV2 : V2 -> NP -> VP ; -- love it
ReflV2 : V2 -> VP ; -- use itself
UseAP : AP -> VP ; -- be small
AdvVP : VP -> Adv -> VP ; -- sleep here
And the implementation is in MiniGrammarEng.
ReflV2 v2 = {
verb = verb2gverb v2 ;
compl = table {
Agr Sg Per1 => "myself" ;
Agr Sg Per2 => "yourself" ;
Agr Sg Per3 => "itself" ; -- simplification, no human referent
Agr Pl Per1 => "ourselves" ;
Agr Pl Per2 => "yourselves" ;
Agr Pl Per3 => "themselves" }
} ;
Now what do we want to do: transform all sentences with the same subject and object into reflexive, otherwise leave sentence untouched. Some examples:
A program that does this modification is our goal. So far we have involved just the PGF library, for parsing and linearising. But the current goal involves more complex manipulation of the trees, and here we are going to introduce another way of interacting with the GF trees.
Remember the flag -f haskell
when we compiled the GF grammar? It produced a file called MiniLang.hs
, and now we are going to use that.
So first of all, why do we do this? Our overall goal is to manipulate trees, and this is much simpler using pure Haskell datatypes, than using the PGF functions. I’m not even going to bother show how to do it in pure PGF expressions—check out the Python tutorial if you like your code awkward and type-unsafe.
Our goal is to go from PGF expressions to the GF abstract syntax in Haskell, do our transformations operating on the Haskell datatypes, and then go back to the PGF expressions.
Here’s (a sample of) how the Haskell module looks like:
data GCl = GPredVP GNP GVP
data GNP =
GDetCN GDet GCN
| GMassNP GCN
| GUsePN GPN
| GUsePron GPron
data GVP =
GAdvVP GVP GAdv
| GComplV2 GV2 GNP
| GReflV2 GV2
| GUseAP GAP
| GUseV GV
And so on. If you’re familiar with the miniresource, you should recognise all these constructors—it’s a Haskell translation of the abstract syntax of MiniLang! Where in GF you had fun PredVP : NP -> VP -> Cl
, in Haskell you have data GCl = GPredVP GNP GVP
.
In addition, we have a way to relate these Haskell datatypes to the PGF of the same grammar that produced it. Here’s a type class Gf
:
class Gf a where
gf :: a -> Expr
fg :: Expr -> a
The type Expr
comes from the PGF library. In place of a
, we will put the Haskell data types just defined, such as GAdv
or GCl
.
By making a datatype into an instance of the typeclass Gf
, we need to provide a translation to and from the PGF datatype Expr
. (I will skip the details here; you can see them in the generated MiniLang.hs
file if you are interested.) Thanks to the functions gf
and fg
, we can now have a workflow as follows:
Expr
, using the PGF library.fg
. gf
. Now let’s get back to the goal! We want to transform sentences with the same subject and object into reflexive, for example I like me -> I like myself. The first sentence is parsed as follows in the miniresource:
I have highlighted the two arguments that are identical, and will trigger the change into reflexive.
The identical argument in question is UsePron i_Pron
: there are two instances of the tree. Their most recent common ancestor is PredVP
, which constructs a Cl
. So we need to design a function that does the following:
At this point, just go and see the actual Haskell program! The full code is found in ReflTransfer.hs, and the relevant parts are pasted below.
transfer :: Tree -> Tree
transfer = gf . toReflexive . fg
-- Wrapper for the more interesting trasfer functions.
-- Need this because Utt is the start category;
-- the strings we input are parsed as Utt by default.
toReflexive :: GUtt -> GUtt
toReflexive (GUttNP x) = GUttNP x -- NPs can't be made reflexive
toReflexive (GUttS s) = GUttS (toReflexiveS s)
-- Another layer of wrapper
toReflexiveS :: GS -> GS
toReflexiveS s = case s of
GCoordS conj s1 s2 -> GCoordS conj (toReflexiveS s1) (toReflexiveS s2)
GUsePresCl pol cl -> GUsePresCl pol (toReflexiveCl cl)
-- The relevant transfer function is Cl -> Cl
toReflexiveCl :: GCl -> GCl
toReflexiveCl cl@(GPredVP subj vp) = -- PredVP is the only constructor for Cl in the mini resource
case vp of
GComplV2 v2 obj
-> if show subj == show obj -- GNP has no Eq instance, need to compare string
then GPredVP subj (GReflV2 v2)
else cl
_ -> cl -- Any other way to form VP: keep it unchanged
Now you can run the program, alternatively by runghc ReflTransfer.hs
, or stack run ReflTransfer
. This is what it should look like:
EITHER
$ runghc ReflTransfer.hs
OR
$ stack run ReflTransfer
Write your sentence here, I will transform it into reflexive, if it has the same subject and object.
Write quit to exit.
I see me
I see myself
a car
a car
John sleeps and the water drinks the water
John sleeps and the water drinks itself
quit
bye
If you are unable to repeat these steps, please let me know! This time it’s not a GF core issue, just an issue about my tutorial, so create an issue in gf-embedded-grammars-tutorial repository or email me.
Do you like GF and Python? Wish you could write GF in a more familiar environment? Here’s a Jupyter kernel for writing GF grammars and running GF shell in a Jupyter notebook: github.com/kwarc/gf_kernel
This post has concentrated on PGF the library. But maybe you’re curious about PGF the file format and the compilation? You can get the gist of it in my blog, and a thorough description in Krasimir’s PhD thesis.
Follow-up post comparing the two Haskell options: PGF
(native Haskell library) vs. PGF2
(Haskell bindings to the C library) and other advanced topics