Installing Openccg

OpenCCG is a Java library which can handle both parsing and generation. I’ve mostly used it for surface realization, converting fairly syntactic meaning representations into a natural language text, but you can use it for parsing or for generation from higher-level semantic representations if you’d like.

This tutorial is intended to help you:

  1. Install the current version of OpenCCG

Installing OpenCCG

We’re partly following the OpenCCG README in these guidelines, and you should defer to the official documentation if anything in this tutorial conflicts with that, unless otherwise noted.

Prerequisites

  • Java 1.6 or later
  • Python 2.4 or later (check Py3 compatibility)

Getting the code

If you want to keep your version up to date with the official releases, the easiest way to do that is to clone the repository from GitHub:

git clone https://github.com/OpenCCG/openccg.git

This will create a directory openccg/ in your current working directory and fill it with the code from the project.

Alternatively, you can download a compressed archive of OpenCCG on SourceForge.

At the time of writing, the SourceForge page provides version 0.9.5 while GitHub has version 0.9.6. The differences between these two, however, appear to be purely cosmetic (i.e. the version number was updated when files were prepared for release on GitHub).

Getting required libraries for the Git Repo

The one downside to using git is that it is not intended to track binary files, so if you chose to clone the git repo above, you will need to download the archive from SourceForge anyway in order to get the following libraries:

  • ant-contrib.jar
  • ant.jar
  • ant-junit4.jar
  • ant-junit.jar
  • ant-launcher.jar
  • javacc.jar
  • jdom.jar
  • jgrapht-jdk1.6.jar
  • jline.jar
  • jopt-simple.jar
  • junit-4.10.jar
  • openccg.jar
  • serializer.jar
  • trove.jar
  • xalan.jar
  • xercesImpl.jar
  • xml-apis.jar
  • xsltc.jar

The libraries are located in the lib/ directory of openccg-0.9.5.tgz. If you installed via git, copy these files to the lib/ directory of your OpenCCG git repository.

If you’re installing from SourceForge, they’re already in the right place, so you can ignore this step.

Optional materials

Building and installing OpenCCG

In order to use OpenCCG, we first have to compile the code. This requires setting a few ’environment variables’ on your computer so it knows where to find (1) the Java Development Kit (JDK) and (2) OpenCCG.

Then we can run the ccg-build script to build the project and make it executable.

Setting Environment Variables

We have to set at least two environment variables for OpenCCG to work: JAVA_HOME, which is where our JDK is installed, and OPENCCG_HOME for the location of our openccg/ directory.

In bash, you can find out where Java is installed by running which java. The following examples show the result on my computer:

$ which java
/usr/bin/java

This is often a symbolic link (a shortcut), though, so it is helpful to get the full listing and keep going until we have a real file and not a shortcut:

$ ls -l /usr/bin/java
/etc/alternatives/java
$ ls -l /etc/alternatives/java
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-1.b12.fc25.x86_64/jre/bin/java

We see here that this is the location of java, but several of the OpenCCG scripts call javac (the Java compiler) by adding bin/javac to the JAVA_HOME environment variable, so we want to set our JAVA_HOME to /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-1.b12.fc25.x86_64 instead.

If you run into errors, double check that you have javac installed in addition to java. Plain Java is often distributed as the Java Runtime Environment (JRE) to run compiled Java code from, e.g., .jar files, while javac is a part of the JDK.

In Linux & Mac you can use the bash shell in a terminal window to set JAVA_HOME:

$ export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-1.b12.fc25.x86_64

(Note that the $ is not a part of the command. It just represents the command prompt.)

If you are using something other than bash on Mac or Linux, try googling ‘setting environment variables in OPERATING SYSTEM OR TERMINAL’ for your operating system or terminal/shell. The OpenCCG README also includes some details for configuring these variables in Windows.

We should similarly set the environment variable OPENCCG_HOME to point to the openccg/ directory. If installed to user’s home directory, this would be

$ export OPENCCG_HOME=/home/user/openccg

on most (all?) Linux distributions. Setting persistent environment variables

In Linux, you can modify the .bashrc file located in your user’s home directory to include the export statements mentioned above. Otherwise you will have to re-set them every time you want to run OpenCCG in a new session.

Building OpenCCG

In a terminal, navigate to your openccg/ directory and execute the following:

$ ./bin/ccg-build

The OpenCCG project uses Apache’s “Ant” build system to handle all the compiling (and then some).

If this worked, you should see a bunch of output followed by “BUILD SUCCESSFUL” and an estimate of how long the build took.

And that’s it! Now you should be able to use OpenCCG. If you’re not sure what to do next, keep an eye out for additional OpenCCG tutorials in the near future.

Research Fellow in Natural Language Generation

Dave Howcroft is a computational linguist working at Edinburgh Napier University.