CS 4120: Introduction to Compilers
Fall 2013


Using Antlr

Antlr4 is a sophisticated lexer-and parser generator. You may build your lexer any way you like -- by hand, with another lexer generator, some mix of the two (as long as the final product will run on the JVM). If you are looking for a good default implementation strategy, however, we recommend Antlr (and we will offer some support for it).

Getting started

You will find installation instructions for Antlr4 at the Antlr homepage and a small getting-started tutorial here.

For the purposes of this guide, you will use Antlr to generate a lexer. Antlr will generate a file called CubexLexer.java containing the class CubexLexer. Your code will create an CubexLexer object and use it to tokenize the input. You can find documentation of the Lexer class here.

Antlr comes with an associated testing harness, called grun, which will be set up if you follow the getting started directions linked to above. The below workflow shows how to use it to test your lexer grammar; alternatively you could write your own small Java-based testing harness.

Example ANTLR input files

The following example files are a lexer grammar for the Xi programming language (from the 2011 version of this course) and a dummy parser grammar which will enable us to test with grun

Example Workflow

The following is an example build/test workflow for the Xi language. The workflow should work exactly the same for X3

Start by compiling your grammar to a Java parser, turning off some unnecessary Antlr features:


  antlr4 XiLexer.g4 XiParser.g4 -no-listener
Then compile the Java source:

  javac Xi*.java
Now test the lexer on the file hello.xi:

  grun Xi file -tokens hello.xi
and the output should be

  [@0,0:2='use',<12>,1:0]
  [@1,4:5='io',<12>,1:4]
  [@2,6:6=';',<21>,1:6]
  [@3,11:14='main',<12>,3:0]
  [@4,15:15='(',<18>,3:4]
  [@5,16:16=')',<19>,3:5]
  [@6,18:18='{',<22>,3:7]
  [@7,25:31='println',<12>,4:4]
  [@8,32:32='(',<18>,4:11]
  [@9,33:46='"Hello World!"',<1>,4:12]
  [@10,47:47=')',<19>,4:26]
  [@11,48:48=';',<21>,4:27]
  [@12,51:51='}',<23>,5:0]
  [@13,54:53='',<-1>,6:0]
You can also specify the package name of your lexer and parser using the -package your.package.name flag for the antlr4 command.

Adding Antlr to the build path in Eclipse

If you're using Eclipse to build your project, you will need to tell it to link your project against the antlr jar file. To do so, right click on your project in the project explorer, navigate down to the "Build path" item, and select the "Add External Archives" option.

Finally, in the file-chooser dialog, find and select the antlr jar file. (its name is likely "antlr-4.1-complete.jar")