I'm doing mostly Ruby these days, but all those JavaCC grammars are still accessible and useful through the magic of JRuby. With JRuby I can write a Ruby script that loads up a JavaCC-generated parser and rips right through whatever data I need to manage. Here's how.
Let's use the Java grammar as an example. Download this Java grammar and build it into a jar file - basically, you'll do this:
Or, if you're in a hurry, just download grammar.jar which has all that stuff in it. Now, install JRuby if you don't already have it somewhere on your system - rvm is probably the best path for this, or you can just download the latest binary and untar it somewhere on your computer. Finally, add a little test source file to the current directory - call it
Hello.java and put this code in it:
With that setup in place, the nicest way to explore JavaCC and JRuby is to use JRuby's interactive interpreter,
Great, we're in. Let's try to use that
Oops, need to import
java as well:
Now we'll import
JavaParser to save some typing:
OK, let's load up that
Hello.java file. First we'll create a Java
Now we parse the file contents!
We now have a reference to the root of the abstract syntax tree (AST) that the parser has built from that source file. What can we do with it? Well, we can show the name of the class:
We can also do something a little more interesting - we can use a
Visitor implementation that comes with this grammar to visit each node of the AST and print out the source:
We can also just use the tokenizer (i.e., the
JavaParserTokenManager) if that's all we need. Here's a little program to do that - put this in a file called
When you run it with
jruby tokenize.rb you'll see this:
This gives us the ability to use any JavaCC grammar's tokenizer to lex any data file. Very handy!
There's a lot more we can do with JRuby and JavaCC, but this should give you a feel for the possibilities. Enjoy!
Check out my JavaCC book for a much deeper dive into JavaCC, JJTree, and all that.