Programs that rewrite Ruby programs

20 Sep 2016

A while back I ported an application from Rails 4.2 to Rails 5; all went well, but adding those params keys to the controller tests was pretty tedious. Then last week I was upgrading an app from rspec 2 to 3 and was ready to embark on another long journey of syntax fixes when I came across Yuji Nakayama's transpec. What a great utility, just a wonderful timesaver. Leigh Halliday has written a fine overview of transpec so I won't repeat that here, but, if you're upgrading rspec, this will save you many hours.

More generally, there are a bunch of utilities that parse Ruby code (flay, reek, flog, etc), but not as many that actually rewrite it. I use "rewrite" vs "transpile" or something only because the primary gem I found that supports rewriting Ruby code is parser, and the class which wraps up the code modification process is Parser::Source::Rewriter. Thanks to the rubygems.org API endpoint for fetching reverse dependencies (e.g., curl https://rubygems.org/api/v1/gems/parser/reverse_dependencies.json) it's easy to see which gems depend on parser. Here are a couple of interesting ones.

First up is transpec. As I mentioned, rspec 3 introduced a bunch of syntax changes from rspec 2, and transpec rewrites your specs to make those changes. It's a pretty mechanical translation, so it seems ideal to send a machine to do the job. Here's an example diff:

-      Foo.should_receive(:find).with("42").and_return(x)
+      expect(Foo).to receive(:find).with("42").and_return(x)

To do this, transpec has a Converter class with a process method that accepts a Transpec::AST::Node instance that's the root of the abstract syntax tree of a particular spec. Here's an example of the AST for x=42 (the AST for an actual spec is much bigger, of course):

require 'parser/current'
require 'transpec/ast/builder'
buffer = Parser::Source::Buffer.new("test").tap{ |b| b.source = "x=42" }
Parser::CurrentRuby.new(Transpec::AST::Builder.new).parse(buffer)
s(:lvasgn, :x,
  s(:int, 42))

This AST is generated by the parser gem by way of astrolabe, which is a library that (per its README) provides "an object-oriented AST extension for Parser". It has all sorts of handy features, like providing the ability to iterate over all nodes of a particular type in a subtree. The lvasgn and int symbols are AST node types; lvasgn is more or less "left hand side value assignment". There's a list of all possible types in Parser::Meta::NODE_TYPES, and astrolabe has meta-programming that defines useful methods from that list.

transpec manages the code transformations via a collection of Transpec::Syntax subclasses, each one of which handles a specific type of change. For example the migration of should_receive into expect is handled by Transpec::Syntax::ShouldReceive. Each class analyzes a node, and if applicable, calls a method like insert_after_multi (to insert code after a particular location) on the source rewriter. An interesting thing about parser's source rewriter - calling a method like replace doesn't just slap in a new blob of code. Instead, it appends a Rewriter::Action to a queue for later processing. This enables detecting two changes which would clobber each other and raise an exception while leaving things in a good state.

At the beginning of this post I mentioned upgrading an app to Rails 5 and fixing up all the tests since, as Abhishek Jain explains nicely here, Rails 5 uses kwargs for controller tests. This particular project is using test-unit, but if I had been using rspec I could have used rails5-spec-converter, which does a straightforward conversion:

-     post :create, {}
+     post :create, params: {}

rails5-spec-converter is similar to transpec in that it's using parser's Parser::Source::Rewriter along with astrolabe. Internally there's a Rails5::SpecConverter::TextTransformer#transform method that spelunks around the AST and eventually calls Parser::Source::Rewriter#replace to effect the transformation if necessary. Scanning TextTransformer gives you a feel for how delicate a source code transformation is, especially if you need to preserve indentation, newlines, and so forth.

I couldn't find a similar project for test-unit, but 1) maybe I missed it and 2) seems like a rails5-testunit-converter would be doable or 3) maybe rails5-spec-converter could be generalized. By someone.

No post on Ruby source rewriting - or Ruby static analysis in general - would be complete without a mention of Bozhidar Batsov's rubocop. Rubocop is a well-known static analysis tool that can locate all sorts of problems with a codebase. More to the point for this post, it can also fix certain classes of issues. For example, it can replace old-style Ruby hash formatting like "foo" => 42 with the modern style foo: 42. The rubocop base class Rubocop::Cop::Cop includes a AutocorrectLogic module which provides a support_autocorrect? method which delegates back to the check's autocorrect method. So, UselessArraySplat supports autocorrection because it just involves removing a splat operator, but UselessAssignment does not, because in the case of a unnecessary assignment a human should determine if the entire statement can be removed. Internally rubocop has a nice way of wrapping this up; there's a RuboCop::Cop::Corrector with a bunch of methods like insert_before, replace, and remove_leading, most of which delegate to similarly-named methods on the Parser::Source::Rewriter.

As I've been poking around these utilities I've been wondering what other tools could benefit from rewriting as opposed to just reading and reporting on Ruby source code. For example, a while back I wrote a small Rails cleanup utility, filter_decrufter; it reports issues like "EmployeesController after_filter 'set_name' has an :only constraint with a non-existent action name 'frobnicate'". But it could pretty easily make that change instead of just reporting it. Lots of possibilities!