Finding copy/pasted code in a Rails app

03 Dec 2007

Ryan Owens and I were looking at a Rails app a few days ago; we knew that there was some view code that had been copied and pasted but we weren't sure exactly where it was. When I was doing Java fulltime I had worked on a copy/paste detector CPD; this handy utility has support for several languages including - thanks to Zev Blut - Ruby!

So we went to the CPD web page, fired it up via the Java Web Start link, had it scan app/controllers/ as a trial run, and lo and behold, it found a couple of duplicate methods! Then we twiddled the settings to check the .rhtml files - we selected the "by extension" setting from the "language" dropdown, and put "rhtml" in the "extension" text field. We kicked it off and what do you know - it found the exact duplication that we were looking for. A few minutes later we had cleaned all that up and checked it in. Good times.

The Ruby support isn't as good as it could be; it'd be nicer if we had a real JavaCC tokenizer for Ruby and it'd also be nice if we had one for ERB. Right now we just skip spaces and comments and pretty much every other character is seen as a token. So someone should get my JavaCC book and do this work. If I get motivated, perhaps that someone will be me....