Using regular expressions in PMD rules

29 Dec 2005

XPath has a bunch of handy functions that you can use for writing PMD rules. For example, the AbstractNaming rule finds abstract classes that aren't named something like "AbstractFoo" using the starts-with function:

//ClassOrInterfaceDeclaration
 [@Abstract='true' and @Interface='false']
 [not (starts-with(@Image,'Abstract'))]

But for real power, you need regular expressions - and thanks to Jaxen's extension functions and Jakarta ORO, PMD supports that now! Here's an example; this would find classes named 'Foo' or 'Bar':

//ClassOrInterfaceDeclaration
 [regexp(@Image,'^(Foo|Bar)$')]

Note the use of the new regexp function - that's where the magic happens. For a more complicated example, here's a rule that checks for hardcoded IP addresses:

//PrimaryExpression/PrimaryPrefix/Literal
 [matches(@Image,
  '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
 \.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
 \.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
 \.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
 )]

Props to both the Jaxen and the ORO guys for their nice APIs; adding this regular expression support took only about 40 lines of code. Very nice!

More details on this feature are in the RFE. And for more info on PMD, get the book!

Updated 1/5/06: function is called 'matches' now per the XPath 2.0 spec. Thanks to Daniel Sheppard for the pointer!