Sphinx, Riddle, and escaping special characters

15 Jan 2009

If you're using Sphinx and Riddle, you'll notice that special characters don't get escaped. This means that if you do a extended mode search for apples -oranges, the dash in -oranges will be treated as a NOT operator. If you're accepting search terms from your users, this will lead to suprises unless you escape that and other special characters.

This functionality is built in to the Sphinx PHP API, but I didn't find it in Riddle. But here it is thanks to backreferences and the block form of gsub:

def self.escape_string(s)
  (s || "").gsub(/(:|@|-|!|~|&|"|\(|\)|\\|\|)/) { "\\#{$1}" }
end

I think that covers all the cases, but if you notice anything missing here please let me know, thanks! Feb 27 2009 update: Added : and ), thanks Brian!