rcov crashing with [BUG] rb_gc_mark()

25 Aug 2008

While working on some Rails apps for RollStream and using Mauricio Fernandez's excellent rcov plugin we started to encounter the [BUG] rb_gc_mark(): unknown data type problem. We only saw this when we ran our controller tests; just running the unit tests wouldn't trigger it. It was a bummer, though, because we couldn't see where we were coverage-wise.

I poked around rcov for a while using Valgrind - there's no Mac OS X port, but I had a Linux VMWare Fusion instance handy. After some flailing around I finally hit paydirt. This Valgrind invocation:

valgrind --tool=memcheck --error-limit=no --leak-check=no \
--leak-resolution=low \
--log-file=valgrind.out /usr/local/bin/rcov --rails \
--aggregate coverage.data --text-summary -Ilib --html \
[... lots of controller names here ...]

turned up this problem report:

==13390== Invalid write of size 4
==13390==    at 0x784BE8E: coverage_event_coverage_hook (rcovrt.c:103)
==13390==    by 0x416E85: rb_eval (eval.c:4127)
[... stack elided ...]
==13390==  Address 0x7e419e8 is not stack'd, malloc'd or (recently) free'd

rcovt.c line 103 involves usage of a cov_array struct; I added some bounds checking like so:

$ diff -Naur rcovrt.c ~/new.rcovrt.c
--- rcovrt.c
2008-08-28 17:50:16.000000000 -0400+++ /Users/tom/new.rcovrt.c
2008-08-28 17:52:15.000000000 -0400
@@ -64,7 +64,9 @@
           if(!carray->ptr[sourceline])
                   carray->ptr[sourceline] = 1;
   } else {
+   if (carray && carray->len > sourceline) {
          carray->ptr[sourceline]++;
+    }
   }

   return carray;
@@ -98,7 +100,7 @@
static void
coverage_increase_counter_cached(char *sourcefile, int sourceline)
{
- if(cached_file == sourcefile && cached_array) {
+ if(cached_file == sourcefile && cached_array && cached_array->len > sourceline) {
          cached_array->ptr[sourceline]++;
          return;
  }

I rebuilt the gem, reran the coverage task, and huzzah! It completes!

This isn't a great fix, of course - I'd much rather figure out what's wrong with the allocation of cached_array. Perhaps someone cleverer than I can come up with a better fix.

Updated 8/27/08: Modified to document a better fix - check the cached_array->len attribute and compare it to the sourceline.