Upgrading PostgreSQL with pg_migrator

30 Sep 2009

I've always done PostgreSQL upgrades via the time-honored "dump and load" method. This works, but it takes a while as you can see from the notes on the RubyForge PostgreSQL 8.2 => 8.3 upgrade.

Happily, with PostgreSQL 8.4, the dump/load routine is no longer necessary. There's a new game in town, and it's pg_migrator. This utility copies over the appropriate bits from your old PostgreSQL installation and adds them to your new one, with the result being a much shorter downtime.

I've done two 8.3 => 8.4 upgrades so far; here are some notes:

  • If you don't compile PostgreSQL with --disable-integer-datetimes you might get an error from pg_migrator. This has happened in both migrations for me. To fix, just recompile with that flag and move on. Also, since you probably did an initdb before running pg_migrator, you'll need to run that again. Easiest fix is probably to just rm -rf /path/to/pgsql84/ and recompile, reinstall, and initdb.
  • When installing pg_migrator I found I needed to first run make USE_PGXS=1 PG_CONFIG=/var/pgsql/bin/pg_config and then run a sudo make USE_PGXS=1 PG_CONFIG=/var/pgsql/bin/pg_config install. The two files that need to be copied over to your new pgsql directory are func/pg_migrator.so, which goes in /var/pgsql/lib/, and src/pg_migrator, which goes in /var/pgsql/bin/. Now that I'm thinking about it I'm wondering if that "sudo make install" is really necessary... I'll experiment next time I'm doing an upgrade.
  • In one database I had a test table that was a copy of pg_class. This table had a field of type name, which caused problems for pg_migrator. In my case I could just drop the table and rerun pg_migrator, so it was no big deal.
  • The docs recommend running vacuumdb --all --analyze on the new cluster once the migration is complete. I found this takes almost as long as the actual migration! But it's probably the right time to do it since the system is down anyhow. Update: In the comments of this post Greg Smith notes that this can be run after the system is brought back online, although performance won't be great until all those stats are recomputed. But it would get the system back online more quickly.
  • pg_migrator has a --link parameter that does an in-place upgrade. This scares me - I want to be able to quickly fall back if something goes wrong - so I haven't used it. That would probably save a little time... not sure how much. It would definitely save some disk I/O, too. Update: In the comments of this post Bruce Momjian notes that this can save orders of magnitude of time. Update 2: Reran a migration using --link and migration time went from 13 minutes to 1 minute. Nice! Update 2: I used --link when migrating RubyForge from 8.3 to 8.4 - actual database migration took only 1m17s! This vs dump/load time a year ago of 40 minutes; would have been more this time for sure!
  • For one system I was using the postgres gem to hook my Rails app the the PostgreSQL database. I should upgrade to the pg gem, but anyhow, I didn't need to reinstall the gem after this upgrade.
  • I'm also running Sphinx on PostgreSQL and didn't need to reinstall that either; every thing just kept on ticking.

That's all I can think of for now. I'll add some notes to this after I do the RubyForge upgrade... these first couple upgrades have been "area familiarity".